EPUB TTS Reader API

Testing phase (All features might not work)

A REST API for converting EPUB files to text and generating text-to-speech audio using the Supertonic-3 model. Supports direct text input, EPUB conversion via Calibre, and sentence-level audio generation with on-device inference.

Technology Stack

Backend

FastAPI: Modern, fast web framework for building APIs with Python
Supertonic-3: Lightning-fast, on-device text-to-speech model supporting 31 languages
ONNX Runtime: High-performance inference engine for machine learning models
Calibre: eBook conversion tool for EPUB to text conversion
Python 3.12: Core programming language

Frontend

HTML5/CSS3: Markup and styling
Vanilla JavaScript: Client-side interactivity
Flexbox: CSS layout system

Key Features

On-device TTS inference (no cloud API calls required)
EPUB to text conversion via Calibre CLI
Sentence-level audio generation
Click-to-play from any text position
Auto-scroll toggle functionality
Adjustable font size
Monospace font with borders for clarity
RESTful API with OpenAPI/Swagger documentation

API Documentation

Interactive API documentation is available at:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Installation

Local Development

Prerequisites

Python 3.12+
Calibre (for EPUB conversion)
uv (Python package manager)

Setup

Clone the repository and navigate to the project directory
Create a virtual environment and install dependencies:

uv venv
uv pip install fastapi uvicorn python-multipart supertonic

Set your Hugging Face token as environment variable or update in main.py
Run the server:

.venv\Scripts\python main.py

The API will be available at http://localhost:8000

Docker Deployment

Prerequisites

Docker installed on your system

Build and Run

Build the Docker image:

docker build -t epub-tts-reader .

Run the container:

docker run -d -p 8000:8000 -e HF_TOKEN=your_huggingface_token epub-tts-reader

Access the API at http://localhost:8000

Docker Compose

Create a docker-compose.yml file:

version: '3.8'
services:
  epub-tts-reader:
    build: .
    ports:
      - "8000:8000"
    environment:
      - HF_TOKEN=your_huggingface_token
    volumes:
      - ./uploads:/app/uploads
      - ./audio:/app/audio

Run with:

docker-compose up -d

API Endpoints

Web Interface

GET / - Main web interface for EPUB TTS Reader

EPUB Processing

POST /upload-epub - Upload and convert EPUB to text

Text Processing

POST /load-text - Load and parse direct text input

TTS Generation

POST /generate-audio - Generate audio for text segment

Audio Serving

GET /audio/{filename} - Serve generated audio files

Deployment Considerations

Security

Remove or secure the Hugging Face token before public deployment
Implement rate limiting for API endpoints
Add authentication/authorization for production use
Validate and sanitize all user inputs
Use environment variables for sensitive configuration

Performance

Implement audio file cleanup to prevent disk space issues
Add caching for frequently accessed content
Consider using a CDN for static assets
Implement request queuing for heavy TTS operations

Scalability

Use a production ASGI server like Gunicorn with Uvicorn workers
Implement horizontal scaling with load balancing
Use Redis or similar for session management
Consider containerization with Docker

Monitoring

Add logging for API requests and errors
Implement health check endpoints
Set up error tracking (e.g., Sentry)
Monitor resource usage and response times

Environment Variables

HF_TOKEN: Hugging Face API token for model access
PORT: Server port (default: 8000)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
static		static
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
docker-compose.yml		docker-compose.yml
install.sh		install.sh
main.py		main.py
test_api.py		test_api.py
test_supertonic.py		test_supertonic.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EPUB TTS Reader API

Testing phase (All features might not work)

Technology Stack

Backend

Frontend

Key Features

API Documentation

Installation

Local Development

Prerequisites

Setup

Docker Deployment

Prerequisites

Build and Run

Docker Compose

API Endpoints

Web Interface

EPUB Processing

Text Processing

TTS Generation

Audio Serving

Deployment Considerations

Security

Performance

Scalability

Monitoring

Environment Variables

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EPUB TTS Reader API

Testing phase (All features might not work)

Technology Stack

Backend

Frontend

Key Features

API Documentation

Installation

Local Development

Prerequisites

Setup

Docker Deployment

Prerequisites

Build and Run

Docker Compose

API Endpoints

Web Interface

EPUB Processing

Text Processing

TTS Generation

Audio Serving

Deployment Considerations

Security

Performance

Scalability

Monitoring

Environment Variables

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages