eBookBot: Full-Stack Neural EPUB Reader

A premium, local-first EPUB reader with high-fidelity "Direct Neural" text-to-speech. Built with Next.js, FastAPI, and ONNX.

🚀 Overview

eBookBot converts your EPUB books into immersive audio experiences. It uses a Flow-Matching based TTS engine (ReaderAudioEngine) to generate natural speech with precise word-level synchronization.

TTS model in use: Supertone/supertonic-2.

Highlights

Local-first pipeline with fast, responsive playback.
Word-sync highlighting aligned with neural audio.
Fine-grained reading controls for layout and tempo.
Modular architecture: Next.js UI + FastAPI API + ONNX TTS engine.

Stack

Layer	Technology
Frontend	Next.js (App Router)
Backend	FastAPI
TTS Engine	ONNX Runtime + ReaderAudioEngine
TTS Model	Supertone/supertonic-2

📖 How to Use

Requirements

Python 3.10+
Node.js 18+
ONNX Runtime (CUDA recommended for GPU acceleration, works on CPU too)

⚡ Quick Start (Recommended)

The easiest way to run both the backend and frontend simultaneously is using the run.py script:

python run.py

This will:

Start the FastAPI backend.
Start the Next.js frontend.
Handle clean shutdown of both services.

🔧 Manual Setup

If you prefer to run services separately:

1. Backend Setup (ReaderAudioAPI)

cd ReaderAudioAPI
pip install -r requirements.txt
python -m uvicorn app.main:app --reload

2. Frontend Setup (reader-frontend)

cd reader-frontend
npm install
npm run dev

Adding Books

Open http://localhost:3000.
Click the + (Plus) icon in the sidebar.
Upload an EPUB file and wait for processing.
- Tip: You can purchase high-quality EPUBs from official bookstores or find catalogs on community sites like Free Media Collection.
Select the book and click Play.

Storage Notes

An average book requires about 400 MB of local storage (audio + cache). We will optimize this in the future; see TODO below.

Data Directory

By default, runtime data is stored in ReaderAudioAPI/oas_assets/ (uploads, audio, metadata).

You can override this location by setting EBOOKBOT_DATA_DIR before starting the backend.

Performance & Resource Management

TTS worker pool: defaults to 3 GPU workers.
- EBOOKBOT_TTS_WORKERS (default: 3)
- EBOOKBOT_TTS_MAX_INFLIGHT (default: workers * 2)
- EBOOKBOT_TTS_TASK_TIMEOUT_SECONDS (default: 600)
Idle GPU cleanup: when you pause generation and no work remains, the TTS worker processes shut down to free VRAM. Workers will auto-resume on the next queued task.

⚙️ Features

Dynamic Controls
- Precise sliders for Reading Size, Line Height, Word Spacing, and Chunk Gaps
- Tempo control (0.5x to 3.0x)
Instant Playback: Iterative chunking lets you start instantly while the rest builds in the background.
Word-Sync: Visual highlighting tracks the neural audio in real time.

🛠 ReaderAudioEngine (Submodule)

The core engine is included as a submodule. It is responsible for:

Auto-downloading models from HuggingFace.
Low-latency ONNX inference.
Estimating precise word timestamps for highlighting.

To contribute or find more details about the engine, visit the ReaderAudioEngine/ directory.

TODO

Optimize per-book storage size (target below ~100 MB).
- it is 200 right now.
Add audio compression or streaming for long books.
Provide a cleanup tool for cached audio.

License

MIT (see LICENSE).

Note: the TTS model and the ReaderAudioEngine submodule may be governed by their own separate licenses/terms.

Contact

Type	Details
Author	Izzet Sezer
Email	sezer@imsezer.com

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
ReaderAudioAPI		ReaderAudioAPI
ReaderAudioEngine @ 737efaf		ReaderAudioEngine @ 737efaf
images		images
reader-frontend		reader-frontend
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
benchmark_tts.py		benchmark_tts.py
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

eBookBot: Full-Stack Neural EPUB Reader

🚀 Overview

Highlights

Stack

📖 How to Use

Requirements

⚡ Quick Start (Recommended)

🔧 Manual Setup

1. Backend Setup (ReaderAudioAPI)

2. Frontend Setup (reader-frontend)

Adding Books

Storage Notes

Data Directory

Performance & Resource Management

⚙️ Features

🛠 ReaderAudioEngine (Submodule)

TODO

License

Contact

About

Uh oh!

Releases

Packages

Languages

License

sezer-muhammed/EBookReaderFullStack

Folders and files

Latest commit

History

Repository files navigation

eBookBot: Full-Stack Neural EPUB Reader

🚀 Overview

Highlights

Stack

📖 How to Use

Requirements

⚡ Quick Start (Recommended)

🔧 Manual Setup

1. Backend Setup (ReaderAudioAPI)

2. Frontend Setup (reader-frontend)

Adding Books

Storage Notes

Data Directory

Performance & Resource Management

⚙️ Features

🛠 ReaderAudioEngine (Submodule)

TODO

License

Contact

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages