This package provides two user-friendly frontend applications for the Kokoro Text-to-Speech system:
- Web Frontend - Browser-based interface with modern UI
- Desktop GUI - Native desktop application with tkinter
For users who already have the prerequisites (like pyenv, Python 3.12+, and espeak-ng) installed, follow these steps to get the web application running quickly:
You only need to perform these steps once on your machine. This guide uses pyenv to manage Python versions, which is highly recommended to avoid conflicts with your system's default Python.
Install pyenv using a package manager. If you are on macOS, Homebrew is the recommended method:
brew install pyenvsudo pacman -S pyenvsudo pacman -S pyenv base-devel openssl zlib xz sqlite bzip2 readline tk ncurses libffipyenv install 3.12.4For Bash:
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
source ~/.bash_profileFor Zsh:
echo 'eval "$(pyenv init -)"' >> ~/.zshrc
source ~/.zshrcFor Fish:
echo 'pyenv init - | source' >> ~/.config/fish/config.fish
Restart terminal# 1. Clone the repository and navigate to the frontend directory
git clone https://github.com/(user)/kokoro.git
cd kokoro/frontend
# 2. Set up the Python environment
pyenv local 3.12.4
python3 -m venv .venv
source .venv/bin/activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Make script executable and run the web application
source .venv/bin/activate
sh ./start_web.shOnce started, access the application in your browser at the local URL provided in the terminal (e.g., http://127.0.0.1:53286).
Both applications include a rich set of features:
- ✅ Voice Selection - Choose from multiple available voices
- ✅ Playback Speed Control - Adjust speech speed from 0.5x to 2.0x
- ✅ Text Input Box - Enter text up to 5000 characters
- ✅ Generate Speech Button - One-click speech generation
- ✅ Audio Playback - Built-in audio player
- ✅ File Download/Save - Save generated audio files
- ✅ Language Support - Multiple language options
- ✅ Real-time Status - Shows system status and progress
Follow these steps to get the frontend applications running on your local machine.
Make sure you have the following installed on your system:
-
Git: To clone the repository.
-
Python 3.12+: We recommend using
pyenvto manage Python versions. -
System Dependencies:
espeak-ngfor text processing andtk(optional, for the Desktop GUI).Dependency Installation Commands:
- Arch / CachyOS:
sudo pacman -S espeak-ng tk - Debian / Ubuntu:
sudo apt-get install espeak-ng tk
- Arch / CachyOS:
Using pyenv helps manage multiple Python versions without conflicts.
-
Install
pyenv: Follow the official pyenv installation guide. -
Configure your shell to load
pyenv. Add the correct command for your shell's startup file (~/.bashrc,~/.zshrc, or~/.config/fish/config.fish).- For bash:
echo 'eval "$(pyenv init -)"' >> ~/.bashrc
- For zsh:
echo 'eval "$(pyenv init -)"' >> ~/.zshrc
- For fish:
echo 'pyenv init - | source' >> ~/.config/fish/config.fish
- For bash:
-
Restart your shell or run
source <your_shell_config_file>for the changes to take effect. -
Install Python 3.12.4 (or a newer 3.12+ version):
pyenv install 3.12.4
-
Clone the repository:
git clone https://github.com/(user)/kokoro
-
Navigate to the frontend directory:
cd kokoro/frontend -
Set the local Python version (if you used
pyenv):pyenv local 3.12.4 -
Create and activate a virtual environment:
python3 -m venv .venv source .venv/bin/activateNote: To deactivate the virtual environment later, simply run
deactivate. -
Install the required Python packages:
pip install -r requirements.txt
-
Make the startup scripts executable:
source .venv/bin/activate -
Start the Web Frontend:
sh ./start_web.sh
Note: The first time you run this, it may take a few minutes to download the necessary TTS models.
-
Access the application in your web browser. The terminal will show you which URLs to use:
- Access from the same machine via:
http://127.0.0.1:53286 - Access from other devices on your local network via:
http://<YOUR_LOCAL_IP_ADDRESS>:53286(e.g.,http://192.168.1.100:53286). Your local IP address will vary.
- Access from the same machine via:
-
(Alternative) Start the Desktop GUI:
./start_gui.sh
The web frontend also provides a REST API:
# Generate speech
curl -X POST http://localhost:53286/generate \
-H "Content-Type: application/json" \
-d '{
"text": "Hello world!",
"voice": "af_heart",
"language": "a",
"speed": 1.0
}'
# Check status
curl http://localhost:53286/status
# Health check
curl http://localhost:53286/healthEdit templates/index.html to customize colors, layout, and add features.
Modify gui_app.py to change the window size, layout, fonts, and colors.
To add new voices, update the voices dictionary in both app.py (web) and gui_app.py (desktop).
-
"Kokoro TTS is not available" / "No module named 'kokoro'"
- Ensure your virtual environment is active (
source .venv/bin/activate). - Re-run
pip install -r requirements.txt.
- Ensure your virtual environment is active (
-
"No module named 'tkinter'" (Desktop GUI)
- Install the
tkpackage using your system's package manager (see prerequisites).
- Install the
-
Port already in use (Web Frontend)
- Change the port in
app.py:app.run(host='0.0.0.0', port=YOUR_PORT) - Or stop the existing process using that port.
- Change the port in