This guide shows you how to run Kokoro using uv and uvx on your CachyOS laptop.
First, make sure you have uv installed:
# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | shAlso ensure you have the system dependency:
# On CachyOS/Arch Linux
sudo pacman -S espeak-ng
# On Ubuntu/Debian
sudo apt-get install espeak-nguvx is perfect for running Kokoro without installing it permanently:
# Basic usage - generate speech from text
uvx kokoro -t "Hello, this is Kokoro TTS!" -m af_heart -o output.wav
# Read from a file
uvx kokoro -i input.txt -m af_heart -o output.wav
# Use different voices and languages
uvx kokoro -t "Bonjour le monde!" -m af_heart -l f -o french.wav
# Adjust speech speed
uvx kokoro -t "This is slower speech" -m af_heart -s 0.8 -o slow.wava- American Englishb- British Englishe- Spanish (es)f- French (fr-fr)h- Hindi (hi)i- Italian (it)j- Japanese (requires:uvx --with misaki[ja] kokoro ...)p- Brazilian Portuguese (pt-br)z- Mandarin Chinese (requires:uvx --with misaki[zh] kokoro ...)
# Spanish
uvx kokoro -t "Hola mundo, esto es una prueba." -m af_heart -l e -o spanish.wav
# French
uvx kokoro -t "Bonjour le monde, ceci est un test." -m af_heart -l f -o french.wav
# Japanese (with additional dependency)
uvx --with misaki[ja] kokoro -t "こんにちは世界" -m af_heart -l j -o japanese.wav
# Chinese (with additional dependency)
uvx --with misaki[zh] kokoro -t "你好世界" -m af_heart -l z -o chinese.wavIf you're working with the source code:
# Clone the repository
git clone https://github.com/hexgrad/kokoro.git
cd kokoro
# Install dependencies
uv sync
# Run with uv
uv run kokoro -t "Hello from development version!" -m af_heart -o dev_output.wavYou can also use the Python API:
# Create a script
cat > tts_script.py << 'EOF'
from kokoro import KPipeline
import soundfile as sf
# Initialize pipeline
pipeline = KPipeline(lang_code='a') # American English
# Generate speech
text = "Hello, this is the Kokoro TTS system running via uvx!"
generator = pipeline(text, voice='af_heart')
# Save audio
for i, (graphemes, phonemes, audio) in enumerate(generator):
print(f"Generated: {graphemes}")
print(f"Phonemes: {phonemes}")
sf.write(f'output_{i}.wav', audio, 24000)
break
print("Audio generated successfully!")
EOF
# Run the script with uvx
uvx --with soundfile --from kokoro python tts_script.py- Missing espeak-ng: Install with your system package manager
- CUDA warnings: These are normal if you don't have a NVIDIA GPU
- Model downloads: First run will download models (~327MB)
- GPU Acceleration: On systems with NVIDIA GPUs, CUDA will be used automatically
- Apple Silicon: Set
PYTORCH_ENABLE_MPS_FALLBACK=1for GPU acceleration on M1/M2/M3/M4 Macs - Memory: The model requires ~1GB RAM when loaded
The default voice is af_heart, but you can explore other voices by checking the voices directory after first run:
# List available voices (after first run)
ls ~/.cache/huggingface/hub/models--hexgrad--Kokoro-82M/snapshots/*/voices/# Process multiple files
for file in *.txt; do
uvx kokoro -i "$file" -m af_heart -o "${file%.txt}.wav"
done# Slower, more dramatic speech
uvx kokoro -t "This is a dramatic announcement." -m af_heart -s 0.7 -o dramatic.wav
# Faster speech
uvx kokoro -t "Quick announcement!" -m af_heart -s 1.3 -o quick.wav# Pipe text directly
echo "Hello from stdin!" | uvx kokoro -m af_heart -o stdin_output.wav
# From a command output
date | uvx kokoro -m af_heart -o current_time.wav#!/bin/bash
# tts.sh - Simple TTS wrapper
TEXT="$1"
OUTPUT="${2:-output.wav}"
VOICE="${3:-af_heart}"
if [ -z "$TEXT" ]; then
echo "Usage: $0 'text to speak' [output.wav] [voice]"
exit 1
fi
uvx kokoro -t "$TEXT" -m "$VOICE" -o "$OUTPUT"
echo "Generated: $OUTPUT"#!/usr/bin/env python3
import subprocess
import sys
def text_to_speech(text, output_file="output.wav", voice="af_heart"):
"""Convert text to speech using uvx kokoro"""
cmd = ["uvx", "kokoro", "-t", text, "-m", voice, "-o", output_file]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode == 0:
print(f"Successfully generated: {output_file}")
return True
else:
print(f"Error: {result.stderr}")
return False
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python tts_wrapper.py 'text to speak'")
sys.exit(1)
text_to_speech(sys.argv[1])This guide should help you get started with Kokoro using uv and uvx on your CachyOS system!