日本語 | English
A command-line interface wrapper for VOICEPEAK text-to-speech software with preset management and automatic audio playback.
This wrapper enhances the original VOICEPEAK CLI with several powerful features:
- 🎵 Auto-play with mpv - Automatically plays generated audio when no output file is specified
- 📝 Voice presets - Save and reuse combinations of narrator, emotions, and pitch settings
- 📜 Long text support - Automatically splits texts longer than 140 characters and merges audio chunks
- 🔧 Advanced playback modes - Choose between batch (generate all → merge → play) or sequential (generate → play one by one)
- 🔄 Pipe input support - Accept text from stdin:
echo "text" | vp - 🔇 Clean output - Suppresses technical output by default (use
--verboseto see debug info) - ⚙️ Configuration file - Store your preferred settings in
~/.config/vp/config.toml
- Enhanced Workflow: No need to manually save and play audio files - just run and listen
- Batch Processing: Handle long documents without worrying about character limits
- Flexible Input: Works with direct text, files, or piped input from other commands
- Personalization: Save your favorite voice configurations for consistent results
- Professional Output: Clean interface with optional verbose mode for debugging
- macOS
- VOICEPEAK installed at
/Applications/voicepeak.app/ - mpv for audio playback (install via Homebrew:
brew install mpv) - ffmpeg for batch mode and multi-chunk file output (install via Homebrew:
brew install ffmpeg)
cargo install voicepeak-cli- Clone this repository
- Build and install:
cargo install --path .
# Simple text-to-speech (requires preset or --narrator)
vp "こんにちは、世界!"
# With explicit narrator
vp "こんにちは、世界!" --narrator "夏色花梨"
# Save to file instead of auto-play
vp "こんにちは、世界!" --narrator "夏色花梨" -o output.wav
# Read from file
vp -t input.txt --narrator "夏色花梨"
# Pipe input
echo "こんにちは、世界!" | vp --narrator "夏色花梨"
cat document.txt | vp -p karin-happy# List available presets
vp --list-presets
# Use a preset
vp "こんにちは、世界!" -p karin-happy
# Override preset settings
vp "こんにちは、世界!" -p karin-normal --emotion "happy=50"# Control speech parameters
vp "こんにちは、世界!" --narrator "夏色花梨" --speed 120 --pitch 50
# List available narrators
vp --list-narrator
# List emotions for a specific narrator
vp --list-emotion "夏色花梨"# Allow automatic text splitting (default)
vp "very long text..."
# Strict mode: reject texts longer than 140 characters
vp "text" --strict-length# Batch mode: generate all chunks first, merge, then play (default)
vp "long text" --playback-mode batch
# Sequential mode: generate and play chunks one by one
vp "long text" --playback-mode sequential
# Long text file output (uses ffmpeg to merge chunks)
vp "very long text" -o output.wav
# For sequential playback without ffmpeg
vp "long text" --playback-mode sequentialConfiguration is stored in ~/.config/vp/config.toml. The file is automatically created on first run.
default_preset = "karin-custom"
[[presets]]
name = "karin-custom"
narrator = "夏色花梨"
emotions = [
{ name = "hightension", value = 10 },
{ name = "sasayaki", value = 20 },
]
pitch = 30
speed = 120
[[presets]]
name = "karin-normal"
narrator = "夏色花梨"
emotions = []
[[presets]]
name = "karin-happy"
narrator = "夏色花梨"
emotions = [{ name = "hightension", value = 50 }]default_preset: Optional. Preset to use when no-poption is specifiedpresets: Array of voice presets
name: Unique preset identifiernarrator: Voice narrator nameemotions: Array of emotion parameters withnameandvaluepitch: Optional pitch adjustment (-300 to 300)speed: Optional speed adjustment (50 to 200)
Usage: vp [OPTIONS] [TEXT]
Arguments:
[TEXT] Text to say (or pipe from stdin)
Options:
-t, --text <FILE> Text file to say
-o, --out <FILE> Path of output file (optional - will play with mpv if not specified)
-n, --narrator <NAME> Name of voice
-e, --emotion <EXPR> Emotion expression (e.g., happy=50,sad=50)
-p, --preset <NAME> Use voice preset
--list-narrator Print voice list
--list-emotion <NARRATOR> Print emotion list for given voice
--list-presets Print available presets
--speed <VALUE> Speed (50 - 200)
--pitch <VALUE> Pitch (-300 - 300)
--strict-length Reject input longer than 140 characters (default: false, allows splitting)
--playback-mode <MODE> Playback mode: sequential or batch (default: batch)
-v, --verbose Enable verbose output (show VOICEPEAK debug messages)
-h, --help Print help
-V, --version Print version
When multiple sources specify the same parameter, the priority order is:
- Command-line options (highest priority)
- Preset values
- Default values / none (lowest priority)
For example:
vp "text" -p my-preset --pitch 100uses pitch=100 (CLI override)vp "text" -p my-presetuses preset's pitch valuevp "text" --narrator "voice"uses no pitch adjustment
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines on how to contribute to this project.