A local, portable GUI for Qwen3-TTS. Specially optimized for NVIDIA RTX 50 Series (CUDA 12.8 / PyTorch Nightly), but also runs on previous generations (3090/4090).
🎯 Perfect for anyone who wants to test these models quickly without dealing with complex node graphs (ComfyUI).
- 🎬 Director Mode: Choose presets (Ryan, Vivian) and provide direction instructions ("Angry", "Whispering").
- 🧬 Voice Cloner: Upload a short audio file (3-10s) and clone the voice (supports High-Quality ICL Mode).
- 🎨 Voice Creator: Create completely new voices from scratch using text descriptions (Voice Design).
- 📊 Live Hardware Monitor: Includes a real-time dashboard to watch your VRAM/RAM/CPU usage while generating.
- 📂 Auto-Save: Automatically creates an
outputs_audiofolder and saves every generation with a timestamp. - Portable: Does not modify your Windows system. Everything stays contained in one folder.
- Download this repository as a ZIP file and extract it.
- Double-click on
install.bat.- The script automatically downloads an isolated Python 3.11 environment.
- It installs PyTorch Nightly (required for Blackwell / RTX 50 Series support).
- Wait until the installation is complete.
- Double-click on
start.bat. - Your browser will open automatically at
http://127.0.0.1:7860.
Models are automatically downloaded from HuggingFace the first time you use a specific tab (~4GB per model). Please ensure you have enough disk space.
- Windows 10/11
- NVIDIA GPU (Recommended: 12GB+ VRAM)
- Internet connection (required for installation and model download)
This project is a GUI wrapper built to make the amazing work of the Qwen Team easily accessible. All AI capabilities are powered by their models.
- Base Models: Developed by Alibaba Cloud / Qwen Team.
- Please support their original work on HuggingFace and GitHub.
This is a free open-source project. I don't ask for donations. However, if you want to say "Thanks", check out my profile on Spotify.
A follow or a listen is the best way to support me! 🎧
