Skip to content

Detoxfox4234/Qwen3-Voice-Factory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🏭 Qwen3 Voice Factory (RTX 50 Series Optimized)

A local, portable GUI for Qwen3-TTS. Specially optimized for NVIDIA RTX 50 Series (CUDA 12.8 / PyTorch Nightly), but also runs on previous generations (3090/4090).

🎯 Perfect for anyone who wants to test these models quickly without dealing with complex node graphs (ComfyUI).

Screenshot

Features

  • 🎬 Director Mode: Choose presets (Ryan, Vivian) and provide direction instructions ("Angry", "Whispering").
  • 🧬 Voice Cloner: Upload a short audio file (3-10s) and clone the voice (supports High-Quality ICL Mode).
  • 🎨 Voice Creator: Create completely new voices from scratch using text descriptions (Voice Design).
  • 📊 Live Hardware Monitor: Includes a real-time dashboard to watch your VRAM/RAM/CPU usage while generating.
  • 📂 Auto-Save: Automatically creates an outputs_audio folder and saves every generation with a timestamp.
  • Portable: Does not modify your Windows system. Everything stays contained in one folder.

Installation

  1. Download this repository as a ZIP file and extract it.
  2. Double-click on install.bat.
    • The script automatically downloads an isolated Python 3.11 environment.
    • It installs PyTorch Nightly (required for Blackwell / RTX 50 Series support).
  3. Wait until the installation is complete.

Usage

  1. Double-click on start.bat.
  2. Your browser will open automatically at http://127.0.0.1:7860.

Models

Models are automatically downloaded from HuggingFace the first time you use a specific tab (~4GB per model). Please ensure you have enough disk space.

Requirements

  • Windows 10/11
  • NVIDIA GPU (Recommended: 12GB+ VRAM)
  • Internet connection (required for installation and model download)

🔗 Credits & Acknowledgements

This project is a GUI wrapper built to make the amazing work of the Qwen Team easily accessible. All AI capabilities are powered by their models.

🤝 Support

This is a free open-source project. I don't ask for donations. However, if you want to say "Thanks", check out my profile on Spotify.

A follow or a listen is the best way to support me! 🎧

Releases

No releases published

Packages

 
 
 

Contributors