QwenVoice is the public repository for an offline Apple-platform Qwen3-TTS app with Custom Voice, Voice Design, and Voice Cloning.
The currently shipped public release is QwenVoice v1.2.3 for macOS. The next macOS release ships under the Vocello app name while this repository, shared core, and many internal modules keep the QwenVoice identity for continuity.
| Surface | Public name | Artifact | Minimum OS / tools | Runtime | Status |
|---|---|---|---|---|---|
| Shipped release | QwenVoice v1.2.3 | Assets attached to the v1.2.3 GitHub Release | See the v1.2.3 release notes | SwiftUI app with the shipped legacy runtime | Current public download |
Current main |
QwenVoice repo / Vocello app | Local builds produce Vocello.app |
macOS 26.0+, iOS 26.0+, Xcode 26.0 | Native Swift/MLX shared core with macOS XPC isolation and iPhone extension isolation | Active development |
| Next macOS release | Vocello | Vocello-macos26.dmg |
macOS 26.0+ | Native Swift/MLX macOS runtime hosted out of process | Current release target |
| iPhone track | Vocello for iPhone | App Store / TestFlight only | iOS 26.0+, iPhone 15 Pro minimum target | 4-bit Speed variants in an engine extension | Maintained but deferred |
The public landing page remains QwenVoice-led until the Vocello-branded macOS release ships. iPhone is in active development, but it is not a public release surface for the current macOS-first milestone.
Generate speech with the app's built-in English speakers:
- Ryan
- Aiden
- Serena
- Vivian
Voice Design is a standalone destination. Describe the voice you want, then shape tone before generating.
Clone a voice from a short reference clip. The app accepts WAV, MP3, AIFF, M4A, FLAC, and OGG input and can also use an optional transcript for better cloning accuracy. Only clone voices you own or have permission to use.
- no temperature or max-token controls
- no streaming batch UI
Single-generation flows use live streaming preview and sidebar playback. Batch generation remains sequential and final-file-based.
- Native model downloads from Hugging Face
- Live streaming preview for single generations
- Local generation history stored in SQLite via GRDB
- Batch generation for multi-line jobs
- Sidebar waveform playback UI
- Configurable output directory and autoplay preference
- macOS XPC process isolation for native generation on current
main - iPhone engine-extension isolation for the deferred iPhone track
| Requirement | Detail |
|---|---|
| macOS | 26.0+ |
| iOS | 26.0+ for the maintained iPhone targets |
| Chip | Apple Silicon |
| RAM | 8 GB+ on macOS; iPhone 15 Pro is the stated iPhone floor |
| Tools | Xcode 26.0 and XcodeGen |
The shipped v1.2.3 build predates the current native main release track. Use the v1.2.3 GitHub Release notes and attached assets as the source of truth for that historical build.
Download the current public release from Releases.
For the next macOS release, the public artifact is expected to be:
Vocello-macos26.dmg
Then:
- Open the DMG.
- Drag
Vocello.appto/Applications. - Open the app, go to Models, download a model, and generate speech.
Static model metadata comes from Sources/Resources/qwenvoice_contract.json.
| Mode | 8-bit Quality folder | 4-bit Speed folder | Hugging Face repos |
|---|---|---|---|
| Custom Voice | Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit |
Qwen3-TTS-12Hz-1.7B-CustomVoice-4bit |
mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-* |
| Voice Design | Qwen3-TTS-12Hz-1.7B-VoiceDesign-8bit |
Qwen3-TTS-12Hz-1.7B-VoiceDesign-4bit |
mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-* |
| Voice Cloning | Qwen3-TTS-12Hz-1.7B-Base-8bit |
Qwen3-TTS-12Hz-1.7B-Base-4bit |
mlx-community/Qwen3-TTS-12Hz-1.7B-Base-* |
On macOS, the Models screen exposes both Speed / 4-bit and Quality / 8-bit rows for each generation mode, for six downloadable rows total. The app marks the hardware-recommended row and the currently active row separately: 8 GB/floor Macs recommend and default to Speed, while mid/high-memory Macs recommend and default to Quality. Users can install both folders side by side and switch the active installed variant per mode with Use.
Downloads come directly from the Hugging Face repos listed above through the contract metadata. iPhone remains Speed-only and exposes the 4-bit rows for its supported model catalog.
Source-build prerequisites for current main:
- macOS 26.0+
- Apple Silicon
- Xcode 26.0
- XcodeGen
git clone https://github.com/PowerBeef/QwenVoice.git
cd QwenVoice
./scripts/regenerate_project.sh
open QwenVoice.xcodeprojBuild the QwenVoice scheme from Xcode, or use:
xcodebuild -project QwenVoice.xcodeproj -scheme QwenVoice buildUseful local checks:
./scripts/check_project_inputs.sh
./scripts/qa.sh validate
./scripts/qa.sh test --layer contract
./scripts/qa.sh test --layer swift
./scripts/qa.sh test --layer native
./scripts/build_foundation_targets.sh macos
./scripts/build_foundation_targets.sh iosqa.sh stores isolated build products and .xcresult bundles under build/harness/. The macOS UI smoke lane is available with ./scripts/qa.sh test --layer e2e; for release signoff on a controlled Mac, run it strictly with QWENVOICE_E2E_STRICT=1.
Performance and audio-QC validation runs through the opt-in perf lane (requires installed models, not part of --layer all):
./scripts/qa.sh test --layer perfFor a local unsigned macOS release build and DMG:
./scripts/release.sh --preflight full
./scripts/verify_release_bundle.sh build/Vocello.app
./scripts/verify_packaged_dmg.sh build/Vocello-macos26.dmg build/release-metadata.txtCustom Voice and Voice Design are guided by natural-language instructions rather than SSML-style sliders or markup.
See docs/qwen_tone.md for app-oriented guidance on tone and prompt writing.
Current main uses a native Apple-platform architecture:
Sources/contains the macOS app shell, shared app models/services/views, and the shipping Mac target.Sources/QwenVoiceCore/contains shared Apple-platform runtime semantics, contract types, model variants, and iOS extension transport.Sources/QwenVoiceNative/contains the macOS app-facing engine proxy/store/client layer.Sources/QwenVoiceEngineSupport/contains shared macOS engine IPC and transport types.Sources/QwenVoiceEngineService/contains the bundled macOS XPC helper.Sources/iOS/,Sources/iOSSupport/, andSources/iOSEngineExtension/contain the deferred iPhone app, support layer, and isolated engine extension.Sources/SharedSupport/contains shared playback and generation-persistence surfaces.
The current codebase does not maintain a repo-owned Python backend, Python setup path, or standalone CLI surface.
Default macOS runtime data layout:
~/Library/Application Support/QwenVoice/
models/
outputs/
CustomVoice/
VoiceDesign/
Clones/
voices/
history.sqlite
See docs/reference/privacy-storage.md for local storage, privacy, and deletion details.
docs/README.md- documentation indexdocs/reference/current-state.md- current repo factsdocs/reference/engineering-status.md- current strengths and caveatsdocs/reference/release-readiness.md- macOS-first release policy and signoff gatesdocs/reference/privacy-storage.md- local storage, privacy, and deletion detailsCONTRIBUTING.md- contributor workflow
QwenVoice is available under the MIT License.
QwenVoice builds on: