VAKH is a minimalist, high-performance desktop application for real-time voice-to-text dictation. It runs entirely offline using OpenAI's Whisper model, ensuring your voice data never leaves your computer.
- 100% Offline: All processing is done locally. No API keys, no subscriptions, and no data collection.
- Fast & Responsive: Powered by Rust and Tauri for minimal overhead and low latency.
- Universal Typing: Works across all applications (Word, Slack, Browser, Terminal, etc.) by simulating keystrokes.
- Smart Slicing: Automatically flushes text every few seconds or at the end of sentences for a smooth flow.
- Visual Feedback: Real-time waveform visualization with color-coded states (Blue for listening, Green for speaking, Red for warnings).
- 5-Minute Sessions: Support for long-form dictation up to 5 minutes per session.
- Launch VAKH: Open the application. A minimalist "orb" UI will appear at the bottom of your screen.
- Start Dictating: Double-tap the Left Ctrl or Right Ctrl key to start listening. The UI will turn Blue.
- Speak: As you talk, the UI will pulse Green. Your words will appear in the focused text field almost instantly.
- Stop/Finalize: Double-tap Ctrl again (or click the stop button) to finalize the transcription.
- Hide/Close: Press Esc or click the close button to hide the window.
- Download the latest
vakh_0.1.0_x64_en-US.msiorvakh.exefrom the Releases page. - Run the installer.
- Launch VAKH from your Start menu.
If you want to build the project yourself:
- Install Rust and Node.js.
- Clone this repository.
- Run
npm install. - Run
npm run tauri buildto generate the production binaries.
- Backend: Rust, Tauri 2.0
- AI Model: Whisper (embedded via
whisper-rs) - Frontend: Vanilla HTML/JS/CSS
- Input Simulation: Win32 API (
SendInput)
This project is licensed under the MIT License - see the LICENSE file for details.
VAKH uses a small, optimized version of the Whisper model (tiny.en). While very fast, it may occasionally hallucinate or make errors in complex environments.
