Keyboard shortcut voice dictation using OpenAI Whisper model.
- Docker
- xclip
- arecord (ALSA)
- paplay (PulseAudio, optional for sounds)
./scripts/install.shOr install manually:
sudo apt-get install -y docker.io xclip alsa-utils pulseaudio-utils
sudo usermod -aG docker $USER
newgrp docker # Or log out and back inmake build
make run
make enable-service
# Test recording (USB mic example)
arecord -f cd -t wav recordings/test.wav -d 3 -D hw:1,0
# Test transcription
make testWHISPER_LANG(default:auto) - input language hint. Supportsauto,en,ru,english,russian. Useautoto let Whisper detect the language.WHISPER_OUTPUT_LANG- strict output language hint (e.g.,enorru). Forces transcription to this language.WHISPER_MODEL(default:openai/whisper-base) - model id (restart daemon/service after changing). Available models:tiny,base,small,medium,large-v3,large-v3-turbo.WHISPER_PORT(default:8610) - daemon port (host).
- Go to Settings → Keyboard → View and Customize Shortcuts → Custom Shortcuts
- Add shortcut:
- Name: "Voice Dictate"
- Command:
/home/yuki/Projects/scuffed-whisper/scripts/voice-dictate.sh - Shortcut: Ctrl+Shift+D
The service keeps the container and daemon running in the background and will automatically start on the next login.
Enable autostart:
make enable-serviceVerify service is running and enabled:
systemctl --user status whisper-dictate.serviceTo disable:
make disable-serviceIf you change WHISPER_MODEL, WHISPER_LANG, or WHISPER_OUTPUT_LANG, restart the service:
make disable-service
make enable-service- Press shortcut to start recording
- Speak
- Press shortcut again to transcribe and copy to clipboard
- Paste anywhere with Ctrl+V
- Recording stored in
recordings/directory - Docker container runs CPU-only Whisper model (compatible with AMD integrated GPU)
- HuggingFace cache persisted at
~/.cache/huggingfaceto avoid repeated downloads - Transcribe script mounted from host, so no rebuilds needed for script changes
- Daemon keeps model warm and accepts HTTP transcription requests
- Transcription written to temp file, then copied via host
xclip(Wayland compatible) - Audio + desktop notifications for recording/transcribing/ready
- Recordings:
recordings/dictation.wav - Temp transcript:
/tmp/whisper-transcript.txt - Container:
whisper-app - Daemon:
http://127.0.0.1:8610
- Check service status:
systemctl --user status whisper-dictate.service - Check container:
docker ps - View logs:
docker logs whisper-app - Stop:
make stop - Rebuild:
make clean && make build