Self-contained speech-to-text tool. Press a hotkey to record, press again to stop. Audio is transcribed locally using Whisper and copied to your clipboard. No external server, no cloud, one process. π
Yapper has two entry points β same engine, different interfaces:
-
yapperβ Raw/headless mode. Minimal terminal output, no dependencies beyond the core. Good for scripting, running in the background, or if you just want it to work without any fuss. -
yapper-tuiβ Rich terminal dashboard. Live status panel, real-time waveform visualization, activity log, transcription display, and audio feedback sounds (beeps in your headphones on start/stop). Good for when you want to see what the app is doing. π§
Both accept the same CLI options. The TUI is a separate layer that wraps the core β it never touches the engine logic.
- On launch, loads a Whisper model onto GPU (or CPU) β stays in memory
- Press
Shift+Spaceto start recording from your microphone π€ - Press
Shift+Spaceagain to stop - Audio is transcribed locally with VAD filtering (strips silence)
- Result is copied to clipboard and a notification fires π
- Press
Ctrl+Shift+;to toggle language (EN β HR) - Ready for the next recording β»οΈ
Download the latest release from the Releases page.
# System dependencies
sudo apt install libportaudio2 xclip
# (Optional) For Wayland clipboard support instead of xclip:
# sudo apt install wl-clipboard
# Clone and install
git clone https://github.com/dlozina/yapper.git
cd yapper
uv venv
source .venv/bin/activate
# Install PyTorch with CUDA (for GPU acceleration)
uv pip install torch --index-url https://download.pytorch.org/whl/cu124
# Install yapper
uv pip install -e .brew install portaudio
git clone https://github.com/dlozina/yapper.git
cd yapper
uv venv
source .venv/bin/activate
uv pip install -e .Notes:
- No CUDA on macOS β uses CPU with int8 quantization. Apple Silicon (M-series) runs the
basemodel in under a second for short clips. - macOS will prompt for microphone access and Accessibility permissions (for pynput) on first run.
git clone https://github.com/dlozina/yapper.git
cd yapper
uv venv
.venv\Scripts\activate
# Install PyTorch with CUDA
uv pip install torch --index-url https://download.pytorch.org/whl/cu124
# Install yapper
uv pip install -e .Note: Install Visual C++ Redistributable if not already present.
# Terminal dashboard with waveform, status, and sound feedback
yapper-tui
# With options
yapper-tui --model large-v3-turbo --language en
yapper-tui --compute cpu# Plain terminal output, no rich UI
yapper
# Same options
yapper --model base --compute cpu# Set language explicitly (skips detection, faster)
yapper-tui --language en
# Croatian
yapper-tui --language hr
# Custom hotkey
yapper-tui --hotkey "<ctrl>+<space>"
# Save recordings to ~/Recordings/yapper/
yapper-tui --save
# List available microphones
yapper-tui --list-devices
# Pick a specific microphone by index
yapper-tui --device 3
# Force CPU even if CUDA is available
yapper-tui --compute cpu| Option | Default | Description |
|---|---|---|
--model |
base |
Whisper model: tiny, base, small, medium, large-v3, large-v3-turbo |
--language |
en |
Language code (en, hr, etc.) |
--hotkey |
<shift>+<space> |
Global hotkey in pynput format |
--lang-hotkey |
<ctrl>+<shift>+; |
Hotkey to toggle language |
--device |
auto |
Audio input device index or auto |
--compute |
auto |
Force cuda or cpu (auto detects CUDA) |
--save |
off | Save audio + transcript to ~/Recordings/yapper/ |
--list-devices |
β | List audio input devices and exit |
| Model | Size | Speed (GPU) | Speed (CPU) | Quality |
|---|---|---|---|---|
| tiny | ~75MB | instant | fast | basic |
| base | ~150MB | instant | fast | good β |
| small | ~500MB | instant | moderate | better |
| medium | ~1.5GB | fast | slow | great |
| large-v3-turbo | ~1.5GB | fast | slow | best π |
Models download automatically on first run and are cached in ~/.cache/huggingface/.
yapper/ Core package
cli.py CLI entry point (parse_args, main)
core.py Engine β recording, transcription, clipboard
devices.py Device detection, mic check, paired output
clipboard.py Cross-platform clipboard with fallback chain
compute.py GPU/CPU detection
notifications.py Desktop notifications
constants.py Shared constants
yapper_tui.py TUI entry point β wires core + display + sounds
tui/
display.py Rich Live dashboard (status, waveform, log)
sounds.py Audio feedback (beeps via dedicated OutputStream)
The core never imports rich or anything from tui/. The TUI subscribes to events via callbacks and adds its own rendering and sound layer. This separation means:
- π§ͺ The core can be tested and used without any UI
- π Alternative frontends (GUI, web, system tray) can wrap the same engine
- π¨ The TUI can evolve independently without risking the transcription logic
- Status panel β shows idle/recording/transcribing with color-coded borders
- Live waveform β real-time audio level visualization using Unicode block characters
- Recording timer β elapsed time shown during recording
- Activity log β timestamped log of all events
- Last transcription β always visible for reference
- Sound feedback β ascending beep on record start, descending on stop, chirp when clipboard is ready π
- Smart audio routing β sounds play through headphones when using a headset mic (e.g. PlayStation Link, Elgato Wave) π§
pynput requires X11 or XWayland for global hotkey capture. On pure Wayland, options:
- Run under XWayland (most desktop environments still support this)
- Switch to X11 session
- Future: evdev-based input backend (planned)
sounddevice works with both PipeWire and PulseAudio. Use --list-devices to see what's available.
First run will trigger two permission prompts:
- Microphone access β required for recording
- Accessibility β required for pynput global hotkey
Both are one-time grants in System Settings.
- CUDA works if you have an NVIDIA GPU and install PyTorch with CUDA support
- CPU mode uses int8 quantization β works fine for
tinyandbasemodels - Clipboard uses
pyperclip(which useswin32clipboardunder the hood) - Notifications use
win10toastβ install withuv pip install yapper[windows]