Skip to content

dlozina/yapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ—£οΈ Yapper β€” Voice to Clipboard

Self-contained speech-to-text tool. Press a hotkey to record, press again to stop. Audio is transcribed locally using Whisper and copied to your clipboard. No external server, no cloud, one process. πŸ”’

πŸš€ Two Ways to Run

Yapper has two entry points β€” same engine, different interfaces:

  • yapper β€” Raw/headless mode. Minimal terminal output, no dependencies beyond the core. Good for scripting, running in the background, or if you just want it to work without any fuss.

  • yapper-tui β€” Rich terminal dashboard. Live status panel, real-time waveform visualization, activity log, transcription display, and audio feedback sounds (beeps in your headphones on start/stop). Good for when you want to see what the app is doing. 🎧

Both accept the same CLI options. The TUI is a separate layer that wraps the core β€” it never touches the engine logic.

⚑ How It Works

  1. On launch, loads a Whisper model onto GPU (or CPU) β€” stays in memory
  2. Press Shift+Space to start recording from your microphone 🎀
  3. Press Shift+Space again to stop
  4. Audio is transcribed locally with VAD filtering (strips silence)
  5. Result is copied to clipboard and a notification fires πŸ“‹
  6. Press Ctrl+Shift+; to toggle language (EN ↔ HR)
  7. Ready for the next recording ♻️

πŸ“¦ Installation

From GitHub Releases (recommended)

Download the latest release from the Releases page.

From source

🐧 Linux (Ubuntu/Debian)

# System dependencies
sudo apt install libportaudio2 xclip

# (Optional) For Wayland clipboard support instead of xclip:
# sudo apt install wl-clipboard

# Clone and install
git clone https://github.com/dlozina/yapper.git
cd yapper
uv venv
source .venv/bin/activate

# Install PyTorch with CUDA (for GPU acceleration)
uv pip install torch --index-url https://download.pytorch.org/whl/cu124

# Install yapper
uv pip install -e .

🍎 macOS

brew install portaudio

git clone https://github.com/dlozina/yapper.git
cd yapper
uv venv
source .venv/bin/activate
uv pip install -e .

Notes:

  • No CUDA on macOS β€” uses CPU with int8 quantization. Apple Silicon (M-series) runs the base model in under a second for short clips.
  • macOS will prompt for microphone access and Accessibility permissions (for pynput) on first run.

πŸͺŸ Windows

git clone https://github.com/dlozina/yapper.git
cd yapper
uv venv
.venv\Scripts\activate

# Install PyTorch with CUDA
uv pip install torch --index-url https://download.pytorch.org/whl/cu124

# Install yapper
uv pip install -e .

Note: Install Visual C++ Redistributable if not already present.

🎯 Usage

TUI mode (recommended for interactive use)

# Terminal dashboard with waveform, status, and sound feedback
yapper-tui

# With options
yapper-tui --model large-v3-turbo --language en
yapper-tui --compute cpu

Raw mode (headless / minimal)

# Plain terminal output, no rich UI
yapper

# Same options
yapper --model base --compute cpu

Common options

# Set language explicitly (skips detection, faster)
yapper-tui --language en

# Croatian
yapper-tui --language hr

# Custom hotkey
yapper-tui --hotkey "<ctrl>+<space>"

# Save recordings to ~/Recordings/yapper/
yapper-tui --save

# List available microphones
yapper-tui --list-devices

# Pick a specific microphone by index
yapper-tui --device 3

# Force CPU even if CUDA is available
yapper-tui --compute cpu

βš™οΈ CLI Options

Option Default Description
--model base Whisper model: tiny, base, small, medium, large-v3, large-v3-turbo
--language en Language code (en, hr, etc.)
--hotkey <shift>+<space> Global hotkey in pynput format
--lang-hotkey <ctrl>+<shift>+; Hotkey to toggle language
--device auto Audio input device index or auto
--compute auto Force cuda or cpu (auto detects CUDA)
--save off Save audio + transcript to ~/Recordings/yapper/
--list-devices β€” List audio input devices and exit

🧠 Models

Model Size Speed (GPU) Speed (CPU) Quality
tiny ~75MB instant fast basic
base ~150MB instant fast good βœ…
small ~500MB instant moderate better
medium ~1.5GB fast slow great
large-v3-turbo ~1.5GB fast slow best πŸ†

Models download automatically on first run and are cached in ~/.cache/huggingface/.

πŸ—οΈ Architecture

yapper/                  Core package
  cli.py                 CLI entry point (parse_args, main)
  core.py                Engine β€” recording, transcription, clipboard
  devices.py             Device detection, mic check, paired output
  clipboard.py           Cross-platform clipboard with fallback chain
  compute.py             GPU/CPU detection
  notifications.py       Desktop notifications
  constants.py           Shared constants

yapper_tui.py            TUI entry point β€” wires core + display + sounds
tui/
  display.py             Rich Live dashboard (status, waveform, log)
  sounds.py              Audio feedback (beeps via dedicated OutputStream)

The core never imports rich or anything from tui/. The TUI subscribes to events via callbacks and adds its own rendering and sound layer. This separation means:

  • πŸ§ͺ The core can be tested and used without any UI
  • πŸ”Œ Alternative frontends (GUI, web, system tray) can wrap the same engine
  • 🎨 The TUI can evolve independently without risking the transcription logic

πŸ–₯️ TUI Features

  • Status panel β€” shows idle/recording/transcribing with color-coded borders
  • Live waveform β€” real-time audio level visualization using Unicode block characters
  • Recording timer β€” elapsed time shown during recording
  • Activity log β€” timestamped log of all events
  • Last transcription β€” always visible for reference
  • Sound feedback β€” ascending beep on record start, descending on stop, chirp when clipboard is ready πŸ””
  • Smart audio routing β€” sounds play through headphones when using a headset mic (e.g. PlayStation Link, Elgato Wave) 🎧

πŸ“ Platform Notes

🐧 Linux β€” Wayland

pynput requires X11 or XWayland for global hotkey capture. On pure Wayland, options:

  • Run under XWayland (most desktop environments still support this)
  • Switch to X11 session
  • Future: evdev-based input backend (planned)

🐧 Linux β€” PipeWire/PulseAudio

sounddevice works with both PipeWire and PulseAudio. Use --list-devices to see what's available.

🍎 macOS β€” Permissions

First run will trigger two permission prompts:

  1. Microphone access β€” required for recording
  2. Accessibility β€” required for pynput global hotkey

Both are one-time grants in System Settings.

πŸͺŸ Windows β€” Notes

  • CUDA works if you have an NVIDIA GPU and install PyTorch with CUDA support
  • CPU mode uses int8 quantization β€” works fine for tiny and base models
  • Clipboard uses pyperclip (which uses win32clipboard under the hood)
  • Notifications use win10toast β€” install with uv pip install yapper[windows]

πŸ“„ License

MIT

About

Yapp with your linux, windows or MacOS machine

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors