A text-to-speech utility for Windows that reads clipboard text aloud using high-quality neural voices. The inverse of Whisper Voice-to-Text.
v0.3.0+ is a C# (.NET 8) rewrite with local neural TTS via Kokoro. The Python version (v0.2.1) remains available.
Accessibility - Have articles, emails, and documents read aloud while you multitask or rest your eyes.
Productivity - Listen to written content while driving, exercising, or doing other tasks.
Learning - Improve comprehension by engaging both visual and auditory learning styles.
Hands-Free - Works globally across any Windows application with a simple hotkey.
| Feature | Herald | Windows TTS |
|---|---|---|
| Voice Quality | Kokoro neural (local) + Edge neural (online) | Basic SAPI voices |
| Speed Range | 100-600+ wpm | Limited range |
| Pause/Resume | ✅ Yes | ❌ No |
| Hotkey Control | ✅ Global hotkeys | ❌ Manual trigger |
| System Tray | ✅ Quick access | ❌ No controls |
| Offline Support | ✅ Kokoro (default) + SAPI fallback | N/A |
- Kokoro Neural TTS (default): Studio-quality local voices — 27 voices, runs offline, Apache 2.0 licensed
- Edge Neural Voices: Microsoft Edge neural voices (Aria, Jenny, Guy, Christopher) — requires internet
- SAPI Fallback: Windows SAPI voices (Zira, David) — always available offline
- OCR Support: Read text from screenshots and images (Win+Shift+S → Ctrl+Shift+S)
- Region Capture: Draw a box on screen to OCR and read (Ctrl+Shift+O) - great for PDFs
- Auto-Copy: Just select text and press Ctrl+Shift+S - no need to Ctrl+C first
- Global Hotkeys: Works in any application
- System Tray: Unobtrusive tray icon with menu controls
- Pause/Resume: Pause mid-speech and resume later
- Adjustable Speed: 100-600+ wpm depending on engine
- Settings Persistence: Remembers your voice and speed preferences
- Verbal Error Alerts: Speaks errors via SAPI fallback so you're never left with silence
- OS: Windows 10/11
- Internet: Not required (Kokoro runs locally). Edge TTS voices need internet.
Download the pre-built executable - no Python or .NET installation required:
- Go to Releases
- Download
Herald.zipfrom the latest release - Extract and run
Herald.exeas Administrator
The executable is portable and self-contained. On first use, Kokoro will download its ONNX model (~320MB).
Requires .NET 8 SDK.
# Build
dotnet build cs/Herald.sln
# Run
dotnet run --project cs/Herald
# Publish self-contained executable
dotnet publish cs/Herald -c Release -r win-x64 --self-containedThe published exe is at cs/Herald/bin/Release/net8.0-windows/win-x64/publish/Herald.exe.
Requires Python 3.10, 3.11, or 3.12.
Download and install Python from python.org
Make sure to check "Add Python to PATH" during installation.
Double-click Launch_Herald.bat - it will:
- Request administrator privileges (required for global hotkeys)
- Create a virtual environment (first run only)
- Install dependencies (first run only)
- Launch the application
That's it! On subsequent runs, it starts immediately.
Read selected text:
- Select any text in any application
- Press Ctrl+Shift+S to hear it read aloud (auto-copies selection)
Read from screenshot:
- Take a screenshot with Win+Shift+S and select a region
- Press Ctrl+Shift+S to OCR and read the text
Read from screen region (great for PDFs):
- Press Ctrl+Shift+O to enter region capture mode
- Draw a box around the text you want to read
- Text is OCR'd and read aloud automatically
Persistent region for PDFs/videos:
- Press Ctrl+Shift+M to define a screen region (green border appears)
- Press Ctrl+Shift+S anytime to read from that region
- Enable "Auto-Read Monitor Region" in tray menu to read automatically when text changes
- Press Ctrl+Shift+M again to clear the region
Controls:
- Press Alt+P to pause/resume
- Press Escape to stop
- Right-click the tray icon to change voice, speed, or hotkeys
To launch automatically when you log in:
Double-click Autostart_Enable.bat (will prompt for admin rights).
To remove auto-start:
Double-click Autostart_Disable.bat (will prompt for admin rights).
All hotkeys are configurable via the system tray menu and config/settings.json.
| Hotkey | Action |
|---|---|
| Ctrl+Shift+S | Speak selection/clipboard (auto-copies, supports OCR for images) |
| Ctrl+Shift+O | OCR region capture (draw box on screen) |
| Ctrl+Shift+M | Toggle persistent OCR region (for PDFs/videos) |
| Ctrl+Shift+P | Pause/resume |
| Ctrl+Shift+N | Skip to next line |
| Ctrl+Shift+B | Go back to previous line |
| Ctrl+Shift+] | Speed up |
| Ctrl+Shift+[ | Slow down |
| Escape | Stop speaking |
| Ctrl+Shift+Q | Quit application |
When you press Ctrl+Shift+S, text is split by newlines and read one line at a time. Use Ctrl+Shift+N to skip ahead or Ctrl+Shift+B to go back. This is useful for:
- Bouncing through code blocks
- Skipping sections you've already heard
- Replaying a line you missed
Right-click the tray icon to access:
- Voice (Online): Aria, Jenny, Guy, Christopher (neural voices, requires internet)
- Voice (Offline): Zira, David (Windows SAPI voices, no internet needed)
- Speed: Preset speeds (150-900 wpm online, up to 1500 wpm offline)
- Read Mode: Line by Line (default) or Continuous (reads all text as one block)
- Line Delay: Add a pause between lines (0-2000ms, only applies in Line by Line mode)
- Pause/Resume: Toggle when speaking
- Grab & Speak Selection: Auto-copy and speak when you select text
- Copy OCR to Clipboard: Save OCR'd text for pasting
- Auto-Read Monitor Region: Continuously read when text changes in persistent region
- Hotkeys: Configure all hotkeys (organized by Reading, Navigation, OCR, App categories)
- Console: Show or hide the console window
- Show Text Preview: Toggle whether text content appears in console/logs (privacy option)
- Quit: Exit the application
Settings are saved to config/settings.json:
{
"engine": "edge",
"voice": "aria",
"rate": 900,
"hotkey_speak": "ctrl+shift+s",
"hotkey_pause": "ctrl+shift+p",
"hotkey_stop": "escape",
"hotkey_speed_up": "ctrl+shift+]",
"hotkey_speed_down": "ctrl+shift+[",
"hotkey_next": "ctrl+shift+n",
"hotkey_prev": "ctrl+shift+b",
"hotkey_ocr": "ctrl+shift+o",
"hotkey_monitor": "ctrl+shift+m",
"hotkey_quit": "ctrl+shift+q",
"line_delay": 0,
"read_mode": "lines",
"log_preview": true
}| Setting | Options | Description |
|---|---|---|
| engine | kokoro, edge, pyttsx3 | TTS engine (kokoro is default in v0.3.0+) |
| voice | heart, bella, michael, emma, aria, jenny, guy, christopher, zira, david | Voice name (varies by engine) |
| rate | 100-1500 | Words per minute (varies by engine) |
| hotkey_speak | ctrl+shift+s, alt+s, f9, alt+r | Speak hotkey |
| hotkey_pause | ctrl+shift+p, alt+p, f10 | Pause hotkey |
| hotkey_stop | escape, f12 | Stop hotkey |
| hotkey_speed_up | ctrl+shift+], alt+] | Speed up hotkey |
| hotkey_speed_down | ctrl+shift+[, alt+[ | Slow down hotkey |
| hotkey_next | ctrl+shift+n, alt+n, f7 | Next line hotkey |
| hotkey_prev | ctrl+shift+b, alt+b, f6 | Previous line hotkey |
| hotkey_ocr | ctrl+shift+o, ctrl+alt+shift+o, alt+o, f8 | OCR capture hotkey |
| hotkey_monitor | ctrl+shift+m, ctrl+alt+shift+m, alt+m, f11 | Monitor region hotkey |
| hotkey_quit | ctrl+shift+q, alt+q | Quit hotkey |
| line_delay | 0-2000 | Milliseconds to pause between lines |
| read_mode | lines, continuous | Line by line or read all text at once |
| log_preview | true, false | Show text content in console/logs |
| Engine | Type | Internet | Voices | Speed Range | Notes |
|---|---|---|---|---|---|
| Kokoro (default) | Local neural | No | 27 | 100-600 wpm | Best quality up to ~260 wpm (1.3x). Quality degrades above that. All values above 600 wpm produce identical output (capped at 3.0x). |
| Edge TTS | Cloud neural | Yes | 4 | 150-900 wpm | Microsoft Edge voices. Not for commercial use. |
| SAPI | Windows built-in | No | 2 | 150-1500 wpm | Basic offline fallback. |
Note: Edge TTS uses Microsoft Edge's neural voices for personal and educational use only. For commercial use, switch to Kokoro (default) or SAPI.
| Voice | Gender | Accent | Grade |
|---|---|---|---|
| heart (default) | Female | American | A |
| bella | Female | American | A- |
| nicole | Female | American | B- |
| sarah | Female | American | C+ |
| nova, sky, alloy, jessica, kore, aoede, river | Female | American | — |
| michael | Male | American | C+ |
| fenrir, puck | Male | American | C+ |
| adam, echo, eric, liam, onyx | Male | American | — |
| emma | Female | British | B- |
| alice, isabella, lily | Female | British | — |
| daniel, george, lewis, fable | Male | British | — |
| Voice | Description |
|---|---|
| aria | Female, conversational |
| jenny | Female, news anchor |
| guy | Male, friendly |
| christopher | Male, professional |
| Voice | Description |
|---|---|
| zira | Female, Windows default |
| david | Male, Windows default |
- Make sure you've copied text to the clipboard (Ctrl+C)
- Some applications use special clipboard formats; try copying from Notepad
- Check your internet connection
- The app will fall back to offline voices if edge-tts fails
- Ensure the application is running as administrator
- Check for conflicting hotkeys in other applications
- Pause only works with neural voices (edge-tts)
- Offline voices (pyttsx3) don't support true pause
- Kokoro: Best quality up to ~260 wpm (1.3x). Usable up to 600 wpm (3.0x). Values above 600 wpm are capped.
- Edge TTS: Effective range 150-900 wpm
- SAPI: Full 150-1500 wpm range
herald/
├── cs/ # C# rewrite (v0.3.0+)
│ ├── Herald/ # WinForms app (hotkeys, OCR, tray, UI)
│ ├── Herald.Tts/ # Shared TTS library (Kokoro, Edge, SAPI)
│ ├── Herald.Tests/ # xUnit tests
│ └── Herald.sln # Solution file
├── src/ # Python version (v0.2.1)
│ ├── main.py # Application entry point
│ ├── tts_engine.py # TTS engine abstraction
│ ├── tray_app.py # System tray icon
│ ├── text_grab.py # Clipboard handling + OCR
│ ├── region_capture.py # Screen region selection
│ └── config.py # Settings management
├── config/
│ └── settings.json # Your settings (shared format, auto-created)
├── Launch_Herald.bat # Python launcher
└── requirements.txt # Python dependencies
- Whisper Voice-to-Text - The inverse: speech-to-text
Get Herald Pro on the Microsoft Store -- Free, 41 AI neural voices in 6 languages, fully offline. Includes batch audio export, sentence highlighting, voice preview, and EPUB/PDF/DOCX import.
MIT License - feel free to use and modify.