Herald - Text-to-Speech

A text-to-speech utility for Windows that reads clipboard text aloud using high-quality neural voices. The inverse of Whisper Voice-to-Text.

v0.3.0+ is a C# (.NET 8) rewrite with local neural TTS via Kokoro. The Python version (v0.2.1) remains available.

Why Herald?

Accessibility - Have articles, emails, and documents read aloud while you multitask or rest your eyes.

Productivity - Listen to written content while driving, exercising, or doing other tasks.

Learning - Improve comprehension by engaging both visual and auditory learning styles.

Hands-Free - Works globally across any Windows application with a simple hotkey.

vs. Windows Built-in TTS

Feature	Herald	Windows TTS
Voice Quality	Kokoro neural (local) + Edge neural (online)	Basic SAPI voices
Speed Range	100-600+ wpm	Limited range
Pause/Resume	✅ Yes	❌ No
Hotkey Control	✅ Global hotkeys	❌ Manual trigger
System Tray	✅ Quick access	❌ No controls
Offline Support	✅ Kokoro (default) + SAPI fallback	N/A

Features

Kokoro Neural TTS (default): Studio-quality local voices — 27 voices, runs offline, Apache 2.0 licensed
Edge Neural Voices: Microsoft Edge neural voices (Aria, Jenny, Guy, Christopher) — requires internet
SAPI Fallback: Windows SAPI voices (Zira, David) — always available offline
OCR Support: Read text from screenshots and images (Win+Shift+S → Ctrl+Shift+S)
Region Capture: Draw a box on screen to OCR and read (Ctrl+Shift+O) - great for PDFs
Auto-Copy: Just select text and press Ctrl+Shift+S - no need to Ctrl+C first
Global Hotkeys: Works in any application
System Tray: Unobtrusive tray icon with menu controls
Pause/Resume: Pause mid-speech and resume later
Adjustable Speed: 100-600+ wpm depending on engine
Settings Persistence: Remembers your voice and speed preferences
Verbal Error Alerts: Speaks errors via SAPI fallback so you're never left with silence

Requirements

OS: Windows 10/11
Internet: Not required (Kokoro runs locally). Edge TTS voices need internet.

Installation

Option 1: Standalone Executable (Easiest)

Download the pre-built executable - no Python or .NET installation required:

Go to Releases
Download Herald.zip from the latest release
Extract and run Herald.exe as Administrator

The executable is portable and self-contained. On first use, Kokoro will download its ONNX model (~320MB).

Option 2: Build from Source (C# — v0.3.0+)

Requires .NET 8 SDK.

# Build
dotnet build cs/Herald.sln

# Run
dotnet run --project cs/Herald

# Publish self-contained executable
dotnet publish cs/Herald -c Release -r win-x64 --self-contained

The published exe is at cs/Herald/bin/Release/net8.0-windows/win-x64/publish/Herald.exe.

Option 3: Run from Source (Python — v0.2.1)

Requires Python 3.10, 3.11, or 3.12.

1. Install Python

Download and install Python from python.org

Make sure to check "Add Python to PATH" during installation.

2. Run the Application

Double-click Launch_Herald.bat - it will:

Request administrator privileges (required for global hotkeys)
Create a virtual environment (first run only)
Install dependencies (first run only)
Launch the application

That's it! On subsequent runs, it starts immediately.

3. Usage

Read selected text:

Select any text in any application
Press Ctrl+Shift+S to hear it read aloud (auto-copies selection)

Read from screenshot:

Take a screenshot with Win+Shift+S and select a region
Press Ctrl+Shift+S to OCR and read the text

Read from screen region (great for PDFs):

Press Ctrl+Shift+O to enter region capture mode
Draw a box around the text you want to read
Text is OCR'd and read aloud automatically

Persistent region for PDFs/videos:

Press Ctrl+Shift+M to define a screen region (green border appears)
Press Ctrl+Shift+S anytime to read from that region
Enable "Auto-Read Monitor Region" in tray menu to read automatically when text changes
Press Ctrl+Shift+M again to clear the region

Controls:

Press Alt+P to pause/resume
Press Escape to stop
Right-click the tray icon to change voice, speed, or hotkeys

4. Auto-Start with Windows (Optional)

To launch automatically when you log in:

Double-click Autostart_Enable.bat (will prompt for admin rights).

To remove auto-start:

Double-click Autostart_Disable.bat (will prompt for admin rights).

Hotkeys

All hotkeys are configurable via the system tray menu and config/settings.json.

Hotkey	Action
Ctrl+Shift+S	Speak selection/clipboard (auto-copies, supports OCR for images)
Ctrl+Shift+O	OCR region capture (draw box on screen)
Ctrl+Shift+M	Toggle persistent OCR region (for PDFs/videos)
Ctrl+Shift+P	Pause/resume
Ctrl+Shift+N	Skip to next line
Ctrl+Shift+B	Go back to previous line
Ctrl+Shift+]	Speed up
Ctrl+Shift+[	Slow down
Escape	Stop speaking
Ctrl+Shift+Q	Quit application

Line Navigation

When you press Ctrl+Shift+S, text is split by newlines and read one line at a time. Use Ctrl+Shift+N to skip ahead or Ctrl+Shift+B to go back. This is useful for:

Bouncing through code blocks
Skipping sections you've already heard
Replaying a line you missed

System Tray Menu

Right-click the tray icon to access:

Voice (Online): Aria, Jenny, Guy, Christopher (neural voices, requires internet)
Voice (Offline): Zira, David (Windows SAPI voices, no internet needed)
Speed: Preset speeds (150-900 wpm online, up to 1500 wpm offline)
Read Mode: Line by Line (default) or Continuous (reads all text as one block)
Line Delay: Add a pause between lines (0-2000ms, only applies in Line by Line mode)
Pause/Resume: Toggle when speaking
Grab & Speak Selection: Auto-copy and speak when you select text
Copy OCR to Clipboard: Save OCR'd text for pasting
Auto-Read Monitor Region: Continuously read when text changes in persistent region
Hotkeys: Configure all hotkeys (organized by Reading, Navigation, OCR, App categories)
Console: Show or hide the console window
Show Text Preview: Toggle whether text content appears in console/logs (privacy option)
Quit: Exit the application

Configuration

Settings are saved to config/settings.json:

{
  "engine": "edge",
  "voice": "aria",
  "rate": 900,
  "hotkey_speak": "ctrl+shift+s",
  "hotkey_pause": "ctrl+shift+p",
  "hotkey_stop": "escape",
  "hotkey_speed_up": "ctrl+shift+]",
  "hotkey_speed_down": "ctrl+shift+[",
  "hotkey_next": "ctrl+shift+n",
  "hotkey_prev": "ctrl+shift+b",
  "hotkey_ocr": "ctrl+shift+o",
  "hotkey_monitor": "ctrl+shift+m",
  "hotkey_quit": "ctrl+shift+q",
  "line_delay": 0,
  "read_mode": "lines",
  "log_preview": true
}

Setting	Options	Description
engine	kokoro, edge, pyttsx3	TTS engine (kokoro is default in v0.3.0+)
voice	heart, bella, michael, emma, aria, jenny, guy, christopher, zira, david	Voice name (varies by engine)
rate	100-1500	Words per minute (varies by engine)
hotkey_speak	ctrl+shift+s, alt+s, f9, alt+r	Speak hotkey
hotkey_pause	ctrl+shift+p, alt+p, f10	Pause hotkey
hotkey_stop	escape, f12	Stop hotkey
hotkey_speed_up	ctrl+shift+], alt+]	Speed up hotkey
hotkey_speed_down	ctrl+shift+[, alt+[	Slow down hotkey
hotkey_next	ctrl+shift+n, alt+n, f7	Next line hotkey
hotkey_prev	ctrl+shift+b, alt+b, f6	Previous line hotkey
hotkey_ocr	ctrl+shift+o, ctrl+alt+shift+o, alt+o, f8	OCR capture hotkey
hotkey_monitor	ctrl+shift+m, ctrl+alt+shift+m, alt+m, f11	Monitor region hotkey
hotkey_quit	ctrl+shift+q, alt+q	Quit hotkey
line_delay	0-2000	Milliseconds to pause between lines
read_mode	lines, continuous	Line by line or read all text at once
log_preview	true, false	Show text content in console/logs

TTS Engines

Engine	Type	Internet	Voices	Speed Range	Notes
Kokoro (default)	Local neural	No	27	100-600 wpm	Best quality up to ~260 wpm (1.3x). Quality degrades above that. All values above 600 wpm produce identical output (capped at 3.0x).
Edge TTS	Cloud neural	Yes	4	150-900 wpm	Microsoft Edge voices. Not for commercial use.
SAPI	Windows built-in	No	2	150-1500 wpm	Basic offline fallback.

Note: Edge TTS uses Microsoft Edge's neural voices for personal and educational use only. For commercial use, switch to Kokoro (default) or SAPI.

Available Voices

Kokoro (local neural — default)

Voice	Gender	Accent	Grade
heart (default)	Female	American	A
bella	Female	American	A-
nicole	Female	American	B-
sarah	Female	American	C+
nova, sky, alloy, jessica, kore, aoede, river	Female	American	—
michael	Male	American	C+
fenrir, puck	Male	American	C+
adam, echo, eric, liam, onyx	Male	American	—
emma	Female	British	B-
alice, isabella, lily	Female	British	—
daniel, george, lewis, fable	Male	British	—

Edge TTS (online)

Voice	Description
aria	Female, conversational
jenny	Female, news anchor
guy	Male, friendly
christopher	Male, professional

SAPI (offline)

Voice	Description
zira	Female, Windows default
david	Male, Windows default

Troubleshooting

"No text to speak"

Make sure you've copied text to the clipboard (Ctrl+C)
Some applications use special clipboard formats; try copying from Notepad

Neural voices not working

Check your internet connection
The app will fall back to offline voices if edge-tts fails

Hotkeys not working

Ensure the application is running as administrator
Check for conflicting hotkeys in other applications

Pause doesn't work

Pause only works with neural voices (edge-tts)
Offline voices (pyttsx3) don't support true pause

Speed limits

Kokoro: Best quality up to ~260 wpm (1.3x). Usable up to 600 wpm (3.0x). Values above 600 wpm are capped.
Edge TTS: Effective range 150-900 wpm
SAPI: Full 150-1500 wpm range

Project Structure

herald/
├── cs/                        # C# rewrite (v0.3.0+)
│   ├── Herald/                # WinForms app (hotkeys, OCR, tray, UI)
│   ├── Herald.Tts/            # Shared TTS library (Kokoro, Edge, SAPI)
│   ├── Herald.Tests/          # xUnit tests
│   └── Herald.sln             # Solution file
├── src/                       # Python version (v0.2.1)
│   ├── main.py                # Application entry point
│   ├── tts_engine.py          # TTS engine abstraction
│   ├── tray_app.py            # System tray icon
│   ├── text_grab.py           # Clipboard handling + OCR
│   ├── region_capture.py      # Screen region selection
│   └── config.py              # Settings management
├── config/
│   └── settings.json          # Your settings (shared format, auto-created)
├── Launch_Herald.bat          # Python launcher
└── requirements.txt           # Python dependencies

Related Projects

Whisper Voice-to-Text - The inverse: speech-to-text

Herald Pro

Get Herald Pro on the Microsoft Store -- Free, 41 AI neural voices in 6 languages, fully offline. Includes batch audio export, sentence highlighting, voice preview, and EPUB/PDF/DOCX import.

License

MIT License - feel free to use and modify.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
.claude/docs		.claude/docs
.github		.github
config		config
cs		cs
packaging		packaging
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
Autostart_Disable.bat		Autostart_Disable.bat
Autostart_Enable.bat		Autostart_Enable.bat
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Launch_Herald.bat		Launch_Herald.bat
README.md		README.md
UNINSTALL.md		UNINSTALL.md
Uninstall_All.bat		Uninstall_All.bat
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
test_runner.bat		test_runner.bat

Folders and files

Latest commit

History

Repository files navigation

Herald - Text-to-Speech

Why Herald?

vs. Windows Built-in TTS

Features

Requirements

Installation

Option 1: Standalone Executable (Easiest)

Option 2: Build from Source (C# — v0.3.0+)

Option 3: Run from Source (Python — v0.2.1)

1. Install Python

2. Run the Application

3. Usage

4. Auto-Start with Windows (Optional)

Hotkeys

Line Navigation

System Tray Menu

Configuration

TTS Engines

Available Voices

Kokoro (local neural — default)

Edge TTS (online)

SAPI (offline)

Troubleshooting

"No text to speak"

Neural voices not working

Hotkeys not working

Pause doesn't work

Speed limits

Project Structure

Related Projects

Herald Pro

License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages