dictate

Privacy-first voice typing for macOS. Hold a hotkey, speak and the transcript is pasted into the focused app. Speech recognition runs on your Mac. Optional LLM cleanup runs locally through Ollama or remotely through OpenRouter and is off by default.

Docs site: https://lewiswigmore.github.io/macOS-dictate/

What you get

Local Whisper ASR on the Mac with VAD-driven endpointing.
Optional cleanup pipeline. Off by default. Toggle on in the WebUI when you want polishing.
Two cleanup backends, both opt-in. Ollama for fully local LLM cleanup. OpenRouter for cloud cleanup when you want a bigger model.
Hotkey state machine with hold, tap and double-tap actions.
Per-app vocab presets (code, work, personal, projects) chosen by the frontmost app.
Voice commands like new line, scratch that and paste raw.
Secret redaction before any cloud call. Patterns in config/redact.yaml.
Correction learning that captures your Cmd+Z edits as few-shot examples for future cleanup prompts.
Loopback-only WebUI at http://127.0.0.1:47843 with a dashboard, stats, history search and settings.
Synthetic Cmd+V insertion with a single Cmd+Z undo and full Unicode.

Install

git clone https://github.com/lewiswigmore/macOS-dictate.git ~/dictate
cd ~/dictate
./install.sh                  # python deps and CLI shim
./install.sh --with-ollama    # also installs Ollama and pulls a default model
./install.sh --autolaunch     # optional: start at login, restart on crash
./run.sh

No API keys are required. dictate is designed as a self-hosted tool: clone, install, run. There is no marketplace listing, signed .app, or Homebrew cask, by design.

First launch opens a wizard that walks through Accessibility, Microphone and Input Monitoring permissions, runs a mic test, checks backend reachability and seeds your vocab files.

Hotkeys

Action	Hotkey
Push to talk	Hold Cmd+H for more than 250 ms, release to insert
Toggle continuous dictation	Tap Cmd+H, tap again to stop
Cancel in-flight	Double-tap Cmd+H within 400 ms or press Esc
Pause Cmd+H override	Menu bar then Pause Cmd+H Override

Remap via config/settings.yaml under hotkey.{mods,key} or the menu bar's Set Hotkey... dialog.

WebUI

The menu-bar app starts the WebUI in the background. Open it from the menu bar or visit http://127.0.0.1:47843.

Page	What it shows
Dashboard	KPI cards, 14-day sparkline, recent dictations, system health, actionable suggestions
Stats	30-day usage chart, latency percentiles for ASR and cleanup, local-vs-cloud ratio, top apps and presets
History	Full transcript search, filters by app, preset, backend and date, bulk delete, export as JSONL or CSV or Markdown
Settings	Cleanup on/off, backend, model, privacy mode, redaction, hotkey, retention

The server binds to 127.0.0.1 only. Middleware rejects non-loopback clients. Mutating requests require a custom X-Dictate-WebUI header so third-party origins cannot forge requests across origins. CSP locks frame-ancestors to none.

Backends

Cleanup ships disabled. Enable it in Settings and pick a backend.

Backend	Where it runs	Setup
ollama	Local Mac	`brew install ollama && ollama pull qwen2.5:3b-instruct`
openrouter	OpenRouter cloud	`export OPENROUTER_API_KEY=...` then enable in Settings
raw	None, pass through ASR output	Always available, the default

If the active backend is unhealthy for over 60 seconds, dictate falls back through cleanup.fallback_chain. If every cleanup backend fails, raw Whisper output is pasted. The menu-bar Privacy Mode toggle forces cleanup.privacy_backend (default ollama), useful if you normally use OpenRouter but want a fully local pass on demand.

cleanup:
  enabled: false
  backend: ollama
  model: qwen2.5:3b-instruct
  fallback_chain: [ollama, raw]

Voice commands

Say any of these as the entire utterance:

new line or new paragraph or tab
scratch that or delete that or nevermind
stop or cancel
fix last or redo to re-clean the previous transcript
paste raw to bypass LLM cleanup for this utterance

Add more in config/commands.yaml.

Context presets

Driven by the frontmost app's bundle ID (config/app_map.yaml):

code preserves identifiers and literal symbols and skips sentence auto-cap
chat is casual with minimal cleanup
prose does full grammar, punctuation and paragraphs
default fallback

Selection as context

If text is selected when you press the hotkey, dictate treats the utterance as a rewrite instruction. Select a paragraph in Mail, hold the hotkey, say "make this more concise" and the selection is replaced. Requires Accessibility permission and an app that exposes kAXSelectedTextAttribute (most native and Chromium apps qualify, some Electron apps do not).

Custom vocabulary

One term per line in config/vocab/{code,work,personal}.txt. Per-project vocab in config/vocab/projects/<repo-name>.txt. Vocab is passed to Whisper as initial_prompt to bias recognition and to the cleanup LLM as a verbatim preservation list.

Automation

Wire dictate into Keyboard Maestro, Hammerspoon, Shortcuts.app and other tools via the dictate:// URL scheme:

open "dictate://record"
open "dictate://history"

The URL scheme requires a packaged .app bundle to register with macOS (tracked in issue #10). During development, invoke URLs directly:

python3 -m dictate "dictate://toggle"

AppleScript terminology lives at assets/dictate.sdef and activates when dictate ships as an .app. A Raycast extension lives in raycast/ with commands for toggle, start, stop, open history and open settings.

CLI

After install, a dictate shim is on your PATH:

dictate start      # launch in the background
dictate stop       # graceful shutdown, falls back to SIGKILL after 8 s
dictate restart    # stop then start
dictate status     # show pid and pidfile path
dictate doctor     # full diagnostics: permissions, audio, backends, models, conflicts
dictate --version  # version, Python and macOS info
dictate --dry-run  # validate config and imports without starting

The pidfile lives at ~/Library/Application Support/dictate/dictate.pid (override via DICTATE_STATE_DIR). Background logs go to ~/Library/Application Support/dictate/dictate.log.

Security

dictate ships with a hardened supply chain and is local-by-default:

All GitHub Actions pinned to commit SHAs, with a hardened-runner egress audit step.
AI code review on every PR via sebastionAI.
OSSF Scorecard published on every push.
Bandit and pip-audit gate pull requests.
Secret scanning and push protection enabled at the repo level.
Branch protection requires code-owner review, dismisses stale reviews, requires last-push approval, requires linear history and applies to admins too.
Symlink-aware file writes refuse to follow links and chmod with follow_symlinks=False.
Per-request DICTATION prompt-injection fence wraps every cleanup call so a transcript cannot impersonate system instructions.

Report security issues through GitHub Security Advisories rather than public issues. See SECURITY.md and THREAT_MODEL.md.

Troubleshooting

Hotkey not detected. Check Accessibility and Input Monitoring permissions, then restart the app.
Event tap stopped silently. Handled automatically (auto-reenable on kCGEventTapDisabledByTimeout). Check logs if it persists.
Whisper hallucinates on silence. VAD should prevent it. Raise vad.threshold if it still occurs.
Selection not replaced. The focused app does not expose AX selection. dictate falls back to appending.
Cmd+V does not paste. Likely a secure-input field. dictate falls back to per-character typing.
AirPods or mic switch breaks recording. Handled via configuration change notification. Restart if it does not recover.
Ollama not reachable. brew services start ollama && ollama pull qwen2.5:3b-instruct.

Menu bar then Export Diagnostics produces a tar of recent logs, the last five transcripts (redacted), your config and system info. Logs rotate at ~/dictate/logs/dictate.log.

Uninstall

launchctl unload ~/Library/LaunchAgents/com.dictate.app.plist 2>/dev/null || true
rm ~/Library/LaunchAgents/com.dictate.app.plist 2>/dev/null || true
rm -rf ~/dictate

Develop

source .venv/bin/activate
pytest               # unit and end-to-end tests
ruff check .

Benchmarks live in bench/ for ASR throughput and synthetic end-to-end latency. See AGENTS.md for project conventions and CONTRIBUTING.md for the PR workflow.

Documentation

Contributing

Pull requests are welcome. The roadmap tracker links every open issue with full acceptance criteria. Items tagged good first issue are friendly starting points. All PRs are reviewed and require green CI plus a code-owner approval before merge.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github		.github
assets		assets
bench		bench
config		config
dictate		dictate
docs		docs
logs		logs
raycast		raycast
scripts		scripts
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
FAQ.md		FAQ.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
THIRD_PARTY_NOTICES.md		THIRD_PARTY_NOTICES.md
THREAT_MODEL.md		THREAT_MODEL.md
com.dictate.app.plist.template		com.dictate.app.plist.template
entitlements.plist		entitlements.plist
history.jsonl.lock		history.jsonl.lock
install.sh		install.sh
justfile		justfile
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-build.in		requirements-build.in
requirements-build.txt		requirements-build.txt
requirements-dev.txt		requirements-dev.txt
requirements-docs.in		requirements-docs.in
requirements-docs.txt		requirements-docs.txt
requirements-tools.in		requirements-tools.in
requirements-tools.txt		requirements-tools.txt
requirements.txt		requirements.txt
run.sh		run.sh
setup_app.py		setup_app.py
uninstall.sh		uninstall.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dictate

What you get

Install

Hotkeys

WebUI

Backends

Voice commands

Context presets

Selection as context

Custom vocabulary

Automation

CLI

Security

Troubleshooting

Uninstall

Develop

Documentation

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dictate

What you get

Install

Hotkeys

WebUI

Backends

Voice commands

Context presets

Selection as context

Custom vocabulary

Automation

CLI

Security

Troubleshooting

Uninstall

Develop

Documentation

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages