🗣️ yap

A CLI for on-device speech transcription using Speech.framework on macOS 26.

Usage

USAGE: yap transcribe [--locale <locale>] [--censor] <input-file> [--txt] [--srt] [--vtt] [--json] [--output-file <output-file>] [--max-length <max-length>] [--word-timestamps]

ARGUMENTS:
  <input-file>            Path to an audio or video file to transcribe.

OPTIONS:
  -l, --locale <locale>   (default: current)
  --censor                Replaces certain words and phrases with a redacted form.
  --txt/--srt/--vtt/--json
                          Output format for the transcription. (default: --txt)
  -o, --output-file <output-file>
                          Path to save the transcription output. If not provided,
                          output will be printed to stdout.
  -m, --max-length <max-length>
                          Maximum sentence length in characters. (default: 40)
  --word-timestamps       Include word-level timestamps in JSON output.
  -h, --help              Show help information.

Installation

Homebrew

brew install yap

Mint

mint install finnvoor/yap

Examples

Transcribe a YouTube video using yap and yt-dlp

yt-dlp "https://www.youtube.com/watch?v=ydejkIvyrJA" -x --exec yap

Summarize a video using yap and llm

yap video.mp4 | uvx llm -m mlx-community/Llama-3.2-1B-Instruct-4bit 'Summarize this transcript:'

Create SRT captions for a video

yap video.mp4 --srt -o captions.srt

Generate WebVTT subtitles

yap video.mp4 --vtt -o subtitles.vtt

Export JSON with word-level timestamps

yap video.mp4 --json --word-timestamps -o transcript.json

Live System Audio

yap listen transcribes system audio in real time — anything playing on your computer.

USAGE: yap listen [--locale <locale>] [--censor] [--txt] [--srt] [--vtt] [--json] [--max-length <max-length>] [--word-timestamps]

OPTIONS:
  -l, --locale <locale>   (default: current)
  --censor                Replaces certain words and phrases with a redacted form.
  --txt/--srt/--vtt/--json
                          Output format for the transcription. (default: --txt)
  -m, --max-length <max-length>
                          Maximum sentence length in characters for timed output
                          formats. (default: 40)
  --word-timestamps       Include word-level timestamps in JSON output.
  -h, --help              Show help information.

Screen Recording permission is required. Grant it to your terminal app in System Settings > Privacy & Security > Screen Recording.

Examples

# Transcribe system audio live
yap listen

# Pipe live transcription to another tool
yap listen | uvx llm 'Translate this to French:'

# Save system audio as VTT subtitles
yap listen --vtt > captions.vtt

Listen and Dictate

yap listen-and-dictate transcribes both system audio and microphone input simultaneously — perfect for meeting transcription.

USAGE: yap listen-and-dictate [--locale <locale>] [--censor] [--txt] [--srt] [--vtt] [--json] [--max-length <max-length>] [--mic-label <mic-label>] [--system-label <system-label>] [--word-timestamps]

OPTIONS:
  -l, --locale <locale>   (default: current)
  --censor                Replaces certain words and phrases with a redacted form.
  --txt/--srt/--vtt/--json
                          Output format for the transcription. (default: --txt)
  -m, --max-length <max-length>
                          Maximum sentence length in characters for timed output
                          formats. (default: 40)
  --mic-label <mic-label> Speaker label for microphone audio in timed output
                          formats. (default: Mic)
  --system-label <system-label>
                          Speaker label for system audio in timed output
                          formats. (default: System)
  --word-timestamps       Include word-level timestamps in JSON output.
  -h, --help              Show help information.

Both Screen Recording and Microphone permissions are required. Grant them to your terminal app in System Settings > Privacy & Security.

Examples

# Transcribe a video call (both sides)
yap listen-and-dictate

# Save a meeting transcript
yap listen-and-dictate > meeting.txt

# Save a meeting transcript as VTT with speaker labels
yap listen-and-dictate --vtt > meeting.vtt

# Use custom speaker labels
yap listen-and-dictate --vtt --mic-label Alice --system-label Bob > meeting.vtt

Dictation

yap dictate transcribes microphone input in real time.

USAGE: yap dictate [--locale <locale>] [--censor] [--txt] [--srt] [--vtt] [--json] [--max-length <max-length>] [--word-timestamps]

OPTIONS:
  -l, --locale <locale>   (default: current)
  --censor                Replaces certain words and phrases with a redacted form.
  --txt/--srt/--vtt/--json
                          Output format for the transcription. (default: --txt)
  -m, --max-length <max-length>
                          Maximum sentence length in characters for timed output
                          formats. (default: 40)
  --word-timestamps       Include word-level timestamps in JSON output.
  -h, --help              Show help information.

Microphone permission is required. Grant it to your terminal app in System Settings > Privacy & Security > Microphone.

Examples

# Dictate from your microphone
yap dictate

# Dictate and save to a file
yap dictate > notes.txt

MCP Server

yap includes an MCP server that exposes a transcribe tool, allowing any MCP-compatible agent to transcribe audio and video files.

Claude Code

claude mcp add yap -- yap mcp

Codex

codex mcp add yap -- yap mcp

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github		.github
Sources/yap		Sources/yap
.gitignore		.gitignore
.swiftformat		.swiftformat
LICENSE		LICENSE
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🗣️ yap

Usage

Installation

Homebrew

Mint

Examples

Transcribe a YouTube video using yap and yt-dlp

Summarize a video using yap and llm

Create SRT captions for a video

Generate WebVTT subtitles

Export JSON with word-level timestamps

Live System Audio

Examples

Listen and Dictate

Examples

Dictation

Examples

MCP Server

Claude Code

Codex

About

Uh oh!

Releases 9

Sponsor this project

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🗣️ yap

Usage

Installation

Homebrew

Mint

Examples

Transcribe a YouTube video using yap and yt-dlp

Summarize a video using yap and llm

Create SRT captions for a video

Generate WebVTT subtitles

Export JSON with word-level timestamps

Live System Audio

Examples

Listen and Dictate

Examples

Dictation

Examples

MCP Server

Claude Code

Codex

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Sponsor this project

Uh oh!

Contributors

Uh oh!

Languages