A small Windows-friendly live caption page backed by xAI Grok Speech-to-Text.
$env:XAI_API_KEY="xai-your-key"
node .\server.mjsOpen http://localhost:8787.
- Use 屏幕/标签页音频 for video calls, streams, or browser audio. In the browser share picker, choose a tab/window/screen and enable audio sharing when offered.
- Use 麦克风 for spoken input near the computer.
- This version sends short 16 kHz WAV segments to
https://api.x.ai/v1/stt, then optionally translates them throughhttps://api.x.ai/v1/chat/completions. - The default segment length is 3 seconds; use 2 seconds for lower latency or 5-7 seconds for fewer requests.
- xAI's documented formatting languages do not currently list Chinese. Leave language on
自动for unsupported languages. - Translation defaults to Simplified Chinese. Set 翻译到 to 不翻译 for raw transcription only.