A Discord voice bot that uses OpenAI's Realtime API for natural voice conversations, with seamless integration to OpenClaw/Claude for tool-heavy requests like email, calendar, web search, and file operations.
- Voice-to-Voice Conversations: Direct voice interaction using OpenAI's Realtime API (~1s latency)
- Tool Integration: Complex requests automatically routed to OpenClaw/Claude for tool access
- Natural Audio Processing: Handles Discord's Opus audio format and converts to/from OpenAI's PCM format
- Smart Function Calling: Routes requests for email, calendar, web search, commands to Claude
- Conversation Management: Reset conversation history, join/leave voice channels
- Robust Error Handling: Automatic reconnection and graceful error recovery
Discord Voice ↔ Voice Handler ↔ OpenAI Realtime API ↔ OpenClaw Bridge ↔ Claude Tools
- OpenAI Realtime API: Handles direct voice-to-voice for casual conversation
- OpenClaw Integration: Routes complex requests requiring tools to Claude
- Audio Pipeline: Discord Opus ↔ PCM 16-bit 24kHz mono for OpenAI
- Node.js (v18 or higher)
- Discord Bot Token with voice permissions
- OpenAI API Key with Realtime API access
- OpenClaw Gateway running locally at
http://localhost:18789 - FFmpeg (for audio processing) - install via:
- Windows:
winget install FFmpegor download from https://ffmpeg.org/ - macOS:
brew install ffmpeg - Linux:
sudo apt install ffmpeg
- Windows:
git clone <this-repo>
cd discord-voice-claude
npm installCopy .env.example to .env and fill in your credentials:
DISCORD_TOKEN=your_discord_bot_token
DISCORD_CLIENT_ID=your_discord_application_id
OPENAI_API_KEY=your_openai_api_key
OPENCLAW_GATEWAY_URL=http://localhost:18789
OPENCLAW_GATEWAY_TOKEN=your_openclaw_gateway_token- Go to https://discord.com/developers/applications
- Create a new application
- Go to "Bot" section, create a bot and copy the token
- Enable these bot permissions:
- Scopes:
bot,applications.commands - Bot Permissions:
Connect,Speak,Use Voice Activity
- Scopes:
- Copy the Application ID for
DISCORD_CLIENT_ID
Make sure OpenClaw gateway is running:
openclaw gateway startGet your gateway token from OpenClaw configuration.
npm run registernpm run build
npm startOr for development:
npm run dev/join- Bot joins your voice channel and starts listening/leave- Bot leaves the voice channel/reset- Clears conversation history
- Use
/joinwhile in a voice channel - Start speaking - the bot will respond with voice for simple questions
- For complex requests (email, calendar, web search, file operations), the bot automatically routes to Claude via OpenClaw
- Use
/resetto clear conversation history if needed
Direct Voice (OpenAI Realtime API):
- "What's the weather like?"
- "Tell me a joke"
- "How are you doing?"
Routed to Claude (via OpenClaw):
- "Check my email for anything urgent"
- "What's on my calendar tomorrow?"
- "Search the web for latest news about AI"
- "Run a command to check disk space"
- Discord Input: Opus 48kHz stereo
- OpenAI Realtime: PCM 16-bit 24kHz mono (base64)
- Conversion Pipeline: Opus → PCM → Resample → Base64
{
"modalities": ["text", "audio"],
"voice": "alloy",
"turn_detection": { "type": "server_vad" },
"input_audio_transcription": { "model": "whisper-1" }
}The bot registers these functions with OpenAI Realtime API:
ask_claude(message)- General Claude requestscheck_email()- Check for urgent emailscheck_calendar()- Check upcoming eventssearch_web(query)- Web searchrun_command(command)- Execute commands
"FFmpeg not found"
- Install FFmpeg and ensure it's in your PATH
"Cannot connect to OpenClaw gateway"
- Make sure OpenClaw gateway is running:
openclaw gateway status - Check the gateway URL and token in
.env
"Bot not responding to voice"
- Check Discord permissions (Connect, Speak, Use Voice Activity)
- Verify OpenAI API key has Realtime API access
- Check console logs for WebSocket errors
"Audio quality issues"
- Ensure stable internet connection
- Check Discord voice channel region/server location
Enable detailed logging by adding to your .env:
DEBUG=discord-voice-claude:*Check logs for:
- WebSocket connection status
- Audio processing pipeline
- Function call routing
- OpenClaw API responses
src/
├── index.ts # Main bot entry point
├── voice-handler.ts # Discord voice connection management
├── realtime-client.ts # OpenAI Realtime API WebSocket client
├── openclaw-bridge.ts # Bridge to OpenClaw gateway
└── register-commands.ts # Discord slash command registration
npm run build # Compile TypeScript
npm run dev # Build and run- Add function definition to
openClawFunctionsinopenclaw-bridge.ts - Add handler case to
handleFunctionCall - The OpenAI Realtime API will automatically call your function when appropriate
MIT License - see LICENSE file for details
For issues with:
- OpenAI Realtime API: Check OpenAI documentation and API status
- Discord Integration: Verify bot permissions and Discord.js version
- OpenClaw Integration: Ensure gateway is running and accessible
- Audio Processing: Check FFmpeg installation and codec support