AI-powered meeting transcription project for MinutelyAI.
This repository includes two browser-based experiences:
meeting.html: live, multi-user transcription with shared real-time outputminutely.html: upload a recorded audio/video file and get diarized transcription (speaker-labeled)
- Deepgram WebSocket API for live speech-to-text
- Firebase Realtime Database for syncing live transcripts across participants
- Modal GPU endpoint for offline transcription
- Whisper
medium+ pyannote speaker diarization on the backend
meeting.html: live meeting pageminutely.html: recorded file transcription pagemodal_app.py: Modal backend (FastAPI + Whisper + pyannote)config.example.js: frontend config templaterequirements.txt: backend Python dependencies
You need:
- A Deepgram API key
- A Firebase project (Realtime Database enabled)
- A Hugging Face token with access to pyannote models
- A Modal account and CLI
- Python 3.11+ for local backend setup/deploy
Create config.js from the template:
cp config.example.js config.jsThen update:
DEEPGRAM_KEYMODAL_URL(after deploying backend)FIREBASEobject values
Important: config.js is gitignored and should never be committed.
Open meeting.html in Chrome.
Flow:
- Enter your name and create/join a meeting
- Share the invite link
- Each participant starts microphone capture
- Transcript updates in real time for everyone
Open minutely.html in a browser, upload one file, and click transcribe.
Supported formats include .mp3, .wav, .mp4, .m4a, .webm.
Install dependencies and deploy:
pip install -r requirements.txt
modal secret create minutely-secrets HF_TOKEN=your_hf_token
modal deploy modal_app.pyTake the generated endpoint and set it as MODAL_URL in config.js.
POST / expects JSON payload:
{
"audio": "<base64-audio-bytes>",
"language": "en",
"min_speakers": 1,
"max_speakers": 8
}Response includes:
- full transcript text
- detected language
- speaker segments with timestamps
- If mic capture fails, verify browser mic permissions and use HTTPS when required.
- If live transcript is empty, confirm
DEEPGRAM_KEYand Firebase config are valid. - If recorded transcription fails, verify
MODAL_URLand that Modal app is deployed. - If diarization fails, verify
HF_TOKENis set inminutely-secrets.
- Never commit
config.js. - Rotate any API key that was accidentally exposed.
- Restrict Firebase rules appropriately for production.