Foram Shah • Belle Hsieh • Soham Katdare • Colin Zhao
Document-to-video app with:
- Frontend: Next.js in frontend (default port 3000)
- Backend: FastAPI + generation pipeline in backend (default port 8000)
You upload a .txt, .md, .pdf, .ppt, or .pptx file from the frontend. The backend then:
- Parses the source material.
- Uses a selectable LLM provider (TerpAI, Gemini, or Claude) to plan segments and generate Manim scene code.
- Uses ElevenLabs to synthesize narration.
- Renders video segments with Manim.
- Merges segment audio/video and builds a final video.
- Returns the final video URL to the frontend.
- Python 3.10+
- Node.js 20+
- Yarn 1+
- API keys:
- ELEVEN_LABS_API_KEY
- For
LLM_PROVIDER=terpai: TERPAI_BEARER_TOKEN, TERPAI_CONVERSATION_ID, TERPAI_PARENT_SEGMENT_ID - For
LLM_PROVIDER=gemini: GOOGLE_API_KEY - For
LLM_PROVIDER=claude: ANTHROPIC_API_KEY
System tools required by Manim and media steps:
- pkg-config
- cmake
- cairo
- pango
- ffmpeg (includes ffprobe)
On macOS:
brew install pkg-config cmake cairo pango ffmpegNote: uv is for backend Python dependencies. Frontend dependencies use Yarn.
cd backend
uv venv
source .venv/bin/activate
uv pip install -r requirements.txtCreate backend/.env:
ELEVEN_LABS_API_KEY=your_elevenlabs_key
ELEVEN_LABS_VOICE_ID=EXAVITQu4vr4xnSDxMaL
LLM_PROVIDER=terpai
# Gemini provider settings (required when LLM_PROVIDER=gemini)
GOOGLE_API_KEY=your_google_ai_studio_key
GOOGLE_GEMINI_MODEL=gemini-flash-latest
GEMINI_REQUEST_TIMEOUT_SECONDS=120
# Claude provider settings (required when LLM_PROVIDER=claude)
ANTHROPIC_API_KEY=your_anthropic_api_key
ANTHROPIC_MODEL=claude-sonnet-4-6
ANTHROPIC_REQUEST_TIMEOUT_SECONDS=120
ANTHROPIC_MAX_OUTPUT_TOKENS=4096
# TerpAI provider settings (required when LLM_PROVIDER=terpai)
TERPAI_BEARER_TOKEN=your_terpai_token
TERPAI_CONVERSATION_ID=your_conversation_id
TERPAI_PARENT_SEGMENT_ID=your_parent_segment_id
TERPAI_API_BASE_URL=https://terpai.umd.edu/api/internal
TERPAI_REQUEST_TIMEOUT_SECONDS=120
TERPAI_POLL_INTERVAL_SECONDS=2
TERPAI_RESPONSE_TIMEOUT_SECONDS=120
# Optional TLS controls for TerpAI requests
# TERPAI_CA_BUNDLE=/path/to/cacert.pem
# TERPAI_SSL_VERIFY=0
MANIM_QUALITY_FLAG=-qlOptional backend vars:
FRONTEND_ORIGINS=http://localhost:3000,http://127.0.0.1:3000
PIPELINE_OUTPUT_DIR=outputs
# Cost savers (see backend/README.md): merge plan+beats, one TTS per segment
# PIPELINE_MERGE_PLAN_BEATS=1
# PIPELINE_SINGLE_TTS_PER_SEGMENT=1In a separate terminal:
cd frontend
yarn install
cp .env.local.example .env.localDefault frontend env value:
NEXT_PUBLIC_BACKEND_URL=http://localhost:8000cd backend
source .venv/bin/activate
uvicorn api:app --reload --host 127.0.0.1 --port 8000Health check:
curl http://127.0.0.1:8000/api/healthcd frontend
yarn dev --hostname 127.0.0.1 --port 3000Open http://127.0.0.1:3000 and go to /render.
The Run Pipeline CLI command is:
python run_pipeline.py --input path/to/source.pdf --output-dir outputs --max-segments 4 --llm-provider terpaiIt runs the generation pipeline directly from the command line, without the frontend and without the API upload route.
Use it when you want to:
- debug or iterate on pipeline behavior quickly
- batch-run local files
- verify backend pipeline independently from UI/API
- GET /api/health
- POST /api/render (multipart:
file,max_segments, optionallanguage_code, optionalllm_provider=terpai|gemini|claude, optionalstream=1) - GET /api/files/{path}
- uv install fails on pycairo/manim:
- Install system deps with brew install pkg-config cmake cairo pango ffmpeg.
- Missing required LLM/API keys (for example TerpAI, Gemini, or Claude vars) or ELEVEN_LABS_API_KEY:
- Ensure backend/.env exists and required keys for your selected
LLM_PROVIDERare not empty.
- Ensure backend/.env exists and required keys for your selected
- TERPAI SSL certificate verification failed:
- Preferred: set TERPAI_CA_BUNDLE to a trusted PEM CA bundle path.
- Temporary workaround: set TERPAI_SSL_VERIFY=0 in backend/.env.
- Frontend cannot call backend:
- Verify NEXT_PUBLIC_BACKEND_URL in frontend/.env.local and restart yarn dev.
- Running yarn from repo root fails with no package.json:
- Run Yarn commands from frontend.
- zsh reports command not found: l when activating:
- Use source .venv/bin/activate (do not prefix with l).
