Skip to content

IdkwhatImD0ing/AdaptEd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

222 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AdaptEd — Lectures that talk back

Demo reel ↗  ·  Devpost ↗  ·  Watch  ·  Three loops  ·  Architecture  ·  Run it

Google Gemini Challenge — 1st place Fetch.ai Agentified Winner License: MIT

Next.js 15 React 19 TypeScript Tailwind CSS Motion Gemini 1.5 Pro Fetch.ai uagents LangChain FastAPI MongoDB ElevenLabs Retell AI


Note

AdaptEd is a hackathon winner first, a portfolio piece second — and this repo is both. The production-style frontend you can npm run dev lives in client/. The original FastAPI / LangChain / Retell AI / Gemini stack that won the awards at LA Hacks 2024 is preserved verbatim in server/ as a reference artifact. Nothing in server/ is invoked at runtime — the showcase mocks every backend interaction in the browser.


▌ Watch it run

Watch the AdaptEd demo on YouTube

3 minutes · LA Hacks 2024 finals · UCLA Pauley Pavilion


Tip

Prefer to drive it yourself? The frontend showcase ships with a fully scripted "Red-Black Trees" lecture (pre-rendered ElevenLabs narration, scripted prompts, animated research timeline) — see Run the showcase.


▌ Why it exists

Of 16 million U.S. university students, half fall behind on static, one-sided lectures while fewer than three percent have access to quality tutoring programs. So we built a lecturer that talks back.

— from the original Devpost writeup

The thesis was a single inversion: instead of students adapting to the system, the AI lecturer adapts to students. That meant three feedback loops running in parallel.

▌ The three loops

The three loops behind AdaptEd: Voice, Content, Attention
Loop What it does What it unlocks
01 Voice Retell AI captures speech with end-of-turn detection and streams to a LangChain GPT-3.5 agent that exposes three slide-control tools. The lecturer can be cut off, asked to repeat, or told to skip — and pick up gracefully like a human tutor.
02 Content Wikipedia · YouTube transcripts (Gemini multimodal) · Serper image search → Gemini 1.5 Pro slide aggregator with the templates.py taxonomy. Slides regenerate live from verbal cues, not from a fixed deck.
03 Attention Webcam emotion side-channel (Hume on Intel Dev Cloud) signals confusion / disengagement back to the planner. The pace bends to the student, not the other way around.

▌ Architecture

The original LA Hacks build  server/ + client/
flowchart LR
  user([Student]) -- voice --> retell[Retell AI agent]
  retell -- LLM websocket --> voice["server/voice.py"]
  voice --> llm["server/llm.py<br/>LangChain · GPT-3.5"]
  llm -- tool calls --> client[Next.js client]

  user -- topic --> input["client /input"]
  input --> research["client /research"]
  research -- POST /generate-simple --> generate["server/generate_route.py"]
  generate --> aggregate["server/aggregate.py"]
  aggregate --> wiki[(Wikipedia)]
  aggregate --> youtube["YouTube + Gemini multimodal"]
  aggregate --> images["image_agent.py<br/>GPT-4-turbo + Serper"]
  aggregate --> gemini["Gemini 1.5 Pro<br/>slide aggregator"]
  gemini --> lectureView["client /lecture"]
  lectureView -. speaker_notes .-> retell

  classDef hot fill:#1a1a1d,stroke:#c700e7,color:#f5f4f0;
  classDef cold fill:#15151a,stroke:#252528,color:#d8d6cf;
  class retell,voice,llm,gemini hot;
  class input,research,generate,aggregate,wiki,youtube,images,client,lectureView cold;
Loading
This frontend showcase  browser-only · zero backend
flowchart LR
  landing["/ landing"] --> input["/input topic"]
  input --> research["/research<br/>animated 4-step timeline"]
  research -- localStorage --> lecture["/lecture viewer"]
  lecture --> mockVoice["MockVoice.tsx<br/>ElevenLabs MP3 player"]
  lecture --> slideshow["Slideshow.tsx<br/>custom template renderer"]
  lecture --> sidebar["Sidebar.tsx<br/>live transcript"]
  mockVoice -. audio ends .-> autoAdvance{auto-advance}
  autoAdvance --> slideshow

  classDef hot fill:#1a1a1d,stroke:#c700e7,color:#f5f4f0;
  classDef cold fill:#15151a,stroke:#252528,color:#d8d6cf;
  class mockVoice,slideshow hot;
  class landing,input,research,lecture,sidebar,autoAdvance cold;
Loading
What lives in server/  preserved, not invoked
File Role
server/main.py FastAPI app, WebSocket manager, MongoDB connection.
server/aggregate.py Multi-source slide pipeline driven by Gemini 1.5 Pro with Wikipedia, YouTube audio+frames, and image search. Cycled a list of GEMINI_API_KEYS to dodge free-tier quotas.
server/voice.py Two websockets per call: Retell's Custom LLM URL + a sidecar "data" socket the client used to push speaker notes and receive next_slide / prev_slide / goto_slide tool calls.
server/llm.py LangChain create_openai_tools_agent over GPT-3.5-turbo-0613, three slide-control tools.
server/google_agent.py · server/mermaid_agent.py Fetch.ai uagents experiments for the whiteboard.
intel_dev_cloud/fine_tune.ipynb LoRA / PEFT fine-tuning attempt on Intel GPUs via bigdl-llm[xpu].

▌ Run the showcase

git clone https://github.com/IdkwhatImD0ing/AdaptEd.git
cd AdaptEd/client
npm install
npm run dev

Then open http://localhost:3000. Four routes ship as static pages:

Route Purpose
/ Landing — hero, demo embed, feature triptych, origin spec sheet
/input Topic entry, pre-filled with Red-Black Trees
/research Mocked four-step research timeline (Wikipedia → YouTube → images → drafting)
/lecture Slideshow + voice dock + live transcript

Important

No .env is needed at runtime. Every voice clip is pre-rendered to MP3 and shipped in client/public/audio/. The lecturer auto-advances when each clip ends and offers scripted prompt chips (Explain that again, Skip ahead) to demo the conversational interrupt feature.

Regenerate the narration  optional · requires an ElevenLabs API key

Slide narration and scripted prompt responses are produced once by scripts/generate-audio.mjs using the "Sarah" library voice on eleven_multilingual_v2. Re-run after editing slide copy:

cd client
ELEVENLABS_API_KEY=sk_… node scripts/generate-audio.mjs
# add --force to overwrite existing files

▌ Map of the repo

AdaptEd/
├─ client/                       # The frontend showcase ──────────────────
│  ├─ app/
│  │  ├─ page.tsx                Landing — hero, embed, features, origin
│  │  ├─ layout.tsx              Fonts (Fraunces · Instrument Sans · JetBrains Mono)
│  │  ├─ globals.css             Theme tokens, mesh & grain, utility classes
│  │  ├─ icon.tsx                Generated 64×64 brand favicon (Fraunces "Ed")
│  │  ├─ apple-icon.tsx          Generated 180×180 apple-touch-icon
│  │  ├─ opengraph-image.tsx     Generated 1200×630 OG card
│  │  ├─ input/page.tsx          Topic entry
│  │  ├─ research/page.tsx       Mocked 4-step research timeline
│  │  ├─ lecture/page.tsx        Stage + voice dock + transcript
│  │  ├─ components/
│  │  │  ├─ NavBar.tsx           Top navigation (landing & lecture variants)
│  │  │  ├─ Slideshow.tsx        Custom renderer for templates 0/1/2/4/8/9/10/11
│  │  │  ├─ Diagrams.tsx         Inline animated SVG diagrams (Red-Black Trees)
│  │  │  ├─ MockVoice.tsx        ElevenLabs MP3 player + scripted chips
│  │  │  ├─ Sidebar.tsx          Live transcript pane
│  │  │  ├─ YouTubeEmbed.tsx     Lazy, privacy-enhanced demo embed
│  │  │  └─ Typewriter.tsx       Tiny TS rewrite of react-type-animation
│  │  └─ lib/
│  │     ├─ types.ts             Lecture · Slide · TranscriptEntry types
│  │     ├─ mockLecture.ts       The canned Red-Black Trees lecture
│  │     └─ mockTranscript.ts    Scripted prompts for the conversational interrupt
│  ├─ public/audio/              Pre-rendered ElevenLabs narration MP3s
│  ├─ scripts/generate-audio.mjs Renders public/audio/*.mp3 from copy
│  ├─ tailwind.config.ts         Brand tokens, font variables, custom keyframes
│  └─ next.config.mjs            Allow remote images for slide assets
│
├─ server/                       # The original LA Hacks backend (preserved)
│  ├─ main.py                    FastAPI app + WebSocket manager
│  ├─ voice.py                   Retell AI Custom LLM URL + data sidecar
│  ├─ llm.py                     LangChain GPT-3.5 agent with slide tools
│  ├─ aggregate.py               Gemini 1.5 Pro multi-source slide pipeline
│  ├─ image_agent.py             GPT-4-turbo + Serper image picker
│  ├─ google_agent.py            Fetch.ai uagent — Gemini call
│  └─ mermaid_agent.py           Fetch.ai uagent — whiteboard
│
└─ intel_dev_cloud/
   └─ fine_tune.ipynb            LoRA/PEFT fine-tune on Intel GPUs

▌ What changed from the original

LA Hacks 2024 build This showcase
Frontend framework Next.js 14.2 + React 18 Next.js 15 + React 19
Slideshow Spectacle 10 + BroadcastChannel sync Custom <Slideshow /> component
Voice Retell AI + GPT-3.5 + ngrok Pre-rendered ElevenLabs MP3s
Animation react-reveal (React 16 era) motion (Framer Motion successor)
Diagrams External Wikipedia images Inline animated SVGs (Diagrams.tsx)
Auth Auth0 (never wired) Removed
Typography Inter Fraunces · Instrument Sans · JetBrains Mono
Styling SCSS (global.sass) + Tailwind Tailwind + CSS variables
Build deps 18 prod / 9 dev 6 prod / 9 dev

▌ Spec sheet

EVENT
LA Hacks 2024
DATES
Apr 19–21, 2024
VENUE
Pauley Pavilion · UCLA
HACKERS
704
TEAMS
142
AWARDS
Google · Fetch.ai
TEAM SIZE
4
BUILD TIME
36 hrs
SLEEP
≈ 4 hrs
WORKING NAME
"TeachMe"

▌ Credits

Bill Zhang
Bill Zhang
Voice loop · Frontend
@IdkwhatImD0ing
Spike O'Carroll
Spike O'Carroll
Backend · Agents
@spikecodes · LinkedIn
Jay Wu
Jay Wu
Slide pipeline · Gemini
@jotalis · LinkedIn
Jasmine Wu
Jasmine Wu
Frontend · UX
@Jaslavie · LinkedIn

Built in 36 hours under the rafters of Pauley Pavilion · originally branded "TeachMe" during the build, renamed to AdaptEd for submission.


▌ License

MIT.

Footnotes & sources

About

Lectures that talk back. AI tutor with real-time voice conversation, dynamic Gemini-driven slides, and emotion-aware pacing — built at LA Hacks 2024 (Google Gemini Challenge 1st + Fetch.ai Agentified).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors