🎬 Interactive Mythos Director — Multimodal Storytelling Agent

An AI creative director powered by Google Gemini that builds evolving story worlds with interleaved text, images, and audio. Built for the Gemini Live Agent Challenge.

🎯 What is Interactive Mythos Director?

Interactive Mythos Director is a web app that:

Takes a story prompt from the user (character, setting, tone)
Generates cinematic narrative scenes with Gemini
Creates visual assets from scene prompts (Imagen)
Produces voice narration from story scripts (TTS)
Returns a unified story package per turn (text + media URLs + choices)

The experience is designed like a digital storybook, not a plain chat.

🛠 Tech Stack

Layer	Technology
Frontend	React
Backend	Python (FastAPI)
AI SDK	Google GenAI SDK
Models	Gemini (logic), Imagen (images), optional Veo (cinematic clips)
Deployment	Google Cloud Run (containerized)
Storage	Google Cloud Storage (generated assets)
Audio	Google Cloud Text-to-Speech

📁 Project Structure

gen-ai/
├── README.md                    # This file
├── Dockerfile                   # Cloud Run deployment image
├── AGENTS.md                    # Local coding agent instructions
│
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI entrypoint, routes
│   │   ├── director.py          # System prompt + orchestration logic
│   │   ├── media.py             # Imagen/TTS integration + GCS upload
│   │   └── schemas.py           # Request/response models
│   ├── requirements.txt
│   ├── .env                     # API keys and config (never commit)
│   └── .env.example
│
└── frontend/
    ├── package.json
    ├── vite.config.js
    └── src/
        ├── main.jsx
        ├── App.jsx              # Main app flow
        └── components/
            ├── StoryInput.jsx       # Prompt + generation controls
            ├── SceneCard.jsx        # Narrative + image/audio render
            ├── ChoiceButtons.jsx    # Branching path controls
            └── TimelinePanel.jsx    # Session story history

🚀 Quick Start (Local Dev)

Prerequisites

Python 3.11+
Node.js 18+
Google Cloud project with Vertex AI + Cloud Storage enabled
Service account credentials or local gcloud auth application-default login

Setup & Run

1. Backend setup:

cd backend
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your project/config values

2. Start backend (port 8000):

cd backend
source .venv/bin/activate
uvicorn app.main:app --reload --port 8000

3. Frontend setup (new terminal):

cd frontend
npm install
npm run dev

4. Open in browser:

http://localhost:5173

🔌 API Reference

POST `/api/generate-scene`

Send user prompt and story context to generate the next scene package.

Request:

{
  "prompt": "A cyberpunk samurai enters a neon temple.",
  "history": [
    { "role": "user", "content": "Start a dark fantasy adventure." },
    { "role": "assistant", "content": "The moon cracked above the iron forest..." }
  ],
  "style": "cinematic"
}

Response:

{
  "narrative": "Rain hissed against chrome armor as Kael stepped into the temple gate...",
  "image_url": "https://storage.googleapis.com/BUCKET/scenes/scene-12.png",
  "audio_url": "https://storage.googleapis.com/BUCKET/audio/scene-12.mp3",
  "choices": [
    "Investigate the altar",
    "Challenge the masked guardian",
    "Scan for hidden exits"
  ]
}

🎨 Features

Story Loop State Machine

IDLE -> GENERATING_TEXT -> GENERATING_MEDIA -> RENDERING_SCENE -> WAITING_FOR_CHOICE -> IDLE

Frontend Components

Component	Purpose
StoryInput.jsx	Prompt entry, tone selector, generate action
SceneCard.jsx	Displays scene text, artwork, and narration player
ChoiceButtons.jsx	Branching actions for next turn
TimelinePanel.jsx	Scrollable history of generated scenes

Content Package Per Turn

Narrative Segment: 100-250 words scene text
Illustration: Generated image from scene description
Voiceover: Narration audio for accessibility/immersion
Choices: 2-4 branching options to continue the story

🧠 Director Persona (System Prompt)

You are the Interactive Mythos Director.
Create immersive, coherent story scenes based on user input.
For each turn, produce:
1) A vivid narrative segment
2) A precise visual description for image generation
3) A clean voiceover script for narration
4) 2-4 meaningful branching choices
Maintain continuity with prior events and avoid contradictions.

🐳 Docker & Cloud Run Deployment

Local Docker Build

docker build -t mythos-director:latest .
docker run -p 8080:8080 \
  -e PORT=8080 \
  -e GOOGLE_CLOUD_PROJECT=your_project_id \
  -e GCS_BUCKET=your_bucket \
  mythos-director:latest

Deploy to Google Cloud Run

1. Authenticate and configure project:

gcloud auth login
gcloud config set project YOUR_PROJECT_ID

2. Build container in Cloud Build:

gcloud builds submit --tag gcr.io/YOUR_PROJECT_ID/mythos-director

3. Deploy to Cloud Run:

gcloud run deploy mythos-director \
  --image gcr.io/YOUR_PROJECT_ID/mythos-director \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated

4. Verify service URL:

gcloud run services describe mythos-director --platform managed --region us-central1

📊 Architecture

┌────────────────────┐
│ React Frontend     │
│ (Vite SPA)         │
│ StoryInput/SceneUI │
└─────────┬──────────┘
          │ HTTPS
          ▼
   /api/generate-scene
          │
┌─────────▼──────────┐
│ Python Backend     │
│ (FastAPI on        │
│ Cloud Run)         │
│ - GenAI Orchestrator
│ - Media Pipeline   │
└──────┬─────┬───────┘
       │     │
       │     └───────────────┐
       ▼                     ▼
┌───────────────┐      ┌───────────────┐
│ Vertex AI     │      │ Cloud Storage │
│ Gemini/Imagen │      │ Scene Assets  │
└───────────────┘      └───────────────┘
       │
       ▼
┌───────────────┐
│ Cloud TTS     │
│ Narration MP3 │
└───────────────┘

🔑 Environment Variables

# backend/.env
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GCS_BUCKET=your-asset-bucket
PORT=8000
ENV=development

✅ Hackathon Checklist

React frontend connected to backend API
Python backend integrated with Google GenAI SDK
Interleaved multimodal output (text + image + audio)
Assets uploaded and served from GCS
Dockerized backend running on Cloud Run
Architecture diagram finalized
Demo video recorded (<=4 minutes)
Public repository and submission page ready

🐛 Troubleshooting

"Backend cannot access Google services"

Verify gcloud auth application-default login for local development
Confirm Vertex AI, Cloud Storage, and Cloud Run APIs are enabled
Check service account permissions for deployed Cloud Run service

"Image or audio URL is empty"

Ensure GCS_BUCKET is set correctly
Validate bucket write permissions
Inspect backend logs for failed upload operations

Frontend cannot call backend

Confirm backend is running on http://localhost:8000
Check frontend API base URL configuration
Verify CORS settings in FastAPI

📝 License

Open source for educational and hackathon use.

🙌 Credits

Frontend: React + Vite
Backend: Python + FastAPI
AI: Google GenAI SDK (Gemini)
Media: Imagen + Google Cloud Text-to-Speech
Cloud: Google Cloud Run + Cloud Storage

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Client/mi-proyecto-reactnpm		Client/mi-proyecto-reactnpm
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md
ai_engine.py		ai_engine.py
app.py		app.py
idea.md		idea.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎬 Interactive Mythos Director — Multimodal Storytelling Agent

🎯 What is Interactive Mythos Director?

🛠 Tech Stack

📁 Project Structure

🚀 Quick Start (Local Dev)

Prerequisites

Setup & Run

🔌 API Reference

POST `/api/generate-scene`

🎨 Features

Story Loop State Machine

Frontend Components

Content Package Per Turn

🧠 Director Persona (System Prompt)

🐳 Docker & Cloud Run Deployment

Local Docker Build

Deploy to Google Cloud Run

📊 Architecture

🔑 Environment Variables

✅ Hackathon Checklist

🐛 Troubleshooting

"Backend cannot access Google services"

"Image or audio URL is empty"

Frontend cannot call backend

📝 License

🙌 Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎬 Interactive Mythos Director — Multimodal Storytelling Agent

🎯 What is Interactive Mythos Director?

🛠 Tech Stack

📁 Project Structure

🚀 Quick Start (Local Dev)

Prerequisites

Setup & Run

🔌 API Reference

POST /api/generate-scene

🎨 Features

Story Loop State Machine

Frontend Components

Content Package Per Turn

🧠 Director Persona (System Prompt)

🐳 Docker & Cloud Run Deployment

Local Docker Build

Deploy to Google Cloud Run

📊 Architecture

🔑 Environment Variables

✅ Hackathon Checklist

🐛 Troubleshooting

"Backend cannot access Google services"

"Image or audio URL is empty"

Frontend cannot call backend

📝 License

🙌 Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

POST `/api/generate-scene`

Packages