Beyond the cartridge: A research and preparation workspace for the Return to Yoshi's Island documentary.
Quick Start Β· Analysis Pipeline Β· Project Structure Β· rtyi.land
This repository is the working hub for a long-form documentary about Return to Yoshi's Island, the Mario 64 ROM hack led by Kaze Emanuar. It holds everything needed to prepare for filming: interview question sets, narrative planning, stream evidence, contributor profiles, and curated quotes.
Planned interviews include:
- Kaze Emanuar
- Biobak
- Badub
- Kaze and Zeina together
- Node.js 22+
- pnpm
git clone git@github.com:johannschopplich/rtyi-doc.git
cd rtyi-doc
pnpm installpnpm docs:dev # Start local dev server
pnpm docs:build # Build the site
pnpm docs:preview # Preview the built siteRaw stream transcripts live in transcripts/. Running the analysis pipeline extracts structured findings from each one and writes them to .data/streams/ as JSON. The synthesis step then aggregates those findings into documentary-ready content.
Create a .env file in the repository root with your API keys, then:
pnpm stream-analysis # Per-stream extraction β .data/streams/
pnpm stream-synthesis # Cross-stream aggregation β .data/synthesis/pnpm lint
pnpm format:check
pnpm test:typesrtyi-doc/
βββ docs/ # VitePress documentary research site
β βββ .vitepress/ # Site config, theme, data loaders
β βββ drafts/ # Narrative arcs and chapter planning
β βββ interviews/ # Per-person interview question sets
β βββ synthesis/ # Documentary prep (generated from analysis)
β βββ streams/ # Stream pages and dashboard
β βββ topics/ # Findings grouped by documentary theme
β βββ team/ # Contributor profiles
β βββ prompts/ # Prompt and extraction documentation
β βββ research/ # Background research on documentary craft
β βββ public/ # Static assets
β βββ index.md
βββ scripts/
β βββ stream-analysis.ts # Per-stream transcript extraction
β βββ stream-synthesis.ts # Cross-stream aggregation
βββ src/
β βββ analysis/
β β βββ prompts.ts # Prompt templates for stream extraction
β β βββ schemas.ts # Zod schemas for analysis output
β β βββ runner.ts # Transcript processing and JSON output
β βββ synthesis/
β β βββ prompts.ts # Prompt templates for aggregation
β β βββ schemas.ts # Zod schemas for synthesis output
β β βββ runner.ts # Aggregation execution
β βββ stt-corrections.ts # Speech-to-text cleanup rules
β βββ constants.ts # Paths and model defaults
β βββ utils.ts # Provider and model helpers
βββ transcripts/ # Raw stream transcript files (.txt)
βββ .data/ # Generated analysis artifacts
β βββ streams/ # Per-stream extraction output (JSON)
β βββ synthesis/ # Aggregated documentary prep (JSON)
βββ package.json
βββ pnpm-workspace.yaml
βββ wrangler.toml
βββ README.md
The pipeline has two stages. The first reads each raw transcript and extracts structured findings β development decisions, context and motivation, contributor roles, key stories, and open questions for follow-up interviews. Output goes to .data/streams/ as one JSON file per stream.
The second stage aggregates all per-stream output into documentary-ready material in .data/synthesis/:
- Story Arcs β arc-first narrative stories with embedded interview questions and quotes
- Narrative Arcs β thematic filming roadmap forming the documentary's high-level structure
- Topic Arcs β per-topic narrative summaries tracing how each theme evolved across streams
VitePress data loaders in docs/synthesis/ read these JSON files and render them as browsable pages on the research site.
The site deploys to Cloudflare as a static build. Routing and asset configuration live in wrangler.toml, serving the site at rtyi.land.
This project is for internal documentary research and preparation. Content related to the RTYI team and project is used with permission for documentary purposes.