Skip to content

CodeStrate/pedagogy-agent-mastra

Repository files navigation

Pedagogy Agent — Mastra + JSON-to-Excel Converter

A Mastra agent that ingests textbook PDFs and produces structured pedagogical JSON, plus a standalone CLI that turns those JSONs into per-class Excel workbooks.

The repo has three pieces:

  • src/mastra/ — Mastra agents (pedagogyAgent, pedagogyLiteratureAgent), tools, prompts, and a shared Mastra Workspace (src/mastra/workspace.ts). Exposes a mastra dev server on port 4111.
  • standalone-excel-converter/ — Pure-Bun CLI (cli.ts, batch-convert.ts) that consumes the JSON output and produces Excel workbooks.
  • workspace/ — Persistent agent workspace. The Mastra LocalFilesystem basePath is rooted here, and all generated artifacts land in workspace/outputs/. This is where the agent and converter exchange files.

You can run everything locally with Bun or in Docker.


a. Set-up Guide (Local with Bun)

Prerequisites

  • Bun ≥ 1.0 (curl -fsSL https://bun.sh/install | bash)
  • A Google AI / Vertex AI API key with access to gemini-2.5-pro

1. Clone and install

git clone <repo-url> pedagogy-agent-mastra
cd pedagogy-agent-mastra
bun install

2. Configure environment

cp .env.example .env
# edit .env and set GOOGLE_GENERATIVE_AI_API_KEY

Bun loads .env automatically — no dotenv needed.

3. Workspace

The workspace/ directory is checked into the repo (skeleton only). At runtime the agent and converter both read/write workspace/outputs/:

workspace/
└── outputs/
    ├── json_files/    ← agent saveJsonToFile target, converter input
    └── excel_files/   ← converter output, also agent jsonToXLSXTool target

workspace/outputs/* is gitignored — only the directory skeleton is tracked. Nothing else to set up; the tools mkdir -p on first write.

4. Run the agent

bun run dev

mastra dev boots on http://localhost:4111 (playground UI + REST API). The agent is wired with a Workspace (see src/mastra/workspace.ts), so it has built-in read_file / write_file / list_dir tools scoped to workspace/, plus the project's custom PDF and JSON-saving tools.

5. Verify

Open the playground, pick Pedagogy Agent or Pedagogy Literature Agent, paste a public PDF URL, and watch outputs land in workspace/outputs/json_files/ on the host.


b. Dockerization

Use this path if you don't want Bun on the host or want a reproducible runtime.

Prerequisites

  • Docker ≥ 24
  • Docker Compose v2

Files

  • Dockerfile — Bun 1.x (Debian) base, installs deps, copies source + workspace skeleton, runs bun run dev on :4111.
  • docker-compose.yml — two services on the same image:
    • pedagogy-agent — long-running Mastra dev server.
    • converter — gated behind profiles: ["tools"], intended for docker compose run --rm converter ....
  • .dockerignore — excludes node_modules, .env, .mastra, workspace/outputs/, etc.

1. Configure

cp .env.example .env
# edit .env

2. Build

docker compose build

Bakes deps into pedagogy-agent:latest; both services share it.

3. Run the agent server

docker compose up -d pedagogy-agent
docker compose logs -f pedagogy-agent          # tail
docker compose down                            # stop

Reachable at http://localhost:4111.

4. Run the standalone converter

The converter service is opt-in. Either drop into a shell:

docker compose run --rm converter
# inside:
bun batch-convert.ts
bun cli.ts ../workspace/outputs/json_files/class_1-english_bb_class1.json english class_1
exit

…or run a single command:

docker compose run --rm converter bun batch-convert.ts
docker compose run --rm converter bun cli.ts example.json mathematics class5

Or, if the agent is already up, exec into it:

docker compose exec pedagogy-agent bash
cd standalone-excel-converter && bun batch-convert.ts

5. Bind mounts

Only workspace/outputs/ is bind-mounted — that's the persistent boundary between host and container. Everything else (source, node_modules) lives inside the image, so host/container Bun versions can't collide.

Host Container Purpose
./workspace/outputs /app/workspace/outputs Agent + converter artifacts

To pick up source changes after edits, rebuild: docker compose build. (For a live-reload dev workflow, add a ./src:/app/src mount yourself — left out of the default to keep the image self-contained.)

Rebuilding after dependency changes

docker compose build --no-cache pedagogy-agent

c. Usage

c.1 Mastra agent server

Local:

bun run dev

Docker:

docker compose up -d pedagogy-agent

Hit the API:

curl -X POST http://localhost:4111/analyze-pdf \
  -H "Content-Type: application/json" \
  -d '{"pdfUrl": "https://example.com/textbook.pdf"}'

The agent downloads the PDF, parses it, returns the pedagogical structure, and writes JSON to workspace/outputs/json_files/ via the saveJsonToFile tool.

c.2 Batch-process many PDFs

bun src/batch-process.ts input.json
bun src/batch-process.ts input.json --class=1 --concurrent=2
bun src/batch-process.ts input.json --book=hindi_bb_class1
bun src/batch-process.ts input.json --skip-existing --concurrent=3

Manifest shape (input.json):

{
  "hindi_bb_class1": {
    "url": "https://.../hindi-class1.pdf",
    "class": 1,
    "medium": ["hindi"],
    "filename": "hindi_bb_class1",
    "prompt": "..."
  }
}

--skip-existing checks workspace/outputs/json_files/ for already-processed books.

c.3 Convert JSONs to Excel

Convert every JSON in workspace/outputs/json_files/ into per-class workbooks:

cd standalone-excel-converter
bun batch-convert.ts

Single file:

bun cli.ts <json-file> <subject> <class>
# example
bun cli.ts ../workspace/outputs/json_files/class_1-english_bb_class1.json english class_1

Output:

workspace/outputs/excel_files/
├── class_1-pedagogy.xlsx
│   ├── english   (english_bb + english_book records merged)
│   ├── hindi
│   └── marathi
└── ...

For converter internals (subject normalization, multi-medium handling, file-naming conventions) see standalone-excel-converter/README.md.

c.4 Quick reference

Task Command
Start agent (local) bun run dev
Start agent (Docker) docker compose up -d pedagogy-agent
Tail logs (Docker) docker compose logs -f pedagogy-agent
Shell into running agent docker compose exec pedagogy-agent bash
One-off converter shell docker compose run --rm converter
One-off batch convert docker compose run --rm converter bun batch-convert.ts
Rebuild after dep changes docker compose build --no-cache

Mastra Workspace notes

  • src/mastra/workspace.ts instantiates a Workspace with LocalFilesystem({ basePath: <repo>/workspace }) and a LocalSandbox rooted at the same path.
  • The workspace is attached to both agents via new Agent({ ..., workspace }). Mastra automatically exposes filesystem tools (read_file, write_file, list_dir, …) scoped to basePath. The destructive delete tool is disabled.
  • Custom tools (saveJsonToFile, convertToXlsx) write to workspace/outputs/json_files/ and workspace/outputs/excel_files/ via constants exported from workspace.ts — same physical location as the workspace tools, so the agent can also read_file / list_dir over its own outputs.
  • The path is resolved via import.meta.dirname so it's stable regardless of process.cwd() (which differs between mastra dev, bun run, and the bundled build).

Troubleshooting

  • Port 4111 in use — change the host side in docker-compose.yml (e.g. "5111:4111").
  • GOOGLE_GENERATIVE_AI_API_KEY not set — the agent fails on the first model call. docker compose config shows the resolved env.
  • Empty workspace/outputs/excel_files/ — make sure workspace/outputs/json_files/ has files matching class_N-subject_type_classN.json before running batch-convert.ts.
  • Permission errors on Linux bind mounts — the container runs as bun. If host-owned files end up unwritable: chown -R $USER workspace/outputs.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors