Skip to content

modal-projects/training-gym

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

45 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Training Gym

πŸ“– Documentation Β· Tutorials Β· API Reference

Modal Training Gym is a Python SDK for RL post-training on Modal β€” so you don't have to hand-roll a launcher every time. Pick a base model, a dataset, and an RL framework (GRPO, PPO, custom reward / generate functions); the gym handles cluster topology, Ray/NCCL bring-up, volume mounts, checkpointing, and serving for eval and rollouts. SFT and plain distributed training are supported too β€” but RL is the happy path.

Quickstart

Install with pip:

pip install -q git+https://github.com/modal-projects/training-gym.git@main

Or pin it in pyproject.toml for uv:

training-gym = { git = "https://github.com/modal-projects/training-gym.git", branch = "main" }

Then import the building blocks from your own script:

from modal_training_gym import TrainConfig

Agent set-up

This repository includes an AGENTS.md and a skills/ directory (symlinked to .claude/skills/) that teach Claude Code how to navigate the framework β€” W&B configuration, custom rollouts and generate functions, custom eval functions, and more.

Clone the repo and run claude from its root; the skills load automatically based on what you ask for.

Observability dashboard

Training Gym ships a dashboard that aggregates training runs, deployments, and eval results in one place. Deploy your own copy:

training-gym setup

Modal prints a URL where you can watch jobs in progress.

Gym Observability Dashboard

Tutorials

The fastest path through the API is the tutorials. Each one ships as a runnable .py and a paired .ipynb narrated cell-by-cell β€” the notebook is the canonical walkthrough. Each tutorial below has a one-click Launch button that opens the .ipynb in a fresh Modal Notebook; the first code cell pip-installs modal-training-gym into the notebook kernel, so the rest of the cells run as-is.

Difficulty is a rough self-assessed signal for where to start:

  • Beginner β€” single-node, introduces one framework concept.
  • Intermediate β€” 1–2 nodes, or wires up something non-default (custom reward, external script).
  • Advanced β€” β‰₯2 nodes with non-trivial parallelism (tensor-parallel, colocated RL, long context); assumes familiarity with the underlying framework.

RL

Tutorial Summary Difficulty Framework Launch
000_rl_basics Qwen3-4B haiku evaluation with verifiable rewards β€” serve, evaluate, train, compare Beginner slime Open in Modal
001_sandboxes Code RL with Harbor hello-world and sandboxed verification Intermediate slime Open in Modal
002_multiturn Multi-turn number-guessing RL with custom generate and reward functions Intermediate slime Open in Modal
003_on_policy_distillation On-policy distillation on math β€” Qwen3-8B teacher, Qwen3-4B student Intermediate slime Open in Modal

See tutorials/README.md for how to run the .py companions from the CLI and how to author a new tutorial.

Multi-node access

Important

Single-node training is open to everyone. Multi-node clusters β€” required for larger models β€” are still in Beta. Contact us on Slack for access.

Architecture

Architecture diagram

Documentation

Full docs are hosted at gym.modal.dev:

  • Tutorials β€” step-by-step runnable examples
  • API Reference β€” every public class documented with types and defaults

Modal platform references:

License

MIT.


Contributing Guide

Layout

modal_training_gym/        ← installable package
β”œβ”€β”€ common/                ← shared classes (datasets, models, eval, deployment, Ray helpers)
β”œβ”€β”€ deploy_recipes/        ← serving presets for engines like SGLang and vLLM
β”œβ”€β”€ frameworks/            ← launcher implementations that build Modal apps
└── train_recipes/         ← training presets such as SlimeRecipe

tutorials/                 ← runnable examples β€” one folder per tutorial
β”œβ”€β”€ tutorial_generator/    ← source files; each produces a .py + .ipynb
└── generate_tutorial.py   ← AST-walks the sources, regenerates .py + .ipynb

dashboards/                ← observability dashboard (deploy with `modal deploy dashboards/app.py`)
docs-next/                 ← Starlight docs site (deploy with `modal deploy docs-next/docs_next_app.py`)
.claude/skills/            ← agent skills for navigating this repo

Dev setup

# editable install + pinned dev deps (pre-commit, etc.)
uv sync

# optional: register this venv as a Jupyter kernel for notebook work
uv run python -m ipykernel install --user --name=modal-training-gym

# install the pre-commit hook locally
uv run pre-commit install

Python is pinned to 3.12 (see .python-version and pyproject.toml). Modal's @app.function(serialized=True) requires the local and remote Python versions to match, and the framework images we ship (slime nightly, NeMo 25.11) are all py312.

Authoring a new tutorial

See tutorials/README.md for the generator-source format and the per-tutorial TUTORIAL_METADATA schema.

Contributing a new recipe

  1. Add a train recipe under modal_training_gym/train_recipes/, or a deploy recipe under modal_training_gym/deploy_recipes/.
  2. If the recipe needs new runtime behavior, wire it into the relevant launcher or serving builder under modal_training_gym/frameworks/ or modal_training_gym/deploy_recipes/*/serve_*.py.
  3. Add or update a source tutorial under tutorials/tutorial_generator/<bucket>/ and run the generator.
  4. Keep shared container objects (dataset, model, wandb, eval) framework-agnostic β€” recipe layers do the translation into engine-specific flags.

Agent guide

Working on this repo with an AI coding agent? The .claude/skills/ directory contains auto-triggering skills for Modal training workflows, example validation, and repo navigation.

Releases

No releases published

Packages

 
 
 

Contributors