π Documentation Β· Tutorials Β· API Reference
Modal Training Gym is a Python SDK for RL post-training on Modal β so you don't have to hand-roll a launcher every time. Pick a base model, a dataset, and an RL framework (GRPO, PPO, custom reward / generate functions); the gym handles cluster topology, Ray/NCCL bring-up, volume mounts, checkpointing, and serving for eval and rollouts. SFT and plain distributed training are supported too β but RL is the happy path.
Install with pip:
pip install -q git+https://github.com/modal-projects/training-gym.git@mainOr pin it in pyproject.toml for uv:
training-gym = { git = "https://github.com/modal-projects/training-gym.git", branch = "main" }Then import the building blocks from your own script:
from modal_training_gym import TrainConfigThis repository includes an AGENTS.md and a skills/ directory (symlinked to .claude/skills/) that teach Claude Code how to navigate the framework β W&B configuration, custom rollouts and generate functions, custom eval functions, and more.
Clone the repo and run claude from its root; the skills load automatically based on what you ask for.
Training Gym ships a dashboard that aggregates training runs, deployments, and eval results in one place. Deploy your own copy:
training-gym setupModal prints a URL where you can watch jobs in progress.
The fastest path through the API is the tutorials. Each one
ships as a runnable .py and a paired .ipynb narrated cell-by-cell β
the notebook is the canonical walkthrough. Each tutorial below has a one-click
Launch button that opens the .ipynb in a fresh Modal Notebook; the first
code cell pip-installs modal-training-gym into the notebook kernel, so the
rest of the cells run as-is.
Difficulty is a rough self-assessed signal for where to start:
- Beginner β single-node, introduces one framework concept.
- Intermediate β 1β2 nodes, or wires up something non-default (custom reward, external script).
- Advanced β β₯2 nodes with non-trivial parallelism (tensor-parallel, colocated RL, long context); assumes familiarity with the underlying framework.
| Tutorial | Summary | Difficulty | Framework | Launch |
|---|---|---|---|---|
000_rl_basics |
Qwen3-4B haiku evaluation with verifiable rewards β serve, evaluate, train, compare | Beginner | slime |
|
001_sandboxes |
Code RL with Harbor hello-world and sandboxed verification | Intermediate | slime |
|
002_multiturn |
Multi-turn number-guessing RL with custom generate and reward functions | Intermediate | slime |
|
003_on_policy_distillation |
On-policy distillation on math β Qwen3-8B teacher, Qwen3-4B student | Intermediate | slime |
See tutorials/README.md for how to run the .py
companions from the CLI and how to author a new tutorial.
Important
Single-node training is open to everyone. Multi-node clusters β required for larger models β are still in Beta. Contact us on Slack for access.
Full docs are hosted at gym.modal.dev:
- Tutorials β step-by-step runnable examples
- API Reference β every public class documented with types and defaults
Modal platform references:
MIT.
modal_training_gym/ β installable package
βββ common/ β shared classes (datasets, models, eval, deployment, Ray helpers)
βββ deploy_recipes/ β serving presets for engines like SGLang and vLLM
βββ frameworks/ β launcher implementations that build Modal apps
βββ train_recipes/ β training presets such as SlimeRecipe
tutorials/ β runnable examples β one folder per tutorial
βββ tutorial_generator/ β source files; each produces a .py + .ipynb
βββ generate_tutorial.py β AST-walks the sources, regenerates .py + .ipynb
dashboards/ β observability dashboard (deploy with `modal deploy dashboards/app.py`)
docs-next/ β Starlight docs site (deploy with `modal deploy docs-next/docs_next_app.py`)
.claude/skills/ β agent skills for navigating this repo
# editable install + pinned dev deps (pre-commit, etc.)
uv sync
# optional: register this venv as a Jupyter kernel for notebook work
uv run python -m ipykernel install --user --name=modal-training-gym
# install the pre-commit hook locally
uv run pre-commit installPython is pinned to 3.12 (see .python-version and pyproject.toml). Modal's
@app.function(serialized=True) requires the local and remote Python versions
to match, and the framework images we ship (slime nightly, NeMo 25.11) are all
py312.
See tutorials/README.md
for the generator-source format and the per-tutorial TUTORIAL_METADATA
schema.
- Add a train recipe under
modal_training_gym/train_recipes/, or a deploy recipe undermodal_training_gym/deploy_recipes/. - If the recipe needs new runtime behavior, wire it into the relevant launcher
or serving builder under
modal_training_gym/frameworks/ormodal_training_gym/deploy_recipes/*/serve_*.py. - Add or update a source tutorial under
tutorials/tutorial_generator/<bucket>/and run the generator. - Keep shared container objects (
dataset,model,wandb,eval) framework-agnostic β recipe layers do the translation into engine-specific flags.
Working on this repo with an AI coding agent? The .claude/skills/ directory
contains auto-triggering skills for Modal training workflows, example
validation, and repo navigation.
