Training Gym

📖 Documentation · Tutorials · API Reference

Modal Training Gym is a Python SDK for RL post-training on Modal — so you don't have to hand-roll a launcher every time. Pick a base model, a dataset, and an RL framework (GRPO, PPO, custom reward / generate functions); the gym handles cluster topology, Ray/NCCL bring-up, volume mounts, checkpointing, and serving for eval and rollouts. SFT and plain distributed training are supported too — but RL is the happy path.

Quickstart

Install with pip:

pip install -q git+https://github.com/modal-projects/training-gym.git@main

Or pin it in pyproject.toml for uv:

training-gym = { git = "https://github.com/modal-projects/training-gym.git", branch = "main" }

Then import the building blocks from your own script:

from modal_training_gym import TrainConfig

Agent set-up

This repository includes an AGENTS.md and a skills/ directory (symlinked to .claude/skills/) that teach Claude Code how to navigate the framework — W&B configuration, custom rollouts and generate functions, custom eval functions, and more.

Clone the repo and run claude from its root; the skills load automatically based on what you ask for.

Observability dashboard

Training Gym ships a dashboard that aggregates training runs, deployments, and eval results in one place. Deploy your own copy:

training-gym setup

Modal prints a URL where you can watch jobs in progress.

Tutorials

The fastest path through the API is the tutorials. Each one ships as a runnable .py and a paired .ipynb narrated cell-by-cell — the notebook is the canonical walkthrough. Each tutorial below has a one-click Launch button that opens the .ipynb in a fresh Modal Notebook; the first code cell pip-installs modal-training-gym into the notebook kernel, so the rest of the cells run as-is.

Difficulty is a rough self-assessed signal for where to start:

Beginner — single-node, introduces one framework concept.
Intermediate — 1–2 nodes, or wires up something non-default (custom reward, external script).
Advanced — ≥2 nodes with non-trivial parallelism (tensor-parallel, colocated RL, long context); assumes familiarity with the underlying framework.

RL

Tutorial	Summary	Difficulty	Framework
`000_rl_basics`	Qwen3-4B haiku evaluation with verifiable rewards — serve, evaluate, train, compare	Beginner	`slime`
`001_sandboxes`	Code RL with Harbor hello-world and sandboxed verification	Intermediate	`slime`
`002_multiturn`	Multi-turn number-guessing RL with custom generate and reward functions	Intermediate	`slime`
`003_on_policy_distillation`	On-policy distillation on math — Qwen3-8B teacher, Qwen3-4B student	Intermediate	`slime`

See tutorials/README.md for how to run the .py companions from the CLI and how to author a new tutorial.

Multi-node access

Important

Single-node training is open to everyone. Multi-node clusters — required for larger models — are still in Beta. Contact us on Slack for access.

Architecture

Documentation

Full docs are hosted at gym.modal.dev:

Tutorials — step-by-step runnable examples
API Reference — every public class documented with types and defaults

Modal platform references:

License

MIT.

Contributing Guide

Layout

modal_training_gym/        ← installable package
├── common/                ← shared classes (datasets, models, eval, deployment, Ray helpers)
├── deploy_recipes/        ← serving presets for engines like SGLang and vLLM
├── frameworks/            ← launcher implementations that build Modal apps
└── train_recipes/         ← training presets such as SlimeRecipe

tutorials/                 ← runnable examples — one folder per tutorial
├── tutorial_generator/    ← source files; each produces a .py + .ipynb
└── generate_tutorial.py   ← AST-walks the sources, regenerates .py + .ipynb

dashboards/                ← observability dashboard (deploy with `modal deploy dashboards/app.py`)
docs-next/                 ← Starlight docs site (deploy with `modal deploy docs-next/docs_next_app.py`)
.claude/skills/            ← agent skills for navigating this repo

Dev setup

# editable install + pinned dev deps (pre-commit, etc.)
uv sync

# optional: register this venv as a Jupyter kernel for notebook work
uv run python -m ipykernel install --user --name=modal-training-gym

# install the pre-commit hook locally
uv run pre-commit install

Python is pinned to 3.12 (see .python-version and pyproject.toml). Modal's @app.function(serialized=True) requires the local and remote Python versions to match, and the framework images we ship (slime nightly, NeMo 25.11) are all py312.

Authoring a new tutorial

See tutorials/README.md for the generator-source format and the per-tutorial TUTORIAL_METADATA schema.

Contributing a new recipe

Add a train recipe under modal_training_gym/train_recipes/, or a deploy recipe under modal_training_gym/deploy_recipes/.
If the recipe needs new runtime behavior, wire it into the relevant launcher or serving builder under modal_training_gym/frameworks/ or modal_training_gym/deploy_recipes/*/serve_*.py.
Add or update a source tutorial under tutorials/tutorial_generator/<bucket>/ and run the generator.
Keep shared container objects (dataset, model, wandb, eval) framework-agnostic — recipe layers do the translation into engine-specific flags.

Agent guide

Working on this repo with an AI coding agent? The .claude/skills/ directory contains auto-triggering skills for Modal training workflows, example validation, and repo navigation.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github		.github
assets		assets
dashboards		dashboards
docs-next		docs-next
modal_training_gym		modal_training_gym
scripts		scripts
skills		skills
tests		tests
tutorials		tutorials
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
slime.md		slime.md
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training Gym

Quickstart

Agent set-up

Observability dashboard

Tutorials

RL

Multi-node access

Architecture

Documentation

License

Contributing Guide

Layout

Dev setup

Authoring a new tutorial

Contributing a new recipe

Agent guide

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Training Gym

Quickstart

Agent set-up

Observability dashboard

Tutorials

RL

Multi-node access

Architecture

Documentation

License

Contributing Guide

Layout

Dev setup

Authoring a new tutorial

Contributing a new recipe

Agent guide

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages