Add single-node budget tutorials (slime/ms_swift/miles) + cost estimation by devin-ai-integration[bot] · Pull Request #2 · modal-projects/training-gym

devin-ai-integration · 2026-04-27T15:16:35Z

Summary

Adds three new single-node, budget-GPU tutorials using Qwen3-0.6B, plus cost estimation infrastructure so users can see what each tutorial costs before launching.

New tutorials

Tutorial	Framework	GPU	Est. Cost/hr
`sft/002_ms_swift_qwen3_0_6b`	ms_swift	1×A100-40GB	~$2.10
`rl/005_slime_single_gpu`	slime	2×A100	~$5.00
`rl/005_miles_intro`	miles	8×A100	~$20.00

All three use Qwen3-0.6B and target cheaper GPU tiers (A100, A100-40GB) instead of H100. Each includes cost callouts in the narrative, smoke-run defaults (train_iters=5, small batch, tiny dataset slice), and retargets the model's preset via a lightweight subclass.

Cost estimation

modal_training_gym/common/cost.py — GPU_HOURLY_PRICES dict + estimate_cost() / format_cost_range() utilities.
Runnable notebook cells — every tutorial now has a "Cost estimate" section with a runnable code cell that imports from cost.py and prints estimated costs for smoke runs and full runs. Users can adjust estimated_minutes before launching.
README table — new "Est. Cost/hr" column auto-generated from gpu_type + n_gpus metadata fields in each tutorial's TUTORIAL_METADATA.

Infrastructure

GPUType narrowed to supported tiers: A100, A100-40GB, A100-80GB, H100, H200, B200.

Checklist

Example is documented with comments throughout, in a Literate Programming style.
Example does not require third-party dependencies to be installed locally
Example pins its dependencies
- Example pins container images to a stable tag, not a dynamic tag like latest
- Example specifies a python_version for the base image, if it is used
- Example pins all dependencies to at least minor version, ~=x.y.z or ==x.y
- Example dependencies with version < 1 are pinned to patch version, ==0.y.z

Link to Devin session: https://app.devin.ai/sessions/21d21063333249d292cbbab7064ff905
Requested by: @joyliu-q

- Expand GPUType to include T4, L4, A10, L40S, A100 (all variants) - Add modal_training_gym/common/cost.py with GPU price table + estimate_cost() - Add 'Est. Cost/hr' column to tutorials/README.md (auto-generated from gpu_type + n_gpus metadata fields) - New tutorials: - sft/002_ms_swift_qwen3_0_6b: Qwen3-0.6B LoRA SFT on 1×A10 (~$1.10/hr) - rl/005_slime_single_gpu: Qwen3-0.6B GRPO on 2×A100 (~$5.00/hr) - rl/005_miles_intro: Qwen3-0.6B GRPO with Miles on 8×A100 (~$20.00/hr) - Add cost callouts to existing tutorial narratives (001_slime_intro, 001_ms_swift) - Add gpu_type/n_gpus metadata to all existing tutorials Co-Authored-By: Joy <joyliu.q@gmail.com>

devin-ai-integration · 2026-04-27T15:18:07Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

⚙️ Control Options:

Disable automatic comment and CI monitoring

Every tutorial now has a 'Cost estimate' section with a runnable code cell that imports from modal_training_gym.common.cost and prints the estimated cost for the smoke run and a 1-hour full run. Users can adjust estimated_minutes before launching. Co-Authored-By: Joy <joyliu.q@gmail.com>

Remove T4, L4, A10, L40S, B300 from supported GPU list. Retarget the ms_swift budget tutorial from A10 to A100-40GB. Update cost tables in cost.py and generate_tutorial.py to match. Co-Authored-By: Joy <joyliu.q@gmail.com>

Co-Authored-By: Joy <joyliu.q@gmail.com>

devin-ai-integration Bot assigned joyliu-q Apr 27, 2026

devin-ai-integration Bot and others added 3 commits April 27, 2026 15:26

Narrow GPUType to A100/A100-40GB/A100-80GB/H100/H200/B200

2c57cce

Remove T4, L4, A10, L40S, B300 from supported GPU list. Retarget the ms_swift budget tutorial from A10 to A100-40GB. Update cost tables in cost.py and generate_tutorial.py to match. Co-Authored-By: Joy <joyliu.q@gmail.com>

Fix stale A10 reference in cost.py docstring example

afbbbfa

Co-Authored-By: Joy <joyliu.q@gmail.com>

joyliu-q closed this May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add single-node budget tutorials (slime/ms_swift/miles) + cost estimation#2

Add single-node budget tutorials (slime/ms_swift/miles) + cost estimation#2
devin-ai-integration[bot] wants to merge 4 commits into
joy/initial-setupfrom
devin/1777302694-single-node-tutorials-cost-estimation

devin-ai-integration Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devin-ai-integration Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New tutorials

Cost estimation

Infrastructure

Checklist

Uh oh!

devin-ai-integration Bot commented Apr 27, 2026

🤖 Devin AI Engineer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devin-ai-integration Bot commented Apr 27, 2026 •

edited

Loading