Skip to content

Add single-node budget tutorials (slime/ms_swift/miles) + cost estimation#2

Closed
devin-ai-integration[bot] wants to merge 4 commits into
joy/initial-setupfrom
devin/1777302694-single-node-tutorials-cost-estimation
Closed

Add single-node budget tutorials (slime/ms_swift/miles) + cost estimation#2
devin-ai-integration[bot] wants to merge 4 commits into
joy/initial-setupfrom
devin/1777302694-single-node-tutorials-cost-estimation

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented Apr 27, 2026

Summary

Adds three new single-node, budget-GPU tutorials using Qwen3-0.6B, plus cost estimation infrastructure so users can see what each tutorial costs before launching.

New tutorials

Tutorial Framework GPU Est. Cost/hr
sft/002_ms_swift_qwen3_0_6b ms_swift 1×A100-40GB ~$2.10
rl/005_slime_single_gpu slime 2×A100 ~$5.00
rl/005_miles_intro miles 8×A100 ~$20.00

All three use Qwen3-0.6B and target cheaper GPU tiers (A100, A100-40GB) instead of H100. Each includes cost callouts in the narrative, smoke-run defaults (train_iters=5, small batch, tiny dataset slice), and retargets the model's preset via a lightweight subclass.

Cost estimation

  • modal_training_gym/common/cost.pyGPU_HOURLY_PRICES dict + estimate_cost() / format_cost_range() utilities.
  • Runnable notebook cells — every tutorial now has a "Cost estimate" section with a runnable code cell that imports from cost.py and prints estimated costs for smoke runs and full runs. Users can adjust estimated_minutes before launching.
  • README table — new "Est. Cost/hr" column auto-generated from gpu_type + n_gpus metadata fields in each tutorial's TUTORIAL_METADATA.

Infrastructure

  • GPUType narrowed to supported tiers: A100, A100-40GB, A100-80GB, H100, H200, B200.

Checklist

  • Example is documented with comments throughout, in a Literate Programming style.
  • Example does not require third-party dependencies to be installed locally
  • Example pins its dependencies
    • Example pins container images to a stable tag, not a dynamic tag like latest
    • Example specifies a python_version for the base image, if it is used
    • Example pins all dependencies to at least minor version, ~=x.y.z or ==x.y
    • Example dependencies with version < 1 are pinned to patch version, ==0.y.z

Link to Devin session: https://app.devin.ai/sessions/21d21063333249d292cbbab7064ff905
Requested by: @joyliu-q

- Expand GPUType to include T4, L4, A10, L40S, A100 (all variants)
- Add modal_training_gym/common/cost.py with GPU price table + estimate_cost()
- Add 'Est. Cost/hr' column to tutorials/README.md (auto-generated from
  gpu_type + n_gpus metadata fields)
- New tutorials:
  - sft/002_ms_swift_qwen3_0_6b: Qwen3-0.6B LoRA SFT on 1×A10 (~$1.10/hr)
  - rl/005_slime_single_gpu: Qwen3-0.6B GRPO on 2×A100 (~$5.00/hr)
  - rl/005_miles_intro: Qwen3-0.6B GRPO with Miles on 8×A100 (~$20.00/hr)
- Add cost callouts to existing tutorial narratives (001_slime_intro, 001_ms_swift)
- Add gpu_type/n_gpus metadata to all existing tutorials

Co-Authored-By: Joy <joyliu.q@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

devin-ai-integration Bot and others added 3 commits April 27, 2026 15:26
Every tutorial now has a 'Cost estimate' section with a runnable code
cell that imports from modal_training_gym.common.cost and prints the
estimated cost for the smoke run and a 1-hour full run. Users can adjust
estimated_minutes before launching.

Co-Authored-By: Joy <joyliu.q@gmail.com>
Remove T4, L4, A10, L40S, B300 from supported GPU list. Retarget
the ms_swift budget tutorial from A10 to A100-40GB. Update cost
tables in cost.py and generate_tutorial.py to match.

Co-Authored-By: Joy <joyliu.q@gmail.com>
Co-Authored-By: Joy <joyliu.q@gmail.com>
@joyliu-q joyliu-q closed this May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant