feat: add cloud_type parameter for on-demand vs spot GPU selection by slacki-ai · Pull Request #50 · longtermrisk/openweights

slacki-ai · 2026-03-26T12:58:29Z

Summary

Adds a cloud_type parameter ("SECURE" / "ALL" / "COMMUNITY") to control which RunPod cloud tier workers are provisioned on
Stored in the job's params JSONB — no database migration needed
Defaults to "SECURE" (on-demand) for full backward compatibility

Changes

Threaded through the entire pipeline (8 files):

openweights/cli/exec.py — new --cloud-type CLI argument
openweights/client/jobs.py — base Jobs.create() extracts and stores cloud_type in params
openweights/jobs/inference/__init__.py — InferenceJobs.create()
openweights/jobs/unsloth/__init__.py — FineTuning.create() + LogProb.create()
openweights/jobs/vllm/__init__.py — API.create() + deploy() + multi_deploy()
openweights/jobs/weighted_sft/__init__.py — SFT.create() + MultipleChoice.create() + LogProb.create()
openweights/cluster/org_manager.py — groups jobs by (cloud_type, allowed_hardware) so each worker is launched on the correct tier
openweights/cluster/start_runpod.py — passes cloud_type to create_pod()

Test plan

Submit a job with cloud_type="COMMUNITY" — verify RunPod pod is created as a spot instance
Submit a job with default (no cloud_type) — verify it uses SECURE (on-demand) as before
Submit two jobs with different cloud_type values and same allowed_hardware — verify they get separate workers
CLI: ow exec --cloud-type COMMUNITY "echo test" — verify argument is passed through

🤖 Generated with Claude Code

Add a `cloud_type` parameter ("SECURE", "ALL", or "COMMUNITY") that controls which RunPod cloud tier is used when provisioning workers. This is stored in the job's `params` JSONB (no DB migration needed) and threaded through the entire pipeline: - CLI: `ow exec --cloud-type COMMUNITY ...` - Client: `Jobs.create()` base class extracts and stores cloud_type - All job types: inference, unsloth (fine-tuning + logprob), vllm (create + deploy + multi_deploy), weighted_sft (SFT + MC + logprob) - Scheduler: groups pending jobs by (cloud_type, allowed_hardware) so each worker lands on the correct RunPod tier - Worker start: passes cloud_type to RunPod pod creation Defaults to "SECURE" (on-demand) for backward compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Tests verify that: - Jobs are grouped by (cloud_type, allowed_hardware) in the scheduler - cloud_type defaults to "SECURE" when absent or params is None - Different cloud_type values produce separate worker groups - Group keys unpack correctly for the scale_workers loop - CLI parser accepts valid choices and rejects invalid ones All tests are pure-Python logic checks, no DB or RunPod needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Keep 5 core tests that exercise the actual grouping logic (same group, separate groups, SECURE default, params=None edge case, hardware sort normalization). Remove 6 tests: argparse-only tests, redundant grouping variants, and Python tuple-unpacking check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

nielsrolf · 2026-04-01T09:32:30Z

Why do we need this?

nielsrolf · 2026-04-14T14:39:58Z

Closing because I don't think that we need it, feel free to reopen if this is wrong / explain why we need it

slacki-ai and others added 3 commits March 26, 2026 08:58

nielsrolf closed this Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add cloud_type parameter for on-demand vs spot GPU selection#50

feat: add cloud_type parameter for on-demand vs spot GPU selection#50
slacki-ai wants to merge 3 commits intolongtermrisk:v0.9from
slacki-ai:support_cloud_types

slacki-ai commented Mar 26, 2026

Uh oh!

nielsrolf commented Apr 1, 2026

Uh oh!

nielsrolf commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

slacki-ai commented Mar 26, 2026

Summary

Changes

Test plan

Uh oh!

nielsrolf commented Apr 1, 2026

Uh oh!

nielsrolf commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants