verl-mint

Open MinT training runtime on veRL.

MinT SDK . Documentation . Quickstart . MinT Console . Cookbook . veRL

What verl-mint is

verl-mint is the commercial open-source edition of the MinT training runtime. The commercial MinT SaaS gives users hosted training infrastructure through the MinT SDK. verl-mint exposes the same training mental model for teams that want to run MinT-style workloads on their own GPUs, their own Ray or veRL cluster, and their own artifact storage.

The server exposes the MinT training workflow at the API boundary, so clients can keep familiar session, checkpoint, sampler, and future semantics while moving execution onto the MinT and veRL stack.

The project is built in collaboration with veRL as the training execution foundation. MinT exposes a client-facing workflow for creating a training session, running SFT or RL updates, saving checkpoints, handing weights to samplers, collecting rollouts, and continuing from saved state.

Support matrix

Algorithms

Algorithm	Status	Notes
SFT	Available	LoRA SFT, checkpoint save/load, resume, sampler handoff
GRPO	Available	Group Relative Policy Optimization; group-relative advantages without a value critic
DPO	Planned	Preference optimization track

Models

Model family	Size	Status	Notes
Qwen	0.6B	Available	veRL trainer smoke path
Qwen	4B	Available	veRL model-backed run
Qwen	8B	Planned	Not yet validated
Qwen	30B	Available	veRL/Ray run

Install

git clone https://github.com/verl-project/verl-mint.git
cd verl-mint
uv sync --extra smoke

The smoke extra installs the official MinT SDK from the MindLab repository: mindlab-toolkit @ git+https://github.com/MindLab-Research/mindlab-toolkit.git.

Local SDK checkouts are useful for reading source, but smoke runs should use the installed mindlab-toolkit dependency rather than PYTHONPATH.

For cluster runs, verl-mint assumes the execution environment already exists; it does not provision Ray, GPUs, or worker images.

Prerequisite	Requirement
Ray	`ray[default]>=2.46.0,<3` on the API node and every worker
veRL runtime	verl-project/verl importable in the Python environment used by Ray actors
Storage	a shared artifact root visible to the API node and workers; copy `.env.example` to `.env` and set `VERL_MINT_STORAGE_ROOT` plus `VERL_MINT_SHARED_STORAGE_ROOTS`

Install the Ray client/runtime dependency on the API node with:

uv sync --extra ray

Quick start

Path	Entry point
Smoke server: contract	`scripts/smoke_fake_backend.py`
Smoke server: Qwen SFT diagnostic	`scripts/smoke_qwen_sft.py`
Smoke server: veRL trainer job	`scripts/smoke_verl_ppo_job.py`
Remote SDK client smoke	`scripts/client_verl_ppo_job_smoke.py --base-url http://<host>:8000`

Run the contract smoke with a lightweight backend:

uv run --env-file /dev/null --extra smoke \
  python scripts/smoke_fake_backend.py

Run the local Qwen SFT diagnostic path. This checks API compatibility with a small local PyTorch backend; it is not the recommended model-backed training path:

uv run --env-file /dev/null --extra smoke \
  python scripts/smoke_qwen_sft.py

Run an SDK-driven veRL trainer job smoke:

export VERL_CONFIG_DIR=/workspace/verl/verl/trainer/config

uv run --env-file /dev/null --extra smoke --extra ray \
  python scripts/smoke_verl_ppo_job.py \
  --config-dir "$VERL_CONFIG_DIR" \
  --model-path Qwen/Qwen3-0.6B \
  --train-files /path/to/train.parquet \
  --val-files /path/to/val.parquet \
  --total-steps 1

All client_* scripts exercise an already-running service through the installed MinT SDK and require --base-url. Scripts that start a local smoke server and immediately run the matching SDK scenario live under scripts/smoke_*.py. Scripts that post directly to server routes are server/API diagnostics, not client contract tests, and should not be named or treated as client smoke tests.

Simple cluster service

For the open-source runtime, keep deployment simple: start one HTTP service on a network-reachable cluster node, point the official SDK at it, run a smoke or example, and stop the service with Ctrl-C or SIGTERM. This is not a production deployment recipe.

Start the service with the veRL backend:

export VERL_CONFIG_DIR=/workspace/verl/verl/trainer/config
export VERL_MINT_STORAGE_ROOT=/shared/path/verl-mint

uv run --extra smoke --extra ray \
  verl-mint serve \
  --backend verl \
  --host 0.0.0.0 \
  --port 8000 \
  --storage-root "$VERL_MINT_STORAGE_ROOT" \
  --model-path Qwen/Qwen3-0.6B \
  --verl-config-dir "$VERL_CONFIG_DIR" \
  --override data.train_files=/shared/path/data/train.parquet \
  --override data.val_files=/shared/path/data/val.parquet \
  --override trainer.total_training_steps=1

From any non-Ray host that can reach the service:

import mint

client = mint.ServiceClient(
    base_url="http://<cluster-node-ip>:8000",
    api_key="sk-local-smoke",
)
training = client.create_lora_training_client("Qwen/Qwen3-0.6B", rank=16)
print(training.get_info())

Or run a packaged SDK client smoke against that service:

uv run --env-file /dev/null --extra client \
  python scripts/client_verl_ppo_job_smoke.py \
  --base-url http://<cluster-node-ip>:8000 \
  --base-model Qwen/Qwen3-0.6B

Ray cluster run

verl-mint can move model execution into Ray or veRL workers while keeping the FastAPI process as the control plane. Start from an existing Ray 2.x cluster using ray[default]>=2.46.0,<3 where the API node and workers share the same Python dependencies and artifact root. For open-source use, the expected deployment shape is one service process on the Ray or veRL cluster plus clients on any network-reachable non-Ray host.

Use the service command above on the Ray or veRL cluster node, then run a client_* smoke from a network-reachable non-Ray host with --base-url. The scripts/smoke_*.py entries are local one-shot smoke servers; they are not the remote-client path.

Architecture

verl-mint runs a FastAPI control plane that accepts MinT-style training requests, routes them through backend abstractions, and stores checkpoint artifacts through a configurable storage repository. Local diagnostics can use the lightweight fake and Qwen backends, but the recommended model-backed path uses veRL trainer execution while keeping the API process responsible for sessions, futures, checkpoint metadata, and sampler handoff.

API surface

verl-mint exposes MinT-style routes for:

model lifecycle: create, load from state, unload, inspect
training: forward, forward-backward, optimizer step, train step
RL: GRPO/reverse-KL, rollout collection, train-on-experience
inference: sample and compute logprobs
checkpoints: save state, load state, export weights for sampler
futures: retrieve asynchronous training results

Checkpoint endpoints keep save_state/load_state aliases for clients that use state-oriented wording; the runtime stores and returns the same underlying model weights artifacts.

Contributing

See CONTRIBUTING.md for development setup, test commands, pull request expectations, and release versioning. Release notes are tracked in CHANGELOG.md.

MinT ecosystem

Repository	Role
mindlab-toolkit	Python SDK used by MinT users and examples
MinT Console	SaaS product entry point
mint-doc	Product and API documentation
mint-quickstart	First-run tutorials and migration examples
mint-cookbook	Reproducible MinT training recipes
verl	veRL training execution foundation used by the open runtime

Research notes

Requirements

Python >=3.11
CUDA GPU for practical model training
torch, transformers, peft, and accelerate for local diagnostics
ray[default]>=2.46.0,<3 for distributed execution
shared storage for multi-node checkpoint save and load

Citation

If you use verl-mint, please cite the MinT technical report:

@misc{mindlab2026mint,
  title = {MinT: Managed Infrastructure for Training and Serving Millions of LLMs},
  author = {{Mind Lab}},
  year = {2026},
  eprint = {2605.13779},
  archivePrefix = {arXiv},
  primaryClass = {cs.LG},
  doi = {10.48550/arXiv.2605.13779},
  url = {https://arxiv.org/abs/2605.13779},
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
scripts		scripts
src/verl_mint		src/verl_mint
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_zh.md		README_zh.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock
uv.toml		uv.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

verl-mint

What verl-mint is

Support matrix

Algorithms

Models

Install

Quick start

Simple cluster service

Ray cluster run

Architecture

API surface

Contributing

MinT ecosystem

Research notes

Requirements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

verl-mint

What verl-mint is

Support matrix

Algorithms

Models

Install

Quick start

Simple cluster service

Ray cluster run

Architecture

API surface

Contributing

MinT ecosystem

Research notes

Requirements

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages