verl-vla

Experimental VLA training support built on top of verl, currently focused on PI0.5 workflows for Libero, Isaac, and LeRobot-style SFT.

Supported Simulators

Simulator	Env Name	Difference	Benchmark data source
Mujoco	`LiberoEnv`	1. Initialize task from `init_states` in Libero dataset. 2. Each env can have different tasks.	https://github.com/Lifelong-Robot-Learning/LIBERO
IsaacSim	`IsaacEnv`	1. Initialize from randomized states with more variety than dataset `init_states`. 2. Each sim process must use the same task for its envs.	https://huggingface.co/datasets/china-sae-robotics/IsaacLabPlayGround_Dataset

Hardware Requirements

Simulator GPU: NVIDIA L20 or L40 with 48GB memory and RT Cores

Notes:

Mujoco can fall back to CPU mode with degraded performance if RT Cores are unavailable.
IsaacSim requires GPUs with RT Cores.
RTX GPU support is planned, but it does not work well with colocated mode under current memory limits.

Docker Image

Isaac Lab support for Libero depends on RobotLearningLab from The Isaac Lab Project Developers team. It is currently bundled into the preview image below.

Example image:

vemlp-demo-cn-beijing.cr.volces.com/verl/pi05-libero-sac:v0.2

Dataset Preparation

Libero parquet generation script:

python scripts/prepare_libero_dataset.py

Adjust paths inside the script or your environment before generating data.

Training Entry

The current default training entry is SAC, and the repo now also includes a LeRobot-based SFT entry.

Main Python entry:

python -m verl_vla.trainer.main_sac

Recommended launcher script:

bash examples/libero_sac/run_pi05_libero_sac.sh

Disaggregated launcher:

bash examples/libero_sac/run_pi05_libero_sac_disagg.sh

SFT entry:

python -m verl_vla.trainer.main_sft

Recommended SFT launcher:

bash examples/lerobot_sft/run_pi05_lerobot_sft.sh

Disaggregation Mode

Train-rollout workers and simulation workers can be placed on different nodes.

Start Ray on the main train-rollout node:

ray start --head --dashboard-host=0.0.0.0 --resources='{"train_rollout": 1}'

Start Ray on each simulation node:

ray start --address='<main_node_ip>:6379' --resources='{"sim": 1}'

Then launch training on the main node only.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
examples		examples
scripts		scripts
src/verl_vla		src/verl_vla
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

verl-vla

Supported Simulators

Hardware Requirements

Docker Image

Dataset Preparation

Training Entry

Disaggregation Mode

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

verl-vla

Supported Simulators

Hardware Requirements

Docker Image

Dataset Preparation

Training Entry

Disaggregation Mode

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages