Democratizing Reinforcement Learning for LLMs
-
Updated
Jan 10, 2026 - Python
Democratizing Reinforcement Learning for LLMs
A Framework for LLM-based Multi-Agent Reinforced Training and Inference
Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
[arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies
qwen3-base family of models RL on gsm8k using verl, is there an RL power law on downstream tasks?
Using automated curriculum learning to enhance LLM's RL training process.
Sample for Fine-Tuning LLMs & VLMs
🌐 Streamline LLM development with ready-to-use environment templates for efficient setup and deployment.
A list of uv environments templates for LLM development.
Add a description, image, and links to the verl topic page so that developers can more easily learn about it.
To associate your repository with the verl topic, visit your repo's landing page and select "manage topics."