verl

Here are 15 public repositories matching this topic...

rllm-org / rllm

Democratizing Reinforcement Learning for LLMs

machine-learning reinforcement-learning tinker distributed-training ml-infrastructure ml-platform agent-framework search-agent llm-training llm-reasoning agentic-workflow swe-agent verl coding-agent

Updated Jan 10, 2026
Python

TsinghuaC3I / MARTI

Star

A Framework for LLM-based Multi-Agent Reinforced Training and Inference

camel llama gemma multi-agent-systems autogen multi-agent-reinforcement-learning large-language-models qwen large-reasoning-models deepseek-r1 verl openrlhf

Updated Nov 20, 2025
Python

thuml / RLVR-World

Star

Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934

text-game video-generation robotic-manipulation video-prediction web-agent real2sim world-model webarena video-gpt grpo verl rlvr reinforcement-learning-with-verifiable-rewards

Updated Oct 28, 2025
Python

GAIR-NLP / OctoThinker

Star

Revisiting Mid-training in the Era of Reinforcement Learning Scaling

rl llama reasoning post-training pre-training llm qwen verl mid-training

Updated Jul 23, 2025
Jupyter Notebook

NVlabs / GDPO

Star

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

rl reasoning trl llm agentic-ai grpo verl

Updated Jan 9, 2026
Python

Trae1ounG / BuPO

Star

[arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

rl interpretability llms llm-reasoning verl

Updated Jan 4, 2026
Python

sylvain-wei / 24-Game-Reasoning

Star

超简单复现Deepseek-R1-Zero和Deepseek-R1，以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL，以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of DeepSeek R1-Zero, DeepSeek R1

alignment reasoning r1 post-training cot sft o1 24game llm rlhf deepseek r1-zero verl long-cot

Updated Apr 5, 2025
Python

josancamon19 / rl-scaling-laws

Star

qwen3-base family of models RL on gsm8k using verl, is there an RL power law on downstream tasks?

rl scaling-laws verl

Updated Oct 19, 2025
Python

Graph-Reasoner / Graph-R1

Star

Long COT RFT and Reinforcement Learning Creates Generalize

ai graph rl reasoning reasoning-language-models verl

Updated Aug 29, 2025
Python

zsychina / Curriculum-LLM

Star

Using automated curriculum learning to enhance LLM's RL training process.

reinforcement-learning curriculum-learning llm qwen verl

Updated Mar 25, 2025
Python

rabiloo / llm-finetuning

Star

Sample for Fine-Tuning LLMs & VLMs

transformers perf moe lora fine-tuning large-language-models llm rlhf qlora qwen llama-factory llama3 grpo verl

Updated Apr 3, 2025
Python

cognichip / Noisy-RL

Star

RLVεR: Reinforcement Learning with Verifiable Noisy Rewards

rl llm grpo verl

Updated Jan 9, 2026
Python

awinml / verl-turing-support

Star

Fork of VeRL to support Turing Family of GPUs

turing verl

Updated Nov 2, 2025
Python

KRESS99 / llm-env-templates

Star

🌐 Streamline LLM development with ready-to-use environment templates for efficient setup and deployment.

python environment deep-learning conda pytorch venv uv llm flash-attn verl openrlhf

Updated Sep 8, 2025

Magnicord / llm-env-templates

Star

A list of uv environments templates for LLM development.

python environment deep-learning conda pytorch venv uv llm flash-attn verl openrlhf

Updated Sep 19, 2025

Improve this page

Add a description, image, and links to the verl topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the verl topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

verl

Here are 15 public repositories matching this topic...

rllm-org / rllm

TsinghuaC3I / MARTI

thuml / RLVR-World

GAIR-NLP / OctoThinker

NVlabs / GDPO

Trae1ounG / BuPO

sylvain-wei / 24-Game-Reasoning

josancamon19 / rl-scaling-laws

Graph-Reasoner / Graph-R1

zsychina / Curriculum-LLM

rabiloo / llm-finetuning

cognichip / Noisy-RL

awinml / verl-turing-support

KRESS99 / llm-env-templates

Magnicord / llm-env-templates

Improve this page

Add this topic to your repo