docs/training.mdx at main · mdlinville/docs

title	W&B Training
description	Post-train your models using reinforcement learning
mode	wide

Now in public preview, W&B Training offers serverless reinforcement learning (RL) for post-training large language models (LLMs) to improve their reliability performing multi-turn, agentic tasks while also increasing speed and reducing costs. RL is a training technique where models learn to improve their behavior through feedback on their outputs.

W&B Training includes integration with:

ART, a flexible RL fine-tuning framework.
RULER, a universal verifier.
A fully-managed backend on CoreWeave Cloud.

To get started, satisfy the prerequisites to start using the service and then see OpenPipe's Serverless RL quickstart to learn how to post-train your models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

training.mdx

Latest commit

History

training.mdx

File metadata and controls