[doc] feat: add effectiveness figures and example recipes to README by Luosuu · Pull Request #26 · verl-project/vexact

Luosuu · 2026-05-13T23:19:27Z

What does this PR do?

Concise overview of the change. Reference related issues/PRs.

Checklist Before Starting

Search for relative PRs/issues and link here: ...
PR title follows [{modules}] {type}: {description} format (see check_pr_title.py for the full list of allowed modules and types)
- Breaking changes: prepend [BREAKING] — e.g. [BREAKING][ops] feat: new batch-invariant matmul API

Test

Validation results (numeric checks, benchmark metrics) for changes not covered by CI.

API and Usage Example

Show API changes and usage examples if applicable.

Design & Code Changes

High-level design description and specific change list.

Checklist Before Submitting

Read the Contribute Guide
Applied pre-commit checks (pre-commit run --all-files)
Added/updated documentation
Added tests to CI workflow (or explained why not feasible)

gemini-code-assist

Code Review

This pull request enhances the README.md by adding an 'Effectiveness' section with performance comparisons and an 'Example Recipes' section containing a comprehensive table of training scripts. The review feedback identified several areas for improvement, including ensuring model naming consistency, adding a required environment variable to an example command, and correcting an algorithm label in the recipe table to align with the actual script implementation.

gemini-code-assist · 2026-05-13T23:24:28Z


+## Effectiveness 
+
+> **Qwen3-30B-A3B · REINFORCE++ · DAPO dataset**


For consistency with the recipe table below (line 53), the model name should include the -Base suffix when referring to the REINFORCE++ experiment.

Suggested change

> **Qwen3-30B-A3B · REINFORCE++ · DAPO dataset**

> **Qwen3-30B-A3B-Base · REINFORCE++ · DAPO dataset**

gemini-code-assist · 2026-05-13T23:24:28Z

+```bash
+bash examples/getting_started/run_qwen3_1b7.sh
+# override paths via env vars
+model_dir=/path/to/model data_dir=/path/to/data bash examples/moe/run_qwen3_30B_A3B_dapo.sh


The example command is missing the test_path environment variable. The script examples/moe/run_qwen3_30B_A3B_dapo.sh explicitly requires test_path to locate the validation dataset (see line 36 of that script). Without it, the command will fail for users who do not have the default Arnold-style mount.

Suggested change

model_dir=/path/to/model data_dir=/path/to/data bash examples/moe/run_qwen3_30B_A3B_dapo.sh

model_dir=/path/to/model data_dir=/path/to/data test_path=/path/to/test bash examples/moe/run_qwen3_30B_A3B_dapo.sh

gemini-code-assist · 2026-05-13T23:24:28Z

+| Recipe | Model | Dataset | Hardware | Algorithm |
+|---|---|---|---|---|
+| [`getting_started/run_qwen3_1b7.sh`](examples/getting_started/run_qwen3_1b7.sh) | Qwen3-1.7B | gsm8k | 1×8H100 | GRPO |
+| [`moe/run_qwen3_30B_A3B_dapo.sh`](examples/moe/run_qwen3_30B_A3B_dapo.sh) | Qwen3-30B-A3B | DAPO-Math-17k / AIME 2025 | 1×8H100 | DAPO |


The algorithm for this recipe is listed as DAPO, but the corresponding script examples/moe/run_qwen3_30B_A3B_dapo.sh sets algorithm.adv_estimator=grpo (line 34). Please ensure the algorithm name in the table accurately reflects the implementation in the script.

Suggested change

| [`moe/run_qwen3_30B_A3B_dapo.sh`](examples/moe/run_qwen3_30B_A3B_dapo.sh) | Qwen3-30B-A3B | DAPO-Math-17k / AIME 2025 | 1×8H100 | DAPO |

| [`moe/run_qwen3_30B_A3B_dapo.sh`](examples/moe/run_qwen3_30B_A3B_dapo.sh) | Qwen3-30B-A3B | DAPO-Math-17k / AIME 2025 | 1×8H100 | GRPO |

[doc] feat: add effectiveness figures and example recipes to README

d3039d2

Luosuu merged commit 67b5380 into verl-project:main May 13, 2026
2 checks passed

gemini-code-assist Bot reviewed May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[doc] feat: add effectiveness figures and example recipes to README#26

[doc] feat: add effectiveness figures and example recipes to README#26
Luosuu merged 1 commit into
verl-project:mainfrom
Luosuu:doc/readme-results-and-recipes

Luosuu commented May 13, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		## Effectiveness

		> Qwen3-30B-A3B · REINFORCE++ · DAPO dataset

	> Qwen3-30B-A3B · REINFORCE++ · DAPO dataset
	> Qwen3-30B-A3B-Base · REINFORCE++ · DAPO dataset

	model_dir=/path/to/model data_dir=/path/to/data bash examples/moe/run_qwen3_30B_A3B_dapo.sh
	model_dir=/path/to/model data_dir=/path/to/data test_path=/path/to/test bash examples/moe/run_qwen3_30B_A3B_dapo.sh

	\| [`moe/run_qwen3_30B_A3B_dapo.sh`](examples/moe/run_qwen3_30B_A3B_dapo.sh) \| Qwen3-30B-A3B \| DAPO-Math-17k / AIME 2025 \| 1×8H100 \| DAPO \|
	\| [`moe/run_qwen3_30B_A3B_dapo.sh`](examples/moe/run_qwen3_30B_A3B_dapo.sh) \| Qwen3-30B-A3B \| DAPO-Math-17k / AIME 2025 \| 1×8H100 \| GRPO \|

Conversation

Luosuu commented May 13, 2026

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant