[doc] feat: add effectiveness figures and example recipes to README#26
Conversation
There was a problem hiding this comment.
Code Review
This pull request enhances the README.md by adding an 'Effectiveness' section with performance comparisons and an 'Example Recipes' section containing a comprehensive table of training scripts. The review feedback identified several areas for improvement, including ensuring model naming consistency, adding a required environment variable to an example command, and correcting an algorithm label in the recipe table to align with the actual script implementation.
|
|
||
| ## Effectiveness | ||
|
|
||
| > **Qwen3-30B-A3B · REINFORCE++ · DAPO dataset** |
There was a problem hiding this comment.
| ```bash | ||
| bash examples/getting_started/run_qwen3_1b7.sh | ||
| # override paths via env vars | ||
| model_dir=/path/to/model data_dir=/path/to/data bash examples/moe/run_qwen3_30B_A3B_dapo.sh |
There was a problem hiding this comment.
The example command is missing the test_path environment variable. The script examples/moe/run_qwen3_30B_A3B_dapo.sh explicitly requires test_path to locate the validation dataset (see line 36 of that script). Without it, the command will fail for users who do not have the default Arnold-style mount.
| model_dir=/path/to/model data_dir=/path/to/data bash examples/moe/run_qwen3_30B_A3B_dapo.sh | |
| model_dir=/path/to/model data_dir=/path/to/data test_path=/path/to/test bash examples/moe/run_qwen3_30B_A3B_dapo.sh |
| | Recipe | Model | Dataset | Hardware | Algorithm | | ||
| |---|---|---|---|---| | ||
| | [`getting_started/run_qwen3_1b7.sh`](examples/getting_started/run_qwen3_1b7.sh) | Qwen3-1.7B | gsm8k | 1×8H100 | GRPO | | ||
| | [`moe/run_qwen3_30B_A3B_dapo.sh`](examples/moe/run_qwen3_30B_A3B_dapo.sh) | Qwen3-30B-A3B | DAPO-Math-17k / AIME 2025 | 1×8H100 | DAPO | |
There was a problem hiding this comment.
The algorithm for this recipe is listed as DAPO, but the corresponding script examples/moe/run_qwen3_30B_A3B_dapo.sh sets algorithm.adv_estimator=grpo (line 34). Please ensure the algorithm name in the table accurately reflects the implementation in the script.
| | [`moe/run_qwen3_30B_A3B_dapo.sh`](examples/moe/run_qwen3_30B_A3B_dapo.sh) | Qwen3-30B-A3B | DAPO-Math-17k / AIME 2025 | 1×8H100 | DAPO | | |
| | [`moe/run_qwen3_30B_A3B_dapo.sh`](examples/moe/run_qwen3_30B_A3B_dapo.sh) | Qwen3-30B-A3B | DAPO-Math-17k / AIME 2025 | 1×8H100 | GRPO | |
What does this PR do?
Checklist Before Starting
[{modules}] {type}: {description}format (see check_pr_title.py for the full list of allowed modules and types)[BREAKING]— e.g.[BREAKING][ops] feat: new batch-invariant matmul APITest
API and Usage Example
Design & Code Changes
Checklist Before Submitting
pre-commit run --all-files)