[docs]: sync args with doc by Ratish1 · Pull Request #594 · radixark/miles

Ratish1 · 2026-02-13T06:21:52Z

This is part 1 of #578 . Manual sync of current doc's description of parameters to the argument.py.

gemini-code-assist · 2026-02-13T06:23:18Z

Summary of Changes

Hello @Ratish1, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on improving the documentation and clarity of command-line arguments in miles/utils/arguments.py. By synchronizing argument descriptions with the documentation, it enhances the user experience and makes it easier to configure and use the Miles framework. The changes cover various aspects of the framework, including rollout, training, evaluation, and debugging.

Highlights

Argument Descriptions: This PR synchronizes the descriptions of various command-line arguments in miles/utils/arguments.py with the documentation, improving clarity and consistency.
Rollout Configuration: Several arguments related to rollout, such as temperature, top-p, top-k, max context length, and stop tokens, have updated descriptions to provide more accurate guidance.
Dynamic Sampling: The descriptions for dynamic sampling parameters, including over_sampling_batch_size and dynamic_sampling_filter_path, were updated to clarify their purpose and usage.
Customization: Descriptions for arguments related to custom functions (e.g., custom rollout, loss, and reward model functions) were updated to provide more context and links to relevant documentation.
CI and Debugging: Added descriptions for CI-related arguments and debugging tools, enhancing the ability to test and diagnose issues.

Changelog

miles/utils/arguments.py
- Synchronized argument descriptions with documentation for improved clarity.
- Updated descriptions for rollout-related arguments.
- Clarified descriptions for dynamic sampling parameters.
- Enhanced descriptions for arguments related to custom functions.
- Added descriptions for CI-related arguments and debugging tools.

Activity

Manual sync of current doc's description of parameters to the argument.py.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request does a great job of synchronizing the command-line argument help strings with the documentation, which significantly improves usability and clarity for users. The new descriptions are much more detailed and helpful. I've identified a couple of minor areas for improvement to enhance maintainability and user experience.

gemini-code-assist · 2026-02-13T06:24:40Z

miles/utils/arguments.py

                help=(
-                    "Whether to colocate the inference engines and the actor. "
-                    "Turning this on will also set --offload to true."
+                    "Deploy training and rollout on the same GPUs. This mode automatically enables `--offload-train` and `--offload-rollout` to facilitate weight-swapping between the training actor and inference engine. **Note:** The offload parameters are currently only used for AMD GPUs and will be removed soon. **Memory Tip:** When colocating, it is highly recommended to set `--sglang-mem-fraction-static` to **0.8** (especially on **NVIDIA Blackwell B200/B300** GPUs). This leaves sufficient VRAM (~20%) for Megatron to initialize its structures before the first weight offload to CPU occurs. On GB200/GB300, values up to 0.75 are safer for long-running jobs to prevent potential OOMs. #TODO: Verify optimal fraction for Blackwell in production"
                ),


The help string for --colocate contains a #TODO comment. This appears to be an internal development note and should not be part of a user-facing help message. It's recommended to remove this from the help string and track it as a code comment elsewhere or in an issue tracker.

Suggested change

help=(

"Whether to colocate the inference engines and the actor. "

"Turning this on will also set --offload to true."

"Deploy training and rollout on the same GPUs. This mode automatically enables `--offload-train` and `--offload-rollout` to facilitate weight-swapping between the training actor and inference engine. **Note:** The offload parameters are currently only used for AMD GPUs and will be removed soon. **Memory Tip:** When colocating, it is highly recommended to set `--sglang-mem-fraction-static` to **0.8** (especially on **NVIDIA Blackwell B200/B300** GPUs). This leaves sufficient VRAM (~20%) for Megatron to initialize its structures before the first weight offload to CPU occurs. On GB200/GB300, values up to 0.75 are safer for long-running jobs to prevent potential OOMs. #TODO: Verify optimal fraction for Blackwell in production"

),

help=(

"Deploy training and rollout on the same GPUs. This mode automatically enables `--offload-train` and `--offload-rollout` to facilitate weight-swapping between the training actor and inference engine. **Note:** The offload parameters are currently only used for AMD GPUs and will be removed soon. **Memory Tip:** When colocating, it is highly recommended to set `--sglang-mem-fraction-static` to **0.8** (especially on **NVIDIA Blackwell B200/B300** GPUs). This leaves sufficient VRAM (~20%) for Megatron to initialize its structures before the first weight offload to CPU occurs. On GB200/GB300, values up to 0.75 are safer for long-running jobs to prevent potential OOMs."

),

We should move TODOs out of the help text

Yes, I thought of this too, but then the script would just keep failing since docs and arguments are out of sync.

cc: @zhaochenyang20

miles/utils/arguments.py

zijiexia · 2026-02-20T19:54:37Z

Hi @Ratish1 , I think for the sake of better code visualization, we should not put the helper text as a single line, I think a better way to do is instead of using a single string, stacking a bunch of substrings. e.g.,
use

help=(
    "Applies to `megatron` training backend only. "
    "Disables the system that backups model weights (Actor, Ref, Old Actor) to CPU RAM. "
    "Disabling saves significant host memory but prevents features that rely on weight-swapping, such as computing KL-divergence against a reference model. "
    "Note: do not set `--ref-load` and `--keep-old-actor` if disable weights backuper."
),

instead of

help="Applies to `megatron` training backend only. Disables the system that backs up model weights (Actor, Ref, Old Actor) to CPU RAM. Disabling saves significant host memory but prevents features that rely on weight-swapping, such as computing the KL-divergence against a reference model. **Note**: do not set `--ref-load` and `--keep-old-actor` if disable weights backuper."

I believe there're a lot of linting tools can help you automate this. Please let me know what you think. Thanks.

Also, for those single lined help text, why we need bracket around the text? I think we should remove those to make the coding style aligned.

miles/utils/arguments.py

Ratish1 · 2026-02-21T04:06:58Z

Hi @Ratish1 , I think for the sake of better code visualization, we should not put the helper text as a single line, I think a better way to do is instead of using a single string, stacking a bunch of substrings. e.g., use
help=(
    "Applies to `megatron` training backend only. "
    "Disables the system that backups model weights (Actor, Ref, Old Actor) to CPU RAM. "
    "Disabling saves significant host memory but prevents features that rely on weight-swapping, such as computing KL-divergence against a reference model. "
    "Note: do not set `--ref-load` and `--keep-old-actor` if disable weights backuper."
),
instead of
help="Applies to `megatron` training backend only. Disables the system that backs up model weights (Actor, Ref, Old Actor) to CPU RAM. Disabling saves significant host memory but prevents features that rely on weight-swapping, such as computing the KL-divergence against a reference model. **Note**: do not set `--ref-load` and `--keep-old-actor` if disable weights backuper."
I believe there're a lot of linting tools can help you automate this. Please let me know what you think. Thanks.

Also, for those single lined help text, why we need bracket around the text? I think we should remove those to make the coding style aligned.

Yes I will follow this, thanks @zijiexia . I assumed the pre commit would fix this but it didnt. I will look into it and fix it.

zijiexia · 2026-02-23T22:52:10Z

miles/utils/arguments.py

-                    "Save the rollout data to this path for debugging. "
-                    "The file will be saved to `save_debug_rollout_data.format(rollout_id)`."
-                ),
+                help=("Path to save rollout data for offline analysis. " "[Ref](../developer_guide/debug.md)"),


Here we split on periods even though it's a short string

zijiexia · 2026-02-23T22:53:06Z

miles/utils/arguments.py

+                "--check-weight-update-equal",
+                action="store_true",
+                help=(
+                    "Use SGLang's weight checker to check and ensure that the loaded weight from HF checkpoint and received from Megatron "


Here we split on random place in the sentence. I was wondering what is the logic behind the splitting? Will this works with your sync script?

sync args with doc

cc9efc0

Ratish1 requested review from fzyzcjy, guapisolo, maocheng23 and yueming-yuan as code owners February 13, 2026 06:21

gemini-code-assist bot reviewed Feb 13, 2026

View reviewed changes

address bot comment

4b41b5f

zijiexia reviewed Feb 21, 2026

View reviewed changes

miles/utils/arguments.py Outdated Show resolved Hide resolved

address comments

bc9566b

zijiexia reviewed Feb 23, 2026

View reviewed changes

Ratish1 added 4 commits February 24, 2026 11:13

address comments

ef249f4

address comments

f4b4360

fix conflict

cd80c00

revert submodule

6e3ae68

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[docs]: sync args with doc#594

[docs]: sync args with doc#594
Ratish1 wants to merge 7 commits intoradixark:mainfrom
Ratish1:sync/doc-arg

Ratish1 commented Feb 13, 2026

Uh oh!

gemini-code-assist bot commented Feb 13, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 13, 2026

Uh oh!

zijiexia Feb 20, 2026

Uh oh!

Ratish1 Feb 21, 2026

Uh oh!

Uh oh!

zijiexia commented Feb 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Ratish1 commented Feb 21, 2026

Uh oh!

zijiexia Feb 23, 2026 •

edited

Loading

Uh oh!

zijiexia Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

Ratish1 commented Feb 13, 2026

Uh oh!

gemini-code-assist bot commented Feb 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

zijiexia Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Ratish1 Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zijiexia commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Ratish1 commented Feb 21, 2026

Uh oh!

zijiexia Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zijiexia Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zijiexia commented Feb 20, 2026 •

edited

Loading

zijiexia Feb 23, 2026 •

edited

Loading