Skip to content

Comments

[docs]: sync args with doc#594

Open
Ratish1 wants to merge 7 commits intoradixark:mainfrom
Ratish1:sync/doc-arg
Open

[docs]: sync args with doc#594
Ratish1 wants to merge 7 commits intoradixark:mainfrom
Ratish1:sync/doc-arg

Conversation

@Ratish1
Copy link
Contributor

@Ratish1 Ratish1 commented Feb 13, 2026

This is part 1 of #578 . Manual sync of current doc's description of parameters to the argument.py.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Ratish1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on improving the documentation and clarity of command-line arguments in miles/utils/arguments.py. By synchronizing argument descriptions with the documentation, it enhances the user experience and makes it easier to configure and use the Miles framework. The changes cover various aspects of the framework, including rollout, training, evaluation, and debugging.

Highlights

  • Argument Descriptions: This PR synchronizes the descriptions of various command-line arguments in miles/utils/arguments.py with the documentation, improving clarity and consistency.
  • Rollout Configuration: Several arguments related to rollout, such as temperature, top-p, top-k, max context length, and stop tokens, have updated descriptions to provide more accurate guidance.
  • Dynamic Sampling: The descriptions for dynamic sampling parameters, including over_sampling_batch_size and dynamic_sampling_filter_path, were updated to clarify their purpose and usage.
  • Customization: Descriptions for arguments related to custom functions (e.g., custom rollout, loss, and reward model functions) were updated to provide more context and links to relevant documentation.
  • CI and Debugging: Added descriptions for CI-related arguments and debugging tools, enhancing the ability to test and diagnose issues.
Changelog
  • miles/utils/arguments.py
    • Synchronized argument descriptions with documentation for improved clarity.
    • Updated descriptions for rollout-related arguments.
    • Clarified descriptions for dynamic sampling parameters.
    • Enhanced descriptions for arguments related to custom functions.
    • Added descriptions for CI-related arguments and debugging tools.
Activity
  • Manual sync of current doc's description of parameters to the argument.py.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request does a great job of synchronizing the command-line argument help strings with the documentation, which significantly improves usability and clarity for users. The new descriptions are much more detailed and helpful. I've identified a couple of minor areas for improvement to enhance maintainability and user experience.

Comment on lines 89 to 91
help=(
"Whether to colocate the inference engines and the actor. "
"Turning this on will also set --offload to true."
"Deploy training and rollout on the same GPUs. This mode automatically enables `--offload-train` and `--offload-rollout` to facilitate weight-swapping between the training actor and inference engine. **Note:** The offload parameters are currently only used for AMD GPUs and will be removed soon. **Memory Tip:** When colocating, it is highly recommended to set `--sglang-mem-fraction-static` to **0.8** (especially on **NVIDIA Blackwell B200/B300** GPUs). This leaves sufficient VRAM (~20%) for Megatron to initialize its structures before the first weight offload to CPU occurs. On GB200/GB300, values up to 0.75 are safer for long-running jobs to prevent potential OOMs. #TODO: Verify optimal fraction for Blackwell in production"
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The help string for --colocate contains a #TODO comment. This appears to be an internal development note and should not be part of a user-facing help message. It's recommended to remove this from the help string and track it as a code comment elsewhere or in an issue tracker.

Suggested change
help=(
"Whether to colocate the inference engines and the actor. "
"Turning this on will also set --offload to true."
"Deploy training and rollout on the same GPUs. This mode automatically enables `--offload-train` and `--offload-rollout` to facilitate weight-swapping between the training actor and inference engine. **Note:** The offload parameters are currently only used for AMD GPUs and will be removed soon. **Memory Tip:** When colocating, it is highly recommended to set `--sglang-mem-fraction-static` to **0.8** (especially on **NVIDIA Blackwell B200/B300** GPUs). This leaves sufficient VRAM (~20%) for Megatron to initialize its structures before the first weight offload to CPU occurs. On GB200/GB300, values up to 0.75 are safer for long-running jobs to prevent potential OOMs. #TODO: Verify optimal fraction for Blackwell in production"
),
help=(
"Deploy training and rollout on the same GPUs. This mode automatically enables `--offload-train` and `--offload-rollout` to facilitate weight-swapping between the training actor and inference engine. **Note:** The offload parameters are currently only used for AMD GPUs and will be removed soon. **Memory Tip:** When colocating, it is highly recommended to set `--sglang-mem-fraction-static` to **0.8** (especially on **NVIDIA Blackwell B200/B300** GPUs). This leaves sufficient VRAM (~20%) for Megatron to initialize its structures before the first weight offload to CPU occurs. On GB200/GB300, values up to 0.75 are safer for long-running jobs to prevent potential OOMs."
),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move TODOs out of the help text

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I thought of this too, but then the script would just keep failing since docs and arguments are out of sync.

cc: @zhaochenyang20

@zijiexia
Copy link
Contributor

zijiexia commented Feb 20, 2026

Hi @Ratish1 , I think for the sake of better code visualization, we should not put the helper text as a single line, I think a better way to do is instead of using a single string, stacking a bunch of substrings. e.g.,
use

help=(
    "Applies to `megatron` training backend only. "
    "Disables the system that backups model weights (Actor, Ref, Old Actor) to CPU RAM. "
    "Disabling saves significant host memory but prevents features that rely on weight-swapping, such as computing KL-divergence against a reference model. "
    "Note: do not set `--ref-load` and `--keep-old-actor` if disable weights backuper."
),

instead of

help="Applies to `megatron` training backend only. Disables the system that backs up model weights (Actor, Ref, Old Actor) to CPU RAM. Disabling saves significant host memory but prevents features that rely on weight-swapping, such as computing the KL-divergence against a reference model. **Note**: do not set `--ref-load` and `--keep-old-actor` if disable weights backuper."

I believe there're a lot of linting tools can help you automate this. Please let me know what you think. Thanks.

Also, for those single lined help text, why we need bracket around the text? I think we should remove those to make the coding style aligned.

@Ratish1
Copy link
Contributor Author

Ratish1 commented Feb 21, 2026

Hi @Ratish1 , I think for the sake of better code visualization, we should not put the helper text as a single line, I think a better way to do is instead of using a single string, stacking a bunch of substrings. e.g., use

help=(
    "Applies to `megatron` training backend only. "
    "Disables the system that backups model weights (Actor, Ref, Old Actor) to CPU RAM. "
    "Disabling saves significant host memory but prevents features that rely on weight-swapping, such as computing KL-divergence against a reference model. "
    "Note: do not set `--ref-load` and `--keep-old-actor` if disable weights backuper."
),

instead of

help="Applies to `megatron` training backend only. Disables the system that backs up model weights (Actor, Ref, Old Actor) to CPU RAM. Disabling saves significant host memory but prevents features that rely on weight-swapping, such as computing the KL-divergence against a reference model. **Note**: do not set `--ref-load` and `--keep-old-actor` if disable weights backuper."

I believe there're a lot of linting tools can help you automate this. Please let me know what you think. Thanks.

Also, for those single lined help text, why we need bracket around the text? I think we should remove those to make the coding style aligned.

Yes I will follow this, thanks @zijiexia . I assumed the pre commit would fix this but it didnt. I will look into it and fix it.

"Save the rollout data to this path for debugging. "
"The file will be saved to `save_debug_rollout_data.format(rollout_id)`."
),
help=("Path to save rollout data for offline analysis. " "[Ref](../developer_guide/debug.md)"),
Copy link
Contributor

@zijiexia zijiexia Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we split on periods even though it's a short string

"--check-weight-update-equal",
action="store_true",
help=(
"Use SGLang's weight checker to check and ensure that the loaded weight from HF checkpoint and received from Megatron "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we split on random place in the sentence. I was wondering what is the logic behind the splitting? Will this works with your sync script?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants