Move RL code from src/MaxText/rl/ to src/maxtext/trainers/post_train/rl/ by A9isha · Pull Request #3180 · AI-Hypercomputer/maxtext

A9isha · 2026-02-18T15:01:00Z

Description

Migrate RL training code to the new package structure following the same pattern as the SFT (PR #2988) and distillation moves. Old location files are replaced with backward-compatibility shims that delegate to the new modules with deprecation warnings.

Tests

Locally ran RL using the following commands:

## new
python3 -m src.maxtext.trainers.post_train.rl.train_rl src/maxtext/configs/post_train/rl.yml   model_name=llama3.1-8b   t
okenizer_path=meta-llama/Llama-3.1-8B-Instruct   load_parameters_path=/path/to/checkpoint   run_name=maz-8b-$RANDOM   bas
e_output_directory=/path/to/storage   hf_access_token=<HF_TOKEN> dataset_name=gsm8k steps=4


## old
python3 -m src.MaxText.rl.train_rl src/maxtext/configs/post_train/rl.yml   model_name=llama3.1-8b   t
okenizer_path=meta-llama/Llama-3.1-8B-Instruct   load_parameters_path=/path/to/checkpoint   run_name=maz-8b-$RANDOM   bas
e_output_directory=/path/to/storage   hf_access_token=<HF_TOKEN> dataset_name=gsm8k steps=4

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

codecov · 2026-02-18T15:22:42Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

bvandermoon

For the manual tests you ran, could you also try with the old commands? Not as critical as the train.py shims since RL is a newer feature, but still good to test them since they are added here

A9isha · 2026-02-18T21:56:06Z

Done testing with the old command and updated the description - thanks @bvandermoon

bvandermoon · 2026-02-19T00:19:05Z

Done testing with the old command and updated the description - thanks @bvandermoon

Thanks @A9isha. Just to double check, can you confirm you saw all logs as expected with the old command? For train.py, I needed to set logging.set_verbosity(logging.INFO) to see the standard completed step output logged

A9isha requested review from NicoGrande, NuojCheng, RissyRan, SurbhiJainUSC, aireenmei, bvandermoon, gagika, gobbleturk, hengtaoguo, jacoguzo, jesselu-google, jiangjy1982, khatwanimohit, richjames0, shralex, suexu1025 and vipannalla as code owners February 18, 2026 15:01

bvandermoon approved these changes Feb 18, 2026

View reviewed changes

hengtaoguo approved these changes Feb 18, 2026

View reviewed changes

bvandermoon approved these changes Feb 18, 2026

View reviewed changes

A9isha added the pull ready label Feb 18, 2026

A9isha force-pushed the anisha-rl-refactor branch from ec30219 to 81e5072 Compare February 18, 2026 22:04

Move RL code from src/MaxText/rl/ to src/maxtext/trainers/post_train/rl/

8918852

A9isha force-pushed the anisha-rl-refactor branch from 81e5072 to 8918852 Compare February 18, 2026 22:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move RL code from src/MaxText/rl/ to src/maxtext/trainers/post_train/rl/#3180

Move RL code from src/MaxText/rl/ to src/maxtext/trainers/post_train/rl/#3180
A9isha wants to merge 1 commit intomainfrom
anisha-rl-refactor

A9isha commented Feb 18, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 18, 2026

Uh oh!

bvandermoon left a comment

Uh oh!

A9isha commented Feb 18, 2026

Uh oh!

bvandermoon commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

A9isha commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

codecov bot commented Feb 18, 2026

Codecov Report

Uh oh!

bvandermoon left a comment

Choose a reason for hiding this comment

Uh oh!

A9isha commented Feb 18, 2026

Uh oh!

bvandermoon commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

A9isha commented Feb 18, 2026 •

edited

Loading