Add rollout trace logging with `trackio` by abidlabs · Pull Request #1360 · areal-project/AReaL

abidlabs · 2026-05-21T22:26:26Z

Hi folks! This PR adds trace logging via Trackio, the free, local-first experiment tracking library from Hugging Face 🤗

AReaL already has an existing Trackio metrics backend, so this PR extends it to also include logging Traces. specifically I did this:

added logging rollout trajectories as trackio.Trace records when stats_logger.trackio.mode is enabled
added logging evaluation rollout trajectories as Trackio traces from the eval rollout path
decoded tensor trajectories into prompt/completion chat messages with reward, step, sequence length, prompt length, and version metadata
added stats_logger.trackio.max_rollout_traces_per_step to cap trace volume per step
documented Trackio trace logging and added mocked Trackio tests

Here's what it looks like:

AI assistance was used to prepare this PR.

gemini-code-assist

Code Review

This pull request implements rollout and evaluation trace logging for the Trackio backend, allowing tensor trajectories to be decoded into human-readable traces with associated metadata. Key changes include the addition of the max_rollout_traces_per_step configuration, implementation of trace decoding in StatsLogger, and integration into the RLTrainer training and evaluation loops. Feedback focuses on optimizing performance by moving GPU-to-CPU tensor transfers for input IDs, masks, rewards, and versions outside of the per-sample processing loop to reduce synchronization overhead.

gemini-code-assist · 2026-05-21T22:31:08Z

+                metadata["reward"] = float(rewards[sample_index].item())
+            if versions is not None:
+                sample_versions = (
+                    versions[sample_index, :seqlen].detach().cpu().tolist()


The versions tensor should also be moved to CPU outside the loop to avoid repeated GPU-to-CPU transfers for each sample.

sitabulaixizawaluduo · 2026-05-23T07:08:37Z

                    cnt += 1
-            self.eval_rollout.wait(cnt, timeout=None)
+            eval_batch = self.eval_rollout.wait(cnt, timeout=None)
+            self.stats_logger.log_rollout_traces(


The wait method maybe return with None standing for rejected trajectories. Inside _trajectory_to_trackio_traces the first line is trajectory.get("input_ids"), which will AttributeError on any None element. Could you either filter Nones out in log_rollout_traces, or guard at the top of
_trajectory_to_trackio_traces?

I also agree this should be fixed. It would be best to add test cases for the None trajectory scenario.

Thanks for catching this, I handled rejected None trajectories from rollout/eval results, updated trajectory type hints, and added tests for rejected trajectories

sitabulaixizawaluduo · 2026-05-23T07:12:25Z

Thanks for your contribute! Please run 'pre-commit' before your submit

abidlabs · 2026-05-25T17:21:36Z

Ran pre-commit run --all-files and pushed the generated updates. The second pre-commit run passes cleanly. Thanks @sitabulaixizawaluduo!

PrometheusComing · 2026-05-26T12:00:57Z

+                trackio.Trace(
+                    messages=[
+                        {"role": "user", "content": prompt},
+                        {"role": "assistant", "content": completion},


Is there a good way to support multi-turn traces

I added multi-turn/tool trace support by using structured messages when available and reconstructing user/assistant/tool spans from loss_mask otherwise. Here's what it looks like:

abidlabs · 2026-05-28T19:45:09Z

Thanks for the review @sitabulaixizawaluduo @PrometheusComing! Addressed all of the comments and reran the pre-commit. All changes have been pushed.

TaoZex

LGTM

Add Trackio rollout trace logging

d0f7606

gemini-code-assist Bot reviewed May 21, 2026

View reviewed changes

abidlabs changed the title ~~Add Trackio rollout trace logging~~ Add rollout trace logging with trackio May 21, 2026

abidlabs marked this pull request as ready for review May 21, 2026 22:34

abidlabs requested review from fishcrap, garrett4wade, nuzant, rchardx and sitabulaixizawaluduo as code owners May 21, 2026 22:34

sitabulaixizawaluduo reviewed May 23, 2026

View reviewed changes

docs: update generated cli reference

44a9644

Merge branch 'main' into add-trackio-trace-logging

91d0961

TaoZex reviewed May 26, 2026

View reviewed changes

Comment thread areal/utils/stats_logger.py Outdated

PrometheusComing reviewed May 26, 2026

View reviewed changes

abidlabs added 2 commits May 28, 2026 12:26

Address Trackio trace review comments

c65866e

Merge branch 'main' into add-trackio-trace-logging

2c9f48d

TaoZex approved these changes May 30, 2026

View reviewed changes

Conversation

abidlabs commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

abidlabs May 28, 2026

Choose a reason for hiding this comment

Uh oh!

sitabulaixizawaluduo May 23, 2026

Choose a reason for hiding this comment

Uh oh!

TaoZex May 26, 2026

Choose a reason for hiding this comment

Uh oh!

abidlabs May 28, 2026

Choose a reason for hiding this comment

Uh oh!

sitabulaixizawaluduo commented May 23, 2026

Uh oh!

abidlabs commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

PrometheusComing May 26, 2026

Choose a reason for hiding this comment

Uh oh!

abidlabs May 28, 2026

Choose a reason for hiding this comment

Uh oh!

abidlabs commented May 28, 2026

Uh oh!

TaoZex left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

abidlabs commented May 21, 2026 •

edited

Loading

abidlabs commented May 25, 2026 •

edited

Loading