Skip to content

Add rollout trace logging with trackio#1360

Open
abidlabs wants to merge 5 commits into
areal-project:mainfrom
abidlabs:add-trackio-trace-logging
Open

Add rollout trace logging with trackio#1360
abidlabs wants to merge 5 commits into
areal-project:mainfrom
abidlabs:add-trackio-trace-logging

Conversation

@abidlabs
Copy link
Copy Markdown

@abidlabs abidlabs commented May 21, 2026

Hi folks! This PR adds trace logging via Trackio, the free, local-first experiment tracking library from Hugging Face 🤗

AReaL already has an existing Trackio metrics backend, so this PR extends it to also include logging Traces. specifically I did this:

  • added logging rollout trajectories as trackio.Trace records when stats_logger.trackio.mode is enabled
  • added logging evaluation rollout trajectories as Trackio traces from the eval rollout path
  • decoded tensor trajectories into prompt/completion chat messages with reward, step, sequence length, prompt length, and version metadata
  • added stats_logger.trackio.max_rollout_traces_per_step to cap trace volume per step
  • documented Trackio trace logging and added mocked Trackio tests

Here's what it looks like:

image

AI assistance was used to prepare this PR.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements rollout and evaluation trace logging for the Trackio backend, allowing tensor trajectories to be decoded into human-readable traces with associated metadata. Key changes include the addition of the max_rollout_traces_per_step configuration, implementation of trace decoding in StatsLogger, and integration into the RLTrainer training and evaluation loops. Feedback focuses on optimizing performance by moving GPU-to-CPU tensor transfers for input IDs, masks, rewards, and versions outside of the per-sample processing loop to reduce synchronization overhead.

Comment thread areal/utils/stats_logger.py Outdated
Comment thread areal/utils/stats_logger.py Outdated
Comment thread areal/utils/stats_logger.py Outdated
metadata["reward"] = float(rewards[sample_index].item())
if versions is not None:
sample_versions = (
versions[sample_index, :seqlen].detach().cpu().tolist()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The versions tensor should also be moved to CPU outside the loop to avoid repeated GPU-to-CPU transfers for each sample.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@abidlabs abidlabs changed the title Add Trackio rollout trace logging Add rollout trace logging with trackio May 21, 2026
@abidlabs abidlabs marked this pull request as ready for review May 21, 2026 22:34
cnt += 1
self.eval_rollout.wait(cnt, timeout=None)
eval_batch = self.eval_rollout.wait(cnt, timeout=None)
self.stats_logger.log_rollout_traces(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wait method maybe return with None standing for rejected trajectories. Inside _trajectory_to_trackio_traces the first line is trajectory.get("input_ids"), which will AttributeError on any None element. Could you either filter Nones out in log_rollout_traces, or guard at the top of
_trajectory_to_trackio_traces?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree this should be fixed. It would be best to add test cases for the None trajectory scenario.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this, I handled rejected None trajectories from rollout/eval results, updated trajectory type hints, and added tests for rejected trajectories

@sitabulaixizawaluduo
Copy link
Copy Markdown
Collaborator

Thanks for your contribute! Please run 'pre-commit' before your submit

@abidlabs
Copy link
Copy Markdown
Author

abidlabs commented May 25, 2026

Ran pre-commit run --all-files and pushed the generated updates. The second pre-commit run passes cleanly. Thanks @sitabulaixizawaluduo!

Comment thread areal/utils/stats_logger.py Outdated
Comment thread areal/utils/stats_logger.py Outdated
trackio.Trace(
messages=[
{"role": "user", "content": prompt},
{"role": "assistant", "content": completion},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good way to support multi-turn traces

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added multi-turn/tool trace support by using structured messages when available and reconstructing user/assistant/tool spans from loss_mask otherwise. Here's what it looks like:

image

@abidlabs
Copy link
Copy Markdown
Author

Thanks for the review @sitabulaixizawaluduo @PrometheusComing! Addressed all of the comments and reran the pre-commit. All changes have been pushed.

Copy link
Copy Markdown
Collaborator

@TaoZex TaoZex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants