Skip to content

Forward-merge release/0.3 into main#187

Merged
GPUtester merged 1 commit into
mainfrom
release/0.3
May 29, 2026
Merged

Forward-merge release/0.3 into main#187
GPUtester merged 1 commit into
mainfrom
release/0.3

Conversation

@rapids-bot

@rapids-bot rapids-bot Bot commented May 29, 2026

Copy link
Copy Markdown

Forward-merge triggered by push to release/0.3 that creates a PR to keep main up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

#### Overview

Adds ATIF exporter de-duplication for overlapping LLM spans that represent the same physical provider request, such as a hook-observed span and a gateway-observed span.

- [x] I confirm this contribution is my own work, or I have the right to submit it under this project's license.
- [x] I searched existing issues and open pull requests, and this does not duplicate existing work.

#### Details

Some harnesses can emit multiple LLM spans for one underlying request. Without de-duplication, ATIF can contain repeated user/agent steps for a single model call.

This PR adds an exporter pre-pass that collects complete LLM start/end span candidates, detects overlapping duplicates under the same parent/model, suppresses the lower-fidelity span, and merges metrics from the suppressed span into the canonical step when needed.

It also adds tests for overlapping hook/gateway spans, preferring a higher-fidelity gateway span over a non-exact hook summary, and preserving sequential same-content LLM calls as separate steps.

#### Where should the reviewer start?

Start with `crates/core/src/observability/atif.rs`, specifically `build_llm_dedupe`, `same_physical_llm_request`, and `llm_event_fidelity_score`. The main tests are in `crates/core/tests/unit/atif_tests.rs` around `test_exporter_dedupes_overlapping_hook_and_gateway_llm_spans`.

#### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

- Relates to #176




## Summary by CodeRabbit

* **New Features**
  * Enhanced ATIF exporter with improved LLM span deduplication and token-metric consolidation from multiple instrumentation sources.

* **Bug Fixes**
  * Fixed metric handling for overlapping LLM requests to prevent inaccurate data consolidation.

* **Tests**
  * Added comprehensive unit tests validating LLM span deduplication behavior across instrumentation scenarios.



[![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NeMo-Relay/pull/183?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

Authors:
  - Bryan Bednarski (https://github.com/bbednarski9)

Approvers:
  - Will Killian (https://github.com/willkill07)

URL: #183
@rapids-bot rapids-bot Bot requested a review from a team as a code owner May 29, 2026 20:35
@GPUtester GPUtester merged commit dad1ee7 into main May 29, 2026
1 check passed
@rapids-bot

rapids-bot Bot commented May 29, 2026

Copy link
Copy Markdown
Author

SUCCESS - forward-merge complete.

@github-actions github-actions Bot added size:L PR is large lang:rust PR changes/introduces Rust code labels May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lang:rust PR changes/introduces Rust code size:L PR is large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants