fix: repair Hermes gateway session fallback by bbednarski9 · Pull Request #189 · NVIDIA/NeMo-Relay

bbednarski9 · 2026-05-29T21:18:21Z

Overview

Fixes the Hermes gateway session fallback and tightens ATIF LLM dedupe so complementary hook/gateway spans are only collapsed when they represent the same physical request.

I confirm this contribution is my own work, or I have the right to submit it under this project's license.
I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

Uses the OpenAI-compatible request body session_id as a gateway fallback when explicit session headers are absent.
Keeps the existing explicit Claude/Codex session fallbacks ahead of the OpenAI body fallback.
Requires complementary hook/gateway LLM spans to share a request signature or strong request correlation key before ATIF dedupes them.
Adds regression coverage for gateway fallback selection and concurrent overlapping LLM spans that should remain distinct.

Where should the reviewer start?

Start with crates/cli/src/alignment/mod.rs for the gateway fallback behavior, then review crates/core/src/observability/atif.rs for the strengthened complementary hook/gateway dedupe guard. The focused regression tests are in crates/cli/tests/coverage/alignment_tests.rs, crates/cli/tests/coverage/gateway_tests.rs, and crates/core/tests/unit/atif_tests.rs.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Closes [Bug]: duplicate LLM steps when CLI hook-forward and gateway instrumentation both observe the same request #176

Summary by CodeRabbit

Bug Fixes
- Session ID resolution enhanced to properly support OpenAI-compatible API request formats, including additional fallback to request body identifiers
- LLM span correlation and deduplication logic improved with request-level identifier matching, enabling more accurate observability tracking and better event correlation for request tracing

Add a gateway session-id fallback for OpenAI-compatible requests that carry a top-level in the request body. Explicit relay headers, Claude headers, and Codex prompt-cache routing still take precedence. This lets CLI/plugin harnesses keep gateway-observed LLM calls aligned to the intended session even when no relay session header is available. Also adds regression coverage for: - OpenAI chat completions body Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>

coderabbitai · 2026-05-29T21:18:34Z

Walkthrough

The PR improves request identification across two layers: CLI routing now falls back to a generic OpenAI-compatible session_id field in request bodies, and the ATIF exporter now deduplicates overlapping hook and gateway LLM spans using request correlation keys extracted from event metadata. This addresses duplicate ATIF steps when both CLI-integrated agent hooks and gateway instrumentation observe the same request.

Changes

Request Correlation Enhancements

Layer / File(s)	Summary
OpenAI body session_id fallback in CLI routing `crates/cli/src/alignment/mod.rs`, `crates/cli/tests/coverage/alignment_tests.rs`, `crates/cli/tests/coverage/gateway_tests.rs`	`gateway_session_id` now extracts and validates a trimmed `session_id` from the request body for OpenAI routes only (`OpenAiChatCompletions` and `OpenAiResponses`), falling back after explicit headers and provider-specific routes. New test `gateway_session_id_accepts_openai_body_session_id_fallback` validates trimming, empty/invalid rejection, and route-specific behavior.
ATIF request correlation keys for span deduplication `crates/core/src/observability/atif.rs`, `crates/core/tests/unit/atif_tests.rs`	`LlmSpanCandidate` gains a `request_correlation_keys` field populated from event metadata paths (`api_call_id`, `llm_correlation_request_id`, etc.). Complementary hook+gateway span matching now accepts spans as equivalent when `request_signature` differs but spans share at least one correlation key, eliminating duplicate ATIF steps for the same physical request. New test `test_exporter_keeps_overlapping_non_exact_hook_and_gateway_spans_without_shared_request_key` verifies behavior when spans lack shared keys.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

NVIDIA/NeMo-Relay#183: Both PRs modify ATIF span deduplication logic; PR #183 introduces/expands the LLM span dedupe pre-pass while this PR extends matching to use request correlation keys.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 35.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title follows Conventional Commits format with lowercase type 'fix', concise imperative summary, no scope, no breaking change indicator, under 72 characters, and no trailing period.
Description check	✅ Passed	The description includes all required sections: Overview with confirmations, Details explaining changes, Where should the reviewer start with file guidance, and Related Issues with Closes keyword and issue number.
Linked Issues check	✅ Passed	The PR directly addresses `#176` by implementing gateway session fallback for OpenAI-compatible routes and tightening ATIF LLM deduplication to require shared request signatures or correlation keys for complementary hook/gateway spans.
Out of Scope Changes check	✅ Passed	All code changes in alignment, ATIF deduplication, and test modules are scoped to fixing the gateway session fallback and preventing duplicate ATIF LLM steps, directly addressing the requirements in `#176`.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-29T21:21:51Z

Fern docs preview: https://nvidia-preview-pull-request-189.docs.buildwithfern.com/nemo/relay (https://nvidia-preview-pull-request-189.docs.buildwithfern.com/nemo/relay)

coderabbitai · 2026-05-29T21:26:22Z

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details

{}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/core/tests/unit/atif_tests.rs`:
- Around line 2100-2113: The test currently asserts only that each function name
appears more than zero times; change the assertions to check exact counts (2
each) by reusing the existing closure function_name_count and asserting
function_name_count("openrouter") == 2 and
function_name_count("openai.chat_completions") == 2 so that with
trajectory.steps.len() == 4 you validate each function_name in
step.extra.ancestry appears exactly twice.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 1d2c80b3-1794-48b2-8abb-761cda1f4e38

📥 Commits

Reviewing files that changed from the base of the PR and between f1a69c2 and d1c6dc0.

📒 Files selected for processing (5)

crates/cli/src/alignment/mod.rs
crates/cli/tests/coverage/alignment_tests.rs
crates/cli/tests/coverage/gateway_tests.rs
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Check / Run
GitHub Check: Preview docs

🧰 Additional context used

📓 Path-based instructions (11)

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Keep SPDX headers on Rust source files. The project is Apache-2.0.
Use snake_case for Rust binding naming conventions.
Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.
Preserve async behavior on the existing tokio-based model i...

Files:

crates/cli/src/alignment/mod.rs
crates/cli/tests/coverage/alignment_tests.rs
crates/core/src/observability/atif.rs
crates/cli/tests/coverage/gateway_tests.rs
crates/core/tests/unit/atif_tests.rs

**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

crates/cli/src/alignment/mod.rs
crates/cli/tests/coverage/alignment_tests.rs
crates/core/src/observability/atif.rs
crates/cli/tests/coverage/gateway_tests.rs
crates/core/tests/unit/atif_tests.rs

**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

crates/cli/src/alignment/mod.rs
crates/cli/tests/coverage/alignment_tests.rs
crates/core/src/observability/atif.rs
crates/cli/tests/coverage/gateway_tests.rs
crates/core/tests/unit/atif_tests.rs

**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

crates/cli/src/alignment/mod.rs
crates/cli/tests/coverage/alignment_tests.rs
crates/core/src/observability/atif.rs
crates/cli/tests/coverage/gateway_tests.rs
crates/core/tests/unit/atif_tests.rs

{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

crates/cli/tests/coverage/alignment_tests.rs
crates/cli/tests/coverage/gateway_tests.rs
crates/core/tests/unit/atif_tests.rs

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

crates/cli/tests/coverage/alignment_tests.rs
crates/cli/tests/coverage/gateway_tests.rs
crates/core/tests/unit/atif_tests.rs

crates/core/src/observability/{atif,otel,openinference}.rs

📄 CodeRabbit inference engine (.agents/skills/maintain-observability/SKILL.md)

When changing event fields in ATIF, OpenTelemetry, or OpenInference observability surfaces, keep the core event model in crates/core/src/observability/atif.rs, crates/core/src/observability/otel.rs, and crates/core/src/observability/openinference.rs in sync

Files:

crates/core/src/observability/atif.rs

{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs

crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs

crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs

crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs

🔇 Additional comments (11)

crates/cli/src/alignment/mod.rs (2)

247-261: LGTM!

263-275: LGTM!

crates/cli/tests/coverage/alignment_tests.rs (2)

93-121: LGTM!

123-160: LGTM!

crates/cli/tests/coverage/gateway_tests.rs (1)

264-341: LGTM!

crates/core/src/observability/atif.rs (4)

1197-1197: LGTM!

1252-1252: LGTM!

1279-1332: LGTM!

1347-1364: LGTM!

crates/core/tests/unit/atif_tests.rs (2)

1938-1938: LGTM!

Also applies to: 1952-1955, 1963-1966, 1976-1976

2016-2114: Fix toolchain for cargo fmt, then run full Rust validation for this crates/core test change

cargo fmt --all failed with error: no such command: 'fmt' → ensure the Rust toolchain includes rustfmt (e.g., rustup component add rustfmt) and rerun cargo fmt --all

Run cargo clippy --workspace --all-targets -- -D warnings

Run just test-rust

Since crates/core changed, also run the full language matrix (Rust/Python/Go/Node.js/WebAssembly) per the project guidelines

…fallback-clean

bbednarski9 · 2026-05-29T21:27:26Z

/ok to test efd8036

willkill07 · 2026-05-29T21:48:14Z

/merge

#### Overview Fixes the Hermes gateway session fallback and tightens ATIF LLM dedupe so complementary hook/gateway spans are only collapsed when they represent the same physical request. - [x] I confirm this contribution is my own work, or I have the right to submit it under this project's license. - [x] I searched existing issues and open pull requests, and this does not duplicate existing work. #### Details - Uses the OpenAI-compatible request body session_id as a gateway fallback when explicit session headers are absent. - Keeps the existing explicit Claude/Codex session fallbacks ahead of the OpenAI body fallback. - Requires complementary hook/gateway LLM spans to share a request signature or strong request correlation key before ATIF dedupes them. - Adds regression coverage for gateway fallback selection and concurrent overlapping LLM spans that should remain distinct. #### Where should the reviewer start? Start with `crates/cli/src/alignment/mod.rs` for the gateway fallback behavior, then review `crates/core/src/observability/atif.rs` for the strengthened complementary hook/gateway dedupe guard. The focused regression tests are in `crates/cli/tests/coverage/alignment_tests.rs`, `crates/cli/tests/coverage/gateway_tests.rs`, and `crates/core/tests/unit/atif_tests.rs`. #### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to) - Closes NVIDIA#176 ## Summary by CodeRabbit * **Bug Fixes** * Session ID resolution enhanced to properly support OpenAI-compatible API request formats, including additional fallback to request body identifiers * LLM span correlation and deduplication logic improved with request-level identifier matching, enabling more accurate observability tracking and better event correlation for request tracing [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NeMo-Relay/pull/189?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) Authors: - Bryan Bednarski (https://github.com/bbednarski9) Approvers: - Will Killian (https://github.com/willkill07) URL: NVIDIA#189 Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>

bbednarski9 added 2 commits May 29, 2026 14:12

fix(atif): require request identity for complementary LLM dedupe

d1c6dc0

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>

bbednarski9 requested a review from a team as a code owner May 29, 2026 21:18

github-actions Bot added size:M PR is medium Bug issue describes bug; PR fixes bug labels May 29, 2026

github-actions Bot added the lang:rust PR changes/introduces Rust code label May 29, 2026

copy-pr-bot Bot temporarily deployed to fern May 29, 2026 21:18 Inactive

willkill07 approved these changes May 29, 2026

View reviewed changes

willkill07 assigned bbednarski9 May 29, 2026

willkill07 added this to the 0.3 milestone May 29, 2026

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

Comment thread crates/core/tests/unit/atif_tests.rs

Merge branch 'release/0.3' into bbednarski/issue-176-gateway-session-…

efd8036

…fallback-clean

copy-pr-bot Bot temporarily deployed to fern May 29, 2026 21:27 Inactive

rapids-bot Bot merged commit 40bd7a0 into NVIDIA:release/0.3 May 29, 2026
70 checks passed

rapids-bot Bot temporarily deployed to fern May 29, 2026 21:48 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: repair Hermes gateway session fallback#189

fix: repair Hermes gateway session fallback#189
rapids-bot[bot] merged 3 commits into
NVIDIA:release/0.3from
bbednarski9:bbednarski/issue-176-gateway-session-fallback-clean

bbednarski9 commented May 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 29, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 29, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 29, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

bbednarski9 commented May 29, 2026

Uh oh!

willkill07 commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bbednarski9 commented May 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Details

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 29, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bbednarski9 commented May 29, 2026

Uh oh!

willkill07 commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bbednarski9 commented May 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading