fix: repair Hermes gateway session fallback#189
Conversation
Add a gateway session-id fallback for OpenAI-compatible requests that carry a top-level in the request body. Explicit relay headers, Claude headers, and Codex prompt-cache routing still take precedence. This lets CLI/plugin harnesses keep gateway-observed LLM calls aligned to the intended session even when no relay session header is available. Also adds regression coverage for: - OpenAI chat completions body Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
WalkthroughThe PR improves request identification across two layers: CLI routing now falls back to a generic OpenAI-compatible ChangesRequest Correlation Enhancements
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/core/tests/unit/atif_tests.rs`:
- Around line 2100-2113: The test currently asserts only that each function name
appears more than zero times; change the assertions to check exact counts (2
each) by reusing the existing closure function_name_count and asserting
function_name_count("openrouter") == 2 and
function_name_count("openai.chat_completions") == 2 so that with
trajectory.steps.len() == 4 you validate each function_name in
step.extra.ancestry appears exactly twice.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 1d2c80b3-1794-48b2-8abb-761cda1f4e38
📒 Files selected for processing (5)
crates/cli/src/alignment/mod.rscrates/cli/tests/coverage/alignment_tests.rscrates/cli/tests/coverage/gateway_tests.rscrates/core/src/observability/atif.rscrates/core/tests/unit/atif_tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Check / Run
- GitHub Check: Preview docs
🧰 Additional context used
📓 Path-based instructions (11)
**/*.rs
📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)
Use
snake_casenaming convention for Rust identifiers (e.g.,nemo_relay_tool_call)
**/*.rs: Any Rust change must runjust test-rust
Any Rust change must runcargo fmt --all
Any Rust change must runcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allfor all FFI work since it is Rust work
Runjust test-rustto validate FFI changes
Runcargo clippy --workspace --all-targets -- -D warningsto enforce strict linting on FFI workWhen Rust files changed as part of Go work, also run
cargo fmt --all,just test-rust, andcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allwhen Rust files are changed as part of Node work
Runcargo clippy --workspace --all-targets -- -D warningswhen Rust files are changed as part of Node work
Runjust test-rustwhen Rust files are changed as part of Node work
**/*.rs: Runcargo fmt --allto format all Rust code
Runcargo clippy --workspace --all-targets -- -D warningsto enforce all clippy lints as errors
**/*.rs: Runcargo fmt --allwhen Rust files changed as part of WebAssembly work
Runcargo clippy --workspace --all-targets -- -D warningswhen Rust files changed as part of WebAssembly work
**/*.rs: If any Rust code changed, always runjust test-rust
If any Rust code changed, also runcargo fmt --all
If any Rust code changed, also runcargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting withcargo fmt --all
Run Rust linting withcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Keep SPDX headers on Rust source files. The project is Apache-2.0.
Usesnake_casefor Rust binding naming conventions.
UseJson = serde_json::Valuein Rust-facing runtime APIs where the existing code expects JSON payloads.
UseResult<T>withFlowErrorin core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.
Preserve async behavior on the existing tokio-based model i...
Files:
crates/cli/src/alignment/mod.rscrates/cli/tests/coverage/alignment_tests.rscrates/core/src/observability/atif.rscrates/cli/tests/coverage/gateway_tests.rscrates/core/tests/unit/atif_tests.rs
**/{Cargo.toml,**/*.rs}
📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)
Maintain consistency between Rust package names in
Cargo.tomland their actual usage across the codebase
Files:
crates/cli/src/alignment/mod.rscrates/cli/tests/coverage/alignment_tests.rscrates/core/src/observability/atif.rscrates/cli/tests/coverage/gateway_tests.rscrates/core/tests/unit/atif_tests.rs
**/*.{h,hpp,c,cpp,rs}
📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)
Ensure FFI header and library naming follows consistent conventions across platform-specific builds
Files:
crates/cli/src/alignment/mod.rscrates/cli/tests/coverage/alignment_tests.rscrates/core/src/observability/atif.rscrates/cli/tests/coverage/gateway_tests.rscrates/core/tests/unit/atif_tests.rs
**/*.{rs,toml}
📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)
Update Rust crate names and module prefixes during coordinated rename operations
Files:
crates/cli/src/alignment/mod.rscrates/cli/tests/coverage/alignment_tests.rscrates/core/src/observability/atif.rscrates/cli/tests/coverage/gateway_tests.rscrates/core/tests/unit/atif_tests.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}
📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)
Maintain documented and tested validation and report behavior for adaptive surfaces
Files:
crates/cli/tests/coverage/alignment_tests.rscrates/cli/tests/coverage/gateway_tests.rscrates/core/tests/unit/atif_tests.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}
⚙️ CodeRabbit configuration file
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.
Files:
crates/cli/tests/coverage/alignment_tests.rscrates/cli/tests/coverage/gateway_tests.rscrates/core/tests/unit/atif_tests.rs
crates/core/src/observability/{atif,otel,openinference}.rs
📄 CodeRabbit inference engine (.agents/skills/maintain-observability/SKILL.md)
When changing event fields in ATIF, OpenTelemetry, or OpenInference observability surfaces, keep the core event model in
crates/core/src/observability/atif.rs,crates/core/src/observability/otel.rs, andcrates/core/src/observability/openinference.rsin sync
Files:
crates/core/src/observability/atif.rs
{crates/core,crates/adaptive}/**/*
📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)
Changes to
crates/coreorcrates/adaptivemust run the full language matrix
Files:
crates/core/src/observability/atif.rscrates/core/tests/unit/atif_tests.rs
crates/core/**/*.rs
📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)
If the change touched
crates/coreor shared runtime semantics, also usevalidate-changefor broader validation
Files:
crates/core/src/observability/atif.rscrates/core/tests/unit/atif_tests.rs
crates/{core,adaptive}/**
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
If
crates/coreorcrates/adaptivechanged, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly
Files:
crates/core/src/observability/atif.rscrates/core/tests/unit/atif_tests.rs
crates/{core,adaptive}/**/*.rs
⚙️ CodeRabbit configuration file
crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.
Files:
crates/core/src/observability/atif.rscrates/core/tests/unit/atif_tests.rs
🔇 Additional comments (11)
crates/cli/src/alignment/mod.rs (2)
247-261: LGTM!
263-275: LGTM!crates/cli/tests/coverage/alignment_tests.rs (2)
93-121: LGTM!
123-160: LGTM!crates/cli/tests/coverage/gateway_tests.rs (1)
264-341: LGTM!crates/core/src/observability/atif.rs (4)
1197-1197: LGTM!
1252-1252: LGTM!
1279-1332: LGTM!
1347-1364: LGTM!crates/core/tests/unit/atif_tests.rs (2)
1938-1938: LGTM!Also applies to: 1952-1955, 1963-1966, 1976-1976
2016-2114: Fix toolchain forcargo fmt, then run full Rust validation for thiscrates/coretest change
cargo fmt --allfailed witherror: no such command: 'fmt'→ ensure the Rust toolchain includes rustfmt (e.g.,rustup component add rustfmt) and reruncargo fmt --all- Run
cargo clippy --workspace --all-targets -- -D warnings- Run
just test-rust- Since
crates/corechanged, also run the full language matrix (Rust/Python/Go/Node.js/WebAssembly) per the project guidelines
|
/ok to test efd8036 |
|
/merge |
#### Overview Fixes the Hermes gateway session fallback and tightens ATIF LLM dedupe so complementary hook/gateway spans are only collapsed when they represent the same physical request. - [x] I confirm this contribution is my own work, or I have the right to submit it under this project's license. - [x] I searched existing issues and open pull requests, and this does not duplicate existing work. #### Details - Uses the OpenAI-compatible request body session_id as a gateway fallback when explicit session headers are absent. - Keeps the existing explicit Claude/Codex session fallbacks ahead of the OpenAI body fallback. - Requires complementary hook/gateway LLM spans to share a request signature or strong request correlation key before ATIF dedupes them. - Adds regression coverage for gateway fallback selection and concurrent overlapping LLM spans that should remain distinct. #### Where should the reviewer start? Start with `crates/cli/src/alignment/mod.rs` for the gateway fallback behavior, then review `crates/core/src/observability/atif.rs` for the strengthened complementary hook/gateway dedupe guard. The focused regression tests are in `crates/cli/tests/coverage/alignment_tests.rs`, `crates/cli/tests/coverage/gateway_tests.rs`, and `crates/core/tests/unit/atif_tests.rs`. #### Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to) - Closes NVIDIA#176 ## Summary by CodeRabbit * **Bug Fixes** * Session ID resolution enhanced to properly support OpenAI-compatible API request formats, including additional fallback to request body identifiers * LLM span correlation and deduplication logic improved with request-level identifier matching, enabling more accurate observability tracking and better event correlation for request tracing [](https://app.coderabbit.ai/change-stack/NVIDIA/NeMo-Relay/pull/189?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) Authors: - Bryan Bednarski (https://github.com/bbednarski9) Approvers: - Will Killian (https://github.com/willkill07) URL: NVIDIA#189 Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Overview
Fixes the Hermes gateway session fallback and tightens ATIF LLM dedupe so complementary hook/gateway spans are only collapsed when they represent the same physical request.
Details
Where should the reviewer start?
Start with
crates/cli/src/alignment/mod.rsfor the gateway fallback behavior, then reviewcrates/core/src/observability/atif.rsfor the strengthened complementary hook/gateway dedupe guard. The focused regression tests are incrates/cli/tests/coverage/alignment_tests.rs,crates/cli/tests/coverage/gateway_tests.rs, andcrates/core/tests/unit/atif_tests.rs.Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit