Skip to content

fix: keep late mark spans in parent traces#296

Merged
rapids-bot[bot] merged 2 commits into
NVIDIA:mainfrom
bbednarski9:bbednarski/RELAY-360
Jun 24, 2026
Merged

fix: keep late mark spans in parent traces#296
rapids-bot[bot] merged 2 commits into
NVIDIA:mainfrom
bbednarski9:bbednarski/RELAY-360

Conversation

@bbednarski9

@bbednarski9 bbednarski9 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Overview

This PR keeps late mark spans attached to the trace of their completed parent scope in the OpenTelemetry and OpenInference exporters. When a mark event arrives after the parent span has already ended, the exporters now reuse the completed parent's span context instead of starting a new root trace.

  • I confirm this contribution is my own work, or I have the right to submit it under this project's license.
  • I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

  • Add a bounded completed-span-context cache to the OpenTelemetry event processor.
  • Add the same completed-span-context cache to the OpenInference event processor.
  • Preserve the existing active-parent lookup path before falling back to completed parent contexts.
  • Preserve the existing true-orphan mark behavior when no active or completed parent context exists.
  • Add regression tests that cover mark events emitted after a tool span has completed.

Validation:

  • cargo fmt --check
  • cargo test -p nemo-relay observability::otel::tests::
  • cargo test -p nemo-relay observability::openinference::tests::
  • cargo test -p nemo-relay late_parented_marks_reuse_completed_parent_trace_context

Where should the reviewer start?

Start with crates/core/src/observability/otel.rs and crates/core/src/observability/openinference.rs, specifically the parent context lookup and completed span context cache. The matching regression tests are in crates/core/tests/unit/observability/otel_tests.rs and crates/core/tests/unit/observability/openinference_tests.rs.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • Relates to: RELAY-360

Summary by CodeRabbit

  • Bug Fixes

    • Improved observability span lineage for late or out-of-order events by reusing trace context from recently completed spans, ensuring correct parent-child relationships and accurate remote-parent handling.
    • OTEL NeMo Relay now preserves trace parentage for orphan/late “End” and “Mark” scenarios using a bounded in-memory cache.
  • Tests

    • Added regression coverage to verify marks correctly reuse the completed parent’s trace context.
    • Added unit coverage for cache bookkeeping when spans are restarted.

Cache completed span contexts in the OpenTelemetry and OpenInference event processors so mark events that arrive after their parent scope has ended still inherit the original trace context.

Keep true orphan marks on the existing fallback path, bound the completed-context cache, and add regression coverage for late parented marks after tool scope completion.

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
@bbednarski9 bbednarski9 requested a review from a team as a code owner June 24, 2026 01:45
@github-actions github-actions Bot added size:M PR is medium Bug issue describes bug; PR fixes bug lang:rust PR changes/introduces Rust code labels Jun 24, 2026
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: e8a093e1-709a-4449-a5cb-5fa975165502

📥 Commits

Reviewing files that changed from the base of the PR and between 2188e3d and 080c520.

📒 Files selected for processing (4)
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (15)
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

  1. Scope stacks decide where work belongs and which scope-local behavior is visible.
  2. Middleware registries decide what guardrails and intercepts run around managed calls.
  3. Plugins install reusable runtime behavior from configuration.
  4. Events record runtime behavior in ATOF form.
  5. Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.

crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
crates/core/src/observability/{atif,otel,openinference}.rs

📄 CodeRabbit inference engine (.agents/skills/maintain-observability/SKILL.md)

When changing event fields in ATIF, OpenTelemetry, or OpenInference observability surfaces, keep the core event model in crates/core/src/observability/atif.rs, crates/core/src/observability/otel.rs, and crates/core/src/observability/openinference.rs in sync

Files:

  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
🔇 Additional comments (5)
crates/core/src/observability/openinference.rs (2)

511-517: 📐 Maintainability & Code Quality

Confirm required Rust/core validation matrix before merge.

Please provide the run results for just test-rust, cargo fmt --all, cargo clippy --workspace --all-targets -- -D warnings, and the full binding test matrix (just test-python, just test-go, just test-node, just test-wasm) for this crates/core change.

As per coding guidelines, “If any Rust code changed, always run just test-rustcargo clippy --workspace --all-targets -- -D warnings” and “{crates/core,crates/adaptive}/**/*: Changes to crates/core or crates/adaptive must run the full language matrix.”

Also applies to: 543-567

Source: Coding guidelines


543-567: LGTM!

Also applies to: 597-606, 624-643

crates/core/src/observability/otel.rs (1)

500-511: LGTM!

Also applies to: 537-555, 588-597, 615-634

crates/core/tests/unit/observability/openinference_tests.rs (1)

1811-1888: LGTM!

Also applies to: 1890-1946

crates/core/tests/unit/observability/otel_tests.rs (1)

807-864: LGTM!


Walkthrough

Both OpenInferenceEventProcessor and OtelEventProcessor gain a bounded cache (completed_span_contexts + completed_span_order) that stores SpanContext for recently ended spans. On Start, stale cached entries are evicted; on End, the context is recorded. parent_context now falls back to this cache for late/orphan events. Two regression tests verify the behavior.

Changes

Completed span context cache for late-parented events

Layer / File(s) Summary
Cache state, constants, and initialization
crates/core/src/observability/openinference.rs, crates/core/src/observability/otel.rs
Both processors import VecDeque, define COMPLETED_SPAN_CONTEXT_LIMIT, add completed_span_contexts and completed_span_order fields to track recently finished spans with FIFO eviction order, and initialize them in constructors.
Cache management helpers
crates/core/src/observability/openinference.rs, crates/core/src/observability/otel.rs
Implement remove_completed_span_context to delete cache entries and their eviction-order tracking, and record_completed_span_context to insert completed contexts and enforce bounded cache size by evicting oldest entries.
Span lifecycle wiring
crates/core/src/observability/openinference.rs, crates/core/src/observability/otel.rs
process_start removes stale cached contexts for the incoming UUID; process_end records the finished span's SpanContext into the cache; parent_context falls back to the completed cache to construct a remote parent when no active span matches the parent UUID.
Regression tests
crates/core/tests/unit/observability/openinference_tests.rs, crates/core/tests/unit/observability/otel_tests.rs
One test per subscriber validates late-parented marks: emits a tool span (start/end), fires a mark referencing the completed tool UUID, then asserts trace-id continuity, correct parent span ID linkage, and nemo_relay.mark.orphan = "true". OTEL includes an additional internal state test for cache cleanup on span restart.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title follows Conventional Commits format with 'fix' type and provides a concise, imperative summary of the change under 72 characters.
Description check ✅ Passed Description includes all required sections from template: Overview with contribution confirmation, Details listing key changes, Where to start guidance, and Related Issues with action keyword.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@github-actions

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/core/src/observability/openinference.rs`:
- Around line 543-544: In the process_start method, when removing an event UUID
from completed_span_contexts, also remove that same UUID from the
completed_span_order queue to maintain synchronization between both data
structures. This prevents stale queue entries from later evicting fresh context
entries and causing loss of parent linkage for completed spans. Additionally,
review and apply the same synchronization fix to any similar span context
removal operations in the code around lines 624-637 where the same issue may
exist.

In `@crates/core/src/observability/otel.rs`:
- Around line 537-538: The process_start method removes entries from
completed_span_contexts but does not remove the corresponding UUID from
completed_span_order queue, causing synchronization issues. When removing an
event UUID from completed_span_contexts, also remove it from
completed_span_order to keep both data structures synchronized. Apply the same
fix to all locations where entries are removed from completed_span_contexts
(including the sections noted around lines 615-628) to ensure the queue and map
remain consistent and prevent stale queue entries from evicting fresh contexts.

In `@crates/core/tests/unit/observability/openinference_tests.rs`:
- Around line 1754-1809: Add a new test function that verifies the bounded
eviction behavior of the completed span context cache. The test should create
and complete more than COMPLETED_SPAN_CONTEXT_LIMIT + 1 spans, then create mark
events parented to both the oldest (evicted) completed span and a recent
completed span. Add assertions to verify that the mark event for the oldest span
does not link to its parent (indicating cache eviction) while the mark event for
a recent span successfully reuses the completed parent trace context. This test
should follow the same pattern as the existing
late_parented_marks_reuse_completed_parent_trace_context test but specifically
validate the cache boundary behavior when the limit is exceeded.

In `@crates/core/tests/unit/observability/otel_tests.rs`:
- Around line 755-805: Add a new regression test after
late_parented_marks_reuse_completed_parent_trace_context that covers the bounded
eviction behavior. The test should create and complete
COMPLETED_SPAN_CONTEXT_LIMIT + 1 span events, then create mark events
referencing both the oldest completed span and a recent completed span. Verify
that the mark for the oldest span becomes a true orphan with distinct trace_id
from its parent and the orphan attribute set to true, while the mark for a
recent span still reuses the parent's trace_id and has the correct
parent_span_id relationship. This ensures the cache boundary behavior is
properly tested alongside the happy path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: de2008ad-b8f9-464a-8cd7-c94facab5ea2

📥 Commits

Reviewing files that changed from the base of the PR and between 02f12c9 and 2188e3d.

📒 Files selected for processing (4)
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout. (2)
  • GitHub Check: Check / Run
  • GitHub Check: Preview docs
🧰 Additional context used
📓 Path-based instructions (15)
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

  1. Scope stacks decide where work belongs and which scope-local behavior is visible.
  2. Middleware registries decide what guardrails and intercepts run around managed calls.
  3. Plugins install reusable runtime behavior from configuration.
  4. Events record runtime behavior in ATOF form.
  5. Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.

crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • crates/core/tests/unit/observability/openinference_tests.rs
  • crates/core/tests/unit/observability/otel_tests.rs
crates/core/src/observability/{atif,otel,openinference}.rs

📄 CodeRabbit inference engine (.agents/skills/maintain-observability/SKILL.md)

When changing event fields in ATIF, OpenTelemetry, or OpenInference observability surfaces, keep the core event model in crates/core/src/observability/atif.rs, crates/core/src/observability/otel.rs, and crates/core/src/observability/openinference.rs in sync

Files:

  • crates/core/src/observability/openinference.rs
  • crates/core/src/observability/otel.rs
🔇 Additional comments (3)
crates/core/src/observability/openinference.rs (2)

19-19: 📐 Maintainability & Code Quality

Confirm the required Rust/core validation.

Please confirm this PR ran the required Rust checks plus the crates/core language matrix, not just the focused observability tests.

As per coding guidelines, “Any Rust change must run just test-rust”, “Run Rust formatting with cargo fmt --all”, and “Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings”.
As per path instructions, “Changes to crates/core or crates/adaptive must run the full language matrix”.

Sources: Coding guidelines, Path instructions


48-49: LGTM!

Also applies to: 503-517, 561-561, 597-604

crates/core/src/observability/otel.rs (1)

19-19: LGTM!

Also applies to: 42-43, 497-511, 555-555, 588-595

Comment thread crates/core/src/observability/openinference.rs Outdated
Comment thread crates/core/src/observability/otel.rs Outdated
Comment thread crates/core/tests/unit/observability/openinference_tests.rs
Comment thread crates/core/tests/unit/observability/otel_tests.rs
Comment thread crates/core/src/observability/otel.rs
@mnajafian-nv

Copy link
Copy Markdown
Contributor

This seems like the right small fix to land ahead of the larger exporter/normalization refactors. One thing I would like us to preserve when #291/#293 rebase over this area is the completed-parent context fallback, so late mark spans continue to stay enclosed in the original parent trace.

@mnajafian-nv mnajafian-nv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending addressing inline suggestions and green CI :)

@willkill07 willkill07 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coderabbit feedback seems sensible to resolve. Otherwise LGTM.

Remove completed span UUIDs from both the context map and FIFO order queue when a span restarts, preventing stale queue entries from later evicting fresh parent contexts.

Add regression coverage for restart synchronization and OpenInference completed-context cache eviction behavior.

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
@bbednarski9

Copy link
Copy Markdown
Contributor Author

Ack @mnajafian-nv will check for conflicts on those PRs when this is in and comment if anything looks off

@mnajafian-nv mnajafian-nv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM.

@willkill07 willkill07 added this to the 0.5 milestone Jun 24, 2026
@willkill07

Copy link
Copy Markdown
Member

/merge

@rapids-bot rapids-bot Bot merged commit fdbf2f0 into NVIDIA:main Jun 24, 2026
70 of 72 checks passed
@bbednarski9

Copy link
Copy Markdown
Contributor Author

Thanks @willkill07

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug issue describes bug; PR fixes bug lang:rust PR changes/introduces Rust code size:M PR is medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants