NVIDIA NeMo Relay

NVIDIA NeMo Relay helps see and control what happens inside agent runs without rewriting the agent stack already made. It gives coding agents, applications, framework integrations, middleware, and observability backends a shared runtime for scopes, policy, plugins, and lifecycle events.

Where To Start

Goal	Start With...
Observe Codex, Claude Code, Cursor, or Hermes locally via CLI	Quick Start CLI
Instrument app-owned LLM or tool calls	Quick Start Application
Use LangChain, LangGraph, Deep Agents, or OpenClaw	Supported Integrations
Build a framework or provider integration	Integrate into Frameworks
Export ATOF, ATIF, OpenTelemetry, or OpenInference	Observability Plugin
Package reusable middleware or exporters	Build Plugins
Develop or test this repository from source	CONTRIBUTING.md

Quick Start CLI

A good first step is to record a real agent run on disk. Once Relay is writing raw events and a trajectory file, there is something concrete to inspect, debug, and build from.

Local Agent Trajectory

This walkthrough shows an end-to-end quick success setup. Install the nemo-relay-cli, turn on local exporters, run either Codex or Claude Code through Relay, and check that Relay wrote both raw events and normalized trajectories.

1. Install the CLI

cargo install nemo-relay-cli

If using cargo-binstall, the CLI can also be installed with:

cargo binstall nemo-relay-cli

2. Enable Local Observability Output

From the project directory ready to be observed, open the project-scoped plugin editor:

nemo-relay plugins edit --project

The editor creates or updates the nearest project plugin file at .nemo-relay/plugins.toml. In the menu:

Enable the Observability component.
Open ATOF, toggle the section [on]

Optionally set:
- output_directory to .nemo-relay/atof
- filename to events.jsonl
- mode to overwrite
Open ATIF, toggle the section [on]

Optionally set:
- output_directory to .nemo-relay/atif
- filename_template to trajectory-{session_id}.json
Press p to preview the generated TOML.
Press s to save.

Note

Use nemo-relay plugins edit without --project only if needing to use these exporter settings in a user-level Relay config instead of a specific project.

3. Run Codex or Claude Code Through Relay

Use either host CLI that is installed on a machine. For example:

nemo-relay codex -- exec "Summarize this repository."

nemo-relay claude -- "Summarize this repository."

Refer to the full Quick Start CLI docs for more options.

The transparent wrapper starts a local Relay gateway, injects host-specific hook and provider settings for that launched process, then shuts the gateway down when the agent exits.

Warning

Codex users may need to review and activate generated hooks before events appear. Using the Codex Desktop App also adds further complications. Refer to the Codex CLI guide for the current hook activation caveat and troubleshooting steps.

4. Verify the Run

After the run exits, check that raw events and trajectory files were written. If the optionally set output directory and file name were used:

test -s .nemo-relay/atof/events.jsonl
ls .nemo-relay/atif/*.json
for file in .nemo-relay/atif/*.json; do
  python3 -m json.tool "$file" >/dev/null
done

Then verify that at least one raw ATOF 0.1 event exists:

python3 - <<'PY'
from pathlib import Path
import json

events_path = Path(".nemo-relay/atof/events.jsonl")
events = [
    json.loads(line)
    for line in events_path.read_text().splitlines()
    if line.strip()
]

assert events, "no ATOF events were written"
assert any(event.get("atof_version") == "0.1" for event in events), "no ATOF 0.1 events found"
print(f"validated {len(events)} ATOF event(s)")
PY

A successful run creates several outputs to inspect:

.nemo-relay/atof/events.jsonl as the raw canonical event stream.
One or more .nemo-relay/atif/*.json trajectory files for analysis and evaluation workflows.

Tip

If raw ATOF events exist but LLM spans are missing, provider traffic probably isn't flowing through the Relay gateway. If ATIF is missing, make sure the agent session or turn ended and the output directory is writable.

Next Steps

Go to the full NeMo Relay CLI docs for persistent host plugin installation, gateway configuration, exporter options, and agent-specific diagnostics.

Tip

Start by trusting the raw Agent Trajectory Observability Format (ATOF) JSONL. It shows the lifecycle events Relay actually captured before anything is translated into Agent Trajectory Interchange Format (ATIF), OpenTelemetry, or OpenInference output.

Quick Start Applications

If writing the code that calls the model or tool, install the binding for the appropriate language and route that boundary through Relay directly.

Application Trajectory

Install Relay for the application language:

# Python
uv add nemo-relay

# Node.js
# Requires Node.js 24 or newer.
npm install nemo-relay-node

# Rust
cargo add nemo-relay

Then run a minimal example workflow for that binding:

What Relay Adds

Relay is the liaison between agent systems. A production application may combine NeMo Agent Toolkit, LangChain, LangGraph, provider SDKs, custom harness code, NeMo Guardrails, tracing systems, and evaluation pipelines. Relay gives those pieces one runtime contract instead of asking every layer to invent its own wrappers and trace vocabulary.

Relay gives those systems:

Scopes so runs, turns, tools, LLM calls, and subagents have clear ownership, parent-child lineage, cleanup boundaries, and request isolation.
Managed LLM and tool calls so the same lifecycle and middleware rules apply around each callback.
Middleware for the places where Relay must block, sanitize, transform, route, retry, or replace execution.
Plugins so reusable observability, guardrail, adaptive, and exporter behavior can be turned on from configuration.
Events and subscribers so raw ATOF, normalized ATIF, OpenTelemetry, and OpenInference output all come from the same runtime stream.

Relay does not replace frameworks, model provider, application logic, observability backend, or guardrail authoring system. It gives those systems a common boundary to meet.

flowchart LR
    App[Application, Framework, or CLI Harness]

    subgraph Runtime[NeMo Relay Runtime]
        direction TB
        Scopes[Scopes]
        Middleware[Middleware]
        Plugins[Plugins]
        Events[Lifecycle Events]
    end

    Output[Subscribers and Exporters]

    App --> Scopes
    App --> Middleware
    Plugins --> Middleware
    Scopes --> Events
    Middleware --> Events
    Events --> Output

Support Status

Note

The main supported paths today are Rust, Python, and Node.js. Go, WebAssembly, and raw C FFI are available for source-first users, but they are still experimental.

The following table shows which language bindings and CLI features are currently supported:

Binding	Status	Notes
Python	Fully supported	Documented with Quick Start and Guides.
Node.js	Fully supported	Documented with Quick Start and Guides.
Rust	Fully supported	Documented with Quick Start and Guides.
NeMo Relay CLI	Supported	Local observability and hook-backed security are supported; optimization is partial and host-dependent.
Go	Experimental	Source-first under `go/nemo_relay`.
WebAssembly	Experimental	Source-first under `crates/wasm`.
FFI	Experimental	Source-first under `crates/ffi`.

Agent Harness Support

The CLI support matrix separates the supported CLI surface from host-specific coverage.

Observability works for the listed harnesses.
Security is supported when the host exposes blocking hooks.
Optimization remains partial and host-dependent.

Agent	Observability	Security	Optimization	Notes
Claude Code	Yes	Yes	Partial	Hook forwarding, pre-tool blocking, and gateway-routed LLM observability are supported.
Codex	Yes	Yes	Partial	Hook activation is required; missing session-end behavior limits trajectory finalization and full optimization coverage.
Hermes Agent	Yes	Yes	Partial	Hook forwarding, pre-tool blocking, and gateway-routed or hook-backed LLM observability are supported.
Cursor	Partial	Limited	No	Missing hooks under `cursor-agent` and manual gateway routing limit full feature coverage.

Public API Integrations

Use these integrations when the framework exposes stable callbacks, middleware, or plugin hooks that preserve enough lifecycle fidelity.

Agent / Library	Observability	Security	Optimization	Notes
LangChain	Yes	Yes	Yes	Wrapped tool and LLM calling.
LangGraph	Yes	Yes	Yes	Wrapped tool and LLM calling.
Deep Agents	Yes	Yes	Yes	Wrapped tool and LLM calling.
OpenClaw	Yes	Partial	No	Hook-backed telemetry with pre-tool guardrails. Managed execution rewrites require the patch-based integration.

The Python nemo-relay package ships extras for LangChain, LangGraph, and Deep Agents:

uv add "nemo-relay[langchain,langgraph,deepagents]"

Refer to Supported Integrations for setup guides and current caveats.

Patch-Based Integrations

Patch-based integrations are experimental samples maintained against pinned upstream checkouts. Use third_party/README.md for the clone, checkout, and patch-application workflow.

Integration	Observability	Security	Optimization	Notes
LangChain, LangGraph, LangChain NVIDIA	Yes	Yes	Yes	Directly patches behavior into code.
opencode	Yes	Yes	Yes	Directly patches behavior into code.
OpenClaw	Yes	Yes	Yes	Adds middleware support to OpenClaw and a built-in plugin.
Hermes Agent	Yes	Yes	Yes	Directly patches behavior into code.

Documentation

End-user documentation lives at NVIDIA NeMo Relay documentation.

Important local entry points:

For source builds, tests, and contribution workflow, refer to CONTRIBUTING.md.

Roadmap

NemoClaw support and integration for managed tool and LLM execution flows.
Deeper NVIDIA NeMo ecosystem integration across agent, guardrail, evaluation, and observability workflows.
Expanded adaptive optimization capabilities for performance-aware scheduling, hints, and cache behavior.
First-party plugins and packages for common agent runtimes and frameworks where upstream extension points allow it.

License

NVIDIA NeMo Relay is licensed under the Apache License 2.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA NeMo Relay

Where To Start

Quick Start CLI

Local Agent Trajectory

1. Install the CLI

2. Enable Local Observability Output

3. Run Codex or Claude Code Through Relay

4. Verify the Run

Next Steps

Quick Start Applications

Application Trajectory

What Relay Adds

Support Status

Agent Harness Support

Public API Integrations

Patch-Based Integrations

Documentation

Roadmap

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

NVIDIA NeMo Relay

Where To Start

Quick Start CLI

Local Agent Trajectory

1. Install the CLI

2. Enable Local Observability Output

3. Run Codex or Claude Code Through Relay

4. Verify the Run

Next Steps

Quick Start Applications

Application Trajectory

What Relay Adds

Support Status

Agent Harness Support

Public API Integrations

Patch-Based Integrations

Documentation

Roadmap

License