NVIDIA NeMo Relay helps see and control what happens inside agent runs without rewriting the agent stack already made. It gives coding agents, applications, framework integrations, middleware, and observability backends a shared runtime for scopes, policy, plugins, and lifecycle events.
| Goal | Start With... |
|---|---|
| Observe Codex, Claude Code, Cursor, or Hermes locally via CLI | Quick Start CLI |
| Instrument app-owned LLM or tool calls | Quick Start Application |
| Use LangChain, LangGraph, Deep Agents, or OpenClaw | Supported Integrations |
| Build a framework or provider integration | Integrate into Frameworks |
| Export ATOF, ATIF, OpenTelemetry, or OpenInference | Observability Plugin |
| Package reusable middleware or exporters | Build Plugins |
| Develop or test this repository from source | CONTRIBUTING.md |
A good first step is to record a real agent run on disk. Once Relay is writing raw events and a trajectory file, there is something concrete to inspect, debug, and build from.
This walkthrough shows an end-to-end quick success setup. Install the
nemo-relay-cli, turn on local exporters, run either Codex or Claude Code
through Relay, and check that Relay wrote both raw events and normalized
trajectories.
cargo install nemo-relay-cliIf using cargo-binstall, the CLI can also be installed with:
cargo binstall nemo-relay-cliFrom the project directory ready to be observed, open the project-scoped plugin editor:
nemo-relay plugins edit --projectThe editor creates or updates the nearest project plugin file at
.nemo-relay/plugins.toml. In the menu:
-
Enable the
Observabilitycomponent. -
Open
ATOF, toggle the section[on]Optionally set:
output_directoryto.nemo-relay/atoffilenametoevents.jsonlmodetooverwrite
-
Open
ATIF, toggle the section[on]Optionally set:
output_directoryto.nemo-relay/atiffilename_templatetotrajectory-{session_id}.json
-
Press
pto preview the generated TOML. -
Press
sto save.
Note
Use nemo-relay plugins edit without --project only if needing to use these
exporter settings in a user-level Relay config instead of a specific project.
Use either host CLI that is installed on a machine. For example:
nemo-relay codex -- exec "Summarize this repository."nemo-relay claude -- "Summarize this repository."Refer to the full Quick Start CLI docs for more options.
The transparent wrapper starts a local Relay gateway, injects host-specific hook and provider settings for that launched process, then shuts the gateway down when the agent exits.
Warning
Codex users may need to review and activate generated hooks before events appear. Using the Codex Desktop App also adds further complications. Refer to the Codex CLI guide for the current hook activation caveat and troubleshooting steps.
After the run exits, check that raw events and trajectory files were written. If the optionally set output directory and file name were used:
test -s .nemo-relay/atof/events.jsonl
ls .nemo-relay/atif/*.json
for file in .nemo-relay/atif/*.json; do
python3 -m json.tool "$file" >/dev/null
doneThen verify that at least one raw ATOF 0.1 event exists:
python3 - <<'PY'
from pathlib import Path
import json
events_path = Path(".nemo-relay/atof/events.jsonl")
events = [
json.loads(line)
for line in events_path.read_text().splitlines()
if line.strip()
]
assert events, "no ATOF events were written"
assert any(event.get("atof_version") == "0.1" for event in events), "no ATOF 0.1 events found"
print(f"validated {len(events)} ATOF event(s)")
PYA successful run creates several outputs to inspect:
.nemo-relay/atof/events.jsonlas the raw canonical event stream.- One or more
.nemo-relay/atif/*.jsontrajectory files for analysis and evaluation workflows.
Tip
If raw ATOF events exist but LLM spans are missing, provider traffic probably isn't flowing through the Relay gateway. If ATIF is missing, make sure the agent session or turn ended and the output directory is writable.
Go to the full NeMo Relay CLI docs for persistent host plugin installation, gateway configuration, exporter options, and agent-specific diagnostics.
Tip
Start by trusting the raw Agent Trajectory Observability Format (ATOF) JSONL. It shows the lifecycle events Relay actually captured before anything is translated into Agent Trajectory Interchange Format (ATIF), OpenTelemetry, or OpenInference output.
If writing the code that calls the model or tool, install the binding for the appropriate language and route that boundary through Relay directly.
Install Relay for the application language:
# Python
uv add nemo-relay
# Node.js
# Requires Node.js 24 or newer.
npm install nemo-relay-node
# Rust
cargo add nemo-relayThen run a minimal example workflow for that binding:
Relay is the liaison between agent systems. A production application may combine NeMo Agent Toolkit, LangChain, LangGraph, provider SDKs, custom harness code, NeMo Guardrails, tracing systems, and evaluation pipelines. Relay gives those pieces one runtime contract instead of asking every layer to invent its own wrappers and trace vocabulary.
Relay gives those systems:
- Scopes so runs, turns, tools, LLM calls, and subagents have clear ownership, parent-child lineage, cleanup boundaries, and request isolation.
- Managed LLM and tool calls so the same lifecycle and middleware rules apply around each callback.
- Middleware for the places where Relay must block, sanitize, transform, route, retry, or replace execution.
- Plugins so reusable observability, guardrail, adaptive, and exporter behavior can be turned on from configuration.
- Events and subscribers so raw ATOF, normalized ATIF, OpenTelemetry, and OpenInference output all come from the same runtime stream.
Relay does not replace frameworks, model provider, application logic, observability backend, or guardrail authoring system. It gives those systems a common boundary to meet.
flowchart LR
App[Application, Framework, or CLI Harness]
subgraph Runtime[NeMo Relay Runtime]
direction TB
Scopes[Scopes]
Middleware[Middleware]
Plugins[Plugins]
Events[Lifecycle Events]
end
Output[Subscribers and Exporters]
App --> Scopes
App --> Middleware
Plugins --> Middleware
Scopes --> Events
Middleware --> Events
Events --> Output
Note
The main supported paths today are Rust, Python, and Node.js. Go, WebAssembly, and raw C FFI are available for source-first users, but they are still experimental.
The following table shows which language bindings and CLI features are currently supported:
| Binding | Status | Notes |
|---|---|---|
| Python | Fully supported | Documented with Quick Start and Guides. |
| Node.js | Fully supported | Documented with Quick Start and Guides. |
| Rust | Fully supported | Documented with Quick Start and Guides. |
| NeMo Relay CLI | Supported | Local observability and hook-backed security are supported; optimization is partial and host-dependent. |
| Go | Experimental | Source-first under go/nemo_relay. |
| WebAssembly | Experimental | Source-first under crates/wasm. |
| FFI | Experimental | Source-first under crates/ffi. |
The CLI support matrix separates the supported CLI surface from host-specific coverage.
- Observability works for the listed harnesses.
- Security is supported when the host exposes blocking hooks.
- Optimization remains partial and host-dependent.
| Agent | Observability | Security | Optimization | Notes |
|---|---|---|---|---|
| Claude Code | Yes | Yes | Partial | Hook forwarding, pre-tool blocking, and gateway-routed LLM observability are supported. |
| Codex | Yes | Yes | Partial | Hook activation is required; missing session-end behavior limits trajectory finalization and full optimization coverage. |
| Hermes Agent | Yes | Yes | Partial | Hook forwarding, pre-tool blocking, and gateway-routed or hook-backed LLM observability are supported. |
| Cursor | Partial | Limited | No | Missing hooks under cursor-agent and manual gateway routing limit full feature coverage. |
Use these integrations when the framework exposes stable callbacks, middleware, or plugin hooks that preserve enough lifecycle fidelity.
| Agent / Library | Observability | Security | Optimization | Notes |
|---|---|---|---|---|
| LangChain | Yes | Yes | Yes | Wrapped tool and LLM calling. |
| LangGraph | Yes | Yes | Yes | Wrapped tool and LLM calling. |
| Deep Agents | Yes | Yes | Yes | Wrapped tool and LLM calling. |
| OpenClaw | Yes | Partial | No | Hook-backed telemetry with pre-tool guardrails. Managed execution rewrites require the patch-based integration. |
The Python nemo-relay package ships extras for LangChain, LangGraph, and Deep
Agents:
uv add "nemo-relay[langchain,langgraph,deepagents]"Refer to Supported Integrations for setup guides and current caveats.
Patch-based integrations are experimental samples maintained against pinned upstream checkouts. Use third_party/README.md for the clone, checkout, and patch-application workflow.
| Integration | Observability | Security | Optimization | Notes |
|---|---|---|---|---|
| LangChain, LangGraph, LangChain NVIDIA | Yes | Yes | Yes | Directly patches behavior into code. |
| opencode | Yes | Yes | Yes | Directly patches behavior into code. |
| OpenClaw | Yes | Yes | Yes | Adds middleware support to OpenClaw and a built-in plugin. |
| Hermes Agent | Yes | Yes | Yes | Directly patches behavior into code. |
End-user documentation lives at NVIDIA NeMo Relay documentation.
Important local entry points:
For source builds, tests, and contribution workflow, refer to CONTRIBUTING.md.
- NemoClaw support and integration for managed tool and LLM execution flows.
- Deeper NVIDIA NeMo ecosystem integration across agent, guardrail, evaluation, and observability workflows.
- Expanded adaptive optimization capabilities for performance-aware scheduling, hints, and cache behavior.
- First-party plugins and packages for common agent runtimes and frameworks where upstream extension points allow it.
NVIDIA NeMo Relay is licensed under the Apache License 2.0.