diff --git a/docs/about-nemo-relay/architecture.mdx b/docs/about-nemo-relay/architecture.mdx
index 0c502d07..f8417355 100644
--- a/docs/about-nemo-relay/architecture.mdx
+++ b/docs/about-nemo-relay/architecture.mdx
@@ -194,3 +194,4 @@ The following concepts are related to this architecture:
- [Events](/about-nemo-relay/concepts/events)
- [Subscribers](/about-nemo-relay/concepts/subscribers)
- [Plugins](/about-nemo-relay/concepts/plugins)
+- [Codecs](/about-nemo-relay/concepts/codecs)
diff --git a/docs/about-nemo-relay/concepts/codecs.mdx b/docs/about-nemo-relay/concepts/codecs.mdx
new file mode 100644
index 00000000..a0df49ac
--- /dev/null
+++ b/docs/about-nemo-relay/concepts/codecs.mdx
@@ -0,0 +1,112 @@
+---
+title: "Codecs"
+description: ""
+position: 7
+---
+{/* SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+SPDX-License-Identifier: Apache-2.0 */}
+
+This page explains how codecs fit into the shared NeMo Relay runtime contract.
+
+## What A Codec Is
+
+A codec is a boundary translator. It converts one runtime-facing data shape into
+another without changing who owns execution.
+
+NeMo Relay uses codecs when runtime behavior needs stable, JSON-compatible, or
+annotated data but the application or provider surface starts from a different
+shape.
+
+## Why Codecs Exist
+
+Codecs let NeMo Relay preserve one execution model across different boundaries:
+
+- application-owned typed values
+- framework-owned callback payloads
+- provider-native LLM request and response shapes
+- exporter or subscriber consumers that need normalized data
+
+Without codecs, request-side middleware and observability would need to reason
+about every framework or provider shape directly.
+
+## Two Main Codec Roles
+
+NeMo Relay documentation uses the word `codec` in two related but different
+senses.
+
+### Typed Value Codecs
+
+Typed value codecs translate application-facing values to and from JSON-friendly
+shapes at the public wrapper boundary.
+
+Use them when:
+
+- application code wants native objects
+- framework callbacks expect typed values
+- runtime events and JSON-based middleware still need stable serialized payloads
+
+### Provider Codecs
+
+Provider codecs translate provider-native LLM payloads at two different points.
+Request codecs decode provider requests into annotated request data for request
+intercepts and request-side middleware, then encode edits back into the provider
+shape. Response codecs decode provider responses into annotated response data
+for LLM end events, subscribers, exporters, and diagnostics.
+
+Use them when:
+
+- provider payloads differ structurally
+- request intercepts need normalized request meaning
+- response annotations such as usage, model names, or tool calls should be
+ exposed in one stable shape for downstream consumers
+
+## Request And Response Responsibilities
+
+Request and response codecs do different jobs.
+
+- **Request codecs** decode raw provider requests into annotated request data
+ before request-oriented runtime behavior runs, then encode edited annotations
+ back into the provider-native shape when execution continues.
+- **Response codecs** decode raw provider responses into annotated response data
+ for lifecycle events, subscribers, and exporters.
+
+Response decoding improves observability and downstream consistency. It does not
+automatically change the value returned to the application unless a separate
+typed value boundary also does so.
+
+## Who Consumes Normalized Data
+
+Normalized codec output matters to several runtime layers:
+
+- request intercepts or request-side middleware that need stable request meaning
+- lifecycle events that should expose consistent semantic payloads
+- subscribers that inspect runtime activity in process
+- exporters that write raw ATOF events or project them into ATIF,
+ OpenTelemetry, or OpenInference
+
+Codecs do not replace scopes, middleware, subscribers, or plugins. They make
+those layers easier to apply consistently across heterogeneous inputs.
+
+## What Codecs Do Not Decide
+
+Codecs do not decide:
+
+- ownership boundaries
+- middleware ordering
+- whether execution is allowed to continue
+- which exporter is active
+
+Those responsibilities belong to scopes, middleware, plugins, and exporter or
+subscriber registration.
+
+## Read Next
+
+- Use [Using Codecs](/integrate-into-frameworks/using-codecs) for typed value
+ codecs at framework-facing boundaries.
+- Use [Provider Codecs](/integrate-into-frameworks/provider-codecs) for
+ provider-native request and response normalization.
+- Use [Provider Response
+ Codecs](/integrate-into-frameworks/provider-response-codecs) when the main
+ need is response-side annotations for subscribers or exporters.
+- Refer to the [Glossary](/resources/glossary) for the stable terminology used
+ across codecs, providers, and observability surfaces.
diff --git a/docs/about-nemo-relay/concepts/events.mdx b/docs/about-nemo-relay/concepts/events.mdx
index 80cba351..6e9c24ad 100644
--- a/docs/about-nemo-relay/concepts/events.mdx
+++ b/docs/about-nemo-relay/concepts/events.mdx
@@ -96,6 +96,19 @@ annotations are part of the canonical event JSON under `category_profile` when
they are present, so ATOF JSONL export and in-process subscriber JSON expose the
same payload shape.
+### Event Contract Boundary
+
+The event is the canonical envelope and handoff point for subscribers and
+exporters. Consumers should prefer typed event fields, binding helpers, and
+annotated request or response data before falling back to raw payload
+re-reading.
+
+Events preserve runtime facts. They do not decide which scope owns a call, how
+replay or cost policy is applied, how redaction policy is configured, how
+streams are blocked, or how exporter-specific semantic projection works. Those
+decisions stay in the session, codec, plugin, guardrail, or exporter layer that
+owns them.
+
## How Events Are Produced
Scope APIs emit `scope` start and end events and can also emit named `mark`
@@ -104,3 +117,6 @@ events. Managed tool and LLM helpers emit `scope` events with `category` set to
the runtime record important state transitions even when there is no full nested
callback to wrap. Subscriber delivery is downstream from event construction and
does not block the emitting call on native bindings.
+
+ATOF export writes this raw canonical event stream. ATIF, OpenTelemetry, and
+OpenInference are downstream projections over the same events.
diff --git a/docs/about-nemo-relay/concepts/framework-integrations.mdx b/docs/about-nemo-relay/concepts/framework-integrations.mdx
index 13262b1b..a916bdeb 100644
--- a/docs/about-nemo-relay/concepts/framework-integrations.mdx
+++ b/docs/about-nemo-relay/concepts/framework-integrations.mdx
@@ -36,6 +36,10 @@ order:
This order preserves the most runtime semantics with the least distortion.
+This order also keeps ownership clear. A framework integration should preserve
+the framework's scheduling, retry, provider routing, and object-lifetime rules
+unless a managed NeMo Relay wrapper explicitly owns that invocation boundary.
+
## First Choice: Execution Wrappers
Execution wrappers are the preferred integration boundary when a framework exposes a
@@ -93,6 +97,11 @@ hooks but does not let NeMo Relay wrap the real invocation.
This fallback preserves lifecycle visibility, but the framework must pair start
and end calls correctly.
+Manual lifecycle calls do not run the full managed execution pipeline by
+themselves. They preserve observability and parentage, but execution intercepts,
+request intercepts, and sanitize guardrails only run when the integration calls
+the corresponding managed or standalone runtime surface.
+
### Conditional Execution
Use standalone conditional-execution helpers when the framework only needs an
@@ -138,6 +147,11 @@ Use these rules to decide where NeMo Relay should wrap framework behavior.
- If you only need request transformation, use request-intercept helpers.
- If you only have milestone visibility, emit mark events.
+When provider request or response payloads matter, prefer NeMo Relay codecs and
+annotated request or response data before introducing ad hoc raw-payload parsing
+in the integration. Keep provider-specific round-trip behavior in the codec or
+adapter that owns that provider shape.
+
## Practical Guidance
Use these practices when applying the concept in application or integration code.
diff --git a/docs/about-nemo-relay/concepts/index.mdx b/docs/about-nemo-relay/concepts/index.mdx
index a4bcc2e0..5a6d93bf 100644
--- a/docs/about-nemo-relay/concepts/index.mdx
+++ b/docs/about-nemo-relay/concepts/index.mdx
@@ -7,7 +7,28 @@ position: 4
SPDX-License-Identifier: Apache-2.0 */}
-Use these pages to understand the NeMo Relay runtime model before applying it in a use-case workflow.
+Use these pages to understand the NeMo Relay runtime model before applying it
+in a use-case workflow.
+
+NeMo Relay's shared runtime model has five parts:
+
+- **Scopes** decide where work belongs and which scope-local behavior is visible.
+- **Middleware** decides what can block, rewrite, sanitize, or wrap managed
+ execution.
+- **Plugins** install reusable runtime behavior from configuration.
+- **Events** record what happened in the canonical lifecycle stream.
+- **Subscribers and exporters** consume that stream in process, write raw ATOF
+ events, or project events into downstream formats.
+
+Managed tool and LLM helpers are the application-owned API path through scopes,
+middleware, events, and subscribers. Codecs are boundary translators that
+normalize typed application values or provider-native payloads when runtime
+behavior needs stable data shapes.
+
+Use the concepts pages for runtime semantics such as ownership, ordering,
+cleanup, and support boundaries. Use the generated [API
+Reference](/reference/api) for symbol lookup after you already know which
+runtime concept or integration boundary you need.
+
+
+Normalization boundaries for typed application values, provider payloads, middleware, and exporters.
+
+
diff --git a/docs/about-nemo-relay/concepts/plugins.mdx b/docs/about-nemo-relay/concepts/plugins.mdx
index 7034855a..fbcb8076 100644
--- a/docs/about-nemo-relay/concepts/plugins.mdx
+++ b/docs/about-nemo-relay/concepts/plugins.mdx
@@ -59,6 +59,18 @@ behavior.
Reporting provides structured diagnostics about what activated successfully and
what did not.
+### Failure Boundary
+
+Plugin validation and initialization are setup boundaries. If configuration is
+invalid, a component kind is unavailable, or initialization fails, callers should
+treat the plugin setup as failed before relying on the new runtime behavior.
+Activation reports are the public way to inspect what validated or activated.
+
+Runtime behavior after activation still belongs to the installed component. For
+example, an exporter can report delivery failures without changing tool or LLM
+execution semantics. Keep those component-specific failure rules in the
+component guide rather than redefining them in the plugin concept.
+
```mermaid
@@ -131,6 +143,10 @@ that should activate once for the running process rather than once per request.
Scope-local behavior still matters after plugin installation, but the plugin
system itself is a global activation layer.
+Plugins install runtime behavior; they do not create a separate execution
+model. Scopes still own parentage and cleanup, middleware still owns execution
+ordering, and events still own the canonical runtime record.
+
## Built-In Plugin Examples
Core plugin APIs register built-in components before lookup, validation, and
@@ -171,7 +187,7 @@ The core crate also ships a built-in `nemo_guardrails` plugin component. It is
the first-party Guardrails integration point that NeMo Relay owns through the
shared plugin system.
-The current shipped user-facing lanes are:
+The current shipped user-facing paths are:
- the remote backend for Guardrails-service integration
- the Python-backed local backend for `nemoguardrails` integration through a
diff --git a/docs/contribute/about.mdx b/docs/contribute/about.mdx
index 90079535..d67a06b4 100644
--- a/docs/contribute/about.mdx
+++ b/docs/contribute/about.mdx
@@ -36,6 +36,7 @@ Use these guide links to move from the overview into task-specific instructions.
- [Development Setup](/contribute/development-setup) covers package installation, source setup, branch naming, and code style.
- [Workflow and Reviews](/contribute/workflow-and-reviews) covers pre-commit hooks, pull request expectations, release tag conventions, DCO sign-off, commit messages, and review rules.
- [Testing and Documentation](/contribute/testing-and-docs) covers affected-language test selection, common build and test commands, documentation checks, and licensing expectations.
+- [Runtime Contract Documentation](/contribute/runtime-contract-docs) covers shared terminology, support boundaries, and review guardrails for docs and examples that describe runtime behavior.
Read [Development Setup](/contribute/development-setup) before building locally. Use
[Testing and Documentation](/contribute/testing-and-docs) to choose the smallest
diff --git a/docs/contribute/runtime-contract-docs.mdx b/docs/contribute/runtime-contract-docs.mdx
new file mode 100644
index 00000000..ca89fb2a
--- /dev/null
+++ b/docs/contribute/runtime-contract-docs.mdx
@@ -0,0 +1,123 @@
+---
+title: "Runtime Contract Documentation"
+description: ""
+position: 5
+---
+{/* SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+SPDX-License-Identifier: Apache-2.0 */}
+
+Use this page when documentation, examples, bindings, plugins, or integrations
+need to update shared NeMo Relay runtime terminology.
+
+The Rust runtime remains the behavioral source of truth. These docs are the
+coordination point for user-facing language about ownership, execution ordering,
+cleanup, support boundaries, and observability semantics. Keep these pages
+aligned before changing examples or workflow-specific guides that depend on
+them.
+
+## Shared Contract Pages
+
+These pages define or route the shared runtime model:
+
+- `docs/about-nemo-relay/concepts/index.mdx`
+- `docs/about-nemo-relay/concepts/scopes.mdx`
+- `docs/about-nemo-relay/concepts/middleware.mdx`
+- `docs/about-nemo-relay/concepts/subscribers.mdx`
+- `docs/about-nemo-relay/concepts/plugins.mdx`
+- `docs/about-nemo-relay/concepts/events.mdx`
+- `docs/about-nemo-relay/concepts/framework-integrations.mdx`
+- `docs/about-nemo-relay/concepts/codecs.mdx`
+- `docs/getting-started/agent-runtime-primer.mdx`
+- `docs/getting-started/installation.mdx`
+- `docs/getting-started/quick-start/index.mdx`
+- `docs/reference/api/index.mdx`
+- `docs/observability-plugin/about.mdx`
+- `docs/observability-plugin/atof.mdx`
+- `docs/observability-plugin/atif.mdx`
+- `docs/observability-plugin/opentelemetry.mdx`
+- `docs/observability-plugin/openinference.mdx`
+- `docs/resources/glossary.mdx`
+
+Support-sensitive summaries should also stay aligned with:
+
+- `README.md`
+- `docs/integrate-into-frameworks/about.mdx`
+- `docs/nemo-relay-cli/about.mdx`
+- `docs/supported-integrations/about.mdx`
+
+## Contract Boundaries
+
+Use these boundaries when reviewing docs and examples.
+
+### Concepts Versus API Reference
+
+Concept pages explain runtime semantics. Generated API reference pages are for
+symbol lookup after the reader already knows which concept or integration
+boundary they need.
+
+### Primary Versus Experimental Bindings
+
+Rust, Python, and Node.js are the primary documented bindings. Go, WebAssembly,
+and raw C FFI are source-first experimental surfaces unless a page explicitly
+says otherwise.
+
+### ATOF Versus Projections
+
+ATOF is the canonical event stream. ATIF, OpenTelemetry, and OpenInference are
+downstream projections over those events. Projection docs should not redefine
+scope ownership, middleware ordering, provider payload decoding, replay policy,
+or cost policy.
+
+### Codecs Versus Ownership
+
+Codecs translate typed application values or provider-native payloads into
+stable runtime shapes. They do not decide ownership, middleware ordering,
+whether execution may continue, or which exporter is active.
+
+### Managed Versus Explicit Integration
+
+Managed execution wrappers preserve the full runtime pipeline around a real tool
+or LLM callback. Explicit lifecycle APIs preserve observability and parentage
+when a framework owns the callback, but they do not run the full managed
+execution pipeline by themselves.
+
+## Page Responsibilities
+
+Use this map to keep shared pages focused.
+
+| Page | Responsibility |
+| --- | --- |
+| Concepts hub | Shared runtime model and routing across core concepts, framework integrations, and codecs. |
+| Scopes | Ownership, hierarchy, scope-local visibility, cleanup, and request isolation. |
+| Middleware | Intercepts versus guardrails, managed execution ordering, streaming placement, and failure behavior that changes execution. |
+| Subscribers | Downstream event consumption, async delivery, flush expectations, and exporter relationship. |
+| Plugins | Process-level activation, config-driven behavior installation, validation, rollback, and component ownership boundaries. |
+| Events | Canonical ATOF envelope, start/end pairing, marks, semantic payloads, category profiles, and annotated request or response data. |
+| Framework integrations | What the framework owns, what Relay owns, preferred wrapper points, explicit fallbacks, and what semantics each fallback loses. |
+| Codecs | Typed value codecs, provider request codecs for request intercepts and request-side middleware, and response codecs for end-event annotations, subscribers, exporters, and diagnostics. |
+| Agent runtime primer | Compact runtime model and first routing choice by owned boundary. |
+| Installation | Install-by-surface routing for CLI, primary bindings, and maintained integrations. |
+| Quick Start index | Boundary-based routing that does not promise more than child pages demonstrate. |
+| API index | Generated reference as symbol lookup, not runtime semantics. |
+| Observability pages | Event consumption, raw ATOF export, downstream projections, exporter selection, and setup versus delivery failure behavior. |
+| Glossary | Stable vocabulary shared across concepts, support pages, examples, and generated reference introductions. |
+
+## Review Checklist
+
+Before landing a change that touches shared runtime wording, check:
+
+- The changed page still says who owns the boundary: application, framework,
+ plugin, subscriber, exporter, or NeMo Relay runtime.
+- Managed execution ordering is not implied on pages that only use explicit
+ lifecycle APIs.
+- Sanitize guardrails are described as emitted-payload changes, not real
+ callback argument or return-value changes.
+- Subscriber and exporter work remains downstream from event construction.
+- ATOF remains the raw canonical event format, while ATIF, OpenTelemetry, and
+ OpenInference remain projections.
+- Examples consume shared terminology instead of defining new support or
+ contract language locally.
+- Support matrices stay aligned across README, CLI docs, supported integrations,
+ and install or quick-start routing pages.
+- Generated API reference pages are not used as the only explanation for runtime
+ ownership, ordering, cleanup, or failure behavior.
diff --git a/docs/getting-started/agent-runtime-primer.mdx b/docs/getting-started/agent-runtime-primer.mdx
index db86384e..ffc03c84 100644
--- a/docs/getting-started/agent-runtime-primer.mdx
+++ b/docs/getting-started/agent-runtime-primer.mdx
@@ -26,26 +26,37 @@ NeMo Relay gives those boundaries one execution model.
NeMo Relay does not decide what your agent should do. It describes and manages
what happens when your agent crosses runtime boundaries.
-The core runtime model has five parts:
+The shared runtime model has five parts, with managed calls and codecs acting
+as paths through that model rather than separate model layers:
- **Scopes** describe where work belongs. They preserve parent-child
relationships across requests, agent runs, tools, LLM calls, background work,
and nested functions.
-- **Managed tool and LLM calls** attach work to the active scope, run middleware
- in a consistent order, and emit lifecycle events. The application result is
- preserved unless registered intercepts or guardrails intentionally change the
- execution path.
- **Middleware** runs around managed execution. Intercepts can transform or wrap
real calls. Guardrails can block execution or sanitize emitted observability
payloads.
-- **Events** record what happened. NeMo Relay emits Agent Trajectory
- Observability Format (ATOF) lifecycle records that subscribers and exporters
- can consume.
- **Plugins** package reusable runtime behavior so teams can install middleware,
subscribers, exporters, or adaptive behavior from configuration instead of
repeating setup code in every application.
+- **Events** record what happened. NeMo Relay emits Agent Trajectory
+ Observability Format (ATOF) lifecycle records that subscribers and exporters
+ can consume.
+- **Subscribers and exporters** consume events in process, write raw ATOF
+ events, or project events into ATIF, OpenTelemetry, OpenInference, or other
+ downstream formats.
+
+Managed tool and LLM calls are the main application-owned API path through that
+model: they attach work to the active scope, run middleware in a consistent
+order, and emit lifecycle events. The application result is preserved unless
+registered intercepts or guardrails intentionally change the execution path.
-The simplest mental model is:
+Codecs translate typed application values or provider-native payloads into
+stable runtime shapes when request-side middleware, events, or exporters need
+normalized data. They are boundary translators, not a separate execution model.
+
+The simplest mental model is below. Codecs appear only when a boundary needs
+payload normalization, so they are shown as an optional translator rather than a
+required step for every call.
@@ -54,11 +65,15 @@ flowchart LR
boundary["App or framework boundary"]
scope["NeMo Relay scope"]
managedCall["Managed tool or LLM call"]
+ codec["Codec boundary when needed"]
middleware["Middleware"]
event["Lifecycle event"]
sink["Subscriber or exporter"]
boundary --> scope --> managedCall --> middleware --> event --> sink
+ managedCall -. normalize payloads .-> codec
+ codec -. normalized data .-> middleware
+ codec -. annotations .-> event
```
## What NeMo Relay Does Not Replace
@@ -86,18 +101,25 @@ LLM calls your code owns.
If a framework owns scheduling, retries, callbacks, or provider payloads, use a
framework integration. The integration should preserve framework behavior while
-adding NeMo Relay scopes, managed calls, codecs, middleware, and events at stable
+adding the appropriate mix of NeMo Relay scopes, managed wrappers when
+available, explicit lifecycle APIs, codecs, middleware, and events at stable
framework boundaries.
If you need the same behavior across multiple services or teams, package it as a
plugin. Plugins are the configuration-driven path for reusable middleware,
subscribers, exporters, and adaptive components.
+If your main question is how to observe local coding-agent runs, use the CLI
+path. In that mode, the coding-agent harness owns the invocation boundary, while
+NeMo Relay observes hooks, gateway-routed model traffic, and exporter output.
+
## Read Next
The following pages help you choose the next step for your integration.
- Use [Quick Start](/getting-started/quick-start) for the smallest binding-specific example.
+- Use [NeMo Relay CLI](/nemo-relay-cli/about) when a local coding-agent harness
+ owns the invocation boundary.
- Use [Instrument Applications](/instrument-applications/about) when you
own the tool or LLM call site.
- Use [Integrate into Frameworks](/integrate-into-frameworks/about) when a
diff --git a/docs/getting-started/installation.mdx b/docs/getting-started/installation.mdx
index 66bc81a1..b40d4f55 100644
--- a/docs/getting-started/installation.mdx
+++ b/docs/getting-started/installation.mdx
@@ -14,6 +14,17 @@ If you are working from a source checkout, validating unpublished changes, or
contributing to the repository, use
[Development Setup](/contribute/development-setup) instead.
+Choose the package by the boundary you want NeMo Relay to own or observe:
+
+- install the CLI when a local coding-agent harness owns the invocation
+- install a language binding when your application owns the tool or LLM call
+- install an integration package when a supported framework or agent harness
+ owns the boundary
+
+After installation, use [Quick Start](/getting-started/quick-start) to choose
+the matching runtime path. If a supported framework or agent harness owns the
+boundary, start from [Supported Integrations](/supported-integrations/about).
+
## CLI
Install the NeMo Relay CLI when you want the `nemo-relay` executable for
@@ -37,7 +48,7 @@ plugin workflow.
## Python
Install the Python package when your application uses NeMo Relay through the
-Python wrapper.
+Python wrapper and owns the tool or LLM call boundary.
```bash
uv add nemo-relay@0.5.0
@@ -52,7 +63,7 @@ managing the environment with `uv`.
## Node.js
Install the Node.js package when your application uses NeMo Relay through the
-JavaScript API.
+JavaScript API and owns the tool or LLM call boundary.
```bash
npm install nemo-relay-node
diff --git a/docs/getting-started/quick-start/index.mdx b/docs/getting-started/quick-start/index.mdx
index da0c5dbc..754d5831 100644
--- a/docs/getting-started/quick-start/index.mdx
+++ b/docs/getting-started/quick-start/index.mdx
@@ -9,6 +9,17 @@ SPDX-License-Identifier: Apache-2.0 */}
Choose the Quick Start path for the boundary you want Relay to observe or
control.
+Use this page only after you know which boundary you own:
+
+- use the CLI path when a local coding-agent harness owns the invocation
+- use a binding quick start when your application owns the tool or LLM call
+- use a supported integration guide when a framework or agent harness owns the
+ boundary
+
+If you still need the shared runtime vocabulary for scopes, middleware, events,
+or plugins, read the [Agent Runtime Primer](/getting-started/agent-runtime-primer)
+or [Concepts](/about-nemo-relay/concepts) first.
+
## Local Coding-Agent Runs
Use the NVIDIA NeMo Relay CLI when you want to observe a local Codex, Claude
@@ -28,6 +39,50 @@ the local gateway, and export traces or trajectories.
+## Framework Or Agent-Harness-Owned Code
+
+Use a supported integration guide when LangChain, LangGraph, Deep Agents, or
+OpenClaw owns scheduling, callbacks, tool execution, provider payloads, or
+agent lifecycle hooks. If you are building or reviewing a framework integration
+that is not covered by a maintained guide, use the generic framework integration
+workflow instead.
+
+
+
+
+Choose the maintained integration path and support level for LangChain,
+LangGraph, Deep Agents, or OpenClaw.
+
+
+
+
+Choose the generic workflow for unsupported frameworks, provider adapters, or
+third-party integration patches.
+
+
+
+
+Understand what the framework owns, what Relay owns, and which runtime
+semantics each integration boundary preserves.
+
+
+
+
## Application-Owned Code
Use a binding Quick Start when your application owns the LLM or tool call and
@@ -61,7 +116,8 @@ Smallest Node.js workflow that emits scope, tool, and LLM events.
href="/getting-started/quick-start/rust"
>
-Smallest Rust workflow that emits scope, tool, and LLM events.
+Smallest Rust workflow for direct runtime ownership, starting from scope events
+and mark-based checkpoints.
diff --git a/docs/instrument-applications/instrument-llm-call.mdx b/docs/instrument-applications/instrument-llm-call.mdx
index 388ebc36..f6ef8097 100644
--- a/docs/instrument-applications/instrument-llm-call.mdx
+++ b/docs/instrument-applications/instrument-llm-call.mdx
@@ -245,7 +245,10 @@ Before deploying to production, ensure the following checklist is completed:
- Pass `model_name` separately when the model should be easy to filter or export.
- Keep request and response payloads JSON-compatible.
- Keep SDK clients and transport objects inside the provider callback.
-- Use codecs when middleware needs normalized provider request or response semantics.
+- Use request codecs when request intercepts or request-side middleware need
+ normalized provider request semantics.
+- Use response codecs when LLM end events, subscribers, or exporters need
+ normalized provider response annotations.
- Use response codecs and the `pricing` plugin when exporters need cost
estimates from model pricing.
- Use sanitize guardrails before exporting prompts or model responses in production.
@@ -265,5 +268,7 @@ Use these links to continue from this workflow into the next related task.
- Instrument tools with [Instrument a Tool Call](/instrument-applications/instrument-tool-call).
- Add policy or transformation with [Add Middleware](/instrument-applications/advanced-guide).
-- Use [Provider Codecs](/integrate-into-frameworks/provider-codecs) when middleware needs normalized LLM request and response data.
+- Use [Provider Codecs](/integrate-into-frameworks/provider-codecs) when request
+ intercepts need normalized LLM request data or downstream consumers need
+ normalized response annotations.
- Export events with [Observability](/observability-plugin/about).
diff --git a/docs/integrate-into-frameworks/about.mdx b/docs/integrate-into-frameworks/about.mdx
index 96a81344..13e5d8a5 100644
--- a/docs/integrate-into-frameworks/about.mdx
+++ b/docs/integrate-into-frameworks/about.mdx
@@ -44,7 +44,7 @@ Use these guide links to move from the overview into task-specific instructions.
- [Wrap LLM Calls](/integrate-into-frameworks/wrap-llm-calls) explains where to place managed provider wrappers, model names, streaming behavior, and LLM lifecycle fallbacks.
- [Handle Non-Serializable Data](/integrate-into-frameworks/non-serializable-data) shows how to keep clients, streams, callbacks, and SDK objects outside JSON payloads.
- [Using Codecs](/integrate-into-frameworks/using-codecs) explains typed value codecs for framework-facing wrappers.
-- [Provider Codecs](/integrate-into-frameworks/provider-codecs) explains provider request and response codecs for normalized middleware and event annotations.
+- [Provider Codecs](/integrate-into-frameworks/provider-codecs) explains request codecs for request intercepts and request-side middleware, plus response codecs for event annotations.
- [Provider Response Codecs](/integrate-into-frameworks/provider-response-codecs) focuses on response-only annotations for subscribers and exporters.
- [Code Examples](/integrate-into-frameworks/code-examples) collects fallback APIs, mark events, and repository patch workflow examples.
diff --git a/docs/integrate-into-frameworks/using-codecs.mdx b/docs/integrate-into-frameworks/using-codecs.mdx
index e8201a80..c79eea0b 100644
--- a/docs/integrate-into-frameworks/using-codecs.mdx
+++ b/docs/integrate-into-frameworks/using-codecs.mdx
@@ -40,7 +40,7 @@ Typed value codecs are different from provider codecs:
| Typed value codec | Converts application values to and from JSON. | Dataclasses, Pydantic models, TypeScript object shapes, custom framework types. |
| Provider codec | Converts provider-specific LLM requests and responses to annotated NeMo Relay request or response data. | OpenAI Chat, OpenAI Responses, Anthropic Messages, custom provider payloads. |
-Use this page for typed value codecs. Use [Provider Codecs](/integrate-into-frameworks/provider-codecs) when middleware needs normalized LLM messages, tools, model names, generation parameters, or provider response annotations.
+Use this page for typed value codecs. Use [Provider Codecs](/integrate-into-frameworks/provider-codecs) when request intercepts or request-side middleware need normalized LLM messages, tools, model names, and generation parameters, or when subscribers and exporters need provider response annotations.
## How Typed Value Codecs Work
diff --git a/docs/integrate-into-frameworks/wrap-llm-calls.mdx b/docs/integrate-into-frameworks/wrap-llm-calls.mdx
index 745b7b16..d226a029 100644
--- a/docs/integrate-into-frameworks/wrap-llm-calls.mdx
+++ b/docs/integrate-into-frameworks/wrap-llm-calls.mdx
@@ -32,7 +32,11 @@ Follow this sequence to keep framework work attached to the expected runtime con
4. Pass a stable provider name and `model_name`.
5. Keep provider clients, streams, callbacks, and retry state outside emitted JSON payloads.
-Use a request or response codec when provider payloads need normalization before middleware or events see them. Use [Provider Codecs](/integrate-into-frameworks/provider-codecs) for those cases.
+Use a request codec when provider requests need normalization before request
+intercepts or request-side middleware run. Use a response codec when provider
+responses need normalized LLM end-event annotations for subscribers or
+exporters. Use [Provider Codecs](/integrate-into-frameworks/provider-codecs) for
+those cases.
## Concrete LLM Example
diff --git a/docs/observability-plugin/about.mdx b/docs/observability-plugin/about.mdx
index 5e3eeef8..b93e592c 100644
--- a/docs/observability-plugin/about.mdx
+++ b/docs/observability-plugin/about.mdx
@@ -18,6 +18,25 @@ Subscribers consume that stream in process, and exporter-oriented subscribers
write raw ATOF JSONL or translate events into Agent Trajectory Interchange
Format (ATIF), OpenTelemetry, or OpenInference.
+That makes observability downstream from the runtime contract:
+
+- scopes decide ownership and parentage
+- middleware decides whether execution continues or what payloads are sanitized
+- codecs provide normalized request and response data when available
+- events record the canonical runtime stream
+- subscribers and exporters consume or project that stream
+
+Exporters should not redefine execution semantics. They should preserve the
+meaning of the underlying events while translating them into the shape expected
+by the downstream system.
+
+Configuration and activation failures are setup failures for the observability
+component. Delivery failures after activation are downstream exporter failures:
+application work continues, and explicit exporter barriers such as flush or
+shutdown can report stored delivery problems. See
+[Observability Configuration](/observability-plugin/configuration) for the
+failure table.
+
The first-party plugin component has kind `observability`. It can install:
- Agent Trajectory Observability Format (ATOF) JSONL export for raw lifecycle events.
@@ -35,6 +54,10 @@ needs direct control over registration names, collection windows, explicit
flush timing, or per-run exporter objects. The plugin owns subscriber names and
teardown for the sections it enables.
+Both paths consume the same runtime event contract. Choose plugin-managed export
+for reusable process configuration, and choose manual APIs only when the caller
+needs direct control over registration or lifetime.
+
## Use Observability When
Start here when you need to:
diff --git a/docs/observability-plugin/atif.mdx b/docs/observability-plugin/atif.mdx
index 26a1374a..f509a527 100644
--- a/docs/observability-plugin/atif.mdx
+++ b/docs/observability-plugin/atif.mdx
@@ -199,6 +199,11 @@ agent scope UUID. Each step's `extra.ancestry.function_id` is the event UUID,
and `extra.ancestry.parent_id` is the parent event UUID. Trace spans expose the
same values as `nemo_relay.uuid` and `nemo_relay.parent_uuid` attributes.
+ATIF is a trajectory projection over NeMo Relay events. It should preserve the
+meaning of scope parentage, event UUIDs, codec annotations, and exporter-local
+lineage rules without becoming the source of truth for runtime ownership,
+middleware ordering, or provider payload decoding.
+
## Plugin Configuration
Use plugin configuration when the application should let NeMo Relay own the ATIF
diff --git a/docs/observability-plugin/atof.mdx b/docs/observability-plugin/atof.mdx
index 3757aa81..dfa2d010 100644
--- a/docs/observability-plugin/atof.mdx
+++ b/docs/observability-plugin/atof.mdx
@@ -85,6 +85,10 @@ Each emitted scope, tool, LLM, middleware, or mark event is written as one ATOF
JSON object per line. For event field semantics, see
[Events](/about-nemo-relay/concepts/events).
+ATOF is the raw-event export path. It preserves the canonical event shape; it
+does not add semantic extraction, replay policy, scope-owner resolution, or
+exporter projection rules.
+
Register the plugin before instrumented work starts and clear it during
shutdown so file handles flush.
diff --git a/docs/reference/api/index.mdx b/docs/reference/api/index.mdx
index 70fc3735..1f2b9d52 100644
--- a/docs/reference/api/index.mdx
+++ b/docs/reference/api/index.mdx
@@ -6,7 +6,23 @@ position: 1
{/* SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0 */}
-Use these generated library references for symbol-level API documentation:
+Use these generated library references for symbol-level API documentation.
+
+Use the generated reference when you already know which public symbol you need.
+Do not treat generated reference pages as the source of truth for runtime
+semantics such as:
+
+- ownership
+- managed execution ordering
+- cleanup boundaries
+- middleware behavior
+- support boundaries
+
+For those topics, start with [Concepts](/about-nemo-relay/concepts), the
+[Agent Runtime Primer](/getting-started/agent-runtime-primer), or the
+workflow-specific documentation path that matches the boundary you own.
+
+Primary binding references:
- [Python Library Reference](/reference/api/python-library-reference)
- [Node.js Library Reference](/reference/api/nodejs-library-reference)
diff --git a/docs/resources/glossary.mdx b/docs/resources/glossary.mdx
index 2f101e66..aaea8b0e 100644
--- a/docs/resources/glossary.mdx
+++ b/docs/resources/glossary.mdx
@@ -103,6 +103,12 @@ An **explicit lifecycle API** is a manual start, end, or mark helper used when a
An **exporter** is a subscriber-oriented component that translates NeMo Relay events into an external artifact or backend format, such as an ATIF trajectory or OTLP trace spans.
+**Experimental Binding**
+
+An **experimental binding** exposes runtime behavior for source-first users but
+is not the primary documentation path. Go, WebAssembly, and the raw C FFI
+surface are experimental unless a page says otherwise.
+
**FFI**
**FFI** means foreign function interface. NeMo Relay's C FFI layer exposes core runtime behavior to non-Rust languages and is used by the Go binding.
@@ -207,6 +213,19 @@ A **plugin context** is the activation-time object that plugin code uses to regi
**Priority** is the ordering value attached to middleware registrations. NeMo Relay runs visible middleware in priority order after merging global and scope-local registrations.
+**Primary Binding**
+
+A **primary binding** is one of the documented binding surfaces used for
+first-line examples and generated API references. The primary bindings are Rust,
+Python, and Node.js.
+
+**Projection**
+
+A **projection** translates canonical NeMo Relay events into a downstream format
+such as ATIF, OpenTelemetry, or OpenInference. A projection should preserve the
+meaning of the runtime event stream without redefining ownership, middleware
+ordering, or provider-specific codec policy.
+
**Prompt IR**
**Prompt IR** is the internal representation ACG uses to model an LLM request as addressable prompt blocks for stability analysis and cache planning.
@@ -217,11 +236,11 @@ A **prompt-cache breakpoint** is a provider-specific location in a prompt where
**Provider Adapter**
-A **provider adapter** is code that translates between a framework's model-call surface and a provider-specific API shape. Provider adapters often use codecs when middleware needs normalized request or response semantics.
+A **provider adapter** is code that translates between a framework's model-call surface and a provider-specific API shape. Provider adapters often use request codecs when request intercepts or request-side middleware need normalized request semantics, and response codecs when events, subscribers, or exporters need normalized response annotations.
**Provider Codec**
-A **provider codec** converts provider-specific LLM requests or responses into normalized annotated data. Request codecs decode raw provider requests before request intercepts run and encode edited annotations back into the provider request before execution continues.
+A **provider codec** converts provider-specific LLM requests or responses into normalized annotated data. Request codecs decode raw provider requests before request intercepts run and encode edited annotations back into the provider request before execution continues. Response codecs decode raw provider responses for LLM end-event annotations, subscribers, exporters, and diagnostics.
**Request Intercept**