Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/about-nemo-relay/architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -194,3 +194,4 @@ The following concepts are related to this architecture:
- [Events](/about-nemo-relay/concepts/events)
- [Subscribers](/about-nemo-relay/concepts/subscribers)
- [Plugins](/about-nemo-relay/concepts/plugins)
- [Codecs](/about-nemo-relay/concepts/codecs)
112 changes: 112 additions & 0 deletions docs/about-nemo-relay/concepts/codecs.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
title: "Codecs"
description: ""
position: 7
---
{/* SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0 */}

This page explains how codecs fit into the shared NeMo Relay runtime contract.

## What A Codec Is

A codec is a boundary translator. It converts one runtime-facing data shape into
another without changing who owns execution.

NeMo Relay uses codecs when runtime behavior needs stable, JSON-compatible, or
annotated data but the application or provider surface starts from a different
shape.

## Why Codecs Exist

Codecs let NeMo Relay preserve one execution model across different boundaries:

- application-owned typed values
- framework-owned callback payloads
- provider-native LLM request and response shapes
- exporter or subscriber consumers that need normalized data

Without codecs, request-side middleware and observability would need to reason
about every framework or provider shape directly.

## Two Main Codec Roles

NeMo Relay documentation uses the word `codec` in two related but different
senses.

### Typed Value Codecs

Typed value codecs translate application-facing values to and from JSON-friendly
shapes at the public wrapper boundary.

Use them when:

- application code wants native objects
- framework callbacks expect typed values
- runtime events and JSON-based middleware still need stable serialized payloads

### Provider Codecs

Provider codecs translate provider-native LLM payloads at two different points.
Request codecs decode provider requests into annotated request data for request
intercepts and request-side middleware, then encode edits back into the provider
shape. Response codecs decode provider responses into annotated response data
for LLM end events, subscribers, exporters, and diagnostics.

Use them when:

- provider payloads differ structurally
- request intercepts need normalized request meaning
- response annotations such as usage, model names, or tool calls should be
exposed in one stable shape for downstream consumers

## Request And Response Responsibilities

Request and response codecs do different jobs.

- **Request codecs** decode raw provider requests into annotated request data
before request-oriented runtime behavior runs, then encode edited annotations
back into the provider-native shape when execution continues.
- **Response codecs** decode raw provider responses into annotated response data
for lifecycle events, subscribers, and exporters.

Response decoding improves observability and downstream consistency. It does not
automatically change the value returned to the application unless a separate
typed value boundary also does so.

## Who Consumes Normalized Data

Normalized codec output matters to several runtime layers:

- request intercepts or request-side middleware that need stable request meaning
- lifecycle events that should expose consistent semantic payloads
- subscribers that inspect runtime activity in process
- exporters that write raw ATOF events or project them into ATIF,
OpenTelemetry, or OpenInference

Codecs do not replace scopes, middleware, subscribers, or plugins. They make
those layers easier to apply consistently across heterogeneous inputs.

## What Codecs Do Not Decide

Codecs do not decide:

- ownership boundaries
- middleware ordering
- whether execution is allowed to continue
- which exporter is active

Those responsibilities belong to scopes, middleware, plugins, and exporter or
subscriber registration.

## Read Next

- Use [Using Codecs](/integrate-into-frameworks/using-codecs) for typed value
codecs at framework-facing boundaries.
- Use [Provider Codecs](/integrate-into-frameworks/provider-codecs) for
provider-native request and response normalization.
- Use [Provider Response
Codecs](/integrate-into-frameworks/provider-response-codecs) when the main
need is response-side annotations for subscribers or exporters.
- Refer to the [Glossary](/resources/glossary) for the stable terminology used
across codecs, providers, and observability surfaces.
16 changes: 16 additions & 0 deletions docs/about-nemo-relay/concepts/events.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,19 @@ annotations are part of the canonical event JSON under `category_profile` when
they are present, so ATOF JSONL export and in-process subscriber JSON expose the
same payload shape.

### Event Contract Boundary

The event is the canonical envelope and handoff point for subscribers and
exporters. Consumers should prefer typed event fields, binding helpers, and
annotated request or response data before falling back to raw payload
re-reading.

Events preserve runtime facts. They do not decide which scope owns a call, how
replay or cost policy is applied, how redaction policy is configured, how
streams are blocked, or how exporter-specific semantic projection works. Those
decisions stay in the session, codec, plugin, guardrail, or exporter layer that
owns them.

## How Events Are Produced

Scope APIs emit `scope` start and end events and can also emit named `mark`
Expand All @@ -104,3 +117,6 @@ events. Managed tool and LLM helpers emit `scope` events with `category` set to
the runtime record important state transitions even when there is no full nested
callback to wrap. Subscriber delivery is downstream from event construction and
does not block the emitting call on native bindings.

ATOF export writes this raw canonical event stream. ATIF, OpenTelemetry, and
OpenInference are downstream projections over the same events.
14 changes: 14 additions & 0 deletions docs/about-nemo-relay/concepts/framework-integrations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@ order:

This order preserves the most runtime semantics with the least distortion.

This order also keeps ownership clear. A framework integration should preserve
the framework's scheduling, retry, provider routing, and object-lifetime rules
unless a managed NeMo Relay wrapper explicitly owns that invocation boundary.

## First Choice: Execution Wrappers

Execution wrappers are the preferred integration boundary when a framework exposes a
Expand Down Expand Up @@ -93,6 +97,11 @@ hooks but does not let NeMo Relay wrap the real invocation.
This fallback preserves lifecycle visibility, but the framework must pair start
and end calls correctly.

Manual lifecycle calls do not run the full managed execution pipeline by
themselves. They preserve observability and parentage, but execution intercepts,
request intercepts, and sanitize guardrails only run when the integration calls
the corresponding managed or standalone runtime surface.

### Conditional Execution

Use standalone conditional-execution helpers when the framework only needs an
Expand Down Expand Up @@ -138,6 +147,11 @@ Use these rules to decide where NeMo Relay should wrap framework behavior.
- If you only need request transformation, use request-intercept helpers.
- If you only have milestone visibility, emit mark events.

When provider request or response payloads matter, prefer NeMo Relay codecs and
annotated request or response data before introducing ad hoc raw-payload parsing
in the integration. Keep provider-specific round-trip behavior in the codec or
adapter that owns that provider shape.

## Practical Guidance

Use these practices when applying the concept in application or integration code.
Expand Down
33 changes: 32 additions & 1 deletion docs/about-nemo-relay/concepts/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,28 @@ position: 4
SPDX-License-Identifier: Apache-2.0 */}


Use these pages to understand the NeMo Relay runtime model before applying it in a use-case workflow.
Use these pages to understand the NeMo Relay runtime model before applying it
in a use-case workflow.

NeMo Relay's shared runtime model has five parts:

- **Scopes** decide where work belongs and which scope-local behavior is visible.
- **Middleware** decides what can block, rewrite, sanitize, or wrap managed
execution.
- **Plugins** install reusable runtime behavior from configuration.
- **Events** record what happened in the canonical lifecycle stream.
- **Subscribers and exporters** consume that stream in process, write raw ATOF
events, or project events into downstream formats.

Managed tool and LLM helpers are the application-owned API path through scopes,
middleware, events, and subscribers. Codecs are boundary translators that
normalize typed application values or provider-native payloads when runtime
behavior needs stable data shapes.

Use the concepts pages for runtime semantics such as ownership, ordering,
cleanup, and support boundaries. Use the generated [API
Reference](/reference/api) for symbol lookup after you already know which
runtime concept or integration boundary you need.

<CardGroup cols={3}>
<Card
Expand Down Expand Up @@ -70,4 +91,14 @@ Consumers for lifecycle events, including logs, traces, trajectories, analytics,
Integration patterns for frameworks that own invocation boundaries, scheduling, retries, or provider payloads.
</Card>

<Card
title="Codecs"
icon="regular arrows-rotate"
iconPosition="left"
href="/about-nemo-relay/concepts/codecs"
>

Normalization boundaries for typed application values, provider payloads, middleware, and exporters.
</Card>

</CardGroup>
18 changes: 17 additions & 1 deletion docs/about-nemo-relay/concepts/plugins.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,18 @@ behavior.
Reporting provides structured diagnostics about what activated successfully and
what did not.

### Failure Boundary

Plugin validation and initialization are setup boundaries. If configuration is
invalid, a component kind is unavailable, or initialization fails, callers should
treat the plugin setup as failed before relying on the new runtime behavior.
Activation reports are the public way to inspect what validated or activated.

Runtime behavior after activation still belongs to the installed component. For
example, an exporter can report delivery failures without changing tool or LLM
execution semantics. Keep those component-specific failure rules in the
component guide rather than redefining them in the plugin concept.

<MermaidStyles />

```mermaid
Expand Down Expand Up @@ -131,6 +143,10 @@ that should activate once for the running process rather than once per request.
Scope-local behavior still matters after plugin installation, but the plugin
system itself is a global activation layer.

Plugins install runtime behavior; they do not create a separate execution
model. Scopes still own parentage and cleanup, middleware still owns execution
ordering, and events still own the canonical runtime record.

## Built-In Plugin Examples

Core plugin APIs register built-in components before lookup, validation, and
Expand Down Expand Up @@ -171,7 +187,7 @@ The core crate also ships a built-in `nemo_guardrails` plugin component. It is
the first-party Guardrails integration point that NeMo Relay owns through the
shared plugin system.

The current shipped user-facing lanes are:
The current shipped user-facing paths are:

- the remote backend for Guardrails-service integration
- the Python-backed local backend for `nemoguardrails` integration through a
Expand Down
1 change: 1 addition & 0 deletions docs/contribute/about.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ Use these guide links to move from the overview into task-specific instructions.
- [Development Setup](/contribute/development-setup) covers package installation, source setup, branch naming, and code style.
- [Workflow and Reviews](/contribute/workflow-and-reviews) covers pre-commit hooks, pull request expectations, release tag conventions, DCO sign-off, commit messages, and review rules.
- [Testing and Documentation](/contribute/testing-and-docs) covers affected-language test selection, common build and test commands, documentation checks, and licensing expectations.
- [Runtime Contract Documentation](/contribute/runtime-contract-docs) covers shared terminology, support boundaries, and review guardrails for docs and examples that describe runtime behavior.

Read [Development Setup](/contribute/development-setup) before building locally. Use
[Testing and Documentation](/contribute/testing-and-docs) to choose the smallest
Expand Down
123 changes: 123 additions & 0 deletions docs/contribute/runtime-contract-docs.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
---
title: "Runtime Contract Documentation"
description: ""
position: 5
---
{/* SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0 */}

Use this page when documentation, examples, bindings, plugins, or integrations
need to update shared NeMo Relay runtime terminology.

The Rust runtime remains the behavioral source of truth. These docs are the
coordination point for user-facing language about ownership, execution ordering,
cleanup, support boundaries, and observability semantics. Keep these pages
aligned before changing examples or workflow-specific guides that depend on
them.

## Shared Contract Pages

These pages define or route the shared runtime model:

- `docs/about-nemo-relay/concepts/index.mdx`
- `docs/about-nemo-relay/concepts/scopes.mdx`
- `docs/about-nemo-relay/concepts/middleware.mdx`
- `docs/about-nemo-relay/concepts/subscribers.mdx`
- `docs/about-nemo-relay/concepts/plugins.mdx`
- `docs/about-nemo-relay/concepts/events.mdx`
- `docs/about-nemo-relay/concepts/framework-integrations.mdx`
- `docs/about-nemo-relay/concepts/codecs.mdx`
- `docs/getting-started/agent-runtime-primer.mdx`
- `docs/getting-started/installation.mdx`
- `docs/getting-started/quick-start/index.mdx`
- `docs/reference/api/index.mdx`
- `docs/observability-plugin/about.mdx`
- `docs/observability-plugin/atof.mdx`
- `docs/observability-plugin/atif.mdx`
- `docs/observability-plugin/opentelemetry.mdx`
- `docs/observability-plugin/openinference.mdx`
- `docs/resources/glossary.mdx`

Support-sensitive summaries should also stay aligned with:

- `README.md`
- `docs/integrate-into-frameworks/about.mdx`
- `docs/nemo-relay-cli/about.mdx`
- `docs/supported-integrations/about.mdx`

## Contract Boundaries

Use these boundaries when reviewing docs and examples.

### Concepts Versus API Reference

Concept pages explain runtime semantics. Generated API reference pages are for
symbol lookup after the reader already knows which concept or integration
boundary they need.

### Primary Versus Experimental Bindings

Rust, Python, and Node.js are the primary documented bindings. Go, WebAssembly,
and raw C FFI are source-first experimental surfaces unless a page explicitly
says otherwise.

### ATOF Versus Projections

ATOF is the canonical event stream. ATIF, OpenTelemetry, and OpenInference are
downstream projections over those events. Projection docs should not redefine
scope ownership, middleware ordering, provider payload decoding, replay policy,
or cost policy.

### Codecs Versus Ownership

Codecs translate typed application values or provider-native payloads into
stable runtime shapes. They do not decide ownership, middleware ordering,
whether execution may continue, or which exporter is active.

### Managed Versus Explicit Integration

Managed execution wrappers preserve the full runtime pipeline around a real tool
or LLM callback. Explicit lifecycle APIs preserve observability and parentage
when a framework owns the callback, but they do not run the full managed
execution pipeline by themselves.

## Page Responsibilities

Use this map to keep shared pages focused.

| Page | Responsibility |
| --- | --- |
| Concepts hub | Shared runtime model and routing across core concepts, framework integrations, and codecs. |
| Scopes | Ownership, hierarchy, scope-local visibility, cleanup, and request isolation. |
| Middleware | Intercepts versus guardrails, managed execution ordering, streaming placement, and failure behavior that changes execution. |
| Subscribers | Downstream event consumption, async delivery, flush expectations, and exporter relationship. |
| Plugins | Process-level activation, config-driven behavior installation, validation, rollback, and component ownership boundaries. |
| Events | Canonical ATOF envelope, start/end pairing, marks, semantic payloads, category profiles, and annotated request or response data. |
| Framework integrations | What the framework owns, what Relay owns, preferred wrapper points, explicit fallbacks, and what semantics each fallback loses. |
| Codecs | Typed value codecs, provider request codecs for request intercepts and request-side middleware, and response codecs for end-event annotations, subscribers, exporters, and diagnostics. |
| Agent runtime primer | Compact runtime model and first routing choice by owned boundary. |
| Installation | Install-by-surface routing for CLI, primary bindings, and maintained integrations. |
| Quick Start index | Boundary-based routing that does not promise more than child pages demonstrate. |
| API index | Generated reference as symbol lookup, not runtime semantics. |
| Observability pages | Event consumption, raw ATOF export, downstream projections, exporter selection, and setup versus delivery failure behavior. |
| Glossary | Stable vocabulary shared across concepts, support pages, examples, and generated reference introductions. |

## Review Checklist

Before landing a change that touches shared runtime wording, check:

- The changed page still says who owns the boundary: application, framework,
plugin, subscriber, exporter, or NeMo Relay runtime.
- Managed execution ordering is not implied on pages that only use explicit
lifecycle APIs.
- Sanitize guardrails are described as emitted-payload changes, not real
callback argument or return-value changes.
- Subscriber and exporter work remains downstream from event construction.
- ATOF remains the raw canonical event format, while ATIF, OpenTelemetry, and
OpenInference remain projections.
- Examples consume shared terminology instead of defining new support or
contract language locally.
- Support matrices stay aligned across README, CLI docs, supported integrations,
and install or quick-start routing pages.
- Generated API reference pages are not used as the only explanation for runtime
ownership, ordering, cleanup, or failure behavior.
Loading
Loading