Epic: Assimilate Guardrails AI as an Ouroboros-native agentOS capability

## Summary

Create an Ouroboros-native plugin/adaptation layer for [`guardrails-ai/guardrails`](https://github.com/guardrails-ai/guardrails) that preserves the Guardrails user experience while translating Guardrails capabilities into the Ouroboros plugin contract: explicit commands, declared capabilities, scoped permissions, audit/provenance events, durable state, and handoff artifacts.

The goal is not to reduce Guardrails to a thin shell wrapper. The goal is to make Guardrails feel like Guardrails inside Ouroboros while letting Ouroboros act as a true agentOS: a substrate that can run external capability systems smoothly, safely, audibly, and resumably.

Related context: #27 defines the plugin layer as the capability assimilation layer for external tools, OSS libraries, and domain workflows.

## Product Goal

A user who already understands Guardrails should be able to bring the same mental model into Ouroboros:

- define or reference a Guardrails guard/spec/config;
- install or use Guardrails validators where appropriate;
- validate LLM inputs, outputs, or generated artifacts;
- inspect familiar Guardrails-style validation outcomes;
- optionally use Guardrails server-style flows;
- receive normalized Ouroboros evidence, audit, provenance, state, and handoff artifacts without losing the original Guardrails workflow feel.

In short:

```text
Guardrails experience
  + Ouroboros plugin contract
  + Seed / Ledger / State / Provenance / Permission / Audit / Handoff
  = Guardrails as an Ouroboros-native agentOS capability
```

## Why This Belongs in `ouroboros-plugins`

Guardrails is a strong reference candidate for the open-source library assimilation model described in #27:

- It is an external OSS capability system, not an Ouroboros core primitive.
- It has a clear domain: LLM input/output validation, risk detection, structured-output enforcement, and validator composition.
- It has multiple execution surfaces: Python SDK, CLI, Hub validators, and optional server mode.
- It requires careful boundary management around file reads/writes, network access, dependency installation, model downloads, and possible remote inference.
- It can produce high-value downstream evidence for `ooo auto`, Seed validation, artifact review, eval loops, and agent run acceptance gates.

This should prove a reusable pattern for future validation/evaluation adapters such as Semgrep, CodeQL, pytest, Hypothesis, mutation testing, OpenAPI validators, and other external harnesses.

## Core Thesis

Do not implement this as:

```bash
ooo guardrails raw -- <anything>
```

Implement it as:

```text
Guardrails guard/spec/config
  + bounded artifact/input/output target
  + declared plugin command
  + declared permissions
  + Guardrails ValidationOutcome
  + normalized Ouroboros report
  + audit/provenance/state updates
  + handoff artifact
```

The plugin should preserve Guardrails semantics while making every invocation legible to Ouroboros.

## Guardrails Capabilities to Assimilate

### Primary Capability: Validation Harness

Assimilate Guardrails as an LLM validation harness:

- validate known LLM output;
- validate generated artifact content;
- validate input/output text using configured validators;
- produce pass/fail/blocking evidence;
- expose validator summaries and remediation information;
- generate a handoff artifact for downstream Ouroboros workflows.

This should be the MVP.

### Secondary Capability: Structured Output Contract Checking

Guardrails supports structured-output generation/validation using Pydantic/RAIL-style specs. The plugin should eventually support using Guardrails as an artifact contract checker for generated JSON, structured reports, and Seed-derived outputs.

### Tertiary Capability: Hub/Validator Lifecycle

Guardrails Hub validators are powerful but require stricter permission boundaries. Hub installation should be optional and separated from read-only validation commands.

### Tertiary Capability: Server Mode

Guardrails server mode may be valuable for long-running agentOS integrations, but it should be treated as a later lifecycle/runtime feature, not the MVP.

## UX Principle: Preserve Guardrails Experience

The plugin should not force users to abandon Guardrails idioms. It should support familiar concepts and names where possible:

- Guard
- validator
- RAIL/spec/config
- validate / parse
- ValidationOutcome
- validation summaries
- Hub validator URI
- local model vs remote inference choices, if/when supported

However, every familiar Guardrails operation must be projected into the Ouroboros contract.

Example desired UX:

```bash
# Validate a known LLM output with a Guardrails spec and emit an Ouroboros report.
ooo guardrails validate-output \
  --spec ./guards/toxic-language.rail \
  --output ./.omx/artifacts/model-output.txt \
  --report ./.omx/reports/guardrails-toxic-language.json

# Validate an Ouroboros artifact and attach the result as a handoff/evidence bundle.
ooo guardrails validate-artifact \
  --spec ./guards/structured-report.py \
  --artifact ./.omx/artifacts/research-summary.md \
  --handoff ./.omx/handoffs/guardrails-research-summary.json
```

The report should still make it easy to recognize the Guardrails outcome, but Ouroboros should also be able to consume it.

## Proposed Plugin Name and Namespace

Candidate plugin names:

- `guardrails-eval`
- `llm-guardrails`
- `guardrails-ai`

Recommended MVP:

```text
Plugin name: guardrails-eval
Command namespace: guardrails
```

Rationale: the plugin is an evaluation/validation adapter, not a full replacement for Guardrails Hub or server management.

## Proposed Command Surface

### MVP Commands

#### `ooo guardrails validate-output`

Validate a known LLM output string/file against a Guardrails spec/config.

Inputs:

- `--spec <path>`: repo-relative Guardrails RAIL/spec/config path;
- `--output <path>` or `--text <string>`: target output to validate;
- `--metadata <path>` optional runtime metadata JSON;
- `--report <path>` optional output report path;
- `--handoff <path>` optional handoff artifact path.

Expected outputs:

- normalized JSON report;
- human-readable summary;
- plugin audit events;
- provenance references to bounded inputs;
- handoff artifact that downstream Ouroboros workflows can consume.

Risk: `write` if report/handoff is emitted; `read_only` only if dry-run/no-write is supported.

#### `ooo guardrails validate-artifact`

Validate an Ouroboros artifact path against a Guardrails spec/config.

Inputs:

- `--artifact <path>`;
- `--spec <path>`;
- `--report <path>`;
- `--handoff <path>`.

Expected use:

- post-generation checks;
- `ooo auto` acceptance gates;
- Seed output validation;
- research/eval artifact verification.

Risk: `write`.

#### `ooo guardrails summarize-report`

Read a Guardrails/Ouroboros validation report and produce a concise human summary.

Risk: `read_only` or `write` if persisted.

### Later Commands

#### `ooo guardrails install-validator`

Install a Guardrails Hub validator.

This must be a separate, explicitly trusted command because it may involve network access, dependency installation, filesystem writes, and local model downloads.

Potential scopes:

- `network:read`
- `filesystem:write`
- `shell:execute`

Risk: `write`; may require confirmation.

#### `ooo guardrails create-config`

Wrap Guardrails config creation while bounding output paths and recording provenance.

Risk: `write`.

#### `ooo guardrails start-server`

Run Guardrails server mode under Ouroboros runtime supervision.

This is not MVP. It likely needs a stronger lifecycle/process contract.

Risk: `write` or stronger depending on network/process exposure.

## Manifest Direction

The MVP should fit the current `0.1` manifest schema without speculative schema expansion.

Illustrative direction, not necessarily final:

```json
{
  "schema_version": "0.1",
  "name": "guardrails-eval",
  "version": "0.1.0",
  "description": "Assimilate Guardrails AI validation into Ouroboros audit and handoff artifacts.",
  "source": {
    "type": "local_path",
    "path": "plugins/guardrails-eval",
    "repository": "https://github.com/Q00/ouroboros-plugins"
  },
  "commands": [
    {
      "namespace": "guardrails",
      "name": "validate-output",
      "summary": "Validate an LLM output against a Guardrails spec and emit an Ouroboros handoff report.",
      "usage": "ooo guardrails validate-output --spec <path> --output <path> --report <path>",
      "risk": "write",
      "requires_confirmation": false,
      "arguments": [
        {
          "name": "spec",
          "type": "path",
          "required": true,
          "description": "Repo-relative Guardrails RAIL/config/spec path."
        },
        {
          "name": "output",
          "type": "path",
          "required": true,
          "description": "Repo-relative LLM output artifact to validate."
        },
        {
          "name": "report",
          "type": "path",
          "required": false,
          "description": "Repo-relative path for validation report JSON."
        }
      ]
    }
  ],
  "capabilities": [
    {
      "name": "ledger",
      "access": "write",
      "reason": "Record validation pass/fail evidence."
    },
    {
      "name": "provenance",
      "access": "write",
      "reason": "Record bounded guard spec and artifact references."
    },
    {
      "name": "handoff",
      "access": "attach",
      "reason": "Attach validation reports for downstream Ouroboros runs."
    },
    {
      "name": "state",
      "access": "write",
      "reason": "Persist validation session status and report paths."
    }
  ],
  "permissions": [
    {
      "scope": "filesystem:read",
      "risk": "read_only",
      "required": true,
      "reason": "Read guard specs and target output artifacts."
    },
    {
      "scope": "filesystem:write",
      "risk": "write",
      "required": false,
      "reason": "Write validation reports and handoff artifacts."
    }
  ],
  "entrypoint": {
    "type": "command",
    "command": "python -m guardrails_eval"
  }
}
```

## Report / Handoff Contract

The plugin should emit a normalized report that preserves Guardrails-native fields while adding Ouroboros metadata.

Suggested report shape:

```json
{
  "schema_version": "0.1",
  "tool": {
    "name": "guardrails-ai",
    "source_repository": "https://github.com/guardrails-ai/guardrails",
    "package": "guardrails-ai"
  },
  "plugin": {
    "name": "guardrails-eval",
    "version": "0.1.0"
  },
  "input": {
    "spec_path": "./guards/example.rail",
    "target_path": "./.omx/artifacts/output.txt",
    "metadata_path": null
  },
  "guardrails_outcome": {
    "validation_passed": true,
    "validated_output": {},
    "validation_summaries": [],
    "error": null
  },
  "ouroboros_result": {
    "status": "success",
    "risk": "write",
    "permissions_used": ["filesystem:read", "filesystem:write"],
    "capabilities_used": ["ledger:write", "provenance:write", "handoff:attach"]
  },
  "handoff": {
    "consumer_hint": "Use this report as validation evidence for the associated artifact.",
    "artifact_status": "accepted"
  }
}
```

The handoff should be stable enough for future `ooo auto`, workflow IR, run/step/artifact projections, or agentOS gates to consume.

## Permission and Risk Model

### MVP permissions

Required:

- `filesystem:read` — read Guardrails specs/configs and target artifacts.

Optional:

- `filesystem:write` — write reports and handoff artifacts.

### Later permissions

Optional and command-specific:

- `network:read` — fetch Guardrails Hub metadata or remote inference metadata;
- `network:write` — only if invoking remote inference or external APIs that mutate/session-log;
- `shell:execute` — run Guardrails CLI or install commands;
- `filesystem:write` — install validators, write configs, cache models;
- `runtime:execute` capability — supervise server mode or long-running validation processes, if supported by contract.

### Important risk split

Local validation and Hub installation must not share the same trust path.

A user should be able to trust local validation without trusting dependency installation or server launch.

## Boundary Rules

- Default to repo-relative paths.
- Do not persist raw secrets, raw API keys, or unbounded prompts in provenance.
- Do not enable Hub installation by default.
- Do not start long-running servers in the MVP.
- Do not expose arbitrary passthrough execution.
- Do not mutate production systems.
- Do not treat Guardrails Hub as an Ouroboros plugin marketplace.
- Do not add domain-specific Guardrails branches to `ooo auto`; consume handoff artifacts instead.

## Dependency Strategy

Open question for implementation planning:

- Should `guardrails-ai` be a plugin dependency installed in the plugin environment?
- Should the plugin call an existing `guardrails` CLI if present?
- Should the plugin vendor no Guardrails code and require explicit environment setup?

Recommended starting point:

- implement as an out-of-process Python plugin package;
- depend on `guardrails-ai` in plugin packaging or dev requirements;
- keep Hub validator installation outside the MVP;
- provide a clean error if `guardrails-ai` is missing.

## Implementation Plan

### Phase 0 — Contract spike

- [ ] Confirm current manifest schema can represent the MVP without changes.
- [ ] Define report JSON shape.
- [ ] Define handoff JSON shape.
- [ ] Define path-bounding and redaction rules.
- [ ] Decide how the plugin locates/loads Guardrails specs/configs.

### Phase 1 — MVP plugin skeleton

- [ ] Add `plugins/guardrails-eval/ouroboros.plugin.json`.
- [ ] Add Python entrypoint package, e.g. `plugins/guardrails-eval/guardrails_eval/`.
- [ ] Implement argument parsing for `validate-output`.
- [ ] Implement repo-relative path validation.
- [ ] Load Guardrails spec/config and validate known output.
- [ ] Emit normalized report JSON.
- [ ] Emit concise human summary.
- [ ] Exit non-zero on validation failure only if configured, or define clear exit semantics.

### Phase 2 — Ouroboros-native evidence

- [ ] Attach ledger/provenance-compatible event payloads.
- [ ] Emit handoff artifact.
- [ ] Include bounded source references rather than raw sensitive payloads.
- [ ] Include validation summaries and remediation hints.
- [ ] Support artifact validation as a first-class command.

### Phase 3 — Tests and contract validation

- [ ] Add manifest validation test.
- [ ] Add fixture Guardrails spec/config.
- [ ] Add passing validation fixture.
- [ ] Add failing validation fixture.
- [ ] Test report shape.
- [ ] Test path bounding.
- [ ] Test missing dependency error.
- [ ] Run `python3 scripts/validate_contract.py`.

### Phase 4 — Optional Hub support

- [ ] Add explicit `install-validator` command only after MVP is stable.
- [ ] Require separate permission scopes.
- [ ] Record install provenance.
- [ ] Avoid silent model downloads.
- [ ] Ensure install cannot be confused with plugin trust.

### Phase 5 — Optional server/runtime support

- [ ] Explore server mode as a separate lifecycle pattern.
- [ ] Define process ownership and shutdown semantics.
- [ ] Define network exposure policy.
- [ ] Define state recovery behavior.

## Acceptance Criteria

- [ ] An epic-level design for Guardrails assimilation is captured in this issue.
- [ ] The proposed MVP preserves Guardrails concepts rather than replacing them with generic wrapper semantics.
- [ ] The plugin design maps Guardrails validation outcomes into Ouroboros audit/provenance/handoff artifacts.
- [ ] MVP scope avoids Guardrails Hub installation and server mode unless explicitly added in later phases.
- [ ] Local validation can run with bounded filesystem permissions.
- [ ] Report output includes both Guardrails-native outcome data and Ouroboros-native result metadata.
- [ ] The plugin does not expose arbitrary Guardrails CLI passthrough.
- [ ] The plugin does not require manifest schema expansion for MVP.
- [ ] Any future schema expansion is justified by a concrete reference-plugin need.
- [ ] The design keeps `ooo auto` coherent by consuming handoffs instead of embedding Guardrails-specific branching in core.

## Non-goals

- Do not turn `ouroboros-plugins` into a Guardrails Hub mirror or marketplace.
- Do not auto-install arbitrary validators without explicit trust and permission boundaries.
- Do not add unbounded command passthrough.
- Do not start long-running servers in the MVP.
- Do not store raw secrets or unbounded user/model payloads in provenance.
- Do not move Guardrails logic into Ouroboros core.

## Open Questions

1. Should the MVP support RAIL only, Python config only, or both?
2. What should the exact exit-code semantics be for validation failure vs plugin/runtime failure?
3. Where should reports be written by default: `.omx/reports/`, `.ouroboros/`, or caller-provided path only?
4. Should `filesystem:write` be required for the baseline command, or should the plugin support a pure stdout read-only mode?
5. How should Guardrails runtime metadata be bounded and redacted?
6. Should Guardrails Hub validator installation live in this plugin or a separate plugin/lifecycle command?
7. What minimum handoff shape should `ooo auto` or future Workflow IR consume?

## References

- Guardrails AI repository: https://github.com/guardrails-ai/guardrails
- Guardrails documentation: https://www.guardrailsai.com/guardrails/docs
- Guardrails validators documentation: https://guardrailsai.com/guardrails/docs/concepts/validators
- Ouroboros plugin authoring/capability assimilation RFC issue: #27


Epic: Assimilate Guardrails AI as an Ouroboros-native agentOS capability #41

Description

Summary

Product Goal

Why This Belongs in ouroboros-plugins

Core Thesis

Guardrails Capabilities to Assimilate

Primary Capability: Validation Harness

Secondary Capability: Structured Output Contract Checking

Tertiary Capability: Hub/Validator Lifecycle

Tertiary Capability: Server Mode

UX Principle: Preserve Guardrails Experience

Proposed Plugin Name and Namespace

Proposed Command Surface

MVP Commands

ooo guardrails validate-output

ooo guardrails validate-artifact

ooo guardrails summarize-report

Later Commands

ooo guardrails install-validator

ooo guardrails create-config

ooo guardrails start-server

Manifest Direction

Report / Handoff Contract

Permission and Risk Model

MVP permissions

Later permissions

Important risk split

Boundary Rules

Dependency Strategy

Implementation Plan

Phase 0 — Contract spike

Phase 1 — MVP plugin skeleton

Phase 2 — Ouroboros-native evidence

Phase 3 — Tests and contract validation

Phase 4 — Optional Hub support

Phase 5 — Optional server/runtime support

Acceptance Criteria

Non-goals

Open Questions

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Why This Belongs in `ouroboros-plugins`

`ooo guardrails validate-output`

`ooo guardrails validate-artifact`

`ooo guardrails summarize-report`

`ooo guardrails install-validator`

`ooo guardrails create-config`

`ooo guardrails start-server`