Skip to content

Epic: Assimilate Guardrails AI as an Ouroboros-native agentOS capability #41

@shaun0927

Description

@shaun0927

Summary

Create an Ouroboros-native plugin/adaptation layer for guardrails-ai/guardrails that preserves the Guardrails user experience while translating Guardrails capabilities into the Ouroboros plugin contract: explicit commands, declared capabilities, scoped permissions, audit/provenance events, durable state, and handoff artifacts.

The goal is not to reduce Guardrails to a thin shell wrapper. The goal is to make Guardrails feel like Guardrails inside Ouroboros while letting Ouroboros act as a true agentOS: a substrate that can run external capability systems smoothly, safely, audibly, and resumably.

Related context: #27 defines the plugin layer as the capability assimilation layer for external tools, OSS libraries, and domain workflows.

Product Goal

A user who already understands Guardrails should be able to bring the same mental model into Ouroboros:

  • define or reference a Guardrails guard/spec/config;
  • install or use Guardrails validators where appropriate;
  • validate LLM inputs, outputs, or generated artifacts;
  • inspect familiar Guardrails-style validation outcomes;
  • optionally use Guardrails server-style flows;
  • receive normalized Ouroboros evidence, audit, provenance, state, and handoff artifacts without losing the original Guardrails workflow feel.

In short:

Guardrails experience
  + Ouroboros plugin contract
  + Seed / Ledger / State / Provenance / Permission / Audit / Handoff
  = Guardrails as an Ouroboros-native agentOS capability

Why This Belongs in ouroboros-plugins

Guardrails is a strong reference candidate for the open-source library assimilation model described in #27:

  • It is an external OSS capability system, not an Ouroboros core primitive.
  • It has a clear domain: LLM input/output validation, risk detection, structured-output enforcement, and validator composition.
  • It has multiple execution surfaces: Python SDK, CLI, Hub validators, and optional server mode.
  • It requires careful boundary management around file reads/writes, network access, dependency installation, model downloads, and possible remote inference.
  • It can produce high-value downstream evidence for ooo auto, Seed validation, artifact review, eval loops, and agent run acceptance gates.

This should prove a reusable pattern for future validation/evaluation adapters such as Semgrep, CodeQL, pytest, Hypothesis, mutation testing, OpenAPI validators, and other external harnesses.

Core Thesis

Do not implement this as:

ooo guardrails raw -- <anything>

Implement it as:

Guardrails guard/spec/config
  + bounded artifact/input/output target
  + declared plugin command
  + declared permissions
  + Guardrails ValidationOutcome
  + normalized Ouroboros report
  + audit/provenance/state updates
  + handoff artifact

The plugin should preserve Guardrails semantics while making every invocation legible to Ouroboros.

Guardrails Capabilities to Assimilate

Primary Capability: Validation Harness

Assimilate Guardrails as an LLM validation harness:

  • validate known LLM output;
  • validate generated artifact content;
  • validate input/output text using configured validators;
  • produce pass/fail/blocking evidence;
  • expose validator summaries and remediation information;
  • generate a handoff artifact for downstream Ouroboros workflows.

This should be the MVP.

Secondary Capability: Structured Output Contract Checking

Guardrails supports structured-output generation/validation using Pydantic/RAIL-style specs. The plugin should eventually support using Guardrails as an artifact contract checker for generated JSON, structured reports, and Seed-derived outputs.

Tertiary Capability: Hub/Validator Lifecycle

Guardrails Hub validators are powerful but require stricter permission boundaries. Hub installation should be optional and separated from read-only validation commands.

Tertiary Capability: Server Mode

Guardrails server mode may be valuable for long-running agentOS integrations, but it should be treated as a later lifecycle/runtime feature, not the MVP.

UX Principle: Preserve Guardrails Experience

The plugin should not force users to abandon Guardrails idioms. It should support familiar concepts and names where possible:

  • Guard
  • validator
  • RAIL/spec/config
  • validate / parse
  • ValidationOutcome
  • validation summaries
  • Hub validator URI
  • local model vs remote inference choices, if/when supported

However, every familiar Guardrails operation must be projected into the Ouroboros contract.

Example desired UX:

# Validate a known LLM output with a Guardrails spec and emit an Ouroboros report.
ooo guardrails validate-output \
  --spec ./guards/toxic-language.rail \
  --output ./.omx/artifacts/model-output.txt \
  --report ./.omx/reports/guardrails-toxic-language.json

# Validate an Ouroboros artifact and attach the result as a handoff/evidence bundle.
ooo guardrails validate-artifact \
  --spec ./guards/structured-report.py \
  --artifact ./.omx/artifacts/research-summary.md \
  --handoff ./.omx/handoffs/guardrails-research-summary.json

The report should still make it easy to recognize the Guardrails outcome, but Ouroboros should also be able to consume it.

Proposed Plugin Name and Namespace

Candidate plugin names:

  • guardrails-eval
  • llm-guardrails
  • guardrails-ai

Recommended MVP:

Plugin name: guardrails-eval
Command namespace: guardrails

Rationale: the plugin is an evaluation/validation adapter, not a full replacement for Guardrails Hub or server management.

Proposed Command Surface

MVP Commands

ooo guardrails validate-output

Validate a known LLM output string/file against a Guardrails spec/config.

Inputs:

  • --spec <path>: repo-relative Guardrails RAIL/spec/config path;
  • --output <path> or --text <string>: target output to validate;
  • --metadata <path> optional runtime metadata JSON;
  • --report <path> optional output report path;
  • --handoff <path> optional handoff artifact path.

Expected outputs:

  • normalized JSON report;
  • human-readable summary;
  • plugin audit events;
  • provenance references to bounded inputs;
  • handoff artifact that downstream Ouroboros workflows can consume.

Risk: write if report/handoff is emitted; read_only only if dry-run/no-write is supported.

ooo guardrails validate-artifact

Validate an Ouroboros artifact path against a Guardrails spec/config.

Inputs:

  • --artifact <path>;
  • --spec <path>;
  • --report <path>;
  • --handoff <path>.

Expected use:

  • post-generation checks;
  • ooo auto acceptance gates;
  • Seed output validation;
  • research/eval artifact verification.

Risk: write.

ooo guardrails summarize-report

Read a Guardrails/Ouroboros validation report and produce a concise human summary.

Risk: read_only or write if persisted.

Later Commands

ooo guardrails install-validator

Install a Guardrails Hub validator.

This must be a separate, explicitly trusted command because it may involve network access, dependency installation, filesystem writes, and local model downloads.

Potential scopes:

  • network:read
  • filesystem:write
  • shell:execute

Risk: write; may require confirmation.

ooo guardrails create-config

Wrap Guardrails config creation while bounding output paths and recording provenance.

Risk: write.

ooo guardrails start-server

Run Guardrails server mode under Ouroboros runtime supervision.

This is not MVP. It likely needs a stronger lifecycle/process contract.

Risk: write or stronger depending on network/process exposure.

Manifest Direction

The MVP should fit the current 0.1 manifest schema without speculative schema expansion.

Illustrative direction, not necessarily final:

{
  "schema_version": "0.1",
  "name": "guardrails-eval",
  "version": "0.1.0",
  "description": "Assimilate Guardrails AI validation into Ouroboros audit and handoff artifacts.",
  "source": {
    "type": "local_path",
    "path": "plugins/guardrails-eval",
    "repository": "https://github.com/Q00/ouroboros-plugins"
  },
  "commands": [
    {
      "namespace": "guardrails",
      "name": "validate-output",
      "summary": "Validate an LLM output against a Guardrails spec and emit an Ouroboros handoff report.",
      "usage": "ooo guardrails validate-output --spec <path> --output <path> --report <path>",
      "risk": "write",
      "requires_confirmation": false,
      "arguments": [
        {
          "name": "spec",
          "type": "path",
          "required": true,
          "description": "Repo-relative Guardrails RAIL/config/spec path."
        },
        {
          "name": "output",
          "type": "path",
          "required": true,
          "description": "Repo-relative LLM output artifact to validate."
        },
        {
          "name": "report",
          "type": "path",
          "required": false,
          "description": "Repo-relative path for validation report JSON."
        }
      ]
    }
  ],
  "capabilities": [
    {
      "name": "ledger",
      "access": "write",
      "reason": "Record validation pass/fail evidence."
    },
    {
      "name": "provenance",
      "access": "write",
      "reason": "Record bounded guard spec and artifact references."
    },
    {
      "name": "handoff",
      "access": "attach",
      "reason": "Attach validation reports for downstream Ouroboros runs."
    },
    {
      "name": "state",
      "access": "write",
      "reason": "Persist validation session status and report paths."
    }
  ],
  "permissions": [
    {
      "scope": "filesystem:read",
      "risk": "read_only",
      "required": true,
      "reason": "Read guard specs and target output artifacts."
    },
    {
      "scope": "filesystem:write",
      "risk": "write",
      "required": false,
      "reason": "Write validation reports and handoff artifacts."
    }
  ],
  "entrypoint": {
    "type": "command",
    "command": "python -m guardrails_eval"
  }
}

Report / Handoff Contract

The plugin should emit a normalized report that preserves Guardrails-native fields while adding Ouroboros metadata.

Suggested report shape:

{
  "schema_version": "0.1",
  "tool": {
    "name": "guardrails-ai",
    "source_repository": "https://github.com/guardrails-ai/guardrails",
    "package": "guardrails-ai"
  },
  "plugin": {
    "name": "guardrails-eval",
    "version": "0.1.0"
  },
  "input": {
    "spec_path": "./guards/example.rail",
    "target_path": "./.omx/artifacts/output.txt",
    "metadata_path": null
  },
  "guardrails_outcome": {
    "validation_passed": true,
    "validated_output": {},
    "validation_summaries": [],
    "error": null
  },
  "ouroboros_result": {
    "status": "success",
    "risk": "write",
    "permissions_used": ["filesystem:read", "filesystem:write"],
    "capabilities_used": ["ledger:write", "provenance:write", "handoff:attach"]
  },
  "handoff": {
    "consumer_hint": "Use this report as validation evidence for the associated artifact.",
    "artifact_status": "accepted"
  }
}

The handoff should be stable enough for future ooo auto, workflow IR, run/step/artifact projections, or agentOS gates to consume.

Permission and Risk Model

MVP permissions

Required:

  • filesystem:read — read Guardrails specs/configs and target artifacts.

Optional:

  • filesystem:write — write reports and handoff artifacts.

Later permissions

Optional and command-specific:

  • network:read — fetch Guardrails Hub metadata or remote inference metadata;
  • network:write — only if invoking remote inference or external APIs that mutate/session-log;
  • shell:execute — run Guardrails CLI or install commands;
  • filesystem:write — install validators, write configs, cache models;
  • runtime:execute capability — supervise server mode or long-running validation processes, if supported by contract.

Important risk split

Local validation and Hub installation must not share the same trust path.

A user should be able to trust local validation without trusting dependency installation or server launch.

Boundary Rules

  • Default to repo-relative paths.
  • Do not persist raw secrets, raw API keys, or unbounded prompts in provenance.
  • Do not enable Hub installation by default.
  • Do not start long-running servers in the MVP.
  • Do not expose arbitrary passthrough execution.
  • Do not mutate production systems.
  • Do not treat Guardrails Hub as an Ouroboros plugin marketplace.
  • Do not add domain-specific Guardrails branches to ooo auto; consume handoff artifacts instead.

Dependency Strategy

Open question for implementation planning:

  • Should guardrails-ai be a plugin dependency installed in the plugin environment?
  • Should the plugin call an existing guardrails CLI if present?
  • Should the plugin vendor no Guardrails code and require explicit environment setup?

Recommended starting point:

  • implement as an out-of-process Python plugin package;
  • depend on guardrails-ai in plugin packaging or dev requirements;
  • keep Hub validator installation outside the MVP;
  • provide a clean error if guardrails-ai is missing.

Implementation Plan

Phase 0 — Contract spike

  • Confirm current manifest schema can represent the MVP without changes.
  • Define report JSON shape.
  • Define handoff JSON shape.
  • Define path-bounding and redaction rules.
  • Decide how the plugin locates/loads Guardrails specs/configs.

Phase 1 — MVP plugin skeleton

  • Add plugins/guardrails-eval/ouroboros.plugin.json.
  • Add Python entrypoint package, e.g. plugins/guardrails-eval/guardrails_eval/.
  • Implement argument parsing for validate-output.
  • Implement repo-relative path validation.
  • Load Guardrails spec/config and validate known output.
  • Emit normalized report JSON.
  • Emit concise human summary.
  • Exit non-zero on validation failure only if configured, or define clear exit semantics.

Phase 2 — Ouroboros-native evidence

  • Attach ledger/provenance-compatible event payloads.
  • Emit handoff artifact.
  • Include bounded source references rather than raw sensitive payloads.
  • Include validation summaries and remediation hints.
  • Support artifact validation as a first-class command.

Phase 3 — Tests and contract validation

  • Add manifest validation test.
  • Add fixture Guardrails spec/config.
  • Add passing validation fixture.
  • Add failing validation fixture.
  • Test report shape.
  • Test path bounding.
  • Test missing dependency error.
  • Run python3 scripts/validate_contract.py.

Phase 4 — Optional Hub support

  • Add explicit install-validator command only after MVP is stable.
  • Require separate permission scopes.
  • Record install provenance.
  • Avoid silent model downloads.
  • Ensure install cannot be confused with plugin trust.

Phase 5 — Optional server/runtime support

  • Explore server mode as a separate lifecycle pattern.
  • Define process ownership and shutdown semantics.
  • Define network exposure policy.
  • Define state recovery behavior.

Acceptance Criteria

  • An epic-level design for Guardrails assimilation is captured in this issue.
  • The proposed MVP preserves Guardrails concepts rather than replacing them with generic wrapper semantics.
  • The plugin design maps Guardrails validation outcomes into Ouroboros audit/provenance/handoff artifacts.
  • MVP scope avoids Guardrails Hub installation and server mode unless explicitly added in later phases.
  • Local validation can run with bounded filesystem permissions.
  • Report output includes both Guardrails-native outcome data and Ouroboros-native result metadata.
  • The plugin does not expose arbitrary Guardrails CLI passthrough.
  • The plugin does not require manifest schema expansion for MVP.
  • Any future schema expansion is justified by a concrete reference-plugin need.
  • The design keeps ooo auto coherent by consuming handoffs instead of embedding Guardrails-specific branching in core.

Non-goals

  • Do not turn ouroboros-plugins into a Guardrails Hub mirror or marketplace.
  • Do not auto-install arbitrary validators without explicit trust and permission boundaries.
  • Do not add unbounded command passthrough.
  • Do not start long-running servers in the MVP.
  • Do not store raw secrets or unbounded user/model payloads in provenance.
  • Do not move Guardrails logic into Ouroboros core.

Open Questions

  1. Should the MVP support RAIL only, Python config only, or both?
  2. What should the exact exit-code semantics be for validation failure vs plugin/runtime failure?
  3. Where should reports be written by default: .omx/reports/, .ouroboros/, or caller-provided path only?
  4. Should filesystem:write be required for the baseline command, or should the plugin support a pure stdout read-only mode?
  5. How should Guardrails runtime metadata be bounded and redacted?
  6. Should Guardrails Hub validator installation live in this plugin or a separate plugin/lifecycle command?
  7. What minimum handoff shape should ooo auto or future Workflow IR consume?

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions