Skip to content

AgentOS Wave 2: hardening (H1/H2/H3) + N3 mapping + PR C-docs #1142

@shaun0927

Description

@shaun0927

Goal

Implement the Wave 2 PR queue after the Wave 1 substrate (#1132#1137) merged on 2026-05-20. Wave 2 covers 5 independent PRs scoped tightly inside the locked v1 boundary documents:

All 5 PRs are mutually independent (touch disjoint surfaces) and may merge in any order.

Scope of Wave 2

# Slot Issue Title Risk LOC est.
1 H1 #956 fix(persistence): guard workflow_ir aggregate type against raw append Low 100–150
2 H2 #956 fix(orchestrator): bound workflow lifecycle reason_code/refs and nesting depth Low 150–250
3 H3 #939 fix(plugin): block fail_closed terminal-observability hooks at dispatch Low 200–300
4 N3 #956 / #946 docs+test(agentos): IR ↔ projection mapping contract Low 250–400
5 PR C-docs #939 docs(plugin): define before_tool_call/after_tool_call hook contract Low 100–200 (docs-only)

H1–H3: 3 hardening follow-ups from Wave 1 reviewer findings

H1, H2, H3 each address a HIGH severity reviewer finding from Wave 1 (#1131 final summary). They harden defense-in-depth gaps the reviewers explicitly recommended closing before external consumers (plugin, MCP) attach to workflow_ir.

N3: IR ↔ #946 projection mapping

Locks the boundary contract between #956 Workflow IR and #946 projection vocabulary documented in docs/agentos/workflow-ir-v1.md:

The default boundary fixture must stay local and deterministic: it may pair a validated WorkflowSpec with synthetic EventStore rows to prove source-event linkage, but it must not add dispatch, cache, persistence, or projection-record embedding to the IR.

This is the v1-explicit-supported scope for N3.

PR C-docs: tool-call hook contract

Pure docs PR. Defines the before_tool_call / after_tool_call payload shape, permission scopes, failure policy, and audit event names so a future PR F (dispatch implementation) has an unambiguous target. Per the #939 scope-decision comment, PR C-docs is gated on N1 (#1134) which is now merged.

Excluded from Wave 2 (deferred — boundary decisions still pending)


PR 1 — Hardening H1 · workflow_ir append guard

Refs: #956 · #1134 security review HIGH #1

In-scope

  • In src/ouroboros/persistence/event_store.py:append(), reject any BaseEvent whose aggregate_type == WORKFLOW_LIFECYCLE_AGGREGATE_TYPE (raise PersistenceError instructing the caller to use append_workflow_lifecycle_event()).
  • In src/ouroboros/persistence/event_store.py:replay_workflow_lifecycle(), wrap each WorkflowLifecycleEvent.from_base_event() call in try/except so a single malformed row does not crash the entire replay (recommended fix from security review MEDIUM [Epic 2] Intelligent Task Routing (PAL) #3).

Out-of-scope

  • ❌ Schema changes
  • ❌ Any change to other aggregate types
  • ❌ Behavioral change to append_workflow_lifecycle_event() itself

Files

  • src/ouroboros/persistence/event_store.py
  • tests/unit/persistence/test_event_store.py (or matching test file)

Acceptance

  • Negative test: raw BaseEvent with aggregate_type="workflow_ir" passed to append() raises PersistenceError.
  • Positive test: append_workflow_lifecycle_event() continues to work end-to-end.
  • Replay-resilience test: malformed row co-located with valid rows does not blow up replay_workflow_lifecycle().

PR 2 — Hardening H2 · Bounded reason_code/refs and nesting depth

Refs: #956 · #1134 security review HIGH #2 + MEDIUM #4

In-scope

  • Add validator on WorkflowLifecycleEvent.reason_code: len(reason_code) ≤ 1024.
  • Extend _coerce_refs validator: len(refs) ≤ 64 entries, each len(ref) ≤ 512 chars.
  • Add validator on WorkflowLifecycleEvent.workflow_id: len(workflow_id) ≤ 256 (security review LOW [Epic 5] Three-Stage Evaluation Pipeline #6).
  • Introduce _MAX_NESTING_DEPTH = 32 and pass _depth through _normalize_json_value() recursive calls; raise ValueError on exceedance (security review MEDIUM [Epic 3] Double Diamond Execution #4).

Out-of-scope

  • ❌ Schema_version bump
  • ❌ Blocklist key additions (security review MEDIUM [Epic 4] Resilience & Stagnation Recovery #5 — defer to a separate small PR or normal rotation; not blocking)
  • ❌ Any change to existing event consumers

Files

  • src/ouroboros/orchestrator/workflow_lifecycle.py
  • tests/unit/orchestrator/test_workflow_lifecycle_events.py

Acceptance

  • Each bound has at least one negative test (oversized field rejected with clear ValidationError).
  • Existing 21 lifecycle tests still pass without modification.
  • Nesting depth test: 33-level-deep data rejected cleanly (no RecursionError).

PR 3 — Hardening H3 · Terminal-hook fail-open runtime guard

Refs: #939 · #1137 code review HIGH

In-scope

  • In src/ouroboros/plugin/firewall.py:_run_lifecycle_hooks(), at function entry: if hook_kind in TERMINAL_OBSERVABILITY_HOOK_KINDS AND any iterated hook has failure_policy == "fail_closed":
    • Skip the hook (do not run subprocess).
    • Emit a plugin.hook.blocked audit event documenting the contract violation.
    • Do not let the skip influence the function return (terminal observability is fail-open by contract).

Out-of-scope

  • ❌ Change to non-terminal hook policy (e.g., before_invocation still respects fail_closed)
  • ❌ Manifest layer changes (the JSON schema + loader already prevent fail_closed for on_error/on_cancel; this is purely a runtime defense-in-depth)
  • ❌ New permission scope or event family

Files

  • src/ouroboros/plugin/firewall.py
  • tests/unit/plugin/test_lifecycle_observability.py (or new sibling file)

Acceptance

  • Test: forge a HookSpec(failure_policy="fail_closed") for on_error directly in Python (bypassing manifest validator), pass through invoke_plugin() failure path, assert original cause is preserved AND plugin.hook.blocked audit event is emitted.
  • Existing 13 lifecycle observability tests still pass.
  • Existing test(plugin): lock v0.3 lifecycle conformance #1119 conformance baseline still passes.

PR 4 — N3 · IR ↔ #946 projection mapping contract

Refs: #956 / #946 · docs/agentos/workflow-ir-v1.md

In-scope

  • New doc docs/agentos/workflow-ir-projection-mapping.md defining:
    • How WorkflowNode.node_id maps to projection StepRecord.step_id / VerdictRecord.ac_id.
    • How WorkflowEdge.edge_id corresponds to projection event-pair linkage.
    • How WorkflowLifecycleEvent source IDs project into RunRecord / StageRecord / StepRecord.
  • New test tests/integration/test_ir_projection_consistency.py: build a small validated WorkflowSpec, emit synthetic EventStore rows that follow the documented mapping, verify ProjectionBuilder produces records whose IDs match the IR's node_id / edge_id.

Out-of-scope (per workflow-ir-v1.md boundary)

Files

  • docs/agentos/workflow-ir-projection-mapping.md (new)
  • tests/integration/test_ir_projection_consistency.py (new)

Acceptance

  • Fixture spec + synthetic events produce projection records whose step_id/ac_id/edge_id match the IR's planned identifiers.
  • Mapping doc references both workflow-ir-v1.md and projection-v1-scope.md boundaries explicitly.
  • 100% offline / deterministic / no credentials.

PR 5 — PR C-docs · Tool-call hook contract docs (pure docs)

Refs: #939 · #939 scope-decision comment

In-scope (pure docs)

  • New doc docs/rfc/plugin-tool-call-hook-contract.md defining:
    • before_tool_call / after_tool_call payload shape (tool name, redacted args, bounded output ref).
    • Permission scopes: plugin:tool:intercept (mutating, fail-closed-eligible) vs plugin:tool:observe (observation, fail-open).
    • Failure policy mapping.
    • Audit event names: plugin.tool.intercept.requested / plugin.tool.intercept.completed / plugin.tool.intercept.blocked.
    • Future plugin_schema_version: "0.4" migration plan (introduction deferred to a future PR F dispatch slice; this docs PR does NOT change manifest.py or schemas).

Out-of-scope

  • ❌ ANY code change (no manifest.py, no schemas/, no firewall.py)
  • ❌ v0.4 schema file additions (deferred to PR F)
  • ❌ Runtime dispatch behavior
  • ❌ Permission scope additions in code

Files

  • docs/rfc/plugin-tool-call-hook-contract.md (new, only)

Acceptance


Common acceptance gates (apply to every Wave 2 PR)

  1. Read first: each worker MUST grep / read upstream/main for the affected surface BEFORE pushing, to confirm the slot is genuinely net-new (Wave 1 lesson: S1 worker correctly refused to duplicate already-merged feat(harness): project artifact and verdict records #1061 work).
  2. Schema invariant: no schema_version bump unless explicitly in the PR title.
  3. Boundary doc citation: PR body names the exact section (projection-v1-scope.md, workflow-ir-v1.md, or #939 scope-decision comment) it implements.
  4. Anti-action check: none of the explicit ❌ lists in the lock comments violated.
  5. Test coverage: at least one negative test per acceptance criterion.
  6. No --no-verify: pre-commit hooks must run.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or meaningful improvement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions