Skip to content

feat(auto): additive assumption_sources provenance surface (PR-C2)#1169

Merged
shaun0927 merged 1 commit into
Q00:mainfrom
shaun0927:feat/auto-assumption-source-provenance
May 22, 2026
Merged

feat(auto): additive assumption_sources provenance surface (PR-C2)#1169
shaun0927 merged 1 commit into
Q00:mainfrom
shaun0927:feat/auto-assumption-source-provenance

Conversation

@shaun0927
Copy link
Copy Markdown
Collaborator

Summary

PR-C2 of the L4 Auto Envelope v2 freeze (#1157, #821).

Add an auditable companion to AutoPipelineResult.assumptions so callers can distinguish what the system assumed for them from what the user / repo confirmed. The existing string-only assumptions field is unchanged — this is a strictly additive new field.

Each AssumptionRecord is a frozen dataclass carrying:

  • text — the assumption text
  • source — one of "assumption" (auto-answerer fallback), "inference" (model reasoning), "conservative_default" (safe-default policy). These are the three assumption-class LedgerSource values that, per _EVIDENCE_BACKED_SOURCES, do not count as evidence-backed and therefore land in assumption_only_sections.
  • confidence — per-entry confidence as recorded by the ledger.

Why assumption_sources is broader than assumptions: assumptions() filters to LedgerSource.ASSUMPTION only (backwards-compatible string surface). The new assumption_sources() broadens to all three assumption-class sources so the surface answers the actual user question ("which of these did the system make up?") rather than just the narrow auto-answerer-fallback subset.

Why this scope

Closes the documented PR-C2 gap from the #1157 living SSOT. After PR-B2 (#1167) and this PR, the L4 Envelope v2 lane has all five planned envelope fields in place:

Field PR Status
stop_reason_code #1151 🟢 merged
interview_closure_mode #1148 + #1167 (safe_default) 🟢 merged
defaulted_sections #1146 🟢 merged
assumption_sources this PR 🟡 in flight
(none — surface complete)

What is NOT done here

  • No schema change, no manifest change, no breaking change to existing assumptions callers.
  • CLI / MCP rendering of assumption_sources is intentionally deferred to a follow-up — this PR plumbs the envelope only. The CLI/MCP assumptions bullet list at src/ouroboros/cli/commands/auto.py:748 and src/ouroboros/mcp/tools/auto_handler.py:1940 still works unchanged.
  • No change to the assumptions() / _values_for_sources() filter — the existing ledger.assumptions().count(\"CLI user\") assertion in test_ledger_grading_answerer.py:1182 remains valid (regression-guarded by the new test_assumption_sources_returns_records_for_all_three_assumption_class_sources).

Scope

  • src/ouroboros/auto/ledger.py: new AssumptionRecord frozen dataclass; new SeedDraftLedger.assumption_sources() method using the same inactive-status and dedupe semantics as _values_for_sources.
  • src/ouroboros/auto/pipeline.py: import AssumptionRecord; add assumption_sources: tuple[AssumptionRecord, ...] = () field next to assumptions; populate in _result().
  • skills/auto/SKILL.md: new "Assumption-source provenance" subsection documenting the field shape and source vocabulary.
  • tests/unit/auto/test_ledger_grading_answerer.py: three new tests covering all three source kinds, inactive/evidence-backed exclusion, and same-text dedupe across sections.
  • tests/unit/auto/test_interview_pipeline.py: one new end-to-end pipeline test that asserts the legacy assumptions field is unchanged in scope and the new assumption_sources carries the source tag intact.
  • tests/unit/auto/test_pipeline_lateral.py and tests/unit/auto/test_pipeline_evaluate.py: extend the existing _StubLedger test doubles with the new assumption_sources method so they remain compatible with the additive _result() consumer.

Test plan

  • uv run pytest tests/unit/auto/test_ledger_grading_answerer.py tests/unit/auto/test_interview_pipeline.py -k "assumption_sources or assumption_class or surfaces_assumption" -q → 4 passed
  • uv run pytest tests/unit/auto tests/integration/auto -q → 912 passed (baseline 908 + 4 new)
  • uv run ruff check on touched files → clean
  • uv run ruff format on touched files → no changes
  • uv run mypy src/ouroboros/auto/ledger.py src/ouroboros/auto/pipeline.py → clean

Refs #1157 (L4 lane), #821 (autonomy acceptance matrix), #1146 (PR-C defaulted_sections), #1148 (PR-B1 ledger_only), #1151 (PR-E stop_reason_code), #1167 (PR-B2 safe_default closure).

PR-C2 of the L4 Auto Envelope v2 freeze (Q00#1157, Q00#821).

## Summary

Add an auditable companion to ``AutoPipelineResult.assumptions`` so
callers can distinguish *what the system assumed for them* from *what
the user / repo confirmed*. The existing string-only ``assumptions``
field is unchanged — this is a strictly additive new field.

Each ``AssumptionRecord`` is a frozen dataclass carrying:

- ``text`` — the assumption text
- ``source`` — one of ``"assumption"`` (auto-answerer fallback),
  ``"inference"`` (model reasoning), ``"conservative_default"``
  (safe-default policy). These are the three assumption-class
  ``LedgerSource`` values that, per ``_EVIDENCE_BACKED_SOURCES``, do
  *not* count as evidence-backed and therefore land in
  ``assumption_only_sections``.
- ``confidence`` — per-entry confidence as recorded by the ledger.

Why this is bigger than ``assumptions``: ``assumptions()`` filters to
``LedgerSource.ASSUMPTION`` only (backwards-compatible). The new
``assumption_sources()`` broadens to *all three* assumption-class
sources so the surface answers the actual user question ("which of
these did the system make up?") rather than just the narrow
auto-answerer-fallback subset.

## Scope

- ``src/ouroboros/auto/ledger.py``
  - New ``AssumptionRecord`` frozen dataclass.
  - New ``SeedDraftLedger.assumption_sources()`` method using the same
    inactive-status and dedupe semantics as ``_values_for_sources``.
- ``src/ouroboros/auto/pipeline.py``
  - Import ``AssumptionRecord``.
  - Add ``assumption_sources: tuple[AssumptionRecord, ...] = ()``
    to ``AutoPipelineResult`` next to ``assumptions``.
  - Populate in ``_result()`` via ``ledger.assumption_sources()``.
- ``skills/auto/SKILL.md``
  - New "Assumption-source provenance" subsection documenting the
    field shape and source vocabulary.
- Tests:
  - ``tests/unit/auto/test_ledger_grading_answerer.py``: three new
    tests covering all three source kinds, inactive/evidence-backed
    exclusion, and same-text dedupe across sections.
  - ``tests/unit/auto/test_interview_pipeline.py``: one new
    end-to-end pipeline test that exercises ``AutoPipelineResult``
    surface, asserts the legacy ``assumptions`` field is unchanged
    in scope, and asserts the new ``assumption_sources`` carries
    the source tag intact for each of the three source kinds.
  - ``tests/unit/auto/test_pipeline_lateral.py`` and
    ``tests/unit/auto/test_pipeline_evaluate.py``: extend the
    ``_StubLedger`` test doubles with the new ``assumption_sources``
    method so they remain compatible with the additive
    ``_result()`` consumer.

No schema change, no manifest change, no breaking change to existing
``assumptions`` callers. CLI/MCP rendering of ``assumption_sources``
is intentionally deferred to a follow-up — this PR plumbs the
envelope only.

## Test plan

- ``uv run pytest tests/unit/auto/test_ledger_grading_answerer.py tests/unit/auto/test_interview_pipeline.py -k "assumption_sources or assumption_class or surfaces_assumption" -q`` → 4 passed
- ``uv run pytest tests/unit/auto tests/integration/auto -q`` → 912 passed (baseline 908 + 4 new)
- ``uv run ruff check`` on touched files → clean
- ``uv run ruff format`` on touched files → no changes
- ``uv run mypy src/ouroboros/auto/ledger.py src/ouroboros/auto/pipeline.py`` → clean

Refs Q00#1157 (L4 lane), Q00#821 (autonomy acceptance matrix),
Q00#1146 (PR-C ``defaulted_sections``), Q00#1148 (PR-B1 ``ledger_only``),
Q00#1151 (PR-E ``stop_reason_code``), Q00#1167 (PR-B2 ``safe_default`` closure).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit 9f84af1 for PR #1169

Review record: 86f88141-9bff-4cb4-a3a8-42134df38f59

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

Unable to complete the review: every filesystem command failed before execution with bwrap: No permissions to create a new namespace. I could not read /tmp/pr_diff_1169.patch, the changed-files list, comments, or source files, so I cannot make a defensible code assessment.

Recovery Notes

First recoverable review artifact generated from codex analysis log.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

@shaun0927 shaun0927 merged commit 2f19213 into Q00:main May 22, 2026
8 checks passed
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

PR #1169
Branch: feat/auto-assumption-source-provenance | 7 files, +324/-5 | CI: Bridge TypeScript pass 13s https://github.com/Q00/ouroboros/actions/runs/26264268510/job/77304138424
Scope: architecture-level
HEAD checked: 9f84af16840cdbdec94b35054ce3ccfa4a7bbba2

What Improved

  • Added AssumptionRecord and SeedDraftLedger.assumption_sources() so ledger entries from assumption, inference, and conservative_default can retain per-entry provenance.
  • Populated AutoPipelineResult.assumption_sources from the ledger while keeping the legacy assumptions string tuple scoped to LedgerSource.ASSUMPTION.
  • Added focused unit coverage for ledger filtering, dedupe behavior, and pipeline result population.

Issue #N/A Requirements

Requirement Status
Add auditable companion to AutoPipelineResult.assumptions Partially met: Python result field exists and is populated.
Let callers distinguish system assumptions from confirmed user/repo facts Not met at MCP boundary: metadata callers cannot observe assumption_sources.
Preserve legacy assumptions behavior Met by ledger and pipeline tests.
Cover newly added logic with meaningful tests Partially met: ledger and pipeline are covered, MCP contract is not.

Prior Findings Status

Prior Finding Status
Prior review context MODIFIED — No prior review artifact was provided in this audit context; no prior concerns could be maintained, modified, or withdrawn.

Blockers

# File:Line Severity Confidence Finding
1 src/ouroboros/mcp/tools/auto_handler.py:1273 High 92% The new provenance field is computed on AutoPipelineResult but never crosses the MCP metadata boundary: _result_meta() emits ledger_provenance, defaulted_sections, evidence_backed_sections, and assumption_only_sections, then returns without serializing result.assumption_sources. A focused runtime check with an AutoPipelineResult(..., assumption_sources=(AssumptionRecord(...),)) produced metadata with no assumption_sources key. This leaves MCP clients unable to consume the new PR-C2 contract and defeats the stated goal of letting callers distinguish system-made assumptions from confirmed facts.

Follow-ups

# File:Line Priority Confidence Suggestion
None.

Test Coverage

  • Ran uv run pytest tests/unit/auto/test_ledger_grading_answerer.py tests/unit/auto/test_interview_pipeline.py -k "assumption_sources or surfaces_assumption" -q: 4 passed.
  • Ran uv run pytest tests/unit/auto/test_surface.py -k "defaulted_sections or ledger_provenance or auto_handler_meta_exposes_auto_progress_fields" -q: 4 passed.
  • Coverage is missing for the affected MCP consumer contract: no test asserts _result_meta() exposes assumption_sources as JSON-compatible records.

Design / Roadmap Gate

Affected-boundary reasoning: this PR changes an envelope field intended for callers, so the review boundary includes AutoPipelineResult producers and downstream consumer surfaces, not just ledger internals. The ledger and pipeline boundary now carries provenance, but the MCP result metadata boundary drops it before clients can consume it. Because MCP metadata already exposes related envelope fields unconditionally, omitting assumption_sources creates an inconsistent contract and blocks the stated provenance use case for MCP consumers.

Merge Recommendation

Post-merge audit recommendation: patch current HEAD to serialize assumption_sources through MCP metadata as a stable JSON-compatible list of {text, source, confidence} records, and add a test_surface.py contract test proving non-empty and empty cases.

Review-Metadata:
verdict: REQUEST_CHANGES
github_event: COMMENT
review_kind: post_merge_audit
merge_eligible: false
head_sha: 9f84af1
source_read_ok: true
diff_read_ok: true
blocking_count: 0

Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

PR #1169
Branch: feat/auto-assumption-source-provenance | 7 files, +324/-5 | CI: Bridge TypeScript pass 13s https://github.com/Q00/ouroboros/actions/runs/26264268510/job/77304138424
Scope: architecture-level
HEAD checked: 9f84af16840cdbdec94b35054ce3ccfa4a7bbba2

What Improved

  • Added a frozen AssumptionRecord and ledger-level assumption_sources() surface for the three assumption-class sources.
  • Populated AutoPipelineResult.assumption_sources from the ledger while preserving legacy assumptions.
  • Added focused unit coverage for ledger filtering/dedupe and the internal pipeline result field.

Issue #N/A Requirements

Requirement Status
Add auditable companion to AutoPipelineResult.assumptions Partially met: internal Python result field is populated.
Include text, source, and confidence per assumption record Met internally via AssumptionRecord.
Cover assumption, inference, and conservative_default sources Met in ledger and internal pipeline tests.
Preserve legacy assumptions behavior Met by keeping assumptions() filtered to LedgerSource.ASSUMPTION.
Make the new envelope usable by callers/clients Not met for MCP clients because assumption_sources is omitted from _result_meta().

Prior Findings Status

Prior Finding Status
Prior review context MODIFIED — No prior review findings were present in the provided artifacts; no prior concerns were maintained, modified, or withdrawn.

Blockers

# File:Line Severity Confidence Finding
1 src/ouroboros/mcp/tools/auto_handler.py:1270 High 90% The new provenance field is not exposed through the MCP result contract. _result_meta() serializes adjacent envelope fields (defaulted_sections, evidence_backed_sections, assumption_only_sections) but never includes result.assumption_sources; AutoHandler.handle() returns only _format_result() text plus this meta, and _format_result() still emits only legacy result.assumptions. As a result, MCP clients cannot observe the newly added inference / conservative_default provenance even though the PR goal is to let callers distinguish system-made assumptions from confirmed facts and skills/auto/SKILL.md:99 documents result.assumption_sources as available.

Follow-ups

# File:Line Priority Confidence Suggestion
None.

Test Coverage

  • Ledger tests cover all three assumption-class sources, inactive/evidence-backed exclusion, and same-text dedupe.
  • Pipeline test covers the internal AutoPipelineResult.assumption_sources field and legacy assumptions compatibility.
  • Missing coverage at the consumer boundary: no MCP _result_meta() / AutoHandler.handle() test asserts assumption_sources is serialized for clients.

Design / Roadmap Gate

The ledger and pipeline internals are coherent, but the affected boundary is the auto result contract consumed by MCP clients. Existing envelope peers are explicitly projected into MCPToolResult.meta, while the new field stops at the in-process dataclass. Because MCP callers do not receive raw AutoPipelineResult objects, this leaves the advertised provenance surface inaccessible at the primary client boundary.

Merge Recommendation

Post-merge audit recommendation: treat current HEAD as still incomplete for PR-C2 until assumption_sources is serialized in MCP metadata with regression coverage.

Review-Metadata:
verdict: REQUEST_CHANGES
github_event: COMMENT
review_kind: post_merge_audit
merge_eligible: false
head_sha: 9f84af1
source_read_ok: true
diff_read_ok: true
blocking_count: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant