feat(auto): ledger-derived task-class inference (L1-b) by shaun0927 · Pull Request #1177 · Q00/ouroboros

shaun0927 · 2026-05-22T13:49:30Z

Summary

L1-b slice of #1171 — the ledger-derive substrate of #1157's L1 lane. Pattern-matches the Socratic interview's already-standardized ledger entries against the L1-a (#1173) catalog and returns one of three outcomes: single match, ambiguous, or unmatched (falls back to LIBRARY). No new LLM call, no eval set, no accuracy floor.

`derive_domain_from_ledger` runs every registered pattern function against the ledger's confirmed/defaulted/inferred entries (inactive statuses excluded) and returns a frozen `DomainInference` value:

Single match — one `matches` predicate fired.
Ambiguous — two or more predicates fired; the interview driver (L1-c, future PR) should ask a disambiguation question rather than silently pick.
Unmatched — no predicate fired; falls to LIBRARY (safest completion gate) with `reason="unmatched"` for telemetry / catalog growth.

Adding a new task class = adding a `matches` function + `_PATTERN_REGISTRY` entry + unit test. ~10 LoC PR per class.

What lands

`src/ouroboros/auto/domain_inference.py` (new): `DomainInference` frozen dataclass + 7 pattern functions + `derive_domain_from_ledger` entry point + `register_pattern` test/extension surface.
`tests/unit/auto/test_domain_inference.py` (new): 13 tests — one positive case per class + ambiguous + unmatched + inactive-status discipline + registry-vs-enum guard.

What is NOT in this PR

Interview-driver disambiguation hook (consumes `is_ambiguous`) — deferred to L1-c, evidence-driven.
Seed Architect AC injection (consumes `DomainInference.single`) — L1-d, separate PR.
Result-envelope surface — L1-e, folded into L1-d PR.

Test plan

`uv run pytest tests/unit/auto/test_domain_inference.py -v` → 13 passed.
`uv run pytest tests/unit/auto -q` → 905 passed (892 baseline + 13 new).
`uv run ruff check` on touched files → clean.
`uv run ruff format` on touched files → clean.
`uv run mypy src/ouroboros/auto/domain_inference.py` → clean.

Refs #1157 (L1 lane, ledger-derive redesign), #1171 (L1 design issue), #1173 (L1-a task-class catalog).

L1-b slice of Q00#1171 — the ledger-derive substrate of Q00#1157's L1 lane. Pattern-matches the Socratic interview's already-standardized ledger entries against the L1-a (Q00#1173) catalog and returns one of three outcomes: single match, ambiguous, or unmatched (falls back to LIBRARY). **No new LLM call, no eval set, no accuracy floor.** ## Summary ``derive_domain_from_ledger`` runs every registered pattern function against the ledger's confirmed/defaulted/inferred entries (inactive statuses excluded) and returns a frozen ``DomainInference`` value: - **Single match** — one ``_matches_<class>`` predicate fired. - **Ambiguous** — two or more predicates fired; the interview driver (L1-c, future PR) should ask a disambiguation question rather than silently pick. - **Unmatched** — no predicate fired; falls to ``LIBRARY`` (safest completion gate, narrowest blast radius) with ``reason="unmatched"`` for telemetry / catalog growth. Adding a new task class = adding a ``_matches_<name>`` function + ``_PATTERN_REGISTRY`` entry + unit test. ~10 LoC PR per class. ## What lands - ``src/ouroboros/auto/domain_inference.py`` (new): - ``DomainInference`` frozen dataclass with ``is_single`` / ``is_ambiguous`` / ``is_unmatched`` / ``single`` convenience properties. - ``_section_text`` helper that concatenates a section's active-status entries (matches the same active-set rule used by ``SeedDraftLedger._values_for_sources``). - 7 pattern functions, one per L1-a TaskClass. - ``register_pattern`` opt-in extension surface for tests / future classes. - ``derive_domain_from_ledger`` entry point. - ``tests/unit/auto/test_domain_inference.py`` (new): 13 tests covering one positive case per class + ambiguous + unmatched + inactive-status discipline + registry-vs-enum guard. ## Test plan - [x] ``uv run pytest tests/unit/auto/test_domain_inference.py -v`` → 13 passed. - [x] ``uv run pytest tests/unit/auto -q`` → 905 passed (892 baseline + 13 new). - [x] ``uv run ruff check`` on touched files → clean. - [x] ``uv run ruff format`` on touched files → no changes. - [x] ``uv run mypy src/ouroboros/auto/domain_inference.py`` → clean. ## What is NOT in this PR - Interview-driver disambiguation hook (consumes ``is_ambiguous``) — deferred to L1-c, *evidence-driven* (none of the canonical scenarios are ambiguous, so L1-c is not blocking the SSOT acceptance gate). - Seed Architect AC injection (consumes ``DomainInference.single``) — L1-d, separate PR. - Result-envelope surface — L1-e, folded into the L1-d PR. ## References - Q00#1157 — Meta SSOT for ``ooo auto`` (L1 lane body, ledger-derive redesign). - Q00#1171 — L1 design issue (this PR's spec). - Q00#1173 — L1-a task-class catalog (this PR consumes ``TASK_CLASS_CATALOG`` and ``TaskClass``). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit 7af65a2 for PR #1177

Review record: 18950a45-25e5-44a0-abcc-df6ba8ef34da

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

Unable to complete the review: every shell command failed before execution with bwrap: No permissions to create a new namespace, including simple reads of /tmp/pr_diff_1177.patch, changed-files, and review-comment files. I could not inspect the patch or source snapshot.

Recovery Notes

First recoverable review artifact generated from codex analysis log.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

PR #1177
Branch: feat/pP2 | 2 files, +557/-0 | CI: HTTP 401: Bad credentials (https://api.github.com/graphql)
Scope: architecture-level
HEAD checked: 7af65a2beb204d53dffabf959680eb244fd3d04d

What Improved

Added a deterministic DomainInference result type and task-class pattern registry.
Covered one positive case per current TaskClass, unmatched fallback, inactive status filtering, and enum/registry parity.

Issue #N/A Requirements

Requirement	Status
Derive task class deterministically from ledger entries without an LLM call	Met
Return a single match when exactly one predicate fires	Partially met
Return ambiguous when two or more predicates fire	Not met for `WEBHOOK` + `WEB_SERVICE` overlap
Return unmatched with `LIBRARY` fallback when no predicates fire	Met
Exclude inactive statuses from inference	Met
Keep registry in lockstep with `TaskClass` enum	Met
Provide meaningful tests for newly added logic	Incomplete

Prior Findings Status

Prior Finding	Status
Prior review context	MODIFIED — No prior review concerns were present in the provided artifacts, so no concerns were maintained, modified, or withdrawn.

Blockers

#	File:Line	Severity	Confidence	Finding
1	src/ouroboros/auto/domain_inference.py:179	High	95%	`_matches_web_service` suppresses itself whenever `_matches_webhook` fires, so a ledger with both web-service signals (`HTTP response`, `JSON body`) and webhook signals (`webhook POST`, stored DB row) returns a single `WEBHOOK` match instead of the advertised ambiguous outcome. This violates the boundary contract that every registered pattern runs independently and multi-match cases are surfaced for disambiguation, causing downstream completion gates/AC consumers to silently receive the wrong single class.

Follow-ups

#	File:Line	Priority	Confidence	Suggestion
—	—	—	—	None.

Test Coverage

New unit coverage includes representative positives for all seven task classes, unmatched fallback, inactive entries, and registry parity.
Missing meaningful boundary coverage for overlapping WEBHOOK/WEB_SERVICE signals; the existing ambiguity test only covers CLI + WEBHOOK, so it does not catch the cross-predicate suppression at domain_inference.py:179.
PR check data was unavailable in the provided artifacts due GitHub API HTTP 401; I validated the blocker with a local PYTHONPATH=src python reproduction against current HEAD.

Design / Roadmap Gate

Affected-boundary reasoning: this PR adds a new classification boundary that future interview-driver, seed-AC, telemetry, and result-envelope consumers will trust as authoritative. The design says predicates are conservative and independent, with ambiguity surfaced rather than silently resolved. WEB_SERVICE currently calls _matches_webhook and negates it, embedding priority inside one predicate instead of the derive_domain_from_ledger aggregation boundary. That breaks replay/consumer semantics because identical ledger evidence can no longer represent multiple plausible classes, and future consumers reading is_single/single will have no way to recover the suppressed WEB_SERVICE signal.

Merge Recommendation

Retrospective recommendation: fix current HEAD before wiring L1-b into consumers. Remove the webhook exclusion from _matches_web_service or make any priority rule an explicit documented contract, and add a regression test proving webhook/web-service overlap returns is_ambiguous.

Review-Metadata:
verdict: REQUEST_CHANGES
github_event: COMMENT
review_kind: post_merge_audit
merge_eligible: false
head_sha: 7af65a2
source_read_ok: true
diff_read_ok: true
blocking_count: 0

Drops the five hunks called out in PR review against #1157 SSOT / merged design decisions, while keeping every legitimate fix in this PR: - Restore DeferredProbe.passed=True (#1181 contract: probe-PASS placeholder so the L3 verifier flags the gap without failing the grade). - Restore `_inference.single` consumer call so unmatched ledgers still apply the safe LIBRARY fallback (#1177 + #1188 decision). - Drop the `if error is None` last_error_code clearing — sibling PR #1194 owns this hunk with the stricter `next_phase is not BLOCKED` condition. - Restore `env.setdefault(...)` for the three plugin runtime env vars introduced by #1193 so downstream entrypoints can still pre-seed. - Drop the conditional `active_task_class` MCP meta block — sibling PR #1196 owns it with unconditional null surfacing (clients need to tell ambiguous-inference apart from a missing protocol field). Kept fixes: `_matches_web_service` independence, `defaulted_sections` CLI rendering, `status --limit > 0` validation, mechanical-eval evidence linkage, TimeoutExpired stdout/stderr bytes-safe decode, workflow lifecycle same-timestamp restart ordering, MCP meta surfacing for `interview_closure_mode` + `runtime_probe_evidence`, plugin-mode Ralph `product_status=not_verified_complete` downgrade. Verification: - uv run pytest tests/unit/auto/test_pipeline_task_class_envelope.py tests/unit/auto/test_domain_inference.py tests/unit/auto/test_surface.py tests/unit/orchestrator/test_runtime_evidence.py tests/unit/cli/test_status_run.py tests/unit/cli/test_auto_command.py tests/unit/plugin/test_firewall.py tests/unit/orchestrator/test_workflow_lifecycle_events.py tests/integration/test_mechanical_eval_projection.py tests/unit/auto/test_stop_reason_code.py -q → 243 passed - uv run ruff check src tests → passed - uv run ruff format --check src tests → passed

ouroboros-agent Bot approved these changes May 22, 2026

View reviewed changes

shaun0927 merged commit 4ede81f into Q00:main May 22, 2026
8 checks passed

shaun0927 mentioned this pull request May 22, 2026

feat(auto): Seed AC injection + active_task_class envelope (L1-d, L1-e) #1188

Merged

4 tasks

ouroboros-agent Bot reviewed May 23, 2026

View reviewed changes

Q00 mentioned this pull request May 24, 2026

fix: resolve post-merge review blocker foundations #1205

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(auto): ledger-derived task-class inference (L1-b)#1177

feat(auto): ledger-derived task-class inference (L1-b)#1177
shaun0927 merged 1 commit into
Q00:mainfrom
shaun0927:feat/pP2

shaun0927 commented May 22, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shaun0927 commented May 22, 2026

Summary

What lands

What is NOT in this PR

Test plan

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Recovery Notes

Uh oh!

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

What Improved

Issue #N/A Requirements

Prior Findings Status

Blockers

Follow-ups

Test Coverage

Design / Roadmap Gate

Merge Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant