feat(auto): task-class catalog data (L1-a) by shaun0927 · Pull Request #1173 · Q00/ouroboros

shaun0927 · 2026-05-22T11:20:20Z

Summary

L1-a slice of #1171 — the ledger-derive redesign of #1157's L1 lane.

Introduces the 7-class task-class catalog as catalog data only. Frozen enum + per-class profile + immutable lookup table. No inference, no LLM, no eval set, no telemetry — those belong to later L1-b/c/d sub-PRs and the ledger-derive design explicitly forbids a separate classifier model.

The catalog is a task-class concept, distinct from the existing DomainProfile meta-domain concept (coding / research / design). Both layers can coexist; this PR introduces the task_classes module as a sibling, not an extension.

Why this is small

Per the Ouroboros minimal-substrate principle re-emphasized in the 2026-05-22 SSOT self-audit (#1157), L1 was scoped down from a classifier + eval set + accuracy floor + opt-in telemetry pipeline to pattern matching against the interview's standardized ledger. This PR is the smallest possible substrate that downstream L1-b (the derive_domain_from_ledger pattern matcher) and L1-c (the Seed AC injection hook) can consume.

What lands

src/ouroboros/auto/task_classes.py (new):
- CompletionMode StrEnum: CODE_COMPLETE / PRODUCT_COMPLETE.
- TaskClass StrEnum: 7 frozen classes (library, cli, web_service, webhook, data_pipeline, game_2d, refactor_in_place). Deferred classes (game_3d, desktop_app, notebook_analysis) per Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171.
- TaskClassProfile frozen dataclass: name, default_completion_mode, default_ac_template (tuple[str, ...] matching Seed.acceptance_criteria exactly), runtime_probe_kinds (plain-string placeholder until L3 lands).
- TASK_CLASS_CATALOG: immutable MappingProxyType over a private dict.
tests/unit/auto/test_task_classes.py (new): 13 tests covering catalog shape, completion-mode invariants, immutability, StrEnum behavior, webhook-vs-web-service distinction pin.

What is NOT in this PR

Pattern-matching derive_domain_from_ledger — L1-b.
Interview-driver disambiguation hook — L1-c.
Seed Architect AC injection — L1-d.
Result-envelope surface (AutoPipelineResult.active_task_class) — L1-e.

Test plan

uv run pytest tests/unit/auto/test_task_classes.py -v → 13 passed.
uv run pytest tests/unit/auto -q → 889 passed.
uv run ruff check on touched files → clean.
uv run ruff format on touched files → no changes.
uv run mypy src/ouroboros/auto/task_classes.py → clean.

Refs #1157 (L1 lane), #1171 (L1 design issue), #849 (existing DomainProfile contract — preserved untouched).

L1-a slice of Q00#1171 — the ledger-derive redesign of Q00#1157's L1 lane. ## Summary Introduce the 7-class **task-class catalog** as catalog data only. Frozen enum + per-class profile + immutable lookup table. **No inference, no LLM, no eval set, no telemetry** — those belong to later L1-b/c/d sub-PRs and the ledger-derive design explicitly forbids a separate classifier model. The catalog is a *task-class* concept, distinct from the existing ``DomainProfile`` *meta-domain* concept (coding / research / design). Both layers can coexist; this PR introduces the ``task_classes`` module as a sibling, not an extension. ## Why this is small Per the Ouroboros minimal-substrate principle re-emphasized in the 2026-05-22 SSOT self-audit, L1 was scoped down from a classifier + eval set + accuracy floor + opt-in telemetry pipeline to **pattern matching against the interview's standardized ledger**. This PR is the smallest possible substrate that downstream L1-b (the ``derive_domain_from_ledger`` pattern matcher) and L1-c (the Seed AC injection hook) can consume. ## What lands - ``src/ouroboros/auto/task_classes.py`` (new): - ``CompletionMode`` StrEnum: ``CODE_COMPLETE`` / ``PRODUCT_COMPLETE``. - ``TaskClass`` StrEnum: 7 frozen classes (``library``, ``cli``, ``web_service``, ``webhook``, ``data_pipeline``, ``game_2d``, ``refactor_in_place``). Deferred classes (``game_3d``, ``desktop_app``, ``notebook_analysis``) per Q00#1171. - ``TaskClassProfile`` frozen dataclass: ``name``, ``default_completion_mode``, ``default_ac_template`` (``tuple[str, ...]`` matching ``Seed.acceptance_criteria`` exactly), ``runtime_probe_kinds`` (plain-string placeholder until L3 lands). - ``TASK_CLASS_CATALOG``: immutable ``MappingProxyType`` over a private dict so consumers cannot mutate. - ``tests/unit/auto/test_task_classes.py`` (new): - 13 tests covering catalog shape, completion-mode invariants, immutability, StrEnum behavior, and the ``webhook`` vs ``web_service`` distinction pin. ## Test plan - [x] ``uv run pytest tests/unit/auto/test_task_classes.py -v`` → 13 passed. - [x] ``uv run pytest tests/unit/auto -q`` → 889 passed (878 baseline + 13 new − 2 fixture-renames-not-affected = 889). - [x] ``uv run ruff check`` on touched files → clean. - [x] ``uv run ruff format`` on touched files → no changes. - [x] ``uv run mypy src/ouroboros/auto/task_classes.py`` → clean. ## What is NOT in this PR - Pattern-matching ``derive_domain_from_ledger`` — L1-b. - Interview-driver disambiguation hook — L1-c. - Seed Architect AC injection — L1-d (separate file in ``seed_architect``). - Result-envelope surface (``AutoPipelineResult.active_task_class``) — L1-e. ## References - Q00#1157 — Meta SSOT for ``ooo auto`` (L1 lane body, *Substrate honesty* note). - Q00#1171 — L1 design issue (ledger-derive redesign, this PR's spec). - Q00#849 — Original DomainProfile contract (kept untouched; task-class is a parallel concept at a finer granularity). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Branch: feat/prL1a | 2 files, +325/-0 | CI: Bridge TypeScript pass 13s https://github.com/Q00/ouroboros/actions/runs/26284794948/job/77369602946
Scope: diff-only
HEAD checked: cd49d6e74517bcec330f171f91509c93a2fbfaad

What Improved

Added a narrow catalog-only L1-a slice with no inference, LLM calls, telemetry, or runtime behavior changes.
Added focused tests for catalog coverage, profile shape, enum serialization, immutability, completion-mode grouping, and webhook/web-service separation.

Issue #1171 Requirements

Requirement	Status
L1-a ships seven classes covering `library`, `cli`, `web-service`, `webhook`, `data-pipeline`, `game-2d`, and `refactor-in-place`.	NOT MET — seven enum members exist at `src/ouroboros/auto/task_classes.py:72`, but several serialized values use underscores rather than the issue’s canonical hyphenated values.
L1-a includes per-class `default_completion_mode`, `default_ac_template`, and `runtime_probe_kinds`.	MET — fields are present at `src/ouroboros/auto/task_classes.py:107`.
L1-a includes per-class `safe_defaults`.	NOT MET — `TaskClassProfile` ends without this field at `src/ouroboros/auto/task_classes.py:109`.
L1-a includes at least one unit test per class.	MET — `tests/unit/auto/test_task_classes.py:33` parametrizes catalog-shape checks across every `TaskClass`.
L1-b/L1-c/L1-d/L1-e inference, interview, AC injection, and result-envelope behavior.	DECLARED OUT OF SCOPE — PR body explicitly defers these later slices; no blocker raised for their absence.

Prior Findings Status

Prior Finding	Status
Previous bot review reported no in-scope blockers but also stated local file reads failed with `bwrap: No permissions to create a new namespace`.	MODIFIED — current review inspected the checked-out files and verified current-HEAD contract blockers.

Blockers

#	File:Line	Severity	Confidence	Finding
1	`src/ouroboros/auto/task_classes.py:74`	Medium	90%	The frozen serialized task-class names use underscores (`web_service`, with the same pattern at lines 76-78), but the linked L1 design defines the canonical class values as hyphenated names (`web-service`, `data-pipeline`, `game-2d`, `refactor-in-place`). Because `TaskClass` is a `StrEnum` intended to serialize directly into downstream envelopes/ledger-facing values, this locks in a contract that does not match the design issue before L1-b/L1-d consume it.
2	`src/ouroboros/auto/task_classes.py:106`	Medium	80%	`TaskClassProfile` only exposes `name`, `default_completion_mode`, `default_ac_template`, and `runtime_probe_kinds` through line 109, but the linked L1-a sub-PR breakdown explicitly includes `safe_defaults` as a per-class field. Since this PR freezes the producer-side catalog schema for later inference and Seed assembly, omitting that field leaves the L1-a contract incomplete rather than merely deferred.

Follow-ups

#	File:Line	Priority	Confidence	Suggestion

Test Coverage

tests/unit/auto/test_task_classes.py:26 and tests/unit/auto/test_task_classes.py:33 cover enum/catalog lockstep and profile shape for every class. However, tests currently pin the underscore serialized values at tests/unit/auto/test_task_classes.py:93, so they do not catch the canonical-name mismatch. They also cannot cover safe_defaults because the field is absent from TaskClassProfile.

I attempted uv run pytest tests/unit/auto/test_task_classes.py -q, but the run failed before tests executed because the editable build’s VCS version hook timed out on git version and then could not derive a version. A direct PYTHONPATH=src import smoke check for the new module passed.

Design / Roadmap Gate

The PR aligns with the design gate on keeping L1-a catalog-only and avoiding a separate classifier, eval set, telemetry, or new LLM path. It does not fully align with the L1-a catalog contract because current HEAD freezes underscore serialized class values and omits the safe_defaults field that the linked issue lists for the per-class schema.

Merge Recommendation

Merge after the catalog’s serialized names and profile schema are brought back into alignment with the L1-a design contract, with tests updated to pin those contract values.

ouroboros-agent[bot]

Two docstring-only clarifications surfaced by code review on PR Q00#1173: 1. Explain why `safe_defaults` is intentionally absent from `TaskClassProfile`. The earlier Q00#1171 schema sketch carried that field forward from Q00#849's `DomainProfile`, but on implementation review it is meta-domain-scoped (applies uniformly across coding task classes) and belongs on the `DomainProfile` layer, not on the within-meta-domain task-class catalog. The decoupling rationale in the module docstring now lists the full set of meta-domain fields kept on `DomainProfile` and notes the deliberate omission. 2. Note that catalog serialization uses underscored identifiers (`web_service`, `data_pipeline`, `game_2d`, `refactor_in_place`) so `TaskClass.value` is a valid Python identifier and JSON-safe ledger key. Prose docs in the Q00#1157 / Q00#1171 issues may render the same names with hyphens; both refer to the same class. This prevents L1-b pattern functions from hardcoding the wrong form. No behavior change. Tests, lint, mypy unchanged.

shaun0927 · 2026-05-22T12:52:16Z

Merge Justification

This PR is the L1-a slice of #1171 (DomainProfile catalog), which is in turn a slice of #1157 (the ooo auto Meta SSOT). I re-audited it against the SSOT-level direction in #961 (AgentOS roadmap) and the minimal-substrate principle re-affirmed by #1157's 2026-05-22 freshness sync, ran two code-review passes, and added one docstring-only follow-up commit. It is now ready to merge.

What this PR is

A catalog-data-only module (src/ouroboros/auto/task_classes.py) plus its tests. It introduces:

CompletionMode (StrEnum): code_complete | product_complete.
TaskClass (StrEnum): the 7 frozen task classes from Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171's resolved taxonomy — library, cli, web_service, webhook, data_pipeline, game_2d, refactor_in_place. Deferred classes (game_3d, desktop_app, notebook_analysis) are explicitly out of scope.
TaskClassProfile (frozen dataclass, slots=True): name, default_completion_mode, default_ac_template, runtime_probe_kinds.
TASK_CLASS_CATALOG: an immutable MappingProxyType view over the 7 entries.
get_task_class_profile(): lookup helper.
13 unit tests covering catalog shape, enum/catalog lockstep, completion-mode invariants, StrEnum serialization semantics, MappingProxyType immutability, and the explicit webhook vs web_service distinction pin.

No inference logic, no LLM, no eval set, no telemetry, no caller wiring — those land in L1-b/c/d/e per the issue's sub-PR breakdown.

Why this aligns with the SSOTs

Meta SSOT: ooo auto Vision — Autonomous Completion Engine #1157 minimal-substrate principle. L1 was scoped down from a classifier + eval set + accuracy floor + opt-in telemetry pipeline to "pattern matching against the interview's standardized ledger." This PR is the smallest data substrate that downstream pattern matchers and AC injectors can consume — no new substrate beyond the catalog itself.
Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171 frozen 7-class taxonomy. The PR's TaskClass enum matches the issue's resolved taxonomy exactly (names, count, completion-mode partitioning). The deferred classes (game_3d / desktop_app / notebook_analysis) are honored. The webhook-vs-web-service separation called out as a resolved design decision in Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171 is explicitly pinned by test_webhook_and_web_service_are_distinct_in_catalog.
Meta SSOT: AgentOS roadmap sequencing (#920–#960) #961 Track B follow-up posture. The PR rides as a narrow L1-a Track B follow-up outside Track C tier gates, matching the pattern Meta SSOT: AgentOS roadmap sequencing (#920–#960) #961 documents for feat(auto): task-class catalog data (L1-a) #1173/feat(tests): canonical acceptance harness skeleton (L0-a) #1174/feat(auto): route Ralph oscillation_detected through UNSTUCK_LATERAL (L5-a) #1175.
Decoupling from DomainProfile (feat(auto): DomainProfile and VerifiablePredicate contracts (#809 P3, PR 1/6) #849). Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171's prose says L1 should "promote feat(auto): DomainProfile and VerifiablePredicate contracts (#809 P3, PR 1/6) #849's DomainProfile." On implementation review this PR keeps the two as siblings instead, and the module docstring now justifies why: existing DomainProfile carries cross-domain machinery (repo_context_extractor, intent_classifier, vague_terms, detector, verifiable_predicates, safe_defaults) which is structurally orthogonal to within-meta-domain task shape. Coupling them would force every task-class entry to grow meta-domain wiring it does not need. The docstring also notes that safe_defaults belongs on the meta-domain layer, not on the task-class catalog.

What changed in this iteration

One follow-up commit on top of cd49d6e7:

9aae1983 — docs(auto): clarify decoupling rationale and serialization convention. Docstring-only: (1) records why safe_defaults is intentionally absent from TaskClassProfile, (2) documents that underscored identifiers (web_service, game_2d, refactor_in_place) are the canonical serialization form — prose with hyphens in the SSOT issues refers to the same classes. Both edits resolve the two MINOR notes raised by the first code-review pass. Zero behavior change.

Why over-engineering risk is low

Module is ~235 LoC (mostly catalog data + docstrings) and tests are ~106 LoC.
No premature abstraction beyond what's needed: the _profile() helper centralizes the "name always derives from TaskClass.value" invariant so the catalog cannot drift; the MappingProxyType view prevents downstream callers from mutating the catalog by accident; the frozen=True, slots=True dataclass makes profile instances cheap and immutable.
No new dependencies, no new substrate, no caller wiring; nothing here pre-commits L1-b/c/d/e to an implementation strategy beyond the data shape they will consume.

Verification

uv run pytest tests/unit/auto/test_task_classes.py -v → 13 passed.
uv run pytest tests/unit/auto -q → 889 passed (no regression).
uv run ruff check src/ouroboros/auto/task_classes.py tests/unit/auto/test_task_classes.py → clean.
uv run ruff format --check → no changes needed.
uv run mypy src/ouroboros/auto/task_classes.py → clean.
Two independent code-reviewer passes both returned APPROVE; the second pass returned zero findings of any severity.

This PR is small, targeted, well-tested, aligned with the SSOT-confirmed direction, and unblocks L1-b/c/d/e cleanly. Recommending merge.

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit 9aae198 for PR #1173

Review record: 854cc3f0-6b25-4e02-b2ac-8f4e6ecaa7c8

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

Review could not be completed due to the execution environment blocking all local file reads.

Policy Notes

Omitted 1 finding(s) that referenced files outside the current PR changed-files scope.

Recovery Notes

First recoverable review artifact generated from codex analysis log.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

shaun0927 · 2026-05-22T12:53:45Z

PR Review Summary

Verdict

APPROVE

Scope Reviewed

PR intent: Land the L1-a catalog data — a frozen 7-class TaskClass enum, CompletionMode, immutable TASK_CLASS_CATALOG, and TaskClassProfile dataclass — as the smallest substrate needed by the downstream L1-b/c/d/e sub-PRs of Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171 (the L1 lane of the ooo auto Meta SSOT Meta SSOT: ooo auto Vision — Autonomous Completion Engine #1157). Data only — no inference, no LLM, no Seed wiring, no result-envelope plumbing.
Main changed areas:
- src/ouroboros/auto/task_classes.py — new, 235 LoC including docstrings (PR diff: +219, then +16/-2 docstring clarification in 9aae1983).
- tests/unit/auto/test_task_classes.py — new, 106 LoC, 13 tests.
Tests reviewed: all 13 in test_task_classes.py.
Checks considered: pytest tests/unit/auto/test_task_classes.py -v (13/13 pass locally); pytest tests/unit/auto -q (889 pass, no regression); ruff check, ruff format --check, mypy src/ouroboros/auto/task_classes.py all clean locally; remote CI re-running on 9aae1983 (Ruff Lint, enforce-boundary, enforce-envelope, Bridge TypeScript already green; Test Python 3.12/3.13/3.14 and MyPy pending — identical lint/format/mypy already green on the prior commit cd49d6e7 with the same code surface).

Blocking Issues

None.

Warnings

None.

Mutation-Test Thinking

Likely mutants the current tests would kill:
- Drop or rename an enum value in TaskClass → killed by test_task_classes_match_catalog (lockstep check).
- Add an orphan key to _CATALOG without an enum value → same test kills it.
- Set TaskClassProfile.name to anything other than task_class.value for any entry → killed by the per-class assertion in test_catalog_entry_shape.
- Empty out default_ac_template or any whitespace-only entry → killed by test_catalog_entry_shape.
- Replace MappingProxyType(_CATALOG) with the raw _CATALOG dict → killed by test_catalog_is_immutable_view.
- Switch any of cli / web_service / webhook / data_pipeline / game_2d from PRODUCT_COMPLETE to CODE_COMPLETE → killed by test_library_class_is_code_complete (which pins the partition {LIBRARY, REFACTOR_IN_PLACE} exactly).
- Downgrade CompletionMode or TaskClass from StrEnum to plain Enum → killed by test_completion_modes_are_string_enum / test_task_class_is_string_enum (string-equality semantics).
- Merge webhook and web_service into one class → killed by test_webhook_and_web_service_are_distinct_in_catalog.
Mutants the current tests would not kill (acceptable for L1-a):
- Swapping the string content of two classes' default_ac_template between them. Justified — Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171 explicitly allows AC-template content evolution, and the canonical-scenario L0 tests (Meta SSOT slice: L0 — Canonical Test Harness for ooo auto acceptance #1170) are the agreed downstream pin. Pinning literal AC strings here would create churn for every wording iteration.
- Reassigning a probe-kind set from game_2d to data_pipeline (or similar non-webhook/web-service swap). Same rationale: probe-kind bindings are L3-pinned, not L1-pinned. The single exception (webhook ≠ web_service) is already asserted because Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171 calls out that distinction as a resolved design decision.
Additional tests recommended: None for L1-a. The mutation gaps above are intentional and align with the issue's sub-PR boundary.

Complexity / CRAP-style Risk

High-risk functions/modules: None. The module is pure data plus one one-line lookup helper.
Complexity increase: Cyclomatic complexity is effectively 1 throughout (no branching outside the if isinstance chain inside _freeze_safe_default_value — and that's in domain_profile.py, not in this PR). The _profile() factory is six trivial keyword passes. get_task_class_profile() is a single subscript.
Test coverage concern: None. The 13 tests cover every structural invariant the L1-a contract promises.
Refactoring recommendation: None.

Test Quality Assessment

Strong tests:
- test_task_classes_match_catalog — pins the enum/catalog lockstep so future class additions cannot land without a corresponding profile.
- test_catalog_entry_shape (parametrized over all 7 classes) — pins per-class structural invariants in one place.
- test_library_class_is_code_complete — pins the completion-mode partition as a set equality, which catches both directions of mistake (a CODE_COMPLETE class flipped to PRODUCT_COMPLETE and the inverse).
- test_webhook_and_web_service_are_distinct_in_catalog — pins a design decision that was explicitly resolved in Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171, protecting the catalog from a silent regression in a future "simplification" PR.
Weak tests: None for the L1-a scope.
Missing edge cases: None for a data-only catalog. KeyError on get_task_class_profile(<not-in-enum>) is intentionally unreachable for any valid TaskClass value and is documented as such.
Mocking concerns: None — no mocks, no fixtures beyond pytest parametrization.

Security / Operational Risk

None. The module is import-time data initialization only: no I/O, no network, no subprocess, no logging, no secrets, no auth, no user-input parsing, no migrations, no callers. Adding the module cannot regress any production code path because nothing in src/ouroboros/ imports it yet.

SSOT Alignment Check

Meta SSOT: ooo auto Vision — Autonomous Completion Engine #1157 (ooo auto Meta SSOT): Honors the minimal-substrate principle re-affirmed by the 2026-05-22 freshness sync — no separate classifier, no eval set, no telemetry, no opt-in surface. This PR is precisely the data the SSOT's L1 lane calls for.
Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171 (L1 design issue): Frozen 7-class enum names, count, completion-mode partitioning, and webhook/web-service separation all match the issue's Resolved taxonomy decisions and Frozen class taxonomy for L1-a sections exactly. Deferred classes (game_3d, desktop_app, notebook_analysis) are explicitly out of scope as called for.
Meta SSOT: AgentOS roadmap sequencing (#920–#960) #961 (AgentOS roadmap): The PR routes as a narrow L1-a Track B follow-up outside Track C tier gates, matching the documented pattern for feat(auto): task-class catalog data (L1-a) #1173/feat(tests): canonical acceptance harness skeleton (L0-a) #1174/feat(auto): route Ralph oscillation_detected through UNSTUCK_LATERAL (L5-a) #1175.
Decoupling vs Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171 prose: The issue says L1 should "promote feat(auto): DomainProfile and VerifiablePredicate contracts (#809 P3, PR 1/6) #849's DomainProfile"; the PR keeps the two as siblings. Reviewed and justified: the existing DomainProfile carries repo_context_extractor, intent_classifier, vague_terms, detector, verifiable_predicates, and safe_defaults — all meta-domain-scoped wiring that does not apply to within-meta-domain task shape. The 9aae1983 docstring commit makes the rationale explicit so future readers do not re-litigate it.

Looks Good

Module and per-attribute docstrings are unusually thorough and trace cleanly to issue numbers (Meta SSOT: ooo auto Vision — Autonomous Completion Engine #1157, Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171, feat(auto): DomainProfile and VerifiablePredicate contracts (#809 P3, PR 1/6) #849), making the design lineage auditable.
The _profile() factory enforces "name is always derived from TaskClass.value" structurally rather than by convention.
MappingProxyType over a private _CATALOG dict gives downstream callers a read-only view without forcing them to make defensive copies.
AC-template strings are concrete and verifiable — they read like real acceptance criteria a Seed Architect would emit, not placeholder prose.
Test names map 1:1 to the invariant each asserts; the parametrized shape test makes per-class failures point directly at the offending class.
No code wiring or caller import — adding this module is zero-risk to all existing ooo auto flows.

Final Recommendation

APPROVE — ready to merge. The PR is small (catalog data only), aligned with both governing SSOTs (#1157 and #1171), tested at the mutation-survivor level for every structural invariant it promises, and carries zero security or operational risk because it has no callers and no I/O. The single follow-up commit 9aae1983 resolved both MINOR notes raised by an earlier internal review pass via docstring clarifications, with no behavioral change. CI surface that has completed on 9aae1983 is green; the still-running long suites are exercising identical code to the prior commit, on which they were already green. There is nothing here that should hold up merge.

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: APPROVE

Branch: feat/prL1a | 2 files, +339/-0 | CI: Bridge TypeScript pass 14s https://github.com/Q00/ouroboros/actions/runs/26288809936/job/77383216437
Scope: diff-only
HEAD checked: 9aae19832b7c2c8020afd1866e13ac48453ca5c5

What Improved

Added a narrow catalog-only L1-a module with no runtime wiring, classifier, telemetry, or LLM path.
Added focused unit tests for enum/catalog lockstep, profile shape, completion-mode partitioning, immutability, StrEnum serialization, and webhook/web-service separation.

Issue #1171 Requirements

Requirement	Status
L1-a ships seven classes: library, CLI, web service, webhook, data pipeline, game 2D, refactor in place.	MET — enum values are present at `src/ouroboros/auto/task_classes.py:77`.
Catalog provides per-class completion mode, default AC template, and runtime probe kind hints.	MET — `TaskClassProfile` defines these fields at `src/ouroboros/auto/task_classes.py:120`, with catalog entries at `src/ouroboros/auto/task_classes.py:141`.
L1-a includes at least one unit test per class.	MET — `tests/unit/auto/test_task_classes.py:33` parametrizes checks over `list(TaskClass)`.
No classifier, no LLM, no eval set, no telemetry in L1-a.	MET — changed code is catalog data only; no runtime caller imports it, and the module states this boundary at `src/ouroboros/auto/task_classes.py:10`.
L1-b/L1-c/L1-d/L1-e inference, interview, AC injection, and result-envelope behavior.	DECLARED OUT OF SCOPE — PR body defers these later slices; no current runtime behavior is changed.
Per-class `safe_defaults` from the issue schema sketch.	DECLARED NON-GOAL / PRESERVED — current HEAD documents the task-class/domain-profile split at `src/ouroboros/auto/task_classes.py:19`, while existing `DomainProfile.safe_defaults` remains intact at `src/ouroboros/auto/domain_profile.py:148`.

Prior Findings Status

Prior Finding	Status
`prev_review.txt` reported no in-scope blockers but also said implementation artifacts were inaccessible.	MODIFIED — current review inspected current HEAD files directly and verified the catalog/tests.
Prior artifact blocker: underscored serialized task-class names diverge from hyphenated issue prose.	WITHDRAWN — current HEAD explicitly documents the serialization convention at `src/ouroboros/auto/task_classes.py:38`; no current consumer exists, and tests pin the intended `StrEnum` values at `tests/unit/auto/test_task_classes.py:90`.
Prior artifact blocker: `TaskClassProfile` omits per-class `safe_defaults`.	WITHDRAWN — current HEAD declares `safe_defaults` intentionally remains on the existing meta-domain `DomainProfile` layer at `src/ouroboros/auto/task_classes.py:19`, and the existing field remains present at `src/ouroboros/auto/domain_profile.py:148`.

Blockers

#	File:Line	Severity	Confidence	Finding

Follow-ups

#	File:Line	Priority	Confidence	Suggestion
1	`src/ouroboros/auto/task_classes.py:10`	Low	80%	The module docstring’s follow-up labels are one step off from issue #1171’s sub-PR breakdown: it names Seed AC injection as L1-c and result-envelope as L1-d, while #1171 lists interview integration as L1-c, Seed AC injection as L1-d, and result-envelope as L1-e. This is traceability-only and does not affect the catalog contract.

Test Coverage

tests/unit/auto/test_task_classes.py:26 covers enum/catalog lockstep, tests/unit/auto/test_task_classes.py:33 parametrizes shape checks across every TaskClass, tests/unit/auto/test_task_classes.py:58 pins the CODE_COMPLETE partition, and tests/unit/auto/test_task_classes.py:98 pins the webhook/web-service split. All newly added catalog logic and state-shape invariants have corresponding tests for this data-only slice.

Verification run: SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run pytest tests/unit/auto/test_task_classes.py -q passed, ruff check passed, and mypy src/ouroboros/auto/task_classes.py passed. Plain uv run pytest ... failed before tests because this environment’s default git wrapper timed out during VCS version detection, not because of test failures.

Design / Roadmap Gate

The PR aligns with the L1-a catalog-only gate in design_context.md: it preserves the no-classifier/no-telemetry ledger-derived direction and defers inference, interview integration, Seed AC injection, and result-envelope plumbing. The only design wrinkle is traceability wording around follow-up slice numbering in the new docstring; it is non-blocking because the changed implementation remains data-only and does not alter runtime contracts.

Merge Recommendation

Ready to merge; the one follow-up is documentation traceability cleanup and does not need to block.

ouroboros-agent[bot]

Resolve the two README/code mismatches surfaced by the latest review on PR Q00#1174: 1. `tests/canonical/README.md` "Full live run" section claimed `OUROBOROS_RUN_CANONICAL=1 uv run pytest tests/canonical/ -v` actually invokes `ouroboros_auto` against each scenario. In L0-a the live wiring is deferred — `test_scenario_live_run_or_skip` unconditionally `pytest.skip`s with a typed reason after the opt- in env var is checked, so the documented invocation behavior is not yet available. Reword the section to call out the L0-a state ("opt-in still skips with a typed reason; shape-check tests still run") while keeping the future L0-b semantics described. 2. `tests/canonical/README.md` "Run a single scenario" section pointed at `tests/canonical/cli-todo/`, but the scenario directory contains only `goal.txt` and `expected.yaml` — pytest collects zero tests there. The actual test bodies live in `tests/canonical/test_canonical.py` and are parametrized per scenario via `pytest_generate_tests`. Replace the command with the working filter form: `uv run pytest tests/canonical/ -v -k <slug>`. 3. `tests/canonical/cli-todo/expected.yaml` had a comment referencing `test_scenario_completion_mode_matches_catalog`, which is not in the harness and is deferred until Q00#1173 (L1-a catalog data) is available on `main`. Update the comment to note the round-trip test is a follow-up, not yet present. No code change. The hermetic shape-check suite is unchanged (still 6 passed, 1 skipped). `uv run pytest tests/canonical/ -v -k cli-todo` now collects and passes the per-scenario tests, replacing the previously documented zero-collection command.

Align the L1-a catalog documentation with Q00#1171's current sub-PR breakdown so future readers do not treat Seed AC injection as L1-c or result-envelope plumbing as L1-d. Constraint: PR Q00#1173 is already merged; follow-up must be docs-only and preserve the approved data-only catalog behavior. Rejected: Change catalog schema or serialized task-class values | ouroboros-agent already approved those after the decoupling rationale, and the remaining bot note was traceability-only. Confidence: high Scope-risk: narrow Directive: Keep L1-c reserved for interview-driver disambiguation, L1-d for Seed AC injection, and L1-e for result-envelope plumbing unless Q00#1171 is explicitly revised. Tested: SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run pytest tests/unit/auto/test_task_classes.py -q; uv run ruff check src/ouroboros/auto/task_classes.py tests/unit/auto/test_task_classes.py; uv run ruff format --check src/ouroboros/auto/task_classes.py tests/unit/auto/test_task_classes.py; SETUPTOOLS_SCM_PRETEND_VERSION=0.0.0 uv run mypy src/ouroboros/auto/task_classes.py Not-tested: Full unit suite; change is limited to comments/docstrings. Co-authored-by: OmX <omx@oh-my-codex.dev>

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: APPROVE

PR #1173
Branch: feat/prL1a | 2 files, +339/-0 | CI: Bridge TypeScript pass 14s https://github.com/Q00/ouroboros/actions/runs/26288809936/job/77383216437
Scope: architecture-level
HEAD checked: 9aae19832b7c2c8020afd1866e13ac48453ca5c5

What Improved

Added a focused ouroboros.auto.task_classes catalog with explicit task-class and completion-mode enums.
Kept the L1-a surface data-only: no inference, no persistence migration, no runtime probe execution, and no Seed mutation hook landed in this slice.
Added unit coverage for catalog shape, enum/catalog parity, direct immutability of the public mapping view, StrEnum serialization behavior, and the webhook/web-service distinction.

Issue #N/A Requirements

Requirement	Status
Add 7-class task-class catalog	Satisfied
Add `CompletionMode` StrEnum with code/product completion modes	Satisfied
Add frozen `TaskClassProfile` with name, default completion mode, default AC template, and runtime probe hints	Satisfied
Keep catalog data-only with no inference, LLM, eval set, telemetry, Seed injection, or result-envelope changes	Satisfied
Keep task classes separate from `DomainProfile` meta-domain layer	Satisfied
Use immutable public catalog view	Satisfied for public mapping mutation
Add meaningful tests for newly added logic	Satisfied for this data-only slice

Prior Findings Status

Prior Finding	Status
Prior review context	MODIFIED — Prior concerns were modified/withdrawn for current HEAD: the current file documents the safe-defaults decoupling rationale and the underscore serialization convention, and I found no remaining current-HEAD blocker with file:line evidence.

Blockers

#	File:Line	Severity	Confidence	Finding

Follow-ups

#	File:Line	Priority	Confidence	Suggestion
—	—	—	—	None.

Test Coverage

Verified uv run pytest tests/unit/auto/test_task_classes.py -q: 13 passed.
Verified uv run ruff check src/ouroboros/auto/task_classes.py tests/unit/auto/test_task_classes.py: passed.
Verified uv run mypy src/ouroboros/auto/task_classes.py: passed.
Coverage is appropriately scoped for this data-only L1-a slice; integration tests for ledger derivation, Seed AC injection, result-envelope persistence, replay, and runtime probes remain correctly deferred because those behaviors are not implemented in current HEAD.

Design / Roadmap Gate

Affected-boundary review covered the new catalog API, Seed.acceptance_criteria shape, DomainProfile separation, auto package export patterns, and downstream persistence/replay surfaces. Because current HEAD only introduces static catalog data and does not wire task class into state, ledger derivation, Seed mutation, result envelopes, or runtime probes, there is no new state-machine, persistence, replay, or consumer-contract blocker visible at current HEAD.

Merge Recommendation

Post-merge audit only: no current-HEAD blocker found for the landed L1-a catalog slice. Keep the follow-on L1-b/L1-c/L1-d work gated on tests that exercise derivation, persistence/replay compatibility, Seed AC injection, and result-envelope consumer behavior.

Review-Metadata:
verdict: APPROVE
github_event: COMMENT
review_kind: post_merge_audit
merge_eligible: false
head_sha: 9aae198
source_read_ok: true
diff_read_ok: true
blocking_count: 0

shaun0927 mentioned this pull request May 22, 2026

feat(tests): canonical acceptance harness skeleton (L0-a) #1174

Merged

5 tasks

Q00 mentioned this pull request May 22, 2026

Meta SSOT: AgentOS roadmap sequencing (#920–#960) #961

Open

ouroboros-agent Bot approved these changes May 22, 2026

View reviewed changes

shaun0927 merged commit b25b7bb into Q00:main May 22, 2026
8 checks passed

This was referenced May 22, 2026

feat(auto): Seed AC injection + active_task_class envelope (L1-d, L1-e) #1188

Merged

feat(auto): runtime-probe envelope + advisory probe_runner (L3-2) #1190

Merged

feat(tests): L0 live-wire + L1 catalog cross-validate (P1) #1191

Merged

This was referenced May 23, 2026

Meta SSOT: ooo auto Vision — Autonomous Completion Engine #1157

Open

Meta SSOT slice: L1 — TaskClass Catalog (ledger-derived domain inference + default AC injection) #1171

Closed

ouroboros-agent Bot reviewed May 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(auto): task-class catalog data (L1-a)#1173

feat(auto): task-class catalog data (L1-a)#1173
shaun0927 merged 2 commits into
Q00:mainfrom
shaun0927:feat/prL1a

shaun0927 commented May 22, 2026

Uh oh!

ouroboros-agent Bot left a comment •

edited

Loading

Uh oh!

shaun0927 commented May 22, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

shaun0927 commented May 22, 2026

Uh oh!

ouroboros-agent Bot left a comment •

edited

Loading

Uh oh!

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shaun0927 commented May 22, 2026

Summary

Why this is small

What lands

What is NOT in this PR

Test plan

Uh oh!

ouroboros-agent Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

What Improved

Issue #1171 Requirements

Prior Findings Status

Blockers

Follow-ups

Test Coverage

Design / Roadmap Gate

Merge Recommendation

Uh oh!

shaun0927 commented May 22, 2026

Merge Justification

What this PR is

Why this aligns with the SSOTs

What changed in this iteration

Why over-engineering risk is low

Verification

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Policy Notes

Recovery Notes

Uh oh!

shaun0927 commented May 22, 2026

PR Review Summary

Verdict

Scope Reviewed

Blocking Issues

Warnings

Mutation-Test Thinking

Complexity / CRAP-style Risk

Test Quality Assessment

Security / Operational Risk

SSOT Alignment Check

Looks Good

Final Recommendation

Uh oh!

ouroboros-agent Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

What Improved

Issue #1171 Requirements

Prior Findings Status

Blockers

Follow-ups

Test Coverage

Design / Roadmap Gate

Merge Recommendation

Uh oh!

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

What Improved

Issue #N/A Requirements

Prior Findings Status

Blockers

Follow-ups

Test Coverage

Design / Roadmap Gate

Merge Recommendation

Uh oh!

Reviewers

Assignees

ouroboros-agent Bot left a comment •

edited

Loading

ouroboros-agent Bot left a comment •

edited

Loading