auto cli-todo canonical run blocked at safe-default closure (R1 evidence)

## Summary
R1 live canonical run (`OUROBOROS_RUN_CANONICAL=1 pytest tests/canonical/ -k cli-todo` on `265aedb4`) terminated with `phase=blocked` at safe-default synthesis. `ooo auto` cannot complete the cli-todo canonical scenario, which fails SSOT #1157 success condition #1 (all four canonical goals reach `status=complete`). Two coupled defects produce a single symptom:

- **B1 — Logic**: safe-default fallback cannot close the interview when the backend accepts the synthesis answer but does not flag the resulting turn as `seed_ready`/`completed`. Defaults are then rolled back and the auto pipeline exits blocked.
- **B2 — Envelope**: the resulting blocker is emitted with `last_error_code=None`, violating the canonical 8-code mapping contract introduced by #1151.

This issue tracks the **behavioural** fix. #1211 (open) is the **observability** complement for the same decision point and should land alongside, not instead of, the work proposed here. Bug A (the test-harness `is_ok()` / `unwrap_err()` mis-call that previously masked this evidence behind a `TypeError`) is fixed in #1218.

## Reproduction
```bash
# Once #1218 merges; current main HEAD = 265aedb4 cannot reproduce without it
OUROBOROS_RUN_CANONICAL=1 uv run pytest tests/canonical/ -k cli-todo -v
# ~350s wall, ~$1 LLM cost. Expect phase=blocked terminal.
```

Local evidence preserved in `.ooo-observability/R1-cli-todo-20260525-1739.log` on the reporter's worktree (auto_session_id=`auto_3f44b20d63b7`, interview_id=`interview_169770e8f45c48cf`).

## R1 evidence — interview ambiguity trajectory (13 rounds, never crossed the 0.2 gate)

| round | overall | goal | constraints | success_criteria | ready? |
|------:|--------:|-----:|------------:|-----------------:|:------:|
| 3  | 0.298 | 0.78 | 0.62 | 0.68 | False |
| 4  | 0.304 | 0.78 | 0.62 | 0.66 | False |
| 5  | 0.540 | 0.55 | 0.35 | 0.45 | False |
| 6  | 0.401 | 0.74 | 0.52 | 0.49 | False |
| 7  | 0.464 | 0.68 | 0.46 | 0.42 | False |
| 8  | 0.361 | 0.72 | 0.55 | 0.62 | False |
| 9  | 0.421 | 0.72 | 0.42 | 0.55 | False |
| 10 | 0.353 | 0.74 | 0.55 | 0.62 | False |
| 11 | 0.397 | 0.72 | 0.55 | 0.50 | False |
| 12 | 0.373 | 0.72 | 0.58 | 0.55 | False |

Score oscillates between 0.298 and 0.540 — never converges below the 0.2 readiness gate. After round 12 the auto driver enters the safe-default fallback at 08:45:41:
```
auto.interview.safe_default.entered
  ambiguity_score=0.373  backend_done=False  ledger_done=False
  open_gaps=['runtime_context']  max_rounds=12
```

## R1 evidence — terminal state (recovered from log; JSON dump was not retained on disk)
- `phase=blocked`
- blocker: `safe-default synthesis did not close the persisted interview: backend_done=False, ledger defaults rolled back`
- `last_error_code=None`
- `seed_origin=none`
- `runtime_probe_evidence=[]`

## Suspect code

### B1 — Logic: safe-default cannot close when backend silently no-ops synthesis
`src/ouroboros/auto/interview_driver.py:440-454`

```python
state.interview_session_id = synthesis_turn.session_id
state.pending_question = synthesis_turn.question
if not (synthesis_turn.seed_ready or synthesis_turn.completed):
    _revert_safe_default_entries(ledger, finalization.defaulted_sections)
    blocker = (
        "safe-default synthesis did not close the persisted interview: "
        "backend_done=False, ledger defaults rolled back"
    )
    state.ledger = ledger.to_dict()
    state.mark_blocked(blocker, tool_name="interview.safe_default_synthesis")
    record_authoring_backend(state)
    self._save(state)
    return AutoInterviewResult(
        "blocked", state.interview_session_id, ledger, self.max_rounds, blocker
    )
```

When the Socratic backend accepts the synthesis answer but never flags `seed_ready`/`completed` on the resulting turn (the cli-todo `runtime_context` gap reproduces this reliably), #1167's policy rolls every default back and exits blocked. The backend appears to treat the driver-injected synthesis as just another user response, not a terminator.

### B2 — Envelope: `last_error_code` never set for this blocker
`src/ouroboros/auto/state.py:626-636`

```python
def mark_blocked(
    self,
    message: str,
    *,
    tool_name: str | None = None,
    error_code: str | None = None,
) -> None:
    self.last_tool_name = tool_name
    self.last_error_code = error_code
    self.transition(AutoPhase.BLOCKED, message, error=message)
```

Both safe-default failure sites in `interview_driver.py` (lines 434 and 449) call `mark_blocked(blocker, tool_name="interview.safe_default_synthesis")` without passing `error_code=`, so `last_error_code` defaults to `None`. The terminal envelope carries the rich blocker text but no canonical code — breaking the #1151 8-code mapping contract.

## Sub-tasks

- [ ] **B1 (logic)** — `src/ouroboros/auto/interview_driver.py:440-454`. Decide closure policy when the backend ack is content-only: either extend the safe-default contract with a third closure mode (alongside `mutual_agreement` and `ledger_only`) that accepts "backend echoed, ledger satisfied" as a close, or fail forward into a deterministic `ledger_only` close instead of reverting defaults. Document the chosen policy on #1167.
- [ ] **B2 (envelope)** — Add `INTERVIEW_SAFE_DEFAULT_SYNTHESIS_NONCLOSURE` (or equivalent — must be drawn from the #1151 alphabet) and pass it as `error_code=` at both `mark_blocked` call sites in `interview_driver.py:434, 449`. Add a regression test under `tests/auto/` asserting that any safe-default blocker emits a non-`None` `last_error_code` from the documented alphabet.

## Test-harness coordination

- **#1218** (open) fixes the test harness that previously masked this evidence behind a `TypeError`. Reproduction above assumes #1218 has merged.
- **#1214** (open) adds a `_Ok` test stub with method-shape `is_ok()` / `unwrap_err()` that mirrors the broken API. When #1218 lands, that stub must be aligned with the real `Result` property API; otherwise it silently re-introduces shape drift. See coordination note on #1218.

## Prior art / related work

- **#1138** (merged) — safe-default decision-point observability baseline.
- **#1167 / PR-B2** (merged) — added safe-default closure mode + partial-unsafe blocker code; the rollback policy at the heart of B1.
- **#1151** (merged) — canonical 8-code `stop_reason_code` for interview-layer blockers; B2 fills a gap in that alphabet.
- **#1211** (open) — emits structured `auto.interview.safe_default_synthesis_nonclosure` event. Observability-only by design; complementary to this issue, not a substitute.
- **#1122 / #1129** — earlier safe-default work referenced by #1138.

## Cross-refs
- SSOT: #1157
- Harness unblock: #1218
- Logic baseline: #1167, #1122, #1129
- Envelope baseline: #1151
- Observability complement (open): #1211
- Observability base (merged): #1138
- Harness stub coordination: #1214

## Constraints (per evidence-driven minimal-substrate policy)
- No second live R1 run until B1 lands — same evidence, $1 wasted.
- No new substrates or abstractions; both sub-tasks are edits to existing modules.
- No direct push to `main`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto cli-todo canonical run blocked at safe-default closure (R1 evidence) #1219

Summary

Reproduction

R1 evidence — interview ambiguity trajectory (13 rounds, never crossed the 0.2 gate)

R1 evidence — terminal state (recovered from log; JSON dump was not retained on disk)

Suspect code

B1 — Logic: safe-default cannot close when backend silently no-ops synthesis

B2 — Envelope: `last_error_code` never set for this blocker

Sub-tasks

Test-harness coordination

Prior art / related work

Cross-refs

Constraints (per evidence-driven minimal-substrate policy)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

round	overall	goal	constraints	success_criteria	ready?
3	0.298	0.78	0.62	0.68	False
4	0.304	0.78	0.62	0.66	False
5	0.540	0.55	0.35	0.45	False
6	0.401	0.74	0.52	0.49	False
7	0.464	0.68	0.46	0.42	False
8	0.361	0.72	0.55	0.62	False
9	0.421	0.72	0.42	0.55	False
10	0.353	0.74	0.55	0.62	False
11	0.397	0.72	0.55	0.50	False
12	0.373	0.72	0.58	0.55	False

auto cli-todo canonical run blocked at safe-default closure (R1 evidence) #1219

Description

Summary

Reproduction

R1 evidence — interview ambiguity trajectory (13 rounds, never crossed the 0.2 gate)

R1 evidence — terminal state (recovered from log; JSON dump was not retained on disk)

Suspect code

B1 — Logic: safe-default cannot close when backend silently no-ops synthesis

B2 — Envelope: last_error_code never set for this blocker

Sub-tasks

Test-harness coordination

Prior art / related work

Cross-refs

Constraints (per evidence-driven minimal-substrate policy)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

B2 — Envelope: `last_error_code` never set for this blocker