T5CoreMLEmbedder CPU fallback path never recovers — 1,038/1,038 fail in production, dead weight

## Problem

The CPU fallback path added by #87 (and improved with diagnostics in #90) **does not recover any inferences in practice**. Every attempt to retry an ANE-failed inference on `.cpuOnly` also fails, and the embedder rethrows the original IOSurface error.

What's actually keeping bulk-index alive in production is the **proactive model reload every `reloadInterval` calls** (also from #87). The CPU fallback is dead weight: it costs an extra (failed) `MLModel` load + predict per failure, produces no successful results, and contributes nothing to the recovery story.

## Production Data

A SafariUnfucker bulk-index run (2026-05-07) processed 5,570 successful inferences and 1,038 failures:

```
1038  cpu_fallback_failed  (every failure)
1038  Failed to allocate E5 buffer object. E5RT: Failed to allocate memory IOSurface object. (3)  (every failure's reason)
0     recovered_iosurface_exhaustion  (none)
```

`category: "cpu_fallback_failed"` fires when: ANE threw IOSurface, CPU fallback was attempted, and CPU also threw. Across 1,038 opportunities, **CPU fallback recovered zero inferences**.

The ANE pool DID recover during the run — three failure bursts each followed by recovery periods of 5–25 minutes. Recovery timing is consistent with the proactive reload (Layer 2) firing at a 500-call boundary, not with CPU fallback.

## Summary

Remove the CPU fallback path (Layer 3b) from `T5CoreMLEmbedder` entirely. The path has a measured 0% recovery rate and adds per-failure overhead (one extra `MLModel` load + `predict` call) without benefit. The proactive reload (Layer 2) is the only mitigation that works; it should be the sole recovery mechanism. This simplifies `predictWindow`, eliminates `cpuPredictorFactory` from the actor's internal state, and removes tests for a dead code path.

> **Note on option (b) — log CPU-side error:** The issue originally proposed logging the CPU-side error as a diagnostic step before removal. Based on source inspection, `logCPUFallbackFailed` (line 613 of `T5CoreMLEmbedder.swift`) already captures `cpuErrorName`, `cpuErrorReason`, and `cpuCallStack` in the JSONL row per ADR 021's addendum. Option (b) appears already complete. The Research stage should verify this against actual production JSONL rows; if the cpu error fields are absent, option (b) must be implemented before option (a) is shipped.

## Requirements

- `cpuPredictorFactory` property is removed from `T5CoreMLEmbedder` (both the stored `let` and all assignments in all `init` paths).
- The Layer 3b CPU fallback branch in `predictWindow` (lines ~539–557) is removed. The reactive reload + ANE retry (Layer 3a, lines ~521–537) is retained.
- When an IOSurface exception is caught and the ANE retry also fails, the embedder rethrows the original error with the existing `category: "error"` JSONL row (the `cpuPredictorFactory == nil` path). The `cpu_fallback_failed` category is no longer produced.
- ADR 021 is amended to reflect Layer 3b's removal and the production evidence that motivated it.
- Tests for CPU fallback success (`testIOSurfaceFallbackLogsRecovery`) and CPU fallback failure (`testCPUFactoryThrowsLogsDistinctError`, `testCPUPredictThrowsLogsDistinctError`) are removed.
- All remaining tests continue to pass: proactive reload, autoreleasepool, IOSurface error logging, ANE retry.
- No change to the public `init` signatures (cpuPredictorFactory was never a public parameter).

## Scope

**In scope:**
- Remove `cpuPredictorFactory` property and all wiring in production init paths.
- Remove `cpuPredictorFactory` parameter from the test-only internal init.
- Remove the Layer 3b branch from `predictWindow`.
- Remove the `logCPUFallbackFailed` private method and `extractCPUErrorFields` helper if they are unused after removal (verify — `extractCPUErrorFields` may have no other callers).
- Remove `cpu_fallback_failed` JSONL category (no longer produced).
- Update ADR 021 with a second addendum documenting this removal and the production evidence.
- Delete the three stress-test cases that cover CPU fallback behavior.

**Out of scope:**
- Changing `reloadInterval` default or adding adaptive byte-pressure reload threshold (tracked separately per ADR 021 addendum).
- Changing any other public API.
- Changes to `MLPredictor`, `CoreMLFailureLogEntry`, or the broader JSONL schema (only the `cpu_fallback_failed` category is retired).

## Prior Art / Context

- ADR 021 documents the original three-layer design rationale and the 2026-05-06 addendum (issue #90 improvements). This issue adds a second addendum.
- Issue #87 — introduced CPU fallback (Layer 3b).
- Issue #90 — added `cpu_fallback_failed` category and cpu error fields to JSONL rows; also added Layer 3a (reactive reload + ANE retry).
- The production evidence (0/1,038 CPU recovery rate) is determinative. No further data gathering is needed before removing the path.

## Risks / Dependencies

- **Option (b) completeness**: If the Research stage finds that `cpu_fallback_failed` JSONL rows do NOT include `cpuErrorName`/`cpuErrorReason` in the actual production files (contradicting the source code), then option (b) should be implemented first so future engineers have a record of why CPU fallback failed before the code is deleted.
- **No public API surface change**: `cpuPredictorFactory` was never exposed in a public init — no callers outside the package are affected.
- **Test-only init signature change**: The internal factory-based test init accepts `cpuPredictorFactory` as a parameter. Removing it changes internal-only API — no downstream impact.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

T5CoreMLEmbedder CPU fallback path never recovers — 1,038/1,038 fail in production, dead weight #93

Problem

Production Data

Summary

Requirements

Scope

Prior Art / Context

Risks / Dependencies

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

T5CoreMLEmbedder CPU fallback path never recovers — 1,038/1,038 fail in production, dead weight #93

Description

Problem

Production Data

Summary

Requirements

Scope

Prior Art / Context

Risks / Dependencies

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions