Test plan: open-issue coverage for staging/api-hardening

# Test Plan — `staging/api-hardening` Open-Issue Coverage

## Context

`staging/api-hardening` is 152 commits ahead of `main`. It bundles two waves of work:
1. **MCP API hardening** (epic #245) — standardised error codes, validation helpers, tool-description rewrites, default `output_mode` shrinkage, default-status filter, group-by additions, new `distillery_status`/`store_batch` tools.
2. **Feed/sync hardening** — gh-sync project/tag/author backfill, RSS/GitHub real author, watch URL probe, watch liveness metadata, async background sync jobs, Jina truncation.

Plus security follow-up #112 and CI/CVE #271.

This plan enumerates the **28 open issues** with code landed on the branch, each with a self-contained test scenario a Claude subagent can execute. The intent is parallel dispatch: 4 worker subagents pick up groups by isolation requirement (pure unit, in-memory store, real HTTP MCP, CI/manifest).

**Branch under test:** `staging/api-hardening`
**Base for diff:** `main`
**Commit range:** `git log main..staging/api-hardening` (152 commits)

## Open issues with landed work

| # | Title | Surface | Group |
|---|---|---|---|
| 112 | Security follow-up (TLS, ownership, CORS, pin deps, log retention) | embedding/*, mcp/auth.py, mcp/middleware.py, pyproject.toml, mcp/webhooks.py | C |
| 232 | `distillery_store` enum omits `github` | models.py:EntryType, mcp/tools/crud.py:_VALID_ENTRY_TYPES | A |
| 238 | `distillery_store` `output_mode="summary"` | mcp/tools/crud.py:_handle_store | B |
| 240 | `/gh-sync` invalid `output_mode="metadata"` | mcp/tools/feeds.py:_handle_gh_sync | B |
| 241 | label→tag sanitiser fails on underscored labels | feeds/github_sync.py:_sanitize_tag_segment | A |
| 244 | bulk ingest: store_batch + watch --sync-history | mcp/tools/crud.py:_handle_store_batch, feeds.py | B |
| 245 | MCP hardening epic | mcp/tools/_errors.py, _common.py, all handlers | A+B |
| 266 | CASCADE on dropping FTS schema | store/duckdb.py:_rebuild_fts_index | B |
| 269 | setup/watch CronCreate uses MCP not webhook | skills/setup/, skills/watch/ | D |
| 271 | suppress upstream CVEs in Docker base | .grype.yaml | D |
| 274 | history sync exceeds Jina 8194-token limit | feeds/truncation.py, embedding/jina.py | B |
| 276 | --sync history async to avoid timeout | feeds/sync_jobs.py, github_sync.py:sync_batched | B+C |
| 278 | gh-sync use store_batch async pipeline | feeds/sync_jobs.py, github_sync.py | B+C |
| 283 | `group_by='tags'` in `distillery_list` | mcp/tools/crud.py:_handle_list | B |
| 286 | stale `distillery_tag_tree` permission | .claude/settings.local.json | D |
| 301 | classify --batch with filters | mcp/tools/classify.py, cli.py, skills/classify | B |
| 302 | sync uses real author | feeds/github_sync.py, feeds/rss.py, feeds/poller.py | B |
| 303 | dynamic MCP transport for SessionStart hook | scripts/hooks/session_start_briefing.py | C |
| 307 | `distillery_stale` missing → route to list | skills/briefing/SKILL.md, mcp/tools/crud.py | D |
| 308 | watch accepts invalid/unreachable URLs | mcp/tools/feeds.py:_validate_url_syntax/_probe_url | B+C |
| 309 | `distillery_list(source=feed_url)` returns 0 | mcp/tools/crud.py:_build_filters_from_arguments | B |
| 310 | watch list omits liveness metadata | feeds/poller.py, store/duckdb.py, migration 12 | B |
| 311 | list default `output_mode=full` floods context | mcp/tools/crud.py:_handle_list | B |
| 312 | gh-sync project=null tags=[] | feeds/github_sync.py, cli.py:backfill_github_metadata | B |
| 313 | no `distillery_status` MCP tool | mcp/tools/meta.py, mcp/server.py | A |
| 314 | ghost entry_ids on dedup-skip | mcp/tools/crud.py:_handle_store dedup branch | B |
| 315 | `resolve_review` reviewer ignored | mcp/tools/classify.py, mcp/server.py | B |
| 316 | `resolve_review` reclassify leaves pending_review | mcp/tools/classify.py | B |
| 317 | list default includes archived | mcp/tools/crud.py:_apply_default_status_filter | B |
| 330 | docs: stale tool count + self-host guidance in plugin-install.md | docs/install/plugin-install.md | D |
| 332 | `dedup_action="merged"` returns independent new entry_id (PR #341 merged) | mcp/tools/crud.py:_handle_store merge branch | B |
| 333 | `resolve_review` double-approve silently bumps version (PR #339 merged) | mcp/tools/classify.py:_handle_resolve_review | B |
| 334 | watch list liveness fields exposed but never populated (PR #338 merged) | feeds/poller.py, store/duckdb.py liveness writes | B |
| 335 | `source=<url>` vs `feed_url=<url>` diverge (PR #340 merged) | mcp/tools/crud.py:_build_filters_from_arguments | B |

**Update 2026-04-19:** PRs #352, #353, #354, #358, #359, #360, #361 (issues 345–351) are now merged to `staging/api-hardening`. Group E scenarios run against staging directly — no separate branch checkouts required. PRs #356/#357 against `main` are superseded; staging itself is landing on main tonight.

**Group key**
- **A** — pure schema/enum/static check. No runtime needed. Subagent reads source + runs targeted pytest.
- **B** — in-memory async store + handler. Subagent runs `pytest -k <pattern>` against in-memory DuckDB fixture or invokes handler directly.
- **C** — needs running MCP HTTP server (TLS/CORS/transport probe).
- **D** — manifest/skill text/CI config. Read-only assertion against repo files.
- **E** — issues 345–351 (now on staging). Subagent runs the listed pytest + direct calls against the staging worktree.
- **F** — agent-driven E2E user journeys against a live MCP. Subagent acts as a real client, chaining tool calls across multiple issues per scenario.

## Subagent dispatch strategy

Spawn **6 worker subagents in parallel**, each owning one group (A, B, C, D, E, F). A 7th orchestrator (this conversation) collects reports and aggregates pass/fail. All groups share the `staging/api-hardening` worktree; Groups C and F additionally need a running HTTP MCP server.

Per-subagent prompt template:

> You are testing the `staging/api-hardening` branch of distillery2. Checkout `staging/api-hardening` in a worktree before starting (or operate in the existing worktree if provided). For each scenario in the assigned group, execute the listed steps, capture actual output, and report PASS/FAIL with one-line evidence (test name, error message, or response snippet). Do not modify source. If a test fixture is missing, mark BLOCKED. Final report: markdown table `| issue | scenario | result | evidence |`.

Common prerequisite for groups B and C:
```bash
git worktree add /tmp/distillery-test staging/api-hardening
cd /tmp/distillery-test
pip install -e ".[dev]" --quiet
```

For group C, additionally:
```bash
distillery-mcp --transport http --port 8765 &
```

---

## Group A — Schema / enum / static (1 subagent)

### #232 — `github` entry type
- Read `src/distillery/models.py` → assert `EntryType.GITHUB == "github"` and `TYPE_METADATA_SCHEMAS["github"]` exists with required keys `{repo, ref_type, ref_number}`.
- Read `src/distillery/mcp/tools/crud.py` → assert `_VALID_ENTRY_TYPES` contains `"github"`.
- Run: `pytest tests/ -k "github_entry_type or entry_type_github" -v`
- **Pass:** all assertions hold + tests green.

### #241 — sanitiser
- Run: `pytest tests/ -k "sanitize_tag or sanitiser or sanitizer" -v`
- Direct call test (subagent inlines):
  ```python
  from distillery.feeds.github_sync import _sanitize_tag_segment
  assert _sanitize_tag_segment("github_actions") == "github-actions"
  assert _sanitize_tag_segment("__a__b__") == "a-b"
  assert _sanitize_tag_segment("Foo.Bar") == "foo-bar"
  assert _sanitize_tag_segment("123abc") == "123abc"
  ```
- **Pass:** no exception, all four equalities hold.

### #245 — error code surface
- Read `src/distillery/mcp/tools/_errors.py` → assert `ToolErrorCode` has exactly: `INVALID_PARAMS, NOT_FOUND, CONFLICT, INTERNAL, FORBIDDEN, BUDGET_EXCEEDED, RATE_LIMITED`.
- Run: `pytest tests/test_mcp_errors.py -v`
- Run: `pytest tests/ -k "validate_required or validate_enum or validate_limit" -v`
- **Pass:** enum members match + suite green.

### #313 — `distillery_status` registered
- Run: `pytest tests/test_mcp_meta.py -v` (or `tests/ -k status`)
- Direct check: import `distillery.mcp.server` and assert `distillery_status` is in the registered tool list (introspect FastMCP instance).
- **Pass:** tool registered, returns dict with keys `{status, version, transport, tool_count, store, embedding_provider}`.

---

## Group B — In-memory store + handler (1 subagent, runs full pytest by topic)

Use `tests/conftest.py` fixtures: `store`, `make_entry`, `deterministic_embedding_provider`.

Each scenario: subagent runs the listed pytest pattern AND inlines a direct handler call to verify response shape against the issue's acceptance criteria.

### #238 / #311 / #317 / #309 / #283 — distillery_list extensions
- `pytest tests/test_mcp_tools/test_list_extensions.py -v`
- Direct calls (subagent writes a tmp pytest file using existing fixtures):
  - `_handle_list({})` → response default `output_mode == "summary"` (#311); archived entry NOT in result (#317).
  - `_handle_list({"include_archived": True})` → archived appears (#317).
  - `_handle_list({"group_by": "tags"})` → response shape `{groups: {...}, total: int}` (#283).
  - `_handle_list({"group_by": "invalid"})` → `INVALID_PARAMS` (#283).
  - Seed feed source `https://example.com/rss`, store entries with `source=https://example.com/rss`, then `_handle_list({"source": "https://example.com/rss"})` → returns the seeded entries (#309).

### #232 / #238 / #314 — distillery_store
- `pytest tests/test_store_dedup_response.py -v` (#314)
- `pytest tests/ -k "store_batch or output_mode_summary or entry_type_github" -v`
- Direct calls:
  - `_handle_store({"content":"x","entry_type":"github","source":"github","metadata":{"repo":"o/r","ref_type":"issue","ref_number":1}})` → `persisted=True`, `dedup_action="stored"` (#232, #314).
  - Re-call same args → `persisted=False`, `dedup_action="skipped"`, `existing_entry_id` matches first id, `similarity` present, `entry_id == existing_entry_id` (#314).
  - `_handle_store({"content":"y","entry_type":"reference","output_mode":"summary"})` → success, response omits `dedup_check` and `conflict_check` keys (#238).
  - `_handle_store({"output_mode":"bogus", ...})` → `INVALID_PARAMS` (#245).

### #244 — store_batch + watch sync_history
- `pytest tests/ -k "store_batch or sync_history" -v`
- Direct: `_handle_store_batch({"entries":[{...},{...},{...}]})` → response has `entry_ids` (3), `count==3`, `results` list of 3 with `persisted=True` per entry.
- `_handle_watch({"action":"add","source_type":"github","url":"https://github.com/python/cpython","sync_history":True})` → response includes `sync_job` with `job_id`.

### #266 — FTS CASCADE
- `pytest tests/ -k "fts_cascade or rebuild_fts" -v`
- Direct: open store, force `_rebuild_fts_index()` twice in sequence → no exception.

### #283 — covered above.

### #301 — classify --batch
- `pytest tests/test_mcp_classify.py -k "batch or filter" -v`
- CLI smoke: `distillery classify --batch` (no filter) → exit non-zero, stderr contains `at least one filter`.
- `distillery classify --batch --inbox` → exits 0; processes ≤50 entries.

### #302 — real author
- `pytest tests/test_real_author.py -v`
- Direct: feed a fake GitHub issue payload with `user.login=alice` through `GitHubSyncAdapter` → resulting Entry has `author=="alice"` and `metadata["imported_by"]=="gh-sync"`.

### #308 — watch URL validation (handler-level)
- `pytest tests/test_mcp_watch.py -v`
- Direct calls:
  - `_handle_watch({"action":"add","source_type":"rss","url":"not a url"})` → `INVALID_PARAMS` (or `INVALID_URL`); no DB row.
  - `_handle_watch({"action":"add","source_type":"github","url":"owner/repo"})` → accepted (bare slug allowed for github).
  - `_handle_watch({"action":"add","source_type":"rss","url":"owner/repo"})` → rejected.

### #310 — watch liveness metadata
- `pytest tests/test_mcp_feeds.py -k "liveness or last_polled or item_count" -v`
- `pytest tests/test_poller.py -k "record_poll_status" -v`
- Direct: add a source, poll once via `FeedPoller`, then `_handle_watch({"action":"list"})` → entry includes `last_polled_at`, `last_item_count`, `last_error` (null on success), `next_poll_at`.

### #312 — gh-sync project + tags backfill
- `pytest tests/test_mcp_feeds.py -k "project or backfill" -v`
- Direct: sync a single GitHub issue payload → resulting Entry has `project=="<repo-name>"`, `tags` contains `source/github`, `repo/<name>`, `ref-type/issue`, `state/<x>`.
- CLI: `distillery maintenance backfill-github-metadata --dry-run` → reports counts of entries it WOULD update.

### #315 — reviewer parameter
- `pytest tests/test_mcp_classify.py -k "reviewer or actor or on_behalf_of" -v`
- Direct: call `_handle_resolve_review({"entry_id": id, "action":"approve","reviewer":"bob"})` from server context with `actor="alice"` → entry metadata gains `reviewed_by="alice"`, `reviewed_on_behalf_of="bob"`.
- Same call without delegation (`reviewer="alice"`, `actor="alice"`) → no `*_on_behalf_of` field.

### #316 — reclassify status
- `pytest tests/test_mcp_classify.py -k "reclassify_status or reclassify_pending" -v`
- Direct: seed entry with `status="pending_review"`, call `_handle_resolve_review({"action":"reclassify",...})` → entry `status=="active"`. Repeat with seed `status="archived"` → status remains `archived`.

### #274 — Jina truncation
- `pytest tests/test_truncation.py -v`
- Direct: `truncate_content("x" * 60_000)` returns ≤ 30_000 chars + `[truncated]` suffix.

### #276 / #278 — async sync jobs
- `pytest tests/test_async_sync_pipeline.py -v`
- Direct: kick off `run_sync_job_async(...)` against a stub adapter → `SyncJobTracker.get(job_id)` transitions PENDING→RUNNING→COMPLETED with `pages_processed > 0`.

### #332 — `dedup_action="merged"` ghost id (PR #341 merged on staging)
- `pytest tests/ -k "merge or fold or dedup_action_merged" -v`
- Direct: configure dedup thresholds so a second store call lands in the merge band (≥0.80, <0.95). Call `_handle_store(...)` twice → second response: `dedup_action == "merged"`, `entry_id == first_entry_id` (true fold; no fresh row). Verify with `_handle_get(entry_id=first_entry_id)` — content/refs folded into existing row, version incremented.
- Negative: a "stored" path (similarity < 0.60) must NOT report `dedup_action="merged"`.

### #333 — `resolve_review` double-approve idempotency (PR #339 merged on staging)
- `pytest tests/test_mcp_classify.py -k "double_approve or idempotent or no_op" -v`
- Direct: seed entry with `status="active"`, capture `version=N`. Call `_handle_resolve_review({"action":"approve","entry_id":id})` → response indicates no-op (e.g. `changed: false`); entry `version` still N.
- Repeat for `archive` on already-archived entry: no version bump, no audit-log duplicate.

### #334 — watch list liveness actually populated (PR #338 merged on staging)
- `pytest tests/test_mcp_feeds.py -k "liveness or populate" -v`
- Direct: add a feed source, run `FeedPoller.poll_once()` against a stub adapter, then `_handle_watch({"action":"list"})` → row has non-null `last_polled_at` AND non-null `last_item_count` (not just exposed-but-null). After a forced failure, `last_error` non-null and ≤200 chars.
- Sync path: kick off a `sync_history` job; while RUNNING and after COMPLETED, list shows liveness fields update from sync writes too.

### #335 — `source=<url>` aliases to `feed_url` (PR #340 merged on staging)
- `pytest tests/ -k "source_alias or feed_url_alias" -v`
- Direct: seed feed source `https://x.test/rss` with 3 entries. `_handle_list({"source": "https://x.test/rss"})` and `_handle_list({"feed_url": "https://x.test/rss"})` MUST return identical entry sets. `_handle_list({"source": "manual"})` (enum value) still works as a source-type filter — alias only kicks in when the value parses as a URL.

---

## Group C — Real HTTP MCP server (1 subagent, more setup)

### #112 — security follow-up
Subagent runs:
1. **TLS verify=True audit** — grep all `httpx.Client(` and `httpx.AsyncClient(` callsites; assert each constructed with `verify=True` (or default which is True; flag any explicit `verify=False`).
   ```
   grep -rn "httpx.Client\|httpx.AsyncClient" src/ | grep -v "verify=True"
   ```
   Expected: empty output OR every match passes default verify (no `verify=False` anywhere).
2. **Ownership on classify** — `pytest tests/ -k "ownership and classify" -v`. Direct: as user-A, store entry; as user-B, call `distillery_classify` on it → `FORBIDDEN`.
3. **CORS** — start HTTP server with default config; `curl -H 'Origin: https://evil.test' -i http://localhost:8765/mcp` → response must NOT echo `Access-Control-Allow-Origin: https://evil.test`. Then start with `cors_allowed_origins=["https://ok.test"]` and confirm allowed origin echoes back.
4. **Dep pinning** — open `pyproject.toml` and assert upper bounds present on `pyyaml`, `httpx`, `fastmcp`, `defusedxml`.
5. **Log retention** — invoke `/api/maintenance` with bearer token; assert response includes `search_logs_pruned: <n>` field. Verify config defaults `search_log_retention_days == 90`.

### #303 — dynamic transport
- `pytest tests/test_session_start_briefing.py -v`
- Manual: with `DISTILLERY_MCP_URL=http://localhost:8765/mcp` set, run `python scripts/hooks/session_start_briefing.py` → exits 0, prints briefing.
- Unset env, place a `.mcp.json` at cwd with stdio entry → re-run, hook resolves stdio.
- Both env unset and no manifest → fallback to `localhost:8000`; hook reports unreachable cleanly.

### #308 — watch URL probe (HTTP layer)
- `_handle_watch add` against a known-404 host (e.g. `https://nonexistent.invalid/feed.xml`) → returns `UNREACHABLE_URL` unless `force=True`.
- HEAD-405 host (subagent stands up tiny aiohttp stub on :9001 returning 405 for HEAD, 200 for GET) → watch add succeeds (GET fallback).

### #276 / #278 — async pipeline end-to-end
- Start server. POST `distillery_watch action=add url=<small repo> sync_history=true` via MCP client → response contains `job_id`.
- Poll `distillery_sync_status job_id=<id>` until `status=="completed"` or 60s timeout. Assert `entries_created > 0` and `errors == []`.

---

## Group D — Skills, manifests, CI (1 subagent, read-only)

### #269 — CronCreate uses MCP tool calls
- Read `skills/setup/SKILL.md` and `skills/setup/references/cron-payloads.md`. Assert no occurrences of `POST /hooks/poll`, `POST /api/maintenance`, or HTTP-only references in cron sections.
- Assert presence of `distillery_list`, `distillery_watch`, `distillery_store` tool calls in payload examples.
- Read `skills/watch/SKILL.md`. Same assertions.

### #271 — CVE suppression
- Read `.grype.yaml`. Assert ≥40 entries.
- Each suppression has `vulnerability:` and a `justification:` (or `reason:`) field with non-empty content.
- Spot-check that CVE-2026-31790, CVE-2026-4786 are present.

### #286 — stale permission
- Read `.claude/settings.local.json`. Assert `distillery_tag_tree` does NOT appear.

### #307 — stale section routing
- Read `skills/briefing/SKILL.md`. Assert it references `distillery_list` with `stale_days` parameter (not a missing `distillery_stale` tool).
- Grep skills/ for `distillery_stale` → no matches outside historical changelogs.

### #330 — docs: stale tool count + self-host guidance
- Read `docs/install/plugin-install.md` (or `docs/skills/index.md` — wherever total tool count is published).
- Assert published count matches the actual registered tool count (introspect `distillery.mcp.server` or count tool decorators in `src/distillery/mcp/tools/`).
- Assert presence of a self-host section covering (a) `DISTILLERY_CONFIG` env var, (b) HTTP transport with GitHub OAuth, (c) plugin user-scope override.
- If the doc still hardcodes the old count (e.g. "12 tools" when current is higher) → FAIL.

---

## Group E — Issues 345–351 (now on staging)

All seven PRs are merged to `staging/api-hardening`. Tests run against the single staging worktree.

| Issue | PR | Merge commit | Key files |
|---|---|---|---|
| #345 | #353 | `8c5f2be` / `7a199e8` | mcp/tools/classify.py, mcp/tools/crud.py, tests/test_entry_type_suggestions.py |
| #346 | #360 | `00c1698` / `6d70436` | store/duckdb.py, tests/test_store_wal_durability.py |
| #347 | #359 | `69a49e8` / `c75231e` | scripts/hooks/session_start_briefing.py, scripts/hooks/session-start-briefing.sh |
| #348 | #354 | `24c6cc8` / `188591a` | mcp/tools/crud.py, tests/test_conflict.py |
| #349 | #358 | `df25c15` / `4c96143` | store/duckdb.py, tests/test_duckdb_store.py |
| #350 | #352 | `921aab1` / `8fe4382` | .github/workflows/staging-deploy.yml |
| #351 | #361 | `5e4f924` / `ab842ec` + CR rounds | embedding/{jina,openai,errors}.py, mcp/budget.py, mcp/tools/{crud,search}.py |

### #345 (PR #353) — entry_type alias suggestions
- `pytest tests/test_entry_type_suggestions.py tests/test_mcp_classify.py tests/test_corrections.py tests/test_bulk_ingest.py -v`
- Direct calls (one each):
  - `_handle_store({"content":"x","entry_type":"note"})` → `INVALID_PARAMS` with `details.suggestion == "inbox"`, `details.allowed` is the 12-element canonical list, `details.field == "entry_type"`, message contains `Did you mean 'inbox'?`.
  - Repeat for `task` → `idea`, `pr` → `github`, `article` → `bookmark`, `summary` → `digest`, `doc` → `reference`, `contact` → `person`, `repo` → `project`.
  - `entry_type="ZzUnknownZz"` → `INVALID_PARAMS`, `details` present but no `suggestion` key.
  - `entry_type="NOTE"` (case) and `"  note  "` (whitespace) both → suggestion `inbox`.
  - Reclassify path: `_handle_resolve_review({"action":"reclassify","new_entry_type":"note",...})` → same suggestion.
  - Regression guard: no alias key collides with a canonical EntryType value (asserted in `test_entry_type_suggestions.py`).

### #346 (PR #360) — checkpoint-after-write WAL durability
- `pytest tests/test_store_wal_durability.py tests/test_duckdb_store.py -v` (target ≥118 passing)
- Direct: open store, `_handle_store(...)`, then inspect `<db>.wal` size — should be 0 or near-0 after the write (CHECKPOINT flushed it).
- Recovery: simulate replay failure (mock `_sync_initialize` to raise), then trigger recovery → assert WAL renamed to `*.wal.corrupt.<ts>`, NOT unlinked. The original WAL bytes remain on disk under the new name.
- Failure swallowing: monkeypatch the connection so `CHECKPOINT` raises → `_handle_store` still returns `persisted=true`; warning logged.

### #347 (PR #359) — briefing hook tools/list probe
- `bash scripts/hooks/test-hooks.sh` → 34/34 pass.
- Manual reproduction:
  - Stand up a stub HTTP server on :9100 returning `404` on `/health` and a valid JSON-RPC `tools/list` response on `POST /mcp`. Run hook with `DISTILLERY_MCP_URL=http://localhost:9100/mcp` → exit 0, briefing rendered (no longer no-ops on the 404).
  - Stub returning 401 on `/mcp` → hook exits 0 with `[Distillery] briefing disabled — auth failed` on stderr.
  - With `DISTILLERY_BRIEFING_QUIET=1` set, the diagnostic stderr line MUST be suppressed.

### #348 (PR #354) — `include_conflict_prompt` flag
- `pytest tests/test_conflict.py -v`
- Direct calls (seed 3 near-duplicates first):
  - `_handle_store({"content":"new...", ...})` (default) → response `conflicts[*]` carries `entry_id`, `content_preview`, `similarity_score` ONLY. NO `conflict_prompt` key. Total response bytes ≤ ~1KB.
  - `_handle_store({"content":"new...","include_conflict_prompt":true, ...})` → each conflict carries `conflict_prompt`. Response size approx 3x larger (~3KB+ per docs).
  - `output_mode="summary"` (existing bulk-store fast path) still skips dedup+conflict entirely — unchanged behaviour, no regression.

### #349 (PR #358) — FTS WAL replay with overwrite=1
- `pytest tests/test_duckdb_store.py::TestWalFtsReplayHardening -v`
- Direct: `store._rebuild_fts_index()` twice in sequence → no `Cannot drop entry "fts_main_entries"` error.
- Reproduce subprocess SIGKILL test: spawn child that opens store, writes, calls rebuild, then SIGKILLs itself before clean shutdown. Reopen store in parent → no WAL replay error, FTS searchable.
- Inspect rebuild path: confirm `_rebuild_fts_index` calls `PRAGMA create_fts_index(..., overwrite=1)` (no manual `DROP SCHEMA ... CASCADE` left in the code). Confirm a `CHECKPOINT` follows.

### #350 (PR #352) — staging-deploy comment escaping
- Read `.github/workflows/staging-deploy.yml`. Assert:
  - Both PR-comment blocks use `gh pr comment --body-file -` with a `<<'EOF'` heredoc (not `--body "...\`url\`..."`).
  - Comment text mentions `GET /mcp` returns **405** (not 404), and bare hostname returns 404.
- Validate parses: `python3 -c "import yaml; yaml.safe_load(open('.github/workflows/staging-deploy.yml'))"` exits 0.
- Optional live check: after PR merges, the next `/deploy-staging` PR comment renders backticked URLs cleanly (no `%5C%60` in Fly access logs).

### #351 (PR #361) — embedding budget default unlimited
- `pytest tests/test_budget.py tests/test_embedding.py tests/test_mcp_errors.py tests/test_mcp_coverage_gaps.py -v` (target 261 passing)
- Direct:
  - Read `src/distillery/config.py` → `embedding_budget_daily` default == `0`.
  - With default config, run 600 embed calls in a loop (mock provider returning fast) → no `EmbeddingBudgetError`. Set `embedding_budget_daily=10` and run 11 calls → 11th raises `EmbeddingBudgetError`.
  - 429 path: monkeypatch Jina/OpenAI client to return HTTP 429 with `Retry-After: 12` after retry exhaustion → `EmbeddingRateLimitError` raised; MCP tool surfaces `INVALID_PARAMS` with `details.provider`, `details.endpoint`, `details.http_status==429`, `details.retry_after==12`. WARNING line in logs includes provider name.
  - 429 without `Retry-After` header → error still raised, `retry_after` field absent or null.
  - Follow-on commits (`b247f3e`, `516b694`, `b983903`): OpenAI.embed() routes through embed_batch() (structured errors), non-finite `Retry-After` values pinned, provider errors propagate through store dedup precheck. Spot-check: inject `Retry-After: inf` → error surfaces with `retry_after` clamped/omitted (no traceback).

---

## Group F — Agent-driven E2E user journeys (1 subagent, live MCP)

Each scenario drives a **live staging MCP** as a real client would: sequential tool calls across multiple issues, verifying behaviour observable from outside the server. No pytest — the subagent speaks MCP JSON-RPC (or uses the `distillery` CLI / Python client) and inspects responses.

**Setup (once per run):**

```bash
git worktree add /tmp/distillery-e2e staging/api-hardening
cd /tmp/distillery-e2e
pip install -e ".[dev]" --quiet
rm -f /tmp/distillery-e2e.db*
DISTILLERY_DB_PATH=/tmp/distillery-e2e.db \
DISTILLERY_AUTH_ALLOW_LOOPBACK=1 \
DISTILLERY_EMBEDDING_PROVIDER=deterministic \
distillery-mcp --transport http --port 8765 &
export MCP=http://localhost:8765/mcp
sleep 2 && curl -sf $MCP -X POST -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | jq '.result.tools | length'
# expect: tool count > 12 (includes distillery_status, distillery_store_batch)
```

**Agent-driver protocol.** The subagent MUST call MCP tools via JSON-RPC over HTTP (or the Python `FastMCP` client), NOT by importing handlers. Every scenario must close with a cleanup that drops or archives every entry/source it created.

Each scenario's pass criterion is a cross-cutting predicate — not just "a single field equals X", but "the chained workflow an agent would run actually works".

---

### F1 — Capture-to-classify round trip (covers #245, #311, #313, #314, #317, #332, #348)

1. `distillery_status` → returns `{status:"ok", tool_count, transport:"http"}`. Record `tool_count`.
2. `distillery_store content="Research note about Claude prompt caching TTL" entry_type="inbox"` → response has `persisted:true`, `dedup_action:"stored"`, `entry_id` set, `conflicts` present but each conflict object has NO `conflict_prompt` key (default off).
3. `distillery_store content="Research note about Claude prompt caching TTL" entry_type="inbox"` (same content) → `persisted:false`, `dedup_action:"skipped"`, `entry_id == existing_entry_id` (no ghost). Then `distillery_get entry_id=<returned>` succeeds.
4. `distillery_store content="Research note about Claude prompt caching time-to-live" entry_type="inbox"` (near-dup, merge band) → `dedup_action:"merged"`, `entry_id == first_entry_id`. Version on first entry incremented by 1 (verify with get).
5. `distillery_list` (no args) → default `output_mode="summary"` (response bytes < 1.5KB for 1 entry; no `conflicts`/`versions`/`metadata` on rows). No archived entries included.
6. `distillery_resolve_review entry_id=<id> action="approve"` → on already-active entry, response is no-op (`changed:false`), version NOT bumped (#333 regression guard).
7. `distillery_store content="Similar research on cache" entry_type="inbox" include_conflict_prompt=true` → each conflict object NOW has `conflict_prompt` (~1KB string). Response bytes ≥ 2x a default call.
8. Cleanup: archive created entries.

**Pass:** every assertion above holds. **Fail:** any response shape or field missing.

---

### F2 — Entry-type alias suggestion flow (covers #232, #245, #345)

1. `distillery_store content="todo: wire radar digest" entry_type="note"` → error `code:"INVALID_PARAMS"`, message contains `Did you mean 'inbox'?`, `details.suggestion == "inbox"`, `details.allowed` is a 12-element array including `"github"`, `details.provided == "note"`.
2. Retry with `entry_type="inbox"` → success.
3. `distillery_store content="gh-17" entry_type="pr"` → suggestion `"github"`. Retry with `"github"` + required metadata `{repo, ref_type:"pr", ref_number:17}` → success.
4. `distillery_store content="x" entry_type="ZzZz"` → `INVALID_PARAMS`, `details` present but `details.suggestion` key absent.
5. `distillery_store content="x" entry_type="  NOTE  "` (case + whitespace) → still suggests `inbox`.
6. Cleanup.

**Pass:** alias map works on both `store` and reclassify paths; unknown types still get structured details.

---

### F3 — Watch: URL validation → liveness → async sync (covers #276, #278, #302, #308, #310, #312, #334)

1. `distillery_watch action="add" source_type="rss" url="not a url"` → `INVALID_PARAMS` (or `INVALID_URL`), nothing persisted.
2. `distillery_watch action="add" source_type="rss" url="https://nonexistent.invalid.test/feed.xml"` → `UNREACHABLE_URL` (or similar), not persisted. Retry with `force=true` → persists.
3. `distillery_watch action="add" source_type="github" url="https://github.com/norrietaylor/distillery" sync_history=true` → response includes `sync_job.job_id`. Remember the id.
4. Poll `distillery_sync_status job_id=<id>` every 3s until `status == "completed"` or 90s timeout. Assert `entries_created > 0` and `errors == []`.
5. `distillery_watch action="list"` → the GitHub source has non-null `last_polled_at` AND non-null `last_item_count` (#334 — fields actually populated, not just exposed), non-null `next_poll_at`. `last_error` is null.
6. `distillery_list source="https://github.com/norrietaylor/distillery"` AND `distillery_list feed_url="https://github.com/norrietaylor/distillery"` → identical result sets (#335 alias).
7. Pick one entry; assert `entry.project == "distillery"` (#312), `entry.tags` contains `source/github` and `repo/distillery` and a `ref-type/*`, `entry.author` is a real GitHub login (#302 — not `"gh-sync"` literal).
8. Cleanup: `distillery_watch action="remove" url=...`, archive ingested entries.

**Pass:** the full ambient-intel path works end-to-end; the agent can trust the liveness table and the real-author metadata for downstream skills.

---

### F4 — Classify batch + review queue (covers #301, #315, #316)

1. Seed 5 entries via `distillery_store_batch` with `entry_type="inbox"`, distinct content, all with `status="pending_review"` forced (or via classification that routes to review).
2. `distillery_classify` batch endpoint (via CLI: `distillery classify --batch`) with no filter → exit non-zero, stderr contains `at least one filter`.
3. `distillery classify --batch --inbox` → exits 0; processes all 5; output reports counts by disposition.
4. `distillery_resolve_review entry_id=<id-1> action="approve" reviewer="bob"` called as actor `alice` → entry metadata: `reviewed_by:"alice"`, `reviewed_on_behalf_of:"bob"` (#315).
5. Same call with `reviewer="alice"` (= actor) → no `*_on_behalf_of` field written.
6. `distillery_resolve_review entry_id=<id-2> action="reclassify" new_entry_type="reference"` (entry is `pending_review`) → post-state `status == "active"` (#316), not still pending.
7. `distillery_list` (default) → the reclassified entry appears (no longer hidden from default view).
8. Cleanup.

**Pass:** review-queue exits align with reviewer/actor audit expectations; batch CLI composes filters cleanly.

---

### F5 — WAL durability + FTS replay (covers #266, #346, #349)

1. Start staging MCP against a fresh on-disk DB (not in-memory).
2. Store 10 entries in rapid succession.
3. Trigger FTS rebuild (either via a `distillery_search` call that forces rebuild, or direct CLI maintenance: `distillery maintenance rebuild-fts`). Do it twice in a row — no `Cannot drop entry "fts_main_entries"` error.
4. Kill the server with SIGKILL (not SIGTERM). Restart it against the same DB path.
5. `distillery_list` → all 10 entries present (no WAL discarded by recovery path #346).
6. Look in the DB directory: any `.wal.corrupt.<ts>` files are preserved (if recovery fired). No silently-unlinked WALs.
7. `distillery_search query="..."` → FTS operational, returns expected hits.
8. Cleanup.

**Pass:** a hard kill mid-write does not lose committed entries; operators retain the corrupt WAL for forensics.

---

### F6 — Briefing hook dynamic transport (covers #303, #347)

The subagent runs the hook itself, not inside a Claude Code session.

1. With `DISTILLERY_MCP_URL=http://localhost:8765/mcp` set, run `python scripts/hooks/session_start_briefing.py` → exit 0, briefing text on stdout (recent entries, corrections, radar).
2. Unset env. Create a temp dir with a `.mcp.json` pointing at the same HTTP URL. Run the hook from that dir → resolves via `.mcp.json`, exit 0.
3. Stand up a stub on :9100 that returns 404 on `/health` and a valid JSON-RPC `tools/list` on `POST /mcp`. Set `DISTILLERY_MCP_URL=http://localhost:9100/mcp` → hook exits 0 with briefing rendered (no silent no-op on `/health` 404).
4. Stub returning 401 on `/mcp` → hook exits 0, stderr has `[Distillery] briefing disabled`.
5. Re-run #4 with `DISTILLERY_BRIEFING_QUIET=1` → stderr silent.

**Pass:** hook resolves transport from the full env/manifest chain and no longer requires a `/health` sibling.

---

### F7 — Embedding 429 surfacing (covers #245, #351)

1. Start the server with `DISTILLERY_EMBEDDING_PROVIDER=openai` and a stub OpenAI endpoint configured to always return HTTP 429 with `Retry-After: 12`.
2. `distillery_store content="..." entry_type="inbox"` → error, code `INVALID_PARAMS`, `details.provider == "openai"`, `details.endpoint` set, `details.http_status == 429`, `details.retry_after == 12`. No stack trace leaked in message.
3. Server logs a WARNING line with provider context.
4. Flip stub to return 429 without `Retry-After` header → `details.retry_after` absent or null; error still structured.
5. Flip stub to return `Retry-After: inf` (non-finite) → error surfaces, `retry_after` clamped/omitted, no exception (regression guard for `b247f3e`).
6. Restore normal stub. Confirm `embedding_budget_daily == 0` in config — run 500 sequential stores, none hit `EmbeddingBudgetError`.
7. Set `embedding_budget_daily=5`; on the 6th store → `EmbeddingBudgetError` surfaced as a structured MCP error.

**Pass:** upstream provider throttling is the rate limiter; the local budget is opt-in.

---

### F8 — Security perimeter (covers #112)

1. `curl -i -X POST -H 'Origin: https://evil.test' -H 'content-type: application/json' -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' $MCP` → response does NOT echo `Access-Control-Allow-Origin: https://evil.test`.
2. Restart server with `DISTILLERY_CORS_ORIGINS=https://ok.test`. Repeat curl with `Origin: https://ok.test` → response echoes `Access-Control-Allow-Origin: https://ok.test`.
3. As user-A (one OAuth identity or API key), store entry. As user-B, call `distillery_classify` against that entry_id → `FORBIDDEN` (#112 P2).
4. POST `/api/maintenance` with valid bearer → response body includes `search_logs_pruned: <n>`. Verify default retention: `grep search_log_retention_days src/distillery/config.py` shows default 90.
5. Grep for `verify=False` in `src/` → zero hits (TLS pin — #112 P1).
6. Open `pyproject.toml` → `pyyaml`, `httpx`, `fastmcp`, `defusedxml` all have upper bounds (#112 P4).

**Pass:** the server does not echo unconfigured origins, enforces ownership on classify, prunes search logs, pins transitive deps.

---

### F9 — Bulk ingest + dedup contract (covers #238, #244, #311, #314, #332, #348)

1. `distillery_store_batch entries=[{...} x 20]` with mixed content (some near-dup, most distinct). Response: `entry_ids` length 20, `count == 20`, `results[i].persisted` varies per item, `results[i].dedup_action` ∈ `{"stored","skipped","merged"}`.
2. Per-item error isolation: inject one invalid entry (missing required metadata for `entry_type="github"`) into the batch → batch returns partial success; other 19 persist; the bad one has `results[i].error` populated, no exception leaked.
3. Call `distillery_list` with default paging → reflects only the unique/non-merged entries (no ghosts).
4. Call `distillery_store_batch` with `output_mode="summary"` against the same content → each item's response object is minimal (no conflicts, no dedup preview). Measured response bytes ≤ 30% of full-mode.
5. Cleanup.

**Pass:** bulk path correctly isolates per-item failures and honours the summary contract.

---

### F10 — Docs/skills/catalog alignment (covers #232, #245, #269, #286, #307, #330)

1. `distillery_status` → records `tool_count`.
2. Parse `docs/install/plugin-install.md` (or equivalent published doc) — assert the published count matches (#330).
3. Call `tools/list` via JSON-RPC — assert `distillery_stale` is NOT in the returned tools (#307 — routed to list instead). Assert `distillery_status` IS present (#313).
4. Read `skills/briefing/SKILL.md` — references `distillery_list stale_days=30`, not `distillery_stale`.
5. Read `skills/setup/SKILL.md` and `skills/watch/SKILL.md` — all CronCreate examples use MCP tool calls, not `POST /hooks/*` (#269).
6. Read `.claude/settings.local.json` — no `distillery_tag_tree` permission (#286).
7. Introspect each entry-type-accepting tool's description string via `tools/list` — every value of `EntryType` appears in the description (#232 regression guard; doc drift = fail).

**Pass:** surfaces agents rely on (docs, skill prompts, tool catalog, permissions) agree with the runtime.

---

**Group F teardown:**

```bash
kill %1                         # stop MCP server
rm -f /tmp/distillery-e2e.db*
git worktree remove /tmp/distillery-e2e
```

**Group F subagent prompt template:**

> You are the Group F agent-driver. A staging Distillery MCP is running on `$MCP` and you are authenticated as loopback. For each F-scenario, execute every step AS IF YOU WERE A REAL MCP CLIENT: call tools via JSON-RPC over HTTP (`curl` or `requests`), NEVER by importing Python handlers. Before each scenario, snapshot `distillery_list(limit=0)` count. After each scenario, run the documented cleanup and confirm the count returns to snapshot ±0. Any unhandled exception, non-2xx response code on a step expected to succeed, or schema mismatch is a FAIL. Report `| scenario | issues | result | evidence |`, where evidence is the single curl/Python line that failed (or "all steps ok") per scenario.

---

## Critical files reference

| Purpose | Path |
|---|---|
| Error codes | `src/distillery/mcp/tools/_errors.py` |
| Validation helpers | `src/distillery/mcp/tools/_common.py` |
| Store/list/update handlers | `src/distillery/mcp/tools/crud.py` |
| Classify/resolve_review handler | `src/distillery/mcp/tools/classify.py` |
| Watch/gh-sync/store-batch handlers | `src/distillery/mcp/tools/feeds.py` |
| Status tool | `src/distillery/mcp/tools/meta.py` |
| Server registration | `src/distillery/mcp/server.py` |
| Auth | `src/distillery/mcp/auth.py` |
| Middleware (CORS, rate-limit) | `src/distillery/mcp/middleware.py` |
| Webhooks (incl. log pruning) | `src/distillery/mcp/webhooks.py` |
| DuckDB store + migrations | `src/distillery/store/duckdb.py` |
| GitHub sync adapter | `src/distillery/feeds/github_sync.py` |
| RSS adapter | `src/distillery/feeds/rss.py` |
| Poller | `src/distillery/feeds/poller.py` |
| Background jobs | `src/distillery/feeds/sync_jobs.py` |
| Truncation | `src/distillery/feeds/truncation.py` |
| Embedding (Jina, OpenAI) | `src/distillery/embedding/{jina,openai}.py` |
| SessionStart hook | `scripts/hooks/session_start_briefing.py` |
| Skill files | `skills/{setup,watch,briefing,classify}/SKILL.md` |
| CVE suppressions | `.grype.yaml` |
| Pyproject pins | `pyproject.toml` |

## Verification of the test plan itself

Before dispatching subagents, run a smoke check on the orchestrator:
```bash
git worktree add /tmp/distillery-test staging/api-hardening
cd /tmp/distillery-test
pip install -e ".[dev]" --quiet
pytest --collect-only -q | tail -5            # confirm pytest finds suite
ruff check src/                                # confirm tree is buildable
```
If both pass, dispatch the four group subagents in parallel and aggregate `| issue | scenario | result | evidence |` tables into a single coverage matrix. Any FAIL or BLOCKED triggers a follow-up task on the originating issue.

#	Title	Surface	Group
112	Security follow-up (TLS, ownership, CORS, pin deps, log retention)	embedding/*, mcp/auth.py, mcp/middleware.py, pyproject.toml, mcp/webhooks.py	C
232	`distillery_store` enum omits `github`	models.py:EntryType, mcp/tools/crud.py:_VALID_ENTRY_TYPES	A
238	`distillery_store` `output_mode="summary"`	mcp/tools/crud.py:_handle_store	B
240	`/gh-sync` invalid `output_mode="metadata"`	mcp/tools/feeds.py:_handle_gh_sync	B
241	label→tag sanitiser fails on underscored labels	feeds/github_sync.py:_sanitize_tag_segment	A
244	bulk ingest: store_batch + watch --sync-history	mcp/tools/crud.py:_handle_store_batch, feeds.py	B
245	MCP hardening epic	mcp/tools/_errors.py, _common.py, all handlers	A+B
266	CASCADE on dropping FTS schema	store/duckdb.py:_rebuild_fts_index	B
269	setup/watch CronCreate uses MCP not webhook	skills/setup/, skills/watch/	D
271	suppress upstream CVEs in Docker base	.grype.yaml	D
274	history sync exceeds Jina 8194-token limit	feeds/truncation.py, embedding/jina.py	B
276	--sync history async to avoid timeout	feeds/sync_jobs.py, github_sync.py:sync_batched	B+C
278	gh-sync use store_batch async pipeline	feeds/sync_jobs.py, github_sync.py	B+C
283	`group_by='tags'` in `distillery_list`	mcp/tools/crud.py:_handle_list	B
286	stale `distillery_tag_tree` permission	.claude/settings.local.json	D
301	classify --batch with filters	mcp/tools/classify.py, cli.py, skills/classify	B
302	sync uses real author	feeds/github_sync.py, feeds/rss.py, feeds/poller.py	B
303	dynamic MCP transport for SessionStart hook	scripts/hooks/session_start_briefing.py	C
307	`distillery_stale` missing → route to list	skills/briefing/SKILL.md, mcp/tools/crud.py	D
308	watch accepts invalid/unreachable URLs	mcp/tools/feeds.py:_validate_url_syntax/_probe_url	B+C
309	`distillery_list(source=feed_url)` returns 0	mcp/tools/crud.py:_build_filters_from_arguments	B
310	watch list omits liveness metadata	feeds/poller.py, store/duckdb.py, migration 12	B
311	list default `output_mode=full` floods context	mcp/tools/crud.py:_handle_list	B
312	gh-sync project=null tags=[]	feeds/github_sync.py, cli.py:backfill_github_metadata	B
313	no `distillery_status` MCP tool	mcp/tools/meta.py, mcp/server.py	A
314	ghost entry_ids on dedup-skip	mcp/tools/crud.py:_handle_store dedup branch	B
315	`resolve_review` reviewer ignored	mcp/tools/classify.py, mcp/server.py	B
316	`resolve_review` reclassify leaves pending_review	mcp/tools/classify.py	B
317	list default includes archived	mcp/tools/crud.py:_apply_default_status_filter	B
330	docs: stale tool count + self-host guidance in plugin-install.md	docs/install/plugin-install.md	D
332	`dedup_action="merged"` returns independent new entry_id (PR #341 merged)	mcp/tools/crud.py:_handle_store merge branch	B
333	`resolve_review` double-approve silently bumps version (PR #339 merged)	mcp/tools/classify.py:_handle_resolve_review	B
334	watch list liveness fields exposed but never populated (PR #338 merged)	feeds/poller.py, store/duckdb.py liveness writes	B
335	`source=<url>` vs `feed_url=<url>` diverge (PR #340 merged)	mcp/tools/crud.py:_build_filters_from_arguments	B

Issue	PR	Merge commit	Key files
#345	#353	`8c5f2be` / `7a199e8`	mcp/tools/classify.py, mcp/tools/crud.py, tests/test_entry_type_suggestions.py
#346	#360	`00c1698` / `6d70436`	store/duckdb.py, tests/test_store_wal_durability.py
#347	#359	`69a49e8` / `c75231e`	scripts/hooks/session_start_briefing.py, scripts/hooks/session-start-briefing.sh
#348	#354	`24c6cc8` / `188591a`	mcp/tools/crud.py, tests/test_conflict.py
#349	#358	`df25c15` / `4c96143`	store/duckdb.py, tests/test_duckdb_store.py
#350	#352	`921aab1` / `8fe4382`	.github/workflows/staging-deploy.yml
#351	#361	`5e4f924` / `ab842ec` + CR rounds	embedding/{jina,openai,errors}.py, mcp/budget.py, mcp/tools/{crud,search}.py

Purpose	Path
Error codes	`src/distillery/mcp/tools/_errors.py`
Validation helpers	`src/distillery/mcp/tools/_common.py`
Store/list/update handlers	`src/distillery/mcp/tools/crud.py`
Classify/resolve_review handler	`src/distillery/mcp/tools/classify.py`
Watch/gh-sync/store-batch handlers	`src/distillery/mcp/tools/feeds.py`
Status tool	`src/distillery/mcp/tools/meta.py`
Server registration	`src/distillery/mcp/server.py`
Auth	`src/distillery/mcp/auth.py`
Middleware (CORS, rate-limit)	`src/distillery/mcp/middleware.py`
Webhooks (incl. log pruning)	`src/distillery/mcp/webhooks.py`
DuckDB store + migrations	`src/distillery/store/duckdb.py`
GitHub sync adapter	`src/distillery/feeds/github_sync.py`
RSS adapter	`src/distillery/feeds/rss.py`
Poller	`src/distillery/feeds/poller.py`
Background jobs	`src/distillery/feeds/sync_jobs.py`
Truncation	`src/distillery/feeds/truncation.py`
Embedding (Jina, OpenAI)	`src/distillery/embedding/{jina,openai}.py`
SessionStart hook	`scripts/hooks/session_start_briefing.py`
Skill files	`skills/{setup,watch,briefing,classify}/SKILL.md`
CVE suppressions	`.grype.yaml`
Pyproject pins	`pyproject.toml`

Test plan: open-issue coverage for staging/api-hardening #355

Description

Test Plan — staging/api-hardening Open-Issue Coverage

Context

Open issues with landed work

Subagent dispatch strategy

Group A — Schema / enum / static (1 subagent)

#232 — github entry type

#241 — sanitiser

#245 — error code surface

#313 — distillery_status registered

Group B — In-memory store + handler (1 subagent, runs full pytest by topic)

#238 / #311 / #317 / #309 / #283 — distillery_list extensions

#232 / #238 / #314 — distillery_store

#244 — store_batch + watch sync_history

#266 — FTS CASCADE

#283 — covered above.

#301 — classify --batch

#302 — real author

#308 — watch URL validation (handler-level)

#310 — watch liveness metadata

#312 — gh-sync project + tags backfill

#315 — reviewer parameter

#316 — reclassify status

#274 — Jina truncation

#276 / #278 — async sync jobs

#332 — dedup_action="merged" ghost id (PR #341 merged on staging)

#333 — resolve_review double-approve idempotency (PR #339 merged on staging)

#334 — watch list liveness actually populated (PR #338 merged on staging)

#335 — source=<url> aliases to feed_url (PR #340 merged on staging)

Group C — Real HTTP MCP server (1 subagent, more setup)

#112 — security follow-up

#303 — dynamic transport

#308 — watch URL probe (HTTP layer)

#276 / #278 — async pipeline end-to-end

Group D — Skills, manifests, CI (1 subagent, read-only)

#269 — CronCreate uses MCP tool calls

#271 — CVE suppression

#286 — stale permission

#307 — stale section routing

#330 — docs: stale tool count + self-host guidance

Group E — Issues 345–351 (now on staging)

#345 (PR #353) — entry_type alias suggestions

#346 (PR #360) — checkpoint-after-write WAL durability

#347 (PR #359) — briefing hook tools/list probe

#348 (PR #354) — include_conflict_prompt flag

#349 (PR #358) — FTS WAL replay with overwrite=1

#350 (PR #352) — staging-deploy comment escaping

#351 (PR #361) — embedding budget default unlimited

Group F — Agent-driven E2E user journeys (1 subagent, live MCP)

F1 — Capture-to-classify round trip (covers #245, #311, #313, #314, #317, #332, #348)

F2 — Entry-type alias suggestion flow (covers #232, #245, #345)

F3 — Watch: URL validation → liveness → async sync (covers #276, #278, #302, #308, #310, #312, #334)

F4 — Classify batch + review queue (covers #301, #315, #316)

F5 — WAL durability + FTS replay (covers #266, #346, #349)

F6 — Briefing hook dynamic transport (covers #303, #347)

F7 — Embedding 429 surfacing (covers #245, #351)

F8 — Security perimeter (covers #112)

F9 — Bulk ingest + dedup contract (covers #238, #244, #311, #314, #332, #348)

F10 — Docs/skills/catalog alignment (covers #232, #245, #269, #286, #307, #330)

Critical files reference

Verification of the test plan itself

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Test Plan — `staging/api-hardening` Open-Issue Coverage

#232 — `github` entry type

#313 — `distillery_status` registered

#332 — `dedup_action="merged"` ghost id (PR #341 merged on staging)

#333 — `resolve_review` double-approve idempotency (PR #339 merged on staging)

#335 — `source=<url>` aliases to `feed_url` (PR #340 merged on staging)

#348 (PR #354) — `include_conflict_prompt` flag