fstamatelopoulos · fstamatelopoulos · May 9, 2026 · May 9, 2026 · May 9, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -9,7 +9,41 @@ Changes are tracked via git tags. Each release tag corresponds to an entry here.
 
 ## [Unreleased]
 
-_No changes yet._
+### Added — Item 6.33: ollama-models refresh
+
+- **Auto-refresh on server boot.** `cfcf server start` now calls
+  `listOllamaModels()` and persists the result to
+  `availableOllamaModels` in the global config if the live list
+  differs from what's saved. Newly-pulled ollama models propagate to
+  role-picker dropdowns after a server restart without re-running
+  `cfcf init --force`. Best-effort: ollama not installed / detection
+  failure / config write failure all log + continue, never block boot.
+  Order-insensitive comparison (since `ollama list` reorders by
+  modified-time) so the boot path doesn't flap on every restart.
+- **"Refresh ollama models" button.** New button in the Agent roles
+  section of both web Settings and per-workspace Config tabs. Calls
+  `POST /api/agents/refresh-ollama-models`, displays a status
+  message ("✓ N models detected — list updated" or "list already
+  current"), and triggers a re-fetch of `/api/agents/models` so the
+  `*-ollama` adapter dropdowns pick up new entries without a server
+  restart.
+- New `refreshOllamaModelsInConfig()` helper in `@cfcf/core`. 4 new
+  unit tests + 2 new endpoint tests.
+
+### Fixed — Item 6.33: model-picker UX for ollama-routed adapters
+
+- **Hide "(adapter default)" for `claude-code-ollama` and
+  `opencode-ollama`.** The seed-sourced adapters (`claude-code`,
+  `codex`) have real built-in defaults when `--model` is omitted, so
+  the empty option meaningfully says "let the CLI pick". The
+  ollama-routed adapters don't — `ollama launch <agent>` requires
+  `--model <name>` to know which local model to hand off, and saving
+  `model=""` would produce a silent misconfiguration. The picker now
+  forces a deliberate selection for these adapters.
+- **Empty-state placeholder for ollama-routed adapters** when no
+  models are available: `(no ollama models — pull one or click
+  Refresh)` rendered as a disabled option so the dropdown isn't
+  visually empty + the user is told what to do next.
 
 ## [0.21.0] -- 2026-05-08
 

diff --git a/docs/plan.md b/docs/plan.md
@@ -296,7 +296,7 @@ The tables below are the authoritative view of iteration progress. The **Notes**
 5. **Cerefox-parity Clio improvement** — *Shipped in v0.19.0:* FTS title boosting (**6.24**).
 6. **Adapter expansion driven by Anthropic harness policy** (**6.28**, surfaced 2026-05-07) — Anthropic's Jan→Apr 2026 clarification ([Cherny 2026-04-04](https://x.com/bcherny/status/1808066717213728812)) ties subscription OAuth tokens to interactive use only; cfcf's unattended iteration loop is the third-party-harness pattern the rule targets. Add ollama detection + three new adapters (`opencode` direct, `claude-code-ollama`, `opencode-ollama`) so unattended roles (dev / judge / reflection / documenter / auto-architect) can route to local ollama-served models via the `ollama launch <agent> --model <local-model>` exec wrapper (not via env-var proxy — explicit single command). Skills repository (**6.27**) also entered iter-6 as a research/design item on 2026-05-03.
 
-**Iter-6 active set after the cleanup**: 6.9, 6.11, 6.13, 6.18, 6.27, 6.28, 6.29, 6.30, 6.31, 6.32 (plus 6.19 partial: pre-warm-during-installer). 6.29 + 6.30 + 6.31 + 6.32 surfaced 2026-05-08 during 6.28's dogfood pass — kept in iter-6 to fix-while-debugging rather than defer. **Shipped in v0.18.0**: 6.20, 6.12, 6.26. **Shipped in v0.19.0**: 6.24. **Shipped 2026-05-02 on `feat/structured-pause-actions`**: 6.25. **Shipped in v0.20.0 (2026-05-08)**: 6.28. **Shipped post-v0.20.0 (2026-05-08)**: 6.31, 6.30. Items 6.1, 6.2, 6.10, 6.15 dropped to ⏸ (with rationales in their Notes columns); 6.8 marked ❌ but blocked on 6.11. See the status legend at the end of the section for what each icon means.
+**Iter-6 active set after the cleanup**: 6.9, 6.11, 6.13, 6.18, 6.27, 6.28, 6.29, 6.30, 6.31, 6.32 (plus 6.19 partial: pre-warm-during-installer). 6.29 + 6.30 + 6.31 + 6.32 surfaced 2026-05-08 during 6.28's dogfood pass — kept in iter-6 to fix-while-debugging rather than defer. **Shipped in v0.18.0**: 6.20, 6.12, 6.26. **Shipped in v0.19.0**: 6.24. **Shipped 2026-05-02 on `feat/structured-pause-actions`**: 6.25. **Shipped in v0.20.0 (2026-05-08)**: 6.28. **Shipped in v0.21.0 (2026-05-08)**: 6.31, 6.30. **Shipped post-v0.21.0 (2026-05-08)**: 6.33. Items 6.1, 6.2, 6.10, 6.15 dropped to ⏸ (with rationales in their Notes columns); 6.8 marked ❌ but blocked on 6.11. See the status legend at the end of the section for what each icon means.
 
 | # | Status | Title | Notes |
 |---|--------|-------|-------|
@@ -325,6 +325,7 @@ The tables below are the authoritative view of iteration progress. The **Notes**
 | 6.29 | ❌ | macOS notification fix: per-app bundle ID via shim or `terminal-notifier` | **Surfaced 2026-05-08** during iter-6 dogfood. Symptom: macOS desktop notifications from cfcf events (`loop.paused`, `loop.completed`, `agent.failed`) appear with severe delays — sometimes minutes-to-hours late, often batched. Root cause: cfcf uses `osascript -e 'display notification …'` which macOS attributes to **"Script Editor"** with no separate bundle ID. Three downstream consequences: (1) per-app rate-limiting kicks in around ~5 notifications/min sustained — beyond that, macOS silently queues + batches; (2) DND / Focus Mode queues all of them until DND ends, causing the "minutes-to-hours-later dump" pattern; (3) without a bundle ID, macOS can't merge cfcf events into a coordinated stream — each notification is independent, hitting quotas faster. **Three implementation options**: *(option 1, preferred)* Bundle a tiny `.app` shim (`cfcf-notifier.app/Contents/{Info.plist,MacOS/cfcf-notifier}`) with its own bundle ID + a tiny shell or Swift wrapper that calls `display notification`. cfcf invokes the shim instead of `osascript`. ~50 LOC of plist + a small wrapper script + updates to the install pipeline so the shim ships with cfcf. *(option 2)* Detect `terminal-notifier` if installed (community tool, has its own bundle ID + a `-group` flag for grouping). Use it when present, fall back to `osascript`. Document as a recommended optional dep for macOS users. ~10 LOC of detection + wrapper. *(option 3)* Switch to a Swift / ObjC `NSUserNotification` shim. Overkill compared to option 1; bundle-app shim is simpler. **Out of scope**: rewriting the notification dispatcher or adding new channels — only the macos channel needs the fix. **Alternative if neither option lands by end of iter-6**: remove the macos channel entirely + document terminal-bell as the primary on-the-machine notification path. Linux / cross-platform users already use terminal-bell via the existing dispatcher. **Tests**: a smoke test that confirms the shim binary is reachable post-install (`cfcf doctor` check); a CI-skipped integration test that fires a notification and asserts the shim's path was used. The actual delivery is hard to assert in CI — that part stays manual. **Cross-refs**: `packages/core/src/notifications/channels/macos.ts` is the current implementation; `cfcf-notifier.app` would live under a new `scripts/macos-notifier/` directory + be staged into dist by `scripts/stage-dist.sh`. **Effort estimate**: 0.5-1 session for option 1 (bundle shim); 0.25 session for option 2 (terminal-notifier detection). **Surfaced 2026-05-08 during iter-6 dogfood.** |
 | 6.30 | ✅ | API parse errors with non-coder ollama models on claude-code-ollama (refined: opencode-ollama is the fall-back, not "use coder-tuned models" alone) |  **Shipped 2026-05-08**. **Refined finding from re-test**: original framing "gemma4 is bad at tool calls" was wrong. Same gemma4:31b model, two routes: `claude-code-ollama + gemma4:31b` → fails with `API Error: Content block not found` (Anthropic-strict Messages API parser rejects); `opencode-ollama + gemma4:31b` → wrote all four documenter files cleanly. The model IS capable; it's the strict-Anthropic-shape translation layer that rejects its output. The OpenAI-compatible endpoint used by opencode-ollama is more tolerant. **Re-confirmed after the v0.20.0 process-tree-kill fix landed** (so it's not a queue-starvation confound). **Shipped scope**: (a) **`isApiParseRisk()` helper** in `packages/core/src/adapters/index.ts` (sister to `isClaudeCodeHarnessRisk`); 4 unit tests in `adapters.test.ts`. (b) **New blue info callout** in `<HarnessPolicyWarning>` (web Settings + workspace Config) explaining the parse-error symptom and recommending `opencode-ollama` as the fall-back; positioned between the existing yellow policy callout and the blue log-visibility callout. (c) **Inline ⚠ row indicator** next to the adapter selector for any unattended row on `claude-code` (direct) — visual link to the policy callout below the table. Scoped via a new `isUnattendedRole` check in ServerInfo so PA / HA rows don't flag (they're allowed-interactive). (d) **Architect always counted as unattended** for these warnings (`UNATTENDED_ROLE_NAMES` updated in `@cfcf/core`): the loop invokes architect on `refine_plan` resume and judge `NEEDS_REFINEMENT` verdicts as well as the pre-loop `autoReviewSpecs=true` path; the same adapter setting drives all three loop paths AND the manual `cfcf review` path, so the warning has to reflect the worst case. Drops the previous `(autoReviewSpecs=true)` qualifier on the architect role label. (e) **CLI `cfcf init` banner** updated to mirror the new web callouts — same three notices in the same order. (f) **`docs/guides/anthropic-policy.md`** documenter row updated with the refined finding: two workable paths (coder-tuned model on claude-code-ollama, OR any model on opencode-ollama). **Out of scope (deferred)**: tolerant retry logic in the iteration-loop spawn (option b from original framing — defer to iter-7 if a similar failure pattern shows up with non-gemma models); deeper investigation of ollama's Anthropic-API translation layer (option c — would belong upstream in ollama, not cfcf). **Cross-refs**: `~/.cfcf/logs/calc-04c553/documenter-001.log` + `documenter-004.log` (the 56-byte claude-code-ollama failure logs); `documenter-005.log` (864-byte opencode-ollama success log on the same model). |
 | 6.31 | ✅ | Orphan agent-process cleanup on `cfcf server stop` / `start` / interactive reap | **Surfaced 2026-05-08** during 6.28's opencode-ollama dogfood. When the user stops + starts the cfcf server while a loop iteration is in flight, the spawned agent processes (`claude` / `codex` / `opencode` / `ollama launch <agent>`) are NOT terminated. They keep running until they finish, fail, or are manually killed. **Failure mode observed**: 4 orphans from earlier loop runs (2 claude documenter+architect, 2 opencode architects) accumulated across server restarts and serialized on ollama's model runner — each held the qwen3-coder model busy with up-to-10-minute timed-out inference requests. The new opencode iteration's `/v1/chat/completions` call queued behind them and starved. From the user's POV: loop "hangs" with no clear error, log file is 40 bytes, no obvious culprit. Required `pgrep -f \"ollama launch\" \| xargs kill` to recover. **Scope**: (a) **[SHIPPED in v0.20.0]** `start.ts` `gracefulShutdown()` enumerates every active spawn from the in-memory `activeProcesses` registry (`packages/core/src/active-processes.ts`) and sends SIGTERM to the **process group** (`process.kill(-pid, "SIGTERM")` — agents are now spawned with `detached: true`), then schedules SIGKILL 1.5s later via `setTimeout(...).unref()`. The shutdown handler awaits a 2-second grace window before `process.exit()` so the SIGKILL timer has time to fire — without that wait, the timer dies with the parent and orphans of `init` accumulate. **(b) [SHIPPED 2026-05-08, post-v0.20.0]** On `startServer()` boot, `start.ts` calls `findOrphanAgentProcesses()` from the new `packages/core/src/orphan-reaper.ts` module. Three conjoined filters (PPID==1 + same effective user + cfcf-spawned command shape) identify orphans confidently — false positives near-zero. Auto-reaps via `reapOrphans()` (group-SIGTERM → 1.5s grace → SIGKILL); logs `[server] Reaping N stale agent process(es) from a previous server PID:` followed by one line per orphan. Best-effort: a scan failure never blocks server boot. **(c) [SHIPPED 2026-05-08, post-v0.20.0]** New `cfcf server reap` subcommand uses the same matcher and prints candidates with `pid=… kind=… elapsed=… <command>` lines, then prompts `Kill these N process(es)? [y/N]:`. On `y`: reap. On `N` or empty: `Aborted. No processes killed.`. Empty list: `No zombie agent processes detected.`. Supports `-y / --yes` for non-interactive use. Pure system call — does NOT require the cfcf server to be running. **Tests shipped** (25 in `packages/core/src/orphan-reaper.test.ts`): the classifier covers every cfcf-spawn pattern + negative cases (interactive claude, non-cfcf opencode, ollama serve/pull, unrelated commands like node/python); the parser handles standard `ps` output, header-only input, and malformed lines without throwing; the orphan filter validates each of the three filters in isolation (PPID, user, command shape); `reapOrphans` covers empty input, the SIGTERM-then-SIGKILL flow with mocked `process.kill`, the group-then-direct fallback when group-kill throws ESRCH, and the failed-count when both kills fail. **Cross-refs**: `packages/core/src/orphan-reaper.ts` (matcher + reaper), `packages/server/src/start.ts` (boot-time hook), `packages/cli/src/commands/server.ts` (`reap` subcommand). |
+| 6.33 | ✅ | Auto-refresh `availableOllamaModels` on server boot + manual refresh button | **Shipped 2026-05-08, post-v0.21.0**. **Surfaced 2026-05-08** during dogfood: user pulled a new ollama model, restarted the cfcf server, and the new model didn't show up in the role-picker dropdowns — because `listOllamaModels()` was only invoked at `cfcf init` (interactive setup) and `cfcf doctor` (read-only display); neither the server nor the web UI ever re-detected. **Shipped scope**: (a) **`refreshOllamaModelsInConfig()` helper** in `packages/core/src/ollama-detection.ts` — detects ollama, lists models, persists to `availableOllamaModels` if the live list differs from saved (order-insensitive comparison since `ollama list` reorders by mtime). Returns `{ models, updated, error? }`. Best-effort: never throws; surfaces the missing-ollama case via `error` field. (b) **Boot-time auto-refresh** in `packages/server/src/start.ts` — runs after the orphan reaper, single log line if list changed. (c) **`POST /api/agents/refresh-ollama-models`** endpoint — returns the same shape as the helper. Always 200 (the most common error case is "ollama not installed", which isn't an HTTP failure for this endpoint). (d) **"Refresh ollama models" button** in the Agent roles section of both web Settings (`ServerInfo.tsx`) and per-workspace Config (`ConfigDisplay.tsx`). Clicking calls the endpoint, displays a status message, and bumps a `modelsRev` counter that triggers a re-fetch of `/api/agents/models` so the `*-ollama` dropdowns pick up new entries. **Tests added**: 4 new unit tests in `ollama-detection.test.ts` (shape, no-config-no-write, list-equality persistence guard, ordering insensitivity); 2 new endpoint tests in `agent-models.test.ts` (shape, no-500-on-missing-ollama). **Cross-refs**: `packages/core/src/ollama-detection.ts`, `packages/server/src/start.ts`, `packages/server/src/routes/agent-models.ts`, `packages/web/src/api.ts`, `packages/web/src/pages/ServerInfo.tsx`, `packages/web/src/components/ConfigDisplay.tsx`. |
 | 6.32 | ❌ | Opencode-ollama hang-detection + reduced-deadlock surface | **Surfaced 2026-05-08** during 6.28 dogfood. Despite the documented `opencode run` "scriptable" contract + `--dangerously-skip-permissions` flag, opencode-ollama silently hangs in cf²'s harness pattern when (a) ollama's model runner is busy / has dead orphan requests in queue, (b) the prior session's hardcoded permission denies trip an internal stdin-prompt code path, or (c) opencode's stream-json over the OpenAI-compatible `/v1/chat/completions` API gets aborted server-side (500) and opencode doesn't surface the error loudly. **Symptom**: opencode process at 0% CPU, no TCP connection to ollama, log file frozen mid-session at "service=llm ... stream", cfcf log file 40 bytes. No timeout, no error, no exit. **Three angles to investigate**: *(a)* hard timeout on agent spawns in `process-manager.ts` (e.g. 15-min default per role, configurable per role) so a hung agent eventually kills itself + cfcf marks the iteration failed instead of the loop hanging forever. *(b)* Detect opencode's specific failure modes by following `~/.local/share/opencode/log/<timestamp>.log` in addition to stdout — opencode's internal log captures the full session lifecycle including provider errors. The cfcf log writer could optionally tail this file as a side-channel. *(c)* Test whether `--format json` on `opencode run` produces cleaner streaming events than the default formatted output (might help with the buffering UX too — sister concern to the claude-code-ollama buffering problem). **Recommendation in the meantime**: prefer `claude-code-ollama` over `opencode-ollama` for unattended roles until opencode's stability matures (when running against a local ollama backend). Update `anthropic-policy.md` with this caveat. **Cross-refs**: github/anomalyco/opencode#13851 (the permission-deny-causes-cancel-state issue we already knew about); the 2026-05-08 calc-workspace dogfood log session at `~/.local/share/opencode/log/2026-05-08T082733.log` is the canonical reproduction. **Effort**: 0.25 session for the docs caveat (immediately useful); 1–2 sessions for the spawn-timeout + opencode-log-tail investigation (defer until 6.31 ships, since the orphan cleanup is the more-bang-for-buck blocker). |
 
 **Status icons (this section):**