Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 35 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,41 @@ Changes are tracked via git tags. Each release tag corresponds to an entry here.

## [Unreleased]

_No changes yet._
### Added — Item 6.33: ollama-models refresh

- **Auto-refresh on server boot.** `cfcf server start` now calls
`listOllamaModels()` and persists the result to
`availableOllamaModels` in the global config if the live list
differs from what's saved. Newly-pulled ollama models propagate to
role-picker dropdowns after a server restart without re-running
`cfcf init --force`. Best-effort: ollama not installed / detection
failure / config write failure all log + continue, never block boot.
Order-insensitive comparison (since `ollama list` reorders by
modified-time) so the boot path doesn't flap on every restart.
- **"Refresh ollama models" button.** New button in the Agent roles
section of both web Settings and per-workspace Config tabs. Calls
`POST /api/agents/refresh-ollama-models`, displays a status
message ("✓ N models detected — list updated" or "list already
current"), and triggers a re-fetch of `/api/agents/models` so the
`*-ollama` adapter dropdowns pick up new entries without a server
restart.
- New `refreshOllamaModelsInConfig()` helper in `@cfcf/core`. 4 new
unit tests + 2 new endpoint tests.

### Fixed — Item 6.33: model-picker UX for ollama-routed adapters

- **Hide "(adapter default)" for `claude-code-ollama` and
`opencode-ollama`.** The seed-sourced adapters (`claude-code`,
`codex`) have real built-in defaults when `--model` is omitted, so
the empty option meaningfully says "let the CLI pick". The
ollama-routed adapters don't — `ollama launch <agent>` requires
`--model <name>` to know which local model to hand off, and saving
`model=""` would produce a silent misconfiguration. The picker now
forces a deliberate selection for these adapters.
- **Empty-state placeholder for ollama-routed adapters** when no
models are available: `(no ollama models — pull one or click
Refresh)` rendered as a disabled option so the dropdown isn't
visually empty + the user is told what to do next.

## [0.21.0] -- 2026-05-08

Expand Down
3 changes: 2 additions & 1 deletion docs/plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,7 @@ The tables below are the authoritative view of iteration progress. The **Notes**
5. **Cerefox-parity Clio improvement** — *Shipped in v0.19.0:* FTS title boosting (**6.24**).
6. **Adapter expansion driven by Anthropic harness policy** (**6.28**, surfaced 2026-05-07) — Anthropic's Jan→Apr 2026 clarification ([Cherny 2026-04-04](https://x.com/bcherny/status/1808066717213728812)) ties subscription OAuth tokens to interactive use only; cfcf's unattended iteration loop is the third-party-harness pattern the rule targets. Add ollama detection + three new adapters (`opencode` direct, `claude-code-ollama`, `opencode-ollama`) so unattended roles (dev / judge / reflection / documenter / auto-architect) can route to local ollama-served models via the `ollama launch <agent> --model <local-model>` exec wrapper (not via env-var proxy — explicit single command). Skills repository (**6.27**) also entered iter-6 as a research/design item on 2026-05-03.

**Iter-6 active set after the cleanup**: 6.9, 6.11, 6.13, 6.18, 6.27, 6.28, 6.29, 6.30, 6.31, 6.32 (plus 6.19 partial: pre-warm-during-installer). 6.29 + 6.30 + 6.31 + 6.32 surfaced 2026-05-08 during 6.28's dogfood pass — kept in iter-6 to fix-while-debugging rather than defer. **Shipped in v0.18.0**: 6.20, 6.12, 6.26. **Shipped in v0.19.0**: 6.24. **Shipped 2026-05-02 on `feat/structured-pause-actions`**: 6.25. **Shipped in v0.20.0 (2026-05-08)**: 6.28. **Shipped post-v0.20.0 (2026-05-08)**: 6.31, 6.30. Items 6.1, 6.2, 6.10, 6.15 dropped to ⏸ (with rationales in their Notes columns); 6.8 marked ❌ but blocked on 6.11. See the status legend at the end of the section for what each icon means.
**Iter-6 active set after the cleanup**: 6.9, 6.11, 6.13, 6.18, 6.27, 6.28, 6.29, 6.30, 6.31, 6.32 (plus 6.19 partial: pre-warm-during-installer). 6.29 + 6.30 + 6.31 + 6.32 surfaced 2026-05-08 during 6.28's dogfood pass — kept in iter-6 to fix-while-debugging rather than defer. **Shipped in v0.18.0**: 6.20, 6.12, 6.26. **Shipped in v0.19.0**: 6.24. **Shipped 2026-05-02 on `feat/structured-pause-actions`**: 6.25. **Shipped in v0.20.0 (2026-05-08)**: 6.28. **Shipped in v0.21.0 (2026-05-08)**: 6.31, 6.30. **Shipped post-v0.21.0 (2026-05-08)**: 6.33. Items 6.1, 6.2, 6.10, 6.15 dropped to ⏸ (with rationales in their Notes columns); 6.8 marked ❌ but blocked on 6.11. See the status legend at the end of the section for what each icon means.

| # | Status | Title | Notes |
|---|--------|-------|-------|
Expand Down Expand Up @@ -325,6 +325,7 @@ The tables below are the authoritative view of iteration progress. The **Notes**
| 6.29 | ❌ | macOS notification fix: per-app bundle ID via shim or `terminal-notifier` | **Surfaced 2026-05-08** during iter-6 dogfood. Symptom: macOS desktop notifications from cfcf events (`loop.paused`, `loop.completed`, `agent.failed`) appear with severe delays — sometimes minutes-to-hours late, often batched. Root cause: cfcf uses `osascript -e 'display notification …'` which macOS attributes to **"Script Editor"** with no separate bundle ID. Three downstream consequences: (1) per-app rate-limiting kicks in around ~5 notifications/min sustained — beyond that, macOS silently queues + batches; (2) DND / Focus Mode queues all of them until DND ends, causing the "minutes-to-hours-later dump" pattern; (3) without a bundle ID, macOS can't merge cfcf events into a coordinated stream — each notification is independent, hitting quotas faster. **Three implementation options**: *(option 1, preferred)* Bundle a tiny `.app` shim (`cfcf-notifier.app/Contents/{Info.plist,MacOS/cfcf-notifier}`) with its own bundle ID + a tiny shell or Swift wrapper that calls `display notification`. cfcf invokes the shim instead of `osascript`. ~50 LOC of plist + a small wrapper script + updates to the install pipeline so the shim ships with cfcf. *(option 2)* Detect `terminal-notifier` if installed (community tool, has its own bundle ID + a `-group` flag for grouping). Use it when present, fall back to `osascript`. Document as a recommended optional dep for macOS users. ~10 LOC of detection + wrapper. *(option 3)* Switch to a Swift / ObjC `NSUserNotification` shim. Overkill compared to option 1; bundle-app shim is simpler. **Out of scope**: rewriting the notification dispatcher or adding new channels — only the macos channel needs the fix. **Alternative if neither option lands by end of iter-6**: remove the macos channel entirely + document terminal-bell as the primary on-the-machine notification path. Linux / cross-platform users already use terminal-bell via the existing dispatcher. **Tests**: a smoke test that confirms the shim binary is reachable post-install (`cfcf doctor` check); a CI-skipped integration test that fires a notification and asserts the shim's path was used. The actual delivery is hard to assert in CI — that part stays manual. **Cross-refs**: `packages/core/src/notifications/channels/macos.ts` is the current implementation; `cfcf-notifier.app` would live under a new `scripts/macos-notifier/` directory + be staged into dist by `scripts/stage-dist.sh`. **Effort estimate**: 0.5-1 session for option 1 (bundle shim); 0.25 session for option 2 (terminal-notifier detection). **Surfaced 2026-05-08 during iter-6 dogfood.** |
| 6.30 | ✅ | API parse errors with non-coder ollama models on claude-code-ollama (refined: opencode-ollama is the fall-back, not "use coder-tuned models" alone) | **Shipped 2026-05-08**. **Refined finding from re-test**: original framing "gemma4 is bad at tool calls" was wrong. Same gemma4:31b model, two routes: `claude-code-ollama + gemma4:31b` → fails with `API Error: Content block not found` (Anthropic-strict Messages API parser rejects); `opencode-ollama + gemma4:31b` → wrote all four documenter files cleanly. The model IS capable; it's the strict-Anthropic-shape translation layer that rejects its output. The OpenAI-compatible endpoint used by opencode-ollama is more tolerant. **Re-confirmed after the v0.20.0 process-tree-kill fix landed** (so it's not a queue-starvation confound). **Shipped scope**: (a) **`isApiParseRisk()` helper** in `packages/core/src/adapters/index.ts` (sister to `isClaudeCodeHarnessRisk`); 4 unit tests in `adapters.test.ts`. (b) **New blue info callout** in `<HarnessPolicyWarning>` (web Settings + workspace Config) explaining the parse-error symptom and recommending `opencode-ollama` as the fall-back; positioned between the existing yellow policy callout and the blue log-visibility callout. (c) **Inline ⚠ row indicator** next to the adapter selector for any unattended row on `claude-code` (direct) — visual link to the policy callout below the table. Scoped via a new `isUnattendedRole` check in ServerInfo so PA / HA rows don't flag (they're allowed-interactive). (d) **Architect always counted as unattended** for these warnings (`UNATTENDED_ROLE_NAMES` updated in `@cfcf/core`): the loop invokes architect on `refine_plan` resume and judge `NEEDS_REFINEMENT` verdicts as well as the pre-loop `autoReviewSpecs=true` path; the same adapter setting drives all three loop paths AND the manual `cfcf review` path, so the warning has to reflect the worst case. Drops the previous `(autoReviewSpecs=true)` qualifier on the architect role label. (e) **CLI `cfcf init` banner** updated to mirror the new web callouts — same three notices in the same order. (f) **`docs/guides/anthropic-policy.md`** documenter row updated with the refined finding: two workable paths (coder-tuned model on claude-code-ollama, OR any model on opencode-ollama). **Out of scope (deferred)**: tolerant retry logic in the iteration-loop spawn (option b from original framing — defer to iter-7 if a similar failure pattern shows up with non-gemma models); deeper investigation of ollama's Anthropic-API translation layer (option c — would belong upstream in ollama, not cfcf). **Cross-refs**: `~/.cfcf/logs/calc-04c553/documenter-001.log` + `documenter-004.log` (the 56-byte claude-code-ollama failure logs); `documenter-005.log` (864-byte opencode-ollama success log on the same model). |
| 6.31 | ✅ | Orphan agent-process cleanup on `cfcf server stop` / `start` / interactive reap | **Surfaced 2026-05-08** during 6.28's opencode-ollama dogfood. When the user stops + starts the cfcf server while a loop iteration is in flight, the spawned agent processes (`claude` / `codex` / `opencode` / `ollama launch <agent>`) are NOT terminated. They keep running until they finish, fail, or are manually killed. **Failure mode observed**: 4 orphans from earlier loop runs (2 claude documenter+architect, 2 opencode architects) accumulated across server restarts and serialized on ollama's model runner — each held the qwen3-coder model busy with up-to-10-minute timed-out inference requests. The new opencode iteration's `/v1/chat/completions` call queued behind them and starved. From the user's POV: loop "hangs" with no clear error, log file is 40 bytes, no obvious culprit. Required `pgrep -f \"ollama launch\" \| xargs kill` to recover. **Scope**: (a) **[SHIPPED in v0.20.0]** `start.ts` `gracefulShutdown()` enumerates every active spawn from the in-memory `activeProcesses` registry (`packages/core/src/active-processes.ts`) and sends SIGTERM to the **process group** (`process.kill(-pid, "SIGTERM")` — agents are now spawned with `detached: true`), then schedules SIGKILL 1.5s later via `setTimeout(...).unref()`. The shutdown handler awaits a 2-second grace window before `process.exit()` so the SIGKILL timer has time to fire — without that wait, the timer dies with the parent and orphans of `init` accumulate. **(b) [SHIPPED 2026-05-08, post-v0.20.0]** On `startServer()` boot, `start.ts` calls `findOrphanAgentProcesses()` from the new `packages/core/src/orphan-reaper.ts` module. Three conjoined filters (PPID==1 + same effective user + cfcf-spawned command shape) identify orphans confidently — false positives near-zero. Auto-reaps via `reapOrphans()` (group-SIGTERM → 1.5s grace → SIGKILL); logs `[server] Reaping N stale agent process(es) from a previous server PID:` followed by one line per orphan. Best-effort: a scan failure never blocks server boot. **(c) [SHIPPED 2026-05-08, post-v0.20.0]** New `cfcf server reap` subcommand uses the same matcher and prints candidates with `pid=… kind=… elapsed=… <command>` lines, then prompts `Kill these N process(es)? [y/N]:`. On `y`: reap. On `N` or empty: `Aborted. No processes killed.`. Empty list: `No zombie agent processes detected.`. Supports `-y / --yes` for non-interactive use. Pure system call — does NOT require the cfcf server to be running. **Tests shipped** (25 in `packages/core/src/orphan-reaper.test.ts`): the classifier covers every cfcf-spawn pattern + negative cases (interactive claude, non-cfcf opencode, ollama serve/pull, unrelated commands like node/python); the parser handles standard `ps` output, header-only input, and malformed lines without throwing; the orphan filter validates each of the three filters in isolation (PPID, user, command shape); `reapOrphans` covers empty input, the SIGTERM-then-SIGKILL flow with mocked `process.kill`, the group-then-direct fallback when group-kill throws ESRCH, and the failed-count when both kills fail. **Cross-refs**: `packages/core/src/orphan-reaper.ts` (matcher + reaper), `packages/server/src/start.ts` (boot-time hook), `packages/cli/src/commands/server.ts` (`reap` subcommand). |
| 6.33 | ✅ | Auto-refresh `availableOllamaModels` on server boot + manual refresh button | **Shipped 2026-05-08, post-v0.21.0**. **Surfaced 2026-05-08** during dogfood: user pulled a new ollama model, restarted the cfcf server, and the new model didn't show up in the role-picker dropdowns — because `listOllamaModels()` was only invoked at `cfcf init` (interactive setup) and `cfcf doctor` (read-only display); neither the server nor the web UI ever re-detected. **Shipped scope**: (a) **`refreshOllamaModelsInConfig()` helper** in `packages/core/src/ollama-detection.ts` — detects ollama, lists models, persists to `availableOllamaModels` if the live list differs from saved (order-insensitive comparison since `ollama list` reorders by mtime). Returns `{ models, updated, error? }`. Best-effort: never throws; surfaces the missing-ollama case via `error` field. (b) **Boot-time auto-refresh** in `packages/server/src/start.ts` — runs after the orphan reaper, single log line if list changed. (c) **`POST /api/agents/refresh-ollama-models`** endpoint — returns the same shape as the helper. Always 200 (the most common error case is "ollama not installed", which isn't an HTTP failure for this endpoint). (d) **"Refresh ollama models" button** in the Agent roles section of both web Settings (`ServerInfo.tsx`) and per-workspace Config (`ConfigDisplay.tsx`). Clicking calls the endpoint, displays a status message, and bumps a `modelsRev` counter that triggers a re-fetch of `/api/agents/models` so the `*-ollama` dropdowns pick up new entries. **Tests added**: 4 new unit tests in `ollama-detection.test.ts` (shape, no-config-no-write, list-equality persistence guard, ordering insensitivity); 2 new endpoint tests in `agent-models.test.ts` (shape, no-500-on-missing-ollama). **Cross-refs**: `packages/core/src/ollama-detection.ts`, `packages/server/src/start.ts`, `packages/server/src/routes/agent-models.ts`, `packages/web/src/api.ts`, `packages/web/src/pages/ServerInfo.tsx`, `packages/web/src/components/ConfigDisplay.tsx`. |
| 6.32 | ❌ | Opencode-ollama hang-detection + reduced-deadlock surface | **Surfaced 2026-05-08** during 6.28 dogfood. Despite the documented `opencode run` "scriptable" contract + `--dangerously-skip-permissions` flag, opencode-ollama silently hangs in cf²'s harness pattern when (a) ollama's model runner is busy / has dead orphan requests in queue, (b) the prior session's hardcoded permission denies trip an internal stdin-prompt code path, or (c) opencode's stream-json over the OpenAI-compatible `/v1/chat/completions` API gets aborted server-side (500) and opencode doesn't surface the error loudly. **Symptom**: opencode process at 0% CPU, no TCP connection to ollama, log file frozen mid-session at "service=llm ... stream", cfcf log file 40 bytes. No timeout, no error, no exit. **Three angles to investigate**: *(a)* hard timeout on agent spawns in `process-manager.ts` (e.g. 15-min default per role, configurable per role) so a hung agent eventually kills itself + cfcf marks the iteration failed instead of the loop hanging forever. *(b)* Detect opencode's specific failure modes by following `~/.local/share/opencode/log/<timestamp>.log` in addition to stdout — opencode's internal log captures the full session lifecycle including provider errors. The cfcf log writer could optionally tail this file as a side-channel. *(c)* Test whether `--format json` on `opencode run` produces cleaner streaming events than the default formatted output (might help with the buffering UX too — sister concern to the claude-code-ollama buffering problem). **Recommendation in the meantime**: prefer `claude-code-ollama` over `opencode-ollama` for unattended roles until opencode's stability matures (when running against a local ollama backend). Update `anthropic-policy.md` with this caveat. **Cross-refs**: github/anomalyco/opencode#13851 (the permission-deny-causes-cancel-state issue we already knew about); the 2026-05-08 calc-workspace dogfood log session at `~/.local/share/opencode/log/2026-05-08T082733.log` is the canonical reproduction. **Effort**: 0.25 session for the docs caveat (immediately useful); 1–2 sessions for the spawn-timeout + opencode-log-tail investigation (defer until 6.31 ships, since the orphan cleanup is the more-bang-for-buck blocker). |

**Status icons (this section):**
Expand Down
Loading
Loading