Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,97 @@ Changes are tracked via git tags. Each release tag corresponds to an entry here.

## [Unreleased]

### Added — Item 6.8: role-template management UI (rounds 1 + 2)

**Round 1: full-template editing**

- **New top-level "Agents" tab** in the web UI (between Memory and
Settings). One sub-tab per managed role template; each tab shows
the template content in a scrollable monospace editor with version
selector, edit / save / promote / revert / delete actions, and an
inline help block.
- **Versioning + promote-to-production**: every role's bundled cf²
default is always selectable (read-only, never deletable). Saving
edits creates a new labelled version under
`~/.cfcf/templates-managed/<name>/`. Promoting a version writes its
content to the existing `~/.cfcf/templates/<name>` user-global
override path that `getTemplate()` already reads — no runtime
changes to agent spawning. Reverting to default deletes the override
file so cf² falls back to the embedded default. Editing a
promoted version refreshes the override file in-place.
- **HTTP API**: `GET /api/role-templates`, `GET .../:name`,
`GET .../:name/versions/:versionId`, `POST .../:name/versions`,
`PUT .../:name/versions/:versionId`,
`DELETE .../:name/versions/:versionId`,
`POST .../:name/promote`. Uniform `{ error: string }` envelope.
- **New core module** `packages/core/src/role-templates.ts` with full
CRUD, manifest self-heal on corruption, orphan detection.
`getEmbeddedTemplate(name)` newly exported from `templates.ts` for
read-only access to the bundled default.

**Round 2: augmented-type versions (auto-upgrade-friendly)**

- **`type: "full" | "augmented"`** added to every saved version.
Round-1 manifests (no `type`) are read as `"full"` for back-compat.
- **`type: "full"`** — body REPLACES the bundled default at
promote time. Maximum flexibility (delete sections, restructure);
no auto-upgrade. UI shows a `ℹ Forked from cf² vX.Y.Z` badge so
the user knows their version may have drifted from the latest
bundled default.
- **`type: "augmented"`** — body is APPENDED to the live bundled
default at promote time. The harness composes
`<bundled-default> + separator + <extension>` and writes that to
the override file. The bundled default is read live (never
duplicated on disk), so when cf² ships a new default, the user's
extension automatically rides along on the new version. **This is
upgrade-friendly by default.**
- **Boot-time auto-recompose**: every server boot re-composes any
promoted augmented version (`refreshAugmentedOverrides()` in core,
called from `start.ts` after the orphan reaper). Cheap idempotent
pass — only writes when the on-disk override actually differs from
the live composition. Single log line on change:
`[server] Re-composed N augmented role-template override(s)…`.
Full versions are NOT touched (frozen by design).
- **New "Augment" button** in the action row, next to "Edit". Augment
always creates a new augmented version on top of the bundled
default (regardless of which version is currently selected — keeps
the upgrade-friendly contract).
- **Split editor view** for augmented versions: bundled default
rendered read-only at the top (~25vh, smaller so the extension
editor has visual prominence), extension textarea below (~30vh)
with a placeholder showing example custom directions. Editing
toggles only the extension to editable.
- **Version dropdown** suffixes each entry with its type:
`My stricter judge — full (2026-05-09) — promoted`.
- **API extension**: `POST /api/role-templates/:name/versions` body
now accepts an optional `type: "full" | "augmented"` (defaults to
`"full"`). Invalid values return 400.
- **UI polish (also round 2)**: status messages moved above the
version-selector row (so "click Promote" reads naturally below the
buttons), directional words removed from messages, "creating new
X version" indicator added to the heading when appropriate.

**Managed templates (rounds 1 + 2)**: `cfcf-architect-instructions.md`,
`cfcf-judge-instructions.md`, `cfcf-documenter-instructions.md`,
`cfcf-reflection-instructions.md`, `process.md`. Per-project overrides
at `<repo>/cfcf-templates/<name>` continue to take precedence over the
user-global override (the power-user escape hatch, unmanaged from the UI).

**Out of scope (deferred)**: dev-role custom-directions block (dev's
instructions are programmatically generated by `context-assembler`,
not file-loaded — needs a different mechanism), Product Architect /
Help Assistant system-prompt management, per-project override
management UI, diff viewer between versions.

**Tests**: 41 unit tests in `role-templates.test.ts` (every flow +
manifest corruption recovery + cross-template isolation +
back-compat with round-1 manifests + augmented composition + boot
refresh), 17 endpoint tests in `routes/role-templates.test.ts`
(including round-2 type validation), 3 route-parser tests for
`/agents` + `?template=`.

**Design doc**: [`docs/design/role-template-management.md`](docs/design/role-template-management.md).

### Added — Item 6.33: ollama-models refresh

- **Auto-refresh on server boot.** `cfcf server start` now calls
Expand Down
30 changes: 26 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,10 @@ cfcf (Cerefox Code Factory, also written cf², pronounced "cf square") is a dete
- Agents run as **local processes** (not containers) in the user's dev environment
- **Git branches** provide isolation between iterations (feature branch per iteration, merge to main)
- **Seven agent roles**: five run inside the iteration loop — dev (writes code), judge (per-iteration assessment), architect (reviews / extends Problem Pack; verdicts: READY / NEEDS_REFINEMENT / BLOCKED / SCOPE_COMPLETE), reflection (cross-iteration strategic review), documenter (produces final docs); two are interactive — Product Architect (live spec iteration before the loop, item 5.14) and Help Assistant (in-shell guidance). Each role independently configurable (adapter + model).
- **Per-adapter model registry** (item 6.26): pickers in web Settings, web workspace Config, and `cfcf init` / `cfcf config edit` source their model dropdown from `packages/core/src/adapters/seed-models.ts` (the bundled seed) merged with the user's optional override on `CfcfGlobalConfig.agentModels[<adapter>]`. Resolution lives in `resolveModelsForAdapter()`. The seed is intentionally minimal — generic aliases (`opus`, `sonnet`, `haiku` for claude-code; `gpt-5-codex`, `gpt-5`, `o3` for codex) so it ages slowly. **Single edit surface**: web Settings → Model registry is the one place to add/remove/reset per-adapter models; pickers themselves are read-only dropdowns (no inline "custom name" affordance — kept the UI predictable: one place to manage models). Hand-edited config values that aren't in the registry are preserved as `(custom)` entries on first render so back-compat doesn't break. **Maintenance**: when an upstream agent CLI ships a new headline model, edit the relevant array in `seed-models.ts` and ship in the next release; user overrides survive the upgrade.
- **Per-adapter model registry** (item 6.26): pickers in web Settings, web workspace Config, and `cfcf init` / `cfcf config edit` source their model dropdown from `packages/core/src/adapters/seed-models.ts` (the bundled seed) merged with the user's optional override on `CfcfGlobalConfig.agentModels[<adapter>]`. Resolution lives in `resolveModelsForAdapter()`. The seed is intentionally minimal — generic aliases (`opus`, `sonnet`, `haiku` for claude-code; `gpt-5-codex`, `gpt-5`, `o3` for codex) so it ages slowly. **Single edit surface**: web Settings → Model registry is the one place to add/remove/reset per-adapter models; pickers themselves are read-only dropdowns (no inline "custom name" affordance — kept the UI predictable: one place to manage models). Hand-edited config values that aren't in the registry are preserved as `(custom)` entries on first render so back-compat doesn't break. **Maintenance**: when an upstream agent CLI ships a new headline model, edit the relevant array in `seed-models.ts` and ship in the next release; user overrides survive the upgrade. **Ollama-models refresh** (item 6.33, v0.21.x): for `*-ollama` adapters the model list comes from `availableOllamaModels` in the global config (populated at `cfcf init`). To pick up newly-pulled ollama models, the list is auto-refreshed on every `cfcf server start` (`refreshOllamaModelsInConfig()` in `@cfcf/core`, called from `start.ts`) AND on demand via the "Refresh ollama models" button in the web UI Agent-roles section (Settings + workspace Config). `POST /api/agents/refresh-ollama-models` is the underlying endpoint.
- **Role-template management** (item 6.8): top-level **Agents** tab in the web UI lets users edit the instruction templates each role reads (`cfcf-architect-instructions.md`, `process.md` for the dev role, `cfcf-judge-instructions.md`, `cfcf-reflection-instructions.md`, `cfcf-documenter-instructions.md`). Each role has a bundled cf² default (always selectable, read-only) and any number of saved versions stored under `~/.cfcf/templates-managed/<name>/`. Two version types: **`full`** REPLACES the default (max flexibility, no auto-upgrade — UI shows a "Forked from cf² vX.Y.Z" badge); **`augmented`** APPENDS to the live default (extension-only on disk; composed at promote time + every server boot via `refreshAugmentedOverrides()` so cf² upgrades automatically propagate). Promoting a version writes the (composed-for-augmented) content to the existing user-global override path `~/.cfcf/templates/<name>` that `getTemplate()` already reads — no runtime changes to the agent-spawn pipeline. Reverting to default deletes the override. Project-local overrides at `<repo>/cfcf-templates/<name>` continue to take precedence (power-user escape hatch, unmanaged from the UI). Core: `packages/core/src/role-templates.ts`. API: `/api/role-templates/*`. UI: `packages/web/src/pages/AgentTemplates.tsx`. Design doc: `docs/design/role-template-management.md`.
- **Orphan agent-process reaper** (item 6.31, v0.21.0): cf² spawns agents with `detached: true` so SIGTERM at server shutdown can kill the whole process group. **Graceful shutdown** (SIGINT/SIGTERM) sends group SIGTERM, waits 1.5s, then group SIGKILL — handles the common case. **Boot-time orphan scan** (`refreshAugmentedOverrides`'s sibling, in `packages/core/src/orphan-reaper.ts`) closes the hard-crash hole: when the previous server died via SIGKILL or OS panic, its agent children get reparented to PID 1; on next `cfcf server start`, the boot scan finds them via three conjoined filters (PPID==1 + same effective user + cfcf-spawned command shape) and reaps them. Orphans on `ollama launch <agent>` are particularly important — they hold ollama's model serializer for up to 10 min after their inference times out. **`cfcf server reap`** is the manual interactive variant (list + y/N + kill); doesn't require the cfcf server to be running. See `findOrphanAgentProcesses()` + `reapOrphans()` in `@cfcf/core`.
- **Architect role is unattended for ALL invocation paths** (item 6.30, v0.21.0): the previous "manually-invoked SA via `cfcf review` is interactive" framing was wrong — verified in code that ALL architect spawns (pre-loop autoReviewSpecs, mid-loop refine_plan, manual `cfcf review`) use `spawnProcess` with `stdout: "pipe"` + log file. There's no `stdio: "inherit"` anywhere in the architect path; `cfcf review` is just a polling client to a server-side background spawn. So architect is in `UNATTENDED_ROLE_NAMES` always (no `autoReviewSpecs=true` gate); `cfcf doctor`'s harness check + the web UI's policy callout fire on architect+claude-code regardless of `autoReviewSpecs`; fresh `cfcf init` defaults architect to `codex` instead of `claude-code` (behaviour change for new installs only — existing user configs unaffected). Only **PA (`cfcf spec`)** and **HA (`cfcf help assistant`)** use `Bun.spawn(... { stdio: "inherit" })` and are within Anthropic's allowed-interactive scope.
- **Structured pause actions** (item 6.25, shipped 2026-05-02): when a paused loop is resumed via `cfcf resume --action <…>`, the user picks one of `continue` / `finish_loop` / `stop_loop_now` / `refine_plan` / `consult_reflection`. `loop-stopped` is a workspace-history event type for user-initiated `stop_loop_now`.
- **Three commits per iteration** when reflection runs: `cfcf iteration N dev (...)`, `cfcf iteration N judge (...)`, `cfcf iteration N reflect (<health>): <key_observation>`.
- **Async execution**: iterate endpoint returns 202, CLI polls for status.
Expand Down Expand Up @@ -74,8 +77,25 @@ packages/
iteration-loop.ts # Main iteration loop controller + decision engine
# (preparing -> dev -> judging -> reflecting? -> deciding)
workspace-history.ts # history.json: review / iteration / reflection / document events
adapters/ # Agent adapter implementations (claude-code, codex)
adapters/ # Agent adapter implementations (claude-code, codex,
# opencode, claude-code-ollama, opencode-ollama —
# five total since item 6.28)
ollama-detection.ts # detectOllama, listOllamaModels (item 6.28),
# refreshOllamaModelsInConfig (item 6.33 boot-time
# refresh + button-triggered re-detection)
orphan-reaper.ts # findOrphanAgentProcesses + reapOrphans for boot-time
# reap of stale agents from a hard-crashed prior
# server PID (item 6.31, v0.21.0)
role-templates.ts # Role-instruction-template versioning + promote-to-
# production layer (item 6.8). Two types: full
# (replace default) + augmented (append to live
# default; auto-recomposed on cf² upgrade via
# refreshAugmentedOverrides at server boot).
templates/ # cfcf-docs/ file templates (17 entries incl. reflection + iteration-log + clio-guide)
templates.ts # Template resolver: project-local override → user-global
# override (~/.cfcf/templates/) → embedded default.
# getEmbeddedTemplate(name) exposes the bundled
# default verbatim for role-templates.ts.
clio/ # Clio memory layer (item 5.7)
backend/
types.ts # MemoryBackend interface (swap point for future CerefoxRemote)
Expand Down Expand Up @@ -110,7 +130,7 @@ packages/
commands/ # CLI command implementations
init.ts # First-run interactive setup (numbered agent picker, embedder
# pick + inline HF download with progress bar, error classifier)
server.ts # Server start/stop/status
server.ts # Server start/stop/status/reap (item 6.31 added `reap`)
workspace.ts # Workspace init/list/show/delete (--project for Clio assignment)
config.ts # Global config show/edit
run.ts # Start iteration loop (agent) or single iteration (manual)
Expand All @@ -131,7 +151,9 @@ packages/
# + cfcf memory alias
web/src/
App.tsx # Root router (dashboard / workspace / server)
pages/ # Dashboard, WorkspaceDetail, ServerInfo
pages/ # Dashboard, WorkspaceDetail, ServerInfo, MemoryPage,
# HelpPage, AgentTemplatesPage (item 6.8 — /agents
# route, role-template management UI)
components/ # Header, PhaseIndicator, WorkspaceHistory,
# ArchitectReview, JudgeDetail, ReflectionDetail, …
api.ts # Client for all /api/* endpoints incl. /activity + /reflect
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ cfcf can be driven from the CLI or from the web GUI served by the same Hono serv
- **[Opencode](https://opencode.ai)** (sst.dev) — alternative to Codex; runs against your provider's API or via local ollama models
- **[Ollama](https://ollama.com)** — optional, enables `claude-code-ollama` and `opencode-ollama` adapters that drive the agent CLI against locally-served models

> ⚠️ **Anthropic policy + log-visibility heads-up.** `claude-code` (direct, talking to Anthropic's API/subscription) is **only recommended for interactive roles**: Product Architect (`cfcf spec`), Help Assistant (`cfcf help assistant`), and manually-invoked Solution Architect (`cfcf review`). Anthropic's third-party-harness policy restricts subscription OAuth to interactive use; cf²'s unattended dev / judge / reflection / documenter loop is the violation pattern the rule targets. **`claude-code-ollama` is policy-clean** (no Anthropic credential involved — local ollama serves the model) but shares Claude Code's `-p` stdout-buffering behaviour: log files stay silent during the entire run and dump the final response only when the agent exits. If you want live progress monitoring, use `codex` or the opencode adapters for unattended roles. See [`docs/guides/anthropic-policy.md`](docs/guides/anthropic-policy.md) for the full breakdown + the recommended adapter-per-role table. cf² surfaces warnings in `cfcf init` and the web UI Settings / workspace Config when these recommendations are violated; neither blocks the choice.
> ⚠️ **Anthropic policy + log-visibility heads-up.** `claude-code` (direct, talking to Anthropic's API/subscription) is **only recommended for the two truly-interactive roles** that take over your shell via `stdio: "inherit"`: Product Architect (`cfcf spec`) and Help Assistant (`cfcf help assistant`). All other roles — dev, judge, reflection, documenter, **and Solution Architect** — are spawned headlessly with `claude -p` regardless of how they're invoked (including `cfcf review`, which polls a status endpoint while the server runs the architect in the background). Anthropic's third-party-harness policy restricts subscription OAuth to interactive use, so unattended `claude -p` under a personal subscription is the violation pattern. **`claude-code-ollama` is policy-clean** (no Anthropic credential involved — local ollama serves the model) but shares Claude Code's `-p` stdout-buffering behaviour: log files stay silent during the entire run and dump the final response only when the agent exits. If you want live progress monitoring, use `codex` or the opencode adapters for unattended roles. See [`docs/guides/anthropic-policy.md`](docs/guides/anthropic-policy.md) for the full breakdown + the recommended adapter-per-role table. cf² surfaces warnings in `cfcf init` and the web UI Settings / workspace Config when these recommendations are violated; neither blocks the choice.

cfcf is distributed as a standard npm package (`@cerefox/codefactory`); `bun install -g` resolves the heavy native deps (transformers, ORT, sharp) the same way every JS-ecosystem CLI does. A per-platform `@cerefox/codefactory-native-<platform>` package provides the pinned libsqlite3 + sqlite-vec libs. See [`docs/guides/installing.md`](docs/guides/installing.md) for the install one-liner + local / file-URL install paths.

Expand Down
Loading
Loading