Skip to content

feat: containerize squire as a stack + consolidate dashboard into monorepo#1

Open
danmaxis wants to merge 78 commits into
mainfrom
feat/container-stack
Open

feat: containerize squire as a stack + consolidate dashboard into monorepo#1
danmaxis wants to merge 78 commits into
mainfrom
feat/container-stack

Conversation

@danmaxis

Copy link
Copy Markdown
Owner

Summary

Two related changes that turn squire + dashboard into a single, isolated, self-contained unit.

1. Containerized two-container stack (9ff99a7)

A headless workspace container (orchestrator + command-queue agent + sshd + toolchains: python/git/node/claude/opencode) supervised by s6-overlay, plus the existing dashboard image, as a deploy/docker-compose.yml stack sharing a named squire-state volume.

  • Isolates the inner loop's semi-autonomous LLM-generated code from the host. SSH :2222, dashboard :3101, hard memory caps (mem_limit/memswap_limit, no host swap).
  • Replaces the systemd --user agent unit; the agent runs as uid 1000.
  • Docker-in-Docker escalation seam: tasks whose tests/build invoke Docker (unavailable by design) are blocked with a requires_container_build critical alert instead of burning attempts. The host docker socket is never mounted.
  • Claude creds mounted read-only; container-restart stale-lock auto-handled.

2. Consolidate dashboard into squire/dashboard/ (b638373, fdea4f9)

The dashboard was a sibling repo; the stack hardcoded ../../squire-dashboard and the JSON schema was hand-mirrored across two repos.

  • Imported via git subtree — full 40-commit dashboard history preserved under dashboard/.
  • Stack build context repointed to ../dashboard; dashboard/ excluded from the workspace image context.
  • Docs swept (PT+EN): READMEs, CLAUDE.md (dir layout, doc-sync table, deploy paths), architecture narrative, example repo_path hygiene.

Verification

  • docker compose -f deploy/docker-compose.yml up -d → both containers healthy; dashboard serves the migrated projects.
  • Python suite: 462 passed (incl. 5 new DinD-seam tests); squire doctor: 0 fail.
  • .dockerignore honored — workspace build context stays minimal (no dashboard/node_modules bloat).
  • Stale cross-repo path references: none remaining.

Notes

  • The old /home/ai-debian/squire-dashboard folder is left on disk as rollback; remove after soak.
  • Dashboard vitest not re-run in CI here — the subtree import is byte-identical to what already passed and the image builds/runs healthy.

🤖 Generated with Claude Code

Your Name and others added 30 commits March 25, 2026 15:57
Implements the full orchestrator dashboard — a Next.js app that reads
orchestrator state JSON files from the filesystem and displays them in
real time.

Components and features:
- GlobalStats, AlertBanner, ProjectCard, TaskList, Timeline, CommitLog,
  Sidebar, RefreshIndicator, RefreshController
- useAutoRefresh hook: polls via router.refresh() on a configurable
  interval (NEXT_PUBLIC_REFRESH_INTERVAL, default 30s); uses useState
  for reactive re-renders; setInterval for stable polling cycle
- Main page: server component reads projects/ dir and renders cards;
  RefreshController mounts as client island for the polling indicator
- Containerization: multi-stage Dockerfile (node:20-alpine, standalone
  output, non-root user); docker-compose.yml mounts
  /mnt/user/data/orchestrator as /data:ro on port 3100:3000 with
  healthcheck against /api/health; .dockerignore included

To run locally:
  ORCHESTRATOR_DATA_PATH=../orchestrator-state npm run dev

To deploy (ask Danilo to create the container on Unraid):
  Image: orchestrator-dashboard  Port: 3100:3000
  Volume: /mnt/user/data/orchestrator:/data:ro
  Env: ORCHESTRATOR_DATA_PATH=/data

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rewrite types.ts to exactly mirror squire Python models (snake_case,
  correct enums: ProjectStatus/TaskStatus/CursorStep/Actor/EventType/etc.)
- Rewrite data.ts: import from types.ts, correct JSON wrappers
  (AlertList/TaskList/History/CommitLog), correct projects/ path segment
- Add vitest infrastructure: vitest.config.ts, src/test-setup.ts,
  package.json test scripts, tsconfig.json vitest/globals types
- Add fixture data in fixtures/data/ matching squire serialization format
- Add tests: 18 type-contract tests + 6 data-function tests (24 total)
- Add CLAUDE.md with squire data contract and homologation checklist

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- GlobalStats: convert to RSC, accept stats prop, correct snake_case fields
- TaskList: add effort badges, TDD indicator, rejection counter, no-progress warning
- Timeline: use HistoryEvent types with Actor colors and EventType icons
- Add TaskList.test.tsx with 10 passing tests
- Remove old project/[id]/page.tsx stub (replaced by projects/[id] in task-006)
- Fix squire inner_loop: skip --watchAll=false for vitest projects
- Fix test script: vitest run (squire adds --passWithNoTests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…+ tests

- RateLimitGauge: visual progress bar showing Claude Code call budget,
  color-coded green/yellow/red by usage %, with window countdown
- CheckpointPanel: deep-dive into squire session state — cursor, LLM context,
  rate limit, recovery section with amber banner on escalation
- TDDProgressBar: horizontal step chain for TDD workflow phases
  (planning→red_phase→llm_execution→testing→homologation→completed)
- Add tests: 6 RateLimitGauge + 6 CheckpointPanel tests passing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- TDDProgressBar: finalize component with correct step highlighting
  and contextual sub-labels (attempt counts, test pass/fail, etc.)
- Add TDDProgressBar.test.tsx: 8 tests covering render, active step,
  hidden when tdd=false, and sub-label display

All 9 planned tasks complete. 54 tests passing, tsc clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- page.tsx: fix DATA_PATH (fixtures/data), alerts wrapper, stats prop
- projects/[id]/page.tsx: replace hardcoded stub with real async reads
- GlobalStats: convert to server component accepting stats prop;
  fix snake_case fields, add cost card, handle projects_touched_today array
- CommitLog: add 'use client' (uses useState)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ator

- AlertBanner: convert to client component, add per-alert × button with
  sessionStorage persistence so each alert is shown only once per session
- useAutoRefresh: fire first refresh 500ms after mount instead of waiting
  the full interval, eliminating the --:--:-- state on initial load
- page.tsx: add pt-24 when alerts are present to prevent banner overlap
- Add tests: AlertBanner (9), RefreshIndicator (6), useAutoRefresh (7)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- AlertBanner: add hasMounted pattern — returns null until useEffect fires,
  preventing SSR hydration flash that made dismissed alerts reappear on F5
- Switch sessionStorage → localStorage so dismissed alerts persist across
  browser sessions (not just while the tab is open)
- Rename timestamp → created_at in AlertBanner and page.tsx AlertJson
  interface to match fixtures/data/alerts.json and src/lib/types.ts,
  fixing the "Invalid Date" display bug

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Import Alert from @/lib/types instead of defining a local interface —
  eliminates the project/task vs project_id/task_id mismatch that caused
  the empty " - " header and broke dismiss (no id field in real data)
- Remove AlertJson local interface from page.tsx, use Alert directly
- Use composite key project_id::task_id::created_at for localStorage
  dismiss since Alert has no id field in the squire schema

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Timeline:
- Convert to client component and add pageSize prop (default 20)
- Show only the most recent pageSize events; "Ver mais N eventos"
  button loads the next page progressively (accordion, no collapse)

CommitLog:
- Replace binary show-all toggle with progressive visibleCount state
- Add pageSize prop (default 20) replacing hardcoded MAX_VISIBLE
- Remove internal duplicate <h2> heading (page already has one)
- Import CommitSummary from @/lib/types instead of local Commit interface
- "Ver mais N commits" button follows same pattern as Timeline

Tests: add Timeline.test.tsx (8 tests) and CommitLog.test.tsx (10 tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
getCommits() now falls back to running git log on the project's
repo_path when no commits.json file exists. The squire currently
does not generate commits.json, so this makes CommitLog functional
for all existing projects.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Show first 3 files by default; "+ N arquivo(s)" button reveals the
rest. "Ver menos" collapses back. Each commit tracks its own expanded
state independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Consolidate project naming under the squire-* prefix to match
SQUIRE_* env vars, squire-state directory and the squire CLI.

- package.json + container_name + docker-compose service
- ORCHESTRATOR_DATA_PATH -> SQUIRE_DATA_PATH (data.ts, page.tsx, .env.local, docker-compose)
- Fix docker volume path: /mnt/user/data/orchestrator -> /home/ai-debian/squire-state
- Layout title, sidebar brand, README, CLAUDE.md
- Rename fixtures project folder + fixture project_id/projects_touched_today

102 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror the pilot-project rename in CLAUDE.md, squire CLI usage examples,
and bilingual docs (cli, configuration, state-and-recovery). The Dashboard
env var becomes SQUIRE_DATA_PATH and the state volume is
/home/ai-debian/squire-state (was /mnt/user/data/squire in the original spec).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lback

P1 hygiene + orphan-component wiring for the squire dashboard.

Sidebar (P1.2):
- Convert to server component fed by getProjects(); split off SidebarShell
  (client) for hamburger + active-link state.
- Replace mock 4-project status union with real ProjectStatus enum.
- Mount in layout.tsx so every page gets the project nav.
- Drop the fake "Admin User" footer.

CheckpointPanel (P1.3): mount RateLimitGauge under the session row so the
Claude Code window budget is visible alongside the cursor state.

Project detail (P1.3): mount TDDProgressBar above the CheckpointPanel,
auto-shown only when the cursor is in red_phase/llm_execution/testing/
homologation for a tdd task — picks the current task from checkpoint.

Data layer (P1.5): drop the getCommitsFromGit fallback now that squire
writes commits.json. Smaller container, no shell exec at runtime.

Tests: all 102 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1.5: squire now writes projects/<id>/commits.json (CommitLog model)
after every successful task commit. The dashboard reads it directly
instead of shelling git log in the container. Implementation:
checkpoint.save_commits + Squire._refresh_commits_json invoked from
_commit_task_completion.

P1.4: refresh source anchors that drifted in the bilingual docs:
  models.py:139 -> 143 (HistoryEvent)
  models.py:171 -> 175 (Cursor)
  inner_loop.py:71  -> 72  (snapshot_test_hashes)
  inner_loop.py:80  -> 81  (check_test_integrity)
  inner_loop.py:243 -> 245 (viking injection)
  inner_loop.py:252 -> 254 (permanent restrictions block)
  squire.py:88   -> 100  (_account_call)
  squire.py:113  -> 154  (_auto_snapshot_commit)
  squire.py:162  -> 203  (_commit_task_completion)
  squire.py:252  -> 293  (_is_looping)
  squire.py:482  -> 545  (_wait_productively)
  squire.py:504  -> 567  (_pre_homologation_checks)
  squire.py:889  -> 951  (is_early_escalation)
  squire.py:1087 -> 1179 (NATO confirmation)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 — bring the dashboard to parity with squire's evolved state model.

types.ts: extend Task with max_usd/cost_usd, RateLimitState with
max_daily_usd/daily_cost_usd/daily_cost_date, GlobalStats with
daily_tokens/cost_by_model/daily_calls_unknown_cost. Test fixtures
updated accordingly.

BudgetCard (new): progress ring of today's spend vs daily cap, with
per-model breakdown bars. Active cap is read from the project
checkpoint with the highest daily_cost_usd. Mounted on the home page
alongside GlobalStats in a 2:1 grid.

GlobalStats: rewrite as a 6-tile grid (projetos tocados, tasks hoje,
chamadas LLM, aprovação 1ª, tokens hoje, calls sem custo). Drop the
duplicate cost tile now that BudgetCard owns that area.

TaskList: add cost chip ($cost / $cap, red on overrun), fast-track
badge for skip_homologation, escalated no-progress vs loop badges
(amber at >=2, red "Loop" at >=3). Two new tests cover both.

Timeline: group events by session_started/session_resumed boundaries,
collapsible per session (latest auto-expanded), pagination follows
opened sessions.

Project detail: add coding_backend pill to the header (⚙ opencode).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bilingual architecture docs now describe the squire-dashboard write API
routes (alerts/ack, projects/<id>/budget, tasks/<taskId>/action), the
session.lock 409 guard, and that squire remains the sole writer while
holding the lock — the dashboard only mutates between sessions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P3 turns the dashboard into an ops console while staying read-only-by-default
when squire is running.

Write actions (P3.1):
- POST /api/alerts/ack {project_id, task_id, created_at, dismiss?}
  marks an alert acknowledged or removes it from alerts.json.
- POST /api/projects/<id>/budget {max_daily_usd?, max_calls_per_window?}
  patches Checkpoint.rate_limit.
- POST /api/projects/<id>/tasks/<taskId>/action {action: retry|approve|skip}
  resets / force-approves / fast-tracks a task.

All routes funnel through writeJsonAtomic (.tmp + rename, matching
checkpoint.atomic_write_json) and check session.lock — if squire is
holding the lock for the target project, they refuse with 409.
New helpers: lib/atomic.ts, lib/squireLock.ts, lib/squireStatePath.ts.

UI integration:
- AlertBanner gets Ack + Descartar buttons that hit /api/alerts/ack.
  Local localStorage fallback preserves UX when the network errors.
- TaskList rows get a ⋯ menu (Resetar tentativas, Aprovar manualmente,
  Pular homologação), gated behind window.confirm and refreshed via
  router.refresh() on success.

Hot-view polling (P3.2):
- RefreshController accepts a `hot` prop; project detail page passes
  `hot={checkpoint.phase === "implementing"}` to drop the poll to 5s
  while squire is actively working. RefreshIndicator shows LIVE/IDLE.

HealthStrip (P3.5):
- Server component mounted in layout.tsx. Reads session.lock mtime +
  TTL, alerts count, and the max budget cap across projects. Sticky
  top bar shows status dot, budget %, alert count. aria-live="polite".

Test setup: vitest mocks next/navigation globally so client components
using useRouter render in JSDOM. AlertBanner tests updated to stub fetch
+ await the async ack flow.

102 + 2 new tests = 104 passing. Production build clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes upstream-recommendations #1 and #3 raised by ../integration-squire.

SessionLock now carries a structured `project_id` field so dashboard
consumers can exact-match instead of substring-parsing `holder` — the
old approach 409'd writes across projects sharing a prefix (e.g. a
lock on `proj-happy` blocked `proj`).

commits.json is now always written: at session start (so brand-new
projects have the file) and after each task, with an explicit `error`
field on git failure so the dashboard can distinguish "no commits yet"
from "file disappeared". Previously the writer swallowed git errors
silently and the dashboard masked ENOENT as an empty list.

PT and EN state-and-recovery docs synced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Cursor` moved from models.py:175 to :178 after CommitLog gained an
`error` field; `TokenUsage` moved from :266 to :270 after SessionLock
gained `project_id`. Updates PT + EN architecture and cost-and-budget
docs to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Function definitions had drifted from the line numbers cited in
PT and EN docs. All anchors now verified to land on the exact
def line (or referenced statement). Mirror in both languages.

  squire.py:100  → :102   _account_call
  squire.py:154  → :156   _auto_snapshot_commit
  squire.py:203  → :205   _commit_task_completion
  squire.py:293  → :348   _is_looping
  squire.py:545  → :600   _wait_productively
  squire.py:567  → :622   _pre_homologation_checks
  squire.py:951  → :1003  effort:low early-escalation block

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The LiteLLM gateway on Zordon:4000 is gone; the local LLM now runs via
Ollama at 192.168.50.24:11434/v1 serving journal-synth:latest (same
Qwen3.5-35B-A3B, num_ctx 98304). Machine-specific endpoint config moves
to a gitignored .env with guarded exports (shell env wins), sourced by
the squire wrapper at startup. opencode.json (~/.config) repointed too.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
config.py now defaults STATE_ROOT to /home/ai-debian/squire-state (same
default the bash wrapper injects) instead of raising KeyError on import.
tests/conftest.py points the suite at a tmp dir before config import, so
bare pytest runs out of the box and never touches real state. Also
refresh pre-existing stale config.py:73 anchors (pricing table is :111).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
New alerts_cli.py (modeled on tasks_cli.py) gives the CLI parity with
the dashboard for alert triage: list pending with 1-based indexes, ack
by index or --project/--task selectors, rm by index/--acked/--all. Same
acknowledged field and atomic-write contract the dashboard uses; docs
note the index-race caveat when both writers are active.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
doctor.py runs every precondition a session needs: writable state root,
LLM endpoint + configured models, claude binary, backend binaries in
use, session.lock pid/TTL staleness, llm.lock flock state, per-project
sanity (git repo, dirty tree, blocked tasks, dead-but-resumable
sessions), pending alerts, and stats freshness. Exit 1 on any FAIL.
--fix removes only provably dead/free locks, never live ones.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
approval_first_try_rate was never computed (only the model default).
GlobalStats gains tasks_homologated_today and
tasks_approved_first_try_today; _record_completion_stats updates them on
task completion and derives the rate, excluding skip_homologation tasks
(auto-approved, would inflate it). Adds a regression test proving a
claude --print JSON envelope flows through _account_call into
cost_estimate_usd/cost_by_model (the stale 0.0 in production predates
cost tracking). Also refresh doc anchors shifted by the new helper.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The container runs on the Ai-Debian VM (squire-state lives on its local
disk; Unraid can't mount it). Port 3101 because 3100 is taken by
browserless. Volume goes rw and the container runs as uid 1000: the
dashboard is a second writer (POST /api/alerts/ack) and state files are
0600 ai-debian. Also sync package-lock.json with package.json (npm ci
failed in the image build) and add the public/ dir the Dockerfile copies.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Your Name and others added 24 commits June 11, 2026 23:15
Blocked tasks now show their full rejection history from
homologation_log.json (per-round cards with collapsible feedback and
highlighted fix_suggestion; 'via fix' badge; fallback to the plain
rejection_summaries for pre-log tasks) plus a 'Corrigir com Claude'
button that enqueues fix_task for the host agent. Timeline homologation
events expand to the full verdict when a log entry matches. Home page
distinguishes 'Bloqueado' from 'Pausado' and shows a blocked-task count
badge per project. fix_task is whitelisted in /api/commands with the
same lock pre-check as run/resume (the fix cycle holds the session lock).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Route audit confirms all state-reading routes are dynamic (the static-
prerender bug class is closed). Error boundaries at app root and
projects/[id] turn malformed-state crashes into a recoverable card with
guidance instead of a bare 500. All session-era components (forms,
panels, login, menus) gain dark: variants matching the shell palette;
login converts from always-dark zinc to the standard light+dark pattern.
useCommandPoll tolerates 3 transient network failures before giving up
and its client timeout grows to 15 min (fix_task can run ~13 on the
host). authedFetch rewrites 503 writes_disabled into an actionable
message instead of a generic error. TaskForm closes on Escape.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
DATA_PATH was defined in 4 places (squireStatePath, squireLock, data,
home page) — single source in squireStatePath now. The token+lock 409
guard copy-pasted across 5 mutation routes becomes
guardProjectWrite(req, projectId) in lib/auth. The hand-rolled enqueue
POST repeated in 5 components becomes enqueueCommand() in clientApi.
Behavior-neutral: same status codes, 165/165 tests unchanged.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Squire._account_call and fix_cli.account_usage now share
accounting.record_usage; the guarded add+commit duplicated between the
orchestrator and the fix cycle becomes gitops.commit_all (also
simplifies _commit_task_completion, whose two branches were identical).
Behavior-neutral; snapshot tests updated to the new seam.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
11 specs over next dev on :3199 with SQUIRE_DATA_PATH pointing at a
state fixture rebuilt by global-setup: login (bad/good token), home
(projects, Bloqueado badge, blocked count, alert banner), full task
CRUD through the modal, blocked-task triage (full verdicts from
homologation_log + 'Corrigir com Claude' writing a fix_task to
commands/pending), run controls disabled under a seeded session lock +
409 from the API, alert ack persisting to disk (with state restore so
specs stay independent), and a local-only full roundtrip that runs the
real 'squire agent --once' to execute a UI-enqueued new_project.
npm run test:e2e; e2e/ excluded from vitest; chromium only.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The fix cycle aborted instantly with 'no files written' because the
systemd user agent's default PATH lacks ~/.local/bin (claude) — and
implement_directly swallowed the FileNotFoundError silently. It now
prints the actual cause (timeout / non-zero exit + stderr / missing
binary) so the command result tail tells the truth. The squire-agent
unit gains an explicit PATH with claude and opencode.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The recurring 'Parse error: Expecting value: line 1 column 1' (seen
across claw-code-study history and twice in tonight's fix cycle) is
Claude wrapping the verdict JSON in prose. _parse_response now falls
back to raw_decode-scanning for the first valid JSON object before
declaring an infra error — pure-prose responses still classify as infra.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Doc-heavy tasks blow past the hardcoded 300s (tonight's fix cycle timed
out generating a multi-crate summary). SQUIRE_IMPLEMENT_TIMEOUT, default
600s, still inside the agent's 1200s command timeout.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…mpty parses

A surgical fix request came back as prose (successful response, zero
filepath blocks parsed). The prompt now states explicitly that even
one-line changes must return the whole modified file, and an empty parse
logs the first 300 chars of the response so the cause is visible in the
command result.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The root layout's Sidebar and HealthStrip read squire state from the
filesystem on every route, but 4 statically-prerendered pages (/login,
/projects/new, /dashboard, /_not-found) froze them at build time — empty
project list and a permanent 'Sem sessão / sem lock' regardless of
reality (the user-reported symptoms). force-dynamic on the root layout
kills the whole class; build now has zero static pages, and the new
chrome.spec e2e guards the previously-static routes.

HealthStrip copy fixed via a pure sessionStatus() helper ('Squire
ocioso / nenhuma sessão ativa'; active sessions show the running
project_id instead of the sess-id). New src/lib/statusMaps.ts puts all
enum labels in PT and replaces the English sidebar pills, raw task
status chips, and raw cursor.step badge (raw values kept in title=).
Scratch project teste-dashboard removed from live state.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…Card dark mode

Yesterday's squire change moved approval_first_try_rate to a 0-100
scale; the GlobalStats tile still multiplied by 100 and showed 5000%.
Regression test pins the scale. ProjectCard (missed by the dark-mode
pass) gets variants — titles were dark-on-dark.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…rate

cost_by_model stayed empty forever because recent claude --print
envelopes carry no top-level 'model' — fall back to the first modelUsage
key. Approval-rate counters move to accounting.record_homologated,
shared by the orchestrator and the fix cycle (approved fixes now count
as homologated, never first-try).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
cliHints derives context-aware CLI commands from the state the page
already loads (same heuristics as squire doctor): kill when this project
runs, resume when a dead session is resumable, run/bg when idle with
pending tasks, unblock when blocked tasks exist; per task, blocked →
fix/unblock/reset with ids filled in, dead-session task → resume.
CommandChips copies on click with an execCommand fallback (the LAN
deploy is plain HTTP — navigator.clipboard needs a secure context).
Shown on the project page and inside expanded tasks/BlockedTaskPanel;
healthy tasks show nothing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Headless workspace container (orchestrator + command-queue agent + sshd +
toolchains) supervised by s6-overlay, plus the existing dashboard image, as a
docker-compose stack sharing a named squire-state volume. SSH :2222, dashboard
:3101, hard memory caps. Replaces the systemd --user agent unit.

Adds a Docker-in-Docker escalation seam: tasks whose tests/build invoke Docker
(unavailable by design) are blocked with a requires_container_build critical
alert instead of burning attempts. Never mounts the host docker socket.

Docs (PT+EN) and doctor updated; container restart stale-lock auto-handled.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
git-subtree-dir: dashboard
git-subtree-mainline: 9ff99a7
git-subtree-split: dfd52f6
Repoint the stack's dashboard build context to ../dashboard (was the sibling
repo ../../squire-dashboard), exclude dashboard/ from the workspace image
context, and update all docs (PT+EN) for the in-repo layout: READMEs repo
structure, CLAUDE.md directory block + doc-sync table + deploy paths,
architecture narrative, and example repo_path hygiene. The dashboard tree was
imported with full history via git subtree in the preceding merge commit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @danmaxis, your pull request is larger than the review limit of 300000 diff characters

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Human review recommended

It combines broad infra/containerization changes with a large subtree import and includes at least one credential-handling hardening issue that should be validated by a human end-to-end.

Pull request overview

This PR turns squire + dashboard into a single deployable unit by (1) introducing a two-container Docker Compose stack (workspace + dashboard) backed by shared named volumes, and (2) importing the previously-separate dashboard repo into this repo (via subtree) so build contexts and schemas stay in sync. It also adds supporting runtime features (command queue, richer homologation logging, infra retry semantics, accounting helpers) plus extensive test and docs updates.

Changes:

  • Add deploy/docker-compose.yml stack for an isolated workspace container (s6-supervised agent + sshd) plus the dashboard container, sharing squire-state.
  • Introduce command-queue and homologation enhancements (infra vs config error kinds, free infra retry, full verdict logging, stats accounting), with new tests/docs.
  • Vendor dashboard into dashboard/ with Next.js API routes for state mutation (token-gated) and local E2E/unit tests.
File summaries
File Description
tests/test_homologator.py Adds tests for error_kind classification and tolerant JSON extraction behavior.
tests/test_homologation_retry.py New tests validating one free retry for infra failures in homologation.
tests/test_homologation_log.py New tests for persisting full homologation verdict entries and caps.
tests/test_cost_budget.py Adds regression tests for GlobalStats cost accumulation and approval rate math/model fallback.
tests/test_backends.py Adds tests asserting actionable HTTP error messages from LiteLLM backend.
tests/test_backends_parser.py Adds tests for relpath validation and junk/traversal write protections.
tests/conftest.py Sets default SQUIRE_STATE_ROOT for tests to avoid touching real state.
tasks.json Updates a task description label (“Orchestrator” → “Squire”).
README.md Documents new repo layout entries (dashboard, deploy, docker, workspace Dockerfile).
README.en.md English mirror of repo layout updates.
models.py Adds HomologationLog models, structured project_id in SessionLock, command queue models, and new GlobalStats counters.
inner_loop.py Exposes run_tests() as a public API (for squire fix).
gitops.py New guarded git helper for add+commit used by orchestrator/fix flows.
docs/troubleshooting.md Adds “start with doctor” guidance and updates internal line references.
docs/tasks.md Updates line reference(s) after code movement.
docs/padrao-viking.md Updates line reference(s) after code movement.
docs/homologacao.md Documents infra retry semantics, container-build alert seam, and homologation log format.
docs/en/viking-pattern.md English mirror: viking pattern reference updates.
docs/en/troubleshooting.md English mirror: troubleshooting updates and references.
docs/en/tasks.md English mirror: tasks doc reference updates.
docs/en/homologation.md English mirror: homologation semantics/log docs.
docs/en/cost-and-budget.md English mirror: updated references for pricing/extraction/accounting.
docs/en/backends.md English mirror: path validation and actionable HTTP errors docs + reference updates.
docs/custos-e-orcamento.md PT mirror: updated references for pricing/extraction/accounting.
docs/backends.md PT mirror: docs for path validation + actionable HTTP errors + reference updates.
docker/services.d/sshd/run Adds s6 service to run sshd in foreground.
docker/services.d/squire-agent/run Adds s6 service to run squire agent with correct env/PATH and uid.
docker/profile.d/squire.sh Ensures interactive SSH shells get PATH and runtime env via /etc/squire.env.
docker/cont-init.d/00-init Container init: ssh keys, volume ownership, authorized_keys perms, emit runtime env file.
deploy/docker-compose.yml New stack definition for workspace + dashboard with resource caps and shared volumes.
deploy/.env.example Documents required deploy-time write token env for dashboard writes.
dashboard/vitest.config.ts Adds Vitest config for dashboard unit tests.
dashboard/tsconfig.vitest.json TS config for Vitest config compilation.
dashboard/tsconfig.json Dashboard TypeScript config (includes vitest globals).
dashboard/tailwind.config.ts Tailwind config for dashboard styling.
dashboard/src/test-setup.ts Global mocks for Next navigation hooks in unit tests.
dashboard/src/lib/utils.ts Adds shared formatting/classname utility helpers.
dashboard/src/lib/taskDefaults.ts Mirrors Python Task defaults + editable fields + ID generation logic.
dashboard/src/lib/taskDefaults.test.ts Tests parity between TS defaults and serialized Python Task fixture.
dashboard/src/lib/statusMaps.ts Centralizes PT-BR labels for status enums.
dashboard/src/lib/statusMaps.test.ts Tests non-empty PT labels and guards against raw-enum regressions.
dashboard/src/lib/squireStatePath.ts Centralizes state filesystem paths with env override.
dashboard/src/lib/squireLock.ts Reads/derives session lock status and project blocking logic.
dashboard/src/lib/data.ts Filesystem JSON read layer for projects/tasks/alerts/stats/checkpoints/logs.
dashboard/src/lib/data.test.ts Unit tests for wrapper reading + tasks normalization + homologation log reading.
dashboard/src/lib/commands.ts Implements dashboard-side command queue write + status polling reads.
dashboard/src/lib/cliHints.ts Derives actionable CLI hints from loaded state (doctor-like UX).
dashboard/src/lib/clientApi.ts Client fetch wrapper injecting token + rewrites 503 writes-disabled message.
dashboard/src/lib/auth.ts Token gate + project write guard against active squire lock.
dashboard/src/lib/auth.test.ts Tests requireWriteToken behavior and 503 rewrite in client fetch.
dashboard/src/lib/atomic.ts Atomic JSON write helper (tmp + rename) + tolerant JSON reads.
dashboard/src/lib/fixtures/python-task-defaults.json Serialized Python Task defaults fixture for parity testing.
dashboard/src/hooks/useLocalStorage.ts Generic localStorage hook with SSR guard and error handling.
dashboard/src/hooks/useCommandPoll.ts Polls command queue status with timeout and transient failure tolerance.
dashboard/src/hooks/useCommandPoll.test.ts Unit tests for polling state transitions and transient failure tolerance.
dashboard/src/hooks/useAutoRefresh.ts Polling-based router.refresh auto-refresh hook with state tracking.
dashboard/src/hooks/useAutoRefresh.test.ts Unit tests for refresh cadence and trigger behavior.
dashboard/src/components/ui/card.tsx Shared Card UI primitives.
dashboard/src/components/ui/button.tsx Shared Button primitive with variants/sizes.
dashboard/src/components/Timeline.test.tsx Unit tests for timeline pagination and ordering.
dashboard/src/components/TDDProgressBar.tsx UI for TDD phase/progress visualization with contextual details.
dashboard/src/components/TDDProgressBar.test.tsx Unit tests for progress bar rendering/labels and extra info.
dashboard/src/components/Sidebar.tsx Server component fetching projects and rendering sidebar shell.
dashboard/src/components/RunControls.tsx Client controls to enqueue run/resume/kill commands and poll status.
dashboard/src/components/RunControls.test.tsx Unit tests for enable/disable behavior based on lock state.
dashboard/src/components/RefreshIndicator.tsx Displays refresh status, timestamp, cadence, and optional mode.
dashboard/src/components/RefreshIndicator.test.tsx Unit tests for indicator text/time display.
dashboard/src/components/RefreshController.tsx Chooses “live” vs “idle” polling cadence and renders indicator.
dashboard/src/components/RateLimitGauge.tsx Displays Claude Code window usage as a progress bar.
dashboard/src/components/HealthStrip.tsx Top chrome showing session status, alerts count, and budget summary.
dashboard/src/components/HealthStrip.test.ts Unit tests for session status copy/behavior.
dashboard/src/components/GlobalStats.test.tsx Regression test for approval rate display scale (0–100).
dashboard/src/components/CommandChips.tsx Copy-to-clipboard CLI hints with insecure-context fallback.
dashboard/src/components/CommandChips.test.tsx Tests clipboard and execCommand fallback behavior.
dashboard/src/components/BlockedTaskPanel.test.tsx Tests blocked task triage rendering + fix_task enqueue behavior.
dashboard/src/app/projects/new/page.tsx New project creation page.
dashboard/src/app/projects/[id]/error.tsx Project-level error boundary UI.
dashboard/src/app/not-found.tsx Not-found page.
dashboard/src/app/login/page.tsx Login page that stores write token in localStorage.
dashboard/src/app/layout.tsx Root layout with Sidebar + HealthStrip and force-dynamic to avoid stale chrome.
dashboard/src/app/globals.css Tailwind base/global styles.
dashboard/src/app/error.tsx Global error boundary UI.
dashboard/src/app/dashboard/page.tsx Legacy route redirect to /.
dashboard/src/app/api/projects/[id]/tasks/route.ts POST create-task handler with validation and lock/token gating.
dashboard/src/app/api/projects/[id]/tasks/route.test.ts Tests create-task handler (auth, defaults, validation, lock).
dashboard/src/app/api/projects/[id]/tasks/[taskId]/route.ts PATCH/DELETE task handler with whitelist application.
dashboard/src/app/api/projects/[id]/tasks/[taskId]/route.test.ts Tests whitelist patching and delete semantics.
dashboard/src/app/api/projects/[id]/tasks/[taskId]/action/route.ts Task action handler (retry/approve/skip) mutating task state.
dashboard/src/app/api/projects/[id]/route.ts PATCH project handler with validation and updated_at bump.
dashboard/src/app/api/projects/[id]/route.test.ts Tests project patch semantics and validation.
dashboard/src/app/api/projects/[id]/budget/route.ts POST budget mutation handler on checkpoint rate limit fields.
dashboard/src/app/api/health/route.ts Dynamic health endpoint to avoid build-time timestamp freezing.
dashboard/src/app/api/commands/route.ts Enqueues host commands with early validation and lock pre-checks.
dashboard/src/app/api/commands/[id]/route.ts Reads command status with UUID guard and token gating.
dashboard/src/app/api/commands/[id]/route.test.ts Tests status endpoint and UUID validation behavior.
dashboard/src/app/api/auth/check/route.ts Validates token (used by login flow).
dashboard/src/app/api/alerts/ack/route.ts Acks/dismisses alerts atomically with token gating.
dashboard/src/app/api/alerts/ack/route.test.ts Tests alert ack/dismiss flows and token requirements.
dashboard/README.md Dashboard-local README for dev/test usage and architecture summary.
dashboard/public/.gitkeep Keeps public/ in repo.
dashboard/postcss.config.js PostCSS config for Tailwind pipeline.
dashboard/playwright.config.ts E2E setup (port, seeded state, token, single-worker constraints).
dashboard/package.json Dashboard dependencies and scripts (unit + e2e).
dashboard/next.config.js Enables Next standalone output (Docker-friendly).
dashboard/fixtures/data/projects/squire-dashboard/tasks.json Seed fixture tasks list for dashboard dev/testing.
dashboard/fixtures/data/projects/squire-dashboard/project.json Seed fixture project.json.
dashboard/fixtures/data/projects/squire-dashboard/history.json Seed fixture history events.
dashboard/fixtures/data/projects/squire-dashboard/commits.json Seed fixture commits log.
dashboard/fixtures/data/projects/squire-dashboard/checkpoint.json Seed fixture checkpoint data (cursor/context/rate_limit).
dashboard/fixtures/data/projects/minimal-template/tasks.json Minimal legacy template fixture for normalization tests.
dashboard/fixtures/data/projects/minimal-template/homologation_log.json Seed homologation log fixture for reading tests.
dashboard/fixtures/data/global-stats.json Seed global-stats fixture for dashboard dev/testing.
dashboard/fixtures/data/alerts.json Seed alerts fixture for banner rendering/dev/testing.
dashboard/e2e/tasks-crud.spec.ts E2E: create/edit/delete task via UI.
dashboard/e2e/run-control.spec.ts E2E: run/resume/kill enablement and 409 behavior with lock.
dashboard/e2e/login.spec.ts E2E: login token flow and persistence.
dashboard/e2e/home.spec.ts E2E: project listing and alert visibility.
dashboard/e2e/helpers.ts E2E helper to inject token into localStorage.
dashboard/e2e/cli-hints.spec.ts E2E: CLI hint rendering + clipboard copy behavior.
dashboard/e2e/chrome.spec.ts E2E: verifies layout chrome is fresh (no prerender staleness regressions).
dashboard/e2e/blocked-triage.spec.ts E2E: blocked task triage log rendering + fix_task enqueue into pending/.
dashboard/e2e/alerts.spec.ts E2E: alert ack persists to filesystem.
dashboard/e2e/agent-roundtrip.spec.ts Local-only E2E: end-to-end new_project via UI + agent --once processing.
dashboard/Dockerfile Multi-stage build to standalone Next runner image.
dashboard/docker-compose.yml Dashboard-only compose for isolated dev; main deploy is stack in deploy/.
dashboard/CLAUDE.md Dashboard developer guide + data contracts + review checklist.
dashboard/.gitignore Dashboard-specific gitignore (node/next outputs, e2e state, env).
dashboard/.dockerignore Dashboard dockerignore for smaller build context.
config.py Defaults STATE_ROOT and adds command queue / timeouts / agent settings.
checkpoint.py Adds homologation log persistence, structured project lock field, and commits save/load.
accounting.py New shared usage accounting + approval-rate bookkeeping helpers.
.env.example Updates env example to guarded exports and revised defaults/backends.
.dockerignore Excludes dashboard/deploy/env/etc from workspace image build context.
.devcontainer/devcontainer.json Devcontainer attaches to workspace service without owning lifecycle.

Copilot's findings

  • Files reviewed: 184/186 changed files
  • Comments generated: 4

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docker/cont-init.d/00-init Outdated
Comment on lines +35 to +36
} > /etc/squire.env
chmod 644 /etc/squire.env

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 60b233d/etc/squire.env is now written 0600 and chowned to ai-debian (the only SSH user / the one that sources it in profile.d). Verified the login shell still sources it with the env cleared.

Comment thread dashboard/fixtures/data/global-stats.json Outdated
Comment thread dashboard/src/hooks/useCommandPoll.ts Outdated
Comment on lines +76 to +85
let timer: ReturnType<typeof setTimeout>;
const schedule = () => {
timer = setTimeout(tick, POLL_MS);
};
tick();

return () => {
cancelled = true;
clearTimeout(timer);
};

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 60b233dtimer is now ReturnType<typeof setTimeout> | null = null and the cleanup guards if (timer) clearTimeout(timer). useCommandPoll tests pass.

Comment thread gitops.py Outdated
Comment on lines +26 to +28
subprocess.run(
["git", "add", "-A"], cwd=repo_path, capture_output=True, timeout=15
)

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b119af9commit_all now checks git add -A's return code and returns failed: git add: <stderr> instead of falling through to commit with a misleading error.

Orchestrator and others added 2 commits June 17, 2026 15:44
- docker/cont-init.d/00-init: write /etc/squire.env with 0600 + ai-debian
  ownership (it can hold API keys); was world-readable 0644.
- dashboard/fixtures/data/global-stats.json: approval_first_try_rate uses the
  0–100 scale the UI renders (75.0), not 0–1 (0.75 → misleading "1%").
- dashboard/src/hooks/useCommandPoll.ts: make the poll timer nullable and guard
  clearTimeout in cleanup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
git add -A's return code was ignored, so an add failure (bad permissions,
invalid pathspec, broken repo) would fall through to git commit and report a
misleading error. Capture stderr and return a clear "failed: git add: …".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants