feat: containerize squire as a stack + consolidate dashboard into monorepo#1
feat: containerize squire as a stack + consolidate dashboard into monorepo#1danmaxis wants to merge 78 commits into
Conversation
Implements the full orchestrator dashboard — a Next.js app that reads orchestrator state JSON files from the filesystem and displays them in real time. Components and features: - GlobalStats, AlertBanner, ProjectCard, TaskList, Timeline, CommitLog, Sidebar, RefreshIndicator, RefreshController - useAutoRefresh hook: polls via router.refresh() on a configurable interval (NEXT_PUBLIC_REFRESH_INTERVAL, default 30s); uses useState for reactive re-renders; setInterval for stable polling cycle - Main page: server component reads projects/ dir and renders cards; RefreshController mounts as client island for the polling indicator - Containerization: multi-stage Dockerfile (node:20-alpine, standalone output, non-root user); docker-compose.yml mounts /mnt/user/data/orchestrator as /data:ro on port 3100:3000 with healthcheck against /api/health; .dockerignore included To run locally: ORCHESTRATOR_DATA_PATH=../orchestrator-state npm run dev To deploy (ask Danilo to create the container on Unraid): Image: orchestrator-dashboard Port: 3100:3000 Volume: /mnt/user/data/orchestrator:/data:ro Env: ORCHESTRATOR_DATA_PATH=/data Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Rewrite types.ts to exactly mirror squire Python models (snake_case, correct enums: ProjectStatus/TaskStatus/CursorStep/Actor/EventType/etc.) - Rewrite data.ts: import from types.ts, correct JSON wrappers (AlertList/TaskList/History/CommitLog), correct projects/ path segment - Add vitest infrastructure: vitest.config.ts, src/test-setup.ts, package.json test scripts, tsconfig.json vitest/globals types - Add fixture data in fixtures/data/ matching squire serialization format - Add tests: 18 type-contract tests + 6 data-function tests (24 total) - Add CLAUDE.md with squire data contract and homologation checklist Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- GlobalStats: convert to RSC, accept stats prop, correct snake_case fields - TaskList: add effort badges, TDD indicator, rejection counter, no-progress warning - Timeline: use HistoryEvent types with Actor colors and EventType icons - Add TaskList.test.tsx with 10 passing tests - Remove old project/[id]/page.tsx stub (replaced by projects/[id] in task-006) - Fix squire inner_loop: skip --watchAll=false for vitest projects - Fix test script: vitest run (squire adds --passWithNoTests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…+ tests - RateLimitGauge: visual progress bar showing Claude Code call budget, color-coded green/yellow/red by usage %, with window countdown - CheckpointPanel: deep-dive into squire session state — cursor, LLM context, rate limit, recovery section with amber banner on escalation - TDDProgressBar: horizontal step chain for TDD workflow phases (planning→red_phase→llm_execution→testing→homologation→completed) - Add tests: 6 RateLimitGauge + 6 CheckpointPanel tests passing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- TDDProgressBar: finalize component with correct step highlighting and contextual sub-labels (attempt counts, test pass/fail, etc.) - Add TDDProgressBar.test.tsx: 8 tests covering render, active step, hidden when tdd=false, and sub-label display All 9 planned tasks complete. 54 tests passing, tsc clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- page.tsx: fix DATA_PATH (fixtures/data), alerts wrapper, stats prop - projects/[id]/page.tsx: replace hardcoded stub with real async reads - GlobalStats: convert to server component accepting stats prop; fix snake_case fields, add cost card, handle projects_touched_today array - CommitLog: add 'use client' (uses useState) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ator - AlertBanner: convert to client component, add per-alert × button with sessionStorage persistence so each alert is shown only once per session - useAutoRefresh: fire first refresh 500ms after mount instead of waiting the full interval, eliminating the --:--:-- state on initial load - page.tsx: add pt-24 when alerts are present to prevent banner overlap - Add tests: AlertBanner (9), RefreshIndicator (6), useAutoRefresh (7) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- AlertBanner: add hasMounted pattern — returns null until useEffect fires, preventing SSR hydration flash that made dismissed alerts reappear on F5 - Switch sessionStorage → localStorage so dismissed alerts persist across browser sessions (not just while the tab is open) - Rename timestamp → created_at in AlertBanner and page.tsx AlertJson interface to match fixtures/data/alerts.json and src/lib/types.ts, fixing the "Invalid Date" display bug Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Import Alert from @/lib/types instead of defining a local interface — eliminates the project/task vs project_id/task_id mismatch that caused the empty " - " header and broke dismiss (no id field in real data) - Remove AlertJson local interface from page.tsx, use Alert directly - Use composite key project_id::task_id::created_at for localStorage dismiss since Alert has no id field in the squire schema Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Timeline: - Convert to client component and add pageSize prop (default 20) - Show only the most recent pageSize events; "Ver mais N eventos" button loads the next page progressively (accordion, no collapse) CommitLog: - Replace binary show-all toggle with progressive visibleCount state - Add pageSize prop (default 20) replacing hardcoded MAX_VISIBLE - Remove internal duplicate <h2> heading (page already has one) - Import CommitSummary from @/lib/types instead of local Commit interface - "Ver mais N commits" button follows same pattern as Timeline Tests: add Timeline.test.tsx (8 tests) and CommitLog.test.tsx (10 tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
getCommits() now falls back to running git log on the project's repo_path when no commits.json file exists. The squire currently does not generate commits.json, so this makes CommitLog functional for all existing projects. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Show first 3 files by default; "+ N arquivo(s)" button reveals the rest. "Ver menos" collapses back. Each commit tracks its own expanded state independently. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Consolidate project naming under the squire-* prefix to match SQUIRE_* env vars, squire-state directory and the squire CLI. - package.json + container_name + docker-compose service - ORCHESTRATOR_DATA_PATH -> SQUIRE_DATA_PATH (data.ts, page.tsx, .env.local, docker-compose) - Fix docker volume path: /mnt/user/data/orchestrator -> /home/ai-debian/squire-state - Layout title, sidebar brand, README, CLAUDE.md - Rename fixtures project folder + fixture project_id/projects_touched_today 102 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror the pilot-project rename in CLAUDE.md, squire CLI usage examples, and bilingual docs (cli, configuration, state-and-recovery). The Dashboard env var becomes SQUIRE_DATA_PATH and the state volume is /home/ai-debian/squire-state (was /mnt/user/data/squire in the original spec). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lback P1 hygiene + orphan-component wiring for the squire dashboard. Sidebar (P1.2): - Convert to server component fed by getProjects(); split off SidebarShell (client) for hamburger + active-link state. - Replace mock 4-project status union with real ProjectStatus enum. - Mount in layout.tsx so every page gets the project nav. - Drop the fake "Admin User" footer. CheckpointPanel (P1.3): mount RateLimitGauge under the session row so the Claude Code window budget is visible alongside the cursor state. Project detail (P1.3): mount TDDProgressBar above the CheckpointPanel, auto-shown only when the cursor is in red_phase/llm_execution/testing/ homologation for a tdd task — picks the current task from checkpoint. Data layer (P1.5): drop the getCommitsFromGit fallback now that squire writes commits.json. Smaller container, no shell exec at runtime. Tests: all 102 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1.5: squire now writes projects/<id>/commits.json (CommitLog model) after every successful task commit. The dashboard reads it directly instead of shelling git log in the container. Implementation: checkpoint.save_commits + Squire._refresh_commits_json invoked from _commit_task_completion. P1.4: refresh source anchors that drifted in the bilingual docs: models.py:139 -> 143 (HistoryEvent) models.py:171 -> 175 (Cursor) inner_loop.py:71 -> 72 (snapshot_test_hashes) inner_loop.py:80 -> 81 (check_test_integrity) inner_loop.py:243 -> 245 (viking injection) inner_loop.py:252 -> 254 (permanent restrictions block) squire.py:88 -> 100 (_account_call) squire.py:113 -> 154 (_auto_snapshot_commit) squire.py:162 -> 203 (_commit_task_completion) squire.py:252 -> 293 (_is_looping) squire.py:482 -> 545 (_wait_productively) squire.py:504 -> 567 (_pre_homologation_checks) squire.py:889 -> 951 (is_early_escalation) squire.py:1087 -> 1179 (NATO confirmation) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 — bring the dashboard to parity with squire's evolved state model. types.ts: extend Task with max_usd/cost_usd, RateLimitState with max_daily_usd/daily_cost_usd/daily_cost_date, GlobalStats with daily_tokens/cost_by_model/daily_calls_unknown_cost. Test fixtures updated accordingly. BudgetCard (new): progress ring of today's spend vs daily cap, with per-model breakdown bars. Active cap is read from the project checkpoint with the highest daily_cost_usd. Mounted on the home page alongside GlobalStats in a 2:1 grid. GlobalStats: rewrite as a 6-tile grid (projetos tocados, tasks hoje, chamadas LLM, aprovação 1ª, tokens hoje, calls sem custo). Drop the duplicate cost tile now that BudgetCard owns that area. TaskList: add cost chip ($cost / $cap, red on overrun), fast-track badge for skip_homologation, escalated no-progress vs loop badges (amber at >=2, red "Loop" at >=3). Two new tests cover both. Timeline: group events by session_started/session_resumed boundaries, collapsible per session (latest auto-expanded), pagination follows opened sessions. Project detail: add coding_backend pill to the header (⚙ opencode). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bilingual architecture docs now describe the squire-dashboard write API routes (alerts/ack, projects/<id>/budget, tasks/<taskId>/action), the session.lock 409 guard, and that squire remains the sole writer while holding the lock — the dashboard only mutates between sessions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P3 turns the dashboard into an ops console while staying read-only-by-default
when squire is running.
Write actions (P3.1):
- POST /api/alerts/ack {project_id, task_id, created_at, dismiss?}
marks an alert acknowledged or removes it from alerts.json.
- POST /api/projects/<id>/budget {max_daily_usd?, max_calls_per_window?}
patches Checkpoint.rate_limit.
- POST /api/projects/<id>/tasks/<taskId>/action {action: retry|approve|skip}
resets / force-approves / fast-tracks a task.
All routes funnel through writeJsonAtomic (.tmp + rename, matching
checkpoint.atomic_write_json) and check session.lock — if squire is
holding the lock for the target project, they refuse with 409.
New helpers: lib/atomic.ts, lib/squireLock.ts, lib/squireStatePath.ts.
UI integration:
- AlertBanner gets Ack + Descartar buttons that hit /api/alerts/ack.
Local localStorage fallback preserves UX when the network errors.
- TaskList rows get a ⋯ menu (Resetar tentativas, Aprovar manualmente,
Pular homologação), gated behind window.confirm and refreshed via
router.refresh() on success.
Hot-view polling (P3.2):
- RefreshController accepts a `hot` prop; project detail page passes
`hot={checkpoint.phase === "implementing"}` to drop the poll to 5s
while squire is actively working. RefreshIndicator shows LIVE/IDLE.
HealthStrip (P3.5):
- Server component mounted in layout.tsx. Reads session.lock mtime +
TTL, alerts count, and the max budget cap across projects. Sticky
top bar shows status dot, budget %, alert count. aria-live="polite".
Test setup: vitest mocks next/navigation globally so client components
using useRouter render in JSDOM. AlertBanner tests updated to stub fetch
+ await the async ack flow.
102 + 2 new tests = 104 passing. Production build clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes upstream-recommendations #1 and #3 raised by ../integration-squire. SessionLock now carries a structured `project_id` field so dashboard consumers can exact-match instead of substring-parsing `holder` — the old approach 409'd writes across projects sharing a prefix (e.g. a lock on `proj-happy` blocked `proj`). commits.json is now always written: at session start (so brand-new projects have the file) and after each task, with an explicit `error` field on git failure so the dashboard can distinguish "no commits yet" from "file disappeared". Previously the writer swallowed git errors silently and the dashboard masked ENOENT as an empty list. PT and EN state-and-recovery docs synced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Cursor` moved from models.py:175 to :178 after CommitLog gained an `error` field; `TokenUsage` moved from :266 to :270 after SessionLock gained `project_id`. Updates PT + EN architecture and cost-and-budget docs to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Function definitions had drifted from the line numbers cited in PT and EN docs. All anchors now verified to land on the exact def line (or referenced statement). Mirror in both languages. squire.py:100 → :102 _account_call squire.py:154 → :156 _auto_snapshot_commit squire.py:203 → :205 _commit_task_completion squire.py:293 → :348 _is_looping squire.py:545 → :600 _wait_productively squire.py:567 → :622 _pre_homologation_checks squire.py:951 → :1003 effort:low early-escalation block Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The LiteLLM gateway on Zordon:4000 is gone; the local LLM now runs via Ollama at 192.168.50.24:11434/v1 serving journal-synth:latest (same Qwen3.5-35B-A3B, num_ctx 98304). Machine-specific endpoint config moves to a gitignored .env with guarded exports (shell env wins), sourced by the squire wrapper at startup. opencode.json (~/.config) repointed too. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
config.py now defaults STATE_ROOT to /home/ai-debian/squire-state (same default the bash wrapper injects) instead of raising KeyError on import. tests/conftest.py points the suite at a tmp dir before config import, so bare pytest runs out of the box and never touches real state. Also refresh pre-existing stale config.py:73 anchors (pricing table is :111). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
New alerts_cli.py (modeled on tasks_cli.py) gives the CLI parity with the dashboard for alert triage: list pending with 1-based indexes, ack by index or --project/--task selectors, rm by index/--acked/--all. Same acknowledged field and atomic-write contract the dashboard uses; docs note the index-race caveat when both writers are active. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
doctor.py runs every precondition a session needs: writable state root, LLM endpoint + configured models, claude binary, backend binaries in use, session.lock pid/TTL staleness, llm.lock flock state, per-project sanity (git repo, dirty tree, blocked tasks, dead-but-resumable sessions), pending alerts, and stats freshness. Exit 1 on any FAIL. --fix removes only provably dead/free locks, never live ones. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
approval_first_try_rate was never computed (only the model default). GlobalStats gains tasks_homologated_today and tasks_approved_first_try_today; _record_completion_stats updates them on task completion and derives the rate, excluding skip_homologation tasks (auto-approved, would inflate it). Adds a regression test proving a claude --print JSON envelope flows through _account_call into cost_estimate_usd/cost_by_model (the stale 0.0 in production predates cost tracking). Also refresh doc anchors shifted by the new helper. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The container runs on the Ai-Debian VM (squire-state lives on its local disk; Unraid can't mount it). Port 3101 because 3100 is taken by browserless. Volume goes rw and the container runs as uid 1000: the dashboard is a second writer (POST /api/alerts/ack) and state files are 0600 ai-debian. Also sync package-lock.json with package.json (npm ci failed in the image build) and add the public/ dir the Dockerfile copies. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Blocked tasks now show their full rejection history from homologation_log.json (per-round cards with collapsible feedback and highlighted fix_suggestion; 'via fix' badge; fallback to the plain rejection_summaries for pre-log tasks) plus a 'Corrigir com Claude' button that enqueues fix_task for the host agent. Timeline homologation events expand to the full verdict when a log entry matches. Home page distinguishes 'Bloqueado' from 'Pausado' and shows a blocked-task count badge per project. fix_task is whitelisted in /api/commands with the same lock pre-check as run/resume (the fix cycle holds the session lock). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Route audit confirms all state-reading routes are dynamic (the static- prerender bug class is closed). Error boundaries at app root and projects/[id] turn malformed-state crashes into a recoverable card with guidance instead of a bare 500. All session-era components (forms, panels, login, menus) gain dark: variants matching the shell palette; login converts from always-dark zinc to the standard light+dark pattern. useCommandPoll tolerates 3 transient network failures before giving up and its client timeout grows to 15 min (fix_task can run ~13 on the host). authedFetch rewrites 503 writes_disabled into an actionable message instead of a generic error. TaskForm closes on Escape. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
DATA_PATH was defined in 4 places (squireStatePath, squireLock, data, home page) — single source in squireStatePath now. The token+lock 409 guard copy-pasted across 5 mutation routes becomes guardProjectWrite(req, projectId) in lib/auth. The hand-rolled enqueue POST repeated in 5 components becomes enqueueCommand() in clientApi. Behavior-neutral: same status codes, 165/165 tests unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Squire._account_call and fix_cli.account_usage now share accounting.record_usage; the guarded add+commit duplicated between the orchestrator and the fix cycle becomes gitops.commit_all (also simplifies _commit_task_completion, whose two branches were identical). Behavior-neutral; snapshot tests updated to the new seam. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
11 specs over next dev on :3199 with SQUIRE_DATA_PATH pointing at a state fixture rebuilt by global-setup: login (bad/good token), home (projects, Bloqueado badge, blocked count, alert banner), full task CRUD through the modal, blocked-task triage (full verdicts from homologation_log + 'Corrigir com Claude' writing a fix_task to commands/pending), run controls disabled under a seeded session lock + 409 from the API, alert ack persisting to disk (with state restore so specs stay independent), and a local-only full roundtrip that runs the real 'squire agent --once' to execute a UI-enqueued new_project. npm run test:e2e; e2e/ excluded from vitest; chromium only. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The fix cycle aborted instantly with 'no files written' because the systemd user agent's default PATH lacks ~/.local/bin (claude) — and implement_directly swallowed the FileNotFoundError silently. It now prints the actual cause (timeout / non-zero exit + stderr / missing binary) so the command result tail tells the truth. The squire-agent unit gains an explicit PATH with claude and opencode. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The recurring 'Parse error: Expecting value: line 1 column 1' (seen across claw-code-study history and twice in tonight's fix cycle) is Claude wrapping the verdict JSON in prose. _parse_response now falls back to raw_decode-scanning for the first valid JSON object before declaring an infra error — pure-prose responses still classify as infra. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Doc-heavy tasks blow past the hardcoded 300s (tonight's fix cycle timed out generating a multi-crate summary). SQUIRE_IMPLEMENT_TIMEOUT, default 600s, still inside the agent's 1200s command timeout. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…mpty parses A surgical fix request came back as prose (successful response, zero filepath blocks parsed). The prompt now states explicitly that even one-line changes must return the whole modified file, and an empty parse logs the first 300 chars of the response so the cause is visible in the command result. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The root layout's Sidebar and HealthStrip read squire state from the
filesystem on every route, but 4 statically-prerendered pages (/login,
/projects/new, /dashboard, /_not-found) froze them at build time — empty
project list and a permanent 'Sem sessão / sem lock' regardless of
reality (the user-reported symptoms). force-dynamic on the root layout
kills the whole class; build now has zero static pages, and the new
chrome.spec e2e guards the previously-static routes.
HealthStrip copy fixed via a pure sessionStatus() helper ('Squire
ocioso / nenhuma sessão ativa'; active sessions show the running
project_id instead of the sess-id). New src/lib/statusMaps.ts puts all
enum labels in PT and replaces the English sidebar pills, raw task
status chips, and raw cursor.step badge (raw values kept in title=).
Scratch project teste-dashboard removed from live state.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…Card dark mode Yesterday's squire change moved approval_first_try_rate to a 0-100 scale; the GlobalStats tile still multiplied by 100 and showed 5000%. Regression test pins the scale. ProjectCard (missed by the dark-mode pass) gets variants — titles were dark-on-dark. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…rate cost_by_model stayed empty forever because recent claude --print envelopes carry no top-level 'model' — fall back to the first modelUsage key. Approval-rate counters move to accounting.record_homologated, shared by the orchestrator and the fix cycle (approved fixes now count as homologated, never first-try). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
cliHints derives context-aware CLI commands from the state the page already loads (same heuristics as squire doctor): kill when this project runs, resume when a dead session is resumable, run/bg when idle with pending tasks, unblock when blocked tasks exist; per task, blocked → fix/unblock/reset with ids filled in, dead-session task → resume. CommandChips copies on click with an execCommand fallback (the LAN deploy is plain HTTP — navigator.clipboard needs a secure context). Shown on the project page and inside expanded tasks/BlockedTaskPanel; healthy tasks show nothing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Headless workspace container (orchestrator + command-queue agent + sshd + toolchains) supervised by s6-overlay, plus the existing dashboard image, as a docker-compose stack sharing a named squire-state volume. SSH :2222, dashboard :3101, hard memory caps. Replaces the systemd --user agent unit. Adds a Docker-in-Docker escalation seam: tasks whose tests/build invoke Docker (unavailable by design) are blocked with a requires_container_build critical alert instead of burning attempts. Never mounts the host docker socket. Docs (PT+EN) and doctor updated; container restart stale-lock auto-handled. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Repoint the stack's dashboard build context to ../dashboard (was the sibling repo ../../squire-dashboard), exclude dashboard/ from the workspace image context, and update all docs (PT+EN) for the in-repo layout: READMEs repo structure, CLAUDE.md directory block + doc-sync table + deploy paths, architecture narrative, and example repo_path hygiene. The dashboard tree was imported with full history via git subtree in the preceding merge commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Sorry @danmaxis, your pull request is larger than the review limit of 300000 diff characters
There was a problem hiding this comment.
⚠️ Human review recommended
It combines broad infra/containerization changes with a large subtree import and includes at least one credential-handling hardening issue that should be validated by a human end-to-end.
Pull request overview
This PR turns squire + dashboard into a single deployable unit by (1) introducing a two-container Docker Compose stack (workspace + dashboard) backed by shared named volumes, and (2) importing the previously-separate dashboard repo into this repo (via subtree) so build contexts and schemas stay in sync. It also adds supporting runtime features (command queue, richer homologation logging, infra retry semantics, accounting helpers) plus extensive test and docs updates.
Changes:
- Add
deploy/docker-compose.ymlstack for an isolated workspace container (s6-supervised agent + sshd) plus the dashboard container, sharingsquire-state. - Introduce command-queue and homologation enhancements (infra vs config error kinds, free infra retry, full verdict logging, stats accounting), with new tests/docs.
- Vendor dashboard into
dashboard/with Next.js API routes for state mutation (token-gated) and local E2E/unit tests.
File summaries
| File | Description |
|---|---|
| tests/test_homologator.py | Adds tests for error_kind classification and tolerant JSON extraction behavior. |
| tests/test_homologation_retry.py | New tests validating one free retry for infra failures in homologation. |
| tests/test_homologation_log.py | New tests for persisting full homologation verdict entries and caps. |
| tests/test_cost_budget.py | Adds regression tests for GlobalStats cost accumulation and approval rate math/model fallback. |
| tests/test_backends.py | Adds tests asserting actionable HTTP error messages from LiteLLM backend. |
| tests/test_backends_parser.py | Adds tests for relpath validation and junk/traversal write protections. |
| tests/conftest.py | Sets default SQUIRE_STATE_ROOT for tests to avoid touching real state. |
| tasks.json | Updates a task description label (“Orchestrator” → “Squire”). |
| README.md | Documents new repo layout entries (dashboard, deploy, docker, workspace Dockerfile). |
| README.en.md | English mirror of repo layout updates. |
| models.py | Adds HomologationLog models, structured project_id in SessionLock, command queue models, and new GlobalStats counters. |
| inner_loop.py | Exposes run_tests() as a public API (for squire fix). |
| gitops.py | New guarded git helper for add+commit used by orchestrator/fix flows. |
| docs/troubleshooting.md | Adds “start with doctor” guidance and updates internal line references. |
| docs/tasks.md | Updates line reference(s) after code movement. |
| docs/padrao-viking.md | Updates line reference(s) after code movement. |
| docs/homologacao.md | Documents infra retry semantics, container-build alert seam, and homologation log format. |
| docs/en/viking-pattern.md | English mirror: viking pattern reference updates. |
| docs/en/troubleshooting.md | English mirror: troubleshooting updates and references. |
| docs/en/tasks.md | English mirror: tasks doc reference updates. |
| docs/en/homologation.md | English mirror: homologation semantics/log docs. |
| docs/en/cost-and-budget.md | English mirror: updated references for pricing/extraction/accounting. |
| docs/en/backends.md | English mirror: path validation and actionable HTTP errors docs + reference updates. |
| docs/custos-e-orcamento.md | PT mirror: updated references for pricing/extraction/accounting. |
| docs/backends.md | PT mirror: docs for path validation + actionable HTTP errors + reference updates. |
| docker/services.d/sshd/run | Adds s6 service to run sshd in foreground. |
| docker/services.d/squire-agent/run | Adds s6 service to run squire agent with correct env/PATH and uid. |
| docker/profile.d/squire.sh | Ensures interactive SSH shells get PATH and runtime env via /etc/squire.env. |
| docker/cont-init.d/00-init | Container init: ssh keys, volume ownership, authorized_keys perms, emit runtime env file. |
| deploy/docker-compose.yml | New stack definition for workspace + dashboard with resource caps and shared volumes. |
| deploy/.env.example | Documents required deploy-time write token env for dashboard writes. |
| dashboard/vitest.config.ts | Adds Vitest config for dashboard unit tests. |
| dashboard/tsconfig.vitest.json | TS config for Vitest config compilation. |
| dashboard/tsconfig.json | Dashboard TypeScript config (includes vitest globals). |
| dashboard/tailwind.config.ts | Tailwind config for dashboard styling. |
| dashboard/src/test-setup.ts | Global mocks for Next navigation hooks in unit tests. |
| dashboard/src/lib/utils.ts | Adds shared formatting/classname utility helpers. |
| dashboard/src/lib/taskDefaults.ts | Mirrors Python Task defaults + editable fields + ID generation logic. |
| dashboard/src/lib/taskDefaults.test.ts | Tests parity between TS defaults and serialized Python Task fixture. |
| dashboard/src/lib/statusMaps.ts | Centralizes PT-BR labels for status enums. |
| dashboard/src/lib/statusMaps.test.ts | Tests non-empty PT labels and guards against raw-enum regressions. |
| dashboard/src/lib/squireStatePath.ts | Centralizes state filesystem paths with env override. |
| dashboard/src/lib/squireLock.ts | Reads/derives session lock status and project blocking logic. |
| dashboard/src/lib/data.ts | Filesystem JSON read layer for projects/tasks/alerts/stats/checkpoints/logs. |
| dashboard/src/lib/data.test.ts | Unit tests for wrapper reading + tasks normalization + homologation log reading. |
| dashboard/src/lib/commands.ts | Implements dashboard-side command queue write + status polling reads. |
| dashboard/src/lib/cliHints.ts | Derives actionable CLI hints from loaded state (doctor-like UX). |
| dashboard/src/lib/clientApi.ts | Client fetch wrapper injecting token + rewrites 503 writes-disabled message. |
| dashboard/src/lib/auth.ts | Token gate + project write guard against active squire lock. |
| dashboard/src/lib/auth.test.ts | Tests requireWriteToken behavior and 503 rewrite in client fetch. |
| dashboard/src/lib/atomic.ts | Atomic JSON write helper (tmp + rename) + tolerant JSON reads. |
| dashboard/src/lib/fixtures/python-task-defaults.json | Serialized Python Task defaults fixture for parity testing. |
| dashboard/src/hooks/useLocalStorage.ts | Generic localStorage hook with SSR guard and error handling. |
| dashboard/src/hooks/useCommandPoll.ts | Polls command queue status with timeout and transient failure tolerance. |
| dashboard/src/hooks/useCommandPoll.test.ts | Unit tests for polling state transitions and transient failure tolerance. |
| dashboard/src/hooks/useAutoRefresh.ts | Polling-based router.refresh auto-refresh hook with state tracking. |
| dashboard/src/hooks/useAutoRefresh.test.ts | Unit tests for refresh cadence and trigger behavior. |
| dashboard/src/components/ui/card.tsx | Shared Card UI primitives. |
| dashboard/src/components/ui/button.tsx | Shared Button primitive with variants/sizes. |
| dashboard/src/components/Timeline.test.tsx | Unit tests for timeline pagination and ordering. |
| dashboard/src/components/TDDProgressBar.tsx | UI for TDD phase/progress visualization with contextual details. |
| dashboard/src/components/TDDProgressBar.test.tsx | Unit tests for progress bar rendering/labels and extra info. |
| dashboard/src/components/Sidebar.tsx | Server component fetching projects and rendering sidebar shell. |
| dashboard/src/components/RunControls.tsx | Client controls to enqueue run/resume/kill commands and poll status. |
| dashboard/src/components/RunControls.test.tsx | Unit tests for enable/disable behavior based on lock state. |
| dashboard/src/components/RefreshIndicator.tsx | Displays refresh status, timestamp, cadence, and optional mode. |
| dashboard/src/components/RefreshIndicator.test.tsx | Unit tests for indicator text/time display. |
| dashboard/src/components/RefreshController.tsx | Chooses “live” vs “idle” polling cadence and renders indicator. |
| dashboard/src/components/RateLimitGauge.tsx | Displays Claude Code window usage as a progress bar. |
| dashboard/src/components/HealthStrip.tsx | Top chrome showing session status, alerts count, and budget summary. |
| dashboard/src/components/HealthStrip.test.ts | Unit tests for session status copy/behavior. |
| dashboard/src/components/GlobalStats.test.tsx | Regression test for approval rate display scale (0–100). |
| dashboard/src/components/CommandChips.tsx | Copy-to-clipboard CLI hints with insecure-context fallback. |
| dashboard/src/components/CommandChips.test.tsx | Tests clipboard and execCommand fallback behavior. |
| dashboard/src/components/BlockedTaskPanel.test.tsx | Tests blocked task triage rendering + fix_task enqueue behavior. |
| dashboard/src/app/projects/new/page.tsx | New project creation page. |
| dashboard/src/app/projects/[id]/error.tsx | Project-level error boundary UI. |
| dashboard/src/app/not-found.tsx | Not-found page. |
| dashboard/src/app/login/page.tsx | Login page that stores write token in localStorage. |
| dashboard/src/app/layout.tsx | Root layout with Sidebar + HealthStrip and force-dynamic to avoid stale chrome. |
| dashboard/src/app/globals.css | Tailwind base/global styles. |
| dashboard/src/app/error.tsx | Global error boundary UI. |
| dashboard/src/app/dashboard/page.tsx | Legacy route redirect to /. |
| dashboard/src/app/api/projects/[id]/tasks/route.ts | POST create-task handler with validation and lock/token gating. |
| dashboard/src/app/api/projects/[id]/tasks/route.test.ts | Tests create-task handler (auth, defaults, validation, lock). |
| dashboard/src/app/api/projects/[id]/tasks/[taskId]/route.ts | PATCH/DELETE task handler with whitelist application. |
| dashboard/src/app/api/projects/[id]/tasks/[taskId]/route.test.ts | Tests whitelist patching and delete semantics. |
| dashboard/src/app/api/projects/[id]/tasks/[taskId]/action/route.ts | Task action handler (retry/approve/skip) mutating task state. |
| dashboard/src/app/api/projects/[id]/route.ts | PATCH project handler with validation and updated_at bump. |
| dashboard/src/app/api/projects/[id]/route.test.ts | Tests project patch semantics and validation. |
| dashboard/src/app/api/projects/[id]/budget/route.ts | POST budget mutation handler on checkpoint rate limit fields. |
| dashboard/src/app/api/health/route.ts | Dynamic health endpoint to avoid build-time timestamp freezing. |
| dashboard/src/app/api/commands/route.ts | Enqueues host commands with early validation and lock pre-checks. |
| dashboard/src/app/api/commands/[id]/route.ts | Reads command status with UUID guard and token gating. |
| dashboard/src/app/api/commands/[id]/route.test.ts | Tests status endpoint and UUID validation behavior. |
| dashboard/src/app/api/auth/check/route.ts | Validates token (used by login flow). |
| dashboard/src/app/api/alerts/ack/route.ts | Acks/dismisses alerts atomically with token gating. |
| dashboard/src/app/api/alerts/ack/route.test.ts | Tests alert ack/dismiss flows and token requirements. |
| dashboard/README.md | Dashboard-local README for dev/test usage and architecture summary. |
| dashboard/public/.gitkeep | Keeps public/ in repo. |
| dashboard/postcss.config.js | PostCSS config for Tailwind pipeline. |
| dashboard/playwright.config.ts | E2E setup (port, seeded state, token, single-worker constraints). |
| dashboard/package.json | Dashboard dependencies and scripts (unit + e2e). |
| dashboard/next.config.js | Enables Next standalone output (Docker-friendly). |
| dashboard/fixtures/data/projects/squire-dashboard/tasks.json | Seed fixture tasks list for dashboard dev/testing. |
| dashboard/fixtures/data/projects/squire-dashboard/project.json | Seed fixture project.json. |
| dashboard/fixtures/data/projects/squire-dashboard/history.json | Seed fixture history events. |
| dashboard/fixtures/data/projects/squire-dashboard/commits.json | Seed fixture commits log. |
| dashboard/fixtures/data/projects/squire-dashboard/checkpoint.json | Seed fixture checkpoint data (cursor/context/rate_limit). |
| dashboard/fixtures/data/projects/minimal-template/tasks.json | Minimal legacy template fixture for normalization tests. |
| dashboard/fixtures/data/projects/minimal-template/homologation_log.json | Seed homologation log fixture for reading tests. |
| dashboard/fixtures/data/global-stats.json | Seed global-stats fixture for dashboard dev/testing. |
| dashboard/fixtures/data/alerts.json | Seed alerts fixture for banner rendering/dev/testing. |
| dashboard/e2e/tasks-crud.spec.ts | E2E: create/edit/delete task via UI. |
| dashboard/e2e/run-control.spec.ts | E2E: run/resume/kill enablement and 409 behavior with lock. |
| dashboard/e2e/login.spec.ts | E2E: login token flow and persistence. |
| dashboard/e2e/home.spec.ts | E2E: project listing and alert visibility. |
| dashboard/e2e/helpers.ts | E2E helper to inject token into localStorage. |
| dashboard/e2e/cli-hints.spec.ts | E2E: CLI hint rendering + clipboard copy behavior. |
| dashboard/e2e/chrome.spec.ts | E2E: verifies layout chrome is fresh (no prerender staleness regressions). |
| dashboard/e2e/blocked-triage.spec.ts | E2E: blocked task triage log rendering + fix_task enqueue into pending/. |
| dashboard/e2e/alerts.spec.ts | E2E: alert ack persists to filesystem. |
| dashboard/e2e/agent-roundtrip.spec.ts | Local-only E2E: end-to-end new_project via UI + agent --once processing. |
| dashboard/Dockerfile | Multi-stage build to standalone Next runner image. |
| dashboard/docker-compose.yml | Dashboard-only compose for isolated dev; main deploy is stack in deploy/. |
| dashboard/CLAUDE.md | Dashboard developer guide + data contracts + review checklist. |
| dashboard/.gitignore | Dashboard-specific gitignore (node/next outputs, e2e state, env). |
| dashboard/.dockerignore | Dashboard dockerignore for smaller build context. |
| config.py | Defaults STATE_ROOT and adds command queue / timeouts / agent settings. |
| checkpoint.py | Adds homologation log persistence, structured project lock field, and commits save/load. |
| accounting.py | New shared usage accounting + approval-rate bookkeeping helpers. |
| .env.example | Updates env example to guarded exports and revised defaults/backends. |
| .dockerignore | Excludes dashboard/deploy/env/etc from workspace image build context. |
| .devcontainer/devcontainer.json | Devcontainer attaches to workspace service without owning lifecycle. |
Copilot's findings
- Files reviewed: 184/186 changed files
- Comments generated: 4
Note
Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } > /etc/squire.env | ||
| chmod 644 /etc/squire.env |
There was a problem hiding this comment.
Fixed in 60b233d — /etc/squire.env is now written 0600 and chowned to ai-debian (the only SSH user / the one that sources it in profile.d). Verified the login shell still sources it with the env cleared.
| let timer: ReturnType<typeof setTimeout>; | ||
| const schedule = () => { | ||
| timer = setTimeout(tick, POLL_MS); | ||
| }; | ||
| tick(); | ||
|
|
||
| return () => { | ||
| cancelled = true; | ||
| clearTimeout(timer); | ||
| }; |
There was a problem hiding this comment.
Fixed in 60b233d — timer is now ReturnType<typeof setTimeout> | null = null and the cleanup guards if (timer) clearTimeout(timer). useCommandPoll tests pass.
| subprocess.run( | ||
| ["git", "add", "-A"], cwd=repo_path, capture_output=True, timeout=15 | ||
| ) |
There was a problem hiding this comment.
Fixed in b119af9 — commit_all now checks git add -A's return code and returns failed: git add: <stderr> instead of falling through to commit with a misleading error.
- docker/cont-init.d/00-init: write /etc/squire.env with 0600 + ai-debian ownership (it can hold API keys); was world-readable 0644. - dashboard/fixtures/data/global-stats.json: approval_first_try_rate uses the 0–100 scale the UI renders (75.0), not 0–1 (0.75 → misleading "1%"). - dashboard/src/hooks/useCommandPoll.ts: make the poll timer nullable and guard clearTimeout in cleanup. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
git add -A's return code was ignored, so an add failure (bad permissions, invalid pathspec, broken repo) would fall through to git commit and report a misleading error. Capture stderr and return a clear "failed: git add: …". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Two related changes that turn squire + dashboard into a single, isolated, self-contained unit.
1. Containerized two-container stack (
9ff99a7)A headless workspace container (orchestrator + command-queue agent + sshd + toolchains: python/git/node/claude/opencode) supervised by s6-overlay, plus the existing dashboard image, as a
deploy/docker-compose.ymlstack sharing a namedsquire-statevolume.:2222, dashboard:3101, hard memory caps (mem_limit/memswap_limit, no host swap).systemd --useragent unit; the agent runs as uid 1000.requires_container_buildcritical alert instead of burning attempts. The host docker socket is never mounted.2. Consolidate dashboard into
squire/dashboard/(b638373,fdea4f9)The dashboard was a sibling repo; the stack hardcoded
../../squire-dashboardand the JSON schema was hand-mirrored across two repos.git subtree— full 40-commit dashboard history preserved underdashboard/.../dashboard;dashboard/excluded from the workspace image context.repo_pathhygiene.Verification
docker compose -f deploy/docker-compose.yml up -d→ both containers healthy; dashboard serves the migrated projects.squire doctor: 0 fail..dockerignorehonored — workspace build context stays minimal (no dashboard/node_modules bloat).Notes
/home/ai-debian/squire-dashboardfolder is left on disk as rollback; remove after soak.🤖 Generated with Claude Code