An autonomous expedition loop that picks Linear issues, implements code changes, opens PRs, and iterates through review cycles until the backlog is drained.
Paintress uses Claude Code to automatically process Linear issues — implementing code, running tests, creating PRs, running code reviews, verifying UI, and fixing bugs — with no human intervention, until every issue is done. In Swarm Mode (--workers N), multiple expeditions run in parallel using git worktrees for isolation.
paintress --model opus,sonnet ./your-repoThis single command makes Paintress repeat the following cycle:
- Fetch an unfinished issue from Linear
- Analyze it and determine the mission type: implement / verify / fix
- Claude Code creates a branch, implements, tests, opens a PR
- Run code review gate — review comments trigger automatic fixes (up to 3 cycles)
- Record results, move to the next issue
- Stop when all issues are complete or max expeditions reached
- Enter D-Mail waiting mode — monitor inbox/ via fsnotify for incoming D-Mails
- On D-Mail arrival, re-run the expedition loop; on timeout (default 30m), exit
The system design is inspired by the world structure of Clair Obscur: Expedition 33, an RPG game.
In the game world, a being called the Paintress paints a number on a monolith each year, erasing everyone of that age. Every year, the people send an Expedition to destroy her — but every expedition fails. Only their flags and journals remain as guideposts for the next.
This structure maps directly to AI agent loop design:
| Game Concept | Paintress | Design Meaning |
|---|---|---|
| Paintress | This binary | External force that drives the loop |
| Monolith | Linear backlog | The remaining issue count is inscribed |
| Expedition | One Claude Code execution | Departs with fresh context each time |
| Expedition Flag | .expedition/.run/flag.md |
Per-worker checkpoint, consolidated at exit |
| Journal | .expedition/journal/ |
Record of past decisions and lessons |
| Canvas | LLM context window | Beautiful but temporary — destroyed each run |
| Lumina | Auto-extracted patterns | Patterns learned from past failures/successes |
| Gradient Gauge | Consecutive success tracker | Momentum unlocks harder challenges |
| Reserve Party | Model fallback | When Opus falls, Sonnet takes over |
- Always destroy the Canvas — LLM context is reset every run. A fresh start beats a polluted context.
- Plant the Flag well — Loop quality depends on what you pass to the next Expedition. Checkpoints and Lumina are the lifeline.
- Make the Gommage your ally — Failure (erasure) isn't the end; it's a chance to accumulate Lumina. Consecutive failures trigger class-aware recovery: transient failures (timeout, rate limit, parse error) retry with cooldown, while persistent failures (blocker, systematic) halt and escalate. Recovery resets counters, injects Lumina hints, and starts a fresh attempt for the same issue type.
Three game mechanics autonomously control loop quality:
Consecutive successes fill the gauge, unlocking higher-difficulty issues.
[░░░░░] 0/5 → Start with small, safe issues
[██░░░] 2/5 → Normal priority
[████░] 4/5 → High priority OK
[█████] 5/5 → GRADIENT ATTACK: tackle the most complex issue
- Success → +1 (Charge)
- Skip → -1 (Decay)
- Failure → Reset to 0 (Discharge)
Past journals are scanned in parallel goroutines to extract recurring patterns. Injected directly into the next Expedition's prompt.
- Defensive: Insights from failed expeditions that appear 2+ times → "Avoid — failed N times: ..." (falls back to failure reason if no insight)
- Offensive: Insights from successful expeditions that appear 3+ times → "Proven approach (Nx successful): ..." (falls back to mission type if no insight)
The output streaming goroutine detects rate limits in real-time and cascades through available models automatically. Each model has an independent 30-minute cooldown, so a three-tier configuration can fall back from Opus to Sonnet to Haiku without waiting.
# Opus primary, Sonnet reserve
paintress --model opus,sonnet ./repo
# Three-tier cascade fallback
paintress --model opus,sonnet,haiku ./repo- Rate limit detected → put current model in per-model cooldown → switch to next available model
- After 30-min cooldown expires → attempt recovery to primary
- Timeout also triggers cascade switch (possible rate limit)
Additional systems that improve expedition quality across runs:
ClassifyCapabilityViolation scans journal text for signals indicating the expedition hit an environment boundary (network access, filesystem permissions, missing tools, Docker unavailability, auth failures, resource limits). Detected violations are recorded and injected into the Capability Boundary section of subsequent expedition prompts to prevent repeated failures.
ReflectionAccumulator collects review comments across review-fix cycles within a single expedition. It tracks priority tag counts per cycle and detects stagnation (tag counts not decreasing across cycles). FormatForPrompt renders the accumulated history for injection into fix prompts.
StrategyForCycle rotates through three fix strategies across review-fix cycles: Direct (cycle 1) applies review comments directly, Decompose (cycle 2) breaks comments into sub-tasks, Rewrite (cycle 3) rewrites the affected section from scratch. The rotation repeats for longer review chains.
IssueClaimRegistry prevents multiple parallel workers (Swarm Mode) from working on the same Linear issue simultaneously. Thread-safe via mutex; TryClaim returns the holding expedition number on conflict.
ExpeditionDurations pairs start/complete events to compute per-expedition durations. DurationPercentiles calculates p50, p90, and p99 from the duration list. Telemetry breakdown attributes track time spent in each expedition phase.
WindowedSuccessRate computes success rate over the most recent N completed expeditions. DetectSuccessRateTrend compares the recent window against the preceding window to detect improvement, decline, or stability (threshold: 10% change).
AcquireContext runs a git status health check on acquired worktrees before returning them to workers. If the worktree is corrupted or inaccessible, it is automatically force-recycled and a fresh worktree is created. Acquired worktrees are also cleaned up on Shutdown.
Per-file and total byte limits prevent oversized context files from bloating the expedition prompt. Files exceeding the per-file limit are excluded with a warning; the total budget caps aggregate context size.
ExtractReviewComments parses review tool output into structured ReviewComment values with priority sorting ([P0] highest). Falls back to raw text when structured parsing fails.
Escalation events fire once per failure streak rather than on every consecutive failure. Retry backoff is capped via NewRetryTrackerWithMax with an Exhausted check.
ExcludeIssuesByLabel filters Linear issues by label (case-insensitive match), allowing teams to exclude issues tagged with specific labels from the expedition loop.
Paintress communicates with external tools (phonewave, sightjack, amadeus) via the D-Mail protocol — Markdown files with YAML frontmatter exchanged through inbox/ and outbox/ directories. Each message carries a dmail-schema-version field (currently "1") for protocol compatibility.
- Inbound: External tools write specification/implementation-feedback d-mails to
inbox/. Paintress scans and embeds them in the expedition prompt. - Pre-Flight Triage: Before each expedition,
triagePreFlightDMailsprocesses action fields:escalate(consume + emit event),resolve(consume + emit resolved event),retry(pass through or escalate if over max retries). Triaged-out D-Mails are archived immediately. - Outbound: After a successful expedition, a report d-mail is written to
archive/first, thenoutbox/(archive-first for durability). - HIGH Severity Gate: HIGH severity d-mails trigger desktop notification + human approval before the expedition starts. See docs/approval-contract.md.
- Skills: Agent skill manifests (
SKILL.md) in.expedition/skills/follow the Agent Skills specification, declaring D-Mail capabilities undermetadata.
BREAKING: The feedback kind has been split into design-feedback and implementation-feedback. Paintress consumes implementation-feedback (not the old feedback). Run paintress doctor to detect deprecated kinds and paintress init --force [path] to regenerate SKILL.md files.
Full protocol details: docs/dmail-protocol.md | Directory structure: docs/expedition-directory.md
Paintress (binary) <- Outside the repository
|
| Pre-flight:
| +-- goroutine: parallel journal scan -> Lumina extraction
| +-- PreflightCheckRemote (verify git remote exists)
| +-- WorktreePool.Init (when --workers >= 1)
|
| Per Expedition:
| +-- IssueClaimRegistry.TryClaim (Swarm Mode dedup)
| +-- triagePreFlightDMails (escalate/resolve/retry)
| +-- Gradient Gauge check -> difficulty hint
| +-- Reserve Party check -> primary recovery attempt
| +-- StrategyForCycle -> fix strategy selection
| +-- ReflectionAccumulator -> stagnation detection
|
v
Monolith (Linear) <- Fully external
|
v
WorktreePool <- Isolated worktrees for parallel workers (Swarm Mode)
|
v
Expedition (Claude Code) <- One session per issue
|
v
Review Gate (exec) <- Code review tool + Claude Code --continue (up to 3 cycles)
|
v
D-Mail Waiting Loop <- fsnotify inbox/ watch (--idle-timeout, default 30m)
| On D-Mail arrival: re-run expedition loop
| On timeout/signal: clean exit
v
Continent (Git repo) <- Persistent world
+-- src/
+-- CLAUDE.md
+-- .expedition/
+-- config.yaml <- Project config (paintress init)
+-- journal/
| +-- 001.md, 002.md, ...
+-- context/ <- User-provided .md files injected into prompts
+-- skills/ <- Agent skill manifests (SKILL.md)
+-- inbox/ <- Incoming d-mails (gitignored, transient)
+-- outbox/ <- Outgoing d-mails (gitignored, transient)
+-- archive/ <- Processed d-mails (tracked, audit trail)
+-- insights/ <- Insight Ledger (tracked, lumina.md + gommage.md)
+-- events/ <- Append-only event store (JSONL, gitignored)
+-- .run/ <- Ephemeral (gitignored)
+-- flag.md <- Checkpoint (consolidated from per-worker flags at exit)
+-- logs/ <- Expedition logs
+-- worktrees/ <- Managed by WorktreePool
+-- worker-001/
| +-- .expedition/.run/flag.md <- Per-worker checkpoint
+-- worker-002/
+-- .expedition/.run/flag.md <- Per-worker checkpoint
- Init —
git worktree prune, then for each worker: force-remove leftover →git worktree add --detach→ run--setup-cmdif set - Acquire — Worker claims a worktree from the pool (blocks if all in use).
AcquireContextruns agit statushealth check; corrupted worktrees are force-recycled and re-created automatically. - Release — After each expedition:
git checkout --detach <base-branch>→git reset --hard <base-branch>→git clean -fd -e .expedition→ return to pool. The-e .expeditionexclusion preserves per-worker flag.md across releases. Checkout/reset failures trigger automatic worktree recycling. - Consolidate — After all workers complete:
reconcileFlagsscans all worktree flag.md files, picks max(LastExpedition), writes it back to Continent's flag.md for human inspection and next startup. - Shutdown — On exit (30s timeout, independent of parent context):
git worktree remove -feach →git worktree prune
When --workers 0, no pool is created and expeditions run directly on the repository. The flag.md path unifies: flagDir = workDir (worktree path when Workers>0, Continent when Workers=0). No mutex is needed — each worker has exclusive access to its own flag.md. reconcileFlags skips worktree glob scan when workers=0, reading only the Continent flag.md to avoid stale worktree contamination from crashed prior runs.
| Goroutine | Role | Game Concept |
|---|---|---|
| Signal handler | SIGINT/SIGTERM → context cancel | — |
| Dev server | Background startup & monitoring | Camp |
| Journal scanner | Parallel file reads → Lumina extraction | Resting at Flag |
| Worker (N) | Expedition loop per worktree (Swarm Mode) | Expedition Party |
| Output streaming | stdout tee + rate limit detection | Reserve Party standby |
| Flag watcher | fsnotify: detect issue selection in real-time | Expedition Flag |
| Inbox watcher | fsnotify: detect d-mails arriving mid-expedition | D-Mail courier |
| Timeout watchdog | context.WithTimeout | Gommage (time's up) |
After a successful Expedition creates a PR, Paintress runs an automated code review via a configurable command (default: codex review --base main). The review tool is customizable via --review-cmd and can be any linter, code review tool, or custom script. The review runs outside the LLM context window to avoid polluting the Expedition's Canvas.
- Pass: Review finds no actionable issues → proceed to next Expedition
- Fail: Review comments tagged
[P0]–[P4]are detected → Claude Code resumes the Expedition session (--continue) to fix them, reusing full implementation context - Retry: Up to 3 review-fix cycles per Expedition; unresolved insights are recorded in the journal
- Timeout: The entire review loop is bounded by the expedition timeout (
--timeout) - Rate limit / error: Review is skipped gracefully (logged as WARN, does not block the loop)
The review command is customizable via --review-cmd. Set to empty string (--review-cmd "") to disable.
What Paintress does:
- Autonomously pick Linear issues and implement code changes via Claude Code
- Create branches, run tests, open PRs, and iterate through code review cycles
- Manage parallel expeditions in isolated git worktrees (Swarm Mode)
- Send report D-Mails to downstream tools after successful expeditions
- Enter D-Mail waiting mode after expeditions complete, re-running on incoming D-Mails
What Paintress does NOT do:
- Edit Linear issues directly (only reads issues for implementation)
- Manage git branches on the main repository (uses worktrees for isolation)
- Handle authentication setup (assumes Linear, GitHub CLI, and Claude Code are pre-configured)
- Verify post-merge design integrity (amadeus handles that)
# Install via Homebrew (WIP — tap may not be published yet)
brew install hironow/tap/paintress
# Or build from source
just install
# Initialize project config (Linear team key, etc.)
paintress init /path/to/your/repo
# Generate Claude subprocess isolation settings
paintress mcp-config generate /path/to/your/repo
# Upgrade existing project (regenerate SKILL.md, etc.)
paintress init --force /path/to/your/repo
# Check external command availability, git remote, deprecated kinds, context-budget per-item diagnostics
paintress doctor
# Run — .expedition/ is created automatically
paintress /path/to/your/repoPaintress creates .expedition/ with config, journal entries, and ephemeral
runtime state under .run/ automatically. Mission and Lumina content are
embedded directly in the expedition prompt (no separate files on disk).
Git worktrees for Swarm Mode are also fully managed — Paintress creates them
on startup and removes them on shutdown. No manual git worktree commands needed.
Running paintress without a subcommand defaults to run (expedition loop).
| Command | Description |
|---|---|
run |
Run expedition loop (default) |
init |
Initialize .expedition/config.yaml |
doctor |
Check environment health |
issues |
Query Linear issues via Claude MCP |
config show / config set |
View or update configuration |
status |
Show operational status |
clean |
Remove state directory |
rebuild |
Rebuild projections from event store |
archive-prune |
Prune old archived D-Mail files |
version |
Print version info |
mcp-config generate |
Generate .mcp.json and .claude/settings.json for subprocess isolation |
update |
Self-update to the latest release |
All commands accept an optional [path] argument (defaults to cwd). For flags, examples, and full reference per subcommand, see docs/cli/.
paintress init # set up .expedition/
paintress mcp-config generate # Claude subprocess isolation settings
paintress run # expedition loop
paintress run -n # dry run
paintress run -m opus,sonnet -w 3 # swarm modePaintress stores project configuration in .expedition/config.yaml (generated by paintress init). See docs/expedition-directory.md for the full directory structure.
Paintress instruments key operations (expedition, review loop, worktree pool, dev server) with OpenTelemetry spans and events. Tracing is off by default (noop tracer) and activates when OTEL_EXPORTER_OTLP_ENDPOINT is set.
# Start Jaeger (all-in-one trace viewer)
docker compose -f docker/compose.yaml up -d
# Run paintress with tracing enabled
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 paintress run
# View traces at http://localhost:16686All code lives in internal/ (Go convention). The internal/harness/ layer provides the decision/validation/prompt-rendering boundary between the LLM and the environment, organized as policy/ (deterministic decisions), verifier/ (output validation), and filter/ (prompt construction) behind a single facade. See docs/conformance.md for the full layer architecture, dependency rules, and directory responsibilities. Run just --list for available tasks.
Paintress ships three companion binaries for sending notifications and approval requests to chat platforms. They plug into --notify-cmd and --approve-cmd:
| Binary | Platform | Transport | Env Vars |
|---|---|---|---|
paintress-tg |
Telegram | Bot API (long polling) | PAINTRESS_TG_TOKEN, PAINTRESS_TG_CHAT_ID |
paintress-discord |
Discord | Bot Gateway (WebSocket) | PAINTRESS_DISCORD_TOKEN, PAINTRESS_DISCORD_CHANNEL_ID |
paintress-slack |
Slack | Socket Mode (WebSocket) | PAINTRESS_SLACK_TOKEN, PAINTRESS_SLACK_CHANNEL_ID, PAINTRESS_SLACK_APP_TOKEN |
Each binary provides three subcommands: notify, approve, and doctor.
# Example: Slack notifications + Telegram approval
paintress \
--notify-cmd 'paintress-slack notify "{message}"' \
--approve-cmd 'paintress-tg approve "{message}"' \
/path/to/repo
# Check companion setup
paintress-tg doctor
paintress-discord doctor
paintress-slack doctorAll companions follow the approval contract: exit 0 = approved, exit non-zero = denied.
Build from source: just install-all (installs all 4 binaries to /usr/local/bin). Homebrew (brew install hironow/tap/paintress) is WIP.
See docs/conformance.md for the full conformance table (single source).
- docs/ — Full documentation index
- docs/conformance.md — What/Why/How conformance table
- docs/expedition-directory.md —
.expedition/directory structure - docs/policies.md — Event → Policy mapping
- docs/otel-backends.md — OTel backend configuration
- docs/approval-contract.md — Three-way approval contract
- docs/testing.md — Test strategy and conventions
- docs/adr/ — Architecture Decision Records
- docs/shared-adr/ — Cross-tool shared ADRs
- Go 1.26+
- just task runner
- Claude Code CLI
- A code review CLI (for code review gate, customizable via
--review-cmd, e.g. tools that output[P0]–[P4]priorities) - GitHub CLI for Pull Request operations
- Linear: accessible for Issue operations (e.g. Linear MCP)
- Docker for tracing (Jaeger) and container tests
- Browser automation (for verify missions): e.g. Playwright, Chrome DevTools
Apache License 2.0 See LICENSE for details.