@.claude/docs/REQUIREMENTS.md - Acceptance criteria + UI contract @.claude/docs/ARCHITECTURE.md - Data models, file storage, error/exit tables
# Install dependencies
pnpm install
# Run CLI in dev mode
pnpm dev --help
pnpm dev sprint create
# Or run installed CLI (works from any directory)
./bin/ralphctl
# Run without args for the Ink-based terminal app (recommended)
pnpm devBefore committing any code change, run /verify (wraps pnpm typecheck && pnpm lint && pnpm test). All three must pass.
- Node.js 24+ (managed via
mise.toml) - pnpm 10+
- Claude CLI or GitHub Copilot CLI installed and configured (see Provider Configuration below)
- No
sprint activatecommand βsprint startauto-activates draft sprints affectedRepositoriesstores absolute paths (not names) β set duringsprint plan, persisted per-ticket- Refinement is per-ticket β template uses
{{TICKET}}(singular), one AI session per ticket - Planning is per-sprint β repo selection applies to all tickets, paths saved per-ticket
currentSprint(config.json pointer) is NOT the same as sprint status (lifecycle state)aiProvideris a global config setting, not per-sprint β stored in config.json- Check scripts come ONLY from explicit repo config β set during
project addorproject repo add; heuristic detection (src/integration/external/detect-scripts.ts) is used only as editable suggestions during project setup, never as a runtime fallback RALPHCTL_SETUP_TIMEOUT_MSβ env var to override the 5-minute default timeout for check scripts- Check tracking β
sprint.checkRanAtrecords per-repo timestamps; re-runs skip already-completed checks;--refresh-checkforces re-execution; cleared on sprint close - Post-task gate β harness runs
checkScriptafter every AI task; task not marked done if gate fails - Branch management β
sprint startprompts for branch strategy on first run;sprint.branchpersists the choice; branches created in all repos with tasks; pre-flight verifies correct branch before each task;--branchauto-generatesralphctl/<sprint-id>;--branch-name <name>for custom names;sprint close --create-prcreates PRs - Evaluator pattern β independent code review after each task (see REQUIREMENTS.md Β§ Evaluator Pattern for full spec):
evaluationIterationsis global (config.json). Default1= 1 initial eval + up to 1 fix-and-reeval.0disables.- Claude uses a model ladder (OpusβSonnet, SonnetβHaiku, HaikuβHaiku); Copilot uses the same model (no control).
- Evaluator is autonomous (full tool access) and grades four floor dimensions (Correctness / Completeness / Safety /
Consistency) plus optional
extraDimensionsemitted per-task by the planner (undefined = floor-only). - Full critique persists to
<sprintDir>/evaluations/<taskId>.md;tasks.jsonkeeps a 2000-char preview + status. - Evaluator never blocks β task always completes and the sprint continues. A failed / malformed / plateau
critique logs a warning and persists the full text to
evaluations/<taskId>.mdfor later review;mark-donestill runs.--no-evaluateskips evaluation for one run; session mode disables it implicitly;evaluationIterations: 0(via settings panel /config set) disables it globally.
- Result boundaries β Persistence layer functions throw domain errors. Result types (
wrapAsync,zodParse) are used at command/interactive boundaries to handle errors without throwing. Prefer.okproperty checks over.match()chains. - Clean Architecture layering β
domain<business<integration<application. Inner layers never import from outer layers. Use cases depend on service ports (src/business/ports/); repository interfaces are pure-domain (every port lives insrc/business/ports/). Concrete adapters live undersrc/integration/. - Pipelines are the orchestration layer β every user-triggered workflow (refine, plan, ideate, evaluate, execute)
is a composable
PipelineDefinitioninsrc/business/pipelines/, composed viapipeline()/step()fromsrc/business/pipelines/framework/helpers.tswith shared building blocks insrc/business/pipelines/steps/. CLI commands and TUI views invokecreateXxxPipeline()factories fromsrc/application/factories.tsand callexecutePipeline(...)β neveruseCase.execute()directly. An ESLintno-restricted-importsfence ineslint.config.jsenforces the boundary (type-only imports allowed). - Pipeline framework primitives β extend with
insertBefore/insertAfter/replace(pure builders) rather than rewriting the step array. Usenested(pipeline)to embed one pipeline as a step of another (composite pattern); useforEachTask()to fan out an inner pipeline per item with mutex-keyed concurrency, retry policy, and a shared rate-limit coordinator + signal-bus lifecycle. - Integration tests lock step order β each pipeline has a test under
src/business/pipelines/*.test.tsthat assertsstepResults.map(r => r.stepName)on the happy path and failure paths. These tests are the architectural fence that prevents silent bypass β docs alone aren't enforcement. - No barrel files β every import points to the source module directly. Never add an
index.tsthat only re-exports from siblings; tree-shaking and import clarity beat brevity at the call site. - Ink TUI is the default interactive surface β bare
ralphctl/ralphctl interactive/ralphctl sprint startmount the Ink app viasrc/integration/ui/tui/runtime/mount.tsx. The mount path takes over the terminal using the alt-screen buffer (vim/htop-style) and restores it on exit viasrc/integration/ui/tui/runtime/screen.ts. Non-TTY / CI / piped invocations fall back automatically to Commander + PlainTextSink. - PromptPort is the only interactive-prompt abstraction β call sites use
getPrompt()fromsrc/integration/bootstrap.ts.InkPromptAdapteris the single implementation. When a prompt fires and the full dashboard isn't mounted (one-shot commands likeralphctl project add), the adapter auto-mounts a minimal Ink tree viasrc/integration/ui/prompts/auto-mount.tsxcontaining only<PromptHost />, drains the prompt queue, and unmounts. Non-interactive environments throwPromptCancelledErrorβ pass values as flags. - LoggerPort is the only logging abstraction β three sinks:
PlainTextSink(TTY one-shot CLI),JsonLogger(non-TTY / piped / CI),InkSink(Ink-mounted, publishes to an event bus consumed by the dashboard). Business logic always goes through the injected logger, neverconsole.log. - SignalBusPort is the live observability stream β
ExecuteTasksUseCaseemits on every parsed signal, rate-limit pause/resume, and task lifecycle event. Dashboard subscribes to render live; filesystem signal handler subscribes to persist. Two sinks, one source βInMemorySignalBusmicro-batches emissions at ~16ms to avoid render storms. - Live config (no snapshot) β
ExecuteTasksUseCase.getEvaluationConfig()reads fresh per task settlement so mid-execution changes via the settings panel (REQ-12) take effect on the next task without restart. - Repo onboarding β
ralphctl project onboard <project> [--repo] [--dry-run] [--auto]writes the project context file the activeconfig.aiProvidernatively reads:CLAUDE.mdat repo root for Claude,.github/copilot-instructions.mdfor Copilot. No symlinks, no pointer files. Three modes auto-detected inrepo-preflight:bootstrap(no prior file),adopt(authored file, preserve prose, propose additions only),update(prioronboardingVersionmarker, prune + augment with<changes>rationale). Content follows the empirical 7-section skeleton (Project Overview Β· Build & Run Β· Testing Β· Architecture Β· Implementation Style Β· Security & Safety Β· Performance Constraints); last two are forced to close the empirical gap and acceptLOW-CONFIDENCE:prefixes when the repo gives no signal. Structural lint insrc/integration/external/agents-md-linter.tsenforces 1 H1, β€7 H2, no H4+, <300 lines, Flesch >40, required sections, plus command-drift warnings.onboardingVersionmarker onRepositorydrives doctor status (skip / pass / warn). Authored project context files are never overwritten β diff-only.
- Don't reference or create a
sprint activatecommand β usesprint start - Don't confuse
currentSprint(which sprint CLI targets) withsprintStatus(draft/active/closed) - Don't store repository names in
affectedRepositoriesβ store absolute paths - Don't explore repos during
sprint refineβ refinement is implementation-agnostic (WHAT, not HOW) - Don't break task
blockedBydependencies during planning β preserve dependency chains - Don't let prompt templates drift from command implementation β verify prompts describe actual workflow (e.g., repo selection timing)
- Don't hardcode provider-specific logic outside
src/integration/ai/providers/β use the provider abstraction layer - Don't assume both providers share the same permission model β Claude uses settings files, Copilot uses
--allow-all-tools(see Provider Differences below) - Don't add runtime auto-detection of check scripts β detection logic in
src/integration/external/detect-scripts.tsis for suggestions duringproject addonly - Don't introduce symlinks or pointer files for provider-facing artefacts β
project onboardwrites the native file (CLAUDE.mdor.github/copilot-instructions.md) based onconfig.aiProvider. NoAGENTS.md + symlinkscheme. - Don't skip file locks for data mutations β use
withFileLock()to prevent race conditions in concurrent access (30s timeout, configurable viaRALPHCTL_LOCK_TIMEOUT_MS) - Don't add
index.tsbarrel files β every import goes directly to its source module - Don't import
@inquirer/promptsβ it's deleted. UsegetPrompt()fromsrc/integration/bootstrap.ts - Don't call use cases from CLI commands or TUI views β ESLint fence blocks it. Use
createXxxPipeline()fromsrc/application/factories.ts+executePipeline(...)instead. - Don't invent new pipeline orchestration primitives β the framework has
step/pipeline/nested/forEachTask/insertBefore/insertAfter/replace/renameStepinsrc/business/pipelines/framework/. Use them.
0. Check setup β ralphctl doctor (environment health check)
1. Add projects β ralphctl project add
2. Create sprint β ralphctl sprint create (draft, becomes current)
3. Add tickets β ralphctl ticket add --project <name>
4. Refine requirements β ralphctl sprint refine (WHAT β clarify requirements)
5. Export requirements β ralphctl sprint requirements (optional, markdown export)
6. Plan tasks β ralphctl sprint plan (HOW β explore repos, generate tasks)
7. Check health β ralphctl sprint health (diagnose blockers, stale tasks)
8. Start work β ralphctl sprint start (auto-activates draft sprints)
9. Close sprint β ralphctl sprint close
Optional: Enable shell tab-completion with ralphctl completion install (bash, zsh, fish).
Optional: Configure your preferred AI provider with ralphctl config set provider <claude|copilot> (prompted on
first use if not set).
ralphctl config set provider claude # Use Claude Code CLI
ralphctl config set provider copilot # Use GitHub Copilot CLIAuto-prompts on first AI command if not set. Both CLIs must be in PATH and authenticated.
| Aspect | Claude Code | GitHub Copilot |
|---|---|---|
| CLI flags | --permission-mode acceptEdits, --effort xhigh |
--allow-all-tools |
| Settings files | .claude/settings.local.json, ~/.claude/settings.json |
None |
| Allow/deny patterns | Bash(git commit:*), Bash(*), etc. |
Not applicable |
--effort xhigh matches Claude Code's own default for plans (Opus 4.7 introduced the xhigh level between high and
max). Older Claude models accept --effort too; the CLI maps the level down to what the selected model supports.
Direct Tasks: sprint create β task add (repeat) β sprint start
AI-Assisted: sprint create β ticket add β sprint refine β sprint plan β sprint start
Quick Ideation: sprint create β sprint ideate β sprint start (combines refine + plan for quick ideas)
Re-Plan: (draft sprint) ticket add β sprint refine β sprint plan (replaces existing tasks)
Status: draft β active β closed
| Operation | Draft | Active | Closed |
|---|---|---|---|
| Add ticket | β | β | β |
| Edit/remove ticket | β | β | β |
| Refine requirements | β | β | β |
| Ideate (quick) | β | β | β |
| Plan tasks | β | β | β |
| Start (execute) | β* | β | β |
| Update task status | β | β | β |
| Close | β | β | β |
*sprint start auto-activates draft sprints.
Phase 1: Requirements Refinement (sprint refine) β WHAT needs doing
- Per-ticket HITL clarification: Claude asks questions, user approves requirements
- Implementation-agnostic β no code exploration, no repo selection
- Stores results as
requirementStatus: 'approved'on each ticket
Phase 2: Task Generation (sprint plan) β HOW to implement
- Requires all tickets to have
requirementStatus: 'approved' - User selects repos via checkbox UI (before Claude starts) β saved to
ticket.affectedRepositories - Claude explores confirmed repos only β generates tasks split by repo with dependencies
- Repo selection persists for resumability
Running sprint plan on a draft sprint that already has tasks triggers re-plan mode:
- Add new tickets to the draft sprint (
ticket add) - Refine their requirements (
sprint refine) - Run
sprint planβ auto-detects existing tasks
Behavior:
- Processes ALL tickets (not just unplanned ones)
- Existing tasks are included as AI context so Claude can reuse, modify, or drop them
- AI generates a complete replacement task set covering all tickets
- New tasks atomically replace all existing tasks via
saveTasks()(interruption-safe) reorderByDependenciesruns after every import- Interactive mode shows confirmation prompt before replacing
pnpm dev <command> # Run CLI (tsx, no build needed)
pnpm build # Compile for npm distribution (tsup)
pnpm typecheck # Type check
pnpm lint # Lint
pnpm test # Run testsPre-commit hook runs lint-staged (ESLint + Prettier on staged files). If commits are rejected, run:
pnpm lint:fix # Auto-fix linting issues
pnpm format # Format all filesConditional sections - {{VARIABLE}} placeholders in prompts can be empty strings; avoid numbered lists that create
gaps (use blockquotes or bullets)
Em-dash usage - Use β (em-dash) not - (hyphen) for explanatory clauses in .md prompts (consistency across all
prompt files)
Workflow sync - Prompt templates must match actual command flow (e.g., repo selection happens in command before
Claude session starts)
Template builders - src/integration/ai/prompts/loader.ts compiles .md templates with placeholder replacement
Canonical XML vocabulary β structural inputs sit inside known tags (<harness-context>, <task-specification>,
<context>, <requirements>, <constraints>, <examples>, <dimension>, <signals>). The allowlist is enforced
by src/integration/ai/prompts/loader.test.ts (planner-role rendered prompts wrap top-level inputs inside a known XML tag) β extend both the allowlist and this list when adding a new tag.
No hardcoded package-manager commands β prompts must not embed pnpm/npm/pip/cargo/go test outside the
{{PROJECT_TOOLING}} or {{CHECK_GATE_EXAMPLE}} placeholders. Downstream ecosystems differ; the placeholders are the
seam.
Conditional placeholders must not sit inside numbered lists β when the substitution is empty the list must still
read cleanly. Emit conditional content as a standalone bullet or paragraph, not as trailing prose in a numbered step.
Downstream .claude/ is optional context β many downstream repos have no .claude/ directory. Reference it as
"when present" rather than prescriptively; skip silently when absent.
Absolute rules name their exception β never/always phrasing is fragile when legitimate exceptions exist. Name
the exception inline (e.g. "Merge create+use β except when a stable contract makes them independently testable").
Six specialized agents in .claude/agents/ (auditor, designer, implementer, planner, reviewer, tester) β invoke via
the Task tool with the matching subagent_type. These are contributor-side tooling for working on ralphctl's own
source; they are not shipped to npm and do not affect ralphctl's runtime behavior.
Two UI surfaces β pick the right one for the command:
- Ink TUI (
src/integration/ui/tui/) β live dashboard, REPL, settings panel, inline editor. Mounted by bareralphctl,ralphctl interactive, andralphctl sprint start. Takes over the terminal via the alt-screen buffer (like vim/htop) and restores on exit. Uses@inkjs/uicomponents + theLoggerPortevent bus for live-updating output. - Plain-text CLI β one-shot commands (
sprint show,config set,project add, etc.) usePlainTextSinkfor structured logging plus the pure formatters in@src/integration/ui/theme/ui.ts(grep^exportthere for the full roster β card / table / status / success / warning / info / field / progress families). When a prompt fires, theInkPromptAdapterauto-mounts a minimal<PromptHost />inline β no Inquirer.
Never add raw emoji or inconsistent formatting β use emoji/colors/statusEmoji from
@src/integration/ui/theme/theme.ts and the formatters from @src/integration/ui/theme/ui.ts. Ink components pull
theme tokens via @src/integration/ui/theme/tokens.ts.
The Ink TUI has a design system β see .claude/docs/DESIGN-SYSTEM.md before
adding a view, component, or glyph. It covers the token set (inkColors / glyphs / spacing), component inventory,
state surfaces (loading / empty / error / success), navigation contract, copy rules, and anti-patterns. Most needs are
already solved β reuse ViewShell + ResultCard + FieldList + Spinner before inventing.
See .claude/agents/designer.md for the designer agent's role.
See ARCHITECTURE.md Β§ Clean Architecture Layers for the annotated src/ tree and
the per-port adapter map. Top-level: domain/ (pure) β business/ (ports, usecases, pipelines) β integration/
(adapters, UI, CLI) β application/ (composition root).
See ARCHITECTURE.md Β§ Harness Signals and src/domain/signals.ts. Adding a variant to the HarnessSignal union
triggers compiler exhaustiveness errors everywhere β let the type system guide the edit. Signals flow to two
subscribers in parallel: FileSystemSignalHandler (durable) + SignalBusPort (live dashboard).
Optional, opt-out, runs only after all tasks complete successfully (src/business/usecases/execute.ts):
- Fires when
summary.stopReason === 'all_completed'AND!options.sessionAND!options.noFeedback - User types free-form feedback; empty input exits the loop immediately
- AI implements the feedback, check scripts re-run, evaluator re-runs
- Hard cap:
MAX_FEEDBACK_ITERATIONS(safety net against infinite loops) - Disable per-run with
--no-feedback; disabled implicitly in--sessionmode - Dirty working tree after a feedback iteration triggers the same harness auto-commit as post-task settlement (see recover-dirty-tree).
sprint start runs tasks in parallel by default (one per unique projectPath):
- Session/step mode forces sequential (
--concurrency 1equivalent) - Rate limiting:
RateLimitCoordinatorpauses new task launches globally when any task hits a rate limit; running tasks continue uninterrupted - Rate-limited tasks auto-resume via
--resume <session_id>(full session context preserved) - Errors with rate-limit headers (429, 429-style responses) trigger coordinator pause automatically
Customize ralphctl behavior with these environment variables:
| Variable | Default | Range | Purpose |
|---|---|---|---|
RALPHCTL_ROOT |
~/.ralphctl/ |
Any valid path | Override data directory (e.g., for testing or multi-workspace setup) |
RALPHCTL_SETUP_TIMEOUT_MS |
300000 (5 min) | > 0 | Timeout for check scripts; overridable per-repo via Repository.checkTimeout |
RALPHCTL_LOCK_TIMEOUT_MS |
30000 | 1β3600000 | Stale lock file threshold for concurrent access detection |
RALPHCTL_LOG_LEVEL |
info |
debug/info/warn/error |
Filter structured-log output (PlainTextSink and JsonLogger) |
RALPHCTL_NO_TUI |
unset | any truthy value | Force the plain-text CLI fallback even on a TTY (skip Ink mount) |
RALPHCTL_JSON |
unset | any truthy value | Force the JsonLogger sink (one JSON object per line) regardless of TTY |
NO_COLOR |
unset | any truthy value | Suppress ANSI colors (honored by isTTY() and by colorette) |
CI |
unset | any truthy value | Auto-detected; disables Ink mount and implicit interactive prompts |
VISUAL / EDITOR |
unset | editor command | Read by the editor resolver; the Ink inline editor is preferred on TTY. |
Note: In tests, set RALPHCTL_ROOT BEFORE importing persistence modules (e.g., in setup file before describe
blocks).
Prompt templates are distributed with the CLI. The build script copies .md files from
src/integration/ai/prompts/ to dist/prompts/:
pnpm build # Runs: tsup && mkdir -p dist/prompts && cp src/integration/ai/prompts/*.md dist/prompts/Template loading is dual-mode:
- Dev: Reads from
src/integration/ai/prompts/*.md - Bundled (npm): Reads from
dist/prompts/*.md
Gotcha: If .md files are missing in dist, templates silently fail with empty placeholder values (no file-not-found error). CI verifies dist works by testing node dist/cli.mjs --version from arbitrary cwd.
Releases are automated via GitHub Actions on git tags matching v[0-9]+.[0-9]+.[0-9]+:
- Tag must match
package.jsonversion (e.g., tagv0.2.2requires"version": "0.2.2"in package.json) - Changelog: Add a
## [X.Y.Z]section toCHANGELOG.mdor release notes will fall back to git log - NPM publish: Uses provenance attestation (
--provenance) - GitHub release: Auto-generated with changelog section + comparison link to previous tag
- Pre-release detection: Tags containing
-(e.g.,v1.0.0-beta) are marked as prerelease
When compacting, always preserve: sprint state machine, two-phase planning constraints, architecture constraints, list of modified files, verification commands, and current task context.
- Anthropic β Effective Harnesses for Long-Running Agents β consult when extending the runner/executor layer.
- Anthropic β Harness Design for Long-Running Apps β generator-evaluator pattern, context management, iterative refinement, and model-specific tuning strategies.