Agent-agnostic automated development loop. Fetches issues from your GitHub project board (optionally filtered by milestone), implements them with an AI coding agent, runs tests, reviews the code, and creates PRs — then moves to the next issue until all matching issues are done.
The Loop: Plan (GitHub Issues) -> Build (AI Agent) -> Test -> Review -> Verify -> Learn -> Ship (PR)
# Install globally
npm install -g @bradtaylorsf/alpha-loop
# Or run directly
npx @bradtaylorsf/alpha-loop- Node.js 20+
- git — for worktree isolation
- GitHub CLI (
gh) — authenticated withgh auth login - AI agent CLI — set
agentin your config to one of:- Claude Code (
claude) — default - Codex (
codex) - OpenCode (
opencode)
- Claude Code (
- Playwright CLI (optional) — for live verification with screenshots
# 1. Initialize — runs full onboarding (config, vision, scan, sync)
cd your-project
alpha-loop init
# 2. Edit .alpha-loop.yaml if needed (agent, model, test_command, etc.)
# 3. Run the loop — you'll be prompted to pick an epic or milestone
alpha-loop run
# Or target a specific milestone directly
alpha-loop run --milestone "v1.0"For planned feature work, use epics as the unit you schedule and ship:
alpha-loop triagereviews open issues, proposes cleanup, and groups related ready issues into parent epics with ordered child checklists.alpha-loop roadmapschedules parent epic issues into milestones, while still scheduling standalone issues that are not part of an epic.alpha-loop run --epic <N>ships the epic's child issues in checklist order. Agents working on each child issue receive the parent epic goal, acceptance criteria, and sibling checklist as context.alpha-loop roadmap --queuerecommends the next ordered epic queue, explains blockers and risks, and prints the exactalpha-loop run --epics ...command.alpha-loop run --epics <A,B,C>runs several parent epics back-to-back in that exact order, with a separate session branch and PR for each epic.alpha-loop run --verify-only <N>re-runs the epic verification pass when you need to re-check shipped child issues against the parent acceptance criteria.
Milestones answer "when should this epic ship?" The epic checklist answers "what child issues ship, and in what order?"
Alpha Loop implements a 12-step pipeline for each issue:
- Status Update — Labels issue
in-progress, assigns to you, updates project board - Worktree — Creates an isolated git worktree so work doesn't conflict with other issues
- Plan — Agent analyzes the issue and enriches it with implementation details 3b. Fetch Comments — Loads issue comments so the agent has the full conversation context
- Implement — Agent writes the code, guided by project vision, context, comments, and learnings from previous issues
- Test + Retry — Runs your test command; if tests fail, agent fixes and retries (up to
max_test_retries) - Verify + Retry — Starts your dev server, uses playwright-cli to test the feature like a real user, takes screenshots
- Review — A review agent reads the diff, checks for gaps, security issues, and missing wiring — fixes what it can
- Learn — Extracts learnings (patterns, anti-patterns, what worked/failed) and commits them in the issue worktree
- Create PR — Opens a PR with the implementation, learning artifact, test results, review summary, and verification status 9b. Assumptions — Agent summarizes assumptions and decisions made, posts as a comment on the issue for user validation
- Update Issue — Posts results as a comment, updates labels
- Auto-Merge — Merges the PR to the session branch (if enabled)
- Cleanup — Removes the worktree
After all issues are processed, Alpha Loop:
- Auto-captures failures as eval cases for regression testing
- Generates a session summary aggregating learnings across issues when a session branch is being finalized
- Runs a post-session code review on the full session diff to catch cross-issue integration problems
- Creates the session PR with all findings included
When you start the loop interactively, Alpha Loop shows open epics above milestones and lets you pick which target to work on:
Open Epics
1 Multi-tenant support #165 (0/7 done · milestone v1.0)
Open Milestones
2 v1.0 — MVP (5 open, 3/8 done · due 2026-04-15 · 1 scheduled epic)
3 v1.1 — Polish (10 open, 0/10 done)
0 All ready issues (no filter)
Select [0-3]: 1
This lets you plan work in GitHub milestones and control exactly how much the loop processes per session. You can also pass --milestone "v1.0" to skip the prompt, or set milestone: v1.0 in your config file. If that milestone has exactly one open parent issue labeled epic, Alpha Loop processes that epic's checklist; if it has multiple scheduled epics, it exits with their numbers and asks you to choose with --epic <N>. Use --skip-epic --milestone "v1.0" to force the flat milestone issue flow.
When auto_merge is enabled (default), Alpha Loop creates a session branch (e.g., session/20260331-002240) and merges each issue's PR into it. This keeps your main branch clean until you're ready to merge the whole session.
Each completed issue produces a learning file in .alpha-loop/learnings/ that is committed with that issue's implementation PR. It includes:
- What worked and what failed
- Reusable patterns discovered
- Anti-patterns to avoid
- Suggested skill/prompt updates
These learnings are automatically fed into future implementation prompts, so the agent gets smarter over time.
Run alpha-loop review to trigger the self-improvement loop. It reads all accumulated learnings, computes metrics (success rate, avg retries, common failures), gathers current agent/skill definitions, and asks the configured agent to propose targeted improvements:
- Agent prompts — bake in recurring patterns, eliminate anti-patterns
- Skill definitions — add/update skills based on what consistently works or fails
- Testing environment — fix Playwright config, port conflicts, auth state, data seeding issues
- Harness configuration — tune timeouts, retries, and defaults
Without --apply, proposals are saved to learnings/proposed-updates/ for review. With --apply, changes are written and a draft PR is created.
Alpha Loop includes a self-improving eval system inspired by Meta-Harness (Lee et al., 2026). It captures real failures as eval cases and tracks composite scores over time to measure whether prompt/skill changes actually help. See the Comprehensive Eval Guide for tutorials, use cases, and how-tos.
# Capture failures from recent sessions as eval cases
alpha-loop eval capture
# Capture quality failures from successful sessions (false positives)
alpha-loop eval capture --quality
alpha-loop eval capture --quality 190 --session my-session
# Run the eval suite and compute composite score
alpha-loop eval run
# View score history, Pareto frontier, or compare runs
alpha-loop eval scores
alpha-loop eval pareto
alpha-loop eval compare 1 2
# Greedy search over model configurations per pipeline step
alpha-loop eval search --models "haiku,sonnet,opus"
# Estimate cost before running
alpha-loop eval estimate
# Compare two config files side-by-side
alpha-loop eval compare-configs config-a.yaml config-b.yaml
# Apply a single routing profile before running
alpha-loop eval run --profile hybrid-v1
# Matrix run: replay the routing-regression set under every profile and emit a side-by-side report
# Defaults to a dry-run (validates profiles and case structure). Pass --execute to run pipelines for real.
alpha-loop eval run --matrix --tags routing-regression
alpha-loop eval run --matrix --profiles "all-frontier,hybrid-v1" --out eval/reports
alpha-loop eval run --matrix --tags routing-regression --execute # real runs (see CASE_FORMAT.md)Eval cases live in .alpha-loop/evals/ and scores are appended to scores.jsonl (Git-friendly, append-only). The composite score formula is pass-rate primary with lightweight penalties for retries and duration. Recovered session results are flagged and excluded from aggregate scoring. Real API costs (tokens, USD) are tracked per case from agent output and used for the Pareto frontier.
Step-level evals test individual pipeline stages (plan, implement, test, test-fix, review, learn, skill) and run in seconds using LLM-judge and keyword checks:
# Run only step-level evals (fast, cheap)
alpha-loop eval --suite step
# Run evals for a specific step
alpha-loop eval --suite step --step review
# Convert between AlphaLoop and skill-creator eval formats
alpha-loop eval convert --direction to-skill
alpha-loop eval convert --direction from-skill --input path/to/evals.json
# Import SWE-bench cases from HuggingFace (requires Python + datasets)
alpha-loop eval import-swebench --count 10 --repo "django/django"The evolve command runs a Meta-Harness-style optimization loop: a proposer agent reads full execution traces, scores, and source code, then proposes targeted changes to prompts, skills, or config. Changes are evaluated against the eval suite — improvements are kept, regressions are reverted (autoresearch keep/discard pattern).
alpha-loop evolve # Run up to 5 iterations
alpha-loop evolve --max-iterations 10 # Run 10 iterations
alpha-loop evolve --continuous # Run until manually stopped (Ctrl-C)
alpha-loop evolve --surface prompts # Only modify agent prompts (safest)
alpha-loop evolve --surface all # Modify prompts + pipeline code (riskier)
alpha-loop evolve --resume # Resume from a previous evolve session
alpha-loop evolve --dry-run # Preview without changesPropose per-stage routing changes (frontier → local, or revert to fallback) as draft PRs, based on the metrics aggregated by alpha-loop report routing plus the matrix eval.
A stage is promoted to its local candidate when (over ≥30 runs): cost-per-issue savings ≥ 40%, pipeline success delta ≥ −3%, and tool-error rate < 2%. Promotions require a matrix eval run within the last 7 days (alpha-loop eval --matrix --execute).
alpha-loop evolve routing # Propose promotions as a draft PR
alpha-loop evolve routing --dry-run # Preview without writing config
alpha-loop evolve routing --demote build # Manually revert a stage to fallbackEvery promotion/demotion is appended to .alpha-loop/learnings/routing-history.md. PR bodies include a git revert rollback snippet plus the previous YAML fragment.
By default, Alpha Loop processes issues one at a time — each issue gets its own plan, implement, test, and review agent calls. Batch mode combines multiple issues into single agent calls, dramatically reducing overhead:
# Process issues in batches of 5 (default)
alpha-loop run --batch
# Custom batch size
alpha-loop run --batch --batch-size 3How it works: If a milestone has 13 issues, batch mode processes them in 3 rounds:
| Batch | Issues | Agent Calls |
|---|---|---|
| Batch 1 | #1-#5 | 1 plan + 1 implement + 1 review |
| Batch 2 | #6-#10 | 1 plan + 1 implement + 1 review |
| Batch 3 | #11-#13 | 1 plan + 1 implement + 1 review |
Each batch goes through the full pipeline:
- Batch plan — One agent call plans all issues, writing per-issue plan JSONs
- Batch implement — One agent call implements all issues, committing per-issue
- Test — Runs the test suite once (with retry loop if needed)
- Batch review — One agent call reviews the entire diff for all issues
- PR — Creates one PR that closes all issues in the batch
- Per-issue updates — Each issue gets individually updated with labels, comments, and PR link
This reduces agent calls from ~3-4 per issue to ~3 per batch. For 5 issues, that's 3 agent calls instead of 15-20.
Or set it permanently in .alpha-loop.yaml:
batch: true
batch_size: 5If the loop hangs or crashes mid-session, work can be stranded on local branches with no PR. Run alpha-loop resume to recover:
- Reads session crash markers first, then falls back to scanning local
agent/issue-*branches with commits but no open PR - Pushes each branch to origin
- Runs code review
- Creates WIP PRs, marks issues
In Review, and updates the session PR with a verification caveat - Regenerates missing learning artifacts and the aggregate session summary from recovered session results
Recovered PRs are written with recoveryMode: "resume" and are not marked complete. resume does not rerun the project test suite or final smoke tests, so verify recovered work before merging.
Use --issue <N> to resume a specific issue.
alpha-loop history <session> shows both unrecovered crash-<N>.json markers and recovered result files separately from normal successes and failures.
During live verification, the agent takes screenshots at key states and saves them to .alpha-loop/sessions/<name>/screenshots/issue-<N>/. These are kept locally (not committed to git) for debugging.
| Command | Description |
|---|---|
alpha-loop init |
Full onboarding: config, templates, vision, scan, sync, commit |
alpha-loop run |
Fetch matching issues, process them all, then exit |
alpha-loop run --dry-run |
Preview without making changes |
alpha-loop run --epic <N> |
Process an epic — its sub-issues in checklist order, auto-verify on completion (see docs/epics.md) |
alpha-loop run --epics <ids> |
Process an ordered comma-separated queue of epics, one session branch and PR per epic |
alpha-loop run --epics <ids> --queue-branch-mode independent |
Run queued epics without stacking later session branches on earlier ones |
alpha-loop run --verify-only <N> |
Run just the epic verification pass — evaluates merged PRs against acceptance criteria |
alpha-loop scan |
Generate/refresh project context and instructions file |
alpha-loop vision |
(deprecated) Use alpha-loop plan instead |
alpha-loop auth |
Save authenticated browser state for verification |
alpha-loop history |
View session and queue history |
alpha-loop history <name> |
View a specific session |
alpha-loop history queue-<timestamp> |
Inspect a multi-epic queue manifest, including stopped/pending epics |
alpha-loop history <name> --qa |
Show QA checklist for session |
alpha-loop history <name> --telemetry |
Show per-stage telemetry table (see docs/telemetry.md) |
alpha-loop history --clean |
Remove old session data |
alpha-loop report routing |
Aggregate per-stage telemetry + cost-per-issue across sessions |
alpha-loop sync |
Add/update templated assets in configured harnesses without deleting harness-only files |
alpha-loop sync --check |
Check for drift, including target-only harness files, without writing changes |
alpha-loop sync --prune |
Sync templates and remove target-only harness files after logging each pruned path |
alpha-loop resume |
Resume stranded work — push branches, review, open WIP PRs |
alpha-loop resume --issue <N> |
Resume a specific issue |
alpha-loop review |
Analyze learnings and propose self-improvements |
alpha-loop review --apply |
Apply proposed improvements and create a draft PR |
alpha-loop eval |
Run the eval suite and compute composite score |
alpha-loop eval capture |
Capture failures as eval cases (interactive) |
alpha-loop eval capture --quality |
Capture quality failures from successful sessions (false positives) |
alpha-loop eval list |
Show eval cases and recent scores |
alpha-loop eval scores |
Show score history over time |
alpha-loop eval pareto |
Show score/cost Pareto frontier |
alpha-loop eval compare <r1> <r2> |
Compare two eval runs |
alpha-loop eval search |
Greedy search over model configurations per pipeline step |
alpha-loop eval estimate |
Estimate cost of running the eval suite |
alpha-loop eval compare-configs <a> <b> |
Compare two YAML config files side-by-side |
alpha-loop eval convert |
Convert between AlphaLoop and skill-creator eval formats |
alpha-loop eval import-swebench |
Import eval cases from SWE-bench dataset |
alpha-loop eval export <case> |
Export an eval case for contributing back (anonymized by default) |
alpha-loop evolve |
Meta-Harness-style automated optimization loop |
alpha-loop evolve routing |
Propose routing promotions/demotions as draft PRs based on eval metrics |
alpha-loop evolve routing --demote <stage> |
Manually demote a stage to routing.fallback.escalate_to |
alpha-loop plan |
Generate a full project scope (milestones + issues) from seed inputs using AI |
alpha-loop plan --seed <file> |
Read seed description from a file instead of prompting |
alpha-loop plan --dry-run |
Display the plan and save .alpha-loop/plan.json without creating GitHub resources |
alpha-loop plan --resume |
Create GitHub resources from the saved .alpha-loop/plan.json draft |
alpha-loop plan --yes --seed <file> |
Non-interactive mode: accept all AI recommendations |
alpha-loop triage |
Analyze open issues, clean up backlog noise, and propose/apply epic groups |
alpha-loop triage --dry-run |
Display cleanup findings and epic proposals without making changes |
alpha-loop triage --yes |
Non-interactive mode: apply AI-selected cleanup actions and epic proposals |
alpha-loop roadmap |
Schedule parent epics and standalone issues into milestones using AI analysis |
alpha-loop roadmap --queue |
Recommend the next ordered epic run queue without making changes |
alpha-loop roadmap --queue --milestone <name> |
Recommend an epic run queue within a release or sprint milestone |
alpha-loop roadmap --dry-run |
Display proposed epic/standalone milestone assignments without making changes |
alpha-loop roadmap --yes |
Non-interactive mode: apply all AI-recommended epic and standalone assignments |
alpha-loop run [options]
Options:
--once Process one issue and exit
--dry-run Preview without making changes
--model <model> AI model override (e.g., opus, sonnet, gpt-5.4, gpt-5.3-codex)
--milestone <name> Process this milestone's scheduled epic, or flat issues if none
--skip-tests Skip test execution
--skip-review Skip code review step
--skip-learn Skip learning extraction
--auto-merge Auto-merge PRs to session branch
--merge-to <branch> Use an existing branch instead of creating a new session branch
--batch Batch mode: process multiple issues per agent call (faster, fewer tokens)
--batch-size <n> Issues per batch (default: 5)
--epic <n> Process a specific epic by issue number (skips the picker)
--epics <ids> Process multiple epics in order (comma-separated)
--queue-branch-mode <mode> Branch mode for --epics: stacked or independent
--skip-epic Skip epic discovery, use flat/milestone flow
--verify-only <n> Run only the verification pass on an existing epicRunning alpha-loop init creates a .alpha-loop.yaml file:
# Alpha Loop configuration
repo: owner/repo-name
project: 0 # GitHub Project number (find it in your project URL)
agent: claude # AI agent CLI: claude, codex, opencode, lmstudio, ollama
# model: # omit to use agent's default (e.g., opus, gpt-5.4)
label: ready
base_branch: main
test_command: pnpm test
dev_command: pnpm dev
auto_merge: true
# Coding harnesses to sync skills/agents to (auto-derived from agent if empty)
harnesses:
- claude
# Safety limits (0 = unlimited)
max_issues: 20
max_session_duration: 7200 # 2 hours in seconds
# Post-session review (runs after all issues, reviews full session diff)
post_session:
review: true
security_scan: true
# Eval system
auto_capture: true # capture failures as eval cases
eval_dir: .alpha-loop/evals| Key | Default | Description |
|---|---|---|
repo |
(auto-detected) | GitHub repo in owner/name format |
project |
0 |
GitHub Project number (from URL: users/<owner>/projects/<N>) |
agent |
claude |
AI agent CLI to use: claude, codex, or opencode |
model |
(agent default) | AI model (passed via --model flag; omit to use agent's default) |
review_model |
(agent default) | AI model for code review and learning extraction |
label |
ready |
GitHub label that marks issues as ready for the loop |
base_branch |
master |
Branch to create PRs against |
test_command |
pnpm test |
Command to run tests |
dev_command |
pnpm dev |
Command to start the dev server for verification |
max_turns |
(none) | Max conversation turns for the agent |
poll_interval |
60 |
Seconds between issue polling |
max_test_retries |
3 |
Times to retry failing tests/verification |
milestone |
(none) | Only process issues in this milestone |
max_issues |
0 |
Max issues to process per session (0 = unlimited) |
max_session_duration |
0 |
Max session duration in seconds (0 = unlimited) |
auto_merge |
true |
Auto-merge issue PRs into the session branch |
merge_to |
(none) | Use an existing branch instead of creating a session branch |
skip_tests |
false |
Skip test execution |
skip_review |
false |
Skip code review |
skip_verify |
false |
Skip live verification |
skip_learn |
false |
Skip learning extraction |
skip_e2e |
false |
Skip E2E tests |
skip_install |
false |
Skip pnpm install in worktrees |
skip_preflight |
false |
Skip pre-flight test validation |
auto_cleanup |
true |
Auto-remove worktrees after processing |
run_full |
false |
Run full pipeline without skipping any steps |
verbose |
false |
Enable verbose agent output |
harnesses |
(auto from agent) | Coding harnesses to sync skills/agents to (e.g., claude, codex) |
eval_dir |
.alpha-loop/evals |
Directory for eval cases and scores |
eval_model |
(agent default) | AI model for eval judging |
eval_timeout |
300 |
Timeout in seconds for eval case execution |
auto_capture |
true |
Auto-capture failures as eval cases at end of session |
batch |
false |
Enable batch mode — process multiple issues per agent call |
batch_size |
5 |
Number of issues per batch when batch mode is enabled |
smoke_test |
(none) | Shell command to run as a final smoke test after session review |
pipeline |
{} |
Per-step agent/model overrides (see below) |
pricing |
(built-in) | Custom token pricing per model for cost tracking |
eval_include_agent_prompts |
true |
Include repo-specific agent prompts during eval runs |
eval_include_skills |
true |
Include repo-specific skills during eval runs |
post_session.review |
true |
Run holistic code review on full session diff |
post_session.security_scan |
true |
Include security scanning in post-session review |
All config options can be set via environment variables (uppercase, same names):
| Variable | Config Key |
|---|---|
REPO |
repo |
PROJECT |
project |
AGENT |
agent |
MODEL |
model |
REVIEW_MODEL |
review_model |
POLL_INTERVAL |
poll_interval |
MAX_TEST_RETRIES |
max_test_retries |
MILESTONE |
milestone |
MAX_ISSUES |
max_issues |
MAX_SESSION_DURATION |
max_session_duration |
BASE_BRANCH |
base_branch |
TEST_COMMAND |
test_command |
DEV_COMMAND |
dev_command |
DRY_RUN |
dry_run |
SKIP_TESTS |
skip_tests |
SKIP_REVIEW |
skip_review |
SKIP_VERIFY |
skip_verify |
SKIP_LEARN |
skip_learn |
SKIP_E2E |
skip_e2e |
SKIP_INSTALL |
skip_install |
SKIP_PREFLIGHT |
skip_preflight |
AUTO_MERGE |
auto_merge |
AUTO_CLEANUP |
auto_cleanup |
MERGE_TO |
merge_to |
RUN_FULL |
run_full |
VERBOSE |
verbose |
EVAL_DIR |
eval_dir |
EVAL_MODEL |
eval_model |
EVAL_TIMEOUT |
eval_timeout |
AUTO_CAPTURE |
auto_capture |
BATCH |
batch |
BATCH_SIZE |
batch_size |
SKIP_POST_SESSION_REVIEW |
post_session.review (inverted) |
SKIP_POST_SESSION_SECURITY |
post_session.security_scan (inverted) |
Precedence: CLI flags > environment variables > .alpha-loop.yaml > auto-detection > defaults
Alpha Loop is agent-agnostic. Set the agent field in .alpha-loop.yaml to switch which CLI runs the pipeline:
# Use Codex instead of Claude
agent: codex# Use Codex with a specific model
agent: codex
model: gpt-5.3-codexIf you omit model, the agent CLI uses its own default (e.g., Claude uses its configured model, Codex uses gpt-5.4). Set model only when you want to override.
| Agent | Example models | CLI flags used |
|---|---|---|
claude |
opus, sonnet, haiku |
-p --model MODEL --dangerously-skip-permissions |
codex |
gpt-5.4, gpt-5.4-mini, gpt-5.3-codex |
exec --model MODEL --full-auto |
opencode |
deepseek, gpt-4 |
run --model MODEL |
When you change agent, the harness sync automatically targets the correct directories (e.g., .claude/ for Claude, .codex/ for Codex). You can also explicitly list harnesses if you use multiple tools:
agent: codex
harnesses:
- codex
- claude # also sync to Claude for teammates using itSync is additive by default: alpha-loop sync copies new or changed files from .alpha-loop/templates/ into each harness path, but it leaves harness-only skills and support files in place. Use alpha-loop sync --check to detect strict drift, including target-only files, and alpha-loop sync --prune only when you explicitly want to remove files from harness paths that are not present in templates.
Use pipeline to assign different models to different pipeline stages. This lets you use cheaper models for simple steps and reserve expensive models for implementation:
agent: claude
model: claude-sonnet-4-6 # default for all steps
pipeline:
plan:
model: claude-haiku-4-5 # cheap model for planning
implement:
model: claude-sonnet-4-6 # main model for coding
review:
model: claude-opus-4-6 # best model for review
learn:
model: claude-haiku-4-5 # cheap model for learningUse alpha-loop eval search to automatically find the best model assignment per step via greedy coordinate descent over your eval suite.
For hybrid cloud/local setups, use routing: to target different models and endpoints for each Loop stage. This is how you offload token-heavy middle stages (Build, Test) to local open-weight models while keeping frontier models for Plan and Review:
routing:
profile: hybrid-v1 # all-frontier | hybrid-v1 | all-local | <custom-name>
stages:
plan: { model: claude-opus-4-7, endpoint: anthropic }
build: { model: qwen3-coder-30b-a3b, endpoint: lmstudio_local }
test_write: { model: qwen3-coder-30b-a3b, endpoint: lmstudio_local }
test_exec: { model: qwen3-coder-30b-a3b, endpoint: lmstudio_local }
review: { model: claude-sonnet-4-6, endpoint: anthropic }
summary: { model: gemma-4-31b, endpoint: lmstudio_local }
endpoints:
anthropic: { type: anthropic, base_url: "https://api.anthropic.com" }
lmstudio_local: { type: anthropic_compat, base_url: "http://localhost:1234" }
ollama_local: { type: openai_compat, base_url: "http://localhost:11434/v1" }
fallback:
on_tool_error: escalate # escalate | retry | fail
escalate_to: { model: claude-sonnet-4-6, endpoint: anthropic }Stages: plan, build, test_write, test_exec, review, summary — each takes { model, endpoint } where endpoint references a name defined in routing.endpoints.
Endpoint types: anthropic (native Anthropic API), anthropic_compat (Anthropic-compatible — e.g. LM Studio), or openai_compat (OpenAI-compatible — e.g. Ollama, vLLM). See docs/local-models.md for LM Studio and Ollama setup, including the agent: lmstudio / agent: ollama short form for single-agent local mode.
Fallback modes:
escalate— when a routed stage errors on a tool call, retry onescalate_to(typically a frontier model)retry— retry on the same model/endpointfail— surface the error without retry
Profile as a list (A/B): profile may also be an array of names (e.g. [hybrid-v1, all-local]). Alpha Loop picks one deterministically per-issue so reruns of the same issue select the same profile — this makes profile comparisons reproducible.
Backwards compatibility: If you don't set routing, alpha-loop uses the top-level agent: / model: / pipeline: exactly as before — no behavior change.
Alpha Loop can run the token-heavy middle of the Loop (Build, Test) against an open-weight coding model on your own machine — typically a 30B-class model in LM Studio or Ollama — while keeping frontier models for Plan and Review. On a 64GB+ Apple Silicon Mac this typically cuts cost-per-issue by 60–80% without sacrificing Plan/Review quality.
- docs/local-models.md — hardware prerequisites, install steps for LM Studio 0.4.1+ / Ollama, recommended models (Qwen3-Coder-Next 30B-A3B, Gemma 4 31B, GLM-4.6), Apple Silicon tuning, and troubleshooting.
- docs/routing-profiles.md — copy-pasteable profiles:
all-frontier(baseline),hybrid-v1(recommended default),all-local(offline / zero-cost),budget-hawk(Haiku cloud + local coder).
Quickest path: install LM Studio 0.4.1+, load qwen3-coder-30b-a3b, then drop the hybrid-v1 block from docs/routing-profiles.md into your .alpha-loop.yaml. alpha-loop init detects Apple Silicon + 64GB+ RAM and points you at these docs automatically.
The loop uses these labels. Run alpha-loop init to create any that are missing:
| Label | Purpose |
|---|---|
ready |
Issue is ready for the loop to pick up |
in-progress |
Loop is actively working on it |
in-review |
PR created, awaiting review |
failed |
Loop failed after retries |
epic |
Parent issue with an ordered sub-issue checklist |
Use GitHub milestones to group issues into planned releases or sprints. When you start the loop, you'll be prompted to pick an epic or milestone; milestone rows show scheduled epic counts when parent epics are assigned to them.
Create milestones at github.com/<owner>/<repo>/milestones/new. Set due dates to keep yourself on track.
An epic is a GitHub issue with the epic label and a task-list body that references sub-issues by number:
## Sub-issues
- [ ] #158 Add tenant column to users table
- [ ] #159 Add tenant middleware
- [ ] #160 Scope queries by tenantRun alpha-loop init to install the epic issue template at .github/ISSUE_TEMPLATE/epic.yml. It applies the epic label and prompts for the goal, ordered sub-issues, acceptance criteria, dependencies, sequencing notes, and verification expectations.
When you start the loop, open epics appear above milestones in the picker and show milestone membership when present. You can also target one directly:
alpha-loop run --epic 165alpha-loop run --milestone "v1.0" checks for open epics assigned to that milestone before fetching flat issues. One scheduled epic is processed automatically, multiple scheduled epics require --epic <N>, and no scheduled epics falls back to ready non-epic issues in that milestone. --epic always wins over --milestone; --skip-epic disables epic discovery and preserves the flat milestone flow.
To ask Alpha Loop what to run next, use queue planning:
alpha-loop roadmap --queue
alpha-loop roadmap --queue --milestone "v1.0"Queue planning is read-only. It inspects open epic issues, milestone assignments, checklist progress, child readiness labels, dependency phrases such as depends on #N, and likely file overlap. When a runnable queue exists, it prints an executable command like alpha-loop run --epics 205,166,214; blocked epics stay out of the command and are listed with their blockers.
To run several epics unattended while keeping review scope separate, pass an explicit queue:
alpha-loop run --epics 205,166,214The queue is validated before any work starts. Each listed issue must exist, be labeled epic, not be duplicated, and be open unless it is already closed as completed. Alpha Loop processes the epics in the given order, creates/finalizes one session branch and PR per epic, and stops on the first epic failure, verification gap, checklist consistency error, or transient agent/rate-limit stop. By default, queue sessions use stacked ancestry: later epic session branches start from the previous successful session branch while their PRs still target the configured base branch. Use --queue-branch-mode independent for unrelated epics that should all branch from the base branch. Non-dry-run queue attempts write .alpha-loop/sessions/queue-<timestamp>/queue.json; alpha-loop history lists those manifests and alpha-loop history queue-<timestamp> prints stopped/pending epics, session PRs, and rebase notes. --dry-run prints the validated queue without mutating GitHub or git state.
Sub-issues are processed in checklist order (not issue-number order). Each sub-issue PR gets Part of #165 appended, and the epic body's checkboxes auto-flip from - [ ] to - [x] as PRs merge. When every sub-issue has shipped, the loop runs a verification pass against each sub-issue's acceptance criteria — on pass the epic is auto-closed, on partial or fail it stays open with a needs-human-input label and a structured comment explaining the gaps.
See docs/epics.md for the full feature reference, including --verify-only, the prefer_epics config option, skip rules, and safety rails.
Alpha Loop reads issues from a GitHub Project board (v2). Issues are processed in board order, so you control priority by reordering. When combined with milestones, only "Todo" items in the selected milestone are processed.
Set the project number in your config (find it in your project URL: github.com/users/<owner>/projects/<number>).
Issues work best with structured acceptance criteria. Run alpha-loop init to install two GitHub issue templates:
Agent-Ready Task(.github/ISSUE_TEMPLATE/agent-ready.yml) for standalone or sub-issues the loop can implement.Epic(.github/ISSUE_TEMPLATE/epic.yml) for parent issues that group ordered sub-issues and drive final verification.
## Description
What needs to be done.
## Acceptance Criteria
- [ ] Specific, testable criterion
- [ ] Another criterion
## Test Requirements
- Unit test for X
- E2E test for Y
## Affected Files/Areas
- src/...
- tests/...| Directory | Git-tracked? | Purpose |
|---|---|---|
.alpha-loop/vision.md |
Yes | Project vision document |
.alpha-loop/context.md |
Yes | Auto-generated project context |
.alpha-loop/learnings/ |
Yes | Learning files, session manifests, and session summaries (shared with team) |
.alpha-loop/evals/ |
Yes | Eval cases (YAML) and score history (scores.jsonl) |
.alpha-loop/traces/ |
No (gitignored) | Meta-Harness style execution traces per session |
.alpha-loop/sessions/ |
No (gitignored) | Local session logs, results JSON, screenshots |
.alpha-loop/sessions/queue-<timestamp>/queue.json |
No (gitignored) | Multi-epic queue manifest with status, session PRs, merge order, and stop reason |
.alpha-loop/auth/ |
No (gitignored) | Saved browser auth state for verification |
.worktrees/ |
No (gitignored) | Temporary git worktrees during processing |
git clone https://github.com/bradtaylorsf/alpha-loop.git
cd alpha-loop
pnpm install
pnpm build
pnpm test
# Run in development mode
pnpm dev -- run --dry-run| Layer | Technology |
|---|---|
| Runtime | Node.js, TypeScript, ESM |
| CLI Framework | Commander.js |
| AI Agents | Any CLI agent (Claude, Codex, OpenCode) |
| Source of Truth | GitHub (Issues = kanban, PRs = reviews) |
| Package Manager | pnpm |
MIT