🌲 evergreen

The migration agent that is never allowed to break your build.

Every. Single. Commit. Passes. Your. Tests.

Point it at a green repo, give it a goal — "migrate Flask → FastAPI", "Angular → React", "unittest → pytest" — and it walks your codebase there in dozens of tiny commits, running your real test suite after each one, refusing to advance while anything is red.

Most "AI migration" tools dump one giant diff in your lap and wish you luck. evergreen sells the opposite — a guarantee:

At every commit the agent creates, the full test suite passes.

No exceptions. Enforced in deterministic code the model cannot reach.

The result is a git history that reads like a careful senior engineer wrote it: atomic, reviewable, and git bisect-able from the first commit to the last.

Watch it think

The magic isn't the code generation — it's the closed loop. Here's a real run (a module rename driven by a local model via Ollama). Watch step 5: the model writes a broken edit, the suite goes red, the agent diagnoses it, repairs it, and only then commits:

baseline green: 14 tests in 0.23s

step 5/8 step-004 · Update calclib.stats to import from arithmetic (expand_contract, move)
  apply  llm edit (attempt 1)
  apply error: replace_in_file: 'import calclib.ops' not found in calclib/stats.py
  diagnose  repair: Update calclib.stats to import from arithmetic
  apply  builtin:apply_edits (attempt 2)
  green (full suite, 0.22s)
  commit a8752e7d (step-004)        ← committed ONLY after green

step 7/8 step-006 · Remove calclib.ops.py (expand_contract, remove)
  apply  llm edit (attempt 1)
  red (full suite, 0.25s) → tests/test_ops.py::TestOps::test_div
  diagnose  repair: Remove calclib.ops.py
  apply  builtin:apply_edits (attempt 2)
  red (full suite, 0.23s) → 3 failing
  rollback step-006: could not reach green within the repair budget   ← NO commit. clean.

done 6 green commits / 8 steps
migration incomplete: rolled back [step-006]. Your branch is untouched.

That's the whole pitch in one screen: the model makes mistakes, the invariant catches every one of them. A weaker model just means more repairs and the occasional honest "I couldn't finish this step" — never a broken commit, never a touched branch.

Quickstart

Install straight from GitHub (no PyPI release needed):

pip install "git+https://github.com/utsabpanta/evergreen-migration-agent"   # Python 3.11+

Or clone for development:

git clone https://github.com/utsabpanta/evergreen-migration-agent
cd evergreen-migration-agent
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

Then point it at any OpenAI-compatible model endpoint and run it inside a target repo:

export LLM_BASE_URL=http://localhost:11434/v1   # e.g. Ollama
export LLM_MODEL=qwen3-coder:480b-cloud

cd ~/path/to/your/green/repo           # must be: a git repo, clean tree, passing tests
evergreen plan "migrate the test suite from unittest to pytest"   # preview the DAG — changes nothing
evergreen run  "migrate the test suite from unittest to pytest"   # do it, with a live trace

The four commands:

Command	What it does
`evergreen plan "<goal>"`	Show the step DAG + chosen strategies. Changes nothing.
`evergreen run "<goal>"`	Run it: plan → apply → test → commit, never advancing on red.
`evergreen status`	Done / next / blocked steps for the saved run.
`evergreen resume`	Continue an interrupted run from the last green commit.

Flags worth knowing: --no-promote (leave the result on the evergreen/* branch instead of fast-forwarding yours), --interactive (approve each step), --plan-file plan.json (run a reviewed plan deterministically), --characterize (generate safety-net tests if coverage is missing), --redact (log only hashes of what's sent to a hosted model), --max-repairs / --max-replans (tune the autonomy budget).

Try it in 30 seconds (no model required)

The unittest→pytest migration ships as a deterministic recipe — it needs no LLM at all:

cp -r tests/fixtures/toy_unittest_pkg /tmp/demo && cd /tmp/demo
git init -b main && git add -A && git commit -m "initial"
evergreen run "migrate the test suite from unittest to pytest"
git log --oneline      # 5 commits — check out any one; pytest is green at every step

What it can migrate

evergreen is language-agnostic; "green" is defined by adapters that read your real test runner:

Ecosystem	Runner	Structured results via
Python	pytest (also collects `unittest` suites)	junit XML
Node.js	`node --test` (built-in)	TAP reporter
JS / TS	jest	`--json` reporter
JS / TS	vitest	`--reporter=json`
anything	any command via `[tool.evergreen] test_command`	exit code

For well-known migrations, playbooks inject battle-tested strategy into the planner: Angular→React, Next.js→TanStack, JS→TS, CommonJS→ESM, Flask→FastAPI, unittest→pytest — each with the right coexistence pattern and the oracle warning (e.g. "Angular TestBed unit tests die with the framework — you need behavior-level tests").

How it works

goal + repo ─► Planner (LLM) ─► DAG of atomic steps ─► Orchestrator (DETERMINISTIC)
                  ▲                                       │  for each step:
                  │ re-plan                               │    apply  → verify → commit
                  │                                       │    red?   → diagnose → repair → retry
            Diagnostician (LLM) ◄── red tests ────────────┤    stuck? → roll back, never commit
                                                          ▼
                                            one green commit per step ✅

The model proposes; your test suite disposes. The LLM plans, edits, and diagnoses. The commit decision lives in CommitManager, which demands the actual green full-suite result as proof and raises PrimeDirectiveViolation otherwise. The model has no path to a commit.
Crossing the valley. You can't atomically swap a framework and stay green, so the planner picks a coexistence strategy per concern — expand→migrate→contract, strangler fig, shim, branch-by-abstraction — and a validator rejects any plan that deletes before it migrates.
Sandboxed & reversible. All work happens in a git worktree on an evergreen/* branch. Snapshot before each step, git reset on failure, your branch untouched until you say so.
Trustworthy green. Zero tests → it refuses to migrate blindly and offers characterization tests. Flaky tests → quarantined so nondeterminism never defines "green." Slow suite → affected tests run first for fast feedback, but the full suite always gates the commit.
Resumable & auditable. Every commit is a safe resume point; each carries an Evergreen-Step: trailer; the whole run replays from a JSONL trace + plan.

The model: bring your own

One env-var interface, never a hard-coded vendor — vLLM, SGLang, Ollama, Z.AI, or any endpoint speaking POST /v1/chat/completions:

export LLM_BASE_URL=...     # required to enable LLM planning/editing/diagnosis
export LLM_API_KEY=...      # if your endpoint needs one
export LLM_MODEL=...        # e.g. glm-5.1, qwen3-coder, deepseek, …

Local-first: the repo, sandbox, test execution, and static analysis never leave your machine. Self-host the model (or run a recipe-only migration) and nothing leaves at all. When a hosted endpoint is used, every payload is logged to .git/evergreen/llm.jsonl — --redact keeps only hashes. Smarter model = better plans and fewer rollbacks; it can never mean a broken commit.

Honest limitations

Trust is the product, so here's what it won't do:

Your test oracle must survive the migration. Tests coupled to the framework you're removing (Angular TestBed, mocked next/router) can't pin behavior across the swap. Use behavior-level tests (HTTP / DOM / E2E); evergreen detects the gap and warns.
Infrastructure migrations are mostly out of scope. "API Gateway → ALB + EC2" has no fast local test oracle for "the ALB routes correctly" — that's verified by deploying to the cloud. evergreen can rehost the application code (handler → server) under the invariant, but the IaC + traffic cutover is human-owned deploy work it can scaffold, not guarantee.
Dependency-changing migrations that require npm install / pip install of new packages need the Docker-isolated sandbox (a documented extension point, not yet built); the worktree baseline assumes deps are already present.

How it's proven

27 acceptance tests gate the spec's four build phases — including the headline proof: a real Flask→FastAPI migration where git bisect run pytest finds no red commit in the produced range (it pins only a deliberately injected bad commit), and the same loop runs end-to-end on a real JavaScript repo under node --test.

pip install -e ".[dev]" && pytest      # 27 passing

Project layout

evergreen/
  cli.py              # typer entrypoints: plan / run / resume / status
  orchestrator.py     # the deterministic loop; enforces the Prime Directive
  planner.py          # LLM-backed DAG planner, grounded by static analysis
  executor.py         # codemod-first, LLM-fallback step application
  verifier.py         # runner adapters → structured pass/fail (sole authority on "green")
  diagnostician.py    # LLM-backed repair on red
  commit.py           # CommitManager: refuses to commit anything not full-suite green
  sandbox.py          # git worktree isolation, snapshot/restore, promote
  suite_assessment.py # baseline check, flaky quarantine, test-impact analysis
  playbooks.py        # per-migration strategy guidance injected into the planner
  strategies/         # expand_contract · strangler_fig · shim · branch_by_abstraction
  recipes.py          # deterministic, LLM-free migrations (e.g. unittest→pytest)
  llm.py              # OpenAI-compatible client (any vendor)
  models.py           # MigrationPlan / Step / StepResult (pydantic)
tests/                # acceptance gates incl. the git-bisect proof; tiny real fixture repos

License

MIT.

The model proposes. Your test suite decides.

Built with Claude Code · runs on open weights

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
demo		demo
evergreen		evergreen
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
spec.md		spec.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌲 evergreen

The migration agent that is never allowed to break your build.

At every commit the agent creates, the full test suite passes.

Watch it think

Quickstart

Try it in 30 seconds (no model required)

What it can migrate

How it works

The model: bring your own

Honest limitations

How it's proven

Project layout

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌲 evergreen

The migration agent that is never allowed to break your build.

At every commit the agent creates, the full test suite passes.

Watch it think

Quickstart

Try it in 30 seconds (no model required)

What it can migrate

How it works

The model: bring your own

Honest limitations

How it's proven

Project layout

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages