A framework for building multi-step AI workflows that can plan, execute, validate, and recover from failures. Supports multiple models (Claude, Gemini, Ollama), human approval steps, and resumable runs with persistent state.
A LangGraph-native workflow framework for solo developers. Orchestrates multi-step AI workflows with durable state, multi-provider routing, and deterministic cost accounting across Gemini (via LiteLLM), Qwen (via Ollama), and Claude Code (via OAuth CLI subprocess).
| Milestone | State |
|---|---|
| M1 — Reconciliation & cleanup | Complete (2026-04-19) |
| M2 — Graph-layer adapters + provider drivers | Complete (2026-04-19) |
M3 — First workflow (planner, single tier) |
Complete (2026-04-20) |
| M4 — MCP server (FastMCP) | Complete (2026-04-20) |
M5 — Multi-tier planner |
Complete (2026-04-20) |
M6 — slice_refactor DAG |
Complete (2026-04-20) |
| M7 — Eval harness | Complete (2026-04-21) |
| M8 — Ollama infrastructure | Complete (2026-04-21) |
| M9 — Claude Code skill packaging | Complete (2026-04-21) |
| M10 — Ollama fault-tolerance hardening | Planned |
| M11 — MCP gate-review surface | Complete (2026-04-22) |
| M12 — Tiered audit cascade | Complete (2026-04-29) |
| M13 — v0.1.0 release + PyPI packaging | Complete (2026-04-22) |
| M14 — MCP HTTP transport | Complete (2026-04-22) |
| M15 — Tier overlay + fallback chains | Complete (2026-04-30) |
| M16 — External workflows + primitives load path | Complete (2026-04-24) |
M17 — scaffold_workflow meta-workflow |
Complete (2026-04-30) |
| M19 — Declarative authoring surface | Complete (2026-04-26) |
| M20 — Autonomy loop optimization | Complete (2026-04-28) |
| M21 — Autonomy loop continuation | Complete (2026-04-29) |
ai-workflows exposes two surfaces over the same workflow registry: an aiw CLI for interactive and scripted use, and an aiw-mcp MCP server for Claude Code, Cursor, Zed, and browser-origin consumers (via streamable-HTTP). A workflow is a Python module that builds a LangGraph StateGraph composed of graph primitives (TieredNode, ValidatorNode, HumanGate, RetryingEdge) and registered by name. There is no hosted control plane and no Anthropic API dependency — Claude access is OAuth-only through the claude CLI subprocess.
Four layers with a one-way dependency direction enforced by import-linter:
surfaces (ai_workflows.cli, ai_workflows.mcp)
↓
workflows (ai_workflows.workflows.*) — concrete LangGraph StateGraphs
↓
graph (ai_workflows.graph.*) — LangGraph adapters over primitives
↓
primitives (ai_workflows.primitives.*) — storage, cost, tiers, providers, retry, logging
Full overview in docs/architecture.md. Tutorials for authoring a new workflow or extending the graph layer live at docs/writing-a-workflow.md and docs/writing-a-graph-primitive.md.
Requires Python ≥ 3.12 and uv.
One-shot via uvx — no persistent install; every invocation fetches the wheel into a cache:
uvx --from jmdl-ai-workflows aiw run planner --goal 'Write a release checklist' --run-id demoPersistent tool install — puts aiw + aiw-mcp on PATH:
uv tool install jmdl-ai-workflows
aiw run planner --goal 'Write a release checklist' --run-id demoAfter installing (either path above), set your Gemini API key and drive a planner run end-to-end:
export GEMINI_API_KEY=...
aiw run planner --goal 'Write a release checklist' --run-id demo
aiw resume demo --approve
aiw list-runsThe planner workflow composes two LLM tiers (Qwen explorer via Ollama + Claude Code Opus synth). If you only want the Gemini path for a smoke, pass --tier-override planner-synth=planner-explorer or omit the Ollama + Claude Code prerequisites and stub the gemini_flash tier.
Both aiw and aiw-mcp auto-load a .env from the current working directory at startup (shell-exported values win over .env).
Key env vars:
GEMINI_API_KEY— required for any workflow using a Gemini tier (most defaults).OLLAMA_BASE_URL— defaulthttp://localhost:11434; override if your Ollama daemon listens elsewhere.AIW_STORAGE_DB/AIW_CHECKPOINT_DB— path overrides for the run registry and checkpoint databases (defaults:~/.ai-workflows/storage.sqlite3/~/.ai-workflows/checkpoint.sqlite3).
Claude Code tier: some workflows route to the claude CLI via OAuth. Install and authenticate it separately per Anthropic's setup docs. aiw never reads ANTHROPIC_API_KEY and never imports the anthropic SDK — Claude access is OAuth-only through the CLI subprocess.
ai-workflows is a declarative orchestration layer; extension is a first-class capability. Authors engage at four progressively-deeper tiers, each with a dedicated guide:
| Tier | When | Guide |
|---|---|---|
| 1 — Compose | You're combining built-in step types (LLMStep, ValidateStep, GateStep, TransformStep, FanOutStep) into a workflow. The happy path. |
docs/writing-a-workflow.md |
| 2 — Parameterise | You're configuring built-in steps (retry policy, response format, gate behaviour, tier choice). | docs/writing-a-workflow.md (same doc) |
| 3 — Author a custom step type | No built-in covers your need. Subclass Step; the framework wires your custom step into the graph like a built-in. |
docs/writing-a-custom-step.md |
| 4 — Escape to LangGraph directly | Your topology is genuinely non-standard (dynamic edge conditions, novel control flow). Use the legacy register(name, build_fn) API. |
docs/writing-a-graph-primitive.md |
The framework's promise: descending a tier never forces you to reverse-engineer framework source. If you're at the wrong tier, you'll find pointers to the right one in any guide.
Register aiw-mcp with any MCP host — Claude Code, Cursor, Zed, or an HTTP client via the streamable-HTTP transport — to drive the same workflows inside-out:
claude mcp add ai-workflows --scope user -- uvx --from jmdl-ai-workflows aiw-mcpThe HTTP transport is opt-in for browser-origin consumers: aiw-mcp --transport http --port 8080 --cors-origin http://localhost:3000.
Registering your own workflow modules from a downstream package? AIW_EXTRA_WORKFLOW_MODULES=pkg.workflows.your_workflow (or --workflow-module pkg.workflows.your_workflow, repeatable) imports them at startup. See docs/writing-a-workflow.md §External workflows from a downstream consumer.
- Loopback default —
aiw-mcp --transport httpbinds to127.0.0.1; unreachable from other machines.--host 0.0.0.0exposes the server to every process on the host and to the LAN.aiw-mcphas no built-in auth; the bind address is the only access boundary. Only pass--host 0.0.0.0on a machine you own every process on, and put a reverse proxy in front if you need TLS. - CORS is opt-in, exact-match —
--cors-origin <url>adds one origin; without any flags the server emits noAccess-Control-Allow-Originheader (same-origin only). Not required for stdio or loopback HTTP.
Clone the repo for development or to modify the framework itself:
git clone https://github.com/yeevon/ai-workflows.git
cd ai-workflows
uv sync # install runtime + dev dependencies
uv run aiw version # prints the current __version__ (0.4.0 at M17 close)For the full builder/auditor workflow — task specs, audit issue files, Builder / Auditor mode conventions — switch to the design_branch.
Three gates guard every change:
uv run pytest # unit + scaffolding tests (hermetic; skips e2e unless AIW_E2E=1)
uv run lint-imports # four-layer import contract
uv run ruff check # style + basic correctnessM21 is complete. The next planned milestone is M22, which will address any operator-resume items from M20/M21 (including T06/T07 dynamic model dispatch if the GO/NO-GO verdict fires) and further autonomy-loop improvements identified from M21's empirical baseline.
Roadmap + per-milestone task files live at design_docs/roadmap.md (builder-only, on design branch).