Releases: peteromallet/megaplan
v0.12.0 — Auto Driver & Cross-Directory Plan Discovery
Auto driver
New megaplan auto --plan <name> drives a plan from its current state to a terminal outcome without human intervention. The driver is intentionally dumb: it reads status, runs next_step, and loops. All real judgment stays in the phase logic.
- Gate escalation policy — ESCALATE defaults to force-proceed. Opt out with
--on-escalate abortor--on-escalate fail. - Stall detection — bails after N consecutive iterations in the same state (default 5).
- Iteration cap — hard stop at 200 iterations by default to bound runaway loops.
- Structured exit codes — so shell callers and CI can branch on terminal state without parsing output:
| Exit code | Meaning |
|---|---|
0 |
done |
1 |
failed |
2 |
stalled |
3 |
escalated (under --on-escalate fail) |
4 |
iteration cap hit |
- Emits a JSON outcome with the final state snapshot and event log on exit.
Cross-directory plan discovery
resolve_plan_dir now walks both parent and child directories to locate plans by name, so megaplan commands work from anywhere in a project tree — not just the directory that contains .megaplan/.
megaplan list --tree— list plans in the current subtree.megaplan list --all— system-wide plan discovery across the whole workspace.
Standard robustness: init→plan transition
The standard robustness profile now picks up the init→plan transition that was previously only wired under heavier levels. Standard plans no longer stall after init waiting for a transition that never fires.
Tests
Added test_auto coverage for the driver loop, escalation policies, stall detection, and iteration caps.
v0.11.0 — Robust & Superrobust Levels, Phase Runtime, Codex Hardening
Robustness level rework
The heavy robustness level has been replaced by two new levels that give finer control over review parallelism:
robust— 8 critique checks, parallel critique, monolithic (single-worker) review with pre-check flags and flag verification. This is the new top-tier level for most use cases.superrobust— everything in robust, plus parallel review (review checks split across concurrent subagents). Use when review thoroughness justifies the extra cost.
The default for megaplan loop is now robust.
| Level | Prep | Critique checks | Parallel critique | Gate | Review | Parallel review |
|---|---|---|---|---|---|---|
| tiny | stub | stub | — | stub | — | — |
| light | — | core | — | — | — | — |
| standard | ✓ | core (5) | — | ✓ | monolithic | — |
| robust | ✓ | all (8) | ✓ | ✓ | monolithic | — |
| superrobust | ✓ | all (8) | ✓ | ✓ | parallel | ✓ |
Phase runtime system
New centralized runtime policy in megaplan/_core/phase_runtime.py:
- Per-phase timeout caps: each phase has its own timeout policy instead of sharing the global worker timeout. Non-execute phases are capped at 300s by default.
- Stale step detection:
active_step_is_stale()now uses phase-aware thresholds instead of a flat 300s. - Duration hints:
next_step_runtimein step responses tells orchestrators how long to wait before checking status. - Phase notices: stderr messages at phase start showing expected duration.
- Every step response now includes
monitor_hintandnext_step_runtime.
Codex backend improvements
- Retry guidance: error messages now recommend retrying on Codex before switching agents, with step-aware wording.
- Phase timeouts: Codex worker uses
phase_timeout_seconds()instead of hardcoded caps. - Exec mode flags: critique and review steps now run with
--full-autoin addition to execute. - Codex subagent appendix: new
codex_subagent_appendix.mdfor Codex-based orchestration.
Observability & locking
- Lock contention handling:
plan_lock_is_held()for non-blocking lock probes.error_responseforplan_lockederrors now includesactive_stepdetails andmonitor_hint. - Phase observability:
build_phase_observability()for structured runtime info in status responses. - Status enrichment: CLI status/watch commands now surface richer phase timing and lock state.
Prompt & schema refinements
- Critique prompts now require at least one finding per check (no empty findings arrays).
- Revise prompt includes explicit
changes_summaryguidance and stricter JSON-only output instruction. - Execute batch prompt includes
changes_summaryfield guidance. - Finalize now captures test baseline at finalize time (moved from execute).
- Verification task injection refined with baseline failure awareness.
Test coverage
541 tests passing. New test suites for workers, phase runtime, and expanded coverage for config, review robustness, and end-to-end workflow mocking.
v0.10.0 — Codex hardening, plan locking, observability
Codex backend hardening
The Codex (OpenAI) worker path got a major reliability overhaul, fixing a class of issues that caused silent failures, misclassified errors, and lost output when running with --agent codex or Hermes.
- Timeout recovery: when a Codex step times out, megaplan now recovers partial output from the output file and stdout. If valid structured output was produced before the timeout, the step succeeds instead of failing.
- Per-step timeout caps: non-execute steps are capped at 300s instead of inheriting the full 7200s worker timeout.
- Environment isolation: child Codex processes no longer inherit
CODEX_THREAD_IDorCODEX_CI, preventing workers from attaching to the wrong session. - Error classifier rewrite: connection-level failures are now detected before HTTP status codes, fixing false positives where thread IDs were misclassified as 429s. Bare numeric patterns now use word-boundary regex.
- JSON extraction rewrite: switched from greedy brace-matching to
JSONDecoder.raw_decode(), correctly handling trailing logs after the JSON object.
Concurrency & observability
- Plan locking: all step handlers now acquire an
fcntlfile lock, preventing two processes from running steps on the same plan concurrently. - Active step tracking:
state.jsonnow carries anactive_stepfield set before the worker launches and cleared on completion or failure. Stale detection at 300s. megaplan status: now returnsactive_step,last_step,total_cost_usd, notes, and session summaries.megaplan watch: new command combiningstatus+progressinto a single response.
Tiny robustness level
New --robustness tiny stubs critique and gate entirely, going straight from plan → gated → finalize. Useful for trivial tasks where the full critique loop is overhead.
Parallel review for heavy mode
Heavy robustness now runs review checks in parallel, splitting mechanical checks, sense checks, and task verification across concurrent workers.
OpenAI strict-mode schema compatibility
- Recursive
requiredreconciliation for structured outputs. flag_idandsourcein review rework items changed from optional to required nullable.- Gate
flag_resolutionsentries now require bothevidenceandrationalefields.
Prompt improvements
- Nested harness guard: all worker prompts prevent recursive
megaplaninvocation. - Plan focus guidance: planning prompt stops the model from over-exploring.
- Standard robustness now includes prep.
Other
- License changed to OSNL 0.2.
- New test suites for parallel review, review checks, tiny robustness, config, and more.
v0.9.0 — Subagent Orchestration Mode
v0.9.0 — Subagent Orchestration Mode
This release adds an autonomous subagent orchestration mode for Claude Code, consolidates all operational settings into one place, and introduces live observability for running plans.
✨ Highlights
Subagent orchestration mode (default for Claude Code)
Megaplan now runs the entire workflow inside a single Claude Code Agent by default, returning control to the main conversation only at defined breakpoints. Auto-approve runs are fully hands-off.
- Inline mode is still available via
megaplan config set orchestration.mode inline - Codex and Cursor continue to run inline (subagent mode is Claude Code-specific)
Breakpoint protocol
The subagent orchestrator returns at four well-defined points:
| Type | Triggered when |
|---|---|
GATE_ESCALATE |
Gate worker recommends ESCALATE |
EXECUTE_APPROVAL |
Review-mode runs awaiting approval after finalize |
PHASE_ESCALATE |
A phase failed even after the required --fresh retry |
EXECUTE_ESCALATE |
Execute hit the no-progress cap |
The outer skill relays the breakpoint to the user, collects the answer, and resumes via SendMessage.
Note injection for mid-run guidance
Inject context into a running subagent without killing it:
megaplan override add-note --plan <name> --note "..."The subagent picks up notes at the next phase boundary (visible via megaplan status). For mid-phase interruption, kill the agent, inject a note, and relaunch — the plan state on disk is the source of truth.
megaplan watch command
megaplan watch --plan <name>Shows current state, last completed phase, pending notes, and execution progress in a human-readable format.
Settable defaults — single source of truth
All operational defaults are now consolidated in megaplan/types.py and surfaced via megaplan config show. Every numeric setting can be overridden:
megaplan config show # see everything
megaplan config set execution.worker_timeout_seconds 3600 # override timeout
megaplan config set orchestration.max_critique_concurrency 4 # override concurrency
megaplan config set agents.critique hermes # override routing
megaplan config reset # back to defaults| Key | Default |
|---|---|
orchestration.mode |
subagent |
orchestration.max_critique_concurrency |
2 |
execution.worker_timeout_seconds |
7200 |
execution.max_execute_no_progress |
3 |
execution.max_review_rework_cycles |
3 |
agents.<step> |
varies |
A new get_effective(section, key) helper checks user config first and falls back to the constants, so consuming code stays in sync automatically.
Default robustness changed: light → standard
The triage default in the skill prompt is now standard rather than light. Reasoning: cross-cutting changes are the common case for megaplan; light is the explicit opt-in for simple cases.
🔧 Internal changes
- New
claude_subagent_appendix.md(269 lines, 11 sections) — the orchestrator agent prompt template - Unified Claude skill install path concatenates
instructions.md+ appendix - Subagent appendix templatized so retry caps come from
types.pyconstants - New
tests/test_config.pyforget_effective()and settable-defaults coverage - Documentation overhaul in README.md and bundled
instructions.md
📊 Test coverage
175 tests passing across the megaplan test suite.
🙏 Thanks
Built across a single session using megaplan in subagent mode to plan its own subagent mode.