Skip to content

Releases: peteromallet/megaplan

v0.12.0 — Auto Driver & Cross-Directory Plan Discovery

15 Apr 02:25

Choose a tag to compare

Auto driver

New megaplan auto --plan <name> drives a plan from its current state to a terminal outcome without human intervention. The driver is intentionally dumb: it reads status, runs next_step, and loops. All real judgment stays in the phase logic.

  • Gate escalation policy — ESCALATE defaults to force-proceed. Opt out with --on-escalate abort or --on-escalate fail.
  • Stall detection — bails after N consecutive iterations in the same state (default 5).
  • Iteration cap — hard stop at 200 iterations by default to bound runaway loops.
  • Structured exit codes — so shell callers and CI can branch on terminal state without parsing output:
Exit code Meaning
0 done
1 failed
2 stalled
3 escalated (under --on-escalate fail)
4 iteration cap hit
  • Emits a JSON outcome with the final state snapshot and event log on exit.

Cross-directory plan discovery

resolve_plan_dir now walks both parent and child directories to locate plans by name, so megaplan commands work from anywhere in a project tree — not just the directory that contains .megaplan/.

  • megaplan list --tree — list plans in the current subtree.
  • megaplan list --all — system-wide plan discovery across the whole workspace.

Standard robustness: init→plan transition

The standard robustness profile now picks up the init→plan transition that was previously only wired under heavier levels. Standard plans no longer stall after init waiting for a transition that never fires.

Tests

Added test_auto coverage for the driver loop, escalation policies, stall detection, and iteration caps.

v0.11.0 — Robust & Superrobust Levels, Phase Runtime, Codex Hardening

11 Apr 02:45

Choose a tag to compare

Robustness level rework

The heavy robustness level has been replaced by two new levels that give finer control over review parallelism:

  • robust — 8 critique checks, parallel critique, monolithic (single-worker) review with pre-check flags and flag verification. This is the new top-tier level for most use cases.
  • superrobust — everything in robust, plus parallel review (review checks split across concurrent subagents). Use when review thoroughness justifies the extra cost.

The default for megaplan loop is now robust.

Level Prep Critique checks Parallel critique Gate Review Parallel review
tiny stub stub stub
light core
standard core (5) monolithic
robust all (8) monolithic
superrobust all (8) parallel

Phase runtime system

New centralized runtime policy in megaplan/_core/phase_runtime.py:

  • Per-phase timeout caps: each phase has its own timeout policy instead of sharing the global worker timeout. Non-execute phases are capped at 300s by default.
  • Stale step detection: active_step_is_stale() now uses phase-aware thresholds instead of a flat 300s.
  • Duration hints: next_step_runtime in step responses tells orchestrators how long to wait before checking status.
  • Phase notices: stderr messages at phase start showing expected duration.
  • Every step response now includes monitor_hint and next_step_runtime.

Codex backend improvements

  • Retry guidance: error messages now recommend retrying on Codex before switching agents, with step-aware wording.
  • Phase timeouts: Codex worker uses phase_timeout_seconds() instead of hardcoded caps.
  • Exec mode flags: critique and review steps now run with --full-auto in addition to execute.
  • Codex subagent appendix: new codex_subagent_appendix.md for Codex-based orchestration.

Observability & locking

  • Lock contention handling: plan_lock_is_held() for non-blocking lock probes. error_response for plan_locked errors now includes active_step details and monitor_hint.
  • Phase observability: build_phase_observability() for structured runtime info in status responses.
  • Status enrichment: CLI status/watch commands now surface richer phase timing and lock state.

Prompt & schema refinements

  • Critique prompts now require at least one finding per check (no empty findings arrays).
  • Revise prompt includes explicit changes_summary guidance and stricter JSON-only output instruction.
  • Execute batch prompt includes changes_summary field guidance.
  • Finalize now captures test baseline at finalize time (moved from execute).
  • Verification task injection refined with baseline failure awareness.

Test coverage

541 tests passing. New test suites for workers, phase runtime, and expanded coverage for config, review robustness, and end-to-end workflow mocking.

v0.10.0 — Codex hardening, plan locking, observability

09 Apr 22:20

Choose a tag to compare

Codex backend hardening

The Codex (OpenAI) worker path got a major reliability overhaul, fixing a class of issues that caused silent failures, misclassified errors, and lost output when running with --agent codex or Hermes.

  • Timeout recovery: when a Codex step times out, megaplan now recovers partial output from the output file and stdout. If valid structured output was produced before the timeout, the step succeeds instead of failing.
  • Per-step timeout caps: non-execute steps are capped at 300s instead of inheriting the full 7200s worker timeout.
  • Environment isolation: child Codex processes no longer inherit CODEX_THREAD_ID or CODEX_CI, preventing workers from attaching to the wrong session.
  • Error classifier rewrite: connection-level failures are now detected before HTTP status codes, fixing false positives where thread IDs were misclassified as 429s. Bare numeric patterns now use word-boundary regex.
  • JSON extraction rewrite: switched from greedy brace-matching to JSONDecoder.raw_decode(), correctly handling trailing logs after the JSON object.

Concurrency & observability

  • Plan locking: all step handlers now acquire an fcntl file lock, preventing two processes from running steps on the same plan concurrently.
  • Active step tracking: state.json now carries an active_step field set before the worker launches and cleared on completion or failure. Stale detection at 300s.
  • megaplan status: now returns active_step, last_step, total_cost_usd, notes, and session summaries.
  • megaplan watch: new command combining status + progress into a single response.

Tiny robustness level

New --robustness tiny stubs critique and gate entirely, going straight from plangatedfinalize. Useful for trivial tasks where the full critique loop is overhead.

Parallel review for heavy mode

Heavy robustness now runs review checks in parallel, splitting mechanical checks, sense checks, and task verification across concurrent workers.

OpenAI strict-mode schema compatibility

  • Recursive required reconciliation for structured outputs.
  • flag_id and source in review rework items changed from optional to required nullable.
  • Gate flag_resolutions entries now require both evidence and rationale fields.

Prompt improvements

  • Nested harness guard: all worker prompts prevent recursive megaplan invocation.
  • Plan focus guidance: planning prompt stops the model from over-exploring.
  • Standard robustness now includes prep.

Other

  • License changed to OSNL 0.2.
  • New test suites for parallel review, review checks, tiny robustness, config, and more.

v0.9.0 — Subagent Orchestration Mode

07 Apr 13:31

Choose a tag to compare

v0.9.0 — Subagent Orchestration Mode

This release adds an autonomous subagent orchestration mode for Claude Code, consolidates all operational settings into one place, and introduces live observability for running plans.

✨ Highlights

Subagent orchestration mode (default for Claude Code)

Megaplan now runs the entire workflow inside a single Claude Code Agent by default, returning control to the main conversation only at defined breakpoints. Auto-approve runs are fully hands-off.

  • Inline mode is still available via megaplan config set orchestration.mode inline
  • Codex and Cursor continue to run inline (subagent mode is Claude Code-specific)

Breakpoint protocol

The subagent orchestrator returns at four well-defined points:

Type Triggered when
GATE_ESCALATE Gate worker recommends ESCALATE
EXECUTE_APPROVAL Review-mode runs awaiting approval after finalize
PHASE_ESCALATE A phase failed even after the required --fresh retry
EXECUTE_ESCALATE Execute hit the no-progress cap

The outer skill relays the breakpoint to the user, collects the answer, and resumes via SendMessage.

Note injection for mid-run guidance

Inject context into a running subagent without killing it:

megaplan override add-note --plan <name> --note "..."

The subagent picks up notes at the next phase boundary (visible via megaplan status). For mid-phase interruption, kill the agent, inject a note, and relaunch — the plan state on disk is the source of truth.

megaplan watch command

megaplan watch --plan <name>

Shows current state, last completed phase, pending notes, and execution progress in a human-readable format.

Settable defaults — single source of truth

All operational defaults are now consolidated in megaplan/types.py and surfaced via megaplan config show. Every numeric setting can be overridden:

megaplan config show                                          # see everything
megaplan config set execution.worker_timeout_seconds 3600    # override timeout
megaplan config set orchestration.max_critique_concurrency 4 # override concurrency
megaplan config set agents.critique hermes                   # override routing
megaplan config reset                                         # back to defaults
Key Default
orchestration.mode subagent
orchestration.max_critique_concurrency 2
execution.worker_timeout_seconds 7200
execution.max_execute_no_progress 3
execution.max_review_rework_cycles 3
agents.<step> varies

A new get_effective(section, key) helper checks user config first and falls back to the constants, so consuming code stays in sync automatically.

Default robustness changed: light → standard

The triage default in the skill prompt is now standard rather than light. Reasoning: cross-cutting changes are the common case for megaplan; light is the explicit opt-in for simple cases.

🔧 Internal changes

  • New claude_subagent_appendix.md (269 lines, 11 sections) — the orchestrator agent prompt template
  • Unified Claude skill install path concatenates instructions.md + appendix
  • Subagent appendix templatized so retry caps come from types.py constants
  • New tests/test_config.py for get_effective() and settable-defaults coverage
  • Documentation overhaul in README.md and bundled instructions.md

📊 Test coverage

175 tests passing across the megaplan test suite.

🙏 Thanks

Built across a single session using megaplan in subagent mode to plan its own subagent mode.