Skip to content

CLI harness, conflict gates, swarm branches, and V4 wiring#108

Merged
cursor[bot] merged 7 commits into
mainfrom
cursor/full-cli-verification-c260
May 26, 2026
Merged

CLI harness, conflict gates, swarm branches, and V4 wiring#108
cursor[bot] merged 7 commits into
mainfrom
cursor/full-cli-verification-c260

Conversation

@kartikkabadi

@kartikkabadi kartikkabadi commented May 26, 2026

Copy link
Copy Markdown
Owner

Summary

Consolidates verification harness work, conflict approval blocking (formerly PR #107), and incremental V4 fixes.

Harness & CI

  • scripts/full-cli-harness.sh — all dispatch commands; optional live plan with checkpoint resume
  • scripts/cli-harness.sh — deterministic tmux pattern waits (CI-safe without requiring doctor exit 0)
  • CI runs both harnesses after demos

#37 Conflict pipeline (DONE)

  • Block foundry approve when open conflicts exist
  • Plan completion sets blocked_actions when conflicts remain
  • Emit conflict_raised on swarm disagreement

#32 Swarm (progress)

  • foundry plan --swarm-branches N (2–5), parallel research fanout
  • Provenance + per-branch artifacts (existing)

UX

Verification

  • npm test: 234/234 pass
  • bash scripts/full-cli-harness.sh (live plan when CURSOR_API_KEY set)
  • bash scripts/cli-harness.sh

Supersedes draft PRs #106 and #107.

Open in Web Open in Cursor 

Summary by cubic

Adds an exhaustive CLI harness with optional live Composer verification, and enforces a conflict-aware approval gate that blocks approve until conflicts are resolved. Also adds plan --swarm-branches for parallel research and expands CLI help.

  • New Features

    • scripts/full-cli-harness.sh: runs all dispatch commands end-to-end; optional live plan with resume; secrets leak check; CI runs this and scripts/cli-harness.sh; docs updated with verification steps.
    • Approval gate: blocks approve when open conflicts exist; status/banner show next vs blocked actions; approve exits with a helpful error.
    • foundry plan --swarm-branches N (2–5): enables parallel swarm research; emits conflict_raised events on disagreement; help now lists all commands and flags (tui, daemon, notify, update).
  • Bug Fixes

    • Stabilized tmux waits in scripts/cli-harness.sh using pattern-based checks to avoid the doctor timing race; CI harness tolerates missing CURSOR_API_KEY and sets FOUNDRY_BUILD_MOCK=1 for the mock build.

Written for commit 050150e. Summary will update on new commits. Review in cubic

Summary by CodeRabbit

  • New Features

    • Added --swarm-branches to control plan exploration depth (2–5).
    • CLI help now shows expanded command flags and usage.
    • Approval is blocked when open conflicts exist; conflict IDs are surfaced.
  • Documentation

    • Updated agent and verification docs with new CLI harness runs and latest verification dates.
  • Tests

    • Added deterministic CLI harnesses for end-to-end checks.
    • New/extended tests for plan args, orchestration swarm behavior, and conflict gating.

Review Change Stack

cursoragent and others added 3 commits May 25, 2026 15:22
Co-authored-by: Kartik <kartikkabadi@users.noreply.github.com>
Co-authored-by: Kartik <kartikkabadi@users.noreply.github.com>
- scripts/full-cli-harness.sh exercises all dispatch commands with git init
  for mock build, team-pack init smoke, and optional live plan when
  CURSOR_API_KEY is set (resume loop through agent-pass checkpoints).
- scripts/cli-harness.sh waits for doctor completion before asserting output.
- Wire both harnesses into CI after demo scripts.
- Update docs/agents/README.md verification commands and counts.

Co-authored-by: Kartik <kartikkabadi@users.noreply.github.com>
@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6e6e7efd-43fe-4181-b3e1-29c274a8e19f

📥 Commits

Reviewing files that changed from the base of the PR and between 639a5ed and 050150e.

📒 Files selected for processing (1)
  • scripts/cli-harness.sh

📝 Walkthrough

Walkthrough

This PR wires a new --swarm-branches flag into the CLI and plan execution (parallel swarm research), blocks approvals when open conflicts exist (BLOCKED), emits conflict events and conflict-aware approval actions in orchestration, adds deterministic tmux and full end-to-end CLI harnesses, integrates harnesses into CI, and updates docs/tests/verification records.

Changes

Swarm research and approval conflict gating

Layer / File(s) Summary
State definitions for conflict-aware planning
packages/core/src/state/project-init.ts, packages/cli/src/commands/plan.ts
RunStateErrorCode adds 'BLOCKED' member; ParsedPlanArgs adds optional swarmBranches?: number.
CLI plan command with swarm branches support
packages/cli/src/commands/plan.ts, tests/plan-args.test.ts
parsePlanArgs adds --swarm-branches flag parsing (validated 2–5 range), enables swarm research when supplied, and threads value through return paths; executePlan receives swarmBranches in options. New test validates flag parsing and swarm research enabling.
CLI help documentation
packages/cli/src/commands/help.ts, tests/cli.test.ts
printHelp expands command descriptions to include flags/arguments; --help test verifies presence of new command names (tui, daemon, notify, update).
Approval gating with conflict checks
packages/core/src/state/run-store.ts, tests/approve.test.ts
approveRun queries open conflicts via listOpenConflicts(target.runDir) and throws RunStateError with code BLOCKED before proceeding. Tests validate conflict blocking at run-store and CLI levels, with stderr matching on conflict IDs.
Plan orchestration with swarm research and conflict handling
packages/planner/src/plan/orchestrate.ts, tests/plan.test.ts
New approvalGateActions(runDir) helper computes next_actions/blocked_actions based on presence of open conflicts; swarm research now runs with parallel: true; conflict_raised events emitted on swarm disagreement; awaiting-approval status and completion banner use conflict-aware routing and conditionally display conflict IDs. Test orchestration verifies swarm disagreement detection, conflict recording, and approval gating.

CLI testing harness infrastructure

Layer / File(s) Summary
Improved tmux-based CLI harness
scripts/cli-harness.sh
Adds wait_for polling helper that repeatedly captures tmux pane output until a regex pattern matches (with configurable timeout). Harness flow interleaves wait_for waits after each CLI command to ensure deterministic synchronization; replaces previous capture pipeline with larger scrollback capture.
Comprehensive end-to-end CLI harness
scripts/full-cli-harness.sh
New non-interactive harness exercises complete Foundry workflow: project build and temporary workspace setup; tier-A checks (--version, --help); initialization variants including --team; multi-mode doctor checks (plain, JSON, deep); fixture plan smoke test with run artifact validation; status/pause/approve/publish flow; dry-run and real builds with FOUNDRY_BUILD_MOCK=1; post-v1 commands (tui, daemon, notify --dry-run, update --dry-run); optional live-plan loop with status polling and resume when CURSOR_API_KEY is set; secrets-leak detection gating.
CI workflow integration
.github/workflows/ci.yml
Adds steps to run tmux-based harness and full CLI harness in the verify job for continuous validation.

Verification state and agent documentation

Layer / File(s) Summary
Verification state and agent guidance
AGENTS.md, docs/agents/README.md, docs/planning/VERIFIED_STATE.md
AGENTS.md clarifies Cloud agent pi-cli stub behavior; agent README.md documents harness verification commands and updates last-verified timestamp with harness success; VERIFIED_STATE.md records new verification run and updates issue #32 row to cite --swarm-branches CLI flag wiring and tests/plan-args.test.ts coverage.

Sequence Diagram

sequenceDiagram
  participant Reviewer
  participant CLI as CLI (plan.ts)
  participant Parser as parsePlanArgs
  participant Orchestration as Plan Orchestration
  participant Swarm as Swarm Research
  participant Conflicts as Conflict Query
  participant Approval as Approval Gate

  Reviewer->>CLI: foundry plan --swarm-branches 3 "idea"
  CLI->>Parser: parsePlanArgs(argv)
  Parser->>Parser: Parse --swarm-branches 3
  Parser-->>CLI: Return {swarmBranches: 3, swarmResearch: true, idea: "idea"}
  CLI->>Orchestration: executePlan({swarmBranches: 3, swarmResearch: true, ...})
  Orchestration->>Swarm: runResearchSwarm(parallel: true)
  Swarm-->>Orchestration: Returns differing research outcomes
  Orchestration->>Orchestration: Emit conflict_raised event
  Orchestration->>Conflicts: listOpenConflicts(runDir)
  Conflicts-->>Approval: Returns [swarm-disagreement conflict ID]
  Approval->>Approval: Compute blocked_actions=[approve], next_actions=[resolve]
  Approval-->>Orchestration: Return gated actions
  Orchestration-->>Reviewer: Status awaiting_approval with conflict info
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

hackathon-demo

"I nibbled on flags and hops of code today,
A swarm of branches racing down the way.
When disagreements bloom, I thump and pause,
Approvals wait until we fix the cause.
Harnesses hum — OK! — the tests all say. 🐇"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly relates to the main changes: CLI harness additions (scripts/cli-harness.sh and full-cli-harness.sh), conflict gating (approveRun blocking on open conflicts), swarm branches feature (--swarm-branches flag), and V4 wiring updates (help text, command coverage). It accurately summarizes the primary changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cursor/full-cli-verification-c260

Comment @coderabbitai help to get the list of available commands and usage tips.

cursoragent and others added 3 commits May 26, 2026 01:09
Doctor harness steps allow exit 1 when CURSOR_API_KEY is unset (GitHub CI).
Export FOUNDRY_BUILD_MOCK=1 for the entire mock build section.

Co-authored-by: Kartik <kartikkabadi@users.noreply.github.com>
Integrate conflict approval gate, plan orchestrate blocked_actions, and
deterministic cli-harness waits. Resolve harness merge (doctor wait without
requiring exit 0 for CI).

Co-authored-by: Kartik <kartikkabadi@users.noreply.github.com>
- Merge conflict approval gate from PR #107 (already integrated).
- Expand help.ts for all dispatch commands and flags.
- Add plan --swarm-branches (2-5) with parallel swarm research.
- Emit conflict_raised events when swarm disagreement writes artifacts.
- Cloud AGENTS.md pi stub note; VERIFIED_STATE #37 done, #32 progress.
- tests: plan-args, cli help coverage (234 tests pass).

Co-authored-by: Kartik <kartikkabadi@users.noreply.github.com>
@cursor cursor Bot marked this pull request as ready for review May 26, 2026 01:24
@cursor cursor Bot changed the title Exhaustive CLI harness and live Composer verification CLI harness, conflict gates, swarm branches, and V4 wiring May 26, 2026
Co-authored-by: Kartik <kartikkabadi@users.noreply.github.com>
@cursor cursor Bot merged commit 25dd5de into main May 26, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants