[FORGE-100] feat(reconcile): /reconcile skill + forge orchestrate reconcile verb (P2.5-T09)#164
Merged
Merged
Conversation
Phase 2.5 task IDs (P2.5-T01) and phase IDs (phase-2.5) didn't match the
schema regexes. Live phases.yaml also used `type: skill` / `type: docs` and
`status: deferred-v0.5` / `dropped` / `paused` fields that weren't in
TaskSchema. Result: loadPhases() throws on the very file forge generates
about itself.
- TASK_TYPES += 'skill', 'docs'
- PHASE_STATUSES += 'paused'
- TaskSchema gains optional status + {deferred,dropped,paused}_{at,reason}
- Phase/Task id regex accept decimal phase numbers (P2.5-T01)
- Fix YAML over-escaped backticks in P2.5-T18 description (\\` → `)
Prep work for FORGE-100 /reconcile, which needs loadPhases() to read
phases.yaml as the first step of the diff. Schema patch is additive — no
existing fixtures break.
Decision: Q4 schema-fix-strategy → user chose option B (fix in this PR)
Future verbs (FORGE-100 /reconcile next) need atomic file writes; the init/scaffold.ts private helper now lives in src/core/fs-atomic.ts and scaffold imports it. Adds tmp-file cleanup on rename failure (previous behavior left the tmp behind). Unit tests cover: happy path, nested mkdir, overwrite, no tmp residue on success, tmp cleanup on rename failure.
Pure functions for tracker ↔ phases.yaml sync:
- diffPull(issues, phases) → { updated, removed, added, unmanaged }
* updated: title or depends_on diverges; tracker blockerIds mapped back
to task IDs via the local tracker_issue_id index
* removed: phases.yaml task with no matching tracker issue (orphan)
* added: tracker issue with forge:task footer but no local task
(informational — phases.yaml only gains tasks via /amend-roadmap)
* unmanaged: tracker issue without forge:task footer (created outside forge)
- diffPush(phases, issues) → { bodies, skipped }
* bodies: each (task, phase) → rendered markdown
* skipped: no_tracker_issue_id | orphan_in_phases
- renderTaskBody(task, phase) — markdown shape mirroring live tracker bodies
(metadata header + description + acceptance checklist)
- applyPullToPhases(phases, plan, { confirmPrune }) — updates titles +
depends_on; prunes orphans only when confirmPrune=true; never auto-inserts
added[] (caller must surface them out-of-band)
Decision: depends_on diff sorts both sides before comparison so blockerId
order changes don't generate false positives.
…pull
Live phases.yaml has ~100 comment lines (phase rationale, deferral reasons,
rescope footnotes). Re-stringifying from the parsed Phases shape strips them
all. Per [[validator-narrower-than-preserver-causes-silent-corruption]], the
--pull writer must use yaml.parseDocument and mutate in place.
applyPlanToDocument(doc, plan, { confirmPrune }) traverses the Document tree,
applies title + depends_on updates, prunes orphans when confirmed. Returns
mutation count so the caller can skip the write when zero.
Decision (mid-implementation AUTO): preserve comments via Document API.
Rejected: re-stringify from Phases — loses ~100 comment lines on the very
file forge generates about itself.
CLI surface for FORGE-100 /reconcile. Flat naming matches the 6 existing
orchestrate-*.ts verb files.
Behavior:
- --pull --dry-run: emits {ok, data:{pull:PullPlan, applied:false}}; exits 0
unless orphans present (exit 1 = PRUNE_PENDING).
- --pull (no --dry-run): with orphans and no --confirm-prune|--no-prune,
exits 1 with the plan in stdout and a stderr hint to re-run with one of
the flags. With --confirm-prune, applies updates and prunes orphans via
applyPlanToDocument so phases.yaml comments survive.
- --push --dry-run: emits the rendered bodies without calling the tracker.
- --push: best-effort updateIssueBody loop; each per-task error collected
into failed[]; exits 2 if any failed (PARTIAL_PUSH_FAILURE per plan §Q3).
- --push for tasks without tracker_issue_id: skipped silently.
- Notion --push: gets NOT_IMPLEMENTED from the adapter stub; surfaced via
failed[] like any other per-task error.
Tracker instantiation inline switch on settings.tracker.type. Tests inject
a fakeTracker via trackerOverride so the unit suite has no network surface.
Decision: Q2 partial-push failure → best-effort + summary (user pick).
Decision: Q1 conflict UI → no conflict prompt (SPEC > LINEAR AC).
Decision: Q3 schema → schema patched in earlier commit, --pull reads via
PhasesSchema.safeParse successfully.
Adds 'reconcile' to the orchestrate subcommand switch and usage banner. Verb already self-contained; bin just forwards process.cwd() + subArgs. Smoke: built dist/bin/forge.cjs prints INVALID_ARGS JSON when --pull/--push absent, confirming end-to-end wiring through the bundler.
Skill wraps `forge orchestrate reconcile`: 1. Direction prompt (--pull / --push) 2. --dry-run --json call; present diff summary to user 3. Orphan-prune confirmation when --pull has removed[] (re-invokes with --confirm-prune or --no-prune) 4. Apply (no --dry-run); parse JSON envelope 5. One-line outcome summary + per-task failure listing on --push partial Notes documented in the skill: - Notion --push is a NOT_IMPLEMENTED stub (FORGE-117) - No conflict-resolution UI (SPEC > LINEAR AC per Q1) - YAML comments preserved across --pull writes - Exit codes 1/2 are user-actionable (PRUNE_PENDING / PARTIAL_PUSH_FAILURE)
Behavioral coverage at the CLI verb level driving a shared FakeTracker: - Round-trip: --pull → --push → --pull is idempotent (no diff on round 3) - Unmanaged tracker issue (no forge:task footer) appears in unmanaged[] and is NOT added to phases.yaml - --pull preserves YAML comments + field ordering when applying writes - Orphan flow: dry-run exits PRUNE_PENDING; --confirm-prune actually prunes - --no-prune applies title/depends updates but keeps orphans in yaml Per [[scope-cut-worldview-check-before-multi-adapter-stdlib]]: one FakeTracker against the Tracker interface contract, not three per-adapter runs. Adapter-specific quirks (Notion stub, Linear/GitHub footer parsing) are covered by FORGE-94's tests.
Supersedes the "Conflict prompt fires when same task edited in both places" acceptance line. Per [[spec-beats-linear-ac]], SPEC + the team-mode minimum architecture doc (spec/ORCHESTRATOR.md §CLI surface line 42) override the older AC text: - "Conflict prompt" removed; replaced with diff-preview + --confirm-prune - File path updated: src/cli/orchestrate/reconcile.ts → src/cli/orchestrate-reconcile.ts (flat naming, matches the 6 existing orchestrate-*.ts verb files) - Comment + ordering preservation called out explicitly (yaml.parseDocument) - Integration test AC reinterpreted: Tracker interface contract via shared FakeTracker, not 3 per-adapter runs; per-adapter quirks covered by FORGE-94 tests Decision: Q1 conflict UI → SPEC > LINEAR AC per user answer.
QA on FORGE-100 caught that exit codes 3 (PHASES_NOT_FOUND/INVALID_CONFIG) and 4 (TRACKER_ERROR at listActiveIssues) had no regression test. Two new unit cases cover: - exitCode 3 when plans/phases.yaml is missing (no settingsOverride flag involved — config-loading path) - exitCode 4 when tracker.listActiveIssues() rejects Both error paths emit JSON envelope to stderr per the spec-diff verb pattern. Also notes the --task <id> filter as deferred to FORGE-119 (filed during QA) — none of the locked ACs depend on it and the diff helpers already accept arbitrary inputs, so the follow-up is a thin filter layer.
3 BLOCKs from code-reviewer + 4 MEDIUMs from security-auditor + 1 LOW
addressed in this commit. The remaining LOW (TASK_STATUSES enum widening)
is deferred to a fast-follow per the auditor's verdict.
## Code-review blocks
1. **Exit code collision (BLOCK)**: INVALID_ARGS now exits 3 (hard error),
not 1 (PRUNE_PENDING). A caller checking `exitCode === 1` to decide
whether to prompt for orphan-prune confirmation can no longer mistake a
malformed call for a prune gate. Unit test updated to pin the new code.
2. **MCP handle leak (BLOCK + MEDIUM)**: `createTracker` now returns
`{ tracker, close? }` and `runOrchestrateReconcile` awaits `close()` in
a try/finally. Notion's StdioClientTransport child process is torn down
on every exit path including thrown errors. Linear/GitHub return
`close: undefined` (no-op).
3. **Local-only depends_on corruption (BLOCK)**: if any local
`depends_on` entry pointed to a task without `tracker_issue_id`, the
diff would silently overwrite to the shorter tracker-mapped list,
dropping local-only deps. Now: `diffTaskAgainstIssue` skips the
`depends_on` diff entirely when not every local dep can be represented
on tracker. Regression test added in
`test/unit/orchestrator/reconcile.test.ts`.
## Security MEDIUMs
4. **settings.yaml symlink check**: `validateUnderRoot` is now called on
both `phasesPath` and `settingsPath` before any read/write. Symlinks
pointing outside `cwd` throw `PATH_ESCAPE` and surface as
`INVALID_CONFIG` with exit 3.
5. **phases.yaml write target symlink check**: same as above; covers the
`writeAtomic(phasesPath, …)` call site.
6. **mcp_command unconstrained spawn**: `createTracker` enforces an
allowlist of `{npx, node}` for `mcp_command[0]`. An attacker with write
access to `.forge/settings.yaml` (e.g. malicious CI PR) can no longer
pin the launcher to an arbitrary binary. Out-of-band launchers can be
added to the allowlist if needed.
## Security LOWs
7. **TOCTOU**: `loadPhasesWithDocument` now stat()s the file BEFORE
`readFileSync`, so a 4 MiB+ adversarial file is rejected without
loading into memory.
## Improvements
- Dead code `void titleAndDepsChanges` removed.
- Orphan-pending path: stdout now emits `ok:false` with
`code: PRUNE_PENDING`, stderr carries the full plan (skill consumers
no longer need to inspect data.pull.removed.length to detect).
- `applyPlanToDocument` gains a comment explaining `added[]` is
intentionally skipped (parity with `applyPullToPhases`).
- SKILL.md documents that NOT_IMPLEMENTED and PRECONDITION_FAILED loop
forever — automated retries should filter to RATE_LIMITED/TRANSPORT/
TIMEOUT codes.
## Deferred to follow-up
- LOW: `task.status` schema widening (`z.string().optional()`). Auditor
recommended `z.enum(TASK_STATUSES)`. Deferred because live phases.yaml
uses values like `'deferred-v0.5'` that aren't trivially enumerable;
the right shape needs its own scoping. Filed as a fast-follow.
## Verification
- `npm run typecheck`: clean
- `npm test`: 945 tests, 934 pass, 11 skipped, 0 fail
- `npm run build`: dual ESM+CJS dist
- `npm run test:pack`: 0 forbidden paths in tarball
- Smoke: `node dist/bin/forge.cjs orchestrate reconcile` returns
`INVALID_ARGS` JSON (exit 3), mutually-exclusive flags rejected.
Two independent reviewers (Codex CLI + Claude code-reviewer) flagged
three new BLOCKs on the post-fix PR. All three plus three high-confidence
improvements addressed in this commit.
## BLOCKs
1. **Duplicate forgeTaskId attribution (Codex conf 9)**:
`diffPull` previously fell back from `byTrackerId.get(issue.id)` to
`byTaskId.get(issue.forgeTaskId)` unconditionally. If a malicious or
duplicate tracker issue carried `<!-- forge:task=P1-T01 -->` while the
local P1-T01 already bound a different `tracker_issue_id`, the fallback
would silently apply the duplicate issue's title/deps to the local
task. Now: fall back only when the local task has no
`tracker_issue_id` yet (legitimate first-sync case); a colliding
footer on a different tracker issue surfaces as `unmanaged[]`.
2. **MCP allowlist was security theater (Codex conf 8)**:
The earlier `{npx, node}` allowlist on `mcp_command[0]` did not
constrain anything meaningful: `node -e '...'` and `npx -y
<attacker-pkg>` are both arbitrary code execution. Removed the
allowlist and documented the honest trust model: settings.yaml is
TRUSTED EXECUTABLE CONFIG (same trust level as package.json scripts
or a Makefile). Mitigation lives at the repo layer (branch protection,
CODEOWNERS), not at the argv[0] layer.
3. **YAML anchor splice crash (Claude 2nd-pass)**:
`tasksNode.items.splice(ti, 1)` on a task node carrying a YAML anchor
leaves any `*alias` elsewhere in the document dangling; yaml v2's
`doc.toString()` then throws "Unresolved alias". forge-generated
phases.yaml never uses anchors, but a hand-edited file might. Now:
`applyPlanToDocument` throws a clear VALIDATION error before splicing,
pointing the user to resolve the anchor manually.
## Improvements
- **PRUNE_PENDING stream convention (Codex conf 8)**: stdout now carries
the structured plan envelope and stderr the human diagnostic, matching
`orchestrate-spec-diff.ts`. Previously inverted (error on stdout).
- **Round-trip e2e was a placebo (Codex conf 7)**: `FakeTracker.updateIssueBody`
now actually stores the body, and the round-trip test asserts the rendered
body reached the tracker (Forge task ID, type, owner, depends_on,
acceptance — all present in the pushed body).
- **Local-only depends_on regression test scope (Codex conf 7)**: assertion
narrowed from "no update at all" to "no depends_on change", so a
future title diff on the same task can't mask a real regression.
## New regression tests
- diffPull duplicate forgeTaskId footer → unmanaged (not attributed)
- diffPull forgeTaskId fallback still works when local tracker_issue_id is undefined
- applyPlanToDocument refuses to prune an anchored task
## Verification
- `npm run typecheck`: clean
- `npm test`: 962 tests, 951 pass, 11 skipped, 0 fail
- `npm run build`: dual ESM+CJS dist
- `npm run test:pack`: 0 forbidden paths
- Smoke: `node dist/bin/forge.cjs orchestrate reconcile` returns INVALID_ARGS (exit 3)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bidirectional sync between
plans/phases.yamland the configured tracker. Implements P2.5-T09 (FORGE-100) — the minimum-architecture/reconcileper the 2026-05-17 PM team-mode pivot.forge orchestrate reconcile --pull— tracker → phases.yaml. Diffs titles + depends_on, mutates the YAML Document in place (comments preserved). Orphans require--confirm-pruneor--no-prune(exit 1 = PRUNE_PENDING when neither given).forge orchestrate reconcile --push— phases.yaml → tracker viaTracker.updateIssueBody(FORGE-94). Best-effort loop; per-task failures collected intofailed[](exit 2 = PARTIAL_PUSH_FAILURE).--dry-runsupported on both directions.--jsonenvelope matches the canonicalorchestrate-spec-diff.ts:55pattern.skills/reconcile/SKILL.mdwraps the verb with diff preview + orphan-prune confirmation.No conflict-resolution UI per
spec/ORCHESTRATOR.md§CLI surface — supersedes the older "Conflict prompt fires" wording in the LINEAR AC (per[[spec-beats-linear-ac]]).Why
/reconcileis the bridge that lets phases.yaml stay current with team activity in the tracker. Unblocks FORGE-113 (T17 source metadata stanza) and FORGE-116 (T20 /wrap-up).What changed
src/schemas/phases.ts):TASK_TYPES += 'skill','docs',PHASE_STATUSES += 'paused', optionalstatus/*_reasonfields, decimal phase IDs (P2.5-T01). The liveplans/phases.yamlnow loads cleanly throughloadPhases()for the first time.src/core/fs-atomic.ts):writeAtomicextracted frominit/scaffold.tswith tmp-cleanup on rename failure.src/orchestrator/reconcile.ts):diffPull,diffPush,renderTaskBody,applyPlanToDocument(yaml document-mode mutation that preserves comments per[[validator-narrower-than-preserver-causes-silent-corruption]]).src/cli/orchestrate-reconcile.ts): argv parsing, JSON envelope, exit codes 0/1/2/3/4, MCP-command allowlist for Notion,validateUnderRooton read+write paths, MCP handle teardown in try/finally.src/bin/forge.ts):dispatchOrchestrategainsreconcilebranch.skills/reconcile/SKILL.md): UX layer with documented exit code semantics + Notion stub + PRECONDITION_FAILED retry loop warning.Forks (user-decided in /plan-task)
--confirm-pruneloadPhases()unblocked for all future verbsTest plan
npm run typecheckcleannpm test— 959 tests, 948 pass, 11 skipped, 0 failnpm run buildproduces dual ESM+CJSnode dist/bin/forge.cjs orchestrate reconciledispatches (exit 3 INVALID_ARGS when no direction)npm run test:pack— 0 forbidden paths in tarballReview summary
Two REQUEST CHANGES verdicts addressed in commit 08e2a81:
{npx, node}, TOCTOU stat-before-read.--taskfilter → FORGE-119,TASK_STATUSESenum → FORGE-120.Linear
Closes FORGE-100. Follow-ups filed: FORGE-119, FORGE-120.
🤖 Generated with Claude Code