diff --git a/plugins/codex-loop-engineering/skills/loop-engineering/SKILL.md b/plugins/codex-loop-engineering/skills/loop-engineering/SKILL.md index 815efbe..55705e3 100644 --- a/plugins/codex-loop-engineering/skills/loop-engineering/SKILL.md +++ b/plugins/codex-loop-engineering/skills/loop-engineering/SKILL.md @@ -3,7 +3,7 @@ name: loop-engineering description: Use when a substantial coding, research, content, product, or other long-running project needs Codex-centered multi-agent orchestration, role/lane design, cross-model Claude/Codex planning or review, cross-model debate, named Codex agent threads, dispatcher-mediated handoffs, thread ledgers, worklogs, batons, repair loops, or evidence-based arbitration. --- -# Codex Loop Engineering +# Loop Engineering ## Purpose @@ -29,16 +29,14 @@ For tiny edits, config tweaks, docs-only notes, or simple local bug fixes, do no First decide the smallest route that controls risk: 1. Choose the route tier before choosing agents. -2. Ask the user to define the topology before creating lanes: desired roles, parallel vs sequential work, communication rules, and whether Codex-only or Codex-plus-Claude is expected. -3. For each new task, derive the project-specific agent roles first, then build the matching identity table and worklog summary before execution starts. -4. Verify the six-interface contract: goal, state, context, act, capture, stop. -5. For T3/T4, create or identify a Strategic Loop Contract; see `references/strategic-loop-contract.md`. -6. If the strategic target or "good enough" completion criterion is missing, write `strategy-gap: ` and stop for a user checkpoint. -7. For multi-round work, define how state and feedback will be recorded; see `references/state-feedback-schema.md`. -8. Same active loop correction or continuation? Reuse the existing `loop_id` and owner lane. -9. Choose `claude_policy` for handoffs and lane artifacts; see `references/claude-policy.md`. -10. Critical direction or degraded-tool decision? Use `references/user-checkpoints.md`. -11. Context or handoff risk? Drop a baton before more work or handoff. +2. Verify the six-interface contract: goal, state, context, act, capture, stop. +3. For T3/T4, create or identify a Strategic Loop Contract; see `references/strategic-loop-contract.md`. +4. If the strategic target or "good enough" completion criterion is missing, write `strategy-gap: ` and stop for a user checkpoint. +5. For multi-round work, define how state and feedback will be recorded; see `references/state-feedback-schema.md`. +6. Same active loop correction or continuation? Reuse the existing `loop_id` and owner lane. +7. Choose `claude_policy` for handoffs and lane artifacts; see `references/claude-policy.md`. +8. Critical direction or degraded-tool decision? Use `references/user-checkpoints.md`. +9. Context or handoff risk? Drop a baton before more work or handoff. ## Planning Lane Execution Firewall @@ -132,7 +130,7 @@ If the strategic target is absent, write `strategy-gap: ` and For T3/T4, the strategic plan and operational route contract should live together as a Strategic Loop Contract, not as duplicate documents. Use `references/strategic-loop-contract.md`, then optionally run: ```bash -python3 skills/loop-engineering/scripts/validate-loop-contract.py +python3 /Users/apple/.codex/skills/loop-engineering/scripts/validate-loop-contract.py ``` ## Execution Batch Sizing And Review Cadence @@ -163,6 +161,14 @@ multi-lane process. For substantial frontend/product-surface work, the execution contract must state the intended visible UI shape before edits: target screen, main panels/cards, empty/degraded states, backend placeholders, and the user calibration point. Prefer landing a visible skeleton tied to stable contracts before filling deep backend behavior when the user needs to judge the interface. +### Parallel Worktree Execution + +When a large T4 execution or repair slice has separable surfaces, prefer a main +integrator/arbitrator plus isolated worktree lanes instead of one overloaded +thread. Never let multiple agents edit the same checkout. Detailed contract, +boundary, arbitration-repair, and lane-rotation rules live in +`references/lane-roles.md`. + Do not use long blank windows for ordinary work. Normal execution/review handoffs should use a practical first check and deadline; deadlines above 45 minutes need an explicit reason such as deep planning, whole-phase architecture review, long test/build operations, or slow external tools. For execution lanes, treat the deadline as a recovery threshold only when the lane appears idle, errored, or artifact-missing without active progress; if the execution lane is visibly active and still editing/testing, keep low-frequency artifact/status monitoring instead of interrupting or declaring failure. If the user says a lane is done, blocked, or wrong, treat that as an immediate state signal: check artifacts first, perform one recovery read if needed, update state/feedback, and route the next lane instead of waiting for the old deadline. ## Admission And New-Lane Gates @@ -242,14 +248,14 @@ docs/ai-handoffs/YYYY-MM-DD-slug/ For this project, prefer existing plan conventions: ```text -docs/loop-engineering/plans/YYYY-MM-DD-slug-claude-plan.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-codex-plan.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-merged-plan.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-codex-execution-report.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-claude-review.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-codex-subagent-review.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-arbitration.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-final-report.md +docs/superpowers/plans/YYYY-MM-DD-slug-claude-plan.md +docs/superpowers/plans/YYYY-MM-DD-slug-codex-plan.md +docs/superpowers/plans/YYYY-MM-DD-slug-merged-plan.md +docs/superpowers/plans/YYYY-MM-DD-slug-codex-execution-report.md +docs/superpowers/plans/YYYY-MM-DD-slug-claude-review.md +docs/superpowers/plans/YYYY-MM-DD-slug-codex-subagent-review.md +docs/superpowers/plans/YYYY-MM-DD-slug-arbitration.md +docs/superpowers/plans/YYYY-MM-DD-slug-final-report.md ``` Rules: @@ -288,16 +294,6 @@ Core defaults: Loop lanes are role contracts, not fixed job titles. For coding loops the default roles are planning, execution, review, and arbitration; for non-code long projects, map the same pattern to domain roles such as producer, researcher, scriptwriter, editor, publisher, or QA. -Before any lanes exist, the skill should ask the user to define the topology rather than assuming one: - -- What are the roles or lane names for this project? -- Which work should run in parallel, and which work must stay sequential? -- Should the manager be the only cross-lane communicator, or are some direct handoffs allowed? -- Do you want a review lane, an arbitration lane, or both? -- Is Claude optional, required, or not part of the topology? -- At which points should the user be brought in before the loop continues? -- What conditions mean the current plan should be revised instead of letting the loop continue? - Read `references/lane-roles.md` when creating, steering, recovering, or reviewing any lane. That reference defines planning, execution, review, arbitration, manager, and dispatcher behavior, including continuous manager monitoring and low-frequency artifact checks. Essential constraints: @@ -310,7 +306,6 @@ Essential constraints: - Manager/dispatcher does not own planning/execution/review/arbitration decisions. It tracks artifacts, repairs coordination, and routes handoffs. - Manager/dispatcher monitoring is artifact-driven and deadline-driven. Do not poll active lane threads every few seconds; each handoff should include `check_after`, `deadline`, and expected artifact paths when the lane may run long. - When the user asks the manager/dispatcher to keep a loop moving, do not stop with a normal final while required lane artifacts are pending and no blocker has been reached. -- If the loop reaches a product, scope, or tradeoff decision that the user should own, stop and ask rather than continuing autonomously. - Reviews stay independent: Claude review and Codex review do not read each other before arbitration. - Arbitration repairs implementation defects inside the merged plan. Return to planning only for plan defects, scope-changing fixes, or user-goal mismatches. - Critical direction changes and degraded Claude-required gates need a user checkpoint; see `references/user-checkpoints.md`. @@ -631,7 +626,7 @@ Do not turn a one-off project lesson into a skill unless it generalizes beyond t - Giving every lane broad project context when only planning needs it. - Skipping `thread-ledger.md` rows for `send_message_to_thread`. - Skipping agent worklog entries, losing lessons and repeated pitfalls. -- Making the manager lane the central relay for all messages instead of letting lanes hand off directly. +- Making `经理Agent` the central relay for all messages instead of letting lanes hand off directly. - Treating the bootstrap thread as a main agent instead of assigning a real lane role. - Starting a new lane set for a correction to the active loop instead of messaging the existing owner lane. - Auto-archiving lane threads before the user has finished evaluating the loop. diff --git a/plugins/codex-loop-engineering/skills/loop-engineering/references/lane-roles.md b/plugins/codex-loop-engineering/skills/loop-engineering/references/lane-roles.md index a131c15..07d3cf2 100644 --- a/plugins/codex-loop-engineering/skills/loop-engineering/references/lane-roles.md +++ b/plugins/codex-loop-engineering/skills/loop-engineering/references/lane-roles.md @@ -23,6 +23,40 @@ For multi-round or multi-lane loops, use `state-feedback-schema.md` to record ho For non-code workflows, map roles to the task instead of forcing coding labels. Example video workflow: topic planner -> researcher -> scriptwriter -> visual planner/editor -> QA/reviewer -> publisher, with manager/dispatcher coordinating artifacts. The durable value is reusable role memory and artifact flow inside Codex, not the specific plan/execute/review labels. +## Human-Readable Lane Naming + +Lane names, thread titles, branch names, and worktree folder names are part of +the product surface for human operators. Manager/dispatcher must choose short, +stable, human-readable names before creating or continuing lanes. + +Rules: + +- Prefer `role + letter/number + purpose`, not long artifact slugs or opaque + generated ids. +- Keep visible thread titles short enough to scan in a sidebar: usually 3-6 + words. +- Include the work type first: `Plan`, `Exec A`, `Exec B`, `Review`, `Arbitrate`, + `Repair A`, `QA`. +- Include the human purpose second: `Shell Pocket`, `Paper Mode`, `PPT Region`, + `Agent Panel`, `Safety QA`. +- Use compact branch/worktree names such as `codex/b12-a-shell-pocket`, + `codex/b12-b-paper`, `codex/b12-c-ppt`, `codex/b12-integrator`, and + `codex/b12-repair-a-paper`. +- Record machine ids separately in the ledger. Do not make humans infer purpose + from thread ids, pending worktree ids, UUIDs, or full artifact filenames. +- If a tool creates a pending worktree/thread with an opaque id, immediately map + it to a human label in the ledger and rename the visible thread when the tool + supports it. + +Examples: + +| lane | good visible title | good branch | avoid | +|---|---|---|---| +| main integration | `Exec Integrator - B12` | `codex/b12-integrator` | `019eef...` | +| shell/pocket | `Exec A - Shell Pocket` | `codex/b12-a-shell-pocket` | `batch12-parallel-preview-first-research-workflow-shell-pocket-relaunch` | +| paper | `Exec B - Paper Mode` | `codex/b12-b-paper` | `worktree-lane-b-phase2-batch12-paper-manuscript-mode` | +| arbitration | `Arbitrate - B12 Repair` | `codex/b12-arbitration` | `standing-arbitration-lane-019eeec2` | + ## Lane Reuse Policy Lanes are persistent roles when continuity helps: planning/product, execution, arbitration/repair, manager, and dispatcher may reuse an existing visible thread for the same loop if the thread is not blocked, polluted by incompatible scope, or too stale to recover. Reuse is preferred when the lane benefits from local project memory and repeated setup would waste context. @@ -358,6 +392,64 @@ Rules: - If a helper discovers a plan gap, scope change, mutation/runtime need, or product-direction mismatch, the execution lane records the blocker and stops instead of silently expanding scope. +### Parallel Worktree Execution Lanes + +For large T4 execution slices, manager/dispatcher may split implementation across +isolated worktrees instead of asking one execution lane to do everything. Use this +when the work naturally separates by product surface, data/model contract, +runtime boundary, fixture set, visual QA surface, or test/evidence layer. + +Default shape: + +```text +Main Integrator lane + owns shared contracts, final merge/integration, full verification, consolidated report + +Parallel Worktree lanes A/B/C/... + each owns one isolated worktree/branch, one write scope, one lane report + +Late QA/Safety lane + starts only after lane outputs or an integrated preview exist +``` + +Manager/dispatcher contract requirements: + +- name every worktree path, branch, lane role, write scope, and forbidden scope; +- forbid concurrent edits to the same checkout/worktree; +- require the integrator to define shared component/data/CSS boundaries before + parallel lanes depend on them, or explicitly state the stable boundaries from + existing code; +- give each lane one expected artifact path and focused verification commands; +- require lane reports to include touched files, tests run, known conflicts, + remaining integration work, safety check, and lifecycle; +- require the integrator to consume lane reports/diffs, resolve conflicts, run + focused then full tests/build/browser evidence, and write one consolidated + execution report; +- state the integration order, such as shell/data boundary -> feature surface A + -> feature surface B -> Agent/interaction -> QA/safety. + +Parallel lane rules: + +- A lane may edit only its assigned worktree and write scope. +- A lane must not modify the product root, sibling worktrees, shared Git state, + or another lane's report. +- A lane must not stage, commit, push, reset, stash, or run destructive cleanup + unless that operation is explicitly authorized in the handoff. +- If a lane accidentally edits outside its worktree/scope, it must stop, + compare evidence, recover only clearly attributable out-of-scope edits, and + record `Boundary Incident / Recovery` in its report. If attribution is + uncertain, stop and escalate to manager instead of reverting. + +Integrator rules: + +- The integrator is the only lane that merges/selects patches from parallel + lanes. +- The integrator must not invent completed lane work when lane reports or diffs + are absent; it records blockers or partial integration honestly. +- The consolidated execution report must distinguish lane-local verification + from integrator verification, and must include git/worktree state for the + integration worktree plus any relevant boundary incidents. + ## Phase Sizing And Review Cadence For substantial long-running work, spend more effort in planning and reduce review churn during execution. @@ -376,7 +468,7 @@ Planning should define: If planning cannot state the strategic target, write `strategy-gap: ` and stop for a user checkpoint. Do not compensate by writing a longer execution task list. -Execution should group work into larger slices when the pieces belong to the same user workflow. For example, a content production workflow might group research collection, material preparation, draft creation, editing, and QA into one or two coherent execution slices instead of five helper-sized review loops. +Execution should group work into larger slices when the pieces belong to the same user workflow. Example for a biology learning workbench: daily guide/status command center, acquisition refresh facade, queue lifecycle commands, weekly project refresh, and PPT readiness command/report may be grouped into one or two coherent execution slices instead of five helper-sized review loops. An execution slice is too small for a full review loop when it only changes one helper, one internal function, or one local cleanup without landing a visible workflow state, product surface, durable contract, migration boundary, or risk boundary. @@ -460,6 +552,43 @@ Arbitration lane should: - stay reusable for the active loop unless the full milestone closes, a replacement lane is confirmed, the lane is corrupted/stale, or the user explicitly asks. +### Parallel Arbitration / Repair Worktrees + +Large repair phases may use isolated worktree repair lanes, but adjudication +authority stays centralized. Use this when accepted findings span separable +surfaces, many files, or conflicting patches that would overload one arbitration +thread. + +Default shape: + +```text +Chief Arbitration lane + owns findings disposition, repair contract, integration, final report + +Repair Worktree lanes A/B/C/... + each owns one accepted finding group or product surface + +Optional QA/Safety lane + checks integrated repairs after the chief arbitration lane has a preview +``` + +Rules: + +- Only the chief arbitration lane decides finding disposition: + `accept`, `reject`, `defer`, `third path`, or `needs more evidence`. +- Repair lanes may implement only already-dispositioned, scoped repairs. They do + not independently reinterpret review findings or expand product direction. +- Each repair lane needs its own worktree/branch, write scope, expected repair + report, focused verification, known conflict list, and stop condition. +- The chief arbitration lane alone integrates repair outputs, resolves + conflicts, runs required verification, and writes arbitration/final artifacts. +- If repair lanes reveal a plan gap, scope change, or uncertain evidence, they + stop and report back to chief arbitration. The chief lane decides whether to + gather evidence, return to planning, or defer. +- Boundary incidents follow the same rule as execution worktrees: recover only + clearly attributable out-of-scope edits and record the incident in the repair + report, ledger, and final arbitration summary. + ## Repair Routing After review finds a problem, route it by defect type: @@ -498,6 +627,39 @@ Manager should: Manager must not silently execute business-code changes, overrule arbitration without evidence, or centralize all lane communication as hidden chat. +### Manager / Planning Lane Rotation + +Long-lived manager and planning lanes can become coordination debt when chat +history accumulates old heartbeats, closed batches, stale lane ids, and +superseded routes. Rotate them deliberately before context corruption affects +dispatch. + +Rotate a manager or planning lane when any of these are true: + +- repeated context compression or interrupted turns make current state + unreliable; +- old route instructions, obsolete heartbeats, or stale lane ids keep resurfacing; +- the lane has accumulated multiple completed phases and the next phase needs a + clean dispatch context; +- the user reports the manager/planner is stuck, slow, confused, or reviving old + work; +- the lane is near context limits and exact dirty state, active lanes, or + blockers matter. + +Rotation contract: + +- Write a baton/context pack before handoff. It must include loop id, roots, + canonical artifacts, active lanes, retired lanes, current phase status, + pending expected artifacts, safety boundaries, stale monitors to delete/update, + dirty git/worktree state, and next routing decision. +- Record the rotation in state-feedback, worklog, and thread-ledger. +- Mark the old manager/planning lane retired, archived, or checkpointed with + `next_expected_use: none` unless a specific future use is named. +- The new lane must start artifact-first from the baton/context pack and current + ledgers, not from inherited chat memory. +- Do not let both old and new manager lanes actively dispatch the same loop. + Overlapping managers are allowed only during explicit handoff validation. + ## Dispatcher Role Dispatcher is a physical delivery role. diff --git a/skills/loop-engineering/SKILL.md b/skills/loop-engineering/SKILL.md index 815efbe..55705e3 100644 --- a/skills/loop-engineering/SKILL.md +++ b/skills/loop-engineering/SKILL.md @@ -3,7 +3,7 @@ name: loop-engineering description: Use when a substantial coding, research, content, product, or other long-running project needs Codex-centered multi-agent orchestration, role/lane design, cross-model Claude/Codex planning or review, cross-model debate, named Codex agent threads, dispatcher-mediated handoffs, thread ledgers, worklogs, batons, repair loops, or evidence-based arbitration. --- -# Codex Loop Engineering +# Loop Engineering ## Purpose @@ -29,16 +29,14 @@ For tiny edits, config tweaks, docs-only notes, or simple local bug fixes, do no First decide the smallest route that controls risk: 1. Choose the route tier before choosing agents. -2. Ask the user to define the topology before creating lanes: desired roles, parallel vs sequential work, communication rules, and whether Codex-only or Codex-plus-Claude is expected. -3. For each new task, derive the project-specific agent roles first, then build the matching identity table and worklog summary before execution starts. -4. Verify the six-interface contract: goal, state, context, act, capture, stop. -5. For T3/T4, create or identify a Strategic Loop Contract; see `references/strategic-loop-contract.md`. -6. If the strategic target or "good enough" completion criterion is missing, write `strategy-gap: ` and stop for a user checkpoint. -7. For multi-round work, define how state and feedback will be recorded; see `references/state-feedback-schema.md`. -8. Same active loop correction or continuation? Reuse the existing `loop_id` and owner lane. -9. Choose `claude_policy` for handoffs and lane artifacts; see `references/claude-policy.md`. -10. Critical direction or degraded-tool decision? Use `references/user-checkpoints.md`. -11. Context or handoff risk? Drop a baton before more work or handoff. +2. Verify the six-interface contract: goal, state, context, act, capture, stop. +3. For T3/T4, create or identify a Strategic Loop Contract; see `references/strategic-loop-contract.md`. +4. If the strategic target or "good enough" completion criterion is missing, write `strategy-gap: ` and stop for a user checkpoint. +5. For multi-round work, define how state and feedback will be recorded; see `references/state-feedback-schema.md`. +6. Same active loop correction or continuation? Reuse the existing `loop_id` and owner lane. +7. Choose `claude_policy` for handoffs and lane artifacts; see `references/claude-policy.md`. +8. Critical direction or degraded-tool decision? Use `references/user-checkpoints.md`. +9. Context or handoff risk? Drop a baton before more work or handoff. ## Planning Lane Execution Firewall @@ -132,7 +130,7 @@ If the strategic target is absent, write `strategy-gap: ` and For T3/T4, the strategic plan and operational route contract should live together as a Strategic Loop Contract, not as duplicate documents. Use `references/strategic-loop-contract.md`, then optionally run: ```bash -python3 skills/loop-engineering/scripts/validate-loop-contract.py +python3 /Users/apple/.codex/skills/loop-engineering/scripts/validate-loop-contract.py ``` ## Execution Batch Sizing And Review Cadence @@ -163,6 +161,14 @@ multi-lane process. For substantial frontend/product-surface work, the execution contract must state the intended visible UI shape before edits: target screen, main panels/cards, empty/degraded states, backend placeholders, and the user calibration point. Prefer landing a visible skeleton tied to stable contracts before filling deep backend behavior when the user needs to judge the interface. +### Parallel Worktree Execution + +When a large T4 execution or repair slice has separable surfaces, prefer a main +integrator/arbitrator plus isolated worktree lanes instead of one overloaded +thread. Never let multiple agents edit the same checkout. Detailed contract, +boundary, arbitration-repair, and lane-rotation rules live in +`references/lane-roles.md`. + Do not use long blank windows for ordinary work. Normal execution/review handoffs should use a practical first check and deadline; deadlines above 45 minutes need an explicit reason such as deep planning, whole-phase architecture review, long test/build operations, or slow external tools. For execution lanes, treat the deadline as a recovery threshold only when the lane appears idle, errored, or artifact-missing without active progress; if the execution lane is visibly active and still editing/testing, keep low-frequency artifact/status monitoring instead of interrupting or declaring failure. If the user says a lane is done, blocked, or wrong, treat that as an immediate state signal: check artifacts first, perform one recovery read if needed, update state/feedback, and route the next lane instead of waiting for the old deadline. ## Admission And New-Lane Gates @@ -242,14 +248,14 @@ docs/ai-handoffs/YYYY-MM-DD-slug/ For this project, prefer existing plan conventions: ```text -docs/loop-engineering/plans/YYYY-MM-DD-slug-claude-plan.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-codex-plan.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-merged-plan.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-codex-execution-report.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-claude-review.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-codex-subagent-review.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-arbitration.md -docs/loop-engineering/plans/YYYY-MM-DD-slug-final-report.md +docs/superpowers/plans/YYYY-MM-DD-slug-claude-plan.md +docs/superpowers/plans/YYYY-MM-DD-slug-codex-plan.md +docs/superpowers/plans/YYYY-MM-DD-slug-merged-plan.md +docs/superpowers/plans/YYYY-MM-DD-slug-codex-execution-report.md +docs/superpowers/plans/YYYY-MM-DD-slug-claude-review.md +docs/superpowers/plans/YYYY-MM-DD-slug-codex-subagent-review.md +docs/superpowers/plans/YYYY-MM-DD-slug-arbitration.md +docs/superpowers/plans/YYYY-MM-DD-slug-final-report.md ``` Rules: @@ -288,16 +294,6 @@ Core defaults: Loop lanes are role contracts, not fixed job titles. For coding loops the default roles are planning, execution, review, and arbitration; for non-code long projects, map the same pattern to domain roles such as producer, researcher, scriptwriter, editor, publisher, or QA. -Before any lanes exist, the skill should ask the user to define the topology rather than assuming one: - -- What are the roles or lane names for this project? -- Which work should run in parallel, and which work must stay sequential? -- Should the manager be the only cross-lane communicator, or are some direct handoffs allowed? -- Do you want a review lane, an arbitration lane, or both? -- Is Claude optional, required, or not part of the topology? -- At which points should the user be brought in before the loop continues? -- What conditions mean the current plan should be revised instead of letting the loop continue? - Read `references/lane-roles.md` when creating, steering, recovering, or reviewing any lane. That reference defines planning, execution, review, arbitration, manager, and dispatcher behavior, including continuous manager monitoring and low-frequency artifact checks. Essential constraints: @@ -310,7 +306,6 @@ Essential constraints: - Manager/dispatcher does not own planning/execution/review/arbitration decisions. It tracks artifacts, repairs coordination, and routes handoffs. - Manager/dispatcher monitoring is artifact-driven and deadline-driven. Do not poll active lane threads every few seconds; each handoff should include `check_after`, `deadline`, and expected artifact paths when the lane may run long. - When the user asks the manager/dispatcher to keep a loop moving, do not stop with a normal final while required lane artifacts are pending and no blocker has been reached. -- If the loop reaches a product, scope, or tradeoff decision that the user should own, stop and ask rather than continuing autonomously. - Reviews stay independent: Claude review and Codex review do not read each other before arbitration. - Arbitration repairs implementation defects inside the merged plan. Return to planning only for plan defects, scope-changing fixes, or user-goal mismatches. - Critical direction changes and degraded Claude-required gates need a user checkpoint; see `references/user-checkpoints.md`. @@ -631,7 +626,7 @@ Do not turn a one-off project lesson into a skill unless it generalizes beyond t - Giving every lane broad project context when only planning needs it. - Skipping `thread-ledger.md` rows for `send_message_to_thread`. - Skipping agent worklog entries, losing lessons and repeated pitfalls. -- Making the manager lane the central relay for all messages instead of letting lanes hand off directly. +- Making `经理Agent` the central relay for all messages instead of letting lanes hand off directly. - Treating the bootstrap thread as a main agent instead of assigning a real lane role. - Starting a new lane set for a correction to the active loop instead of messaging the existing owner lane. - Auto-archiving lane threads before the user has finished evaluating the loop. diff --git a/skills/loop-engineering/references/lane-roles.md b/skills/loop-engineering/references/lane-roles.md index a131c15..07d3cf2 100644 --- a/skills/loop-engineering/references/lane-roles.md +++ b/skills/loop-engineering/references/lane-roles.md @@ -23,6 +23,40 @@ For multi-round or multi-lane loops, use `state-feedback-schema.md` to record ho For non-code workflows, map roles to the task instead of forcing coding labels. Example video workflow: topic planner -> researcher -> scriptwriter -> visual planner/editor -> QA/reviewer -> publisher, with manager/dispatcher coordinating artifacts. The durable value is reusable role memory and artifact flow inside Codex, not the specific plan/execute/review labels. +## Human-Readable Lane Naming + +Lane names, thread titles, branch names, and worktree folder names are part of +the product surface for human operators. Manager/dispatcher must choose short, +stable, human-readable names before creating or continuing lanes. + +Rules: + +- Prefer `role + letter/number + purpose`, not long artifact slugs or opaque + generated ids. +- Keep visible thread titles short enough to scan in a sidebar: usually 3-6 + words. +- Include the work type first: `Plan`, `Exec A`, `Exec B`, `Review`, `Arbitrate`, + `Repair A`, `QA`. +- Include the human purpose second: `Shell Pocket`, `Paper Mode`, `PPT Region`, + `Agent Panel`, `Safety QA`. +- Use compact branch/worktree names such as `codex/b12-a-shell-pocket`, + `codex/b12-b-paper`, `codex/b12-c-ppt`, `codex/b12-integrator`, and + `codex/b12-repair-a-paper`. +- Record machine ids separately in the ledger. Do not make humans infer purpose + from thread ids, pending worktree ids, UUIDs, or full artifact filenames. +- If a tool creates a pending worktree/thread with an opaque id, immediately map + it to a human label in the ledger and rename the visible thread when the tool + supports it. + +Examples: + +| lane | good visible title | good branch | avoid | +|---|---|---|---| +| main integration | `Exec Integrator - B12` | `codex/b12-integrator` | `019eef...` | +| shell/pocket | `Exec A - Shell Pocket` | `codex/b12-a-shell-pocket` | `batch12-parallel-preview-first-research-workflow-shell-pocket-relaunch` | +| paper | `Exec B - Paper Mode` | `codex/b12-b-paper` | `worktree-lane-b-phase2-batch12-paper-manuscript-mode` | +| arbitration | `Arbitrate - B12 Repair` | `codex/b12-arbitration` | `standing-arbitration-lane-019eeec2` | + ## Lane Reuse Policy Lanes are persistent roles when continuity helps: planning/product, execution, arbitration/repair, manager, and dispatcher may reuse an existing visible thread for the same loop if the thread is not blocked, polluted by incompatible scope, or too stale to recover. Reuse is preferred when the lane benefits from local project memory and repeated setup would waste context. @@ -358,6 +392,64 @@ Rules: - If a helper discovers a plan gap, scope change, mutation/runtime need, or product-direction mismatch, the execution lane records the blocker and stops instead of silently expanding scope. +### Parallel Worktree Execution Lanes + +For large T4 execution slices, manager/dispatcher may split implementation across +isolated worktrees instead of asking one execution lane to do everything. Use this +when the work naturally separates by product surface, data/model contract, +runtime boundary, fixture set, visual QA surface, or test/evidence layer. + +Default shape: + +```text +Main Integrator lane + owns shared contracts, final merge/integration, full verification, consolidated report + +Parallel Worktree lanes A/B/C/... + each owns one isolated worktree/branch, one write scope, one lane report + +Late QA/Safety lane + starts only after lane outputs or an integrated preview exist +``` + +Manager/dispatcher contract requirements: + +- name every worktree path, branch, lane role, write scope, and forbidden scope; +- forbid concurrent edits to the same checkout/worktree; +- require the integrator to define shared component/data/CSS boundaries before + parallel lanes depend on them, or explicitly state the stable boundaries from + existing code; +- give each lane one expected artifact path and focused verification commands; +- require lane reports to include touched files, tests run, known conflicts, + remaining integration work, safety check, and lifecycle; +- require the integrator to consume lane reports/diffs, resolve conflicts, run + focused then full tests/build/browser evidence, and write one consolidated + execution report; +- state the integration order, such as shell/data boundary -> feature surface A + -> feature surface B -> Agent/interaction -> QA/safety. + +Parallel lane rules: + +- A lane may edit only its assigned worktree and write scope. +- A lane must not modify the product root, sibling worktrees, shared Git state, + or another lane's report. +- A lane must not stage, commit, push, reset, stash, or run destructive cleanup + unless that operation is explicitly authorized in the handoff. +- If a lane accidentally edits outside its worktree/scope, it must stop, + compare evidence, recover only clearly attributable out-of-scope edits, and + record `Boundary Incident / Recovery` in its report. If attribution is + uncertain, stop and escalate to manager instead of reverting. + +Integrator rules: + +- The integrator is the only lane that merges/selects patches from parallel + lanes. +- The integrator must not invent completed lane work when lane reports or diffs + are absent; it records blockers or partial integration honestly. +- The consolidated execution report must distinguish lane-local verification + from integrator verification, and must include git/worktree state for the + integration worktree plus any relevant boundary incidents. + ## Phase Sizing And Review Cadence For substantial long-running work, spend more effort in planning and reduce review churn during execution. @@ -376,7 +468,7 @@ Planning should define: If planning cannot state the strategic target, write `strategy-gap: ` and stop for a user checkpoint. Do not compensate by writing a longer execution task list. -Execution should group work into larger slices when the pieces belong to the same user workflow. For example, a content production workflow might group research collection, material preparation, draft creation, editing, and QA into one or two coherent execution slices instead of five helper-sized review loops. +Execution should group work into larger slices when the pieces belong to the same user workflow. Example for a biology learning workbench: daily guide/status command center, acquisition refresh facade, queue lifecycle commands, weekly project refresh, and PPT readiness command/report may be grouped into one or two coherent execution slices instead of five helper-sized review loops. An execution slice is too small for a full review loop when it only changes one helper, one internal function, or one local cleanup without landing a visible workflow state, product surface, durable contract, migration boundary, or risk boundary. @@ -460,6 +552,43 @@ Arbitration lane should: - stay reusable for the active loop unless the full milestone closes, a replacement lane is confirmed, the lane is corrupted/stale, or the user explicitly asks. +### Parallel Arbitration / Repair Worktrees + +Large repair phases may use isolated worktree repair lanes, but adjudication +authority stays centralized. Use this when accepted findings span separable +surfaces, many files, or conflicting patches that would overload one arbitration +thread. + +Default shape: + +```text +Chief Arbitration lane + owns findings disposition, repair contract, integration, final report + +Repair Worktree lanes A/B/C/... + each owns one accepted finding group or product surface + +Optional QA/Safety lane + checks integrated repairs after the chief arbitration lane has a preview +``` + +Rules: + +- Only the chief arbitration lane decides finding disposition: + `accept`, `reject`, `defer`, `third path`, or `needs more evidence`. +- Repair lanes may implement only already-dispositioned, scoped repairs. They do + not independently reinterpret review findings or expand product direction. +- Each repair lane needs its own worktree/branch, write scope, expected repair + report, focused verification, known conflict list, and stop condition. +- The chief arbitration lane alone integrates repair outputs, resolves + conflicts, runs required verification, and writes arbitration/final artifacts. +- If repair lanes reveal a plan gap, scope change, or uncertain evidence, they + stop and report back to chief arbitration. The chief lane decides whether to + gather evidence, return to planning, or defer. +- Boundary incidents follow the same rule as execution worktrees: recover only + clearly attributable out-of-scope edits and record the incident in the repair + report, ledger, and final arbitration summary. + ## Repair Routing After review finds a problem, route it by defect type: @@ -498,6 +627,39 @@ Manager should: Manager must not silently execute business-code changes, overrule arbitration without evidence, or centralize all lane communication as hidden chat. +### Manager / Planning Lane Rotation + +Long-lived manager and planning lanes can become coordination debt when chat +history accumulates old heartbeats, closed batches, stale lane ids, and +superseded routes. Rotate them deliberately before context corruption affects +dispatch. + +Rotate a manager or planning lane when any of these are true: + +- repeated context compression or interrupted turns make current state + unreliable; +- old route instructions, obsolete heartbeats, or stale lane ids keep resurfacing; +- the lane has accumulated multiple completed phases and the next phase needs a + clean dispatch context; +- the user reports the manager/planner is stuck, slow, confused, or reviving old + work; +- the lane is near context limits and exact dirty state, active lanes, or + blockers matter. + +Rotation contract: + +- Write a baton/context pack before handoff. It must include loop id, roots, + canonical artifacts, active lanes, retired lanes, current phase status, + pending expected artifacts, safety boundaries, stale monitors to delete/update, + dirty git/worktree state, and next routing decision. +- Record the rotation in state-feedback, worklog, and thread-ledger. +- Mark the old manager/planning lane retired, archived, or checkpointed with + `next_expected_use: none` unless a specific future use is named. +- The new lane must start artifact-first from the baton/context pack and current + ledgers, not from inherited chat memory. +- Do not let both old and new manager lanes actively dispatch the same loop. + Overlapping managers are allowed only during explicit handoff validation. + ## Dispatcher Role Dispatcher is a physical delivery role.