From 081257130acdd0dfbaaabf89c57c9505d2a228f9 Mon Sep 17 00:00:00 2001
From: Patrick Taylor <1963845+pstaylor-patrick@users.noreply.github.com>
Date: Sun, 21 Jun 2026 06:10:54 -0500
Subject: [PATCH 1/3] feat(pst): best-of-N parallel implementation tournament
 for refactors and extensions

Rule 24: spawn 2-5 parallel Sonnet agents (Conservative/Structural/Extract-first
strategies) in isolated worktrees, Opus judge picks winner by MAINTAINABILITY.md
criteria. /pst:refactor implements the pattern with N scaled to complexity.
Trivial: N=1 (no tournament). Moderate: N=3. Complex: N=5.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019BCr5Qxi8jXnG2Rr7zdkw1
---
 skills/pst/SKILL.md          |   2 +
 skills/pst:refactor/SKILL.md | 106 +++++++++++++++++++++++++----------
 2 files changed, 78 insertions(+), 30 deletions(-)

diff --git a/skills/pst/SKILL.md b/skills/pst/SKILL.md
index aeebb03..171296e 100644
--- a/skills/pst/SKILL.md
+++ b/skills/pst/SKILL.md
@@ -135,6 +135,8 @@ rules a hook reminds about (non-blocking). Detail and examples are in
 
 23. **Maintainability review on every code change** `[NUDGE]`. After any change that touches at least one code file (non-docs, non-lockfile), run a Fowler-smell pass using `MAINTAINABILITY.md` as the rubric. This is a separate refactoring commit (two hats per rule 15): behavior stays identical, only structure improves. The pass is Haiku-tier since it is lightweight. The only exemption is a changeset that touches zero code files (docs-only, lockfile-only, or pure config values). A small diff does not exempt a change -- smells manifest at any size. Run with `/pst:adversarial-review` or inline. See `MAINTAINABILITY.md` for the 16 canonical smells.
 
+24. **Best-of-N implementation tournament** `[NUDGE]`. For substantive refactors or extensions where multiple valid implementations exist, spawn 2-5 parallel Sonnet agents in isolated worktrees, each with a divergent strategy: Conservative (smallest diff, no new abstractions), Structural (reorganize by responsibility), Extract-first (name every abstraction). For N=5, add Domain-model and Functional strategies. After all agents complete, an Opus judge selects the winner using MAINTAINABILITY.md criteria: cohesion, coupling, intent clarity, and locality of change. Skip for trivial one-way fixes where a single correct path is obvious. Rule 15 two-hats still applies to every implementation agent. `/pst:refactor` implements this pattern.
+
 ## Usage
 
 `/pst` activates, `/pst off` disarms. Mechanics, merge modes, and rule detail are
diff --git a/skills/pst:refactor/SKILL.md b/skills/pst:refactor/SKILL.md
index 8200fc0..69e63e9 100644
--- a/skills/pst:refactor/SKILL.md
+++ b/skills/pst:refactor/SKILL.md
@@ -85,41 +85,87 @@ If `--report-only` was passed, stop here.
 Use `AskUserQuestion`: **Apply all fixes in isolated worktrees?** (Yes / Report
 only). Treat no objection as Yes.
 
-## 4. Parallel Sonnet implementers (isolated worktrees)
-
-Group findings by file. Spawn one background Sonnet agent per file
-(`model: sonnet`, `isolation: worktree`) with:
-
-- The file content.
-- The smell findings for that file.
-- Instruction:
+## 4. Plan gate (foreground)
+
+Determine N from the smell count and file count:
+
+- **Trivial** (1-3 smells, 1-2 files): N=1 -- single Sonnet, skip tournament.
+- **Moderate** (4-10 smells or 3-5 files): N=3.
+- **Complex** (11+ smells or 6+ files): N=5.
+
+Present smell count and N to the user via `AskUserQuestion`: "Run tournament
+with N=\<N\> implementations?" (Yes / Report only / Adjust N). If
+`--report-only` was passed, stop here without asking.
+
+## 5. Parallel implementation tournament (background)
+
+Spawn N background Sonnet agents (`model: sonnet`, `isolation: worktree`),
+each receiving the same smell findings and target files but a different
+strategy directive:
+
+- **Strategy A -- Conservative**: Fix only the highest-impact smells with the
+  smallest possible diff. Prefer inlining over new abstractions when the call
+  site is nearby. Preserve the existing module and file structure entirely.
+- **Strategy B -- Structural**: Reorganize by responsibility. Group related
+  behavior together even if it means creating new files or moving methods
+  between classes. Optimize for locality of future change.
+- **Strategy C -- Extract-first**: Create a named function, class, or module
+  for every smell instance. Err toward more abstractions with clear names.
+  Every duplicated concept gets a home.
+
+If N=5, add:
+
+- **Strategy D -- Domain-model**: Look for primitive obsession and data clumps;
+  introduce domain objects or value types to make implicit business concepts
+  explicit.
+- **Strategy E -- Functional**: Prefer pure functions and immutable data where
+  the language supports it. Move state to the edges. Reduce side-effect surface.
+
+Each agent must:
+
+1. Apply its strategy to fix the smells in the target files.
+2. Run existing tests if present; skip any fix that breaks a test and note it.
+3. Stage and commit: `git add <changed files> && git commit -m "refactor(<strategy-name>): <primary smell fixed>"`.
+4. Include co-author trailer: `Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>`.
+
+## 6. Opus judge selects winner (background)
+
+After all N agents complete, spawn one background Opus agent (`model: opus`)
+with:
+
+- All N diffs (collected via `git show HEAD` in each worktree).
+- The smell findings from step 2.
+- The content of `MAINTAINABILITY.md`.
+- Instruction: evaluate each implementation against the six MAINTAINABILITY.md
+  outcomes (Higher Cohesion, Lower Coupling, Explicit Intent, Locality of
+  Change, Reduced Cognitive Load, Strong Domain Modeling). Score each 1-5 on
+  each dimension. Pick the winner.
+
+Return schema:
+
+```json
+{
+  "winner": "Conservative|Structural|Extract-first|Domain-model|Functional",
+  "scores": { "A": 0, "B": 0, "C": 0 },
+  "reasoning": "one sentence"
+}
+```
 
-  ```
-  Apply each refactoring listed below to this file. Rules:
-  - Behavior-preserving only. Do not change any observable behavior.
-  - Run existing tests if a test runner is available; skip any fix that
-    breaks a test and note it.
-  - After all fixes: stage the changed file and commit:
-      git add <file>
-      git commit -m "refactor(<file>): <primary smell fixed>"
-    Include the co-author trailer:
-      Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
-  - If no fix is safe to apply, commit nothing and return a skip reason.
-  ```
+## 7. Apply and report
 
-## 5. Report
+Cherry-pick the winning commit to the current branch:
 
-After all worktree agents complete:
+```sh
+git cherry-pick <winning-commit-sha>
+```
 
-- List each file: smells fixed, smells skipped (with reason), test result.
-- If any fix broke a test, name the fix and the failing test.
-- State total smells fixed vs. skipped.
+Report: winning strategy, Opus reasoning, scores for each strategy, smells
+fixed vs. skipped, and test results.
 
 ## Notes
 
-- Rule 15 two-hats: every commit from this skill is a pure refactor. If a
-  fix requires a behavior change, skip it and surface it as a follow-up.
-- Rule 7 applies if you intend to merge the result via PR: adversarial review
-  before merge.
-- For very large codebases (> 50 files), run with a scoped path first:
+- Rule 24 best-of-N: N=1 for trivial, N=3 for moderate, N=5 for complex.
+- Rule 15 two-hats applies to every implementation agent: no behavior changes.
+- Rule 7 applies if the result goes to a PR: adversarial review before merge.
+- For very large codebases (50+ files), scope with a path first:
   `/pst:refactor src/auth` then widen.

From 24405e07f0814429a8b62695715d90007ea63cdd Mon Sep 17 00:00:00 2001
From: Patrick Taylor <1963845+pstaylor-patrick@users.noreply.github.com>
Date: Sun, 21 Jun 2026 06:19:07 -0500
Subject: [PATCH 2/3] fix(pst:refactor): resolve 5 adversarial review findings
 in best-of-N tournament

- Step 5: switch from background to foreground parallel agents so
  synchronization is implicit (no explicit gate needed before judge)
- Each agent returns a structured ---tournament-result--- block with
  STRATEGY, STATUS, COMMIT_SHA, and full git diff -- no worktree
  access required after agent completes
- Step 6: judge reads diffs from result blocks directly; returns winning
  letter (A-E) not strategy name; SHA lookup stays in orchestrator
- Step 7: cherry-pick by SHA from result blocks, not from judge output;
  fall back to git apply on conflict
- Failed agents emit STATUS: skipped: <reason> so errors surface
- Rule 24: trimmed to normative pointer only; impl details live in skill

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019BCr5Qxi8jXnG2Rr7zdkw1
---
 skills/pst/SKILL.md          |   2 +-
 skills/pst:refactor/SKILL.md | 108 ++++++++++++++++++++++++-----------
 2 files changed, 76 insertions(+), 34 deletions(-)

diff --git a/skills/pst/SKILL.md b/skills/pst/SKILL.md
index 171296e..f2a4376 100644
--- a/skills/pst/SKILL.md
+++ b/skills/pst/SKILL.md
@@ -135,7 +135,7 @@ rules a hook reminds about (non-blocking). Detail and examples are in
 
 23. **Maintainability review on every code change** `[NUDGE]`. After any change that touches at least one code file (non-docs, non-lockfile), run a Fowler-smell pass using `MAINTAINABILITY.md` as the rubric. This is a separate refactoring commit (two hats per rule 15): behavior stays identical, only structure improves. The pass is Haiku-tier since it is lightweight. The only exemption is a changeset that touches zero code files (docs-only, lockfile-only, or pure config values). A small diff does not exempt a change -- smells manifest at any size. Run with `/pst:adversarial-review` or inline. See `MAINTAINABILITY.md` for the 16 canonical smells.
 
-24. **Best-of-N implementation tournament** `[NUDGE]`. For substantive refactors or extensions where multiple valid implementations exist, spawn 2-5 parallel Sonnet agents in isolated worktrees, each with a divergent strategy: Conservative (smallest diff, no new abstractions), Structural (reorganize by responsibility), Extract-first (name every abstraction). For N=5, add Domain-model and Functional strategies. After all agents complete, an Opus judge selects the winner using MAINTAINABILITY.md criteria: cohesion, coupling, intent clarity, and locality of change. Skip for trivial one-way fixes where a single correct path is obvious. Rule 15 two-hats still applies to every implementation agent. `/pst:refactor` implements this pattern.
+24. **Best-of-N implementation tournament** `[NUDGE]`. For substantive refactors or extensions where multiple valid implementations exist, spawn 2-5 parallel Sonnet agents in isolated worktrees with divergent strategies, then have an Opus judge pick the winner using MAINTAINABILITY.md criteria. Skip for trivial one-way fixes. Rule 15 two-hats applies. `/pst:refactor` is the full implementation protocol.
 
 ## Usage
 
diff --git a/skills/pst:refactor/SKILL.md b/skills/pst:refactor/SKILL.md
index 69e63e9..c477593 100644
--- a/skills/pst:refactor/SKILL.md
+++ b/skills/pst:refactor/SKILL.md
@@ -97,70 +97,112 @@ Present smell count and N to the user via `AskUserQuestion`: "Run tournament
 with N=\<N\> implementations?" (Yes / Report only / Adjust N). If
 `--report-only` was passed, stop here without asking.
 
-## 5. Parallel implementation tournament (background)
+## 5. Parallel implementation tournament
 
-Spawn N background Sonnet agents (`model: sonnet`, `isolation: worktree`),
-each receiving the same smell findings and target files but a different
+Spawn N **foreground** Sonnet agents (`model: sonnet`, `isolation: worktree`) in
+the **same response turn** so they run concurrently. Do NOT set
+`run_in_background: true` -- all N must complete before Step 6 begins, and
+synchronization is implicit when they are foreground.
+
+Each agent receives the same smell findings and target files but a different
 strategy directive:
 
-- **Strategy A -- Conservative**: Fix only the highest-impact smells with the
-  smallest possible diff. Prefer inlining over new abstractions when the call
-  site is nearby. Preserve the existing module and file structure entirely.
-- **Strategy B -- Structural**: Reorganize by responsibility. Group related
-  behavior together even if it means creating new files or moving methods
-  between classes. Optimize for locality of future change.
-- **Strategy C -- Extract-first**: Create a named function, class, or module
-  for every smell instance. Err toward more abstractions with clear names.
-  Every duplicated concept gets a home.
+- **A -- Conservative**: Fix only the highest-impact smells with the smallest
+  possible diff. Prefer inlining over new abstractions. Preserve existing module
+  and file structure entirely.
+- **B -- Structural**: Reorganize by responsibility. Group related behavior
+  together even if it means creating new files or moving methods. Optimize for
+  locality of future change.
+- **C -- Extract-first**: Create a named function, class, or module for every
+  smell instance. Err toward more abstractions with clear names.
 
 If N=5, add:
 
-- **Strategy D -- Domain-model**: Look for primitive obsession and data clumps;
-  introduce domain objects or value types to make implicit business concepts
-  explicit.
-- **Strategy E -- Functional**: Prefer pure functions and immutable data where
-  the language supports it. Move state to the edges. Reduce side-effect surface.
+- **D -- Domain-model**: Look for primitive obsession and data clumps; introduce
+  domain objects or value types to make implicit business concepts explicit.
+- **E -- Functional**: Prefer pure functions and immutable data where the
+  language supports it. Move state to the edges. Reduce side-effect surface.
+
+Each agent must end its response with exactly this block so the orchestrator
+can collect results without accessing the worktree afterward:
+
+```
+---tournament-result---
+STRATEGY: <A|B|C|D|E>
+STATUS: committed
+COMMIT_SHA: <full 40-char sha from: git rev-parse HEAD>
+DIFF:
+<output of: git diff HEAD~1..HEAD>
+---end-tournament-result---
+```
+
+If the agent cannot commit (tests fail, conflict, or other error), emit:
 
-Each agent must:
+```
+---tournament-result---
+STRATEGY: <A|B|C|D|E>
+STATUS: skipped: <reason in one line>
+---end-tournament-result---
+```
+
+Steps each agent must follow:
 
 1. Apply its strategy to fix the smells in the target files.
 2. Run existing tests if present; skip any fix that breaks a test and note it.
-3. Stage and commit: `git add <changed files> && git commit -m "refactor(<strategy-name>): <primary smell fixed>"`.
+3. Stage and commit: `git add <changed files> && git commit -m "refactor(<strategy>): <primary smell fixed>"`.
 4. Include co-author trailer: `Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>`.
+5. Run `git rev-parse HEAD` and `git diff HEAD~1..HEAD`; include both verbatim in the result block.
 
-## 6. Opus judge selects winner (background)
+## 6. Opus judge selects winner
 
-After all N agents complete, spawn one background Opus agent (`model: opus`)
-with:
+After all N agents return (foreground means they are all done before this
+step begins), parse each `---tournament-result---` block. Collect the SHA and
+diff for every agent with `STATUS: committed`. If zero agents committed, report
+all skip reasons and stop.
 
-- All N diffs (collected via `git show HEAD` in each worktree).
-- The smell findings from step 2.
+Spawn one **foreground** Opus agent (`model: opus`) with:
+
+- All committed diffs (from the result blocks -- no worktree access needed).
+- The smell findings from Step 2.
 - The content of `MAINTAINABILITY.md`.
-- Instruction: evaluate each implementation against the six MAINTAINABILITY.md
-  outcomes (Higher Cohesion, Lower Coupling, Explicit Intent, Locality of
-  Change, Reduced Cognitive Load, Strong Domain Modeling). Score each 1-5 on
-  each dimension. Pick the winner.
+- Instruction: evaluate each diff against the six MAINTAINABILITY.md outcomes
+  (Higher Cohesion, Lower Coupling, Explicit Intent, Locality of Change,
+  Reduced Cognitive Load, Strong Domain Modeling). Score each 1-5 per
+  dimension. Return the winning strategy letter.
 
 Return schema:
 
 ```json
 {
-  "winner": "Conservative|Structural|Extract-first|Domain-model|Functional",
-  "scores": { "A": 0, "B": 0, "C": 0 },
+  "winner": "A|B|C|D|E",
+  "scores": {
+    "A": {
+      "cohesion": 0,
+      "coupling": 0,
+      "intent": 0,
+      "locality": 0,
+      "cognitive_load": 0,
+      "domain_model": 0
+    }
+  },
   "reasoning": "one sentence"
 }
 ```
 
 ## 7. Apply and report
 
-Cherry-pick the winning commit to the current branch:
+Look up the SHA for the winning strategy from the result blocks collected in
+Step 5 (not from the Opus judge). Cherry-pick it onto the current branch:
 
 ```sh
-git cherry-pick <winning-commit-sha>
+git cherry-pick <sha-from-winning-agent>
 ```
 
+If cherry-pick fails due to conflict, fall back to `git apply` with the
+winning diff from the result block.
+
 Report: winning strategy, Opus reasoning, scores for each strategy, smells
-fixed vs. skipped, and test results.
+fixed vs. skipped, test results, and any agents that skipped with their reason.
 
 ## Notes
 

From 021f72116a26bc06762482fc41f4c4c8e5333af5 Mon Sep 17 00:00:00 2001
From: Patrick Taylor <1963845+pstaylor-patrick@users.noreply.github.com>
Date: Sun, 21 Jun 2026 06:24:19 -0500
Subject: [PATCH 3/3] fix(pst:refactor): address 6 more adversarial findings in
 tournament protocol

- Step 2 Opus was marked 'background', causing Step 3 to run before findings
  arrived; changed to foreground so smell analysis is complete before gate
- N=1 trivial path was undefined; now routes directly to strategy A, no judge
- Co-author trailer commit format specified as HEREDOC to preserve blank line
- git apply fallback specifies write-to-temp-file via Write tool first
- Opus judge schema shows all N strategies, not just A; added instruction
  to include every strategy that ran
- Absent result block (agent crash/overflow) now treated as skipped

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019BCr5Qxi8jXnG2Rr7zdkw1
---
 skills/pst:refactor/SKILL.md | 64 +++++++++++++++++++++++++++++-------
 1 file changed, 52 insertions(+), 12 deletions(-)

diff --git a/skills/pst:refactor/SKILL.md b/skills/pst:refactor/SKILL.md
index c477593..161ee01 100644
--- a/skills/pst:refactor/SKILL.md
+++ b/skills/pst:refactor/SKILL.md
@@ -34,9 +34,9 @@ git ls-files | grep -Ev '\.(lock|snap|min\.(js|css)|pb\.go|pb_test\.go)$' \
   | grep -E '\.(rb|py|ts|tsx|js|jsx|go|rs|java|kt|swift|ex|exs|cs|cpp|c|h)$'
 ```
 
-## 2. Opus smell analysis (background)
+## 2. Opus smell analysis
 
-Spawn one background Opus agent (`model: opus`) with:
+Spawn one **foreground** Opus agent (`model: opus`) with:
 
 - The full content of `MAINTAINABILITY.md` (read via the path resolved above).
 - The list of code files from step 1.
@@ -89,7 +89,10 @@ only). Treat no objection as Yes.
 
 Determine N from the smell count and file count:
 
-- **Trivial** (1-3 smells, 1-2 files): N=1 -- single Sonnet, skip tournament.
+- **Trivial** (1-3 smells, 1-2 files): N=1 -- single Sonnet agent, skip
+  tournament. Spawn one foreground Sonnet agent using strategy A (Conservative).
+  No judge step; cherry-pick its commit directly in Step 7. If it cannot commit,
+  report the reason and stop.
 - **Moderate** (4-10 smells or 3-5 files): N=3.
 - **Complex** (11+ smells or 6+ files): N=5.
 
@@ -149,16 +152,28 @@ Steps each agent must follow:
 
 1. Apply its strategy to fix the smells in the target files.
 2. Run existing tests if present; skip any fix that breaks a test and note it.
-3. Stage and commit: `git add <changed files> && git commit -m "refactor(<strategy>): <primary smell fixed>"`.
-4. Include co-author trailer: `Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>`.
-5. Run `git rev-parse HEAD` and `git diff HEAD~1..HEAD`; include both verbatim in the result block.
+3. Stage and commit using a HEREDOC so the trailer blank line is preserved:
+
+   ```sh
+   git add <changed files>
+   git commit -m "$(cat <<'EOF'
+   refactor(<strategy>): <primary smell fixed>
+
+   Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
+   EOF
+   )"
+   ```
+
+4. Run `git rev-parse HEAD` and `git diff HEAD~1..HEAD`; include both verbatim
+   in the result block.
 
 ## 6. Opus judge selects winner
 
 After all N agents return (foreground means they are all done before this
-step begins), parse each `---tournament-result---` block. Collect the SHA and
-diff for every agent with `STATUS: committed`. If zero agents committed, report
-all skip reasons and stop.
+step begins), parse each `---tournament-result---` block. If an agent's output
+contains no result block at all, treat it as `STATUS: skipped: result block
+missing`. Collect the SHA and diff for every agent with `STATUS: committed`. If
+zero agents committed, report all skip reasons and stop.
 
 Spawn one **foreground** Opus agent (`model: opus`) with:
 
@@ -170,7 +185,8 @@ Spawn one **foreground** Opus agent (`model: opus`) with:
   Reduced Cognitive Load, Strong Domain Modeling). Score each 1-5 per
   dimension. Return the winning strategy letter.
 
-Return schema:
+Return schema (include a scores entry for every strategy letter that appears
+in the committed diffs you received -- do not omit strategies that ran):
 
 ```json
 {
@@ -183,6 +199,22 @@ Return schema:
       "locality": 0,
       "cognitive_load": 0,
       "domain_model": 0
+    },
+    "B": {
+      "cohesion": 0,
+      "coupling": 0,
+      "intent": 0,
+      "locality": 0,
+      "cognitive_load": 0,
+      "domain_model": 0
+    },
+    "C": {
+      "cohesion": 0,
+      "coupling": 0,
+      "intent": 0,
+      "locality": 0,
+      "cognitive_load": 0,
+      "domain_model": 0
     }
   },
   "reasoning": "one sentence"
@@ -198,8 +230,16 @@ Step 5 (not from the Opus judge). Cherry-pick it onto the current branch:
 git cherry-pick <sha-from-winning-agent>
 ```
 
-If cherry-pick fails due to conflict, fall back to `git apply` with the
-winning diff from the result block.
+If cherry-pick fails due to conflict, write the winning diff to a temp file
+and apply it:
+
+```sh
+# Write the diff from the result block to a temp file first
+git apply /tmp/pst-refactor-winner.patch
+```
+
+Use the Write tool to write the diff string to `/tmp/pst-refactor-winner.patch`
+before running that command.
 
 Report: winning strategy, Opus reasoning, scores for each strategy, smells
 fixed vs. skipped, test results, and any agents that skipped with their reason.