From 1337416c29c2c968bae84bb04d70eb3ee4122a79 Mon Sep 17 00:00:00 2001 From: tdwhere123 Date: Sun, 21 Jun 2026 15:53:49 +0800 Subject: [PATCH] feat(quality): sharpen do-it's write-lifecycle quality spine MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Make installed agents write high-quality code by default: tighten the write-before -> write-during -> write-after spine and compress the two heaviest skills so the agent actually reads them. The shared 'write less' primitive is a single decision ladder, reused across skills, not restated. - router: add the decision ladder to § Restraint (does it need to exist? -> stdlib -> native -> existing dep -> one line -> minimal) with never-cut guardrails; grill/brainstorm/lint/YAGNI-lens reference it. - grill: rewrite 290 -> 88-line core + references/checks.md. One core loop at top: necessity first, falsify by exploring not asking, one decision at a time with a recommended answer, checkable+exhaustive completion. Capabilities kept. - brainstorm: rewrite 293 -> 110-line core + references/artifact-format.md. Two viewpoints run inline by default, fan out only at Heavy/explicit (capability unchanged, default cost removed); options map along the ladder; pick-a-branch + fog-of-war skip cut over-firing. - anti-patterns-lint: no-consumer family also flags speculative interface/type/ abstract-class exports as ladder rung 1; family set stays closed per Restraint. - code-quality-cleaner: maintainability + YAGNI/over-engineering with a bounded checklist and a tagged finding format (delete:/stdlib:/native:/yagni:/shrink:, severity preserved, net: -N verdict); wired into review-loop as the YAGNI lens. - fix-loop already mandates batch-by-root-cause (verified, unchanged). Tests: build:codex-plugin regenerates plugins/index mirrors; npm run test green (agents, skill-links, hooks incl. 2 new anti-patterns-lint cases, install). Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 43 ++- agents/code-quality-cleaner.toml | 38 ++- commands/do-it-brainstorm.md | 27 +- hooks/anti-patterns-lint.sh | 27 +- index.json | 6 +- .../do-it/agents/code-quality-cleaner.toml | 38 ++- plugins/do-it/skills/_index.md | 4 +- .../do-it/skills/do-it-brainstorm/SKILL.md | 279 +++-------------- .../references/artifact-format.md | 89 ++++++ plugins/do-it/skills/do-it-grill/SKILL.md | 288 +++--------------- .../skills/do-it-grill/references/checks.md | 52 ++++ .../do-it/skills/do-it-review-loop/SKILL.md | 17 +- plugins/do-it/skills/do-it-router/SKILL.md | 15 + skills/do-it/do-it-brainstorm/SKILL.md | 279 +++-------------- .../references/artifact-format.md | 89 ++++++ skills/do-it/do-it-grill/SKILL.md | 288 +++--------------- skills/do-it/do-it-grill/references/checks.md | 52 ++++ skills/do-it/do-it-review-loop/SKILL.md | 17 +- skills/do-it/do-it-router/SKILL.md | 15 + tests/hooks/anti-patterns-lint.test.sh | 37 +++ 20 files changed, 691 insertions(+), 1009 deletions(-) create mode 100644 plugins/do-it/skills/do-it-brainstorm/references/artifact-format.md create mode 100644 plugins/do-it/skills/do-it-grill/references/checks.md create mode 100644 skills/do-it/do-it-brainstorm/references/artifact-format.md create mode 100644 skills/do-it/do-it-grill/references/checks.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 247f256..9b1bccf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,7 +2,48 @@ ## Unreleased -- No unreleased changes. +Theme: make installed agents write high-quality code by sharpening do-it's +write-before → write-during → write-after quality spine, and compress the two +heaviest skills so the agent actually reads them. The shared "write less" +primitive is a single decision ladder, reused — not restated — across skills. + +### Added + +- `do-it-router` § Restraint now carries **the decision ladder** (does it need to + exist? → stdlib → native → existing dependency → one line → minimal custom), + with guardrails it never cuts. It is the shared anti-over-engineering primitive + that grill, brainstorm, the write-time lint, and the YAGNI review lens all + reference instead of each defining their own. +- `do-it-review-loop` gains a **YAGNI lens** (`code-quality-cleaner`), default at + Heavy when a diff adds a new abstraction/export/dependency. It emits one-line + findings with a closed tag vocabulary (`delete:` / `stdlib:` / `native:` / + `yagni:` / `shrink:`), each carrying a severity, plus a quantified + `net: - lines possible` / `Lean already. Ship.` verdict. + +### Changed + +- **grill** rewritten and slimmed (290 → 88-line core + a 52-line `references/` + file behind progressive disclosure). The core loop is stated once, at the top: + necessity first (decision-ladder rung 1), falsify cheaply by exploring code + instead of asking, ask one decision at a time with a recommended answer, on a + checkable-and-exhaustive completion criterion. The lens taxonomy, + rationalizations, red flags, severity, and common-mistakes lists moved to + `references/checks.md`; the grill-log artifact is de-ceremonied but retains the + same capability. +- **brainstorm** rewritten and slimmed (293 → 110-line core + an 89-line + `references/artifact-format.md`). The two viewpoints (product + architecture) + now run **inline by default** and fan out to independent subagents only at + Heavy tier or on explicit request — capability unchanged, default cost removed. + Options map along the decision ladder with a deletion-over-addition bias; a + pick-a-branch classifier and a fog-of-war skip cut over-firing. +- `anti-patterns-lint.sh` no-consumer family now also flags speculative + `export interface` / `type` / `abstract class` declarations nothing references, + framed as decision-ladder rung 1. The family set stays closed by design (per + § Restraint); richer YAGNI checks live in the on-demand review lens, not the + write-time hook. +- `code-quality-cleaner` agent extended from maintainability-only to + maintainability + YAGNI/over-engineering, with a bounded checklist and the + tagged finding format above. ## 0.11.0 diff --git a/agents/code-quality-cleaner.toml b/agents/code-quality-cleaner.toml index c8addf7..5777516 100644 --- a/agents/code-quality-cleaner.toml +++ b/agents/code-quality-cleaner.toml @@ -1,33 +1,43 @@ name = "code-quality-cleaner" -description = "Use during do-it-review-loop for read-only maintainability review of redundancy, dead paths, weak abstractions, and cleanup risk." +description = "Use during do-it-review-loop for read-only maintainability and YAGNI review of redundancy, dead paths, single-use or speculative abstractions, reinvented stdlib, and cleanup risk." sandbox_mode = "read-only" developer_instructions = """ -Operate as the do-it maintainability review lens. Stay read-only. +Operate as the do-it maintainability and YAGNI review lens. Stay read-only. Default to Standard slice; never self-escalate to Heavy without explicit assignment. Full dispatch contract: see `do-it-subagent-orchestration` § Required Prompt Contract. Purpose: -- find cleanup issues that create correctness, review, or maintenance risk -- keep the delivered diff smaller and easier to support +- find cleanup and over-engineering issues that create correctness, review, or maintenance risk +- keep the delivered diff smaller and easier to support — the best code is the code that was never written - avoid style-only commentary Workflow: 1. Inspect the actual diff or changed files named by the parent. -2. Look for duplicated logic, stale helpers, dead paths, avoidable abstraction, and brittle tests. -3. Check whether local patterns or helpers were missed. -4. Report only issues with a real consequence. -5. Recommend the smallest cleanup that removes the risk. +2. Maintainability scan: duplicated logic, stale helpers, dead paths, brittle tests, and local patterns or helpers that were missed. +3. YAGNI / over-engineering scan against the decision ladder (see `do-it-router` § Restraint). Bounded checklist: + - an interface, abstract class, or `type` with a single implementation; + - a factory, builder, or registry producing exactly one product; + - a wrapper or adapter that only forwards to one callee; + - a module, hook, or config flag exporting a single unused item; + - hand-rolled code the stdlib or platform already provides; + - speculative parameters, generality, or `Phase 2` scaffolding with no current caller. +4. Report only issues with a real consequence; recommend the smallest change that removes the risk — prefer deletion over addition, boring over clever. + +Never cut the guardrails: input validation at trust boundaries, error handling that prevents data loss, security, accessibility, or behavior the task explicitly required. Lazy, not negligent. Severity: -- Blocking: quality issue can cause wrong behavior, broken tests, data loss, or contract breakage. -- Important: likely future bug, meaningful duplication, misleading abstraction, or brittle coverage. -- Opportunity: local clarity with low delivery risk. +- Blocking: the issue can cause wrong behavior, broken tests, data loss, or contract breakage. +- Important: likely future bug, meaningful duplication, or misleading abstraction. +- Opportunity: local simplification with low delivery risk (most YAGNI findings land here unless tied to correctness). + +Finding format — one line each, ordered by severity, using a CLOSED tag vocabulary: + [] L: : . . +Tags: `delete:` (dead or unused — remove it) · `stdlib:` (reinvents the standard library — use it) · `native:` (reinvents a native language or runtime feature) · `yagni:` (speculative abstraction with one or zero users — inline it) · `shrink:` (works but larger than needed — collapse it). Every finding states both what to cut AND what replaces it, and carries a severity so the batch-emission and closeout gates still function. Do not invent new tags. Token discipline: - do not audit untouched subsystems -- cite exact lines or diff evidence when possible +- cite exact lines or diff evidence - keep findings short and action-oriented -- say when no cleanup finding justifies action -Return findings first, ordered by severity, with evidence, consequence, smallest fix, and residual risk. +End with a quantified verdict: `net: - lines possible` when cuts exist, or `Lean already. Ship.` when none do. Return findings first, ordered by severity, each with evidence and smallest fix, then residual risk. """ diff --git a/commands/do-it-brainstorm.md b/commands/do-it-brainstorm.md index fc44db6..abc9dee 100644 --- a/commands/do-it-brainstorm.md +++ b/commands/do-it-brainstorm.md @@ -1,37 +1,38 @@ --- -description: 在编码前用产品边界 + 架构地基双核心澄清需求形态,并按任务动态补充 UX、用户、Ops、安全、领域语言或计划视角。 +description: 在编码前用产品视角 + 架构视角双核心澄清需求形态与选项,沿决策阶梯发散,并把需要收敛的事项交给 do-it-grill。 --- # /do-it-brainstorm -对当前任务先做 divergent brainstorm,搞清楚需求到底是什么、可能长什么样,再把需要收敛的事项交给 `do-it-grill`。默认运行两个核心视角: +对当前任务先做 divergent brainstorm,搞清楚需求到底是什么、可能长什么样、有哪些选项,再把需要收敛的事项交给 `do-it-grill`。每一遍都由两个核心视角支撑: -- `product-strategist`:产品边界、核心目标、需求形态、多个选项及其好处 / 坏处 / 风险。 -- `architecture-strategist`:核心底层、扩展模块、阶段闭环、边界、验证路线。 +- **产品视角**(`product-strategist`):产品边界、核心目标、需求形态、多个选项及其好处 / 坏处 / 风险。 +- **架构视角**(`architecture-strategist`):核心底层、扩展模块、阶段闭环、边界、验证路线。 -根据任务再动态补充 `ux-designer`、`end-user-advocate`、`ops-sre`、`ceo-reviewer`、`red-team-reviewer`、`domain-language-reviewer` 或 `plan-challenger`。旧的四补充视角组合不再是默认固定流程。 +两个视角固定存在,但**默认由主线程内联推演**;只有 Heavy 或你显式要求时,才真正扇出成独立只读子智能体。按任务再动态补充 `ux-designer`、`end-user-advocate`、`ops-sre`、`ceo-reviewer`、`red-team-reviewer`、`domain-language-reviewer` 或 `plan-challenger`(零到两个,只在能改变实现路线时加)。 ## 调用方式 -- `/do-it-brainstorm` — 默认双核心;由 router / 当前任务决定是否补充视角。 +- `/do-it-brainstorm` — 默认双视角内联;由 router / 当前任务决定是否补充视角与是否扇出。 - `/do-it-brainstorm product,architecture,ops` — 显式指定 lenses(逗号分隔)。 -- `/do-it-brainstorm lenses=ux,end-user,red-team` — 在双核心之外补充指定视角。 -- `/do-it-brainstorm discuss` — 项目上下文还不足时先讨论问题空间,不强制写 artifact。 -- `/do-it-brainstorm all` — 运行双核心 + 所有适用补充视角;只在 Heavy 或用户明确要求时使用。 +- `/do-it-brainstorm lenses=ux,end-user,red-team` — 在双视角之外补充指定视角。 +- `/do-it-brainstorm discuss` — 只做内联讨论,不写 artifact(这是默认形态)。 +- `/do-it-brainstorm all` — 扇出双视角 + 所有适用补充视角为独立子智能体;只在 Heavy 或明确要求时用。 ## 行为 1. 加载 `do-it-brainstorm` skill。 -2. 先确认 task frame 和当前 repo 真相;上下文空白时先讨论需求形态,不强制写 artifact。 -3. 默认选择 `product-strategist` + `architecture-strategist`,再按任务风险选择补充视角。 -4. 输出按 `Requirement Shape` / `Product Boundary` / `Core Goal` / `Options` / `Architecture Foundation` / `Extension Modules` / `Must Resolve In Grill` 分层。 -5. 当结果需要后续 session 复用时,写入 `.do-it/brainstorm/.md`。 +2. 先定「哪个问题主导」(产品形态 / 架构地基 / 某个补充关切),把力气投在那里。 +3. 确认 task frame 和当前 repo 真相;上下文空白或没有真正待决项时,直说并交给 planning,不硬造选项。 +4. 沿决策阶梯(跳过 → stdlib/原生 → 已装依赖 → 最小自建 → 完整自建)铺开选项,偏向减法与无聊的方案。 +5. 输出 `Requirement Shape` / `Product Boundary` / `Core Goal` / `Options` / `Architecture Foundation` / `Extension Modules` / `Must Resolve In Grill` 分层;默认内联,下一 session 需复用或 Heavy 时才写 `.do-it/brainstorm/.md`。 6. **不**在 brainstorm 内做最终收敛;需要收敛时交给 `/do-it-grill`。 ## 跳过条件 - 任务是机械重构、单文件改名、依赖升级、纯内部基础设施。 - 当前 task slug 已存在 `.do-it/brainstorm/.md` 且提案未实质变更。 +- 拷问后发现没有真正的开放决策 —— 直接交给 planning。 ## 与 grill 的衔接 diff --git a/hooks/anti-patterns-lint.sh b/hooks/anti-patterns-lint.sh index 8eaf7e9..d8635cd 100755 --- a/hooks/anti-patterns-lint.sh +++ b/hooks/anti-patterns-lint.sh @@ -7,14 +7,22 @@ # case-list — ≥10 consecutive added lines look like a bash case branch # pattern (`*"..."*`); flags the "大段 case 语句吞所有变体" # shape that should usually be data-driven. -# no-consumer — a freshly-added TS/JS top-level `export {const|function|class} ` -# whose `` does not appear in any other .ts/.tsx/.js/.mjs/.cjs -# file under the repo. +# no-consumer — a freshly-added TS/JS top-level export whose `` does not +# appear in any other .ts/.tsx/.js/.mjs/.cjs file under the repo. +# Covers `const|let|var|function|class` AND the speculative- +# abstraction kinds `interface|type|abstract class` — an exported +# abstraction nobody references is decision-ladder rung 1 (does it +# need to exist?), the highest-signal YAGNI smell catchable at +# write time. # copy-paste — a contiguous chunk of ≥5 non-trivial added lines whose first # AND last lines both appear in another file inside the same # directory. # -# Like comments-lint.sh this hook is advisory only. +# Like comments-lint.sh this hook is advisory only. The family set is +# deliberately small and CLOSED: per do-it-router § Restraint, do not grow it +# into an ever-longer anti-pattern list. Richer YAGNI review (single-impl +# interfaces, one-product factories, pass-through wrappers, reinvented stdlib) +# belongs to the on-demand do-it-review-loop YAGNI lens, not this write-time hook. set -uo pipefail @@ -177,6 +185,8 @@ fi # ------------------------------------------------------------------------ # Family 2: no-consumer — TS/JS top-level export with no other reference. +# Includes the speculative-abstraction kinds (interface / type / abstract +# class): an exported abstraction nobody references is decision-ladder rung 1. # ------------------------------------------------------------------------ case "$FILE_PATH" in *.ts|*.tsx|*.js|*.jsx|*.mjs|*.cjs) @@ -185,11 +195,12 @@ case "$FILE_PATH" in # which is not supported by BSD awk (macOS default). The regex matches: # - `export const|let|var ` # - `export function ` / `export async function ` - # - `export class ` + # - `export class ` / `export abstract class ` + # - `export interface ` / `export type ` # - `export default function ` / `export default class ` EXPORT_NAMES="$(printf '%s\n' "$ADDED_LINES" \ - | grep -E '^[[:space:]]*export[[:space:]]+(default[[:space:]]+(async[[:space:]]+)?(function|class)|async[[:space:]]+function|const|let|var|function|class)[[:space:]]+[A-Za-z_][A-Za-z0-9_]*' \ - | sed -E 's/^[[:space:]]*export[[:space:]]+(default[[:space:]]+)?(async[[:space:]]+function|function|class|const|let|var)[[:space:]]+([A-Za-z_][A-Za-z0-9_]*).*/\3/' \ + | grep -E '^[[:space:]]*export[[:space:]]+(default[[:space:]]+(async[[:space:]]+)?(function|class)|async[[:space:]]+function|abstract[[:space:]]+class|const|let|var|function|class|interface|type)[[:space:]]+[A-Za-z_][A-Za-z0-9_]*' \ + | sed -E 's/^[[:space:]]*export[[:space:]]+(default[[:space:]]+)?(async[[:space:]]+function|abstract[[:space:]]+class|function|class|const|let|var|interface|type)[[:space:]]+([A-Za-z_][A-Za-z0-9_]*).*/\3/' \ | sort -u)" if [[ -n "$EXPORT_NAMES" ]]; then NO_CONSUMER_NAMES="" @@ -213,7 +224,7 @@ case "$FILE_PATH" in done <<<"$EXPORT_NAMES" if [[ -n "$NO_CONSUMER_NAMES" ]]; then _record_family "no-consumer" - _append_detail "no-consumer: ${NO_CONSUMER_NAMES} (newly-exported but no other file references; consider whether the export is needed)" + _append_detail "no-consumer: ${NO_CONSUMER_NAMES} (newly-exported, no other file references it — decision ladder rung 1: does it need to exist, or can it be inlined or dropped?)" fi fi ;; diff --git a/index.json b/index.json index 4483459..987346b 100644 --- a/index.json +++ b/index.json @@ -59,7 +59,7 @@ { "kind": "skill", "name": "do-it-grill", - "description": "Use when hidden assumptions or user-only decisions need fact verification before planning, implementation, or closeout.", + "description": "Pressure-test the load-bearing premises and necessity of a plan before it hardens into code, commits, or claims. Use when hidden assumptions or user-only decisions gate planning, implementation, or closeout, or when asked to grill, challenge, or stress-test.", "origin": "do-it", "optional": false, "group": "mainline", @@ -79,7 +79,7 @@ { "kind": "skill", "name": "do-it-brainstorm", - "description": "Use when planning needs divergent product and architecture option mapping before grill converges on decisions.", + "description": "Map requirement shape and options through a product viewpoint and an architecture viewpoint before grill converges on decisions. Use when planning needs divergent option mapping, or when asked to brainstorm or 脑暴.", "origin": "do-it", "optional": false, "group": "on-demand", @@ -264,7 +264,7 @@ { "kind": "agent", "name": "code-quality-cleaner", - "description": "Use during do-it-review-loop for read-only maintainability review of redundancy, dead paths, weak abstractions, and cleanup risk.", + "description": "Use during do-it-review-loop for read-only maintainability and YAGNI review of redundancy, dead paths, single-use or speculative abstractions, reinvented stdlib, and cleanup risk.", "origin": "do-it", "optional": false, "source": "agents/code-quality-cleaner.toml", diff --git a/plugins/do-it/agents/code-quality-cleaner.toml b/plugins/do-it/agents/code-quality-cleaner.toml index c8addf7..5777516 100644 --- a/plugins/do-it/agents/code-quality-cleaner.toml +++ b/plugins/do-it/agents/code-quality-cleaner.toml @@ -1,33 +1,43 @@ name = "code-quality-cleaner" -description = "Use during do-it-review-loop for read-only maintainability review of redundancy, dead paths, weak abstractions, and cleanup risk." +description = "Use during do-it-review-loop for read-only maintainability and YAGNI review of redundancy, dead paths, single-use or speculative abstractions, reinvented stdlib, and cleanup risk." sandbox_mode = "read-only" developer_instructions = """ -Operate as the do-it maintainability review lens. Stay read-only. +Operate as the do-it maintainability and YAGNI review lens. Stay read-only. Default to Standard slice; never self-escalate to Heavy without explicit assignment. Full dispatch contract: see `do-it-subagent-orchestration` § Required Prompt Contract. Purpose: -- find cleanup issues that create correctness, review, or maintenance risk -- keep the delivered diff smaller and easier to support +- find cleanup and over-engineering issues that create correctness, review, or maintenance risk +- keep the delivered diff smaller and easier to support — the best code is the code that was never written - avoid style-only commentary Workflow: 1. Inspect the actual diff or changed files named by the parent. -2. Look for duplicated logic, stale helpers, dead paths, avoidable abstraction, and brittle tests. -3. Check whether local patterns or helpers were missed. -4. Report only issues with a real consequence. -5. Recommend the smallest cleanup that removes the risk. +2. Maintainability scan: duplicated logic, stale helpers, dead paths, brittle tests, and local patterns or helpers that were missed. +3. YAGNI / over-engineering scan against the decision ladder (see `do-it-router` § Restraint). Bounded checklist: + - an interface, abstract class, or `type` with a single implementation; + - a factory, builder, or registry producing exactly one product; + - a wrapper or adapter that only forwards to one callee; + - a module, hook, or config flag exporting a single unused item; + - hand-rolled code the stdlib or platform already provides; + - speculative parameters, generality, or `Phase 2` scaffolding with no current caller. +4. Report only issues with a real consequence; recommend the smallest change that removes the risk — prefer deletion over addition, boring over clever. + +Never cut the guardrails: input validation at trust boundaries, error handling that prevents data loss, security, accessibility, or behavior the task explicitly required. Lazy, not negligent. Severity: -- Blocking: quality issue can cause wrong behavior, broken tests, data loss, or contract breakage. -- Important: likely future bug, meaningful duplication, misleading abstraction, or brittle coverage. -- Opportunity: local clarity with low delivery risk. +- Blocking: the issue can cause wrong behavior, broken tests, data loss, or contract breakage. +- Important: likely future bug, meaningful duplication, or misleading abstraction. +- Opportunity: local simplification with low delivery risk (most YAGNI findings land here unless tied to correctness). + +Finding format — one line each, ordered by severity, using a CLOSED tag vocabulary: + [] L: : . . +Tags: `delete:` (dead or unused — remove it) · `stdlib:` (reinvents the standard library — use it) · `native:` (reinvents a native language or runtime feature) · `yagni:` (speculative abstraction with one or zero users — inline it) · `shrink:` (works but larger than needed — collapse it). Every finding states both what to cut AND what replaces it, and carries a severity so the batch-emission and closeout gates still function. Do not invent new tags. Token discipline: - do not audit untouched subsystems -- cite exact lines or diff evidence when possible +- cite exact lines or diff evidence - keep findings short and action-oriented -- say when no cleanup finding justifies action -Return findings first, ordered by severity, with evidence, consequence, smallest fix, and residual risk. +End with a quantified verdict: `net: - lines possible` when cuts exist, or `Lean already. Ship.` when none do. Return findings first, ordered by severity, each with evidence and smallest fix, then residual risk. """ diff --git a/plugins/do-it/skills/_index.md b/plugins/do-it/skills/_index.md index f4af020..db70468 100644 --- a/plugins/do-it/skills/_index.md +++ b/plugins/do-it/skills/_index.md @@ -12,7 +12,7 @@ router 推荐按需加载:用 Skill 工具 + skill 名加载详情;不要一 ## 主线 (router 推荐) - **do-it-router** — any non-trivial repo task needs Light, Standard, or Heavy tier selection, fail… -- **do-it-grill** — hidden assumptions or user-only decisions need fact verification before planni… +- **do-it-grill** — Pressure-test the load-bearing premises and necessity of a plan before it hard… - **do-it-planning** — intent must become a durable handoff card with current truth, acceptance crite… - **do-it-tdd** — a behavior change needs a failing regression or contract test before implement… - **do-it-review-loop** — a delivered diff or worker result needs PR-style correctness review before the… @@ -21,7 +21,7 @@ router 推荐按需加载:用 Skill 工具 + skill 名加载详情;不要一 ## 按需触发 -- **do-it-brainstorm** — planning needs divergent product and architecture option mapping before grill… +- **do-it-brainstorm** — Map requirement shape and options through a product viewpoint and an architect… - **do-it-architecture-scan** — a change touches multiple packages or surfaces and you need to audit ownership… - **do-it-interface-drill** — API, schema, CLI, or protocol changes must be checked across producer and cons… - **do-it-debugging** — a bug's root cause is unclear or prior fixes failed and you need a hypothesis,… diff --git a/plugins/do-it/skills/do-it-brainstorm/SKILL.md b/plugins/do-it/skills/do-it-brainstorm/SKILL.md index cc4664f..cc7d15c 100644 --- a/plugins/do-it/skills/do-it-brainstorm/SKILL.md +++ b/plugins/do-it/skills/do-it-brainstorm/SKILL.md @@ -1,107 +1,65 @@ --- name: do-it-brainstorm -description: "Use when planning needs divergent product and architecture option mapping before grill converges on decisions." +description: "Map requirement shape and options through a product viewpoint and an architecture viewpoint before grill converges on decisions. Use when planning needs divergent option mapping, or when asked to brainstorm or 脑暴." --- # Do-It Brainstorm -## Purpose +Diverge before grill converges: figure out what the requirement actually is and what shapes it could take, map the options, and hand the open decisions to `do-it-grill`. Brainstorm never settles the decision itself — any option that would change implementation direction leaves a `Must Resolve In Grill` item instead of being silently chosen. -Use this to understand what the requirement is and what shape it could take before `do-it-grill` narrows it. Brainstorm is divergent, but it is not a fixed four-persona ritual. The default pass always includes two core read-only lenses: +Two viewpoints always inform the pass — a **product** viewpoint (requirement shape, product boundary, core goal, option tradeoffs) and an **architecture** viewpoint (foundation, extension modules, stage closure, boundaries, verification route). The viewpoints are fixed; how they run scales with tier (§ Tiers). -- `product-strategist`: requirement shape, product boundary, core goal, and option tradeoffs. -- `architecture-strategist`: architecture foundation, extension modules, stage closure, boundaries, and verification route. +## The pass -Supplemental lenses are added only when the task needs them. Brainstorm surfaces possible shapes, product and architecture boundaries, option benefits/costs/risks, and the questions grill must converge. It does not settle the decision itself. +1. **Pick the dominant branch.** Decide which question this turn is really about — product shape, architecture foundation, or a supplemental concern (UX / ops / security / domain). Route effort there instead of spraying it evenly across every axis. If the user is unreachable, state the assumption and proceed. +2. **Run the two viewpoints.** By default the parent reasons through both inline. Spawn them as independent read-only subagents only at Heavy tier or on explicit request (§ Fan-out). +3. **Map options along the decision ladder.** Lay each product or architecture option out along the ladder (see `do-it-router` § Restraint): skip-it → stdlib/native → existing dependency → minimal custom → full build. Bias divergence toward the subtractive and boring options, not only the additive and clever ones — the cheapest option that meets the goal usually wins. +4. **Diverge predictably.** Brainstorm's behaviour is fixed even when its tokens vary: surface genuinely different options (not three variations of one) and compare them on the same axes — for architecture: seam count, depth/leverage, reversibility, blast radius. +5. **Skip the fog of war.** If the pass surfaces no real open decision — the requirement shape and route are already clear — say so and hand straight to planning. Do not manufacture options to fill a template. +6. **Hand off to grill.** Dedup `Must Resolve In Grill`, drop anything local verification settles immediately, and preserve each option's tradeoff plus a recommended default for grill to drive. -## How It Differs From Grill +## How it differs from grill | | `do-it-brainstorm` | `do-it-grill` | |---|---|---| -| Mode | Divergent: widen with product, architecture, and task-fit lenses | Convergent: falsify facts and settle load-bearing decisions | -| Default lenses | Product core + architecture core | Single internal challenger loop | -| Output | Discussion-first summary or `.do-it/brainstorm/.md` | `.do-it/grill/.md` | -| Stop condition | Enough lens input to describe requirement shape, boundaries, options, and grill questions | No execution-blocking item remains | +| Mode | Divergent: widen with product, architecture, and task-fit viewpoints | Convergent: falsify facts, settle load-bearing decisions | +| Output | Inline decision stack, or `.do-it/brainstorm/.md` | `.do-it/grill/.md` | +| Stop | Enough input to describe requirement shape, options, and grill questions | No execution-blocking item remains | | Handoff | Hands `Must Resolve In Grill` to grill | Hands resolved decisions to planning or execution | -Brainstorm never converges. It may compare options, name benefits/costs/risks, -and recommend a default for discussion, but the final route is chosen by grill, -planning, or the user. Any option that would change implementation direction -must leave a clear `Must Resolve In Grill` item instead of being silently -chosen inside brainstorm. +If both run, brainstorm runs first. -## When To Use +## When to use / skip -Trigger brainstorm when the proposal: +Use when the proposal genuinely opens a decision: a new user- or operator-visible surface; a product, positioning, or opportunity-cost tradeoff; a change to architecture shape, module boundaries, install/runtime policy, public commands, or verification strategy; or an explicit "脑暴 / brainstorm / 多视角看一下". A Heavy-tier task without an existing brainstorm also qualifies. -- introduces a new user-visible or operator-visible surface; -- has product, business, positioning, workflow, or opportunity-cost implications; -- changes architecture shape, module boundaries, install/runtime policy, public commands, or verification strategy; -- crosses operational, reliability, security, domain-language, or release boundaries; -- lacks enough project context and the user wants to discover the requirement shape before artifacts; -- the user explicitly asks to "脑暴" / "brainstorm" / "多视角看一下"; -- a Heavy-tier task lands without an existing brainstorm or equivalent discussion. +Skip when the change is a tightly bounded mechanical edit, pure internal cleanup with no product/architecture/operator/release tradeoff, or already covered by a current brainstorm artifact whose frame has not materially changed. When in doubt, prefer the fog-of-war skip over running a full pass on a task with no open decision. -Skip brainstorm when: +## Viewpoints -- the change is a tightly bounded mechanical edit; -- the work is pure internal cleanup with no product, architecture, operator, or release tradeoff; -- a current brainstorm artifact exists for the same proposal and the frame has not materially changed. +The two cores are always present. Add a supplemental lens only when it can change the next implementation route — zero to two, never the whole table: -## Lens Selection - -Always run the two core lenses unless the user explicitly requests a single named lens: - -- `product-strategist` -- `architecture-strategist` - -Then add supplements only when the task calls for them: - -| Supplemental lens | Use when | +| Supplemental lens | Add when | |---|---| -| `ux-designer` | UI, flow, discoverability, accessibility, visual hierarchy, or user-facing copy could change the requirement shape. | +| `ux-designer` | UI, flow, discoverability, accessibility, or user-facing copy could change the requirement shape. | | `end-user-advocate` | Real-use conditions, recovery, stale state, interruption, or mental-model mismatch matters. | -| `ops-sre` | Deployment, rollback, observability, migration, scale, on-call, or incident recovery matters. | -| `ceo-reviewer` | Board-level business tradeoff, revenue path, market window, pricing, or competitive pressure matters. | +| `ops-sre` | Deployment, rollback, observability, migration, scale, or on-call matters. | +| `ceo-reviewer` | Board-level business tradeoff, revenue path, market window, or competitive pressure matters. | | `red-team-reviewer` | Security, persistence, concurrency, replay, partial failure, or adversarial misuse might change the route. | | `domain-language-reviewer` | Terms, names, public concepts, or domain models are overloaded or contradictory. | | `plan-challenger` | Scope, acceptance criteria, sequencing, or route sizing is the main uncertainty. | -Do not run every lens by default. If two supplemental lenses overlap, choose the one whose output can change the next implementation route. - -## Tier Rules - -### Light - -Use only when the user explicitly asks for brainstorm on a Light task. - -- Run `product-strategist` if the question is about demand, boundary, core goal, or option tradeoffs. -- Run `architecture-strategist` if the question is about foundation, extension modules, boundary, stage closure, or verification. -- If neither clearly dominates, run both cores. -- Prefer an inline discussion summary over an artifact unless the user asks for a durable file. +If two supplements overlap, run the one whose output changes the route. -### Standard +## Tiers -Default for normal product, UX, command, workflow, release-adjacent, or architecture-adjacent work. +- **Light** — only when explicitly asked. Run the dominant viewpoint inline (both if neither dominates). Inline summary, no artifact unless the user asks for a durable file. +- **Standard** — default. Both viewpoints inline; add zero to two supplements by task frame and repo truth. Produce an inline decision stack, or `.do-it/brainstorm/.md` only when the next session must read it. +- **Heavy** — parent-only unless assigned. Fan the two viewpoints out as independent subagents; add the minimum supplements for product/operator/security/domain/release/scope risk; write the artifact unless the user asked for discussion only. -- Run both core lenses. -- Add zero to two supplemental lenses based on the task frame and repo truth. -- Produce either an inline decision stack or `.do-it/brainstorm/.md` when the next session must read it. +## Fan-out -### Heavy - -Parent-only unless explicitly assigned. - -- Run both core lenses. -- Add the minimum supplemental lenses needed for product, operator, security, domain, release, or scope risk. -- For release/workflow/policy work, include at least one install/release or skill-quality review later in `do-it-review-loop`; review lenses are not substitutes for brainstorm lenses. -- Write the artifact unless the user explicitly asked for discussion only. - -## Dispatch Pattern - -Each selected lens is read-only and should receive the same frame plus the facts the parent already verified. Lenses may inspect current truth when needed, but they must not write. - -Use the `do-it-subagent-orchestration` contract when delegating: +Spawn lenses as independent read-only subagents only at Heavy tier, on explicit request, or when a viewpoint genuinely needs an independent deep read the parent cannot do inline without bias. When you do, use the `do-it-subagent-orchestration` contract: ```text tier: Light or Standard @@ -109,185 +67,44 @@ scope: write ownership: read-only forbidden paths: src/**, packages/**, apps/**, .do-it/**, dist/**, agents/**, skills/** current truth: -failure-mode forecast: -path map: not applicable, unless architecture-strategist needs a high-level boundary map must-verify facts: stop condition: NEEDS_CONTEXT if the frame is too vague, BLOCKED if forbidden writes are required, otherwise return the schema return schema: see the selected agent definition ``` -Run independent lenses in parallel when the host supports parallel subagents. If the host or task cannot support delegation, the parent may run a compact local version of the same lenses, but must label it as local and avoid pretending independent agents ran. - -## Discussion-First Mode - -Use discussion-first mode when the project or task frame is too blank to justify an artifact: - -- no repo exists or the repo has no relevant product/architecture truth; -- the user is exploring direction, not asking for a durable handoff; -- selected lenses return `NEEDS_CONTEXT` because the goal, audience, or boundary is unknown. +Run independent lenses in parallel when the host supports it. If delegation is unavailable, run a compact local version and label it local — do not pretend independent agents ran. -Return a concise stack: +## Research-first (new architectural surface) -- `Requirement shape`: what the demand appears to be and what forms it could take. -- `Product boundary`: what is in scope, what is out of scope, and which line matters. -- `Core goal`: the success target for this stage. -- `Options`: multiple viable paths with benefits, costs, risks, and when to choose each. -- `Architecture foundation`: the core bottom layer and invariants. -- `Extension modules`: later or optional modules and how they attach. -- `Must resolve in grill`: decisions that change product direction, foundation, implementation route, or verification. -- `Can decide during planning`: details that can be settled after direction is chosen. -- `Context still needed`: only facts that cannot be read locally and would change execution. +When an option introduces a new architectural surface (new dependency / datastore / framework / protocol), the architecture viewpoint must surface ≥2 real alternatives with a recency check before recommending one — do NOT batch-ask the user three generic questions. Hand the specific candidate-choice as a `Must Resolve In Grill` item; grill picks the highest-leverage one and asks. Out of scope here: bug fixes, refactors of existing code, incremental changes within a module. -Do not force `.do-it/brainstorm/.md` in discussion-first mode. Create an artifact only when there is enough context for future sessions to reuse. +## Artifact -## Output Artifact - -When durable handoff is useful, write: +Discussion-first is the default — return the inline stack (`Requirement shape`, `Product boundary`, `Core goal`, `Options`, `Architecture foundation`, `Extension modules`, `Must resolve in grill`, `Can decide during planning`) and stop. Write a durable file only when a future session must reuse it or the task is Heavy: ```text .do-it/brainstorm/.md ``` -`` follows the same slug rule as the grill log (see `do-it-grill` § Grill Log Artifact): slug from the user title or a short hash, with an optional session prefix. `/.do-it/brainstorm/.gitkeep` should exist when the project tracks do-it artifacts. - -In the same turn, write the slug to the task pointer so router and other skills can pick it up: +`` follows the grill-log slug rule. When you write the artifact, in the same turn write the pointer so router and other skills pick it up: ```bash mkdir -p .do-it/runtime && printf '%s' "" > .do-it/runtime/pointer ``` -See `do-it-router` § Task Pointer for the full protocol. Discussion-first mode without an artifact does not write the pointer. - -### File Format - -```markdown ---- -task: -session_id: -created: -status: open -lenses_run: [product-strategist, architecture-strategist, ...] -tier: -mode: ---- - -## Frame - - - -## Core Lenses - -### Product Strategy - - -### Architecture Strategy - - -## Supplemental Lenses - -### - - -Or `none selected`. - -## Requirement Shape - - - -## Product Boundary - -- In scope: -- Out of scope: -- Boundary risk: - -## Core Goal - - - -## Options - -### Option A: - -- Benefits: -- Costs: -- Risks: -- Choose when: - -### Option B: - -- Benefits: -- Costs: -- Risks: -- Choose when: - -## Architecture Foundation - -- Core bottom layer: -- Ownership/contract: -- Stage closure: - -## Extension Modules - -- - -## Grill Handoff - -### Must Resolve In Grill - -- - -### Can Decide During Planning - -- - -## Tensions - -- - -Or `none surfaced`. -``` - -`status: open` means grill has not converged on `Must Resolve In Grill`. Grill flips it to `converged` once every execution-blocking item is resolved or explicitly deferred. - -## Research handoff (Heavy or new architectural surface) - -When brainstorm produces options that involve a new architectural surface (new dependency / datastore / framework / protocol), do NOT batch-ask the user three generic questions here. Instead, hand specific candidate-choice questions to grill as `Must Resolve` items. - -Grill follows the "ask one focused question" rule (see `do-it-grill`). The brainstorm artifact lists the architectural-surface decisions that grill must drive; grill picks the highest-leverage one and asks. This preserves the question-budget discipline. - -Out of scope here: bug fixes, refactors of existing code, incremental changes within an existing module. - -## Handoff To Grill - -After discussion or artifact creation: - -1. Inspect selected lens returns for forbidden-path hits, unsupported claims, and schema drift. -2. Deduplicate `Must Resolve In Grill` and remove questions that local verification can answer immediately. -3. Preserve the option tradeoffs; do not collapse them into one answer inside brainstorm. - For each user-facing decision, include the available options, the practical - tradeoff, and the recommended default that grill should present. -4. Trigger or recommend `do-it-grill` when any must-resolve item changes execution. -5. If all must-resolve items can be verified locally and no user preference remains, the parent may proceed to planning/execution and record why grill is not needed. - -## Token Discipline - -- Core lenses are default; supplemental lenses are justified by task risk. -- Do not summarize every lens inline when an artifact exists; return the path plus requirement shape, options, and grill handoff summary. -- Keep each distilled lens section to about 30 lines. -- Do not re-run brainstorm if the existing artifact still matches the proposal; append only if the frame materially changed. +The full artifact `File Format` and section template live in `references/artifact-format.md`; load it only when writing the file. `status: open` means grill has not converged; grill flips it to `converged`. -## Common Mistakes +## Common mistakes -- Treating `ceo-reviewer`, `ux-designer`, `end-user-advocate`, and `ops-sre` as the fixed default set. They are supplements now. -- Calling `architect-reviewer` instead of `architecture-strategist` before implementation. The former reviews delivered or planned architecture risk; the latter explores system shape during brainstorm. -- Forcing an artifact when the user is still exploring a blank product direction. -- Letting brainstorm make the final decision. Brainstorm maps the requirement shape and options; convergence belongs to grill, planning, or the user. -- Running many overlapping supplements when the two core lenses already identify the route. +- Spawning subagents on a Standard task when both viewpoints fit inline — fan-out is a Heavy/explicit cost, not a default. +- Treating `ceo-reviewer` / `ux-designer` / `end-user-advocate` / `ops-sre` as a fixed default set; they are supplements. +- Calling `architect-reviewer` (reviews delivered/planned risk) instead of the architecture viewpoint (explores system shape before implementation). +- Forcing an artifact when the user is still exploring a blank direction, or manufacturing options when there is no open decision. +- Letting brainstorm make the final decision — convergence belongs to grill, planning, or the user. -## Related Skills +## Related skills -- `do-it-router` — sets tier and decides whether brainstorm is useful. -- `do-it-subagent-orchestration` — dispatch contract for selected lenses. -- `do-it-grill` — converges the must-discuss stack and records resolved facts and decisions in its grill log. -- `do-it-review-loop` — reviews delivered diff, skill quality, install readiness, and release risk after implementation. +- `do-it-router` — sets tier, owns the decision ladder, and decides whether brainstorm is useful. +- `do-it-subagent-orchestration` — dispatch contract for fanned-out lenses. +- `do-it-grill` — converges the must-resolve stack and records resolved facts and decisions. - `do-it-context` — sediments terms anchored during brainstorm or grill. diff --git a/plugins/do-it/skills/do-it-brainstorm/references/artifact-format.md b/plugins/do-it/skills/do-it-brainstorm/references/artifact-format.md new file mode 100644 index 0000000..a69b59a --- /dev/null +++ b/plugins/do-it/skills/do-it-brainstorm/references/artifact-format.md @@ -0,0 +1,89 @@ +# Brainstorm artifact — file format + +Load this only when writing a durable `.do-it/brainstorm/.md`. Discussion-first mode (the default) returns the inline stack and does not write a file. `/.do-it/brainstorm/.gitkeep` should exist when the project tracks do-it artifacts. + +```markdown +--- +task: +session_id: +created: +status: open +lenses_run: [product, architecture, ...] +tier: +mode: +--- + +## Frame + + + +## Core Viewpoints + +### Product + + +### Architecture + + +## Supplemental Lenses + +### + + +Or `none selected`. + +## Requirement Shape + + + +## Product Boundary + +- In scope: +- Out of scope: +- Boundary risk: + +## Core Goal + + + +## Options + +Map each option along the decision ladder (skip-it → stdlib/native → existing dependency → minimal custom → full build) and compare on the same axes. + +### Option A: +- Ladder rung: +- Benefits: +- Costs: +- Risks: +- Choose when: + +### Option B: +- Ladder rung: <...> +- Benefits / Costs / Risks / Choose when + +## Architecture Foundation + +- Core bottom layer: +- Ownership/contract: +- Stage closure: + +## Extension Modules + +- + +## Grill Handoff + +### Must Resolve In Grill +- + +### Can Decide During Planning +- + +## Tensions + +- + +Or `none surfaced`. +``` + +`status: open` means grill has not converged on `Must Resolve In Grill`. Grill flips it to `converged` once every execution-blocking item is resolved or explicitly deferred. Do not collapse option tradeoffs into one answer inside brainstorm; for each user-facing decision, keep the available options, the practical tradeoff, and the recommended default that grill should present. diff --git a/plugins/do-it/skills/do-it-grill/SKILL.md b/plugins/do-it/skills/do-it-grill/SKILL.md index 11ca100..21dd055 100644 --- a/plugins/do-it/skills/do-it-grill/SKILL.md +++ b/plugins/do-it/skills/do-it-grill/SKILL.md @@ -1,290 +1,88 @@ --- name: do-it-grill -description: "Use when hidden assumptions or user-only decisions need fact verification before planning, implementation, or closeout." +description: "Pressure-test the load-bearing premises and necessity of a plan before it hardens into code, commits, or claims. Use when hidden assumptions or user-only decisions gate planning, implementation, or closeout, or when asked to grill, challenge, or stress-test." --- # Do-It Grill -## Purpose +Pressure-test reasoning before it hardens into code, commits, or claims. Verify the facts you can verify locally, challenge whether the work needs to exist at all, and surface the one decision that genuinely needs the user — then stop. -Use this to pressure-test reasoning before work hardens into code, docs, commits, or claims. The goal is to verify the facts that can be verified locally, expose the decision that actually needs a user choice, and keep hidden delivery risk from leaking into implementation. +`do-it-brainstorm` diverges (maps requirement shape and options) and runs first when both run; grill converges and never re-diverges. `do-it-review-loop` reviews a delivered diff; grill challenges whether that review is needed, sufficient, or honestly closed. -`do-it-review-loop` owns delivered diff review, QA intake, and multi-perspective findings. `do-it-grill` challenges whether that review is needed, sufficient, or honestly closed. +## The loop -`do-it-brainstorm` runs the divergent step — the product strategy and architecture strategy core lenses, plus any task-fit supplemental lenses, clarify the requirement shape, product boundary, architecture foundation, options, and tradeoffs before grill arrives. Grill then converges the `Must Resolve In Grill` items those lenses surfaced. Brainstorm never converges; grill never diverges. If both run, brainstorm runs first. +Run until every load-bearing premise is either verified against a source or logged as an explicit user decision — no execution-blocking unknown remains. -## How It Differs From a Q&A Template +1. **Necessity first.** Before grilling *how*, grill *whether*: does this need to exist, or does an existing capability already cover it? This is the decision ladder's rung 1 (see `do-it-router` § Restraint) and is usually the most load-bearing premise. +2. **Pick the most load-bearing premise.** The single belief that, if wrong, changes the most downstream work. One at a time — do not dump a five-question template; that gets shallow, scattered answers. +3. **Falsify it cheaply — explore, don't ask.** If `rg`, opening the file, or a quick test can settle it, do that and cite `path:line`. Treat every user claim about behavior ("X is already validated", "called from one place") as a premise to grep, not a fact; if the code disagrees, surface the disagreement with a citation. +4. **Ask only for decisions — one at a time, with a recommended answer.** When local truth cannot decide a preference, priority, or scope tradeoff, ask one focused question: one or two sentences of why-it-matters-now, 2-3 real options with benefit/cost/risk, and your recommended default so the user has something concrete to accept or correct. If the user answers only part, record the settled part and ask the next smallest unresolved decision. +5. **Record and sediment.** Log each result in `.do-it/grill/.md` (§ Grill log). When a term or constraint gets clarified, append one declarative line to `.do-it/CONTEXT.md`. -A common failure is to dump a 5-question template at the user and call it grilling. That gets shallow, scattered answers. Instead: +## When to use / skip -1. **One premise at a time.** Pick the single most-load-bearing premise that, if wrong, would invalidate the most downstream work. Ask only that. Wait for the answer. Then pick the next. -2. **Anchor terms before debating intent.** If the user uses a term that has a definition in `CLAUDE.md`, `.do-it/CONTEXT.md`, or visible in code with a different shape — surface the conflict before going further. -3. **Verify, don't ask, when verification is cheap.** If the question can be answered with `rg`, `cat`, or a quick local command, do that and report what you found. -4. **Ask only for decisions.** Ask the user one focused question only when local truth cannot decide a preference, priority, or scope tradeoff that changes execution. Before asking, explain the decision in plain language, list the viable options with benefits/costs/risks, and recommend a default. -5. **Sediment what you learned.** When a term gets clarified or a constraint surfaces, append it to `.do-it/CONTEXT.md` (one line, declarative). Record the per-task `.do-it/grill/.md` artifact (`kind: fact|decision`, falsifier, status, evidence) per § Grill Log Artifact below. +Grill when: asked to grill, challenge, or stress-test; a plan has multiple plausible paths; acceptance criteria are vague; architecture, interface, release, phase, wave, or multi-agent coordination is involved; a review response seems too agreeable or under-evidenced; a closeout claim may outrun verification. -## When To Use +Skip for tiny mechanical edits with obvious acceptance checks. -Use the grill when: +## Tiers -- the user asks to be grilled, challenged, or stress-tested; -- a plan has multiple plausible paths; -- acceptance criteria are vague; -- architecture, interface, release, phase, wave, or multi-agent coordination is involved; -- a review response seems too agreeable or under-evidenced; -- a closeout claim may outrun verification. +- **Light** — bounded work: inspect the nearest truth, name the one route-changing assumption or risk, ask at most one question (only if a preference blocks the next step), recommend the route. Does not consume brainstorm. +- **Standard** — default for subagents and ordinary non-trivial planning: inspect code/docs/tests before asking, run the loop premise by premise, return blockers, important risks, options, and a recommendation. Must read a brainstorm artifact when one exists. +- **Heavy** — parent-only unless assigned: wave/phase/architecture/interface/release/multi-agent decisions; codebase exploration is mandatory; return scope lock, verified facts, decision options with one recommendation, gates, and residual risk. -Skip the grill for tiny mechanical edits with obvious acceptance checks. +## Anchoring terms -## Tier Rules +When a term feels fuzzy, before debating it: search `CLAUDE.md`, `.do-it/CONTEXT.md`, and `docs/` for an existing definition. If one exists and the user is using it differently, quote both side by side and ask which applies. If none exists and the term will recur, propose one (CONTEXT-FORMAT shape, see `do-it-context`) and write it back. Pick one term per concept; do not let synonyms (`payload` vs `event body`) drift in the same conversation. -### Light +## Assumptions mode -Use for bounded work: +When the user may lack the technical vocabulary, or the codebase has strong existing patterns, do not ask open technical questions. Read current truth first, state the assumptions you would use if the user said nothing (with confidence and evidence), and ask the user to confirm or correct only the one that changes the most downstream work. Rephrase technical choices in product or operator language when that is clearer. Verify cheap assumptions instead of asking; present genuine preferences as a choice — never smuggle an unverified preference in as an assumption. -- inspect the nearest truth; -- name the one assumption or risk that could change the route; -- ask at most one question, and only if a preference is needed; -- recommend the next route. +## Convergence after brainstorm -### Standard +When `.do-it/brainstorm/.md` exists with `status: open` (Standard/Heavy must read it; Light only skims): -Default for subagents and ordinary non-trivial planning: +1. Read `Requirement Shape`, `Options`, `Architecture Foundation`, and the `Must Resolve In Grill` list — each entry is a candidate premise the lenses thought a human or local verification had to settle. +2. Rank by route-impact (cross-lens tensions usually outrank single-lens one-offs) and resolve each via the loop. +3. Add `brainstorm: ` to the grill log frontmatter so `do-it-planning` and `do-it-verification-gate` can trace the lineage. +4. Once every execution-blocking item is resolved, flip the brainstorm artifact's `status: open` → `converged`. Do not delete it. Do not regenerate lens angles inside grill — if a lens return is thin, re-run that brainstorm lens with narrower scope. -- inspect code/docs/tests before asking; -- pick the single highest-leverage premise — ask, verify, decide, then move to the next; -- challenge goal, non-goals, acceptance, sequencing, and verification; -- return blockers, important risks, options, and a recommendation. +## Grill log -### Heavy +Record every pressure-tested fact or decision in `.do-it/grill/.md` so the signal survives context compaction and feeds `do-it-planning` and `do-it-verification-gate`. Standard and Heavy write it; Light and discussion-first may keep it inline. -Parent-only unless explicitly assigned: - -- wave, phase, architecture, interface, release, or multi-agent decisions; -- codebase exploration is mandatory; -- include scope lock, verified facts, decision options, gates, and residual risk. - -## The Iterative Loop - -Run this loop until every execution-blocking item is resolved: facts become `confirmed` or `refuted`, while user preferences become `chosen`, `deferred`, or `needs_user_decision`. - -1. **Pick the most load-bearing premise.** What single belief, if wrong, would change the most downstream decisions? -2. **Try to falsify cheaply.** Read the file, grep for the symbol, run the unit test, glance at the schema. If you find evidence, jump to step 4. -3. **Ask one focused question** only when the remaining unknown is a user decision. Start with the minimum context the user needs, then show 2-3 real options with tradeoffs and a recommended default so the user has something concrete to accept or correct. -4. **Record the outcome** in `.do-it/grill/.md` (see § Grill Log Artifact). Use `kind: fact` with `confirmed/refuted` for evidence, or `kind: decision` with `chosen/deferred/needs_user_decision` for user preferences. -5. **Repeat** with the next-most-load-bearing premise, until the remaining unknowns no longer change the route. - -## Assumptions Mode - -Use assumptions mode when the user may not know the technical vocabulary, the -codebase has strong existing patterns, or open-ended questions would be -needlessly hard to answer. - -1. Read the relevant current truth first: project docs, code map, nearby - patterns, and any brainstorm artifact. -2. State the assumptions you would use if the user said nothing, with confidence - and evidence. -3. Ask the user to confirm or correct only the assumption that changes the most - downstream work. -4. Rephrase technical choices in product or operator language when that is more - useful than implementation jargon. -5. Log confirmed assumptions as decisions or facts before planning consumes - them. - -Do not use assumptions mode to smuggle in unverified preferences. If the -assumption is cheap to verify, verify it; if it is a user preference, present it -as a choice. - -## Convergence After Brainstorm - -When `.do-it/brainstorm/.md` exists with `status: open`, grill enters convergence mode for the same task. The selected brainstorm lenses have already widened the frame; grill's job is to pick the load-bearing items off the brainstorm artifact and resolve them, not to re-run divergence. - -Convergence flow: - -1. **Read the artifact.** Parse `Requirement Shape`, `Product Boundary`, `Core Goal`, `Options`, `Architecture Foundation`, and `Grill Handoff`. Each `Must Resolve In Grill` entry is a candidate premise the brainstorm lenses thought a human, grill, or local verification had to settle. -2. **Rank by route-impact.** Sort candidates by "if this is wrong, how many downstream choices change?" — same rule as the Iterative Loop. Cross-lens tensions usually outrank single-lens one-offs. -3. **Resolve each candidate** via the existing loop (verify cheaply, ask one question if a user decision is needed, log the result in `.do-it/grill/.md` per § Grill Log Artifact). -4. **Reference the brainstorm slug** in the grill log frontmatter: add `brainstorm: ` so `do-it-planning` and `do-it-verification-gate` can trace the lineage. -5. **Flip brainstorm status.** Once every execution-blocking `Must Resolve In Grill` item has a resolution (`chosen` / `deferred` / `needs_user_decision` for decisions; `confirmed` / `refuted` for facts), edit the brainstorm artifact's frontmatter from `status: open` to `status: converged`. Do not delete the artifact; future sessions read it. - -Tier behavior: - -- **Light**: grill does **not** consume brainstorm. If brainstorm exists at Light tier, it was probably explicit; honor it by skimming the personas section briefly, but do the regular single-thread Light grill. -- **Standard / Heavy**: grill **must** read the brainstorm artifact when it exists. Skipping it loses the point of running brainstorm at all. - -Convergence does not regenerate lens angles. If a lens return looks thin or wrong, the answer is to re-run that brainstorm lens with narrower scope, not to invent the missing angle inside grill. - -## Anchoring Terms - -When a term feels fuzzy, before you debate it: - -1. Search `CLAUDE.md`, `.do-it/CONTEXT.md`, and `docs/` for an existing definition. -2. If a definition exists and the user is using it differently, **call out the conflict immediately**. Quote both definitions side by side. Ask which one applies. -3. If no definition exists and the term will be re-used, propose one in CONTEXT-FORMAT shape (see `do-it-context`) and write it back. -4. Avoid synonyms — if `payload` and `event body` mean the same thing in this conversation, pick one and stick to it. - -## Code Reverification - -When the user makes a factual claim about behavior ("the validator already checks X", "this is called from one place", "we don't ship Y in prod"): - -1. Treat it as a premise, not a fact. -2. `grep` / open the file. Two minutes of reading beats five minutes of debate. -3. If the code disagrees, surface the disagreement with a path:line citation. -4. Update the grill log with the verified state. - -## Lenses - -Use these as checks, not personas: - -- Truth: What has been verified locally, and what is only assumed? -- Scope: What is the smallest complete outcome? -- Acceptance: What exact evidence proves the work is done? -- Interface: Which contract or boundary will another part of the system rely on? -- Failure: What is the most likely way this plan ships a bug? -- Review: What would a skeptical reviewer block on? -- Maintenance: What future churn can be avoided without expanding scope? -- Delegation: Does each worker have a tier, ownership, verification, and stop conditions? - -## Question Shape - -When a user decision is needed, use this shape: - -- Context: one or two sentences explaining why the decision matters now. -- Options: 2-3 viable choices, each with benefit, cost, risk, and when to choose it. -- Recommendation: one default with the reason. -- Question: one focused ask. Do not bundle unrelated decisions. - -If the user answers only part of the question, record the settled part and ask -the next smallest unresolved decision. - -## Common Rationalizations - -These are the "I'm done grilling" excuses that should each cost you another loop: - -- *"It feels reasonable."* — Reasonable ≠ verified. Name the evidence. -- *"The user said it works that way."* — Treat as premise. Grep before believing. -- *"It would be a small refactor if wrong."* — Costed in tokens or in a follow-up sprint? Re-state. -- *"We can fix it in review."* — Maybe; but if grill catches it now, review carries fewer findings. -- *"There's no time to anchor terms."* — There's also no time to debug a contract mismatch in production. -- *"Both interpretations probably work."* — Then which one ships? Pick. - -## Red Flags - -Pause and re-grill when you see: - -- The user agreeing immediately with whatever you propose. (Are they confirming or appeasing?) -- Acceptance criteria that say "looks good" or "works". -- A plan that names `Phase 2` of work that doesn't yet exist. -- "Should be straightforward" applied to a multi-package change. -- Terms used inconsistently in the same paragraph. -- A review response that has zero `Blocking` / `Important` items but the change touches a public interface. - -## Codebase Exploration - -For Standard and Heavy grills, inspect enough current truth to avoid theater: - -- current diff and dirty files in scope; -- owning modules, call paths, tests, docs, or plans; -- existing terminology and conventions; -- nearby working examples; -- verification commands or evidence surfaces. - -Stop exploration when the next unknown is a real user decision or the facts are sufficient to recommend a route. - -## Output Shape - -For a light grill: - -- current truth checked; -- one route-changing assumption or risk; -- one question only if user preference blocks the next action; -- recommended route. - -For a standard grill: - -- current truth checked; -- grill log items (facts confirmed/refuted; decisions chosen/deferred/needs_user_decision) — usually written to `.do-it/grill/.md`; -- blockers; -- important risks; -- options considered; -- recommended path; -- verification and review checks. - -For a heavy grill: - -- scope lock; -- verified facts; -- blockers; -- important risks; -- non-blocking opportunities; -- decision options with one recommended path; -- verification and review gates that must be satisfied. - -## Severity - -- `Blocking`: Must resolve before execution or closeout because it can make the work wrong, unsafe, or unverifiable. -- `Important`: Should resolve now because it can create rework, review failure, or unclear ownership. -- `Opportunity`: Useful architecture or quality improvement, but not a blocker unless tied to correctness or delivery risk. - -## Common Mistakes - -- Asking five generic questions at once instead of one focused premise. -- Asking technical open questions when assumptions mode would let the user - confirm or correct a concrete proposal. -- Challenging in the abstract without reading files. -- Turning every idea into a blocker. -- Asking about facts that codebase exploration can answer. -- Accepting a plan because it sounds reasonable while evidence is missing. -- Treating architecture taste as delivery truth. -- Using a fuzzy term repeatedly without ever defining it back to the user or to `.do-it/CONTEXT.md`. - -## Grill Log Artifact - -Grill records every pressure-tested fact or decision in `.do-it/grill/.md` -so the signal outlives the chat turn and feeds `do-it-planning` and -`do-it-verification-gate`. Without it, grilling is just a vibe — once context -compacts, every signal is gone. - -`` is the user's task title (lowercased, dash-separated, ≤32 chars) -when one is obvious, else the first 8 chars of the SHA-1 of the first prompt; -prefix with the short session id when collisions are likely. Keep -`.do-it/grill/.gitkeep` so the directory tracks in git. - -Format: +``: the user's task title (lowercased, dash-separated, ≤32 chars), else the first 8 chars of the SHA-1 of the first prompt; prefix with the short session id on collision. Keep `.do-it/grill/.gitkeep` so the directory tracks in git. ```markdown --- task: session_id: created: -status: open # flips to `resolved` when no execution-blocking item remains +status: open # → resolved when no execution-blocking item remains brainstorm: # only when converging a brainstorm artifact --- ## Items tested - - [ ] **** - kind: - - falsifier: - status: - - evidence: + - evidence: ## Anchored terms - - ****: ← from CLAUDE.md / .do-it/CONTEXT.md / clarified this turn ``` -Rules: use `confirmed`/`refuted` only for facts, `chosen`/`deferred`/`needs_user_decision` -only for decisions. A `needs_user_decision` item blocks only when it changes the -next execution step. Append new items and mutate existing ones in place — never -delete an item; refute or defer it with evidence. Anchored-terms entries get -sedimented to `.do-it/CONTEXT.md` (see `do-it-context`); this section is a -working pad. Flip `status: resolved` once no execution-blocking item remains, and -do not flip it back without a new entry explaining why. +Use `confirmed`/`refuted` only for facts, `chosen`/`deferred`/`needs_user_decision` only for decisions. Append new items and mutate existing ones in place — never delete an item; refute or defer it with evidence. Anchored terms sediment to `.do-it/CONTEXT.md` (see `do-it-context`). A `needs_user_decision` item blocks only when it changes the next execution step; flip `status: resolved` once none remain. + +## Reference + +The challenge taxonomy (eight lenses), the "I'm done grilling" rationalizations to push past, red flags that warrant another loop, severity definitions, and common mistakes live in `references/checks.md`. Load it when a grill stalls or you need the full lens set; the loop above is enough for the common case. -## Related Skills +## Related skills -- `do-it-context` — set up and maintain `.do-it/CONTEXT.md` with project terms and invariants (sediment destination for anchored terms). -- `do-it-brainstorm` — divergent persona pass that runs before grill; produces the open decisions grill converges on. -- `do-it-planning` — consume the grill outcome into a plan card under `.do-it/plans/.md`; reads the grill slug. -- `do-it-review-loop` — apply pressure to the delivered diff after grilling has set the bar. +- `do-it-router` — sets tier and owns the decision ladder that grill's necessity rung references. +- `do-it-brainstorm` — divergent pass that runs first; produces the open decisions grill converges. +- `do-it-context` — `.do-it/CONTEXT.md` is the sediment destination for anchored terms. +- `do-it-planning` — consumes the grill outcome into a plan card under `.do-it/plans/.md`; reads the grill slug. +- `do-it-review-loop` — applies pressure to the delivered diff after grilling sets the bar. diff --git a/plugins/do-it/skills/do-it-grill/references/checks.md b/plugins/do-it/skills/do-it-grill/references/checks.md new file mode 100644 index 0000000..91dca22 --- /dev/null +++ b/plugins/do-it/skills/do-it-grill/references/checks.md @@ -0,0 +1,52 @@ +# Grill reference — lenses, rationalizations, red flags + +Load this when a grill stalls or you want the full challenge set. The loop in `SKILL.md` is enough for the common case; this file is progressive disclosure, not a checklist to run every time. + +## Lenses (checks, not personas) + +- **Truth** — what is verified locally, and what is only assumed? +- **Necessity** — does this need to exist, or does an existing capability cover it? (decision ladder rung 1) +- **Scope** — what is the smallest complete outcome? +- **Acceptance** — what exact evidence proves the work is done? +- **Interface** — which contract or boundary will another part of the system rely on? +- **Failure** — what is the most likely way this plan ships a bug? +- **Review** — what would a skeptical reviewer block on? +- **Maintenance** — what future churn can be avoided without expanding scope? +- **Delegation** — does each worker have a tier, ownership, verification, and stop condition? + +## Rationalizations that should each cost another loop + +- *"It feels reasonable."* — Reasonable ≠ verified. Name the evidence. +- *"The user said it works that way."* — Treat as premise. Grep before believing. +- *"It would be a small refactor if wrong."* — Costed in tokens, or in a follow-up sprint? Re-state. +- *"We can fix it in review."* — Maybe; but caught now, review carries fewer findings. +- *"There's no time to anchor terms."* — There's also no time to debug a contract mismatch in production. +- *"Both interpretations probably work."* — Then which one ships? Pick. +- *"We might need it later."* — That is rung 1. Skip it until the need is real. + +## Red flags — pause and re-grill + +- The user agreeing immediately with whatever you propose. (Confirming, or appeasing?) +- Acceptance criteria that say "looks good" or "works". +- A plan that names `Phase 2` of work that does not yet exist. +- "Should be straightforward" applied to a multi-package change. +- Terms used inconsistently in the same paragraph. +- A review response with zero `Blocking` / `Important` items on a change that touches a public interface. +- A new abstraction, layer, or dependency defended by a longer paragraph than the code it replaces — the prose is complexity smuggled back in. + +## Severity + +- `Blocking` — must resolve before execution or closeout: can make the work wrong, unsafe, or unverifiable. +- `Important` — should resolve now: can create rework, review failure, or unclear ownership. +- `Opportunity` — useful architecture or quality improvement, not a blocker unless tied to correctness or delivery risk. + +## Common mistakes + +- Asking five generic questions at once instead of one focused premise. +- Asking open technical questions when assumptions mode would let the user confirm or correct a concrete proposal. +- Challenging in the abstract without reading files. +- Turning every idea into a blocker. +- Asking about facts that codebase exploration can answer. +- Accepting a plan because it sounds reasonable while evidence is missing. +- Treating architecture taste as delivery truth. +- Using a fuzzy term repeatedly without ever defining it back to the user or to `.do-it/CONTEXT.md`. diff --git a/plugins/do-it/skills/do-it-review-loop/SKILL.md b/plugins/do-it/skills/do-it-review-loop/SKILL.md index 4311a91..5d6bc11 100644 --- a/plugins/do-it/skills/do-it-review-loop/SKILL.md +++ b/plugins/do-it/skills/do-it-review-loop/SKILL.md @@ -81,6 +81,18 @@ datastore, framework, runtime, or protocol. Loads and user confirmation step). Returns findings in the standard shape; a memory-pick without a fresh search is `Blocking`. +### YAGNI lens + +Available at Standard tier (optional) and Heavy tier (default when the diff adds +a new abstraction, export, dependency, or `Phase 2` scaffolding). Loads +`code-quality-cleaner` (maintainability + over-engineering against the decision +ladder, see `do-it-router` § Restraint). Returns one-line findings using its +closed tag vocabulary (`delete:` / `stdlib:` / `native:` / `yagni:` / `shrink:`), +each carrying a severity, plus a quantified `net: - lines possible` / +`Lean already. Ship.` verdict. The `anti-patterns-lint.sh` PostToolUse hook is an +advisory write-time pre-filter for the unreferenced-export case; the lens is the +source of truth and catches single-use abstractions the hook cannot see. + ## Review intensity (graduated) Review intensity is the canonical axis for deciding how much review effort to @@ -111,6 +123,7 @@ Review intensity is orthogonal to the tier: - **review-adversarial** (Heavy default for high-risk surface): parallel multi-lens — `reviewer` + `red-team-reviewer` + `spec-compliance-reviewer` (and `architecture-taste-reviewer` if research-first lens triggers, + `code-quality-cleaner` if the diff adds new abstraction/export surface, `do-it-comments-discipline` lens if comments changed). Use when: - Heavy tier with breaks_interface, crosses_packages, security-sensitive code, or migrations @@ -167,7 +180,9 @@ Check every non-trivial diff through five axes before spending time on polish: - Contracts: do APIs, schemas, CLIs, generated outputs, docs, and consumers agree? - Maintainability: did the change add avoidable coupling, dead code, duplicate - logic, or unclear ownership? + logic, unclear ownership, or a single-use / speculative abstraction the + decision ladder would inline? (the YAGNI lens / `code-quality-cleaner` owns + the over-engineering call) - Verification: do the tests and commands actually prove the claim, or did they mock away the risky collaborator chain? diff --git a/plugins/do-it/skills/do-it-router/SKILL.md b/plugins/do-it/skills/do-it-router/SKILL.md index 702798c..c10af38 100644 --- a/plugins/do-it/skills/do-it-router/SKILL.md +++ b/plugins/do-it/skills/do-it-router/SKILL.md @@ -77,6 +77,21 @@ these by default: This binds how do-it itself evolves and how it shapes changes inside a project. +### The decision ladder + +The best code is the code you never wrote. When a change calls for new code, walk down these rungs and stop at the first that holds — the cheapest sufficient option wins: + +1. **Does it need to exist?** Speculative or "might need it later" → skip it. +2. **Does the stdlib do it?** → use it. +3. **Is there a native platform feature?** → use it. +4. **Is an already-installed dependency enough?** → use it. +5. **Can it be one line?** → one line. +6. Only then: the smallest custom code that works. + +Bias toward deletion over addition and boring over clever. Never cut for the ladder's sake: input validation at trust boundaries, error handling that prevents data loss, security, accessibility, or a feature the user explicitly asked for. Lazy, not negligent. + +This is the shared "write less" primitive: `do-it-grill` opens with rung 1 (does this need to exist?), `do-it-brainstorm` maps options along the ladder, and `hooks/anti-patterns-lint.sh` flags rungs the code skipped. Reference it; do not restate it. + ## Orthogonal Dimensions In addition to the single tier label, the router writes 5 boolean dimensions into per-session state. They narrow *intensity*, not tier itself: a Standard task can still be `breaks_interface=1` and a downstream skill MAY upgrade its review or drill posture accordingly. diff --git a/skills/do-it/do-it-brainstorm/SKILL.md b/skills/do-it/do-it-brainstorm/SKILL.md index cc4664f..cc7d15c 100644 --- a/skills/do-it/do-it-brainstorm/SKILL.md +++ b/skills/do-it/do-it-brainstorm/SKILL.md @@ -1,107 +1,65 @@ --- name: do-it-brainstorm -description: "Use when planning needs divergent product and architecture option mapping before grill converges on decisions." +description: "Map requirement shape and options through a product viewpoint and an architecture viewpoint before grill converges on decisions. Use when planning needs divergent option mapping, or when asked to brainstorm or 脑暴." --- # Do-It Brainstorm -## Purpose +Diverge before grill converges: figure out what the requirement actually is and what shapes it could take, map the options, and hand the open decisions to `do-it-grill`. Brainstorm never settles the decision itself — any option that would change implementation direction leaves a `Must Resolve In Grill` item instead of being silently chosen. -Use this to understand what the requirement is and what shape it could take before `do-it-grill` narrows it. Brainstorm is divergent, but it is not a fixed four-persona ritual. The default pass always includes two core read-only lenses: +Two viewpoints always inform the pass — a **product** viewpoint (requirement shape, product boundary, core goal, option tradeoffs) and an **architecture** viewpoint (foundation, extension modules, stage closure, boundaries, verification route). The viewpoints are fixed; how they run scales with tier (§ Tiers). -- `product-strategist`: requirement shape, product boundary, core goal, and option tradeoffs. -- `architecture-strategist`: architecture foundation, extension modules, stage closure, boundaries, and verification route. +## The pass -Supplemental lenses are added only when the task needs them. Brainstorm surfaces possible shapes, product and architecture boundaries, option benefits/costs/risks, and the questions grill must converge. It does not settle the decision itself. +1. **Pick the dominant branch.** Decide which question this turn is really about — product shape, architecture foundation, or a supplemental concern (UX / ops / security / domain). Route effort there instead of spraying it evenly across every axis. If the user is unreachable, state the assumption and proceed. +2. **Run the two viewpoints.** By default the parent reasons through both inline. Spawn them as independent read-only subagents only at Heavy tier or on explicit request (§ Fan-out). +3. **Map options along the decision ladder.** Lay each product or architecture option out along the ladder (see `do-it-router` § Restraint): skip-it → stdlib/native → existing dependency → minimal custom → full build. Bias divergence toward the subtractive and boring options, not only the additive and clever ones — the cheapest option that meets the goal usually wins. +4. **Diverge predictably.** Brainstorm's behaviour is fixed even when its tokens vary: surface genuinely different options (not three variations of one) and compare them on the same axes — for architecture: seam count, depth/leverage, reversibility, blast radius. +5. **Skip the fog of war.** If the pass surfaces no real open decision — the requirement shape and route are already clear — say so and hand straight to planning. Do not manufacture options to fill a template. +6. **Hand off to grill.** Dedup `Must Resolve In Grill`, drop anything local verification settles immediately, and preserve each option's tradeoff plus a recommended default for grill to drive. -## How It Differs From Grill +## How it differs from grill | | `do-it-brainstorm` | `do-it-grill` | |---|---|---| -| Mode | Divergent: widen with product, architecture, and task-fit lenses | Convergent: falsify facts and settle load-bearing decisions | -| Default lenses | Product core + architecture core | Single internal challenger loop | -| Output | Discussion-first summary or `.do-it/brainstorm/.md` | `.do-it/grill/.md` | -| Stop condition | Enough lens input to describe requirement shape, boundaries, options, and grill questions | No execution-blocking item remains | +| Mode | Divergent: widen with product, architecture, and task-fit viewpoints | Convergent: falsify facts, settle load-bearing decisions | +| Output | Inline decision stack, or `.do-it/brainstorm/.md` | `.do-it/grill/.md` | +| Stop | Enough input to describe requirement shape, options, and grill questions | No execution-blocking item remains | | Handoff | Hands `Must Resolve In Grill` to grill | Hands resolved decisions to planning or execution | -Brainstorm never converges. It may compare options, name benefits/costs/risks, -and recommend a default for discussion, but the final route is chosen by grill, -planning, or the user. Any option that would change implementation direction -must leave a clear `Must Resolve In Grill` item instead of being silently -chosen inside brainstorm. +If both run, brainstorm runs first. -## When To Use +## When to use / skip -Trigger brainstorm when the proposal: +Use when the proposal genuinely opens a decision: a new user- or operator-visible surface; a product, positioning, or opportunity-cost tradeoff; a change to architecture shape, module boundaries, install/runtime policy, public commands, or verification strategy; or an explicit "脑暴 / brainstorm / 多视角看一下". A Heavy-tier task without an existing brainstorm also qualifies. -- introduces a new user-visible or operator-visible surface; -- has product, business, positioning, workflow, or opportunity-cost implications; -- changes architecture shape, module boundaries, install/runtime policy, public commands, or verification strategy; -- crosses operational, reliability, security, domain-language, or release boundaries; -- lacks enough project context and the user wants to discover the requirement shape before artifacts; -- the user explicitly asks to "脑暴" / "brainstorm" / "多视角看一下"; -- a Heavy-tier task lands without an existing brainstorm or equivalent discussion. +Skip when the change is a tightly bounded mechanical edit, pure internal cleanup with no product/architecture/operator/release tradeoff, or already covered by a current brainstorm artifact whose frame has not materially changed. When in doubt, prefer the fog-of-war skip over running a full pass on a task with no open decision. -Skip brainstorm when: +## Viewpoints -- the change is a tightly bounded mechanical edit; -- the work is pure internal cleanup with no product, architecture, operator, or release tradeoff; -- a current brainstorm artifact exists for the same proposal and the frame has not materially changed. +The two cores are always present. Add a supplemental lens only when it can change the next implementation route — zero to two, never the whole table: -## Lens Selection - -Always run the two core lenses unless the user explicitly requests a single named lens: - -- `product-strategist` -- `architecture-strategist` - -Then add supplements only when the task calls for them: - -| Supplemental lens | Use when | +| Supplemental lens | Add when | |---|---| -| `ux-designer` | UI, flow, discoverability, accessibility, visual hierarchy, or user-facing copy could change the requirement shape. | +| `ux-designer` | UI, flow, discoverability, accessibility, or user-facing copy could change the requirement shape. | | `end-user-advocate` | Real-use conditions, recovery, stale state, interruption, or mental-model mismatch matters. | -| `ops-sre` | Deployment, rollback, observability, migration, scale, on-call, or incident recovery matters. | -| `ceo-reviewer` | Board-level business tradeoff, revenue path, market window, pricing, or competitive pressure matters. | +| `ops-sre` | Deployment, rollback, observability, migration, scale, or on-call matters. | +| `ceo-reviewer` | Board-level business tradeoff, revenue path, market window, or competitive pressure matters. | | `red-team-reviewer` | Security, persistence, concurrency, replay, partial failure, or adversarial misuse might change the route. | | `domain-language-reviewer` | Terms, names, public concepts, or domain models are overloaded or contradictory. | | `plan-challenger` | Scope, acceptance criteria, sequencing, or route sizing is the main uncertainty. | -Do not run every lens by default. If two supplemental lenses overlap, choose the one whose output can change the next implementation route. - -## Tier Rules - -### Light - -Use only when the user explicitly asks for brainstorm on a Light task. - -- Run `product-strategist` if the question is about demand, boundary, core goal, or option tradeoffs. -- Run `architecture-strategist` if the question is about foundation, extension modules, boundary, stage closure, or verification. -- If neither clearly dominates, run both cores. -- Prefer an inline discussion summary over an artifact unless the user asks for a durable file. +If two supplements overlap, run the one whose output changes the route. -### Standard +## Tiers -Default for normal product, UX, command, workflow, release-adjacent, or architecture-adjacent work. +- **Light** — only when explicitly asked. Run the dominant viewpoint inline (both if neither dominates). Inline summary, no artifact unless the user asks for a durable file. +- **Standard** — default. Both viewpoints inline; add zero to two supplements by task frame and repo truth. Produce an inline decision stack, or `.do-it/brainstorm/.md` only when the next session must read it. +- **Heavy** — parent-only unless assigned. Fan the two viewpoints out as independent subagents; add the minimum supplements for product/operator/security/domain/release/scope risk; write the artifact unless the user asked for discussion only. -- Run both core lenses. -- Add zero to two supplemental lenses based on the task frame and repo truth. -- Produce either an inline decision stack or `.do-it/brainstorm/.md` when the next session must read it. +## Fan-out -### Heavy - -Parent-only unless explicitly assigned. - -- Run both core lenses. -- Add the minimum supplemental lenses needed for product, operator, security, domain, release, or scope risk. -- For release/workflow/policy work, include at least one install/release or skill-quality review later in `do-it-review-loop`; review lenses are not substitutes for brainstorm lenses. -- Write the artifact unless the user explicitly asked for discussion only. - -## Dispatch Pattern - -Each selected lens is read-only and should receive the same frame plus the facts the parent already verified. Lenses may inspect current truth when needed, but they must not write. - -Use the `do-it-subagent-orchestration` contract when delegating: +Spawn lenses as independent read-only subagents only at Heavy tier, on explicit request, or when a viewpoint genuinely needs an independent deep read the parent cannot do inline without bias. When you do, use the `do-it-subagent-orchestration` contract: ```text tier: Light or Standard @@ -109,185 +67,44 @@ scope: write ownership: read-only forbidden paths: src/**, packages/**, apps/**, .do-it/**, dist/**, agents/**, skills/** current truth: -failure-mode forecast: -path map: not applicable, unless architecture-strategist needs a high-level boundary map must-verify facts: stop condition: NEEDS_CONTEXT if the frame is too vague, BLOCKED if forbidden writes are required, otherwise return the schema return schema: see the selected agent definition ``` -Run independent lenses in parallel when the host supports parallel subagents. If the host or task cannot support delegation, the parent may run a compact local version of the same lenses, but must label it as local and avoid pretending independent agents ran. - -## Discussion-First Mode - -Use discussion-first mode when the project or task frame is too blank to justify an artifact: - -- no repo exists or the repo has no relevant product/architecture truth; -- the user is exploring direction, not asking for a durable handoff; -- selected lenses return `NEEDS_CONTEXT` because the goal, audience, or boundary is unknown. +Run independent lenses in parallel when the host supports it. If delegation is unavailable, run a compact local version and label it local — do not pretend independent agents ran. -Return a concise stack: +## Research-first (new architectural surface) -- `Requirement shape`: what the demand appears to be and what forms it could take. -- `Product boundary`: what is in scope, what is out of scope, and which line matters. -- `Core goal`: the success target for this stage. -- `Options`: multiple viable paths with benefits, costs, risks, and when to choose each. -- `Architecture foundation`: the core bottom layer and invariants. -- `Extension modules`: later or optional modules and how they attach. -- `Must resolve in grill`: decisions that change product direction, foundation, implementation route, or verification. -- `Can decide during planning`: details that can be settled after direction is chosen. -- `Context still needed`: only facts that cannot be read locally and would change execution. +When an option introduces a new architectural surface (new dependency / datastore / framework / protocol), the architecture viewpoint must surface ≥2 real alternatives with a recency check before recommending one — do NOT batch-ask the user three generic questions. Hand the specific candidate-choice as a `Must Resolve In Grill` item; grill picks the highest-leverage one and asks. Out of scope here: bug fixes, refactors of existing code, incremental changes within a module. -Do not force `.do-it/brainstorm/.md` in discussion-first mode. Create an artifact only when there is enough context for future sessions to reuse. +## Artifact -## Output Artifact - -When durable handoff is useful, write: +Discussion-first is the default — return the inline stack (`Requirement shape`, `Product boundary`, `Core goal`, `Options`, `Architecture foundation`, `Extension modules`, `Must resolve in grill`, `Can decide during planning`) and stop. Write a durable file only when a future session must reuse it or the task is Heavy: ```text .do-it/brainstorm/.md ``` -`` follows the same slug rule as the grill log (see `do-it-grill` § Grill Log Artifact): slug from the user title or a short hash, with an optional session prefix. `/.do-it/brainstorm/.gitkeep` should exist when the project tracks do-it artifacts. - -In the same turn, write the slug to the task pointer so router and other skills can pick it up: +`` follows the grill-log slug rule. When you write the artifact, in the same turn write the pointer so router and other skills pick it up: ```bash mkdir -p .do-it/runtime && printf '%s' "" > .do-it/runtime/pointer ``` -See `do-it-router` § Task Pointer for the full protocol. Discussion-first mode without an artifact does not write the pointer. - -### File Format - -```markdown ---- -task: -session_id: -created: -status: open -lenses_run: [product-strategist, architecture-strategist, ...] -tier: -mode: ---- - -## Frame - - - -## Core Lenses - -### Product Strategy - - -### Architecture Strategy - - -## Supplemental Lenses - -### - - -Or `none selected`. - -## Requirement Shape - - - -## Product Boundary - -- In scope: -- Out of scope: -- Boundary risk: - -## Core Goal - - - -## Options - -### Option A: - -- Benefits: -- Costs: -- Risks: -- Choose when: - -### Option B: - -- Benefits: -- Costs: -- Risks: -- Choose when: - -## Architecture Foundation - -- Core bottom layer: -- Ownership/contract: -- Stage closure: - -## Extension Modules - -- - -## Grill Handoff - -### Must Resolve In Grill - -- - -### Can Decide During Planning - -- - -## Tensions - -- - -Or `none surfaced`. -``` - -`status: open` means grill has not converged on `Must Resolve In Grill`. Grill flips it to `converged` once every execution-blocking item is resolved or explicitly deferred. - -## Research handoff (Heavy or new architectural surface) - -When brainstorm produces options that involve a new architectural surface (new dependency / datastore / framework / protocol), do NOT batch-ask the user three generic questions here. Instead, hand specific candidate-choice questions to grill as `Must Resolve` items. - -Grill follows the "ask one focused question" rule (see `do-it-grill`). The brainstorm artifact lists the architectural-surface decisions that grill must drive; grill picks the highest-leverage one and asks. This preserves the question-budget discipline. - -Out of scope here: bug fixes, refactors of existing code, incremental changes within an existing module. - -## Handoff To Grill - -After discussion or artifact creation: - -1. Inspect selected lens returns for forbidden-path hits, unsupported claims, and schema drift. -2. Deduplicate `Must Resolve In Grill` and remove questions that local verification can answer immediately. -3. Preserve the option tradeoffs; do not collapse them into one answer inside brainstorm. - For each user-facing decision, include the available options, the practical - tradeoff, and the recommended default that grill should present. -4. Trigger or recommend `do-it-grill` when any must-resolve item changes execution. -5. If all must-resolve items can be verified locally and no user preference remains, the parent may proceed to planning/execution and record why grill is not needed. - -## Token Discipline - -- Core lenses are default; supplemental lenses are justified by task risk. -- Do not summarize every lens inline when an artifact exists; return the path plus requirement shape, options, and grill handoff summary. -- Keep each distilled lens section to about 30 lines. -- Do not re-run brainstorm if the existing artifact still matches the proposal; append only if the frame materially changed. +The full artifact `File Format` and section template live in `references/artifact-format.md`; load it only when writing the file. `status: open` means grill has not converged; grill flips it to `converged`. -## Common Mistakes +## Common mistakes -- Treating `ceo-reviewer`, `ux-designer`, `end-user-advocate`, and `ops-sre` as the fixed default set. They are supplements now. -- Calling `architect-reviewer` instead of `architecture-strategist` before implementation. The former reviews delivered or planned architecture risk; the latter explores system shape during brainstorm. -- Forcing an artifact when the user is still exploring a blank product direction. -- Letting brainstorm make the final decision. Brainstorm maps the requirement shape and options; convergence belongs to grill, planning, or the user. -- Running many overlapping supplements when the two core lenses already identify the route. +- Spawning subagents on a Standard task when both viewpoints fit inline — fan-out is a Heavy/explicit cost, not a default. +- Treating `ceo-reviewer` / `ux-designer` / `end-user-advocate` / `ops-sre` as a fixed default set; they are supplements. +- Calling `architect-reviewer` (reviews delivered/planned risk) instead of the architecture viewpoint (explores system shape before implementation). +- Forcing an artifact when the user is still exploring a blank direction, or manufacturing options when there is no open decision. +- Letting brainstorm make the final decision — convergence belongs to grill, planning, or the user. -## Related Skills +## Related skills -- `do-it-router` — sets tier and decides whether brainstorm is useful. -- `do-it-subagent-orchestration` — dispatch contract for selected lenses. -- `do-it-grill` — converges the must-discuss stack and records resolved facts and decisions in its grill log. -- `do-it-review-loop` — reviews delivered diff, skill quality, install readiness, and release risk after implementation. +- `do-it-router` — sets tier, owns the decision ladder, and decides whether brainstorm is useful. +- `do-it-subagent-orchestration` — dispatch contract for fanned-out lenses. +- `do-it-grill` — converges the must-resolve stack and records resolved facts and decisions. - `do-it-context` — sediments terms anchored during brainstorm or grill. diff --git a/skills/do-it/do-it-brainstorm/references/artifact-format.md b/skills/do-it/do-it-brainstorm/references/artifact-format.md new file mode 100644 index 0000000..a69b59a --- /dev/null +++ b/skills/do-it/do-it-brainstorm/references/artifact-format.md @@ -0,0 +1,89 @@ +# Brainstorm artifact — file format + +Load this only when writing a durable `.do-it/brainstorm/.md`. Discussion-first mode (the default) returns the inline stack and does not write a file. `/.do-it/brainstorm/.gitkeep` should exist when the project tracks do-it artifacts. + +```markdown +--- +task: +session_id: +created: +status: open +lenses_run: [product, architecture, ...] +tier: +mode: +--- + +## Frame + + + +## Core Viewpoints + +### Product + + +### Architecture + + +## Supplemental Lenses + +### + + +Or `none selected`. + +## Requirement Shape + + + +## Product Boundary + +- In scope: +- Out of scope: +- Boundary risk: + +## Core Goal + + + +## Options + +Map each option along the decision ladder (skip-it → stdlib/native → existing dependency → minimal custom → full build) and compare on the same axes. + +### Option A: +- Ladder rung: +- Benefits: +- Costs: +- Risks: +- Choose when: + +### Option B: +- Ladder rung: <...> +- Benefits / Costs / Risks / Choose when + +## Architecture Foundation + +- Core bottom layer: +- Ownership/contract: +- Stage closure: + +## Extension Modules + +- + +## Grill Handoff + +### Must Resolve In Grill +- + +### Can Decide During Planning +- + +## Tensions + +- + +Or `none surfaced`. +``` + +`status: open` means grill has not converged on `Must Resolve In Grill`. Grill flips it to `converged` once every execution-blocking item is resolved or explicitly deferred. Do not collapse option tradeoffs into one answer inside brainstorm; for each user-facing decision, keep the available options, the practical tradeoff, and the recommended default that grill should present. diff --git a/skills/do-it/do-it-grill/SKILL.md b/skills/do-it/do-it-grill/SKILL.md index 11ca100..21dd055 100644 --- a/skills/do-it/do-it-grill/SKILL.md +++ b/skills/do-it/do-it-grill/SKILL.md @@ -1,290 +1,88 @@ --- name: do-it-grill -description: "Use when hidden assumptions or user-only decisions need fact verification before planning, implementation, or closeout." +description: "Pressure-test the load-bearing premises and necessity of a plan before it hardens into code, commits, or claims. Use when hidden assumptions or user-only decisions gate planning, implementation, or closeout, or when asked to grill, challenge, or stress-test." --- # Do-It Grill -## Purpose +Pressure-test reasoning before it hardens into code, commits, or claims. Verify the facts you can verify locally, challenge whether the work needs to exist at all, and surface the one decision that genuinely needs the user — then stop. -Use this to pressure-test reasoning before work hardens into code, docs, commits, or claims. The goal is to verify the facts that can be verified locally, expose the decision that actually needs a user choice, and keep hidden delivery risk from leaking into implementation. +`do-it-brainstorm` diverges (maps requirement shape and options) and runs first when both run; grill converges and never re-diverges. `do-it-review-loop` reviews a delivered diff; grill challenges whether that review is needed, sufficient, or honestly closed. -`do-it-review-loop` owns delivered diff review, QA intake, and multi-perspective findings. `do-it-grill` challenges whether that review is needed, sufficient, or honestly closed. +## The loop -`do-it-brainstorm` runs the divergent step — the product strategy and architecture strategy core lenses, plus any task-fit supplemental lenses, clarify the requirement shape, product boundary, architecture foundation, options, and tradeoffs before grill arrives. Grill then converges the `Must Resolve In Grill` items those lenses surfaced. Brainstorm never converges; grill never diverges. If both run, brainstorm runs first. +Run until every load-bearing premise is either verified against a source or logged as an explicit user decision — no execution-blocking unknown remains. -## How It Differs From a Q&A Template +1. **Necessity first.** Before grilling *how*, grill *whether*: does this need to exist, or does an existing capability already cover it? This is the decision ladder's rung 1 (see `do-it-router` § Restraint) and is usually the most load-bearing premise. +2. **Pick the most load-bearing premise.** The single belief that, if wrong, changes the most downstream work. One at a time — do not dump a five-question template; that gets shallow, scattered answers. +3. **Falsify it cheaply — explore, don't ask.** If `rg`, opening the file, or a quick test can settle it, do that and cite `path:line`. Treat every user claim about behavior ("X is already validated", "called from one place") as a premise to grep, not a fact; if the code disagrees, surface the disagreement with a citation. +4. **Ask only for decisions — one at a time, with a recommended answer.** When local truth cannot decide a preference, priority, or scope tradeoff, ask one focused question: one or two sentences of why-it-matters-now, 2-3 real options with benefit/cost/risk, and your recommended default so the user has something concrete to accept or correct. If the user answers only part, record the settled part and ask the next smallest unresolved decision. +5. **Record and sediment.** Log each result in `.do-it/grill/.md` (§ Grill log). When a term or constraint gets clarified, append one declarative line to `.do-it/CONTEXT.md`. -A common failure is to dump a 5-question template at the user and call it grilling. That gets shallow, scattered answers. Instead: +## When to use / skip -1. **One premise at a time.** Pick the single most-load-bearing premise that, if wrong, would invalidate the most downstream work. Ask only that. Wait for the answer. Then pick the next. -2. **Anchor terms before debating intent.** If the user uses a term that has a definition in `CLAUDE.md`, `.do-it/CONTEXT.md`, or visible in code with a different shape — surface the conflict before going further. -3. **Verify, don't ask, when verification is cheap.** If the question can be answered with `rg`, `cat`, or a quick local command, do that and report what you found. -4. **Ask only for decisions.** Ask the user one focused question only when local truth cannot decide a preference, priority, or scope tradeoff that changes execution. Before asking, explain the decision in plain language, list the viable options with benefits/costs/risks, and recommend a default. -5. **Sediment what you learned.** When a term gets clarified or a constraint surfaces, append it to `.do-it/CONTEXT.md` (one line, declarative). Record the per-task `.do-it/grill/.md` artifact (`kind: fact|decision`, falsifier, status, evidence) per § Grill Log Artifact below. +Grill when: asked to grill, challenge, or stress-test; a plan has multiple plausible paths; acceptance criteria are vague; architecture, interface, release, phase, wave, or multi-agent coordination is involved; a review response seems too agreeable or under-evidenced; a closeout claim may outrun verification. -## When To Use +Skip for tiny mechanical edits with obvious acceptance checks. -Use the grill when: +## Tiers -- the user asks to be grilled, challenged, or stress-tested; -- a plan has multiple plausible paths; -- acceptance criteria are vague; -- architecture, interface, release, phase, wave, or multi-agent coordination is involved; -- a review response seems too agreeable or under-evidenced; -- a closeout claim may outrun verification. +- **Light** — bounded work: inspect the nearest truth, name the one route-changing assumption or risk, ask at most one question (only if a preference blocks the next step), recommend the route. Does not consume brainstorm. +- **Standard** — default for subagents and ordinary non-trivial planning: inspect code/docs/tests before asking, run the loop premise by premise, return blockers, important risks, options, and a recommendation. Must read a brainstorm artifact when one exists. +- **Heavy** — parent-only unless assigned: wave/phase/architecture/interface/release/multi-agent decisions; codebase exploration is mandatory; return scope lock, verified facts, decision options with one recommendation, gates, and residual risk. -Skip the grill for tiny mechanical edits with obvious acceptance checks. +## Anchoring terms -## Tier Rules +When a term feels fuzzy, before debating it: search `CLAUDE.md`, `.do-it/CONTEXT.md`, and `docs/` for an existing definition. If one exists and the user is using it differently, quote both side by side and ask which applies. If none exists and the term will recur, propose one (CONTEXT-FORMAT shape, see `do-it-context`) and write it back. Pick one term per concept; do not let synonyms (`payload` vs `event body`) drift in the same conversation. -### Light +## Assumptions mode -Use for bounded work: +When the user may lack the technical vocabulary, or the codebase has strong existing patterns, do not ask open technical questions. Read current truth first, state the assumptions you would use if the user said nothing (with confidence and evidence), and ask the user to confirm or correct only the one that changes the most downstream work. Rephrase technical choices in product or operator language when that is clearer. Verify cheap assumptions instead of asking; present genuine preferences as a choice — never smuggle an unverified preference in as an assumption. -- inspect the nearest truth; -- name the one assumption or risk that could change the route; -- ask at most one question, and only if a preference is needed; -- recommend the next route. +## Convergence after brainstorm -### Standard +When `.do-it/brainstorm/.md` exists with `status: open` (Standard/Heavy must read it; Light only skims): -Default for subagents and ordinary non-trivial planning: +1. Read `Requirement Shape`, `Options`, `Architecture Foundation`, and the `Must Resolve In Grill` list — each entry is a candidate premise the lenses thought a human or local verification had to settle. +2. Rank by route-impact (cross-lens tensions usually outrank single-lens one-offs) and resolve each via the loop. +3. Add `brainstorm: ` to the grill log frontmatter so `do-it-planning` and `do-it-verification-gate` can trace the lineage. +4. Once every execution-blocking item is resolved, flip the brainstorm artifact's `status: open` → `converged`. Do not delete it. Do not regenerate lens angles inside grill — if a lens return is thin, re-run that brainstorm lens with narrower scope. -- inspect code/docs/tests before asking; -- pick the single highest-leverage premise — ask, verify, decide, then move to the next; -- challenge goal, non-goals, acceptance, sequencing, and verification; -- return blockers, important risks, options, and a recommendation. +## Grill log -### Heavy +Record every pressure-tested fact or decision in `.do-it/grill/.md` so the signal survives context compaction and feeds `do-it-planning` and `do-it-verification-gate`. Standard and Heavy write it; Light and discussion-first may keep it inline. -Parent-only unless explicitly assigned: - -- wave, phase, architecture, interface, release, or multi-agent decisions; -- codebase exploration is mandatory; -- include scope lock, verified facts, decision options, gates, and residual risk. - -## The Iterative Loop - -Run this loop until every execution-blocking item is resolved: facts become `confirmed` or `refuted`, while user preferences become `chosen`, `deferred`, or `needs_user_decision`. - -1. **Pick the most load-bearing premise.** What single belief, if wrong, would change the most downstream decisions? -2. **Try to falsify cheaply.** Read the file, grep for the symbol, run the unit test, glance at the schema. If you find evidence, jump to step 4. -3. **Ask one focused question** only when the remaining unknown is a user decision. Start with the minimum context the user needs, then show 2-3 real options with tradeoffs and a recommended default so the user has something concrete to accept or correct. -4. **Record the outcome** in `.do-it/grill/.md` (see § Grill Log Artifact). Use `kind: fact` with `confirmed/refuted` for evidence, or `kind: decision` with `chosen/deferred/needs_user_decision` for user preferences. -5. **Repeat** with the next-most-load-bearing premise, until the remaining unknowns no longer change the route. - -## Assumptions Mode - -Use assumptions mode when the user may not know the technical vocabulary, the -codebase has strong existing patterns, or open-ended questions would be -needlessly hard to answer. - -1. Read the relevant current truth first: project docs, code map, nearby - patterns, and any brainstorm artifact. -2. State the assumptions you would use if the user said nothing, with confidence - and evidence. -3. Ask the user to confirm or correct only the assumption that changes the most - downstream work. -4. Rephrase technical choices in product or operator language when that is more - useful than implementation jargon. -5. Log confirmed assumptions as decisions or facts before planning consumes - them. - -Do not use assumptions mode to smuggle in unverified preferences. If the -assumption is cheap to verify, verify it; if it is a user preference, present it -as a choice. - -## Convergence After Brainstorm - -When `.do-it/brainstorm/.md` exists with `status: open`, grill enters convergence mode for the same task. The selected brainstorm lenses have already widened the frame; grill's job is to pick the load-bearing items off the brainstorm artifact and resolve them, not to re-run divergence. - -Convergence flow: - -1. **Read the artifact.** Parse `Requirement Shape`, `Product Boundary`, `Core Goal`, `Options`, `Architecture Foundation`, and `Grill Handoff`. Each `Must Resolve In Grill` entry is a candidate premise the brainstorm lenses thought a human, grill, or local verification had to settle. -2. **Rank by route-impact.** Sort candidates by "if this is wrong, how many downstream choices change?" — same rule as the Iterative Loop. Cross-lens tensions usually outrank single-lens one-offs. -3. **Resolve each candidate** via the existing loop (verify cheaply, ask one question if a user decision is needed, log the result in `.do-it/grill/.md` per § Grill Log Artifact). -4. **Reference the brainstorm slug** in the grill log frontmatter: add `brainstorm: ` so `do-it-planning` and `do-it-verification-gate` can trace the lineage. -5. **Flip brainstorm status.** Once every execution-blocking `Must Resolve In Grill` item has a resolution (`chosen` / `deferred` / `needs_user_decision` for decisions; `confirmed` / `refuted` for facts), edit the brainstorm artifact's frontmatter from `status: open` to `status: converged`. Do not delete the artifact; future sessions read it. - -Tier behavior: - -- **Light**: grill does **not** consume brainstorm. If brainstorm exists at Light tier, it was probably explicit; honor it by skimming the personas section briefly, but do the regular single-thread Light grill. -- **Standard / Heavy**: grill **must** read the brainstorm artifact when it exists. Skipping it loses the point of running brainstorm at all. - -Convergence does not regenerate lens angles. If a lens return looks thin or wrong, the answer is to re-run that brainstorm lens with narrower scope, not to invent the missing angle inside grill. - -## Anchoring Terms - -When a term feels fuzzy, before you debate it: - -1. Search `CLAUDE.md`, `.do-it/CONTEXT.md`, and `docs/` for an existing definition. -2. If a definition exists and the user is using it differently, **call out the conflict immediately**. Quote both definitions side by side. Ask which one applies. -3. If no definition exists and the term will be re-used, propose one in CONTEXT-FORMAT shape (see `do-it-context`) and write it back. -4. Avoid synonyms — if `payload` and `event body` mean the same thing in this conversation, pick one and stick to it. - -## Code Reverification - -When the user makes a factual claim about behavior ("the validator already checks X", "this is called from one place", "we don't ship Y in prod"): - -1. Treat it as a premise, not a fact. -2. `grep` / open the file. Two minutes of reading beats five minutes of debate. -3. If the code disagrees, surface the disagreement with a path:line citation. -4. Update the grill log with the verified state. - -## Lenses - -Use these as checks, not personas: - -- Truth: What has been verified locally, and what is only assumed? -- Scope: What is the smallest complete outcome? -- Acceptance: What exact evidence proves the work is done? -- Interface: Which contract or boundary will another part of the system rely on? -- Failure: What is the most likely way this plan ships a bug? -- Review: What would a skeptical reviewer block on? -- Maintenance: What future churn can be avoided without expanding scope? -- Delegation: Does each worker have a tier, ownership, verification, and stop conditions? - -## Question Shape - -When a user decision is needed, use this shape: - -- Context: one or two sentences explaining why the decision matters now. -- Options: 2-3 viable choices, each with benefit, cost, risk, and when to choose it. -- Recommendation: one default with the reason. -- Question: one focused ask. Do not bundle unrelated decisions. - -If the user answers only part of the question, record the settled part and ask -the next smallest unresolved decision. - -## Common Rationalizations - -These are the "I'm done grilling" excuses that should each cost you another loop: - -- *"It feels reasonable."* — Reasonable ≠ verified. Name the evidence. -- *"The user said it works that way."* — Treat as premise. Grep before believing. -- *"It would be a small refactor if wrong."* — Costed in tokens or in a follow-up sprint? Re-state. -- *"We can fix it in review."* — Maybe; but if grill catches it now, review carries fewer findings. -- *"There's no time to anchor terms."* — There's also no time to debug a contract mismatch in production. -- *"Both interpretations probably work."* — Then which one ships? Pick. - -## Red Flags - -Pause and re-grill when you see: - -- The user agreeing immediately with whatever you propose. (Are they confirming or appeasing?) -- Acceptance criteria that say "looks good" or "works". -- A plan that names `Phase 2` of work that doesn't yet exist. -- "Should be straightforward" applied to a multi-package change. -- Terms used inconsistently in the same paragraph. -- A review response that has zero `Blocking` / `Important` items but the change touches a public interface. - -## Codebase Exploration - -For Standard and Heavy grills, inspect enough current truth to avoid theater: - -- current diff and dirty files in scope; -- owning modules, call paths, tests, docs, or plans; -- existing terminology and conventions; -- nearby working examples; -- verification commands or evidence surfaces. - -Stop exploration when the next unknown is a real user decision or the facts are sufficient to recommend a route. - -## Output Shape - -For a light grill: - -- current truth checked; -- one route-changing assumption or risk; -- one question only if user preference blocks the next action; -- recommended route. - -For a standard grill: - -- current truth checked; -- grill log items (facts confirmed/refuted; decisions chosen/deferred/needs_user_decision) — usually written to `.do-it/grill/.md`; -- blockers; -- important risks; -- options considered; -- recommended path; -- verification and review checks. - -For a heavy grill: - -- scope lock; -- verified facts; -- blockers; -- important risks; -- non-blocking opportunities; -- decision options with one recommended path; -- verification and review gates that must be satisfied. - -## Severity - -- `Blocking`: Must resolve before execution or closeout because it can make the work wrong, unsafe, or unverifiable. -- `Important`: Should resolve now because it can create rework, review failure, or unclear ownership. -- `Opportunity`: Useful architecture or quality improvement, but not a blocker unless tied to correctness or delivery risk. - -## Common Mistakes - -- Asking five generic questions at once instead of one focused premise. -- Asking technical open questions when assumptions mode would let the user - confirm or correct a concrete proposal. -- Challenging in the abstract without reading files. -- Turning every idea into a blocker. -- Asking about facts that codebase exploration can answer. -- Accepting a plan because it sounds reasonable while evidence is missing. -- Treating architecture taste as delivery truth. -- Using a fuzzy term repeatedly without ever defining it back to the user or to `.do-it/CONTEXT.md`. - -## Grill Log Artifact - -Grill records every pressure-tested fact or decision in `.do-it/grill/.md` -so the signal outlives the chat turn and feeds `do-it-planning` and -`do-it-verification-gate`. Without it, grilling is just a vibe — once context -compacts, every signal is gone. - -`` is the user's task title (lowercased, dash-separated, ≤32 chars) -when one is obvious, else the first 8 chars of the SHA-1 of the first prompt; -prefix with the short session id when collisions are likely. Keep -`.do-it/grill/.gitkeep` so the directory tracks in git. - -Format: +``: the user's task title (lowercased, dash-separated, ≤32 chars), else the first 8 chars of the SHA-1 of the first prompt; prefix with the short session id on collision. Keep `.do-it/grill/.gitkeep` so the directory tracks in git. ```markdown --- task: session_id: created: -status: open # flips to `resolved` when no execution-blocking item remains +status: open # → resolved when no execution-blocking item remains brainstorm: # only when converging a brainstorm artifact --- ## Items tested - - [ ] **** - kind: - - falsifier: - status: - - evidence: + - evidence: ## Anchored terms - - ****: ← from CLAUDE.md / .do-it/CONTEXT.md / clarified this turn ``` -Rules: use `confirmed`/`refuted` only for facts, `chosen`/`deferred`/`needs_user_decision` -only for decisions. A `needs_user_decision` item blocks only when it changes the -next execution step. Append new items and mutate existing ones in place — never -delete an item; refute or defer it with evidence. Anchored-terms entries get -sedimented to `.do-it/CONTEXT.md` (see `do-it-context`); this section is a -working pad. Flip `status: resolved` once no execution-blocking item remains, and -do not flip it back without a new entry explaining why. +Use `confirmed`/`refuted` only for facts, `chosen`/`deferred`/`needs_user_decision` only for decisions. Append new items and mutate existing ones in place — never delete an item; refute or defer it with evidence. Anchored terms sediment to `.do-it/CONTEXT.md` (see `do-it-context`). A `needs_user_decision` item blocks only when it changes the next execution step; flip `status: resolved` once none remain. + +## Reference + +The challenge taxonomy (eight lenses), the "I'm done grilling" rationalizations to push past, red flags that warrant another loop, severity definitions, and common mistakes live in `references/checks.md`. Load it when a grill stalls or you need the full lens set; the loop above is enough for the common case. -## Related Skills +## Related skills -- `do-it-context` — set up and maintain `.do-it/CONTEXT.md` with project terms and invariants (sediment destination for anchored terms). -- `do-it-brainstorm` — divergent persona pass that runs before grill; produces the open decisions grill converges on. -- `do-it-planning` — consume the grill outcome into a plan card under `.do-it/plans/.md`; reads the grill slug. -- `do-it-review-loop` — apply pressure to the delivered diff after grilling has set the bar. +- `do-it-router` — sets tier and owns the decision ladder that grill's necessity rung references. +- `do-it-brainstorm` — divergent pass that runs first; produces the open decisions grill converges. +- `do-it-context` — `.do-it/CONTEXT.md` is the sediment destination for anchored terms. +- `do-it-planning` — consumes the grill outcome into a plan card under `.do-it/plans/.md`; reads the grill slug. +- `do-it-review-loop` — applies pressure to the delivered diff after grilling sets the bar. diff --git a/skills/do-it/do-it-grill/references/checks.md b/skills/do-it/do-it-grill/references/checks.md new file mode 100644 index 0000000..91dca22 --- /dev/null +++ b/skills/do-it/do-it-grill/references/checks.md @@ -0,0 +1,52 @@ +# Grill reference — lenses, rationalizations, red flags + +Load this when a grill stalls or you want the full challenge set. The loop in `SKILL.md` is enough for the common case; this file is progressive disclosure, not a checklist to run every time. + +## Lenses (checks, not personas) + +- **Truth** — what is verified locally, and what is only assumed? +- **Necessity** — does this need to exist, or does an existing capability cover it? (decision ladder rung 1) +- **Scope** — what is the smallest complete outcome? +- **Acceptance** — what exact evidence proves the work is done? +- **Interface** — which contract or boundary will another part of the system rely on? +- **Failure** — what is the most likely way this plan ships a bug? +- **Review** — what would a skeptical reviewer block on? +- **Maintenance** — what future churn can be avoided without expanding scope? +- **Delegation** — does each worker have a tier, ownership, verification, and stop condition? + +## Rationalizations that should each cost another loop + +- *"It feels reasonable."* — Reasonable ≠ verified. Name the evidence. +- *"The user said it works that way."* — Treat as premise. Grep before believing. +- *"It would be a small refactor if wrong."* — Costed in tokens, or in a follow-up sprint? Re-state. +- *"We can fix it in review."* — Maybe; but caught now, review carries fewer findings. +- *"There's no time to anchor terms."* — There's also no time to debug a contract mismatch in production. +- *"Both interpretations probably work."* — Then which one ships? Pick. +- *"We might need it later."* — That is rung 1. Skip it until the need is real. + +## Red flags — pause and re-grill + +- The user agreeing immediately with whatever you propose. (Confirming, or appeasing?) +- Acceptance criteria that say "looks good" or "works". +- A plan that names `Phase 2` of work that does not yet exist. +- "Should be straightforward" applied to a multi-package change. +- Terms used inconsistently in the same paragraph. +- A review response with zero `Blocking` / `Important` items on a change that touches a public interface. +- A new abstraction, layer, or dependency defended by a longer paragraph than the code it replaces — the prose is complexity smuggled back in. + +## Severity + +- `Blocking` — must resolve before execution or closeout: can make the work wrong, unsafe, or unverifiable. +- `Important` — should resolve now: can create rework, review failure, or unclear ownership. +- `Opportunity` — useful architecture or quality improvement, not a blocker unless tied to correctness or delivery risk. + +## Common mistakes + +- Asking five generic questions at once instead of one focused premise. +- Asking open technical questions when assumptions mode would let the user confirm or correct a concrete proposal. +- Challenging in the abstract without reading files. +- Turning every idea into a blocker. +- Asking about facts that codebase exploration can answer. +- Accepting a plan because it sounds reasonable while evidence is missing. +- Treating architecture taste as delivery truth. +- Using a fuzzy term repeatedly without ever defining it back to the user or to `.do-it/CONTEXT.md`. diff --git a/skills/do-it/do-it-review-loop/SKILL.md b/skills/do-it/do-it-review-loop/SKILL.md index 4311a91..5d6bc11 100644 --- a/skills/do-it/do-it-review-loop/SKILL.md +++ b/skills/do-it/do-it-review-loop/SKILL.md @@ -81,6 +81,18 @@ datastore, framework, runtime, or protocol. Loads and user confirmation step). Returns findings in the standard shape; a memory-pick without a fresh search is `Blocking`. +### YAGNI lens + +Available at Standard tier (optional) and Heavy tier (default when the diff adds +a new abstraction, export, dependency, or `Phase 2` scaffolding). Loads +`code-quality-cleaner` (maintainability + over-engineering against the decision +ladder, see `do-it-router` § Restraint). Returns one-line findings using its +closed tag vocabulary (`delete:` / `stdlib:` / `native:` / `yagni:` / `shrink:`), +each carrying a severity, plus a quantified `net: - lines possible` / +`Lean already. Ship.` verdict. The `anti-patterns-lint.sh` PostToolUse hook is an +advisory write-time pre-filter for the unreferenced-export case; the lens is the +source of truth and catches single-use abstractions the hook cannot see. + ## Review intensity (graduated) Review intensity is the canonical axis for deciding how much review effort to @@ -111,6 +123,7 @@ Review intensity is orthogonal to the tier: - **review-adversarial** (Heavy default for high-risk surface): parallel multi-lens — `reviewer` + `red-team-reviewer` + `spec-compliance-reviewer` (and `architecture-taste-reviewer` if research-first lens triggers, + `code-quality-cleaner` if the diff adds new abstraction/export surface, `do-it-comments-discipline` lens if comments changed). Use when: - Heavy tier with breaks_interface, crosses_packages, security-sensitive code, or migrations @@ -167,7 +180,9 @@ Check every non-trivial diff through five axes before spending time on polish: - Contracts: do APIs, schemas, CLIs, generated outputs, docs, and consumers agree? - Maintainability: did the change add avoidable coupling, dead code, duplicate - logic, or unclear ownership? + logic, unclear ownership, or a single-use / speculative abstraction the + decision ladder would inline? (the YAGNI lens / `code-quality-cleaner` owns + the over-engineering call) - Verification: do the tests and commands actually prove the claim, or did they mock away the risky collaborator chain? diff --git a/skills/do-it/do-it-router/SKILL.md b/skills/do-it/do-it-router/SKILL.md index 702798c..c10af38 100644 --- a/skills/do-it/do-it-router/SKILL.md +++ b/skills/do-it/do-it-router/SKILL.md @@ -77,6 +77,21 @@ these by default: This binds how do-it itself evolves and how it shapes changes inside a project. +### The decision ladder + +The best code is the code you never wrote. When a change calls for new code, walk down these rungs and stop at the first that holds — the cheapest sufficient option wins: + +1. **Does it need to exist?** Speculative or "might need it later" → skip it. +2. **Does the stdlib do it?** → use it. +3. **Is there a native platform feature?** → use it. +4. **Is an already-installed dependency enough?** → use it. +5. **Can it be one line?** → one line. +6. Only then: the smallest custom code that works. + +Bias toward deletion over addition and boring over clever. Never cut for the ladder's sake: input validation at trust boundaries, error handling that prevents data loss, security, accessibility, or a feature the user explicitly asked for. Lazy, not negligent. + +This is the shared "write less" primitive: `do-it-grill` opens with rung 1 (does this need to exist?), `do-it-brainstorm` maps options along the ladder, and `hooks/anti-patterns-lint.sh` flags rungs the code skipped. Reference it; do not restate it. + ## Orthogonal Dimensions In addition to the single tier label, the router writes 5 boolean dimensions into per-session state. They narrow *intensity*, not tier itself: a Standard task can still be `breaks_interface=1` and a downstream skill MAY upgrade its review or drill posture accordingly. diff --git a/tests/hooks/anti-patterns-lint.test.sh b/tests/hooks/anti-patterns-lint.test.sh index a011078..5fe05bf 100644 --- a/tests/hooks/anti-patterns-lint.test.sh +++ b/tests/hooks/anti-patterns-lint.test.sh @@ -278,6 +278,43 @@ _assert_contains "no-consumer covers export default async function" "$OUT" "- no _assert_contains "names MixedFn" "$OUT" "MixedFn" rm -rf "$DIR" +# ------------------------------------------------------------------------- +echo "Case 6d: no-consumer flags a speculative interface with no reference" +DIR=$(_setup_repo) +FILE="$DIR/ports.ts" +cat > "$FILE" <<'EOF' +// initial +EOF +( cd "$DIR" && git add ports.ts && git commit -q -m base ) >/dev/null 2>&1 +cat > "$FILE" <<'EOF' +// initial +export interface SpeculativePort { run(): void; } +EOF +OUT=$(_run_hook "$FILE") +_assert_contains "no-consumer flags unreferenced interface" "$OUT" "- no-consumer:" +_assert_contains "names SpeculativePort" "$OUT" "SpeculativePort" +rm -rf "$DIR" + +# ------------------------------------------------------------------------- +echo "Case 6e: no-consumer silent for a referenced interface" +DIR=$(_setup_repo) +FILE="$DIR/ports.ts" +cat > "$FILE" <<'EOF' +// initial +EOF +cat > "$DIR/impl.ts" <<'EOF' +import { UsedPort } from './ports'; +export class Impl implements UsedPort { run() {} } +EOF +( cd "$DIR" && git add . && git commit -q -m base ) >/dev/null 2>&1 +cat > "$FILE" <<'EOF' +// initial +export interface UsedPort { run(): void; } +EOF +OUT=$(_run_hook "$FILE") +_assert_not_contains "no-consumer silent for referenced interface" "$OUT" "- no-consumer:" +rm -rf "$DIR" + # ------------------------------------------------------------------------- echo "Case 7: out-of-scope extension — skipped silently" DIR=$(_setup_repo)