From 76b5209fbc4e31d60e1bb5dc4dcb1f0a9081f0d9 Mon Sep 17 00:00:00 2001 From: Bartosz Burda Date: Tue, 5 May 2026 12:33:13 +0200 Subject: [PATCH 01/15] Add Pharaoh setup Bootstrap pharaoh.toml, .pharaoh/project/ tailoring, .github/ Copilot agents, and a setup documentation page on top of the existing Sphinx-Needs configuration. Required link chains in pharaoh.toml reflect the 100%-coverage policy observed in needs.json (spec->req, arch->req, safety_goal->hazard, fsr->safety_goal). Mode is reverse-eng with advisory strictness, so workflow gates start permissive over the existing 268-need catalogue. The .gitignore entries are narrow: only .pharaoh/runs, .pharaoh/plans, .pharaoh/session.json, and .pharaoh/cache are ignored. The tailoring under .pharaoh/project/ is tracked. --- .../pharaoh.activity-diagram-draft.agent.md | 10 + .../pharaoh.api-coverage-check.agent.md | 10 + .github/agents/pharaoh.arch-draft.agent.md | 10 + .github/agents/pharaoh.arch-review.agent.md | 10 + .github/agents/pharaoh.audit-fanout.agent.md | 10 + .../pharaoh.block-diagram-draft.agent.md | 10 + .github/agents/pharaoh.bootstrap.agent.md | 13 + .github/agents/pharaoh.change.agent.md | 125 ++++++++++ .../pharaoh.class-diagram-draft.agent.md | 10 + .../pharaoh.component-diagram-draft.agent.md | 10 + .../agents/pharaoh.context-gather.agent.md | 10 + .github/agents/pharaoh.coverage-gap.agent.md | 10 + .github/agents/pharaoh.decide.agent.md | 115 +++++++++ .../agents/pharaoh.decision-record.agent.md | 10 + .../agents/pharaoh.decision-review.agent.md | 10 + .../pharaoh.deployment-diagram-draft.agent.md | 10 + .github/agents/pharaoh.diagram-lint.agent.md | 13 + .../agents/pharaoh.diagram-review.agent.md | 10 + .../pharaoh.dispatch-signal-check.agent.md | 10 + .github/agents/pharaoh.execute-plan.agent.md | 10 + .../pharaoh.fault-tree-diagram-draft.agent.md | 10 + .github/agents/pharaoh.feat-balance.agent.md | 10 + .../pharaoh.feat-component-extract.agent.md | 10 + .../pharaoh.feat-draft-from-docs.agent.md | 10 + .github/agents/pharaoh.feat-file-map.agent.md | 10 + .../agents/pharaoh.feat-flow-extract.agent.md | 10 + .github/agents/pharaoh.feat-review.agent.md | 10 + .../agents/pharaoh.finding-record.agent.md | 10 + .github/agents/pharaoh.flow.agent.md | 10 + .github/agents/pharaoh.fmea-review.agent.md | 10 + .github/agents/pharaoh.fmea.agent.md | 10 + .github/agents/pharaoh.gate-advisor.agent.md | 10 + .github/agents/pharaoh.id-allocate.agent.md | 10 + .../pharaoh.id-convention-check.agent.md | 10 + .../agents/pharaoh.lifecycle-check.agent.md | 10 + .../pharaoh.link-completeness-check.agent.md | 10 + .github/agents/pharaoh.mece.agent.md | 145 +++++++++++ .../agents/pharaoh.output-validate.agent.md | 10 + .../pharaoh.papyrus-non-empty-check.agent.md | 10 + .github/agents/pharaoh.plan.agent.md | 127 ++++++++++ .github/agents/pharaoh.process-audit.agent.md | 10 + .github/agents/pharaoh.prose-migrate.agent.md | 10 + .github/agents/pharaoh.quality-gate.agent.md | 10 + .github/agents/pharaoh.release.agent.md | 110 +++++++++ .../pharaoh.reproducibility-check.agent.md | 10 + .../pharaoh.req-code-grounding-check.agent.md | 10 + .../pharaoh.req-codelink-annotate.agent.md | 10 + .github/agents/pharaoh.req-draft.agent.md | 10 + .github/agents/pharaoh.req-from-code.agent.md | 10 + .../agents/pharaoh.req-regenerate.agent.md | 10 + .github/agents/pharaoh.req-review.agent.md | 10 + .../pharaoh.review-completeness.agent.md | 10 + ...haraoh.self-review-coverage-check.agent.md | 10 + .../pharaoh.sequence-diagram-draft.agent.md | 10 + .github/agents/pharaoh.setup.agent.md | 103 ++++++++ .github/agents/pharaoh.spec.agent.md | 124 ++++++++++ .../pharaoh.sphinx-extension-add.agent.md | 10 + .../pharaoh.standard-conformance.agent.md | 10 + .../pharaoh.state-diagram-draft.agent.md | 10 + .../pharaoh.status-lifecycle-check.agent.md | 13 + .../agents/pharaoh.tailor-bootstrap.agent.md | 10 + ...aoh.tailor-code-grounding-filters.agent.md | 10 + .github/agents/pharaoh.tailor-detect.agent.md | 10 + .github/agents/pharaoh.tailor-fill.agent.md | 10 + .github/agents/pharaoh.tailor-review.agent.md | 10 + .github/agents/pharaoh.toctree-emit.agent.md | 10 + .github/agents/pharaoh.trace.agent.md | 107 ++++++++ .../pharaoh.use-case-diagram-draft.agent.md | 10 + .github/agents/pharaoh.vplan-draft.agent.md | 10 + .github/agents/pharaoh.vplan-review.agent.md | 10 + .github/agents/pharaoh.write-plan.agent.md | 10 + .github/copilot-instructions.md | 84 +++++++ .github/prompts/pharaoh.author.prompt.md | 3 + .github/prompts/pharaoh.change.prompt.md | 3 + .github/prompts/pharaoh.mece.prompt.md | 3 + .github/prompts/pharaoh.plan.prompt.md | 3 + .github/prompts/pharaoh.release.prompt.md | 3 + .github/prompts/pharaoh.trace.prompt.md | 3 + .github/prompts/pharaoh.verify.prompt.md | 3 + .gitignore | 6 + .pharaoh/project/artefact-catalog.yaml | 228 ++++++++++++++++++ .pharaoh/project/checklists/arch.md | 11 + .pharaoh/project/checklists/component.md | 11 + .pharaoh/project/checklists/fsr.md | 11 + .pharaoh/project/checklists/hazard.md | 11 + .pharaoh/project/checklists/impl.md | 11 + .pharaoh/project/checklists/interface.md | 11 + .pharaoh/project/checklists/need.md | 8 + .pharaoh/project/checklists/person.md | 8 + .pharaoh/project/checklists/release.md | 8 + .pharaoh/project/checklists/req.md | 12 + .pharaoh/project/checklists/requirement.md | 6 + .pharaoh/project/checklists/safety_goal.md | 11 + .pharaoh/project/checklists/seq_msg.md | 8 + .pharaoh/project/checklists/spec.md | 12 + .pharaoh/project/checklists/swarch.md | 11 + .pharaoh/project/checklists/swreq.md | 11 + .pharaoh/project/checklists/sys-arch.md | 11 + .pharaoh/project/checklists/sysreq.md | 11 + .pharaoh/project/checklists/team.md | 8 + .pharaoh/project/checklists/test.md | 12 + .pharaoh/project/id-conventions.yaml | 26 ++ .pharaoh/project/workflows.yaml | 227 +++++++++++++++++ docs/index.rst | 1 + docs/pharaoh.rst | 180 ++++++++++++++ pharaoh.toml | 46 ++++ 106 files changed, 2617 insertions(+) create mode 100644 .github/agents/pharaoh.activity-diagram-draft.agent.md create mode 100644 .github/agents/pharaoh.api-coverage-check.agent.md create mode 100644 .github/agents/pharaoh.arch-draft.agent.md create mode 100644 .github/agents/pharaoh.arch-review.agent.md create mode 100644 .github/agents/pharaoh.audit-fanout.agent.md create mode 100644 .github/agents/pharaoh.block-diagram-draft.agent.md create mode 100644 .github/agents/pharaoh.bootstrap.agent.md create mode 100644 .github/agents/pharaoh.change.agent.md create mode 100644 .github/agents/pharaoh.class-diagram-draft.agent.md create mode 100644 .github/agents/pharaoh.component-diagram-draft.agent.md create mode 100644 .github/agents/pharaoh.context-gather.agent.md create mode 100644 .github/agents/pharaoh.coverage-gap.agent.md create mode 100644 .github/agents/pharaoh.decide.agent.md create mode 100644 .github/agents/pharaoh.decision-record.agent.md create mode 100644 .github/agents/pharaoh.decision-review.agent.md create mode 100644 .github/agents/pharaoh.deployment-diagram-draft.agent.md create mode 100644 .github/agents/pharaoh.diagram-lint.agent.md create mode 100644 .github/agents/pharaoh.diagram-review.agent.md create mode 100644 .github/agents/pharaoh.dispatch-signal-check.agent.md create mode 100644 .github/agents/pharaoh.execute-plan.agent.md create mode 100644 .github/agents/pharaoh.fault-tree-diagram-draft.agent.md create mode 100644 .github/agents/pharaoh.feat-balance.agent.md create mode 100644 .github/agents/pharaoh.feat-component-extract.agent.md create mode 100644 .github/agents/pharaoh.feat-draft-from-docs.agent.md create mode 100644 .github/agents/pharaoh.feat-file-map.agent.md create mode 100644 .github/agents/pharaoh.feat-flow-extract.agent.md create mode 100644 .github/agents/pharaoh.feat-review.agent.md create mode 100644 .github/agents/pharaoh.finding-record.agent.md create mode 100644 .github/agents/pharaoh.flow.agent.md create mode 100644 .github/agents/pharaoh.fmea-review.agent.md create mode 100644 .github/agents/pharaoh.fmea.agent.md create mode 100644 .github/agents/pharaoh.gate-advisor.agent.md create mode 100644 .github/agents/pharaoh.id-allocate.agent.md create mode 100644 .github/agents/pharaoh.id-convention-check.agent.md create mode 100644 .github/agents/pharaoh.lifecycle-check.agent.md create mode 100644 .github/agents/pharaoh.link-completeness-check.agent.md create mode 100644 .github/agents/pharaoh.mece.agent.md create mode 100644 .github/agents/pharaoh.output-validate.agent.md create mode 100644 .github/agents/pharaoh.papyrus-non-empty-check.agent.md create mode 100644 .github/agents/pharaoh.plan.agent.md create mode 100644 .github/agents/pharaoh.process-audit.agent.md create mode 100644 .github/agents/pharaoh.prose-migrate.agent.md create mode 100644 .github/agents/pharaoh.quality-gate.agent.md create mode 100644 .github/agents/pharaoh.release.agent.md create mode 100644 .github/agents/pharaoh.reproducibility-check.agent.md create mode 100644 .github/agents/pharaoh.req-code-grounding-check.agent.md create mode 100644 .github/agents/pharaoh.req-codelink-annotate.agent.md create mode 100644 .github/agents/pharaoh.req-draft.agent.md create mode 100644 .github/agents/pharaoh.req-from-code.agent.md create mode 100644 .github/agents/pharaoh.req-regenerate.agent.md create mode 100644 .github/agents/pharaoh.req-review.agent.md create mode 100644 .github/agents/pharaoh.review-completeness.agent.md create mode 100644 .github/agents/pharaoh.self-review-coverage-check.agent.md create mode 100644 .github/agents/pharaoh.sequence-diagram-draft.agent.md create mode 100644 .github/agents/pharaoh.setup.agent.md create mode 100644 .github/agents/pharaoh.spec.agent.md create mode 100644 .github/agents/pharaoh.sphinx-extension-add.agent.md create mode 100644 .github/agents/pharaoh.standard-conformance.agent.md create mode 100644 .github/agents/pharaoh.state-diagram-draft.agent.md create mode 100644 .github/agents/pharaoh.status-lifecycle-check.agent.md create mode 100644 .github/agents/pharaoh.tailor-bootstrap.agent.md create mode 100644 .github/agents/pharaoh.tailor-code-grounding-filters.agent.md create mode 100644 .github/agents/pharaoh.tailor-detect.agent.md create mode 100644 .github/agents/pharaoh.tailor-fill.agent.md create mode 100644 .github/agents/pharaoh.tailor-review.agent.md create mode 100644 .github/agents/pharaoh.toctree-emit.agent.md create mode 100644 .github/agents/pharaoh.trace.agent.md create mode 100644 .github/agents/pharaoh.use-case-diagram-draft.agent.md create mode 100644 .github/agents/pharaoh.vplan-draft.agent.md create mode 100644 .github/agents/pharaoh.vplan-review.agent.md create mode 100644 .github/agents/pharaoh.write-plan.agent.md create mode 100644 .github/copilot-instructions.md create mode 100644 .github/prompts/pharaoh.author.prompt.md create mode 100644 .github/prompts/pharaoh.change.prompt.md create mode 100644 .github/prompts/pharaoh.mece.prompt.md create mode 100644 .github/prompts/pharaoh.plan.prompt.md create mode 100644 .github/prompts/pharaoh.release.prompt.md create mode 100644 .github/prompts/pharaoh.trace.prompt.md create mode 100644 .github/prompts/pharaoh.verify.prompt.md create mode 100644 .pharaoh/project/artefact-catalog.yaml create mode 100644 .pharaoh/project/checklists/arch.md create mode 100644 .pharaoh/project/checklists/component.md create mode 100644 .pharaoh/project/checklists/fsr.md create mode 100644 .pharaoh/project/checklists/hazard.md create mode 100644 .pharaoh/project/checklists/impl.md create mode 100644 .pharaoh/project/checklists/interface.md create mode 100644 .pharaoh/project/checklists/need.md create mode 100644 .pharaoh/project/checklists/person.md create mode 100644 .pharaoh/project/checklists/release.md create mode 100644 .pharaoh/project/checklists/req.md create mode 100644 .pharaoh/project/checklists/requirement.md create mode 100644 .pharaoh/project/checklists/safety_goal.md create mode 100644 .pharaoh/project/checklists/seq_msg.md create mode 100644 .pharaoh/project/checklists/spec.md create mode 100644 .pharaoh/project/checklists/swarch.md create mode 100644 .pharaoh/project/checklists/swreq.md create mode 100644 .pharaoh/project/checklists/sys-arch.md create mode 100644 .pharaoh/project/checklists/sysreq.md create mode 100644 .pharaoh/project/checklists/team.md create mode 100644 .pharaoh/project/checklists/test.md create mode 100644 .pharaoh/project/id-conventions.yaml create mode 100644 .pharaoh/project/workflows.yaml create mode 100644 docs/pharaoh.rst create mode 100644 pharaoh.toml diff --git a/.github/agents/pharaoh.activity-diagram-draft.agent.md b/.github/agents/pharaoh.activity-diagram-draft.agent.md new file mode 100644 index 0000000..cb40cfe --- /dev/null +++ b/.github/agents/pharaoh.activity-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. +handoffs: [] +--- + +# @pharaoh.activity-diagram-draft + +Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. + +See [`skills/pharaoh-activity-diagram-draft/SKILL.md`](../../skills/pharaoh-activity-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.api-coverage-check.agent.md b/.github/agents/pharaoh.api-coverage-check.agent.md new file mode 100644 index 0000000..f76eb9c --- /dev/null +++ b/.github/agents/pharaoh.api-coverage-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify that every public symbol and every raise-site exception in a source file is covered by at least one need in needs.json. Reverse direction of pharaoh-req-from-code — language-parametric via the shared regex table; emits per-symbol and per-raise-site coverage plus a ratio against a tailored threshold. +handoffs: [] +--- + +# @pharaoh.api-coverage-check + +Verify that every public symbol and every raise-site exception in a source file is covered by at least one need in `needs.json`. Reverse direction of `pharaoh-req-from-code` — language-parametric via the shared regex table in `skills/shared/public-symbol-patterns.md`; emits per-symbol and per-raise-site coverage plus a ratio against a tailored threshold. + +See [`skills/pharaoh-api-coverage-check/SKILL.md`](../../skills/pharaoh-api-coverage-check/SKILL.md) for the full atomic specification — inputs, outputs, per-step process, failure modes, and composition patterns. diff --git a/.github/agents/pharaoh.arch-draft.agent.md b/.github/agents/pharaoh.arch-draft.agent.md new file mode 100644 index 0000000..36d408c --- /dev/null +++ b/.github/agents/pharaoh.arch-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Draft a single sphinx-needs architecture element from one parent requirement. +handoffs: [] +--- + +# @pharaoh.arch-draft + +Draft a single sphinx-needs architecture element from one parent requirement. + +See [`skills/pharaoh-arch-draft/SKILL.md`](../../skills/pharaoh-arch-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.arch-review.agent.md b/.github/agents/pharaoh.arch-review.agent.md new file mode 100644 index 0000000..c46b8fb --- /dev/null +++ b/.github/agents/pharaoh.arch-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single architecture element against ISO 26262-8 §6 axes. +handoffs: [] +--- + +# @pharaoh.arch-review + +Audit a single architecture element against ISO 26262-8 §6 axes. + +See [`skills/pharaoh-arch-review/SKILL.md`](../../skills/pharaoh-arch-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.audit-fanout.agent.md b/.github/agents/pharaoh.audit-fanout.agent.md new file mode 100644 index 0000000..e809101 --- /dev/null +++ b/.github/agents/pharaoh.audit-fanout.agent.md @@ -0,0 +1,10 @@ +--- +description: Run a full project audit in parallel across atomic audit skills, sharing findings via Papyrus. +handoffs: [] +--- + +# @pharaoh.audit-fanout + +Run a full project audit in parallel across atomic audit skills, sharing findings via Papyrus. + +See [`skills/pharaoh-audit-fanout/SKILL.md`](../../skills/pharaoh-audit-fanout/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.block-diagram-draft.agent.md b/.github/agents/pharaoh.block-diagram-draft.agent.md new file mode 100644 index 0000000..0436607 --- /dev/null +++ b/.github/agents/pharaoh.block-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. +handoffs: [] +--- + +# @pharaoh.block-diagram-draft + +Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. + +See [`skills/pharaoh-block-diagram-draft/SKILL.md`](../../skills/pharaoh-block-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.bootstrap.agent.md b/.github/agents/pharaoh.bootstrap.agent.md new file mode 100644 index 0000000..bca421a --- /dev/null +++ b/.github/agents/pharaoh.bootstrap.agent.md @@ -0,0 +1,13 @@ +--- +description: Inject minimum sphinx-needs configuration into an existing Sphinx project so sphinx-build produces a valid needs.json. +handoffs: + - label: Detect and scaffold Pharaoh + agent: pharaoh.setup + prompt: Detect the freshly configured sphinx-needs project and scaffold pharaoh.toml +--- + +# @pharaoh.bootstrap + +Inject the minimum sphinx-needs configuration — extension entry, need types, optional extra links — into an existing Sphinx project that does not yet have sphinx-needs configured. Does not seed RST content, does not build, does not write `pharaoh.toml`. + +See [`skills/pharaoh-bootstrap/SKILL.md`](../../skills/pharaoh-bootstrap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.change.agent.md b/.github/agents/pharaoh.change.agent.md new file mode 100644 index 0000000..1ed8190 --- /dev/null +++ b/.github/agents/pharaoh.change.agent.md @@ -0,0 +1,125 @@ +--- +description: Analyze the impact of changing a requirement, specification, or any sphinx-needs item. Traces through all link types and codelinks to produce a Change Document. +handoffs: + - label: MECE Check + agent: pharaoh.mece + prompt: Check the affected area for gaps and redundancies + - label: Trace Requirement + agent: pharaoh.trace + prompt: Trace the changed requirement through all levels +--- + +# @pharaoh.change: Change Impact Analysis + +Analyze the full impact of a proposed change to any sphinx-needs item. Trace through ALL link types -- standard `links`, `extra_links` (implements, tests, etc.), and sphinx-codelinks -- to produce a structured Change Document listing every affected need and code file with a recommended action. + +## Data Access + +Use the best available data source in this priority order: + +1. **ubc CLI**: Run `ubc --version`. If available, use `ubc build needs --format json` for the needs index. Use `ubc diff` for structural change detection. +2. **ubCode MCP**: Check for MCP tools with names containing `ubcode` or `useblocks`. Use for pre-indexed data and link graph. +3. **Raw file parsing**: Search for `ubproject.toml` or `conf.py` for configuration. Grep for need directives (`.. ::`) in RST/MD files. Parse options (`:id:`, `:status:`, `:links:`, extra_links). Build the link graph manually. + +Read `pharaoh.toml` for strictness level, workflow gates, traceability requirements, and codelinks settings. + +## Process + +### Step 1: Understand the Change + +Extract from the user's request: +- **Target need ID(s)**: One or more need IDs. If described by title, resolve after data access. +- **Nature of change**: Value change, addition, removal, or restructuring. +- **Change description**: What changes and why. + +Clarify ambiguity with at most one round of questions. + +### Step 2: Get Project Data + +Detect the project and build the needs index. Present summary: + +``` +Project: () +Types: +Links: +Data source: +Needs found: +Codelinks: +Strictness: +``` + +Resolve need IDs from descriptions if needed (title/content matching). + +### Step 3: Impact Analysis + +**Direct impact (1 hop)**: Find every need directly linked to the target through any link type and direction. + +**Transitive impact (full graph)**: BFS from directly impacted needs through all link types. Track distance from target. Use a visited set to handle cycles. + +**Code impact** (if codelinks enabled): For every affected need, search code files for codelink annotations (`# codelink: `, `// codelink: `, etc.). + +**Classify severity** for each affected item: +- **Must update**: Content references the specific value being changed. +- **Review needed**: Linked but impact unclear; content relates to the changed property without referencing the specific value. +- **No change needed**: Linked but addresses a different concern entirely. + +### Step 4: Produce Change Document + +``` +## Change Document + +### Change Request +- **Target**: () +- **Change**: <description> +- **Date**: <ISO 8601 date> + +### Direct Impact (1 hop) + +| Need ID | Type | Title | Link Type | Direction | Action | +|---------|------|-------|-----------|-----------|--------| + +### Transitive Impact + +| Need ID | Type | Title | Distance | Path | Action | +|---------|------|-------|----------|------|--------| + +### Code Impact + +| File | Location | Linked Need | Action | +|------|----------|-------------|--------| + +### Summary +- Needs requiring update: <count> +- Needs requiring review: <count> +- No change needed: <count> +- Code files affected: <count> +- Recommendation: <proceed / escalate / discuss> +``` + +**Recommendation**: Proceed if <= 5 must-update items and no safety-tagged needs affected. Escalate if safety/critical/regulatory-tagged needs are "Must update" or >10 items need update. Discuss if impact is ambiguous. + +### Step 5: Update Session State + +Write to `.pharaoh/session.json`: +- Set `changes.<target_id>.change_analysis` to current timestamp. +- Set `acknowledged` to `false` initially. + +### Step 6: Ask for Acknowledgment + +``` +Acknowledge this change analysis? Acknowledging allows proceeding to @pharaoh.author for the affected needs. +``` + +If acknowledged, set `changes.<target_id>.acknowledged = true` in session state. + +## Strictness Behavior + +This agent has **no prerequisites** and runs freely in both advisory and enforcing modes. However, its output gates `@pharaoh.author` in enforcing mode -- authoring requires an acknowledged change analysis. + +## Constraints + +1. Always trace ALL configured link types. Read the project's `extra_links` configuration. +2. Handle circular links with a visited set. Report cycles. +3. Support multi-project setups. Label cross-project needs. +4. For large impact scopes (>50 needs), recommend escalation. +5. Never modify need source files. This agent is read-only except for session state. diff --git a/.github/agents/pharaoh.class-diagram-draft.agent.md b/.github/agents/pharaoh.class-diagram-draft.agent.md new file mode 100644 index 0000000..7eef840 --- /dev/null +++ b/.github/agents/pharaoh.class-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). +handoffs: [] +--- + +# @pharaoh.class-diagram-draft + +Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). + +See [`skills/pharaoh-class-diagram-draft/SKILL.md`](../../skills/pharaoh-class-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.component-diagram-draft.agent.md b/.github/agents/pharaoh.component-diagram-draft.agent.md new file mode 100644 index 0000000..b0bea11 --- /dev/null +++ b/.github/agents/pharaoh.component-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. +handoffs: [] +--- + +# @pharaoh.component-diagram-draft + +Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. + +See [`skills/pharaoh-component-diagram-draft/SKILL.md`](../../skills/pharaoh-component-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.context-gather.agent.md b/.github/agents/pharaoh.context-gather.agent.md new file mode 100644 index 0000000..24926d3 --- /dev/null +++ b/.github/agents/pharaoh.context-gather.agent.md @@ -0,0 +1,10 @@ +--- +description: Retrieve rationale memories from a Papyrus workspace before authoring or review. +handoffs: [] +--- + +# @pharaoh.context-gather + +Retrieve rationale memories from a Papyrus workspace before authoring or review. + +See [`skills/pharaoh-context-gather/SKILL.md`](../../skills/pharaoh-context-gather/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.coverage-gap.agent.md b/.github/agents/pharaoh.coverage-gap.agent.md new file mode 100644 index 0000000..ef820ca --- /dev/null +++ b/.github/agents/pharaoh.coverage-gap.agent.md @@ -0,0 +1,10 @@ +--- +description: Detect one gap category (orphan / unverified / duplicate / contradictory / lifecycle) in a sphinx-needs corpus. +handoffs: [] +--- + +# @pharaoh.coverage-gap + +Detect one gap category (orphan / unverified / duplicate / contradictory / lifecycle) in a sphinx-needs corpus. + +See [`skills/pharaoh-coverage-gap/SKILL.md`](../../skills/pharaoh-coverage-gap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.decide.agent.md b/.github/agents/pharaoh.decide.agent.md new file mode 100644 index 0000000..2b6938e --- /dev/null +++ b/.github/agents/pharaoh.decide.agent.md @@ -0,0 +1,115 @@ +--- +description: Record a design decision as a traceable sphinx-needs object with alternatives, rationale, and links to affected requirements. +handoffs: + - label: Trace Decision + agent: pharaoh.trace + prompt: Trace the decision through all linked needs + - label: Generate Spec + agent: pharaoh.spec + prompt: Generate a spec document from the affected requirements +--- + +# @pharaoh.decide + +Record design decisions as `decision` needs with `decided_by`, `alternatives`, `rationale` fields and `:decides:` links. Delegates RST writing to @pharaoh.author to avoid duplicating directive-writing logic. + +## Data Access + +1. **ubc CLI**: `ubc build needs --format json` for index, `ubc config` for schema. +2. **ubCode MCP**: Pre-indexed needs data. +3. **Raw file parsing**: Read `ubproject.toml`/`conf.py` for types, extra_links, ID settings. Grep for directives. Parse needs. + +Read `pharaoh.toml` for strictness level and workflow settings. + +## Process + +### Step 1: Get Project Data + +Build needs index. Present detection summary. Verify that a `decision` type is configured. If missing, show the user the TOML to add: + +```toml +[[needs.types]] +directive = "decision" +title = "Decision" +prefix = "DEC_" +color = "#E8D0A9" +style = "node" +``` + +Also verify `decided_by`, `alternatives`, `rationale` extra options and the `decides` extra link type exist. Ask user to confirm before proceeding if anything is missing. + +### Step 2: Gather Decision Context + +Collect all required fields: + +- **Title**: What is being decided. +- **Affected needs**: Need IDs for the `:decides:` link. +- **decided_by**: Who made the decision. Default to `claude` when AI decides autonomously. +- **alternatives**: Rejected alternatives, semicolon-separated. +- **rationale**: Why this option was chosen. +- **status**: One of `proposed`, `accepted`, `superseded`, `rejected`. + +**Standalone**: Prompt the user for each missing piece. Do not proceed until all five fields are populated. + +**Called by @pharaoh.spec**: Accept all context programmatically. Do not prompt. + +**Status defaults**: `proposed` when standalone, `accepted` when called by @pharaoh.spec. User may override. + +### Step 3: Generate ID + +Reuse @pharaoh.author ID generation logic: + +1. Check `pharaoh.toml` for `[pharaoh.id_scheme]`. Apply pattern with `{TYPE}` resolving to `DEC`. +2. If no scheme, infer from existing `decision` needs (look for `DEC_*` numbering). +3. If no existing decisions, use prefix from type config and start at `001`, padded to `id_length`. +4. Validate uniqueness against the full needs index. + +### Step 4: Write the Need + +Delegate to @pharaoh.author with all fields: + +```rst +.. decision:: <title> + :id: <generated_id> + :status: <proposed|accepted> + :decides: <need_id1>, <need_id2> + :decided_by: <name or claude> + :alternatives: <alt1>; <alt2> + :rationale: <why this option> + + <expanded description> +``` + +**Superseding**: When replacing an old decision, set old status to `superseded` via @pharaoh.author, add `:links: <old_dec_id>` on the new decision, and explain the replacement in the description. + +### Step 5: File Placement + +Place in `decisions.rst` in the same directory as the first need in `:decides:`. Create the file with proper RST title if it does not exist. If no `:decides:` links, fall back to @pharaoh.author file placement. Delegate actual writing to @pharaoh.author. + +### Step 6: Update Session State + +Write to `.pharaoh/session.json`: set `changes.<dec_id>.authored = true` with current ISO 8601 timestamp. + +### Step 7: Follow-up + +**Standalone**: Suggest `Run @pharaoh.verify to validate the decision against its linked requirements.` + +**Called by @pharaoh.spec**: Return the decision ID silently. No follow-up. + +## Strictness Behavior + +**Advisory mode**: Execute freely. No gates. No tips needed -- decisions can be recorded at any time. + +**Enforcing mode**: Execute freely. No gates. Decisions are gate-free in both modes. + +Strictness has no effect on decision recording. Both modes follow the same process. + +## Constraints + +1. **All three fields mandatory.** Always populate `decided_by`, `alternatives`, `rationale`. Ask explicitly if any are missing. +2. **Default `decided_by` to `claude`** when the AI decides autonomously (e.g., during @pharaoh.spec). +3. **Default `status`** to `proposed` (standalone) or `accepted` (called by @pharaoh.spec). +4. **Superseding requires two writes.** Update old decision to `superseded` AND add `:links:` on the new decision. +5. **Reuse @pharaoh.author** for RST writing, file placement, and ID generation. Do not duplicate logic. +6. **Validate `:decides:` targets exist.** Warn if a target is missing from the needs index. +7. **Semicolons for alternatives.** Separate with semicolons, not commas. diff --git a/.github/agents/pharaoh.decision-record.agent.md b/.github/agents/pharaoh.decision-record.agent.md new file mode 100644 index 0000000..6b6b044 --- /dev/null +++ b/.github/agents/pharaoh.decision-record.agent.md @@ -0,0 +1,10 @@ +--- +description: Record a canonical decision, fact, or preference in the shared Papyrus workspace with (type, canonical_name) dedup. +handoffs: [] +--- + +# @pharaoh.decision-record + +Record a canonical decision, fact, or preference in the shared Papyrus workspace with (type, canonical_name) dedup. + +See [`skills/pharaoh-decision-record/SKILL.md`](../../skills/pharaoh-decision-record/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.decision-review.agent.md b/.github/agents/pharaoh.decision-review.agent.md new file mode 100644 index 0000000..e644a61 --- /dev/null +++ b/.github/agents/pharaoh.decision-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single recorded decision against context/alternatives/consequences structure and traceability. +handoffs: [] +--- + +# @pharaoh.decision-review + +Audit a single recorded decision against context/alternatives/consequences structure and traceability. + +See [`skills/pharaoh-decision-review/SKILL.md`](../../skills/pharaoh-decision-review/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.deployment-diagram-draft.agent.md b/.github/agents/pharaoh.deployment-diagram-draft.agent.md new file mode 100644 index 0000000..8e99724 --- /dev/null +++ b/.github/agents/pharaoh.deployment-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). +handoffs: [] +--- + +# @pharaoh.deployment-diagram-draft + +Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). + +See [`skills/pharaoh-deployment-diagram-draft/SKILL.md`](../../skills/pharaoh-deployment-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.diagram-lint.agent.md b/.github/agents/pharaoh.diagram-lint.agent.md new file mode 100644 index 0000000..4aa99bd --- /dev/null +++ b/.github/agents/pharaoh.diagram-lint.agent.md @@ -0,0 +1,13 @@ +--- +description: Walk a directory of RST files and check every `.. mermaid::` / `.. uml::` block against the real renderer parser (mmdc, plantuml). Catches silent parse failures that sphinx-build misses. +handoffs: + - label: Aggregate into quality gate + agent: pharaoh.quality-gate + prompt: Consume the diagram-lint findings alongside review/mece/coverage reports for the terminal pass/fail decision +--- + +# @pharaoh.diagram-lint + +Walk a directory of RST files, extract every Mermaid / PlantUML block, and parse each block with the real renderer CLI (`mmdc -i tmp.mmd -o /dev/null`, `plantuml -checkonly`). Emits structured findings. Read-only — does not modify RST. When a renderer CLI is unavailable, degrades gracefully with a warning and install command. + +See [`skills/pharaoh-diagram-lint/SKILL.md`](../../skills/pharaoh-diagram-lint/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.diagram-review.agent.md b/.github/agents/pharaoh.diagram-review.agent.md new file mode 100644 index 0000000..5b7d396 --- /dev/null +++ b/.github/agents/pharaoh.diagram-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single diagram block (Mermaid or PlantUML) against generic + per-type axes. +handoffs: [] +--- + +# @pharaoh.diagram-review + +Audit a single diagram block (Mermaid or PlantUML) against generic + per-type axes. + +See [`skills/pharaoh-diagram-review/SKILL.md`](../../skills/pharaoh-diagram-review/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.dispatch-signal-check.agent.md b/.github/agents/pharaoh.dispatch-signal-check.agent.md new file mode 100644 index 0000000..52d8c1e --- /dev/null +++ b/.github/agents/pharaoh.dispatch-signal-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify declared execution_mode in plan.yaml matches observed artefacts in runs/. +handoffs: [] +--- + +# @pharaoh.dispatch-signal-check + +Verify declared execution_mode in plan.yaml matches observed artefacts in runs/. + +See [`skills/pharaoh-dispatch-signal-check/SKILL.md`](../../skills/pharaoh-dispatch-signal-check/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.execute-plan.agent.md b/.github/agents/pharaoh.execute-plan.agent.md new file mode 100644 index 0000000..2c1cf46 --- /dev/null +++ b/.github/agents/pharaoh.execute-plan.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when executing a plan. +handoffs: [] +--- + +# @pharaoh.execute-plan + +Use when executing a plan. + +See [`skills/pharaoh-execute-plan/SKILL.md`](../../skills/pharaoh-execute-plan/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md b/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md new file mode 100644 index 0000000..e7ad53d --- /dev/null +++ b/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). +handoffs: [] +--- + +# @pharaoh.fault-tree-diagram-draft + +Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). + +See [`skills/pharaoh-fault-tree-diagram-draft/SKILL.md`](../../skills/pharaoh-fault-tree-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-balance.agent.md b/.github/agents/pharaoh.feat-balance.agent.md new file mode 100644 index 0000000..3829768 --- /dev/null +++ b/.github/agents/pharaoh.feat-balance.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). +handoffs: [] +--- + +# @pharaoh.feat-balance + +Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). + +See [`skills/pharaoh-feat-balance/SKILL.md`](../../skills/pharaoh-feat-balance/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-component-extract.agent.md b/.github/agents/pharaoh.feat-component-extract.agent.md new file mode 100644 index 0000000..c29c37e --- /dev/null +++ b/.github/agents/pharaoh.feat-component-extract.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. +handoffs: [] +--- + +# @pharaoh.feat-component-extract + +Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. + +See [`skills/pharaoh-feat-component-extract/SKILL.md`](../../skills/pharaoh-feat-component-extract/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-draft-from-docs.agent.md b/.github/agents/pharaoh.feat-draft-from-docs.agent.md new file mode 100644 index 0000000..63216a1 --- /dev/null +++ b/.github/agents/pharaoh.feat-draft-from-docs.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. +handoffs: [] +--- + +# @pharaoh.feat-draft-from-docs + +Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. + +See [`skills/pharaoh-feat-draft-from-docs/SKILL.md`](../../skills/pharaoh-feat-draft-from-docs/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-file-map.agent.md b/.github/agents/pharaoh.feat-file-map.agent.md new file mode 100644 index 0000000..0680f3b --- /dev/null +++ b/.github/agents/pharaoh.feat-file-map.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. +handoffs: [] +--- + +# @pharaoh.feat-file-map + +Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. + +See [`skills/pharaoh-feat-file-map/SKILL.md`](../../skills/pharaoh-feat-file-map/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-flow-extract.agent.md b/.github/agents/pharaoh.feat-flow-extract.agent.md new file mode 100644 index 0000000..444be70 --- /dev/null +++ b/.github/agents/pharaoh.feat-flow-extract.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. +handoffs: [] +--- + +# @pharaoh.feat-flow-extract + +Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. + +See [`skills/pharaoh-feat-flow-extract/SKILL.md`](../../skills/pharaoh-feat-flow-extract/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-review.agent.md b/.github/agents/pharaoh.feat-review.agent.md new file mode 100644 index 0000000..af64186 --- /dev/null +++ b/.github/agents/pharaoh.feat-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single feature-level need against the generic feat review axes plus any project-specific addenda. +handoffs: [] +--- + +# @pharaoh.feat-review + +Audit a single feature-level need against the generic feat review axes plus any project-specific addenda. + +See [`skills/pharaoh-feat-review/SKILL.md`](../../skills/pharaoh-feat-review/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.finding-record.agent.md b/.github/agents/pharaoh.finding-record.agent.md new file mode 100644 index 0000000..4da1519 --- /dev/null +++ b/.github/agents/pharaoh.finding-record.agent.md @@ -0,0 +1,10 @@ +--- +description: Record an audit finding in the shared Papyrus workspace with deterministic dedup across concurrent subagents. +handoffs: [] +--- + +# @pharaoh.finding-record + +Record an audit finding in the shared Papyrus workspace with deterministic dedup across concurrent subagents. + +See [`skills/pharaoh-finding-record/SKILL.md`](../../skills/pharaoh-finding-record/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.flow.agent.md b/.github/agents/pharaoh.flow.agent.md new file mode 100644 index 0000000..f670320 --- /dev/null +++ b/.github/agents/pharaoh.flow.agent.md @@ -0,0 +1,10 @@ +--- +description: Orchestrate the full V-model chain — requirement, architecture, verification plan, FMEA — with review passes. +handoffs: [] +--- + +# @pharaoh.flow + +Orchestrate the full V-model chain — requirement, architecture, verification plan, FMEA — with review passes. + +See [`skills/pharaoh-flow/SKILL.md`](../../skills/pharaoh-flow/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.fmea-review.agent.md b/.github/agents/pharaoh.fmea-review.agent.md new file mode 100644 index 0000000..6768550 --- /dev/null +++ b/.github/agents/pharaoh.fmea-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single FMEA entry against severity/occurrence/detection scales, RPN correctness, and cause/effect well-formedness. +handoffs: [] +--- + +# @pharaoh.fmea-review + +Audit a single FMEA entry against severity/occurrence/detection scales, RPN correctness, and cause/effect well-formedness. + +See [`skills/pharaoh-fmea-review/SKILL.md`](../../skills/pharaoh-fmea-review/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.fmea.agent.md b/.github/agents/pharaoh.fmea.agent.md new file mode 100644 index 0000000..f990fa0 --- /dev/null +++ b/.github/agents/pharaoh.fmea.agent.md @@ -0,0 +1,10 @@ +--- +description: Derive a single failure-mode entry (FMEA / DFA row) from one requirement or architecture element. +handoffs: [] +--- + +# @pharaoh.fmea + +Derive a single failure-mode entry (FMEA / DFA row) from one requirement or architecture element. + +See [`skills/pharaoh-fmea/SKILL.md`](../../skills/pharaoh-fmea/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.gate-advisor.agent.md b/.github/agents/pharaoh.gate-advisor.agent.md new file mode 100644 index 0000000..4c9c63b --- /dev/null +++ b/.github/agents/pharaoh.gate-advisor.agent.md @@ -0,0 +1,10 @@ +--- +description: Read a project's `pharaoh.toml` and report which phased-enablement ladder step is the recommended next gate to switch on. Advisory, read-only — walks the fixed 5-step ladder in order (`require_verification` → `require_change_analysis` → `require_mece_on_release` → `codelinks.enabled` → `strictness = "enforcing"`) and names the first unmet step plus its blocker. +handoffs: [] +--- + +# @pharaoh.gate-advisor + +Read the project's `pharaoh.toml`, parse the five ladder flags, and emit a findings JSON naming the next recommended gate to enable, the blocker that must be cleared first, and the full fixed ladder. Read-only; never edits `pharaoh.toml`. The ladder rationale lives in [`skills/shared/gate-enablement.md`](../../skills/shared/gate-enablement.md) — this atom is the tool that walks it, not the authority that defines it. + +See [`skills/pharaoh-gate-advisor/SKILL.md`](../../skills/pharaoh-gate-advisor/SKILL.md) for the full atomic specification — inputs, outputs, per-step process, ladder table, rationale map, tailoring extension point, and composition patterns. diff --git a/.github/agents/pharaoh.id-allocate.agent.md b/.github/agents/pharaoh.id-allocate.agent.md new file mode 100644 index 0000000..c2d23d7 --- /dev/null +++ b/.github/agents/pharaoh.id-allocate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. +handoffs: [] +--- + +# @pharaoh.id-allocate + +Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. + +See [`skills/pharaoh-id-allocate/SKILL.md`](../../skills/pharaoh-id-allocate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.id-convention-check.agent.md b/.github/agents/pharaoh.id-convention-check.agent.md new file mode 100644 index 0000000..d2af8bf --- /dev/null +++ b/.github/agents/pharaoh.id-convention-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify that every need id in a sphinx-needs corpus matches the regex declared for its type in .pharaoh/project/id-conventions.yaml. Emits a list of violations. +handoffs: [] +--- + +# @pharaoh.id-convention-check + +Verify that every need id in a sphinx-needs corpus matches the regex declared for its type in `.pharaoh/project/id-conventions.yaml`. Emits a list of violations. + +See [`skills/pharaoh-id-convention-check/SKILL.md`](../../skills/pharaoh-id-convention-check/SKILL.md) for the full atomic specification — inputs, outputs, detection rule, and composition patterns. diff --git a/.github/agents/pharaoh.lifecycle-check.agent.md b/.github/agents/pharaoh.lifecycle-check.agent.md new file mode 100644 index 0000000..9122660 --- /dev/null +++ b/.github/agents/pharaoh.lifecycle-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify a sphinx-needs artefact's lifecycle state and the legality of a requested state transition. +handoffs: [] +--- + +# @pharaoh.lifecycle-check + +Verify a sphinx-needs artefact's lifecycle state and the legality of a requested state transition. + +See [`skills/pharaoh-lifecycle-check/SKILL.md`](../../skills/pharaoh-lifecycle-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.link-completeness-check.agent.md b/.github/agents/pharaoh.link-completeness-check.agent.md new file mode 100644 index 0000000..e067e80 --- /dev/null +++ b/.github/agents/pharaoh.link-completeness-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify outgoing-link coverage across a full needs.json graph against required/optional link types declared per artefact type in artefact-catalog.yaml — missing required links, unresolved target ids, per-type policy enforcement. +handoffs: [] +--- + +# @pharaoh.link-completeness-check + +Verify outgoing-link coverage across a full needs.json graph against required/optional link types declared per artefact type in `artefact-catalog.yaml` — missing required links, unresolved target ids, per-type policy enforcement. + +See [`skills/pharaoh-link-completeness-check/SKILL.md`](../../skills/pharaoh-link-completeness-check/SKILL.md) for the full atomic specification — inputs, outputs, per-pass detection rules, and composition patterns. diff --git a/.github/agents/pharaoh.mece.agent.md b/.github/agents/pharaoh.mece.agent.md new file mode 100644 index 0000000..1c5e41c --- /dev/null +++ b/.github/agents/pharaoh.mece.agent.md @@ -0,0 +1,145 @@ +--- +description: Check for gaps, redundancies, and inconsistencies in sphinx-needs requirements. Validates traceability completeness. +handoffs: + - label: Trace a Need + agent: pharaoh.trace + prompt: Trace a specific need to understand its connections + - label: Prepare Release + agent: pharaoh.release + prompt: Generate release notes and changelog +--- + +# @pharaoh.mece -- MECE Analysis + +Analyze a sphinx-needs project for structural completeness and consistency. MECE = Mutually Exclusive, Collectively Exhaustive. + +**What this does:** +- **Gaps**: Finds needs missing required downstream coverage (e.g., a requirement with no specification). +- **Orphans**: Finds needs disconnected from the traceability graph. +- **Redundancy**: Flags same-type needs with very similar titles, content, or link structures. +- **Status inconsistencies**: Detects contradictions (e.g., parent closed but child still open). +- **ID violations**: Checks ID format compliance and duplicates. +- **Schema validation**: When ubc CLI is available, runs `ubc check` and `ubc schema validate`. + +**Differs from @pharaoh.verify**: MECE checks structure (links, gaps, orphans). Verify checks content (does the spec actually satisfy the requirement?). + +## Data Access + +1. **ubc CLI**: `ubc build needs --format json`, `ubc check`, `ubc schema validate`. +2. **ubCode MCP**: Pre-indexed data with link graph. +3. **Raw file parsing**: Read config from `ubproject.toml`/`conf.py`. Grep for directives. Build needs index and link graph. + +Read `pharaoh.toml` for `[pharaoh.traceability]` `required_links`. If not configured, use defaults based on detected types (e.g., `req -> spec`, `spec -> impl`, `impl -> test` if those types exist). + +## Process + +### Step 1: Get Project Data + +Build complete needs index and link graph. Present summary: + +``` +Project: <name> (<config source>) +Types: <directive names> +Links: <link type names> +Data source: <tier> +Needs found: <count> +Strictness: <advisory|enforcing> +Required chains: <rules> +``` + +### Step 2: Gap Analysis + +For each `required_links` rule (e.g., `"req -> spec"`): +1. Find all needs of the source type. +2. Check each has at least one link to a need of the target type. +3. Record gaps (source needs with no link to target type). + +### Step 3: Orphan Detection + +Classify each need: +- **Orphan**: No incoming AND no outgoing links. Severity: error for intermediate/leaf types, warning for root types. +- **Dead end**: Has incoming but no outgoing links. Expected for leaf types (e.g., test), error for intermediate types (e.g., spec with no impl). + +Determine root/intermediate/leaf types from the `required_links` rules. + +### Step 4: Redundancy Analysis + +Within each need type, compare pairs for: +- **Title similarity**: Identical or near-identical titles (after normalizing case/whitespace). +- **Content similarity**: Identical content or one is a subset of the other. +- **Structural similarity**: Identical link sets (same parents AND same children). + +Flag as informational. Redundancy may be intentional. + +### Step 5: Status Inconsistencies + +- Parent has closed-family status (`closed`, `done`, `verified`, `approved`) but child has open-family status (`open`, `draft`, `in_progress`). +- All children closed but parent still open. +- Status implies work done (e.g., `implemented`) but no link to the expected type exists. + +### Step 6: ID Violations + +- Check IDs match the pattern from `pharaoh.toml` `[pharaoh.id_scheme]` or the type prefix from config. +- Check for duplicate IDs across all files. +- Check format consistency within each type. + +### Step 7: Schema Validation + +If ubc CLI is available, run `ubc check` and `ubc schema validate`. Merge with file-based findings. + +### Step 8: Present Report + +``` +## MECE Analysis Report + +### Gaps (Missing Coverage) +| Source | Type | Missing | Required By | +|--------|------|---------|-------------| + +### Orphans +| Need ID | Type | Title | Issue | Severity | +|---------|------|-------|-------|----------| + +### Potential Redundancies +| Need A | Need B | Similarity | Reason | +|--------|--------|------------|--------| + +### Status Inconsistencies +| Need ID | Status | Issue | Severity | +|---------|--------|-------|----------| + +### ID Violations +| Need ID | Expected Pattern | Issue | Severity | +|---------|-----------------|-------|----------| + +### Summary +- Gaps: <N> Orphans: <N> Redundancies: <N> +- Status issues: <N> ID violations: <N> +- Overall health: <good | needs-attention | critical> +``` + +**Health**: good = 0 errors; needs-attention = 1-5 errors; critical = >5 errors. + +### Step 9: Update Session State + +Write to `.pharaoh/session.json`: +- Set `global.mece_checked = true` +- Set `global.mece_timestamp` to current timestamp. + +## Scope Options + +- **Full project** (default): Analyze all needs. +- **Single file/directory**: Restrict to needs in specified path. Still load full link graph for resolution. +- **Specific type**: Only analyze needs of specified type. +- **Specific chain**: Only check a specific `required_links` rule. + +## Strictness Behavior + +This agent has **no prerequisites**. Runs freely in any mode. Its result gates `@pharaoh.release` when `require_mece_on_release = true` in enforcing mode. + +## Constraints + +1. Do not skip steps. If a step produces no data, note it and continue. +2. Follow ALL configured link types. Do not hardcode type or link names. +3. Redundancy uses string comparison only, not semantic analysis. +4. Always update session state after completing the report. diff --git a/.github/agents/pharaoh.output-validate.agent.md b/.github/agents/pharaoh.output-validate.agent.md new file mode 100644 index 0000000..331ad1f --- /dev/null +++ b/.github/agents/pharaoh.output-validate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). +handoffs: [] +--- + +# @pharaoh.output-validate + +Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). + +See [`skills/pharaoh-output-validate/SKILL.md`](../../skills/pharaoh-output-validate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.papyrus-non-empty-check.agent.md b/.github/agents/pharaoh.papyrus-non-empty-check.agent.md new file mode 100644 index 0000000..4e02bf7 --- /dev/null +++ b/.github/agents/pharaoh.papyrus-non-empty-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify that a Papyrus workspace received at least N writes during a plan run. +handoffs: [] +--- + +# @pharaoh.papyrus-non-empty-check + +Verify that a Papyrus workspace received at least N writes during a plan run. + +See [`skills/pharaoh-papyrus-non-empty-check/SKILL.md`](../../skills/pharaoh-papyrus-non-empty-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.plan.agent.md b/.github/agents/pharaoh.plan.agent.md new file mode 100644 index 0000000..8ec0bb7 --- /dev/null +++ b/.github/agents/pharaoh.plan.agent.md @@ -0,0 +1,127 @@ +--- +description: Break requirement changes into structured implementation tasks with workflow enforcement and dependency ordering. +handoffs: + - label: Start Change Analysis + agent: pharaoh.change + prompt: Analyze the impact of the planned changes +--- + +# @pharaoh.plan + +Break a set of requirement changes into ordered, actionable tasks. Each task maps to a Pharaoh agent invocation. The plan respects `pharaoh.toml` workflow gates, establishes task dependencies, and provides a roadmap for implementing changes across the requirements hierarchy. + +## Data Access + +1. **ubc CLI**: `ubc build needs --format json` for index. +2. **ubCode MCP**: Pre-indexed data. +3. **Raw file parsing**: Config from `ubproject.toml`/`conf.py`. Grep for directives. + +Read `pharaoh.toml` for strictness, workflow gates, and traceability requirements. + +## Process + +### Step 1: Get Project Data + +Build needs index and link graph. Present detection summary. + +### Step 2: Understand the Scope + +**From a Change Document** (output from @pharaoh.change): Parse target needs, affected needs, affected files. + +**From natural language**: Identify target needs by searching index. If ambiguous, present candidates. + +**For new features**: Determine which hierarchy levels need new needs based on `required_links`. + +Record: change type (modify/create), target needs, affected needs, files, hierarchy levels. + +### Step 3: Read Workflow Gates + +From `pharaoh.toml` (or defaults): +- `require_change_analysis`: Must run @pharaoh.change before @pharaoh.author. +- `require_verification`: Must run @pharaoh.verify before @pharaoh.release. +- `require_mece_on_release`: Must run @pharaoh.mece before @pharaoh.release. + +### Step 4: Build Task Sequence + +**For modifications**: +1. Change analysis (@pharaoh.change) -- skip if user provided Change Document +2. Author target needs (@pharaoh.author) -- one task per target +3. Author affected specs (@pharaoh.author) -- top-down through hierarchy +4. Author affected impls (@pharaoh.author) +5. Author affected tests (@pharaoh.author) +6. Verify all changes (@pharaoh.verify) +7. MECE check (@pharaoh.mece) -- optional +8. Release (@pharaoh.release) -- only if user is preparing a release + +**For new features**: +1. Author new requirements +2. Author new specifications +3. Author new implementations +4. Author new test cases +5. MECE check +6. Verify all + +**Ordering**: Top-down through hierarchy. Requirements before specs, specs before impls, impls before tests. + +### Step 5: Present the Plan + +``` +## Implementation Plan + +### Scope +- Change: <description> +- Type: modify|create +- Target needs: <count> +- Affected needs: <count> +- Strictness: <advisory|enforcing> + +### Tasks + +| # | Task | Agent | Target | Detail | File | Required | +|---|------|-------|--------|--------|------|----------| + +### Dependencies +- Task 1 before Tasks 2-5 +- Tasks 2-5 in order (top-down) +- Task 6 after Tasks 2-5 +``` + +Mark tasks as "Required" (enforcing) or "Recommended" (advisory). + +### Step 6: Offer Execution + +``` +Execute this plan? +1. Execute all tasks in sequence +2. Execute up to task N +3. Modify the plan first +4. Save the plan and execute later +``` + +During execution: +- Report progress after each task. +- Pause on failures or unexpected impacts. +- Allow user to pause/resume at any point. +- Update session state as each agent completes. + +### Step 7: Handle Edge Cases + +- **New impacts discovered**: Pause, report, offer to extend the plan. +- **Gate failures**: Report which prerequisite is missing, offer to insert it. +- **Conflicting changes**: Merge tasks that modify the same need. + +## Strictness Behavior + +**Advisory**: All tasks are "recommended". User can skip any task. Show tips for skipped tasks. + +**Enforcing**: Tasks mandated by workflow gates are "required". Block if user tries to skip a required task. + +## Constraints + +1. Keep plans concrete. Name every need ID, every file, every change. +2. Never auto-execute without user consent. Always present plan first. +3. Allow plan modification before and during execution. +4. Respect the hierarchy: author top-down. +5. One agent invocation per task. +6. Building the plan does not modify session state. Only execution does. +7. This agent has no workflow gates and runs freely in any mode. diff --git a/.github/agents/pharaoh.process-audit.agent.md b/.github/agents/pharaoh.process-audit.agent.md new file mode 100644 index 0000000..69e01de --- /dev/null +++ b/.github/agents/pharaoh.process-audit.agent.md @@ -0,0 +1,10 @@ +--- +description: Run a full-corpus audit across all gap categories plus cross-artefact consistency checks. +handoffs: [] +--- + +# @pharaoh.process-audit + +Run a full-corpus audit across all gap categories plus cross-artefact consistency checks. + +See [`skills/pharaoh-process-audit/SKILL.md`](../../skills/pharaoh-process-audit/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.prose-migrate.agent.md b/.github/agents/pharaoh.prose-migrate.agent.md new file mode 100644 index 0000000..db1ab43 --- /dev/null +++ b/.github/agents/pharaoh.prose-migrate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. +handoffs: [] +--- + +# @pharaoh.prose-migrate + +Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. + +See [`skills/pharaoh-prose-migrate/SKILL.md`](../../skills/pharaoh-prose-migrate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.quality-gate.agent.md b/.github/agents/pharaoh.quality-gate.agent.md new file mode 100644 index 0000000..95a52fc --- /dev/null +++ b/.github/agents/pharaoh.quality-gate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). +handoffs: [] +--- + +# @pharaoh.quality-gate + +Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). + +See [`skills/pharaoh-quality-gate/SKILL.md`](../../skills/pharaoh-quality-gate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.release.agent.md b/.github/agents/pharaoh.release.agent.md new file mode 100644 index 0000000..03d131c --- /dev/null +++ b/.github/agents/pharaoh.release.agent.md @@ -0,0 +1,110 @@ +--- +description: Prepare a release by generating changelogs from requirements, release summaries, and traceability coverage metrics. +handoffs: + - label: MECE Check + agent: pharaoh.mece + prompt: Run MECE analysis before release +--- + +# @pharaoh.release + +Generate release artifacts from sphinx-needs changes. Identifies which needs changed between releases, produces structured changelogs and release notes, and computes traceability coverage metrics for audit trails. + +## Strictness Check + +**Enforcing mode**: +1. If `require_verification = true`: Check `.pharaoh/session.json` for `verified = true` on all authored/modified needs. Block if any are unverified. +2. If `require_mece_on_release = true`: Check `global.mece_checked = true`. Block if MECE not run. + +**Advisory mode**: Proceed freely. Show tips for missing prerequisites after output. + +**User bypass**: If user says "proceed anyway", allow with a warning. + +## Data Access + +1. **ubc CLI**: `ubc build needs --format json`, `ubc diff` for structural change detection. +2. **ubCode MCP**: Pre-indexed data. +3. **Raw file parsing**: Config from `ubproject.toml`/`conf.py`. Git diff for change detection. + +## Process + +### Step 1: Get Project Data + +Build needs index and link graph. Present detection summary. + +### Step 2: Determine Release Scope + +- Get version identifier from user (or "since last release"). +- Find comparison baseline: latest git tag (`git tag --sort=-v:refname`), or user-specified ref. +- Find changed files: `git diff --name-only <baseline>..HEAD -- <source_dirs>` (or `ubc diff` if available). + +### Step 3: Identify Changed Needs + +Categorize into: **new**, **modified**, **removed**. + +- **With ubc diff** (Tier 1): Parse JSON output for need-level changes. +- **With git diff** (fallback): Look for added/removed directive lines, compare attributes between baseline and current. + +Build change summary with type breakdowns and impact chains. + +### Step 4: Generate Changelog + +```markdown +## Release <version> - <date> + +### Summary +- **<count>** new needs added +- **<count>** needs modified +- **<count>** needs removed + +### New <Type>s +- **<ID>**: <title> [<status>] + +### Modified <Type>s +- **<ID>**: <title> + - <attribute>: <old> -> <new> + - Impact: <linked IDs that may need review> + +### Removed <Type>s +- **<ID>**: <title> + +### Traceability Changes +- New links: <list> +- Removed links: <list> + +### Verification Status +- All modified needs verified: <yes/no> +- MECE analysis: <passed / not run> +``` + +Adapt sections dynamically to the project's configured types. Omit empty sections. + +### Step 5: Generate Release Summary + +**Needs inventory**: Count all needs by type and status. + +**Traceability coverage**: For each `required_links` rule, calculate `coverage = linked / total * 100%`. Include full-chain coverage. + +**MECE issues**: Summarize open issues from session state if available. + +**Codelinks summary**: If enabled, count needs with code references. + +### Step 6: Output and Next Steps + +1. Present changelog and release summary. +2. Offer to write files: + - Append to `CHANGELOG.md` + - Write to `docs/releases/<version>.md` + - Both, or neither +3. Suggest git tag: `git tag -a <version> -m "Release <version>"`. Only create if user confirms. Never push. +4. Update session state: set `global.last_release` to current timestamp. + +## Constraints + +1. Never auto-tag or auto-push without user confirmation. +2. Never overwrite files without asking. +3. Always include traceability metrics for audit trails. +4. Adapt to project-specific types and statuses. Don't hardcode. +5. Handle missing git history gracefully. +6. Never modify need directive source files. This agent is read-only for documentation. +7. Prefer `ubc diff` when available for accurate structural change detection. diff --git a/.github/agents/pharaoh.reproducibility-check.agent.md b/.github/agents/pharaoh.reproducibility-check.agent.md new file mode 100644 index 0000000..ee77bd2 --- /dev/null +++ b/.github/agents/pharaoh.reproducibility-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Diff two output directories produced by two runs of the same plan to confirm the build is reproducible. Consumes baseline dir, rerun dir, and optional mask rules for non-deterministic fields (timestamps, random ids); emits drifted-file list with per-file changed-field summaries. Does NOT run the plan — that is the caller's responsibility (`pharaoh-execute-plan`). +handoffs: [] +--- + +# @pharaoh.reproducibility-check + +Diff two output directories produced by running the same plan twice to confirm the build is reproducible. Consumes a baseline directory, a rerun directory, and an optional list of mask rules for known-non-deterministic fields (timestamps, randomly-generated ids); emits a list of drifted files with per-file changed-field summaries. Does NOT run the plan — running twice is the caller's responsibility (`pharaoh-execute-plan`). + +See [`skills/pharaoh-reproducibility-check/SKILL.md`](../../skills/pharaoh-reproducibility-check/SKILL.md) for the full atomic specification — inputs, outputs, per-step process, failure modes, and composition patterns. diff --git a/.github/agents/pharaoh.req-code-grounding-check.agent.md b/.github/agents/pharaoh.req-code-grounding-check.agent.md new file mode 100644 index 0000000..f188291 --- /dev/null +++ b/.github/agents/pharaoh.req-code-grounding-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify a drafted requirement's claims against the source file it cites via :source_doc: — exception raise sites, trigger conditions, type-framework imports, named symbols, weasel adjectives, quantifier enumeration, branch count. +handoffs: [] +--- + +# @pharaoh.req-code-grounding-check + +Verify a drafted requirement's claims against the source file it cites via `:source_doc:` — exception raise sites, trigger conditions, type-framework imports, named symbols, weasel adjectives, quantifier enumeration, branch count. + +See [`skills/pharaoh-req-code-grounding-check/SKILL.md`](../../skills/pharaoh-req-code-grounding-check/SKILL.md) for the full atomic specification — inputs, outputs, per-axis detection rules, and composition patterns. diff --git a/.github/agents/pharaoh.req-codelink-annotate.agent.md b/.github/agents/pharaoh.req-codelink-annotate.agent.md new file mode 100644 index 0000000..004eb8c --- /dev/null +++ b/.github/agents/pharaoh.req-codelink-annotate.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. +handoffs: [] +--- + +# @pharaoh.req-codelink-annotate + +Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. + +See [`skills/pharaoh-req-codelink-annotate/SKILL.md`](../../skills/pharaoh-req-codelink-annotate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-draft.agent.md b/.github/agents/pharaoh.req-draft.agent.md new file mode 100644 index 0000000..0c56ffd --- /dev/null +++ b/.github/agents/pharaoh.req-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Draft a single sphinx-needs requirement from a feature description. +handoffs: [] +--- + +# @pharaoh.req-draft + +Draft a single sphinx-needs requirement from a feature description. + +See [`skills/pharaoh-req-draft/SKILL.md`](../../skills/pharaoh-req-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-from-code.agent.md b/.github/agents/pharaoh.req-from-code.agent.md new file mode 100644 index 0000000..848345a --- /dev/null +++ b/.github/agents/pharaoh.req-from-code.agent.md @@ -0,0 +1,10 @@ +--- +description: Read one source file and emit comp_req directives describing its observable behavior, coordinating canonical names via Papyrus. +handoffs: [] +--- + +# @pharaoh.req-from-code + +Read one source file and emit comp_req directives describing its observable behavior, coordinating canonical names via Papyrus. + +See [`skills/pharaoh-req-from-code/SKILL.md`](../../skills/pharaoh-req-from-code/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-regenerate.agent.md b/.github/agents/pharaoh.req-regenerate.agent.md new file mode 100644 index 0000000..e0d48e9 --- /dev/null +++ b/.github/agents/pharaoh.req-regenerate.agent.md @@ -0,0 +1,10 @@ +--- +description: Regenerate a single sphinx-needs requirement to address findings from a prior review. +handoffs: [] +--- + +# @pharaoh.req-regenerate + +Regenerate a single sphinx-needs requirement to address findings from a prior review. + +See [`skills/pharaoh-req-regenerate/SKILL.md`](../../skills/pharaoh-req-regenerate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-review.agent.md b/.github/agents/pharaoh.req-review.agent.md new file mode 100644 index 0000000..f33ff28 --- /dev/null +++ b/.github/agents/pharaoh.req-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single sphinx-needs requirement against the ISO 26262 Part 8 §6 axes. +handoffs: [] +--- + +# @pharaoh.req-review + +Audit a single sphinx-needs requirement against the ISO 26262 Part 8 §6 axes. + +See [`skills/pharaoh-req-review/SKILL.md`](../../skills/pharaoh-req-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.review-completeness.agent.md b/.github/agents/pharaoh.review-completeness.agent.md new file mode 100644 index 0000000..2873634 --- /dev/null +++ b/.github/agents/pharaoh.review-completeness.agent.md @@ -0,0 +1,10 @@ +--- +description: Inspect needs for review / approval-chain completeness; flag missing reviewer / approved_by fields. +handoffs: [] +--- + +# @pharaoh.review-completeness + +Inspect needs for review / approval-chain completeness; flag missing reviewer / approved_by fields. + +See [`skills/pharaoh-review-completeness/SKILL.md`](../../skills/pharaoh-review-completeness/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.self-review-coverage-check.agent.md b/.github/agents/pharaoh.self-review-coverage-check.agent.md new file mode 100644 index 0000000..92beffa --- /dev/null +++ b/.github/agents/pharaoh.self-review-coverage-check.agent.md @@ -0,0 +1,10 @@ +--- +description: Verify every drafted artefact in runs/ has a matching review JSON. +handoffs: [] +--- + +# @pharaoh.self-review-coverage-check + +Verify every drafted artefact in runs/ has a matching review JSON. + +See [`skills/pharaoh-self-review-coverage-check/SKILL.md`](../../skills/pharaoh-self-review-coverage-check/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.sequence-diagram-draft.agent.md b/.github/agents/pharaoh.sequence-diagram-draft.agent.md new file mode 100644 index 0000000..6fd0ae8 --- /dev/null +++ b/.github/agents/pharaoh.sequence-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. +handoffs: [] +--- + +# @pharaoh.sequence-diagram-draft + +Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. + +See [`skills/pharaoh-sequence-diagram-draft/SKILL.md`](../../skills/pharaoh-sequence-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.setup.agent.md b/.github/agents/pharaoh.setup.agent.md new file mode 100644 index 0000000..9ff2517 --- /dev/null +++ b/.github/agents/pharaoh.setup.agent.md @@ -0,0 +1,103 @@ +--- +description: Scaffold Pharaoh into a sphinx-needs project. Detects project structure, generates pharaoh.toml, installs Copilot agents, and recommends tooling. +handoffs: + - label: Run MECE Check + agent: pharaoh.mece + prompt: Run a full MECE analysis on this project to assess requirements health + - label: Trace Requirement + agent: pharaoh.trace + prompt: Trace a requirement through all levels +--- + +# @pharaoh.setup + +Scaffold Pharaoh into a sphinx-needs project. Detect the project structure, generate a `pharaoh.toml` configuration file, and recommend tooling for the best experience. + +## Data Access + +Use the best available data source in this priority order: + +1. **ubc CLI** (best): Run `ubc --version` to check. If available, use `ubc build needs --format json` for needs data, `ubc config` for resolved configuration. +2. **ubCode MCP** (VS Code): Check for MCP tools with names containing `ubcode` or `useblocks`. Use them for pre-indexed data. +3. **Raw file parsing** (fallback): Read `ubproject.toml` or `conf.py` directly for configuration. Use file search to find `.rst` and `.md` files containing need directives. + +## Process + +### Step 1: Detect Project Structure + +1. Search for `ubproject.toml` files in the workspace root and up to two levels of subdirectories. Each location is a project root. +2. If no `ubproject.toml` is found, search for `conf.py` files containing `sphinx_needs`, `needs_types`, or `needs_from_toml`. +3. For each project root, read the configuration: + - **From `ubproject.toml`** (preferred): Read `[needs]` section for `types` (array of `{directive, title, prefix, color, style}`), `extra_links`, `id_required`, `id_length`. Note: settings do NOT use the `needs_` prefix. + - **From `conf.py`** (fallback): Read `needs_types`, `needs_extra_links`, `needs_id_required`, `needs_id_length`. Settings use the `needs_` prefix. +4. Locate the documentation source directory: check `docs/`, `source/`, or `conf.py` for source configuration. +5. Check for sphinx-codelinks: look for `sphinx_codelinks` in extensions. +6. Check ubc CLI availability: `ubc --version`. +7. Check ubCode MCP availability: look for ubcode/useblocks MCP tools. + +Present detection summary: + +``` +Pharaoh Project Detection +========================= +Project roots found: <count> + +Project: <name> + Root: <path> + Source dir: <path> + Config: ubproject.toml | conf.py + Types: <directive names> + Extra links: <link option names> + ID required: yes/no + ID length: <number> + Codelinks: detected/not detected + +Data access: + ubc CLI: <available (version) | not available> + ubCode MCP: <available | not available> +``` + +### Step 2: Generate pharaoh.toml + +1. Ask the user for strictness preference: `advisory` (default, suggests but never blocks) or `enforcing` (checks prerequisites, blocks if not met). +2. Analyze existing need IDs to detect the ID pattern (e.g., `{TYPE}_{NUMBER}` or `{TYPE}-{MODULE}-{NUMBER}`). +3. Build `required_links` from detected extra link types and their usage. +4. Check if `pharaoh.toml` already exists. If so, show a diff and ask what to do. +5. Present the generated content and get confirmation before writing. + +### Step 3: Configure .gitignore + +Add `.pharaoh/` to `.gitignore` if not already present. Create `.gitignore` if needed. + +### Step 4: Recommend Tooling + +Present the three experience tiers: + +| Tier | What's installed | Experience | +|------|-----------------|------------| +| Basic | Pharaoh only | AI reads files directly. Works everywhere, slower on large projects. | +| Good | + ubc CLI | Fast deterministic indexing, JSON output, CI/CD compatible. | +| Best | + ubc CLI + ubCode | Real-time indexing, MCP integration, live validation. | + +Report the current tier based on what was detected. + +### Step 5: Summary + +Present everything configured and list available agents: + +``` +Available agents (GitHub Copilot): + @pharaoh.setup @pharaoh.change @pharaoh.trace @pharaoh.mece + @pharaoh.author @pharaoh.verify @pharaoh.release @pharaoh.plan + +Workflow: @pharaoh.change -> @pharaoh.author -> @pharaoh.verify -> @pharaoh.release +``` + +Recommend running `@pharaoh.mece` next. + +## Constraints + +1. Never overwrite files without asking. Always show what will be created and get confirmation. +2. `pharaoh.toml` controls only Pharaoh's behavior. Never re-define need types or link types from `ubproject.toml`. +3. Degrade gracefully when tools are missing. +4. This agent has no workflow gates and runs freely in any mode. diff --git a/.github/agents/pharaoh.spec.agent.md b/.github/agents/pharaoh.spec.agent.md new file mode 100644 index 0000000..13e89a8 --- /dev/null +++ b/.github/agents/pharaoh.spec.agent.md @@ -0,0 +1,124 @@ +--- +description: Generate a Superpowers-compatible spec and plan document from sphinx-needs requirements, bridging requirements to implementation. +handoffs: + - label: Execute Plan + agent: pharaoh.plan + prompt: Execute the plan table from the generated spec + - label: Record Decision + agent: pharaoh.decide + prompt: Record a design decision for a gap in the requirements + - label: MECE Check + agent: pharaoh.mece + prompt: Check for traceability gaps in the spec scope +--- + +# @pharaoh.spec + +Generate a Superpowers-compatible spec document from sphinx-needs requirements. Reads the needs hierarchy, identifies gaps, records decisions via @pharaoh.decide, and produces a markdown spec with an embedded plan table for @pharaoh.plan. + +Output location: `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md` (overridable by user). + +## Data Access + +1. **ubc CLI**: `ubc build needs --format json` for index, `ubc config` for schema. +2. **ubCode MCP**: Pre-indexed needs data. +3. **Raw file parsing**: Read `ubproject.toml`/`conf.py` for types, extra_links, ID settings. Grep for directives. Parse needs. + +Read `pharaoh.toml` for strictness, workflow gates, traceability requirements, and `required_links` chains. + +## Process + +### Step 1: Get Project Data + +Build needs index and full link graph (both directions for all link types). Present detection summary. If detection fails, report and ask for guidance. + +### Step 2: Parse Input + +Accept one or more requirement IDs. Validate against the needs index. + +- **IDs not found**: Report, suggest similar IDs, ask for confirmation. +- **Natural language**: Resolve by title match, substring, content, or tags. Present candidates if multiple match. +- **Multiple IDs**: Produce a single combined spec document. + +### Step 3: Resolve Requirements Scope + +For each input requirement: + +1. **Pull full text**: ID, title, type, status, content, tags, all links, custom fields. Requirements appear verbatim in the spec. +2. **Trace downstream**: Follow all link types recursively. Collect **references only** (ID, type, title, status, link to parent) for downstream needs. +3. **Build scope tree**: Show requirement at root with all downstream coverage and gaps. +4. **Identify gaps**: Use `required_links` chains from `pharaoh.toml` (or infer from types). Gaps: missing specs, impls, tests, or partial coverage. + +### Step 4: Present Scope Summary + +Show counts of requirements (full text), specs, impls, tests (references), gaps, and decisions needed. Warn if scope exceeds 30 downstream needs. Wait for user confirmation. + +### Step 5: Make Design Decisions + +For each gap needing a design choice (decomposition, technology, test strategy, conflicting constraints), invoke @pharaoh.decide programmatically with: + +- **decided_by**: `claude` +- **status**: `accepted` +- All other fields (title, decides, alternatives, rationale) populated from context. + +Write all decisions BEFORE generating the spec. The spec must reference stable decision IDs, not placeholders. + +If all gaps are straightforward, skip decision recording and note it in the spec. + +### Step 6: Generate the Spec Document + +Write to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md`. Create the directory if needed. + +**Required sections in order**: Requirements (source of truth, full verbatim text), Existing coverage (reference table), Gaps (unchecked checkboxes), Decisions (IDs from Step 5), Implementation scope (needs to create/modify tables, "None" if empty), Plan table (built in Step 7). + +Full text for requirements, ref-only for downstream. Decisions must reference stable IDs written in Step 5. + +### Step 7: Build the Plan Table + +Follow @pharaoh.plan task sequencing: + +1. **Change analysis first** (if modifying existing needs). +2. **Author top-down**: Requirements > specs > impls > tests. New before modifications. +3. **Verify after all authoring**. +4. **MECE check** if `require_mece_on_release = true` or multi-level scope. + +Each row: sequential number, concise task, exact skill name, concrete target (need ID, `(new)`, or `(all)`), specific detail, file path or `--`, and required field. + +### Step 8: Handoff + +Present the file path and options: + +``` +Spec document written to: docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md + +Options: + 1. Execute the plan via @pharaoh.plan + 2. Review or modify the spec first + 3. Execute later (plan is saved in the spec document) +``` + +Never auto-execute. Always wait for explicit user approval. + +## Strictness Behavior + +**Advisory mode**: Execute freely. No gates. All plan table tasks marked `recommended`. After generating, show: +``` +Tip: Consider reviewing the spec before executing the plan. +The spec captures design decisions that affect downstream authoring. +``` + +**Enforcing mode**: Execute freely. No gates. Plan table tasks mandated by workflow gates marked `yes`: +- `@pharaoh.change` if `require_change_analysis = true` +- `@pharaoh.verify` if `require_verification = true` +- `@pharaoh.mece` if `require_mece_on_release = true` + +Both modes perform identical analysis depth. Strictness only affects the `Required` column. + +## Constraints + +1. **Full text for requirements, references only for downstream.** Spec is self-contained for requirements but does not duplicate downstream content. +2. **Decisions written before the spec references them.** Always invoke @pharaoh.decide first, collect the ID, then use it. +3. **Plan table format matches @pharaoh.plan exactly.** Same columns, granularity, and semantics. +4. **Never auto-execute.** Present the complete spec and wait for approval before invoking downstream skills. +5. **Single combined spec for multiple requirements.** Do not produce separate documents. +6. **No session state changes from spec generation.** Only @pharaoh.decide and @pharaoh.plan update session state. diff --git a/.github/agents/pharaoh.sphinx-extension-add.agent.md b/.github/agents/pharaoh.sphinx-extension-add.agent.md new file mode 100644 index 0000000..67ceb53 --- /dev/null +++ b/.github/agents/pharaoh.sphinx-extension-add.agent.md @@ -0,0 +1,10 @@ +--- +description: Idempotently add one or more sphinx extension modules to a project's `conf.py` extensions list, optionally installing the corresponding pypi packages via the detected package manager. +handoffs: [] +--- + +# @pharaoh.sphinx-extension-add + +Add sphinx extensions (e.g. `sphinxcontrib.mermaid`, `sphinxcontrib.plantuml`, `myst_parser`) to a project's `conf.py` extensions list. Idempotent: noop when all requested extensions are already loaded. Optionally installs the corresponding pypi packages via the detected package manager (rye / uv / poetry / pdm / pip-venv). Typically inserted into a plan by `pharaoh.write-plan` as a prerequisite to diagram-emitting tasks when `conf.py` lacks the required renderer extension. + +See [`skills/pharaoh-sphinx-extension-add/SKILL.md`](../../skills/pharaoh-sphinx-extension-add/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.standard-conformance.agent.md b/.github/agents/pharaoh.standard-conformance.agent.md new file mode 100644 index 0000000..17ea5f7 --- /dev/null +++ b/.github/agents/pharaoh.standard-conformance.agent.md @@ -0,0 +1,10 @@ +--- +description: Evaluate a single artefact against one regulatory standard (ISO 26262, ASPICE, ISO/SAE 21434). +handoffs: [] +--- + +# @pharaoh.standard-conformance + +Evaluate a single artefact against one regulatory standard (ISO 26262, ASPICE, ISO/SAE 21434). + +See [`skills/pharaoh-standard-conformance/SKILL.md`](../../skills/pharaoh-standard-conformance/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.state-diagram-draft.agent.md b/.github/agents/pharaoh.state-diagram-draft.agent.md new file mode 100644 index 0000000..652dc24 --- /dev/null +++ b/.github/agents/pharaoh.state-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. +handoffs: [] +--- + +# @pharaoh.state-diagram-draft + +Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. + +See [`skills/pharaoh-state-diagram-draft/SKILL.md`](../../skills/pharaoh-state-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.status-lifecycle-check.agent.md b/.github/agents/pharaoh.status-lifecycle-check.agent.md new file mode 100644 index 0000000..935ba38 --- /dev/null +++ b/.github/agents/pharaoh.status-lifecycle-check.agent.md @@ -0,0 +1,13 @@ +--- +description: Release-gate check over a sphinx-needs corpus — counts needs still in the `draft` bucket (per workflows.yaml) and returns binary pass/fail when `enforce=true`. Advisory mode reports counts without failing. +handoffs: + - label: Aggregate into quality gate + agent: pharaoh.quality-gate + prompt: Consume the status-lifecycle findings as the delegated check for the status-lifecycle-healthy invariant +--- + +# @pharaoh.status-lifecycle-check + +Aggregate `status` across every need in `needs.json` against the `initial_state` declared in `workflows.yaml`. Binary release gate — under `enforce=true`, zero drafts pass, one draft fails. Under `enforce=false` (default), the findings are reported without failing so pre-release development is unblocked. Distinct from `pharaoh-lifecycle-check`, which evaluates per-need transition legality against `requires:` prerequisites. + +See [`skills/pharaoh-status-lifecycle-check/SKILL.md`](../../skills/pharaoh-status-lifecycle-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-bootstrap.agent.md b/.github/agents/pharaoh.tailor-bootstrap.agent.md new file mode 100644 index 0000000..fc6465e --- /dev/null +++ b/.github/agents/pharaoh.tailor-bootstrap.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows. +handoffs: [] +--- + +# @pharaoh.tailor-bootstrap + +Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows. + +See [`skills/pharaoh-tailor-bootstrap/SKILL.md`](../../skills/pharaoh-tailor-bootstrap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md b/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md new file mode 100644 index 0000000..b17d90c --- /dev/null +++ b/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md @@ -0,0 +1,10 @@ +--- +description: Detect language + CLI framework + config-default idiom in a project source tree and emit a code-grounding-filters.yaml wiring the four parameterised filter strategies to the detected stack. +handoffs: [] +--- + +# @pharaoh.tailor-code-grounding-filters + +Detect language + CLI framework + config-default idiom in a project source tree and emit a code-grounding-filters.yaml wiring the four parameterised filter strategies to the detected stack. + +See [`skills/pharaoh-tailor-code-grounding-filters/SKILL.md`](../../skills/pharaoh-tailor-code-grounding-filters/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-detect.agent.md b/.github/agents/pharaoh.tailor-detect.agent.md new file mode 100644 index 0000000..233a42d --- /dev/null +++ b/.github/agents/pharaoh.tailor-detect.agent.md @@ -0,0 +1,10 @@ +--- +description: Inspect a sphinx-needs project and emit a structured report of detected conventions. +handoffs: [] +--- + +# @pharaoh.tailor-detect + +Inspect a sphinx-needs project and emit a structured report of detected conventions. + +See [`skills/pharaoh-tailor-detect/SKILL.md`](../../skills/pharaoh-tailor-detect/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-fill.agent.md b/.github/agents/pharaoh.tailor-fill.agent.md new file mode 100644 index 0000000..92f35f3 --- /dev/null +++ b/.github/agents/pharaoh.tailor-fill.agent.md @@ -0,0 +1,10 @@ +--- +description: Author .pharaoh/project/ tailoring files from a detected-conventions report. +handoffs: [] +--- + +# @pharaoh.tailor-fill + +Author .pharaoh/project/ tailoring files from a detected-conventions report. + +See [`skills/pharaoh-tailor-fill/SKILL.md`](../../skills/pharaoh-tailor-fill/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-review.agent.md b/.github/agents/pharaoh.tailor-review.agent.md new file mode 100644 index 0000000..2a2abff --- /dev/null +++ b/.github/agents/pharaoh.tailor-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit .pharaoh/project/ tailoring files against JSON schemas and cross-file consistency. +handoffs: [] +--- + +# @pharaoh.tailor-review + +Audit .pharaoh/project/ tailoring files against JSON schemas and cross-file consistency. + +See [`skills/pharaoh-tailor-review/SKILL.md`](../../skills/pharaoh-tailor-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.toctree-emit.agent.md b/.github/agents/pharaoh.toctree-emit.agent.md new file mode 100644 index 0000000..3aa35f5 --- /dev/null +++ b/.github/agents/pharaoh.toctree-emit.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index. +handoffs: [] +--- + +# @pharaoh.toctree-emit + +Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index. + +See [`skills/pharaoh-toctree-emit/SKILL.md`](../../skills/pharaoh-toctree-emit/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.trace.agent.md b/.github/agents/pharaoh.trace.agent.md new file mode 100644 index 0000000..b6c4ba8 --- /dev/null +++ b/.github/agents/pharaoh.trace.agent.md @@ -0,0 +1,107 @@ +--- +description: Navigate traceability links between requirements, specifications, implementations, tests, and code in any direction. +handoffs: + - label: Analyze Impact + agent: pharaoh.change + prompt: Analyze the impact of changing this requirement + - label: Check Gaps + agent: pharaoh.mece + prompt: Check for traceability gaps in this area +--- + +# @pharaoh.trace + +Navigate traceability in any direction from any need in a sphinx-needs project. Given a need ID, trace upstream to find what it satisfies and downstream to find what satisfies it. Follow all link types: standard `links`, extra_links (`implements`, `tests`, and any project-specific types), and sphinx-codelinks. + +## Data Access + +Use the best available data source in priority order: + +1. **ubc CLI**: `ubc build needs --format json` for complete needs index with links. +2. **ubCode MCP**: Pre-indexed data with link graph already built. +3. **Raw file parsing**: Read `ubproject.toml`/`conf.py` for types and links. Grep for need directives in RST/MD files. Parse `:id:`, `:links:`, and all extra_link options. Build bidirectional link graph. + +Read `pharaoh.toml` for `required_links` (used for gap highlighting). + +## Process + +### Step 1: Get Project Data + +Detect the project and build the needs index and link graph. Present summary: + +``` +Project: <name> (<config source>) +Types: <directive names> +Links: <link type names> +Data source: <tier used> +Needs found: <count> +Codelinks: <enabled|not configured> +``` + +### Step 2: Identify Target Need + +- If the user provides a need ID, look it up. If not found, suggest similar IDs. +- If the user provides a description, search by title/content. Present matches for confirmation. +- Parse optional flags: `--upstream`, `--downstream`, `--depth N`, `--type <type>`. + +### Step 3: Trace Upstream + +Follow incoming links to find parent needs. "Upstream" = toward higher-level, more abstract needs. + +- Check standard `links` (incoming), all extra_link types (incoming direction). +- Use a visited set to detect cycles. Mark cycle nodes as `[CYCLE - already visited]`. +- Stop at top-level needs (no incoming links). + +### Step 4: Trace Downstream + +Follow outgoing links to find child needs. "Downstream" = toward concrete implementations and tests. + +- Check standard `links` (outgoing), all extra_link types (outgoing direction). +- If codelinks enabled, search code files for codelink annotations referencing the need ID. +- Use a visited set. Codelinks are always leaf nodes. + +### Step 5: Present Traceability Tree + +``` +=== Traceability: REQ_001 (Requirement: Brake response time) [open] === + +--- Upstream (satisfies) --- +(no upstream links - top-level requirement) + +--- Downstream (satisfied by) --- +REQ_001 (Requirement: Brake response time) [open] ++-- SPEC_001 (Specification: Brake pedal sensor interface) [open] --links--> +| +-- IMPL_001 (Implementation: Brake pedal driver) [open] --implements--> +| | +-- TEST_001 (Test Case: Brake response time test) [open] --tests--> +| | +-- src/brake_controller.c:brake_check() [codelink] +| +-- [GAP] No test case directly linked (expected: spec -> impl -> test) ++-- REQ_002 (Requirement: Brake force distribution) [open] --links--> + +-- SPEC_002 (Specification: Force distribution algorithm) [open] --links--> +``` + +Tree formatting: +- Each node: `ID (Type: Title) [status]` +- Each edge: `--<link_type>-->` +- Box-drawing characters: `+--`, `|` +- Codelinks: `file:symbol [codelink]` +- Cycles: `ID [CYCLE - already visited]` +- Broken links: `ID [BROKEN LINK - need not found]` + +### Step 6: Highlight Gaps + +Using `required_links` from `pharaoh.toml`, check for missing children of the expected type. Insert `[GAP]` markers in the tree. Provide a gap summary after the tree. + +### Step 7: Multi-project Support + +If multiple project roots exist, search all indexes when a link target is not found locally. Annotate cross-project needs with `[project: <name>]`. + +## Constraints + +1. Handle circular links gracefully with a visited set. Never infinite recurse. +2. Follow ALL configured link types. Do not hardcode names. +3. Always show link type labels on every edge. +4. Show status on every node. +5. Show broken references rather than silently dropping them. +6. This agent is read-only. No session state changes, no file modifications. +7. No workflow gates. Runs freely in any mode. +8. For large projects (>500 needs), suggest `--depth` or `--type` filters. diff --git a/.github/agents/pharaoh.use-case-diagram-draft.agent.md b/.github/agents/pharaoh.use-case-diagram-draft.agent.md new file mode 100644 index 0000000..59255eb --- /dev/null +++ b/.github/agents/pharaoh.use-case-diagram-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Draft a single use-case diagram for one feat — actors, use cases, system boundary. +handoffs: [pharaoh.diagram-review] +--- + +# @pharaoh.use-case-diagram-draft + +Draft a single use-case diagram for one feat — actors, use cases, system boundary. + +See [`skills/pharaoh-use-case-diagram-draft/SKILL.md`](../../skills/pharaoh-use-case-diagram-draft/SKILL.md) for the full atomic specification. diff --git a/.github/agents/pharaoh.vplan-draft.agent.md b/.github/agents/pharaoh.vplan-draft.agent.md new file mode 100644 index 0000000..2b1b828 --- /dev/null +++ b/.github/agents/pharaoh.vplan-draft.agent.md @@ -0,0 +1,10 @@ +--- +description: Draft a single sphinx-needs test-case (verification plan item) for one requirement. +handoffs: [] +--- + +# @pharaoh.vplan-draft + +Draft a single sphinx-needs test-case (verification plan item) for one requirement. + +See [`skills/pharaoh-vplan-draft/SKILL.md`](../../skills/pharaoh-vplan-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.vplan-review.agent.md b/.github/agents/pharaoh.vplan-review.agent.md new file mode 100644 index 0000000..af3d327 --- /dev/null +++ b/.github/agents/pharaoh.vplan-review.agent.md @@ -0,0 +1,10 @@ +--- +description: Audit a single test case against ISO 26262-8 §6 axes plus vplan-specific axes. +handoffs: [] +--- + +# @pharaoh.vplan-review + +Audit a single test case against ISO 26262-8 §6 axes plus vplan-specific axes. + +See [`skills/pharaoh-vplan-review/SKILL.md`](../../skills/pharaoh-vplan-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.write-plan.agent.md b/.github/agents/pharaoh.write-plan.agent.md new file mode 100644 index 0000000..934c18d --- /dev/null +++ b/.github/agents/pharaoh.write-plan.agent.md @@ -0,0 +1,10 @@ +--- +description: Use when you have an intent (e. +handoffs: [] +--- + +# @pharaoh.write-plan + +Use when you have an intent (e. + +See [`skills/pharaoh-write-plan/SKILL.md`](../../skills/pharaoh-write-plan/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000..ff9f389 --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,84 @@ +# Pharaoh - AI Assistant for sphinx-needs Projects + +Pharaoh is a skill-based AI assistant framework for sphinx-needs projects. It helps teams author, analyze, trace, and validate requirements using AI. Pharaoh is designed for safety-critical workflows (A-SPICE, ISO 26262) but works with any sphinx-needs project. + +## Core Principles + +- **Static-first data access**: Parse RST/MD source files directly; use ubc CLI or ubCode MCP when available for speed and accuracy. +- **Advisory by default, strict when configured**: `pharaoh.toml` controls enforcement level; no config = advisory mode with guardrails. +- **Safety-critical ready**: Designed for A-SPICE/ISO 26262 workflows but usable by any sphinx-needs team. + +## Available Agents + +| Agent | Purpose | +|-------|---------| +| `@pharaoh.setup` | Scaffold Pharaoh into a project -- detect structure, generate `pharaoh.toml` | +| `@pharaoh.change` | Analyze impact of a change -- trace through needs links and codelinks, produce a Change Document | +| `@pharaoh.trace` | Navigate traceability in any direction -- show everything linked to a need across all levels | +| `@pharaoh.mece` | Gap and redundancy analysis -- find orphans, missing links, MECE violations | +| `@pharaoh.author` | AI-assisted requirement authoring -- create/modify needs with proper IDs, types, and links | +| `@pharaoh.verify` | Validate implementations against requirements -- content-level satisfaction checks | +| `@pharaoh.release` | Release management -- changelog from requirements, traceability coverage metrics | +| `@pharaoh.plan` | Structured implementation planning -- break changes into tasks with workflow enforcement | +| `@pharaoh.spec` | Generate spec from requirements -- read needs hierarchy, record decisions, produce spec with plan table | +| `@pharaoh.decide` | Record design decisions -- create `decision` needs with alternatives, rationale, and traceability links | + +## Recommended Workflow + +``` +@pharaoh.spec -> @pharaoh.decide (for gaps) + -> produces spec doc with plan table + | +@pharaoh.plan -> @pharaoh.change -> @pharaoh.author -> @pharaoh.verify -> @pharaoh.release + -> @pharaoh.mece (optional, for gap analysis) + -> @pharaoh.trace (optional, for exploration) +``` + +## Data Access Tiers + +Agents automatically use the best available data source: + +1. **ubc CLI** (best): Fast, deterministic JSON output. Install from https://ubcode.useblocks.com/ubc/installation.html +2. **ubCode MCP** (VS Code): Real-time indexed data via the ubCode extension. Automatic when the extension is running. +3. **Raw file parsing** (fallback): AI reads RST/MD files directly. Always works, slower on large projects. + +## Configuration + +### Project Configuration + +Agents read need types, link types, and ID settings from: +- `ubproject.toml` (preferred) -- the `[needs]` section +- `conf.py` (fallback) -- `needs_types`, `needs_extra_links`, etc. + +### Pharaoh Configuration (`pharaoh.toml`) + +Optional. Controls Pharaoh's workflow behavior, not sphinx-needs configuration. + +```toml +[pharaoh] +strictness = "advisory" # or "enforcing" + +[pharaoh.workflow] +require_change_analysis = true +require_verification = true +require_mece_on_release = false + +[pharaoh.traceability] +required_links = ["req -> spec", "spec -> impl", "impl -> test"] + +[pharaoh.codelinks] +enabled = true +``` + +### Advisory vs Enforcing Mode + +- **Advisory** (default): Agents suggest the recommended workflow but never block. Tips are shown for skipped steps. +- **Enforcing**: Agents check prerequisites and block if not met (e.g., `@pharaoh.author` requires `@pharaoh.change` first). + +## sphinx-codelinks Integration + +When a project uses sphinx-codelinks, Pharaoh follows codelink references in change analysis and traceability. A change to a requirement surfaces affected code files, not just other requirements. + +## Session State + +Workflow progress is tracked in `.pharaoh/session.json` (ephemeral, gitignored). This enables enforcing mode gates and tracks which needs have been analyzed, authored, and verified. diff --git a/.github/prompts/pharaoh.author.prompt.md b/.github/prompts/pharaoh.author.prompt.md new file mode 100644 index 0000000..f8a06dd --- /dev/null +++ b/.github/prompts/pharaoh.author.prompt.md @@ -0,0 +1,3 @@ +--- +agent: pharaoh.author +--- diff --git a/.github/prompts/pharaoh.change.prompt.md b/.github/prompts/pharaoh.change.prompt.md new file mode 100644 index 0000000..7bcd3b1 --- /dev/null +++ b/.github/prompts/pharaoh.change.prompt.md @@ -0,0 +1,3 @@ +--- +agent: pharaoh.change +--- diff --git a/.github/prompts/pharaoh.mece.prompt.md b/.github/prompts/pharaoh.mece.prompt.md new file mode 100644 index 0000000..ebb2eed --- /dev/null +++ b/.github/prompts/pharaoh.mece.prompt.md @@ -0,0 +1,3 @@ +--- +agent: pharaoh.mece +--- diff --git a/.github/prompts/pharaoh.plan.prompt.md b/.github/prompts/pharaoh.plan.prompt.md new file mode 100644 index 0000000..9babb20 --- /dev/null +++ b/.github/prompts/pharaoh.plan.prompt.md @@ -0,0 +1,3 @@ +--- +agent: pharaoh.plan +--- diff --git a/.github/prompts/pharaoh.release.prompt.md b/.github/prompts/pharaoh.release.prompt.md new file mode 100644 index 0000000..ea5d165 --- /dev/null +++ b/.github/prompts/pharaoh.release.prompt.md @@ -0,0 +1,3 @@ +--- +agent: pharaoh.release +--- diff --git a/.github/prompts/pharaoh.trace.prompt.md b/.github/prompts/pharaoh.trace.prompt.md new file mode 100644 index 0000000..250bcd8 --- /dev/null +++ b/.github/prompts/pharaoh.trace.prompt.md @@ -0,0 +1,3 @@ +--- +agent: pharaoh.trace +--- diff --git a/.github/prompts/pharaoh.verify.prompt.md b/.github/prompts/pharaoh.verify.prompt.md new file mode 100644 index 0000000..1726175 --- /dev/null +++ b/.github/prompts/pharaoh.verify.prompt.md @@ -0,0 +1,3 @@ +--- +agent: pharaoh.verify +--- diff --git a/.gitignore b/.gitignore index 50d7d54..77c7991 100644 --- a/.gitignore +++ b/.gitignore @@ -159,3 +159,9 @@ cython_debug/ # option (not recommended) you can uncomment the following to ignore the entire idea folder. #.idea/ docs/github_images + +# Pharaoh ephemeral state (do not commit). Project tailoring at .pharaoh/project/ IS committed. +.pharaoh/runs/ +.pharaoh/plans/ +.pharaoh/session.json +.pharaoh/cache/ diff --git a/.pharaoh/project/artefact-catalog.yaml b/.pharaoh/project/artefact-catalog.yaml new file mode 100644 index 0000000..06716f9 --- /dev/null +++ b/.pharaoh/project/artefact-catalog.yaml @@ -0,0 +1,228 @@ +req: + required_fields: + - id + - status + optional_fields: + - source_doc + - reviewer + - approved_by + - release + - author + child_of: [] + lifecycle_ref: workflows.yaml#req + +spec: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + - reqs + child_of: + - req + lifecycle_ref: workflows.yaml#spec + +impl: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + - implements + child_of: + - spec + - component + - interface + lifecycle_ref: workflows.yaml#impl + +test: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + - specs + child_of: + - spec + lifecycle_ref: workflows.yaml#test + +person: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + child_of: [] + lifecycle_ref: workflows.yaml#person + +team: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + - persons + child_of: [] + lifecycle_ref: workflows.yaml#team + +release: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + - based_on + child_of: [] + lifecycle_ref: workflows.yaml#release + +arch: + required_fields: + - id + - status + optional_fields: + - source_doc + - reviewer + - approved_by + - depends_on + - realizes + child_of: [] + lifecycle_ref: workflows.yaml#arch + +need: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + child_of: [] + lifecycle_ref: workflows.yaml#need + +swarch: + required_fields: + - id + - status + optional_fields: + - source_doc + - reviewer + - approved_by + child_of: [] + lifecycle_ref: workflows.yaml#swarch + +component: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + - provides + - consumes + - uses + child_of: + - swarch + - sys-arch + lifecycle_ref: workflows.yaml#component + +interface: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + - provided_by + child_of: [] + lifecycle_ref: workflows.yaml#interface + +seq_msg: + required_fields: + - id + - status + optional_fields: + - reviewer + - approved_by + child_of: [] + lifecycle_ref: workflows.yaml#seq_msg + +swreq: + required_fields: + - id + - status + optional_fields: + - source_doc + - reviewer + - approved_by + child_of: + - sysreq + lifecycle_ref: workflows.yaml#swreq + +sys-arch: + required_fields: + - id + - status + optional_fields: + - source_doc + - reviewer + - approved_by + child_of: [] + lifecycle_ref: workflows.yaml#sys-arch + +hazard: + required_fields: + - id + - status + - asil + optional_fields: + - severity + - exposure + - controllability + - scenario + - reviewer + - approved_by + child_of: [] + lifecycle_ref: workflows.yaml#hazard + +safety_goal: + required_fields: + - id + - status + - asil + - mitigates + optional_fields: + - safe_state + - reviewer + - approved_by + child_of: + - hazard + lifecycle_ref: workflows.yaml#safety_goal + +fsr: + required_fields: + - id + - status + - derives_from + optional_fields: + - asil + - reviewer + - approved_by + child_of: + - safety_goal + lifecycle_ref: workflows.yaml#fsr + +sysreq: + required_fields: + - id + - status + optional_fields: + - source_doc + - reviewer + - approved_by + child_of: [] + lifecycle_ref: workflows.yaml#sysreq diff --git a/.pharaoh/project/checklists/arch.md b/.pharaoh/project/checklists/arch.md new file mode 100644 index 0000000..bc926fa --- /dev/null +++ b/.pharaoh/project/checklists/arch.md @@ -0,0 +1,11 @@ +--- +applies_to: arch +required_before: [reviewed] +--- + +# Architecture element review checklist + +- [ ] Element has a single, named responsibility +- [ ] `:depends_on:` / `:realizes:` correctly capture relations to other architecture elements +- [ ] Element is traceable back to at least one requirement (`req`, `sysreq`, or `swreq`) +- [ ] Decomposition is balanced (not too broad, not too granular) diff --git a/.pharaoh/project/checklists/component.md b/.pharaoh/project/checklists/component.md new file mode 100644 index 0000000..380c601 --- /dev/null +++ b/.pharaoh/project/checklists/component.md @@ -0,0 +1,11 @@ +--- +applies_to: component +required_before: [reviewed] +--- + +# Component review checklist + +- [ ] Component has a single, named responsibility +- [ ] `:provides:` lists every interface offered to consumers +- [ ] `:consumes:` / `:uses:` lists every interface depended on +- [ ] Component is traceable to at least one architectural element (`arch` / `swarch` / `sys-arch`) diff --git a/.pharaoh/project/checklists/fsr.md b/.pharaoh/project/checklists/fsr.md new file mode 100644 index 0000000..468fe25 --- /dev/null +++ b/.pharaoh/project/checklists/fsr.md @@ -0,0 +1,11 @@ +--- +applies_to: fsr +required_before: [reviewed] +--- + +# Functional safety requirement review checklist + +- [ ] Body uses a single `shall` clause expressing a safety-relevant behavior +- [ ] `:derives_from:` names at least one real `safety_goal` ID +- [ ] ASIL inheritance / decomposition rationale is stated (if ASIL differs from parent) +- [ ] Allocation to component / sub-system is explicit diff --git a/.pharaoh/project/checklists/hazard.md b/.pharaoh/project/checklists/hazard.md new file mode 100644 index 0000000..2534859 --- /dev/null +++ b/.pharaoh/project/checklists/hazard.md @@ -0,0 +1,11 @@ +--- +applies_to: hazard +required_before: [reviewed] +--- + +# Hazard review checklist + +- [ ] `:asil:` is set to one of QM / A / B / C / D +- [ ] Severity, exposure, and controllability are stated (S/E/C rationale) +- [ ] `:scenario:` describes the operational situation that exposes the hazard +- [ ] Hazard is mitigated by at least one `safety_goal` diff --git a/.pharaoh/project/checklists/impl.md b/.pharaoh/project/checklists/impl.md new file mode 100644 index 0000000..40be838 --- /dev/null +++ b/.pharaoh/project/checklists/impl.md @@ -0,0 +1,11 @@ +--- +applies_to: impl +required_before: [reviewed] +--- + +# Implementation review checklist + +- [ ] `:implements:` names a real `spec`, `component`, or `interface` ID +- [ ] Title names the implementing module / function (or its commit anchor) +- [ ] Body summarizes the realised behavior, not unrelated context +- [ ] At least one `test` verifies the same `spec` diff --git a/.pharaoh/project/checklists/interface.md b/.pharaoh/project/checklists/interface.md new file mode 100644 index 0000000..e73bc20 --- /dev/null +++ b/.pharaoh/project/checklists/interface.md @@ -0,0 +1,11 @@ +--- +applies_to: interface +required_before: [reviewed] +--- + +# Interface review checklist + +- [ ] Operations are listed with explicit parameters and return contracts +- [ ] Pre / post-conditions or error modes are stated where relevant +- [ ] `:provided_by:` names exactly one provider component (or is intentionally blank for an abstract contract) +- [ ] Interface is referenced by at least one consumer (`component :uses:` / `:consumes:`) diff --git a/.pharaoh/project/checklists/need.md b/.pharaoh/project/checklists/need.md new file mode 100644 index 0000000..8e32e34 --- /dev/null +++ b/.pharaoh/project/checklists/need.md @@ -0,0 +1,8 @@ +--- +applies_to: need +required_before: [reviewed] +--- + +# Generic need review checklist + +- [ ] Review this need for clarity, correctness, and traceability. diff --git a/.pharaoh/project/checklists/person.md b/.pharaoh/project/checklists/person.md new file mode 100644 index 0000000..eb7546f --- /dev/null +++ b/.pharaoh/project/checklists/person.md @@ -0,0 +1,8 @@ +--- +applies_to: person +required_before: [reviewed] +--- + +# Person review checklist + +- [ ] Review this person for clarity, correctness, and traceability. diff --git a/.pharaoh/project/checklists/release.md b/.pharaoh/project/checklists/release.md new file mode 100644 index 0000000..983819c --- /dev/null +++ b/.pharaoh/project/checklists/release.md @@ -0,0 +1,8 @@ +--- +applies_to: release +required_before: [reviewed] +--- + +# Release review checklist + +- [ ] Review this release for clarity, correctness, and traceability. diff --git a/.pharaoh/project/checklists/req.md b/.pharaoh/project/checklists/req.md new file mode 100644 index 0000000..7b9cd6c --- /dev/null +++ b/.pharaoh/project/checklists/req.md @@ -0,0 +1,12 @@ +--- +applies_to: req +required_before: [reviewed] +--- + +# Requirement review checklist + +- [ ] Body uses a single `shall` clause expressing observable behavior +- [ ] Body does not describe implementation (package names, function names, internal data structures) +- [ ] Scope is a single observable behavior, not a conjunction +- [ ] `:source_doc:` references an existing doc file (if set) +- [ ] At least one `spec` refines this requirement (else it is untracked work) diff --git a/.pharaoh/project/checklists/requirement.md b/.pharaoh/project/checklists/requirement.md new file mode 100644 index 0000000..1686069 --- /dev/null +++ b/.pharaoh/project/checklists/requirement.md @@ -0,0 +1,6 @@ +--- +applies_to: req +required_before: [reviewed] +--- + +# See [req.md](req.md) — canonical requirement checklist diff --git a/.pharaoh/project/checklists/safety_goal.md b/.pharaoh/project/checklists/safety_goal.md new file mode 100644 index 0000000..167ad07 --- /dev/null +++ b/.pharaoh/project/checklists/safety_goal.md @@ -0,0 +1,11 @@ +--- +applies_to: safety_goal +required_before: [reviewed] +--- + +# Safety goal review checklist + +- [ ] `:asil:` is set to A, B, C, or D (QM not allowed for safety goals) +- [ ] `:mitigates:` names at least one real `hazard` ID +- [ ] `:safe_state:` is defined and reachable +- [ ] Goal is decomposed by at least one `fsr` diff --git a/.pharaoh/project/checklists/seq_msg.md b/.pharaoh/project/checklists/seq_msg.md new file mode 100644 index 0000000..f9f1714 --- /dev/null +++ b/.pharaoh/project/checklists/seq_msg.md @@ -0,0 +1,8 @@ +--- +applies_to: seq_msg +required_before: [reviewed] +--- + +# Sequence message review checklist + +- [ ] Review this seq_msg for clarity, correctness, and traceability. diff --git a/.pharaoh/project/checklists/spec.md b/.pharaoh/project/checklists/spec.md new file mode 100644 index 0000000..b126374 --- /dev/null +++ b/.pharaoh/project/checklists/spec.md @@ -0,0 +1,12 @@ +--- +applies_to: spec +required_before: [reviewed] +--- + +# Specification review checklist + +- [ ] `:reqs:` (or `:links:`) names at least one parent `req` +- [ ] Body refines exactly one observable behavior of the parent requirement +- [ ] Body uses precise wording (no "should/may/usually") +- [ ] At least one `impl` implements this spec +- [ ] At least one `test` verifies this spec diff --git a/.pharaoh/project/checklists/swarch.md b/.pharaoh/project/checklists/swarch.md new file mode 100644 index 0000000..557d2ac --- /dev/null +++ b/.pharaoh/project/checklists/swarch.md @@ -0,0 +1,11 @@ +--- +applies_to: swarch +required_before: [reviewed] +--- + +# Software architecture review checklist + +- [ ] Element has a single, named responsibility within the software boundary +- [ ] Interfaces / dependencies on other software components are explicit +- [ ] Element is traceable back to at least one `swreq` or `sysreq` +- [ ] Decomposition is balanced and consistent with sibling elements diff --git a/.pharaoh/project/checklists/swreq.md b/.pharaoh/project/checklists/swreq.md new file mode 100644 index 0000000..0a8850f --- /dev/null +++ b/.pharaoh/project/checklists/swreq.md @@ -0,0 +1,11 @@ +--- +applies_to: swreq +required_before: [reviewed] +--- + +# Software requirement review checklist + +- [ ] Body uses a single `shall` clause describing observable software behavior +- [ ] `:links:` (or another parent link) names a real `sysreq` +- [ ] Wording is implementation-free (no specific module / class names) +- [ ] At least one `spec` or `impl` covers this requirement diff --git a/.pharaoh/project/checklists/sys-arch.md b/.pharaoh/project/checklists/sys-arch.md new file mode 100644 index 0000000..6770874 --- /dev/null +++ b/.pharaoh/project/checklists/sys-arch.md @@ -0,0 +1,11 @@ +--- +applies_to: sys-arch +required_before: [reviewed] +--- + +# System architecture review checklist + +- [ ] Element captures a system-level structural or behavioral decision +- [ ] Hardware / software allocation (if relevant) is explicit +- [ ] Element is traceable back to at least one `sysreq` +- [ ] No leakage of software-internal detail into the system view diff --git a/.pharaoh/project/checklists/sysreq.md b/.pharaoh/project/checklists/sysreq.md new file mode 100644 index 0000000..0703fd8 --- /dev/null +++ b/.pharaoh/project/checklists/sysreq.md @@ -0,0 +1,11 @@ +--- +applies_to: sysreq +required_before: [reviewed] +--- + +# System requirement review checklist + +- [ ] Body uses a single `shall` clause describing observable system behavior +- [ ] Wording is allocation-free (does not pre-bind hardware or software boundaries unless required) +- [ ] `:source_doc:` references the originating document (if set) +- [ ] At least one `swreq` or `arch` element decomposes this requirement diff --git a/.pharaoh/project/checklists/team.md b/.pharaoh/project/checklists/team.md new file mode 100644 index 0000000..bcb7f2e --- /dev/null +++ b/.pharaoh/project/checklists/team.md @@ -0,0 +1,8 @@ +--- +applies_to: team +required_before: [reviewed] +--- + +# Team review checklist + +- [ ] Review this team for clarity, correctness, and traceability. diff --git a/.pharaoh/project/checklists/test.md b/.pharaoh/project/checklists/test.md new file mode 100644 index 0000000..ca35349 --- /dev/null +++ b/.pharaoh/project/checklists/test.md @@ -0,0 +1,12 @@ +--- +applies_to: test +required_before: [reviewed] +--- + +# Test case review checklist + +- [ ] `:specs:` names at least one parent `spec` covered by this test +- [ ] Inputs / preconditions are stated explicitly +- [ ] Steps are ordered and reproducible +- [ ] Expected outcome is observable and unambiguous +- [ ] No combined assertions hiding multiple behaviors diff --git a/.pharaoh/project/id-conventions.yaml b/.pharaoh/project/id-conventions.yaml new file mode 100644 index 0000000..a8c78bc --- /dev/null +++ b/.pharaoh/project/id-conventions.yaml @@ -0,0 +1,26 @@ +prefixes: + req: R_ + spec: S_ + impl: I_ + test: T_ + person: P_ + team: T_ + release: R_ + arch: _ + need: _ + swarch: SWARCH_ + component: COMP_ + interface: INTF_ + seq_msg: SEQMSG_ + swreq: SWREQ_ + sys-arch: SYSARCH_ + hazard: HAZ_ + safety_goal: SG_ + fsr: FSR_ + sysreq: SYSREQ_ + +# Project id_regex from docs/ubproject.toml [needs] id_regex (preserved verbatim). +# Note: the strict OR-of-prefixes form would collide for R_ (req/release), T_ (test/team) +# and the bare _ (arch/need); the project intentionally uses a permissive regex instead. +id_regex: "^[A-Z_]{3,10}(_[0-9]{1,3})*$" +separator: "_" diff --git a/.pharaoh/project/workflows.yaml b/.pharaoh/project/workflows.yaml new file mode 100644 index 0000000..88a9bd4 --- /dev/null +++ b/.pharaoh/project/workflows.yaml @@ -0,0 +1,227 @@ +req: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +spec: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +impl: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +test: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +person: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +team: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +release: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +arch: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +need: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +swarch: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +component: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +interface: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +seq_msg: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +swreq: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +sys-arch: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +hazard: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +safety_goal: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +fsr: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved + +sysreq: + states: + - draft + - reviewed + - approved + transitions: + - {from: draft, to: reviewed, gate: "reviewer_present"} + - {from: reviewed, to: approved, gate: "approver_present"} + - {from: reviewed, to: draft, gate: "reviewer_rejected"} + initial: draft + final: approved diff --git a/docs/index.rst b/docs/index.rst index 3937d3c..dbf6403 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -106,6 +106,7 @@ Page Content demo_details online_editor + pharaoh .. toctree:: :maxdepth: 2 diff --git a/docs/pharaoh.rst b/docs/pharaoh.rst new file mode 100644 index 0000000..ce55d89 --- /dev/null +++ b/docs/pharaoh.rst @@ -0,0 +1,180 @@ +{% set page="pharaoh.rst" %} +{% include "demo_page_header.rst" with context %} + +🦅 Pharaoh setup +================ + +`Pharaoh <https://github.com/useblocks/pharaoh>`__ is an authoring and +review layer that sits on top of Sphinx-Needs. It does not change how +needs are stored or built. Sphinx-Needs remains the source of truth. +Pharaoh adds: + +* atomic skills for drafting, reviewing, and auditing requirements, + architecture elements, FMEA entries, test cases, and decisions. +* a per-project tailoring layer (``.pharaoh/project/``) that captures + workflow states, artefact catalogs, ID conventions, and review + checklists in YAML. +* a traceability gate (``pharaoh:mece``) that reports gaps, orphans, + and link inconsistencies against the project's declared chains. +* GitHub Copilot agents (``@pharaoh.req-draft``, ``@pharaoh.mece``, + ``@pharaoh.flow``, ...) installed under ``.github/agents``. + +This page records how Pharaoh was bootstrapped on the +sphinx-needs-demo repository so the setup can be reproduced. + +What Pharaoh adds on top of Sphinx-Needs +---------------------------------------- + +The demo project already declares its need types and link options in +``docs/ubproject.toml``. Pharaoh re-uses those declarations and adds +the files described below. + +.. list-table:: + :header-rows: 1 + :widths: 30 50 20 + + * - File / directory + - Purpose + - Tracked in git? + * - ``pharaoh.toml`` + - Pharaoh strictness, mode, traceability rules, codelinks toggle. + Only Pharaoh's own behavior, not Sphinx-Needs configuration. + - yes + * - ``.pharaoh/project/workflows.yaml`` + - Lifecycle state machine per need type + (``draft → reviewed → approved``). + - yes + * - ``.pharaoh/project/id-conventions.yaml`` + - Per-type ID prefixes and the project's ``id_regex`` mirror. + - yes + * - ``.pharaoh/project/artefact-catalog.yaml`` + - Required and optional fields per type, parent-of relations, + lifecycle reference. + - yes + * - ``.pharaoh/project/checklists/<type>.md`` + - Review checklist consumed by ``pharaoh:<type>-review`` skills. + - yes + * - ``.pharaoh/runs/``, ``.pharaoh/plans/``, + ``.pharaoh/session.json``, ``.pharaoh/cache/`` + - Run-time artefacts emitted by Pharaoh skills. + - **no** (gitignored) + * - ``.github/agents/pharaoh.*.agent.md`` + - Custom Copilot Chat agents. + - yes + * - ``.github/prompts/pharaoh.*.prompt.md`` + - Reusable prompts (``/pharaoh.author``, ``/pharaoh.mece``, ...). + - yes + * - ``.github/copilot-instructions.md`` + - Repository-wide Copilot preamble that loads Pharaoh context. + - yes + +Setup +----- + +The setup is performed by the ``pharaoh:pharaoh-setup`` skill. It is +idempotent: re-running it on an already-configured project shows a +diff and asks before overwriting any file. + +Prerequisites +^^^^^^^^^^^^^ + +* a Sphinx project with Sphinx-Needs already configured + (``ubproject.toml`` or ``conf.py`` declaring at least one + ``[[needs.types]]`` entry). +* the Pharaoh plugin installed in your Claude Code or Copilot CLI + workspace. + +Optional but recommended: + +* the ``ubc`` CLI on ``PATH`` (faster, deterministic data access). +* the ubCode VS Code extension (live indexing and MCP integration). + +Running the setup +^^^^^^^^^^^^^^^^^ + +In Claude Code or Copilot CLI: + +.. code-block:: text + + /pharaoh:pharaoh-setup + +The skill walks through five steps: + +1. **Detect project structure.** Reads ``ubproject.toml``, lists + declared types and link options, samples existing IDs, and detects + whether sphinx-codelinks is configured. +2. **Generate** ``pharaoh.toml``. Asks for a strictness mode and a + project-lifecycle mode (``reverse-eng``, ``greenfield``, or + ``steady-state``) and writes the file at the workspace root. +3. **Scaffold Copilot agents** under ``.github/agents/`` and + ``.github/prompts/``. +4. **Configure** ``.gitignore``. Adds narrow entries for the + ephemeral ``.pharaoh/`` paths and leaves ``.pharaoh/project/`` + tracked. +5. **Bootstrap tailoring.** Calls ``pharaoh-tailor-bootstrap`` to + write the YAML tailoring files and the per-type checklists into + ``.pharaoh/project/``. + +This demo's ``pharaoh.toml`` +---------------------------- + +.. literalinclude:: ../pharaoh.toml + :language: toml + :caption: pharaoh.toml + +A few decisions worth calling out: + +* **Mode** is ``reverse-eng``. The catalogue already exists, so the + ``require_change_analysis`` and ``require_mece_on_release`` gates + start permissive. They can be tightened with ``pharaoh:gate-advisor`` + once the workflow stabilises. +* **Strictness** is ``advisory``. Pharaoh skills suggest the + recommended workflow but never block authoring. +* **Required link chains** reflect the 100%-coverage policy that the + existing 268 needs already satisfy: ``spec → req``, ``arch → req``, + ``safety_goal → hazard``, ``fsr → safety_goal``. Chains for + ``impl`` and ``test`` were intentionally left out because the + corpus shows mixed parent types below 90% coverage. +* **Codelinks** are enabled. Pharaoh's change-impact analysis follows + the ``[codelinks.projects.coffee_machine]`` configuration declared + in ``docs/ubproject.toml``. + +Verifying the setup +------------------- + +Build the documentation and run ``pharaoh:mece`` to confirm the +traceability rules match the corpus: + +.. code-block:: bash + + uv sync + uv run sphinx-build -b html docs docs/_build/html -W + +After the build emits ``docs/_build/html/needs.json``, invoke the +MECE skill from your agent: + +.. code-block:: text + + /pharaoh:pharaoh-mece + +For the demo's current state the report shows zero gaps against the +configured chains. Other findings (status mismatches, ID-regex +violations, undeclared types injected by ``sphinx-test-reports``) are +pre-existing properties of the corpus and are surfaced for review, +not introduced by Pharaoh. + +Tailoring layer +--------------- + +The YAML files under ``.pharaoh/project/`` are intentionally +human-readable so that they can be hand-tuned. Key entry points: + +* ``workflows.yaml``: change the allowed states or add a + ``deprecated`` terminal state. +* ``artefact-catalog.yaml``: promote a field from optional to + required, or restrict ``child_of`` to a smaller set of parent types. +* ``checklists/<type>.md``: edit the review questions consumed by + ``pharaoh:<type>-review``. + +Re-run ``pharaoh:tailor-detect`` once the catalogue grows past a few +dozen needs to refresh the inferred conventions. diff --git a/pharaoh.toml b/pharaoh.toml new file mode 100644 index 0000000..3d50a07 --- /dev/null +++ b/pharaoh.toml @@ -0,0 +1,46 @@ +# Pharaoh configuration for sphinx-needs-demo. +# Need types and link types are read from docs/ubproject.toml. They are not +# re-defined here. + +[pharaoh] +# "advisory" (default): suggests workflow but never blocks. +# "enforcing": skills gate each other per workflow rules. +strictness = "advisory" + +[pharaoh.id_scheme] +# Available placeholders: {TYPE}, {MODULE}, {NUMBER}. +# Existing project IDs use a domain prefix (e.g. BRAKE_CTRL_01, FSR_POWER_01). +# The default below is a safe starting point for new IDs allocated by Pharaoh. +pattern = "{TYPE}_{NUMBER}" +auto_increment = true + +[pharaoh.workflow] +# mode: reverse-eng. Pharaoh is being introduced over an existing catalogue. +# Tighten as the catalogue stabilises (see skills/shared/gate-enablement.md). +require_change_analysis = false +require_verification = true +require_mece_on_release = false + +[pharaoh.traceability] +# Required link chains. pharaoh:mece reports violations. +# Format: "source_type -> target_type". Every <source> must have at least one +# outgoing link to a need of <target> type (across any link option). +# Chains below reflect 100%-coverage policy observed in this repo's needs.json: +# spec :reqs: -> req +# arch :links: -> req +# safety_goal :mitigates: -> hazard +# fsr :derives_from: -> safety_goal +# Chains for impl/test/component were intentionally omitted: corpus shows mixed +# parent types (swreq, swarch, interface, ...) below 90% coverage, so they are +# not project-wide policy yet. Add them once the convention stabilises. +required_links = [ + "spec -> req", + "arch -> req", + "safety_goal -> hazard", + "fsr -> safety_goal", +] + +[pharaoh.codelinks] +# sphinx-codelinks detected in docs/ubproject.toml [codelinks]. Pharaoh's +# change-impact analysis follows it. +enabled = true From 72143181e196b7aae6040842d9d02dfba471203c Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 12:47:20 +0200 Subject: [PATCH 02/15] Address PR review: align doc invocations with installed agents * docs/pharaoh.rst: use @pharaoh.setup / @pharaoh.mece for the Copilot Chat examples (these match the agents installed under .github/agents/), and reference /pharaoh.mece + the Claude Code /pharaoh:pharaoh-mece form as alternates. Drop the inaccurate /pharaoh:pharaoh-setup example since this repo does not install a matching prompt. * .pharaoh/project/id-conventions.yaml: clarify that id_regex is the anchored, ASCII-digit normalised form of the regex declared in docs/ubproject.toml, not a verbatim copy. --- .pharaoh/project/id-conventions.yaml | 8 +++++--- docs/pharaoh.rst | 28 +++++++++++++++++++--------- 2 files changed, 24 insertions(+), 12 deletions(-) diff --git a/.pharaoh/project/id-conventions.yaml b/.pharaoh/project/id-conventions.yaml index a8c78bc..5e7dd22 100644 --- a/.pharaoh/project/id-conventions.yaml +++ b/.pharaoh/project/id-conventions.yaml @@ -19,8 +19,10 @@ prefixes: fsr: FSR_ sysreq: SYSREQ_ -# Project id_regex from docs/ubproject.toml [needs] id_regex (preserved verbatim). -# Note: the strict OR-of-prefixes form would collide for R_ (req/release), T_ (test/team) -# and the bare _ (arch/need); the project intentionally uses a permissive regex instead. +# Anchored, ASCII-digit normalised form of the id_regex declared in +# docs/ubproject.toml [needs] (which uses the equivalent [\d] character class +# without anchors). The strict OR-of-prefixes form would collide for R_ +# (req/release), T_ (test/team) and the bare _ (arch/need). The project +# intentionally uses this permissive regex instead. id_regex: "^[A-Z_]{3,10}(_[0-9]{1,3})*$" separator: "_" diff --git a/docs/pharaoh.rst b/docs/pharaoh.rst index ce55d89..32a25f5 100644 --- a/docs/pharaoh.rst +++ b/docs/pharaoh.rst @@ -71,9 +71,9 @@ the files described below. Setup ----- -The setup is performed by the ``pharaoh:pharaoh-setup`` skill. It is -idempotent: re-running it on an already-configured project shows a -diff and asks before overwriting any file. +The setup is performed by the Pharaoh setup agent. It is idempotent: +re-running it on an already-configured project shows a diff and asks +before overwriting any file. Prerequisites ^^^^^^^^^^^^^ @@ -92,13 +92,20 @@ Optional but recommended: Running the setup ^^^^^^^^^^^^^^^^^ -In Claude Code or Copilot CLI: +After the agents in this PR are committed, GitHub Copilot Chat +exposes ``@pharaoh.setup`` as the entry point: + +.. code-block:: text + + @pharaoh.setup + +In Claude Code, invoke the same skill via its plugin name: .. code-block:: text /pharaoh:pharaoh-setup -The skill walks through five steps: +Either form runs the same five steps: 1. **Detect project structure.** Reads ``ubproject.toml``, lists declared types and link options, samples existing IDs, and detects @@ -142,7 +149,7 @@ A few decisions worth calling out: Verifying the setup ------------------- -Build the documentation and run ``pharaoh:mece`` to confirm the +Build the documentation and run the MECE check to confirm the traceability rules match the corpus: .. code-block:: bash @@ -150,12 +157,15 @@ traceability rules match the corpus: uv sync uv run sphinx-build -b html docs docs/_build/html -W -After the build emits ``docs/_build/html/needs.json``, invoke the -MECE skill from your agent: +After the build emits ``docs/_build/html/needs.json``, invoke MECE +from your agent. In Copilot Chat: .. code-block:: text - /pharaoh:pharaoh-mece + @pharaoh.mece + +The same prompt is available as ``/pharaoh.mece`` and, in Claude +Code, as ``/pharaoh:pharaoh-mece``. For the demo's current state the report shows zero gaps against the configured chains. Other findings (status mismatches, ID-regex From 60a8e2c5e3ba7cdc25a04ff06354d7356351c3d3 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 13:04:06 +0200 Subject: [PATCH 03/15] Address PR review: fix truncated agent descriptions and gitignore step Five agent files copied from the Pharaoh plugin had content issues that GitHub Copilot review flagged on this PR. Restoring the full source descriptions locally so this repo's .github/agents/ renders cleanly and matches the upstream skill specs. Tracked upstream as useblocks/pharaoh#12. * pharaoh.write-plan, pharaoh.toctree-emit, pharaoh.execute-plan, pharaoh.feat-file-map: descriptions were truncated mid-sentence at the first '.' character (e.g. "Use when you have an intent (e."), in some cases leaving unmatched parentheses or unclosed backticks. Restored from skills/pharaoh-*/SKILL.md description fields verbatim. * pharaoh.setup: Step 3 instructed to add a wholesale .pharaoh/ rule to .gitignore, contradicting the skill's own SKILL.md Step 4b which requires narrow ignores so .pharaoh/project/ tailoring stays tracked. Rewrote the step to match the skill's policy. --- .github/agents/pharaoh.execute-plan.agent.md | 4 ++-- .github/agents/pharaoh.feat-file-map.agent.md | 4 ++-- .github/agents/pharaoh.setup.agent.md | 15 ++++++++++++++- .github/agents/pharaoh.toctree-emit.agent.md | 4 ++-- .github/agents/pharaoh.write-plan.agent.md | 4 ++-- 5 files changed, 22 insertions(+), 9 deletions(-) diff --git a/.github/agents/pharaoh.execute-plan.agent.md b/.github/agents/pharaoh.execute-plan.agent.md index 2c1cf46..357b32d 100644 --- a/.github/agents/pharaoh.execute-plan.agent.md +++ b/.github/agents/pharaoh.execute-plan.agent.md @@ -1,10 +1,10 @@ --- -description: Use when executing a plan. +description: Use when executing a plan.yaml produced by pharaoh-write-plan. Reads the plan, runs each task (inline or via subagent dispatch), threads outputs between tasks per the ref grammar, validates outputs via pharaoh-output-validate, persists artefacts and report.yaml. The plan is the orchestrator, this skill is the engine. handoffs: [] --- # @pharaoh.execute-plan -Use when executing a plan. +Use when executing a plan.yaml produced by pharaoh-write-plan. Reads the plan, runs each task (inline or via subagent dispatch), threads outputs between tasks per the ref grammar, validates outputs via pharaoh-output-validate, persists artefacts and report.yaml. The plan is the orchestrator, this skill is the engine. See [`skills/pharaoh-execute-plan/SKILL.md`](../../skills/pharaoh-execute-plan/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-file-map.agent.md b/.github/agents/pharaoh.feat-file-map.agent.md index 0680f3b..5726391 100644 --- a/.github/agents/pharaoh.feat-file-map.agent.md +++ b/.github/agents/pharaoh.feat-file-map.agent.md @@ -1,10 +1,10 @@ --- -description: Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. +description: Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. Reads the source tree, returns a YAML entry `{feat_id: {files: [...], rationale: "..."}}`. Does NOT read docs. Does NOT emit reqs. Does NOT create or modify source files. handoffs: [] --- # @pharaoh.feat-file-map -Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. +Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. Reads the source tree, returns a YAML entry `{feat_id: {files: [...], rationale: "..."}}`. Does NOT read docs. Does NOT emit reqs. Does NOT create or modify source files. See [`skills/pharaoh-feat-file-map/SKILL.md`](../../skills/pharaoh-feat-file-map/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.setup.agent.md b/.github/agents/pharaoh.setup.agent.md index 9ff2517..14c5d01 100644 --- a/.github/agents/pharaoh.setup.agent.md +++ b/.github/agents/pharaoh.setup.agent.md @@ -67,7 +67,20 @@ Data access: ### Step 3: Configure .gitignore -Add `.pharaoh/` to `.gitignore` if not already present. Create `.gitignore` if needed. +Add narrow entries for the ephemeral `.pharaoh/` subpaths only. The +project tailoring under `.pharaoh/project/` is shared across the team +and must stay tracked. Create `.gitignore` if needed. + +``` +.pharaoh/runs/ +.pharaoh/plans/ +.pharaoh/session.json +.pharaoh/cache/ +``` + +Do not add a wholesale `.pharaoh/` rule, that would hide the tailoring +files (`workflows.yaml`, `id-conventions.yaml`, `artefact-catalog.yaml`, +`checklists/`). ### Step 4: Recommend Tooling diff --git a/.github/agents/pharaoh.toctree-emit.agent.md b/.github/agents/pharaoh.toctree-emit.agent.md index 3aa35f5..eaeaaf5 100644 --- a/.github/agents/pharaoh.toctree-emit.agent.md +++ b/.github/agents/pharaoh.toctree-emit.agent.md @@ -1,10 +1,10 @@ --- -description: Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index. +description: Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index.rst` with a Sphinx toctree over them. Prevents orphan-file warnings under `sphinx-build -W`. Does NOT modify the emitted RST files. Does NOT wire the emitted directory into any parent toctree, that is a caller concern. handoffs: [] --- # @pharaoh.toctree-emit -Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index. +Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index.rst` with a Sphinx toctree over them. Prevents orphan-file warnings under `sphinx-build -W`. Does NOT modify the emitted RST files. Does NOT wire the emitted directory into any parent toctree, that is a caller concern. See [`skills/pharaoh-toctree-emit/SKILL.md`](../../skills/pharaoh-toctree-emit/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.write-plan.agent.md b/.github/agents/pharaoh.write-plan.agent.md index 934c18d..a9584a4 100644 --- a/.github/agents/pharaoh.write-plan.agent.md +++ b/.github/agents/pharaoh.write-plan.agent.md @@ -1,10 +1,10 @@ --- -description: Use when you have an intent (e. +description: Use when you have an intent (e.g. "reverse-engineer features and reqs from this module") and need a concrete plan.yaml that pharaoh-execute-plan can run. Picks a plan template by intent, fills project-specific values, emits a plan that validates against schema.md. Does NOT execute anything. handoffs: [] --- # @pharaoh.write-plan -Use when you have an intent (e. +Use when you have an intent (e.g. "reverse-engineer features and reqs from this module") and need a concrete plan.yaml that pharaoh-execute-plan can run. Picks a plan template by intent, fills project-specific values, emits a plan that validates against schema.md. Does NOT execute anything. See [`skills/pharaoh-write-plan/SKILL.md`](../../skills/pharaoh-write-plan/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. From c691ee5ced79c0d986c09dc415c174ba3df37d9a Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 13:30:15 +0200 Subject: [PATCH 04/15] Add Pharaoh plugin install instructions to setup docs Cover Claude Code (/plugin marketplace add useblocks/pharaoh, /plugin install pharaoh@pharaoh-dev, /reload-plugins) and Copilot CLI (copilot plugin marketplace add / install). Mention pinning to a tag via /plugin marketplace add useblocks/pharaoh#v1.0.0. --- docs/pharaoh.rst | 39 ++++++++++++++++++++++++++++++++++----- 1 file changed, 34 insertions(+), 5 deletions(-) diff --git a/docs/pharaoh.rst b/docs/pharaoh.rst index 32a25f5..aa7f155 100644 --- a/docs/pharaoh.rst +++ b/docs/pharaoh.rst @@ -81,25 +81,54 @@ Prerequisites * a Sphinx project with Sphinx-Needs already configured (``ubproject.toml`` or ``conf.py`` declaring at least one ``[[needs.types]]`` entry). -* the Pharaoh plugin installed in your Claude Code or Copilot CLI - workspace. +* the Pharaoh plugin installed in your AI assistant of choice + (see below). Optional but recommended: * the ``ubc`` CLI on ``PATH`` (faster, deterministic data access). * the ubCode VS Code extension (live indexing and MCP integration). +Installing the Pharaoh plugin +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Claude Code.** Add the marketplace, install the plugin, and reload +plugins: + +.. code-block:: text + + /plugin marketplace add useblocks/pharaoh + /plugin install pharaoh@pharaoh-dev + /reload-plugins + +To pin a specific Pharaoh release instead of tracking ``pharaoh-dev``, +add the marketplace at a tag: + +.. code-block:: text + + /plugin marketplace add useblocks/pharaoh#v1.0.0 + +**GitHub Copilot CLI.** Same flow with the ``copilot`` command: + +.. code-block:: bash + + copilot plugin marketplace add useblocks/pharaoh + copilot plugin install pharaoh@pharaoh-dev + +Once the plugin is installed, the agents and skills described below +become available in the assistant. + Running the setup ^^^^^^^^^^^^^^^^^ -After the agents in this PR are committed, GitHub Copilot Chat -exposes ``@pharaoh.setup`` as the entry point: +After the plugin is installed and the agents in this PR are committed, +GitHub Copilot Chat exposes ``@pharaoh.setup`` as the entry point: .. code-block:: text @pharaoh.setup -In Claude Code, invoke the same skill via its plugin name: +In Claude Code, invoke the same skill via its plugin slash form: .. code-block:: text From 59c8a34010a282325b165b1aba2f3a9d6f759344 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 13:49:35 +0200 Subject: [PATCH 05/15] Address PR review: rewrite artefact-catalog child_of to match corpus The original catalog placed V-model textbook parent relations under child_of (e.g. impl.child_of: [spec], test.child_of: [spec]). The existing 268-need corpus does not honour those: impls link mostly to swreq/swarch/interface/component, tests link mostly to swreq, swreqs link to req. Pharaoh review skills consuming the catalog would have flagged 100+ false-positive missing-parent findings. Rewrote child_of using a conservative empirical rule: include a parent type only when (a) the corpus shows >=90% coverage and (b) the link semantics are child-to-parent (reqs, implements, specs, mitigates, derives_from, provided_by, generic links, parent_needs). Owner-style links (author, persons), planning links (release), and parent-to-child links (provides, startup_calls, shutdown_calls) are excluded even at high coverage. Types whose corpus parents are below the threshold or use ambiguous link semantics get child_of: []. Also tightened optional_fields per type to project-declared [needs.fields.X] entries plus the Pharaoh-internal reviewer / approved_by / source_doc trio. Project-specific fields (asil, severity, exposure, controllability, scenario, safe_state, customer, date, role, contact, image, jira, github, effort, approved) now sit on the type that actually carries them. --- .pharaoh/project/artefact-catalog.yaml | 56 +++++++++++++------------- 1 file changed, 29 insertions(+), 27 deletions(-) diff --git a/.pharaoh/project/artefact-catalog.yaml b/.pharaoh/project/artefact-catalog.yaml index 06716f9..4be833b 100644 --- a/.pharaoh/project/artefact-catalog.yaml +++ b/.pharaoh/project/artefact-catalog.yaml @@ -1,3 +1,10 @@ +# child_of values are intentionally conservative: only listed where the +# corpus shows >=90% coverage AND the link semantics are child-to-parent +# (decomposition / refinement). Owner-style links (:author:, :persons:), +# planning links (:release:), and parent-to-child links (:provides:, +# :startup_calls:, :shutdown_calls:) are excluded even when their corpus +# coverage is high. Types with no qualifying parent get child_of: []. + req: required_fields: - id @@ -6,8 +13,8 @@ req: - source_doc - reviewer - approved_by - - release - - author + - customer + - approved child_of: [] lifecycle_ref: workflows.yaml#req @@ -18,7 +25,6 @@ spec: optional_fields: - reviewer - approved_by - - reqs child_of: - req lifecycle_ref: workflows.yaml#spec @@ -30,11 +36,11 @@ impl: optional_fields: - reviewer - approved_by - - implements - child_of: - - spec - - component - - interface + - jira + - github + - effort + - approved + child_of: [] lifecycle_ref: workflows.yaml#impl test: @@ -44,9 +50,7 @@ test: optional_fields: - reviewer - approved_by - - specs - child_of: - - spec + child_of: [] lifecycle_ref: workflows.yaml#test person: @@ -54,6 +58,9 @@ person: - id - status optional_fields: + - role + - contact + - image - reviewer - approved_by child_of: [] @@ -66,7 +73,6 @@ team: optional_fields: - reviewer - approved_by - - persons child_of: [] lifecycle_ref: workflows.yaml#team @@ -75,9 +81,9 @@ release: - id - status optional_fields: + - date - reviewer - approved_by - - based_on child_of: [] lifecycle_ref: workflows.yaml#release @@ -89,9 +95,8 @@ arch: - source_doc - reviewer - approved_by - - depends_on - - realizes - child_of: [] + child_of: + - req lifecycle_ref: workflows.yaml#arch need: @@ -112,7 +117,8 @@ swarch: - source_doc - reviewer - approved_by - child_of: [] + child_of: + - swreq lifecycle_ref: workflows.yaml#swarch component: @@ -122,12 +128,7 @@ component: optional_fields: - reviewer - approved_by - - provides - - consumes - - uses - child_of: - - swarch - - sys-arch + child_of: [] lifecycle_ref: workflows.yaml#component interface: @@ -137,8 +138,8 @@ interface: optional_fields: - reviewer - approved_by - - provided_by - child_of: [] + child_of: + - component lifecycle_ref: workflows.yaml#interface seq_msg: @@ -159,8 +160,10 @@ swreq: - source_doc - reviewer - approved_by + - jira + - github child_of: - - sysreq + - req lifecycle_ref: workflows.yaml#swreq sys-arch: @@ -207,7 +210,6 @@ fsr: required_fields: - id - status - - derives_from optional_fields: - asil - reviewer From 794b6375b127aac08d8ae6ad6e8a74a6d21e14e8 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 16:35:32 +0200 Subject: [PATCH 06/15] Sync .github/agents and .github/prompts with useblocks/pharaoh#16 Pulls the agent and prompt template state from the upstream PR that addresses my consolidated feedback (useblocks/pharaoh#13). Adds the two missing user-entry agents (@pharaoh.author, @pharaoh.verify) so the matching slash commands stop dispatching to nothing, and picks up the description and gitignore-guidance fixes from the same upstream branch (covers useblocks/pharaoh#11 and #12 in one shot). Replaces the local hand-edits applied earlier in this PR for the five agent files I had patched (pharaoh.write-plan, pharaoh.toctree-emit, pharaoh.execute-plan, pharaoh.feat-file-map, pharaoh.setup) with the upstream versions, so the templates stay in sync with the plugin. --- .../pharaoh.activity-diagram-draft.agent.md | 4 ++-- .../pharaoh.api-coverage-check.agent.md | 6 +++--- .github/agents/pharaoh.arch-draft.agent.md | 4 ++-- .github/agents/pharaoh.arch-review.agent.md | 4 ++-- .github/agents/pharaoh.audit-fanout.agent.md | 4 ++-- .github/agents/pharaoh.author.agent.md | 19 ++++++++++++++++++ .../pharaoh.block-diagram-draft.agent.md | 4 ++-- .github/agents/pharaoh.bootstrap.agent.md | 4 ++-- .github/agents/pharaoh.change.agent.md | 2 +- .../pharaoh.class-diagram-draft.agent.md | 4 ++-- .../pharaoh.component-diagram-draft.agent.md | 4 ++-- .../agents/pharaoh.context-gather.agent.md | 4 ++-- .github/agents/pharaoh.coverage-gap.agent.md | 4 ++-- .github/agents/pharaoh.decide.agent.md | 2 +- .../agents/pharaoh.decision-record.agent.md | 4 ++-- .../agents/pharaoh.decision-review.agent.md | 6 +++--- .../pharaoh.deployment-diagram-draft.agent.md | 4 ++-- .github/agents/pharaoh.diagram-lint.agent.md | 4 ++-- .../agents/pharaoh.diagram-review.agent.md | 6 +++--- .../pharaoh.dispatch-signal-check.agent.md | 6 +++--- .github/agents/pharaoh.execute-plan.agent.md | 4 ++-- .../pharaoh.fault-tree-diagram-draft.agent.md | 4 ++-- .github/agents/pharaoh.feat-balance.agent.md | 4 ++-- .../pharaoh.feat-component-extract.agent.md | 4 ++-- .../pharaoh.feat-draft-from-docs.agent.md | 4 ++-- .../agents/pharaoh.feat-flow-extract.agent.md | 4 ++-- .github/agents/pharaoh.feat-review.agent.md | 6 +++--- .../agents/pharaoh.finding-record.agent.md | 4 ++-- .github/agents/pharaoh.flow.agent.md | 4 ++-- .github/agents/pharaoh.fmea-review.agent.md | 6 +++--- .github/agents/pharaoh.fmea.agent.md | 4 ++-- .github/agents/pharaoh.gate-advisor.agent.md | 6 +++--- .github/agents/pharaoh.id-allocate.agent.md | 4 ++-- .../pharaoh.id-convention-check.agent.md | 6 +++--- .../agents/pharaoh.lifecycle-check.agent.md | 4 ++-- .../pharaoh.link-completeness-check.agent.md | 6 +++--- .github/agents/pharaoh.mece.agent.md | 2 +- .../agents/pharaoh.output-validate.agent.md | 4 ++-- .../pharaoh.papyrus-non-empty-check.agent.md | 4 ++-- .github/agents/pharaoh.plan.agent.md | 2 +- .github/agents/pharaoh.process-audit.agent.md | 4 ++-- .github/agents/pharaoh.prose-migrate.agent.md | 4 ++-- .github/agents/pharaoh.quality-gate.agent.md | 4 ++-- .github/agents/pharaoh.release.agent.md | 2 +- .../pharaoh.reproducibility-check.agent.md | 6 +++--- .../pharaoh.req-code-grounding-check.agent.md | 6 +++--- .../pharaoh.req-codelink-annotate.agent.md | 4 ++-- .github/agents/pharaoh.req-draft.agent.md | 4 ++-- .github/agents/pharaoh.req-from-code.agent.md | 4 ++-- .../agents/pharaoh.req-regenerate.agent.md | 4 ++-- .github/agents/pharaoh.req-review.agent.md | 4 ++-- .../pharaoh.review-completeness.agent.md | 4 ++-- ...haraoh.self-review-coverage-check.agent.md | 6 +++--- .../pharaoh.sequence-diagram-draft.agent.md | 4 ++-- .github/agents/pharaoh.setup.agent.md | 20 ++++++++++++------- .github/agents/pharaoh.spec.agent.md | 2 +- .../pharaoh.sphinx-extension-add.agent.md | 4 ++-- .../pharaoh.standard-conformance.agent.md | 4 ++-- .../pharaoh.state-diagram-draft.agent.md | 4 ++-- .../pharaoh.status-lifecycle-check.agent.md | 4 ++-- .../agents/pharaoh.tailor-bootstrap.agent.md | 4 ++-- ...aoh.tailor-code-grounding-filters.agent.md | 4 ++-- .github/agents/pharaoh.tailor-detect.agent.md | 4 ++-- .github/agents/pharaoh.tailor-fill.agent.md | 4 ++-- .github/agents/pharaoh.tailor-review.agent.md | 4 ++-- .github/agents/pharaoh.toctree-emit.agent.md | 4 ++-- .github/agents/pharaoh.trace.agent.md | 2 +- .../pharaoh.use-case-diagram-draft.agent.md | 6 +++--- .github/agents/pharaoh.verify.agent.md | 19 ++++++++++++++++++ .github/agents/pharaoh.vplan-draft.agent.md | 4 ++-- .github/agents/pharaoh.vplan-review.agent.md | 4 ++-- .github/prompts/pharaoh.author.prompt.md | 20 +++++++++++++++++++ .github/prompts/pharaoh.verify.prompt.md | 19 ++++++++++++++++++ 73 files changed, 232 insertions(+), 149 deletions(-) create mode 100644 .github/agents/pharaoh.author.agent.md create mode 100644 .github/agents/pharaoh.verify.agent.md diff --git a/.github/agents/pharaoh.activity-diagram-draft.agent.md b/.github/agents/pharaoh.activity-diagram-draft.agent.md index cb40cfe..36bba5e 100644 --- a/.github/agents/pharaoh.activity-diagram-draft.agent.md +++ b/.github/agents/pharaoh.activity-diagram-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. +description: Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. Typical ASPICE usage — SWE.3 Software Detailed Design. Renderer tailored via `pharaoh.toml`. Does NOT emit other diagram kinds. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). handoffs: [] --- # @pharaoh.activity-diagram-draft -Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. +Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. Typical ASPICE usage — SWE.3 Software Detailed Design. Renderer tailored via `pharaoh.toml`. Does NOT emit other diagram kinds. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). See [`skills/pharaoh-activity-diagram-draft/SKILL.md`](../../skills/pharaoh-activity-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.api-coverage-check.agent.md b/.github/agents/pharaoh.api-coverage-check.agent.md index f76eb9c..0fac840 100644 --- a/.github/agents/pharaoh.api-coverage-check.agent.md +++ b/.github/agents/pharaoh.api-coverage-check.agent.md @@ -1,10 +1,10 @@ --- -description: Verify that every public symbol and every raise-site exception in a source file is covered by at least one need in needs.json. Reverse direction of pharaoh-req-from-code — language-parametric via the shared regex table; emits per-symbol and per-raise-site coverage plus a ratio against a tailored threshold. +description: Use when verifying that a source file is covered by the need catalogue on two axes — (1) at least one CREQ declares the file as its `:source_doc:`, and (2) every project-defined exception class raised in the file is named by some CREQ's title or content. Exception classes not defined in the project source tree (stdlib, third-party deps) are reported as `external` and do not fail the axis. Classifies non-behavioral files (constants, type aliases, bare re-exports) as skipped. Language-parametric via the shared regex table in `skills/shared/public-symbol-patterns.md` (python / rust / typescript / go / c / cpp / java). Single mechanical structural check. handoffs: [] --- # @pharaoh.api-coverage-check -Verify that every public symbol and every raise-site exception in a source file is covered by at least one need in `needs.json`. Reverse direction of `pharaoh-req-from-code` — language-parametric via the shared regex table in `skills/shared/public-symbol-patterns.md`; emits per-symbol and per-raise-site coverage plus a ratio against a tailored threshold. +Use when verifying that a source file is covered by the need catalogue on two axes — (1) at least one CREQ declares the file as its `:source_doc:`, and (2) every project-defined exception class raised in the file is named by some CREQ's title or content. Exception classes not defined in the project source tree (stdlib, third-party deps) are reported as `external` and do not fail the axis. Classifies non-behavioral files (constants, type aliases, bare re-exports) as skipped. Language-parametric via the shared regex table in `skills/shared/public-symbol-patterns.md` (python / rust / typescript / go / c / cpp / java). Single mechanical structural check. -See [`skills/pharaoh-api-coverage-check/SKILL.md`](../../skills/pharaoh-api-coverage-check/SKILL.md) for the full atomic specification — inputs, outputs, per-step process, failure modes, and composition patterns. +See [`skills/pharaoh-api-coverage-check/SKILL.md`](../../skills/pharaoh-api-coverage-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.arch-draft.agent.md b/.github/agents/pharaoh.arch-draft.agent.md index 36d408c..3f9b5e3 100644 --- a/.github/agents/pharaoh.arch-draft.agent.md +++ b/.github/agents/pharaoh.arch-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Draft a single sphinx-needs architecture element from one parent requirement. +description: Use when drafting a single sphinx-needs architecture element (component / interface / module) from one parent requirement. Emits an RST directive block linking back to the parent via :satisfies:. handoffs: [] --- # @pharaoh.arch-draft -Draft a single sphinx-needs architecture element from one parent requirement. +Use when drafting a single sphinx-needs architecture element (component / interface / module) from one parent requirement. Emits an RST directive block linking back to the parent via :satisfies:. See [`skills/pharaoh-arch-draft/SKILL.md`](../../skills/pharaoh-arch-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.arch-review.agent.md b/.github/agents/pharaoh.arch-review.agent.md index c46b8fb..ac6afb0 100644 --- a/.github/agents/pharaoh.arch-review.agent.md +++ b/.github/agents/pharaoh.arch-review.agent.md @@ -1,10 +1,10 @@ --- -description: Audit a single architecture element against ISO 26262-8 §6 axes. +description: Use when auditing a single architecture element against the 10 ISO 26262-8 §6 axes plus arch-specific axes (traceability back to requirement). Emits structured findings JSON. handoffs: [] --- # @pharaoh.arch-review -Audit a single architecture element against ISO 26262-8 §6 axes. +Use when auditing a single architecture element against the 10 ISO 26262-8 §6 axes plus arch-specific axes (traceability back to requirement). Emits structured findings JSON. See [`skills/pharaoh-arch-review/SKILL.md`](../../skills/pharaoh-arch-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.audit-fanout.agent.md b/.github/agents/pharaoh.audit-fanout.agent.md index e809101..c314f96 100644 --- a/.github/agents/pharaoh.audit-fanout.agent.md +++ b/.github/agents/pharaoh.audit-fanout.agent.md @@ -1,10 +1,10 @@ --- -description: Run a full project audit in parallel across atomic audit skills, sharing findings via Papyrus. +description: Use when running a full project audit in parallel by dispatching 5 atomic audit skills, each writing findings to a shared Papyrus workspace via pharaoh-finding-record for automatic deduplication. Emits the aggregated deduplicated findings list. handoffs: [] --- # @pharaoh.audit-fanout -Run a full project audit in parallel across atomic audit skills, sharing findings via Papyrus. +Use when running a full project audit in parallel by dispatching 5 atomic audit skills, each writing findings to a shared Papyrus workspace via pharaoh-finding-record for automatic deduplication. Emits the aggregated deduplicated findings list. See [`skills/pharaoh-audit-fanout/SKILL.md`](../../skills/pharaoh-audit-fanout/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.author.agent.md b/.github/agents/pharaoh.author.agent.md new file mode 100644 index 0000000..e1a8e7c --- /dev/null +++ b/.github/agents/pharaoh.author.agent.md @@ -0,0 +1,19 @@ +--- +description: Use when authoring or modifying a single sphinx-needs artefact (requirement, architecture element, test case, decision) by routing to the matching atomic drafting skill based on the project's artefact catalog. Returns the drafted RST directive with an ID, file placement suggestion, and parent link. +handoffs: + - label: Verify Authored Need + agent: pharaoh.verify + prompt: Check that the authored artefact addresses the substance of its parent + - label: Review Drafted Requirement + agent: pharaoh.req-review + prompt: Audit the drafted requirement against the ISO 26262 §6 axes + - label: Trace the Authored Need + agent: pharaoh.trace + prompt: Trace the new artefact through all link types +--- + +# @pharaoh.author + +Use when authoring or modifying a single sphinx-needs artefact (requirement, architecture element, test case, decision) by routing to the matching atomic drafting skill based on the project's artefact catalog. Returns the drafted RST directive with an ID, file placement suggestion, and parent link. + +See [`skills/pharaoh-author/SKILL.md`](../../skills/pharaoh-author/SKILL.md) for the full atomic specification — inputs, dispatch table, and composition patterns. diff --git a/.github/agents/pharaoh.block-diagram-draft.agent.md b/.github/agents/pharaoh.block-diagram-draft.agent.md index 0436607..99e72ac 100644 --- a/.github/agents/pharaoh.block-diagram-draft.agent.md +++ b/.github/agents/pharaoh.block-diagram-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. +description: Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. Typical ASPICE usage — SYS.2/SYS.3 for system-level architecture, and SWE.2 for software architecture on SysML-heavy projects. Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). handoffs: [] --- # @pharaoh.block-diagram-draft -Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. +Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. Typical ASPICE usage — SYS.2/SYS.3 for system-level architecture, and SWE.2 for software architecture on SysML-heavy projects. Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). See [`skills/pharaoh-block-diagram-draft/SKILL.md`](../../skills/pharaoh-block-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.bootstrap.agent.md b/.github/agents/pharaoh.bootstrap.agent.md index bca421a..213249a 100644 --- a/.github/agents/pharaoh.bootstrap.agent.md +++ b/.github/agents/pharaoh.bootstrap.agent.md @@ -1,5 +1,5 @@ --- -description: Inject minimum sphinx-needs configuration into an existing Sphinx project so sphinx-build produces a valid needs.json. +description: Use when a Sphinx project has no sphinx-needs configured and you need minimum viable scaffolding — adding the extension and declaring need types — so that sphinx-build produces a valid needs.json for downstream Pharaoh skills. handoffs: - label: Detect and scaffold Pharaoh agent: pharaoh.setup @@ -8,6 +8,6 @@ handoffs: # @pharaoh.bootstrap -Inject the minimum sphinx-needs configuration — extension entry, need types, optional extra links — into an existing Sphinx project that does not yet have sphinx-needs configured. Does not seed RST content, does not build, does not write `pharaoh.toml`. +Use when a Sphinx project has no sphinx-needs configured and you need minimum viable scaffolding — adding the extension and declaring need types — so that sphinx-build produces a valid needs.json for downstream Pharaoh skills. See [`skills/pharaoh-bootstrap/SKILL.md`](../../skills/pharaoh-bootstrap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.change.agent.md b/.github/agents/pharaoh.change.agent.md index 1ed8190..df8fdf3 100644 --- a/.github/agents/pharaoh.change.agent.md +++ b/.github/agents/pharaoh.change.agent.md @@ -1,5 +1,5 @@ --- -description: Analyze the impact of changing a requirement, specification, or any sphinx-needs item. Traces through all link types and codelinks to produce a Change Document. +description: Use when analyzing the impact of changing a requirement, specification, or any sphinx-needs item, including traceability to code via codelinks handoffs: - label: MECE Check agent: pharaoh.mece diff --git a/.github/agents/pharaoh.class-diagram-draft.agent.md b/.github/agents/pharaoh.class-diagram-draft.agent.md index 7eef840..5911999 100644 --- a/.github/agents/pharaoh.class-diagram-draft.agent.md +++ b/.github/agents/pharaoh.class-diagram-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). +description: Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). Renderer tailored via `pharaoh.toml`. Does NOT emit component, sequence, or state diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). handoffs: [] --- # @pharaoh.class-diagram-draft -Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). +Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). Renderer tailored via `pharaoh.toml`. Does NOT emit component, sequence, or state diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). See [`skills/pharaoh-class-diagram-draft/SKILL.md`](../../skills/pharaoh-class-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.component-diagram-draft.agent.md b/.github/agents/pharaoh.component-diagram-draft.agent.md index b0bea11..e4bf231 100644 --- a/.github/agents/pharaoh.component-diagram-draft.agent.md +++ b/.github/agents/pharaoh.component-diagram-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. +description: Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. Renderer tailored via `pharaoh.toml`. Does NOT emit sequence, class, or state diagrams — those are separate skills. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). handoffs: [] --- # @pharaoh.component-diagram-draft -Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. +Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. Renderer tailored via `pharaoh.toml`. Does NOT emit sequence, class, or state diagrams — those are separate skills. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). See [`skills/pharaoh-component-diagram-draft/SKILL.md`](../../skills/pharaoh-component-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.context-gather.agent.md b/.github/agents/pharaoh.context-gather.agent.md index 24926d3..8f21d82 100644 --- a/.github/agents/pharaoh.context-gather.agent.md +++ b/.github/agents/pharaoh.context-gather.agent.md @@ -1,10 +1,10 @@ --- -description: Retrieve rationale memories from a Papyrus workspace before authoring or review. +description: Use when retrieving rationale memories relevant to an authoring context from a Papyrus workspace, before invoking any draft or review skill. Returns a structured list of memories (memory_id, text, relevance_score). Does NOT draft, review, or modify artefacts. handoffs: [] --- # @pharaoh.context-gather -Retrieve rationale memories from a Papyrus workspace before authoring or review. +Use when retrieving rationale memories relevant to an authoring context from a Papyrus workspace, before invoking any draft or review skill. Returns a structured list of memories (memory_id, text, relevance_score). Does NOT draft, review, or modify artefacts. See [`skills/pharaoh-context-gather/SKILL.md`](../../skills/pharaoh-context-gather/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.coverage-gap.agent.md b/.github/agents/pharaoh.coverage-gap.agent.md index ef820ca..b95341d 100644 --- a/.github/agents/pharaoh.coverage-gap.agent.md +++ b/.github/agents/pharaoh.coverage-gap.agent.md @@ -1,10 +1,10 @@ --- -description: Detect one gap category (orphan / unverified / duplicate / contradictory / lifecycle) in a sphinx-needs corpus. +description: Use when detecting one gap category (orphan / unverified / duplicate / contradictory / lifecycle / ...) in a sphinx-needs corpus. Returns ordered list of needs falling into that gap. handoffs: [] --- # @pharaoh.coverage-gap -Detect one gap category (orphan / unverified / duplicate / contradictory / lifecycle) in a sphinx-needs corpus. +Use when detecting one gap category (orphan / unverified / duplicate / contradictory / lifecycle / ...) in a sphinx-needs corpus. Returns ordered list of needs falling into that gap. See [`skills/pharaoh-coverage-gap/SKILL.md`](../../skills/pharaoh-coverage-gap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.decide.agent.md b/.github/agents/pharaoh.decide.agent.md index 2b6938e..4e942fb 100644 --- a/.github/agents/pharaoh.decide.agent.md +++ b/.github/agents/pharaoh.decide.agent.md @@ -1,5 +1,5 @@ --- -description: Record a design decision as a traceable sphinx-needs object with alternatives, rationale, and links to affected requirements. +description: Use when recording a design decision as a traceable sphinx-needs object with alternatives, rationale, and links to affected requirements handoffs: - label: Trace Decision agent: pharaoh.trace diff --git a/.github/agents/pharaoh.decision-record.agent.md b/.github/agents/pharaoh.decision-record.agent.md index 6b6b044..c7cccc6 100644 --- a/.github/agents/pharaoh.decision-record.agent.md +++ b/.github/agents/pharaoh.decision-record.agent.md @@ -1,10 +1,10 @@ --- -description: Record a canonical decision, fact, or preference in the shared Papyrus workspace with (type, canonical_name) dedup. +description: Use when recording a canonical decision, fact, or preference in the shared Papyrus workspace with automatic dedup on (type, canonical_name). Returns {action: wrote|duplicate, papyrus_id}. Generalizes pharaoh-finding-record beyond audit findings. handoffs: [] --- # @pharaoh.decision-record -Record a canonical decision, fact, or preference in the shared Papyrus workspace with (type, canonical_name) dedup. +Use when recording a canonical decision, fact, or preference in the shared Papyrus workspace with automatic dedup on (type, canonical_name). Returns {action: wrote|duplicate, papyrus_id}. Generalizes pharaoh-finding-record beyond audit findings. See [`skills/pharaoh-decision-record/SKILL.md`](../../skills/pharaoh-decision-record/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.decision-review.agent.md b/.github/agents/pharaoh.decision-review.agent.md index e644a61..35d3bb0 100644 --- a/.github/agents/pharaoh.decision-review.agent.md +++ b/.github/agents/pharaoh.decision-review.agent.md @@ -1,10 +1,10 @@ --- -description: Audit a single recorded decision against context/alternatives/consequences structure and traceability. +description: Use when auditing a single recorded decision (DR / ADR / design note) against the generic decision review axes in `shared/checklists/decision.md`. Checks context/alternatives/consequences structure, traceability to affected artefacts, rationale completeness. Emits structured findings JSON. handoffs: [] --- # @pharaoh.decision-review -Audit a single recorded decision against context/alternatives/consequences structure and traceability. +Use when auditing a single recorded decision (DR / ADR / design note) against the generic decision review axes in `shared/checklists/decision.md`. Checks context/alternatives/consequences structure, traceability to affected artefacts, rationale completeness. Emits structured findings JSON. -See [`skills/pharaoh-decision-review/SKILL.md`](../../skills/pharaoh-decision-review/SKILL.md) for the full atomic specification. +See [`skills/pharaoh-decision-review/SKILL.md`](../../skills/pharaoh-decision-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.deployment-diagram-draft.agent.md b/.github/agents/pharaoh.deployment-diagram-draft.agent.md index 8e99724..b35581c 100644 --- a/.github/agents/pharaoh.deployment-diagram-draft.agent.md +++ b/.github/agents/pharaoh.deployment-diagram-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). +description: Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). Typical ASPICE usage — SYS.3 System Architectural Design; essential for automotive HW/SW allocation per ISO 26262 Part 5 (HW) and Part 6 (SW). Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). handoffs: [] --- # @pharaoh.deployment-diagram-draft -Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). +Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). Typical ASPICE usage — SYS.3 System Architectural Design; essential for automotive HW/SW allocation per ISO 26262 Part 5 (HW) and Part 6 (SW). Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). See [`skills/pharaoh-deployment-diagram-draft/SKILL.md`](../../skills/pharaoh-deployment-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.diagram-lint.agent.md b/.github/agents/pharaoh.diagram-lint.agent.md index 4aa99bd..45014e4 100644 --- a/.github/agents/pharaoh.diagram-lint.agent.md +++ b/.github/agents/pharaoh.diagram-lint.agent.md @@ -1,5 +1,5 @@ --- -description: Walk a directory of RST files and check every `.. mermaid::` / `.. uml::` block against the real renderer parser (mmdc, plantuml). Catches silent parse failures that sphinx-build misses. +description: Use when running a terminal validation step over a directory of RST files to catch Mermaid / PlantUML parse failures that sphinx-build cannot detect. Extracts every `.. mermaid::` and `.. uml::` block and pipes it to the real renderer parser (mmdc / plantuml -checkonly). Returns structured findings. Does NOT modify the RST files. handoffs: - label: Aggregate into quality gate agent: pharaoh.quality-gate @@ -8,6 +8,6 @@ handoffs: # @pharaoh.diagram-lint -Walk a directory of RST files, extract every Mermaid / PlantUML block, and parse each block with the real renderer CLI (`mmdc -i tmp.mmd -o /dev/null`, `plantuml -checkonly`). Emits structured findings. Read-only — does not modify RST. When a renderer CLI is unavailable, degrades gracefully with a warning and install command. +Use when running a terminal validation step over a directory of RST files to catch Mermaid / PlantUML parse failures that sphinx-build cannot detect. Extracts every `.. mermaid::` and `.. uml::` block and pipes it to the real renderer parser (mmdc / plantuml -checkonly). Returns structured findings. Does NOT modify the RST files. See [`skills/pharaoh-diagram-lint/SKILL.md`](../../skills/pharaoh-diagram-lint/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.diagram-review.agent.md b/.github/agents/pharaoh.diagram-review.agent.md index 5b7d396..deb7695 100644 --- a/.github/agents/pharaoh.diagram-review.agent.md +++ b/.github/agents/pharaoh.diagram-review.agent.md @@ -1,10 +1,10 @@ --- -description: Audit a single diagram block (Mermaid or PlantUML) against generic + per-type axes. +description: Use when auditing a single diagram block (Mermaid or PlantUML) emitted by any diagram-emitting skill. Single review atom covering all diagram types — trace/caption/element-count/parser/required-elements checks plus LLM-judge axes for purpose clarity and granularity consistency. Per-type required-element checks dispatched based on `diagram_type` input. handoffs: [] --- # @pharaoh.diagram-review -Audit a single diagram block (Mermaid or PlantUML) against generic + per-type axes. +Use when auditing a single diagram block (Mermaid or PlantUML) emitted by any diagram-emitting skill. Single review atom covering all diagram types — trace/caption/element-count/parser/required-elements checks plus LLM-judge axes for purpose clarity and granularity consistency. Per-type required-element checks dispatched based on `diagram_type` input. -See [`skills/pharaoh-diagram-review/SKILL.md`](../../skills/pharaoh-diagram-review/SKILL.md) for the full atomic specification. +See [`skills/pharaoh-diagram-review/SKILL.md`](../../skills/pharaoh-diagram-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.dispatch-signal-check.agent.md b/.github/agents/pharaoh.dispatch-signal-check.agent.md index 52d8c1e..60f80e4 100644 --- a/.github/agents/pharaoh.dispatch-signal-check.agent.md +++ b/.github/agents/pharaoh.dispatch-signal-check.agent.md @@ -1,10 +1,10 @@ --- -description: Verify declared execution_mode in plan.yaml matches observed artefacts in runs/. +description: Use when verifying that a plan's declared `execution_mode` matches observed subagent artefacts in `runs/`. Detects the "LLM-executor collapsed subagents into inline" failure class observed during dogfooding. One mechanical structural check. handoffs: [] --- # @pharaoh.dispatch-signal-check -Verify declared execution_mode in plan.yaml matches observed artefacts in runs/. +Use when verifying that a plan's declared `execution_mode` matches observed subagent artefacts in `runs/`. Detects the "LLM-executor collapsed subagents into inline" failure class observed during dogfooding. One mechanical structural check. -See [`skills/pharaoh-dispatch-signal-check/SKILL.md`](../../skills/pharaoh-dispatch-signal-check/SKILL.md) for the full atomic specification. +See [`skills/pharaoh-dispatch-signal-check/SKILL.md`](../../skills/pharaoh-dispatch-signal-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.execute-plan.agent.md b/.github/agents/pharaoh.execute-plan.agent.md index 357b32d..2324022 100644 --- a/.github/agents/pharaoh.execute-plan.agent.md +++ b/.github/agents/pharaoh.execute-plan.agent.md @@ -1,10 +1,10 @@ --- -description: Use when executing a plan.yaml produced by pharaoh-write-plan. Reads the plan, runs each task (inline or via subagent dispatch), threads outputs between tasks per the ref grammar, validates outputs via pharaoh-output-validate, persists artefacts and report.yaml. The plan is the orchestrator, this skill is the engine. +description: Use when executing a plan.yaml produced by pharaoh-write-plan. Reads the plan, runs each task (inline or via subagent dispatch), threads outputs between tasks per the ref grammar, validates outputs via pharaoh-output-validate, persists artefacts and report.yaml. Generic — the plan is the orchestrator, this skill is the engine. handoffs: [] --- # @pharaoh.execute-plan -Use when executing a plan.yaml produced by pharaoh-write-plan. Reads the plan, runs each task (inline or via subagent dispatch), threads outputs between tasks per the ref grammar, validates outputs via pharaoh-output-validate, persists artefacts and report.yaml. The plan is the orchestrator, this skill is the engine. +Use when executing a plan.yaml produced by pharaoh-write-plan. Reads the plan, runs each task (inline or via subagent dispatch), threads outputs between tasks per the ref grammar, validates outputs via pharaoh-output-validate, persists artefacts and report.yaml. Generic — the plan is the orchestrator, this skill is the engine. See [`skills/pharaoh-execute-plan/SKILL.md`](../../skills/pharaoh-execute-plan/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md b/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md index e7ad53d..044a7ad 100644 --- a/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md +++ b/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). +description: Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). Typical ISO 26262 usage — Part 3 Hazard Analysis & Risk Assessment, and Part 5 supporting hardware architectural metrics. Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). handoffs: [] --- # @pharaoh.fault-tree-diagram-draft -Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). +Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). Typical ISO 26262 usage — Part 3 Hazard Analysis & Risk Assessment, and Part 5 supporting hardware architectural metrics. Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). See [`skills/pharaoh-fault-tree-diagram-draft/SKILL.md`](../../skills/pharaoh-fault-tree-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-balance.agent.md b/.github/agents/pharaoh.feat-balance.agent.md index 3829768..fd28551 100644 --- a/.github/agents/pharaoh.feat-balance.agent.md +++ b/.github/agents/pharaoh.feat-balance.agent.md @@ -1,10 +1,10 @@ --- -description: Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). +description: Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). Reports health and suggestions; does not mutate. handoffs: [] --- # @pharaoh.feat-balance -Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). +Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). Reports health and suggestions; does not mutate. See [`skills/pharaoh-feat-balance/SKILL.md`](../../skills/pharaoh-feat-balance/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-component-extract.agent.md b/.github/agents/pharaoh.feat-component-extract.agent.md index c29c37e..b3db215 100644 --- a/.github/agents/pharaoh.feat-component-extract.agent.md +++ b/.github/agents/pharaoh.feat-component-extract.agent.md @@ -1,10 +1,10 @@ --- -description: Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. +description: Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. Walks import edges between the listed files and emits a Mermaid or PlantUML diagram whose output shape is compatible with pharaoh-component-diagram-draft. Does NOT hand-author nodes or edges; extraction is rule-based. handoffs: [] --- # @pharaoh.feat-component-extract -Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. +Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. Walks import edges between the listed files and emits a Mermaid or PlantUML diagram whose output shape is compatible with pharaoh-component-diagram-draft. Does NOT hand-author nodes or edges; extraction is rule-based. See [`skills/pharaoh-feat-component-extract/SKILL.md`](../../skills/pharaoh-feat-component-extract/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-draft-from-docs.agent.md b/.github/agents/pharaoh.feat-draft-from-docs.agent.md index 63216a1..d2c4477 100644 --- a/.github/agents/pharaoh.feat-draft-from-docs.agent.md +++ b/.github/agents/pharaoh.feat-draft-from-docs.agent.md @@ -1,10 +1,10 @@ --- -description: Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. +description: Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. Does NOT read source code. Does NOT emit component requirements. Does NOT map features to files — that is `pharaoh-feat-file-map`. handoffs: [] --- # @pharaoh.feat-draft-from-docs -Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. +Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. Does NOT read source code. Does NOT emit component requirements. Does NOT map features to files — that is `pharaoh-feat-file-map`. See [`skills/pharaoh-feat-draft-from-docs/SKILL.md`](../../skills/pharaoh-feat-draft-from-docs/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-flow-extract.agent.md b/.github/agents/pharaoh.feat-flow-extract.agent.md index 444be70..5d9cc2d 100644 --- a/.github/agents/pharaoh.feat-flow-extract.agent.md +++ b/.github/agents/pharaoh.feat-flow-extract.agent.md @@ -1,10 +1,10 @@ --- -description: Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. +description: Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. Walks the call graph up to a bounded depth and emits a Mermaid or PlantUML sequence diagram whose output shape matches pharaoh-sequence-diagram-draft. Complements pharaoh-feat-component-extract (static view); this is the dynamic view. handoffs: [] --- # @pharaoh.feat-flow-extract -Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. +Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. Walks the call graph up to a bounded depth and emits a Mermaid or PlantUML sequence diagram whose output shape matches pharaoh-sequence-diagram-draft. Complements pharaoh-feat-component-extract (static view); this is the dynamic view. See [`skills/pharaoh-feat-flow-extract/SKILL.md`](../../skills/pharaoh-feat-flow-extract/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.feat-review.agent.md b/.github/agents/pharaoh.feat-review.agent.md index af64186..73e9d33 100644 --- a/.github/agents/pharaoh.feat-review.agent.md +++ b/.github/agents/pharaoh.feat-review.agent.md @@ -1,10 +1,10 @@ --- -description: Audit a single feature-level need against the generic feat review axes plus any project-specific addenda. +description: Use when auditing a single feature-level need (feat) against the generic feat review axes in `shared/checklists/feat.md` plus any per-project addenda in `.pharaoh/project/checklists/feat.md`. Emits structured findings JSON — per-axis pass/fail for mechanized axes, 0-3 score for subjective axes. Mirrors `pharaoh-req-review`'s shape for feat-level artefacts. handoffs: [] --- # @pharaoh.feat-review -Audit a single feature-level need against the generic feat review axes plus any project-specific addenda. +Use when auditing a single feature-level need (feat) against the generic feat review axes in `shared/checklists/feat.md` plus any per-project addenda in `.pharaoh/project/checklists/feat.md`. Emits structured findings JSON — per-axis pass/fail for mechanized axes, 0-3 score for subjective axes. Mirrors `pharaoh-req-review`'s shape for feat-level artefacts. -See [`skills/pharaoh-feat-review/SKILL.md`](../../skills/pharaoh-feat-review/SKILL.md) for the full atomic specification. +See [`skills/pharaoh-feat-review/SKILL.md`](../../skills/pharaoh-feat-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.finding-record.agent.md b/.github/agents/pharaoh.finding-record.agent.md index 4da1519..35e2162 100644 --- a/.github/agents/pharaoh.finding-record.agent.md +++ b/.github/agents/pharaoh.finding-record.agent.md @@ -1,10 +1,10 @@ --- -description: Record an audit finding in the shared Papyrus workspace with deterministic dedup across concurrent subagents. +description: Use when recording an audit finding in the shared Papyrus workspace with automatic dedup. Uses deterministic ID to ensure the same {category, subject_id} tuple never appears twice across concurrent subagents. Returns {action: wrote|duplicate, papyrus_id}. handoffs: [] --- # @pharaoh.finding-record -Record an audit finding in the shared Papyrus workspace with deterministic dedup across concurrent subagents. +Use when recording an audit finding in the shared Papyrus workspace with automatic dedup. Uses deterministic ID to ensure the same {category, subject_id} tuple never appears twice across concurrent subagents. Returns {action: wrote|duplicate, papyrus_id}. See [`skills/pharaoh-finding-record/SKILL.md`](../../skills/pharaoh-finding-record/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.flow.agent.md b/.github/agents/pharaoh.flow.agent.md index f670320..ac61e8d 100644 --- a/.github/agents/pharaoh.flow.agent.md +++ b/.github/agents/pharaoh.flow.agent.md @@ -1,10 +1,10 @@ --- -description: Orchestrate the full V-model chain — requirement, architecture, verification plan, FMEA — with review passes. +description: Use when orchestrating the full V-model chain for one feature context — requirement → architecture element → verification plan → FMEA, each with a review pass. Invokes pharaoh-req-draft, pharaoh-req-review, pharaoh-arch-draft, pharaoh-arch-review, pharaoh-vplan-draft, pharaoh-vplan-review, pharaoh-fmea in sequence. handoffs: [] --- # @pharaoh.flow -Orchestrate the full V-model chain — requirement, architecture, verification plan, FMEA — with review passes. +Use when orchestrating the full V-model chain for one feature context — requirement → architecture element → verification plan → FMEA, each with a review pass. Invokes pharaoh-req-draft, pharaoh-req-review, pharaoh-arch-draft, pharaoh-arch-review, pharaoh-vplan-draft, pharaoh-vplan-review, pharaoh-fmea in sequence. See [`skills/pharaoh-flow/SKILL.md`](../../skills/pharaoh-flow/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.fmea-review.agent.md b/.github/agents/pharaoh.fmea-review.agent.md index 6768550..9f8c623 100644 --- a/.github/agents/pharaoh.fmea-review.agent.md +++ b/.github/agents/pharaoh.fmea-review.agent.md @@ -1,10 +1,10 @@ --- -description: Audit a single FMEA entry against severity/occurrence/detection scales, RPN correctness, and cause/effect well-formedness. +description: Use when auditing a single FMEA entry (failure-mode row) against the generic FMEA review axes in `shared/checklists/fmea.md` plus per-project addenda. Checks severity/occurrence/detection scales, RPN computation, cause/effect well-formedness, traceability to the analyzed artefact. Emits structured findings JSON. handoffs: [] --- # @pharaoh.fmea-review -Audit a single FMEA entry against severity/occurrence/detection scales, RPN correctness, and cause/effect well-formedness. +Use when auditing a single FMEA entry (failure-mode row) against the generic FMEA review axes in `shared/checklists/fmea.md` plus per-project addenda. Checks severity/occurrence/detection scales, RPN computation, cause/effect well-formedness, traceability to the analyzed artefact. Emits structured findings JSON. -See [`skills/pharaoh-fmea-review/SKILL.md`](../../skills/pharaoh-fmea-review/SKILL.md) for the full atomic specification. +See [`skills/pharaoh-fmea-review/SKILL.md`](../../skills/pharaoh-fmea-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.fmea.agent.md b/.github/agents/pharaoh.fmea.agent.md index f990fa0..9b3f4b4 100644 --- a/.github/agents/pharaoh.fmea.agent.md +++ b/.github/agents/pharaoh.fmea.agent.md @@ -1,10 +1,10 @@ --- -description: Derive a single failure-mode entry (FMEA / DFA row) from one requirement or architecture element. +description: Use when deriving a single failure-mode entry (FMEA / DFA row) from one requirement or architecture element. Emits structured JSON with cause, effect, severity (1-10), occurrence (1-10), detection (1-10), and RPN. handoffs: [] --- # @pharaoh.fmea -Derive a single failure-mode entry (FMEA / DFA row) from one requirement or architecture element. +Use when deriving a single failure-mode entry (FMEA / DFA row) from one requirement or architecture element. Emits structured JSON with cause, effect, severity (1-10), occurrence (1-10), detection (1-10), and RPN. See [`skills/pharaoh-fmea/SKILL.md`](../../skills/pharaoh-fmea/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.gate-advisor.agent.md b/.github/agents/pharaoh.gate-advisor.agent.md index 4c9c63b..0991f5c 100644 --- a/.github/agents/pharaoh.gate-advisor.agent.md +++ b/.github/agents/pharaoh.gate-advisor.agent.md @@ -1,10 +1,10 @@ --- -description: Read a project's `pharaoh.toml` and report which phased-enablement ladder step is the recommended next gate to switch on. Advisory, read-only — walks the fixed 5-step ladder in order (`require_verification` → `require_change_analysis` → `require_mece_on_release` → `codelinks.enabled` → `strictness = "enforcing"`) and names the first unmet step plus its blocker. +description: Use when reading a project's `pharaoh.toml` to report which phased-enablement ladder step is the recommended next gate to switch on. Single mechanical advisory check — parses five flags (`strictness`, `require_verification`, `require_change_analysis`, `require_mece_on_release`, `codelinks.enabled`), walks the fixed ladder in order, and emits the first unmet step plus its blocker note. Read-only; never edits `pharaoh.toml`. handoffs: [] --- # @pharaoh.gate-advisor -Read the project's `pharaoh.toml`, parse the five ladder flags, and emit a findings JSON naming the next recommended gate to enable, the blocker that must be cleared first, and the full fixed ladder. Read-only; never edits `pharaoh.toml`. The ladder rationale lives in [`skills/shared/gate-enablement.md`](../../skills/shared/gate-enablement.md) — this atom is the tool that walks it, not the authority that defines it. +Use when reading a project's `pharaoh.toml` to report which phased-enablement ladder step is the recommended next gate to switch on. Single mechanical advisory check — parses five flags (`strictness`, `require_verification`, `require_change_analysis`, `require_mece_on_release`, `codelinks.enabled`), walks the fixed ladder in order, and emits the first unmet step plus its blocker note. Read-only; never edits `pharaoh.toml`. -See [`skills/pharaoh-gate-advisor/SKILL.md`](../../skills/pharaoh-gate-advisor/SKILL.md) for the full atomic specification — inputs, outputs, per-step process, ladder table, rationale map, tailoring extension point, and composition patterns. +See [`skills/pharaoh-gate-advisor/SKILL.md`](../../skills/pharaoh-gate-advisor/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.id-allocate.agent.md b/.github/agents/pharaoh.id-allocate.agent.md index c2d23d7..bb80390 100644 --- a/.github/agents/pharaoh.id-allocate.agent.md +++ b/.github/agents/pharaoh.id-allocate.agent.md @@ -1,10 +1,10 @@ --- -description: Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. +description: Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. Each subagent receives its pre-allocated pool and emits only from that pool, so parallel agents cannot collide on stem choice. Does NOT invoke emitters, does NOT write RST. handoffs: [] --- # @pharaoh.id-allocate -Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. +Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. Each subagent receives its pre-allocated pool and emits only from that pool, so parallel agents cannot collide on stem choice. Does NOT invoke emitters, does NOT write RST. See [`skills/pharaoh-id-allocate/SKILL.md`](../../skills/pharaoh-id-allocate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.id-convention-check.agent.md b/.github/agents/pharaoh.id-convention-check.agent.md index d2af8bf..f8e2ff2 100644 --- a/.github/agents/pharaoh.id-convention-check.agent.md +++ b/.github/agents/pharaoh.id-convention-check.agent.md @@ -1,10 +1,10 @@ --- -description: Verify that every need id in a sphinx-needs corpus matches the regex declared for its type in .pharaoh/project/id-conventions.yaml. Emits a list of violations. +description: Use when verifying that every need id in a sphinx-needs corpus matches the regex declared for its type in `.pharaoh/project/id-conventions.yaml`. Single mechanical structural check — applies the tailored per-type regex, emits a list of violations. Does NOT auto-detect how many schemes coexist — scheme policy is the tailoring author's responsibility (declare an alternation to allow multiple forms). handoffs: [] --- # @pharaoh.id-convention-check -Verify that every need id in a sphinx-needs corpus matches the regex declared for its type in `.pharaoh/project/id-conventions.yaml`. Emits a list of violations. +Use when verifying that every need id in a sphinx-needs corpus matches the regex declared for its type in `.pharaoh/project/id-conventions.yaml`. Single mechanical structural check — applies the tailored per-type regex, emits a list of violations. Does NOT auto-detect how many schemes coexist — scheme policy is the tailoring author's responsibility (declare an alternation to allow multiple forms). -See [`skills/pharaoh-id-convention-check/SKILL.md`](../../skills/pharaoh-id-convention-check/SKILL.md) for the full atomic specification — inputs, outputs, detection rule, and composition patterns. +See [`skills/pharaoh-id-convention-check/SKILL.md`](../../skills/pharaoh-id-convention-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.lifecycle-check.agent.md b/.github/agents/pharaoh.lifecycle-check.agent.md index 9122660..b7474ef 100644 --- a/.github/agents/pharaoh.lifecycle-check.agent.md +++ b/.github/agents/pharaoh.lifecycle-check.agent.md @@ -1,10 +1,10 @@ --- -description: Verify a sphinx-needs artefact's lifecycle state and the legality of a requested state transition. +description: Use when verifying a sphinx-needs artefact's current lifecycle state and the legality of a requested state transition per the project's workflows.yaml state machine. handoffs: [] --- # @pharaoh.lifecycle-check -Verify a sphinx-needs artefact's lifecycle state and the legality of a requested state transition. +Use when verifying a sphinx-needs artefact's current lifecycle state and the legality of a requested state transition per the project's workflows.yaml state machine. See [`skills/pharaoh-lifecycle-check/SKILL.md`](../../skills/pharaoh-lifecycle-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.link-completeness-check.agent.md b/.github/agents/pharaoh.link-completeness-check.agent.md index e067e80..6d91804 100644 --- a/.github/agents/pharaoh.link-completeness-check.agent.md +++ b/.github/agents/pharaoh.link-completeness-check.agent.md @@ -1,10 +1,10 @@ --- -description: Verify outgoing-link coverage across a full needs.json graph against required/optional link types declared per artefact type in artefact-catalog.yaml — missing required links, unresolved target ids, per-type policy enforcement. +description: Use when verifying outgoing-link coverage across a full needs.json graph. For each declared link type in `artefact-catalog.yaml`, confirms every need of the governed type carries a non-empty value AND every target id resolves to an existing need. Closes the "catalogue declares `verifies` required but half the reqs ship without it" failure class. handoffs: [] --- # @pharaoh.link-completeness-check -Verify outgoing-link coverage across a full needs.json graph against required/optional link types declared per artefact type in `artefact-catalog.yaml` — missing required links, unresolved target ids, per-type policy enforcement. +Use when verifying outgoing-link coverage across a full needs.json graph. For each declared link type in `artefact-catalog.yaml`, confirms every need of the governed type carries a non-empty value AND every target id resolves to an existing need. Closes the "catalogue declares `verifies` required but half the reqs ship without it" failure class. -See [`skills/pharaoh-link-completeness-check/SKILL.md`](../../skills/pharaoh-link-completeness-check/SKILL.md) for the full atomic specification — inputs, outputs, per-pass detection rules, and composition patterns. +See [`skills/pharaoh-link-completeness-check/SKILL.md`](../../skills/pharaoh-link-completeness-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.mece.agent.md b/.github/agents/pharaoh.mece.agent.md index 1c5e41c..ec4d116 100644 --- a/.github/agents/pharaoh.mece.agent.md +++ b/.github/agents/pharaoh.mece.agent.md @@ -1,5 +1,5 @@ --- -description: Check for gaps, redundancies, and inconsistencies in sphinx-needs requirements. Validates traceability completeness. +description: Use when checking for gaps, redundancies, and inconsistencies in sphinx-needs requirements, or validating traceability completeness handoffs: - label: Trace a Need agent: pharaoh.trace diff --git a/.github/agents/pharaoh.output-validate.agent.md b/.github/agents/pharaoh.output-validate.agent.md index 331ad1f..bfa1914 100644 --- a/.github/agents/pharaoh.output-validate.agent.md +++ b/.github/agents/pharaoh.output-validate.agent.md @@ -1,10 +1,10 @@ --- -description: Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). +description: Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). Returns {valid, errors, parsed, recovery}. Callers gate subagent output through this before writing anything to disk. handoffs: [] --- # @pharaoh.output-validate -Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). +Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). Returns {valid, errors, parsed, recovery}. Callers gate subagent output through this before writing anything to disk. See [`skills/pharaoh-output-validate/SKILL.md`](../../skills/pharaoh-output-validate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.papyrus-non-empty-check.agent.md b/.github/agents/pharaoh.papyrus-non-empty-check.agent.md index 4e02bf7..712d563 100644 --- a/.github/agents/pharaoh.papyrus-non-empty-check.agent.md +++ b/.github/agents/pharaoh.papyrus-non-empty-check.agent.md @@ -1,10 +1,10 @@ --- -description: Verify that a Papyrus workspace received at least N writes during a plan run. +description: Use when verifying that a Papyrus workspace actually received writes during a plan run. Single mechanical check — counts directives across `.papyrus/memory/*.rst` and returns pass/fail against a configured minimum. Wired into `pharaoh-quality-gate` to detect the "LLM-executor skipped the atomic Papyrus writes" failure class observed in prior dogfooding. handoffs: [] --- # @pharaoh.papyrus-non-empty-check -Verify that a Papyrus workspace received at least N writes during a plan run. +Use when verifying that a Papyrus workspace actually received writes during a plan run. Single mechanical check — counts directives across `.papyrus/memory/*.rst` and returns pass/fail against a configured minimum. Wired into `pharaoh-quality-gate` to detect the "LLM-executor skipped the atomic Papyrus writes" failure class observed in prior dogfooding. See [`skills/pharaoh-papyrus-non-empty-check/SKILL.md`](../../skills/pharaoh-papyrus-non-empty-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.plan.agent.md b/.github/agents/pharaoh.plan.agent.md index 8ec0bb7..db2b52f 100644 --- a/.github/agents/pharaoh.plan.agent.md +++ b/.github/agents/pharaoh.plan.agent.md @@ -1,5 +1,5 @@ --- -description: Break requirement changes into structured implementation tasks with workflow enforcement and dependency ordering. +description: Use when breaking requirement changes into structured implementation tasks with workflow enforcement and dependency ordering handoffs: - label: Start Change Analysis agent: pharaoh.change diff --git a/.github/agents/pharaoh.process-audit.agent.md b/.github/agents/pharaoh.process-audit.agent.md index 69e01de..3911cc0 100644 --- a/.github/agents/pharaoh.process-audit.agent.md +++ b/.github/agents/pharaoh.process-audit.agent.md @@ -1,10 +1,10 @@ --- -description: Run a full-corpus audit across all gap categories plus cross-artefact consistency checks. +description: Use when running a full-corpus audit against a sphinx-needs project. Orchestrates pharaoh-coverage-gap across all gap categories plus cross-artefact consistency checks. Emits a prioritised gap report. handoffs: [] --- # @pharaoh.process-audit -Run a full-corpus audit across all gap categories plus cross-artefact consistency checks. +Use when running a full-corpus audit against a sphinx-needs project. Orchestrates pharaoh-coverage-gap across all gap categories plus cross-artefact consistency checks. Emits a prioritised gap report. See [`skills/pharaoh-process-audit/SKILL.md`](../../skills/pharaoh-process-audit/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.prose-migrate.agent.md b/.github/agents/pharaoh.prose-migrate.agent.md index db1ab43..de450d1 100644 --- a/.github/agents/pharaoh.prose-migrate.agent.md +++ b/.github/agents/pharaoh.prose-migrate.agent.md @@ -1,10 +1,10 @@ --- -description: Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. +description: Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. Produces a sentence-by-sentence migration proposal — keep-as-user-guide, merge-into-feat-body, discard. Does NOT overwrite anything; the caller applies the proposal manually. handoffs: [] --- # @pharaoh.prose-migrate -Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. +Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. Produces a sentence-by-sentence migration proposal — keep-as-user-guide, merge-into-feat-body, discard. Does NOT overwrite anything; the caller applies the proposal manually. See [`skills/pharaoh-prose-migrate/SKILL.md`](../../skills/pharaoh-prose-migrate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.quality-gate.agent.md b/.github/agents/pharaoh.quality-gate.agent.md index 95a52fc..d500423 100644 --- a/.github/agents/pharaoh.quality-gate.agent.md +++ b/.github/agents/pharaoh.quality-gate.agent.md @@ -1,10 +1,10 @@ --- -description: Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). +description: Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). Consumes an aggregated review+mece+coverage summary plus a gate spec; returns pass/fail with named breaches. Never produces summaries itself — thin gate layer over upstream atomic checkers. handoffs: [] --- # @pharaoh.quality-gate -Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). +Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). Consumes an aggregated review+mece+coverage summary plus a gate spec; returns pass/fail with named breaches. Never produces summaries itself — thin gate layer over upstream atomic checkers. See [`skills/pharaoh-quality-gate/SKILL.md`](../../skills/pharaoh-quality-gate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.release.agent.md b/.github/agents/pharaoh.release.agent.md index 03d131c..089e004 100644 --- a/.github/agents/pharaoh.release.agent.md +++ b/.github/agents/pharaoh.release.agent.md @@ -1,5 +1,5 @@ --- -description: Prepare a release by generating changelogs from requirements, release summaries, and traceability coverage metrics. +description: Use when preparing a release, generating changelogs from requirements, or summarizing requirement changes for version management handoffs: - label: MECE Check agent: pharaoh.mece diff --git a/.github/agents/pharaoh.reproducibility-check.agent.md b/.github/agents/pharaoh.reproducibility-check.agent.md index ee77bd2..a42da34 100644 --- a/.github/agents/pharaoh.reproducibility-check.agent.md +++ b/.github/agents/pharaoh.reproducibility-check.agent.md @@ -1,10 +1,10 @@ --- -description: Diff two output directories produced by two runs of the same plan to confirm the build is reproducible. Consumes baseline dir, rerun dir, and optional mask rules for non-deterministic fields (timestamps, random ids); emits drifted-file list with per-file changed-field summaries. Does NOT run the plan — that is the caller's responsibility (`pharaoh-execute-plan`). +description: Use when diffing two output directories produced by running the same plan twice to confirm the build is reproducible. Consumes a baseline directory, a rerun directory, and an optional list of mask rules for known-non-deterministic fields (timestamps, randomly-generated ids); emits a list of drifted files with per-file changed-field summaries. Does NOT run the plan — running is the caller's responsibility (`pharaoh-execute-plan`). handoffs: [] --- # @pharaoh.reproducibility-check -Diff two output directories produced by running the same plan twice to confirm the build is reproducible. Consumes a baseline directory, a rerun directory, and an optional list of mask rules for known-non-deterministic fields (timestamps, randomly-generated ids); emits a list of drifted files with per-file changed-field summaries. Does NOT run the plan — running twice is the caller's responsibility (`pharaoh-execute-plan`). +Use when diffing two output directories produced by running the same plan twice to confirm the build is reproducible. Consumes a baseline directory, a rerun directory, and an optional list of mask rules for known-non-deterministic fields (timestamps, randomly-generated ids); emits a list of drifted files with per-file changed-field summaries. Does NOT run the plan — running is the caller's responsibility (`pharaoh-execute-plan`). -See [`skills/pharaoh-reproducibility-check/SKILL.md`](../../skills/pharaoh-reproducibility-check/SKILL.md) for the full atomic specification — inputs, outputs, per-step process, failure modes, and composition patterns. +See [`skills/pharaoh-reproducibility-check/SKILL.md`](../../skills/pharaoh-reproducibility-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-code-grounding-check.agent.md b/.github/agents/pharaoh.req-code-grounding-check.agent.md index f188291..c8d4075 100644 --- a/.github/agents/pharaoh.req-code-grounding-check.agent.md +++ b/.github/agents/pharaoh.req-code-grounding-check.agent.md @@ -1,10 +1,10 @@ --- -description: Verify a drafted requirement's claims against the source file it cites via :source_doc: — exception raise sites, trigger conditions, type-framework imports, named symbols, weasel adjectives, quantifier enumeration, branch count. +description: Use when verifying a single drafted requirement against the source file it cites via `:source_doc:`. Single mechanical fidelity check — compares the CREQ's claims about exceptions, triggers, types, structural symbols, backtick-quoted identifiers, grounding density, adjectives, quantifiers, and branch count against the cited source, returning per-axis findings JSON. Complements `pharaoh-req-review` (which grades prose quality) with code-grounded axes. handoffs: [] --- # @pharaoh.req-code-grounding-check -Verify a drafted requirement's claims against the source file it cites via `:source_doc:` — exception raise sites, trigger conditions, type-framework imports, named symbols, weasel adjectives, quantifier enumeration, branch count. +Use when verifying a single drafted requirement against the source file it cites via `:source_doc:`. Single mechanical fidelity check — compares the CREQ's claims about exceptions, triggers, types, structural symbols, backtick-quoted identifiers, grounding density, adjectives, quantifiers, and branch count against the cited source, returning per-axis findings JSON. Complements `pharaoh-req-review` (which grades prose quality) with code-grounded axes. -See [`skills/pharaoh-req-code-grounding-check/SKILL.md`](../../skills/pharaoh-req-code-grounding-check/SKILL.md) for the full atomic specification — inputs, outputs, per-axis detection rules, and composition patterns. +See [`skills/pharaoh-req-code-grounding-check/SKILL.md`](../../skills/pharaoh-req-code-grounding-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-codelink-annotate.agent.md b/.github/agents/pharaoh.req-codelink-annotate.agent.md index 004eb8c..c4f79b1 100644 --- a/.github/agents/pharaoh.req-codelink-annotate.agent.md +++ b/.github/agents/pharaoh.req-codelink-annotate.agent.md @@ -1,10 +1,10 @@ --- -description: Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. +description: Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. Two modes — `codelinks` (sphinx-codelinks-compatible multi-field `@ title, id, type, [links]` form; the comment IS the need) and `backref` (minimal `@req ID: title` pointer back to an RST-hosted need). Mode is tailored via `ubproject.toml` / `pharaoh.toml`, not hardcoded. handoffs: [] --- # @pharaoh.req-codelink-annotate -Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. +Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. Two modes — `codelinks` (sphinx-codelinks-compatible multi-field `@ title, id, type, [links]` form; the comment IS the need) and `backref` (minimal `@req ID: title` pointer back to an RST-hosted need). Mode is tailored via `ubproject.toml` / `pharaoh.toml`, not hardcoded. See [`skills/pharaoh-req-codelink-annotate/SKILL.md`](../../skills/pharaoh-req-codelink-annotate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-draft.agent.md b/.github/agents/pharaoh.req-draft.agent.md index 0c56ffd..1e34763 100644 --- a/.github/agents/pharaoh.req-draft.agent.md +++ b/.github/agents/pharaoh.req-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Draft a single sphinx-needs requirement from a feature description. +description: Use when drafting a single sphinx-needs requirement from a feature description. Produces a new RST directive block with ID, status=draft, and a single shall-clause body, linking to a parent requirement or workflow per the project's artefact-catalog. handoffs: [] --- # @pharaoh.req-draft -Draft a single sphinx-needs requirement from a feature description. +Use when drafting a single sphinx-needs requirement from a feature description. Produces a new RST directive block with ID, status=draft, and a single shall-clause body, linking to a parent requirement or workflow per the project's artefact-catalog. See [`skills/pharaoh-req-draft/SKILL.md`](../../skills/pharaoh-req-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-from-code.agent.md b/.github/agents/pharaoh.req-from-code.agent.md index 848345a..9e7bb93 100644 --- a/.github/agents/pharaoh.req-from-code.agent.md +++ b/.github/agents/pharaoh.req-from-code.agent.md @@ -1,10 +1,10 @@ --- -description: Read one source file and emit comp_req directives describing its observable behavior, coordinating canonical names via Papyrus. +description: Use when reading one source file and emitting one or more requirement RST directives (typed by `target_level`) describing the observable behavior in that file. Queries shared Papyrus for canonical terms before naming concepts; writes newly surfaced concepts back. Does not draft architecture, plans, or FMEA. handoffs: [] --- # @pharaoh.req-from-code -Read one source file and emit comp_req directives describing its observable behavior, coordinating canonical names via Papyrus. +Use when reading one source file and emitting one or more requirement RST directives (typed by `target_level`) describing the observable behavior in that file. Queries shared Papyrus for canonical terms before naming concepts; writes newly surfaced concepts back. Does not draft architecture, plans, or FMEA. See [`skills/pharaoh-req-from-code/SKILL.md`](../../skills/pharaoh-req-from-code/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-regenerate.agent.md b/.github/agents/pharaoh.req-regenerate.agent.md index e0d48e9..28f7407 100644 --- a/.github/agents/pharaoh.req-regenerate.agent.md +++ b/.github/agents/pharaoh.req-regenerate.agent.md @@ -1,10 +1,10 @@ --- -description: Regenerate a single sphinx-needs requirement to address findings from a prior review. +description: Use when regenerating a single sphinx-needs requirement to address findings from pharaoh-req-review. Consumes the original RST + findings JSON, emits a revised RST directive that passes the flagged axes. handoffs: [] --- # @pharaoh.req-regenerate -Regenerate a single sphinx-needs requirement to address findings from a prior review. +Use when regenerating a single sphinx-needs requirement to address findings from pharaoh-req-review. Consumes the original RST + findings JSON, emits a revised RST directive that passes the flagged axes. See [`skills/pharaoh-req-regenerate/SKILL.md`](../../skills/pharaoh-req-regenerate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.req-review.agent.md b/.github/agents/pharaoh.req-review.agent.md index f33ff28..dabf554 100644 --- a/.github/agents/pharaoh.req-review.agent.md +++ b/.github/agents/pharaoh.req-review.agent.md @@ -1,10 +1,10 @@ --- -description: Audit a single sphinx-needs requirement against the ISO 26262 Part 8 §6 axes. +description: Use when auditing a single sphinx-needs requirement against the 11 ISO 26262 Part 8 §6 axes. Emits structured findings JSON — per-axis pass/fail for mechanized axes, 0-3 score for subjective axes, with action items for any failure. handoffs: [] --- # @pharaoh.req-review -Audit a single sphinx-needs requirement against the ISO 26262 Part 8 §6 axes. +Use when auditing a single sphinx-needs requirement against the 11 ISO 26262 Part 8 §6 axes. Emits structured findings JSON — per-axis pass/fail for mechanized axes, 0-3 score for subjective axes, with action items for any failure. See [`skills/pharaoh-req-review/SKILL.md`](../../skills/pharaoh-req-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.review-completeness.agent.md b/.github/agents/pharaoh.review-completeness.agent.md index 2873634..df8be0e 100644 --- a/.github/agents/pharaoh.review-completeness.agent.md +++ b/.github/agents/pharaoh.review-completeness.agent.md @@ -1,10 +1,10 @@ --- -description: Inspect needs for review / approval-chain completeness; flag missing reviewer / approved_by fields. +description: Use when inspecting one or more needs for review / approval-chain completeness. Flags needs missing required :reviewer: or :approved_by: fields per the project's artefact catalog. Emits one finding per incomplete need via pharaoh-finding-record. handoffs: [] --- # @pharaoh.review-completeness -Inspect needs for review / approval-chain completeness; flag missing reviewer / approved_by fields. +Use when inspecting one or more needs for review / approval-chain completeness. Flags needs missing required :reviewer: or :approved_by: fields per the project's artefact catalog. Emits one finding per incomplete need via pharaoh-finding-record. See [`skills/pharaoh-review-completeness/SKILL.md`](../../skills/pharaoh-review-completeness/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.self-review-coverage-check.agent.md b/.github/agents/pharaoh.self-review-coverage-check.agent.md index 92beffa..e93ca48 100644 --- a/.github/agents/pharaoh.self-review-coverage-check.agent.md +++ b/.github/agents/pharaoh.self-review-coverage-check.agent.md @@ -1,10 +1,10 @@ --- -description: Verify every drafted artefact in runs/ has a matching review JSON. +description: Use when verifying that every artefact emitted during a plan run received a matching review. For every drafted artefact in `runs/`, confirms a matching `<id>_review.json` exists and is non-empty. Closes the "draft emitted but review was skipped" failure class. handoffs: [] --- # @pharaoh.self-review-coverage-check -Verify every drafted artefact in runs/ has a matching review JSON. +Use when verifying that every artefact emitted during a plan run received a matching review. For every drafted artefact in `runs/`, confirms a matching `<id>_review.json` exists and is non-empty. Closes the "draft emitted but review was skipped" failure class. -See [`skills/pharaoh-self-review-coverage-check/SKILL.md`](../../skills/pharaoh-self-review-coverage-check/SKILL.md) for the full atomic specification. +See [`skills/pharaoh-self-review-coverage-check/SKILL.md`](../../skills/pharaoh-self-review-coverage-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.sequence-diagram-draft.agent.md b/.github/agents/pharaoh.sequence-diagram-draft.agent.md index 6fd0ae8..666e39c 100644 --- a/.github/agents/pharaoh.sequence-diagram-draft.agent.md +++ b/.github/agents/pharaoh.sequence-diagram-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. +description: Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. Renderer tailored via `pharaoh.toml`. Does NOT emit component, class, or state diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). handoffs: [] --- # @pharaoh.sequence-diagram-draft -Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. +Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. Renderer tailored via `pharaoh.toml`. Does NOT emit component, class, or state diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). See [`skills/pharaoh-sequence-diagram-draft/SKILL.md`](../../skills/pharaoh-sequence-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.setup.agent.md b/.github/agents/pharaoh.setup.agent.md index 14c5d01..e72a89e 100644 --- a/.github/agents/pharaoh.setup.agent.md +++ b/.github/agents/pharaoh.setup.agent.md @@ -1,5 +1,5 @@ --- -description: Scaffold Pharaoh into a sphinx-needs project. Detects project structure, generates pharaoh.toml, installs Copilot agents, and recommends tooling. +description: Use when setting up Pharaoh in a sphinx-needs project for the first time, scaffolding Copilot agents, or reconfiguring project detection handoffs: - label: Run MECE Check agent: pharaoh.mece @@ -67,9 +67,17 @@ Data access: ### Step 3: Configure .gitignore -Add narrow entries for the ephemeral `.pharaoh/` subpaths only. The -project tailoring under `.pharaoh/project/` is shared across the team -and must stay tracked. Create `.gitignore` if needed. +`.pharaoh/` contains a mix of committed tailoring and ephemeral run state. Ignoring the whole tree is wrong — it hides `.pharaoh/project/` tailoring which IS shared across the team. Ignore only the ephemeral subpaths: + +| Path | Purpose | Commit? | +| ----------------------- | -------------------------------------------------------- | ------- | +| `.pharaoh/project/` | Tailoring: workflows, id-conventions, artefact-catalog, checklists | **yes** | +| `.pharaoh/runs/` | `pharaoh-execute-plan` run artefacts (report.yaml, staged RST) | no | +| `.pharaoh/plans/` | plan.yaml files emitted by `pharaoh-write-plan` | no | +| `.pharaoh/session.json` | Session / gate state | no | +| `.pharaoh/cache/` | Derived caches | no | + +Entries to add (create `.gitignore` if missing): ``` .pharaoh/runs/ @@ -78,9 +86,7 @@ and must stay tracked. Create `.gitignore` if needed. .pharaoh/cache/ ``` -Do not add a wholesale `.pharaoh/` rule, that would hide the tailoring -files (`workflows.yaml`, `id-conventions.yaml`, `artefact-catalog.yaml`, -`checklists/`). +If `.gitignore` already contains a bare `.pharaoh/` (or `.pharaoh`) line, leave it alone and warn the user that the wide form hides `.pharaoh/project/` tailoring which should be committed; recommend narrowing to the four ephemeral entries above. Do not auto-migrate — respect user control. ### Step 4: Recommend Tooling diff --git a/.github/agents/pharaoh.spec.agent.md b/.github/agents/pharaoh.spec.agent.md index 13e89a8..a3b87b5 100644 --- a/.github/agents/pharaoh.spec.agent.md +++ b/.github/agents/pharaoh.spec.agent.md @@ -1,5 +1,5 @@ --- -description: Generate a Superpowers-compatible spec and plan document from sphinx-needs requirements, bridging requirements to implementation. +description: Use when generating a Superpowers-compatible spec and plan document from sphinx-needs requirements, bridging requirements to implementation handoffs: - label: Execute Plan agent: pharaoh.plan diff --git a/.github/agents/pharaoh.sphinx-extension-add.agent.md b/.github/agents/pharaoh.sphinx-extension-add.agent.md index 67ceb53..9f9e20c 100644 --- a/.github/agents/pharaoh.sphinx-extension-add.agent.md +++ b/.github/agents/pharaoh.sphinx-extension-add.agent.md @@ -1,10 +1,10 @@ --- -description: Idempotently add one or more sphinx extension modules to a project's `conf.py` extensions list, optionally installing the corresponding pypi packages via the detected package manager. +description: Use when you need to idempotently add one or more sphinx extension modules to a project's `conf.py` extensions list, optionally installing the corresponding pypi packages via the detected package manager. Invoked by plans produced by pharaoh-write-plan when a diagram-emitting task requires a renderer extension that `conf.py` does not yet load. Does NOT emit RST. Does NOT build. handoffs: [] --- # @pharaoh.sphinx-extension-add -Add sphinx extensions (e.g. `sphinxcontrib.mermaid`, `sphinxcontrib.plantuml`, `myst_parser`) to a project's `conf.py` extensions list. Idempotent: noop when all requested extensions are already loaded. Optionally installs the corresponding pypi packages via the detected package manager (rye / uv / poetry / pdm / pip-venv). Typically inserted into a plan by `pharaoh.write-plan` as a prerequisite to diagram-emitting tasks when `conf.py` lacks the required renderer extension. +Use when you need to idempotently add one or more sphinx extension modules to a project's `conf.py` extensions list, optionally installing the corresponding pypi packages via the detected package manager. Invoked by plans produced by pharaoh-write-plan when a diagram-emitting task requires a renderer extension that `conf.py` does not yet load. Does NOT emit RST. Does NOT build. See [`skills/pharaoh-sphinx-extension-add/SKILL.md`](../../skills/pharaoh-sphinx-extension-add/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.standard-conformance.agent.md b/.github/agents/pharaoh.standard-conformance.agent.md index 17ea5f7..ae0163c 100644 --- a/.github/agents/pharaoh.standard-conformance.agent.md +++ b/.github/agents/pharaoh.standard-conformance.agent.md @@ -1,10 +1,10 @@ --- -description: Evaluate a single artefact against one regulatory standard (ISO 26262, ASPICE, ISO/SAE 21434). +description: Use when evaluating a single sphinx-needs artefact against one regulatory standard (ISO 26262-8 §6, ASPICE 4.0, ISO/SAE 21434). Emits per-indicator findings JSON with pass/fail on mechanizable indicators and 0-3 scores on subjective ones. handoffs: [] --- # @pharaoh.standard-conformance -Evaluate a single artefact against one regulatory standard (ISO 26262, ASPICE, ISO/SAE 21434). +Use when evaluating a single sphinx-needs artefact against one regulatory standard (ISO 26262-8 §6, ASPICE 4.0, ISO/SAE 21434). Emits per-indicator findings JSON with pass/fail on mechanizable indicators and 0-3 scores on subjective ones. See [`skills/pharaoh-standard-conformance/SKILL.md`](../../skills/pharaoh-standard-conformance/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.state-diagram-draft.agent.md b/.github/agents/pharaoh.state-diagram-draft.agent.md index 652dc24..ec261c4 100644 --- a/.github/agents/pharaoh.state-diagram-draft.agent.md +++ b/.github/agents/pharaoh.state-diagram-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. +description: Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. Renderer tailored via `pharaoh.toml`. Does NOT emit component, sequence, or class diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). handoffs: [] --- # @pharaoh.state-diagram-draft -Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. +Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. Renderer tailored via `pharaoh.toml`. Does NOT emit component, sequence, or class diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). See [`skills/pharaoh-state-diagram-draft/SKILL.md`](../../skills/pharaoh-state-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.status-lifecycle-check.agent.md b/.github/agents/pharaoh.status-lifecycle-check.agent.md index 935ba38..f04f403 100644 --- a/.github/agents/pharaoh.status-lifecycle-check.agent.md +++ b/.github/agents/pharaoh.status-lifecycle-check.agent.md @@ -1,5 +1,5 @@ --- -description: Release-gate check over a sphinx-needs corpus — counts needs still in the `draft` bucket (per workflows.yaml) and returns binary pass/fail when `enforce=true`. Advisory mode reports counts without failing. +description: Use when running a release-gate check over a full sphinx-needs corpus to confirm that zero needs remain in the initial `draft` status. Single mechanical binary gate — aggregates `status` across every need in `needs.json`, compares against the initial-state declaration in `workflows.yaml`, and returns pass/fail plus per-status counts. Advisory by default (pre-release development passes); release pipelines override `enforce=true` so any draft blocks the gate. handoffs: - label: Aggregate into quality gate agent: pharaoh.quality-gate @@ -8,6 +8,6 @@ handoffs: # @pharaoh.status-lifecycle-check -Aggregate `status` across every need in `needs.json` against the `initial_state` declared in `workflows.yaml`. Binary release gate — under `enforce=true`, zero drafts pass, one draft fails. Under `enforce=false` (default), the findings are reported without failing so pre-release development is unblocked. Distinct from `pharaoh-lifecycle-check`, which evaluates per-need transition legality against `requires:` prerequisites. +Use when running a release-gate check over a full sphinx-needs corpus to confirm that zero needs remain in the initial `draft` status. Single mechanical binary gate — aggregates `status` across every need in `needs.json`, compares against the initial-state declaration in `workflows.yaml`, and returns pass/fail plus per-status counts. Advisory by default (pre-release development passes); release pipelines override `enforce=true` so any draft blocks the gate. See [`skills/pharaoh-status-lifecycle-check/SKILL.md`](../../skills/pharaoh-status-lifecycle-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-bootstrap.agent.md b/.github/agents/pharaoh.tailor-bootstrap.agent.md index fc6465e..2e37a48 100644 --- a/.github/agents/pharaoh.tailor-bootstrap.agent.md +++ b/.github/agents/pharaoh.tailor-bootstrap.agent.md @@ -1,10 +1,10 @@ --- -description: Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows. +description: Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows.yaml, id-conventions.yaml, artefact-catalog.yaml, and per-type checklists — without requiring any needs to exist. Complements pharaoh-tailor-detect which requires ≥10 needs. handoffs: [] --- # @pharaoh.tailor-bootstrap -Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows. +Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows.yaml, id-conventions.yaml, artefact-catalog.yaml, and per-type checklists — without requiring any needs to exist. Complements pharaoh-tailor-detect which requires ≥10 needs. See [`skills/pharaoh-tailor-bootstrap/SKILL.md`](../../skills/pharaoh-tailor-bootstrap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md b/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md index b17d90c..f2f5bb6 100644 --- a/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md +++ b/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md @@ -1,10 +1,10 @@ --- -description: Detect language + CLI framework + config-default idiom in a project source tree and emit a code-grounding-filters.yaml wiring the four parameterised filter strategies to the detected stack. +description: Use when authoring a project's `code-grounding-filters.yaml` from observed stack conventions. Detects language + CLI framework + config-object style in the project source tree and emits a tailoring YAML populated with the four parameterised filter strategies. Does not invoke `pharaoh-req-code-grounding-check`; purely produces tailoring. handoffs: [] --- # @pharaoh.tailor-code-grounding-filters -Detect language + CLI framework + config-default idiom in a project source tree and emit a code-grounding-filters.yaml wiring the four parameterised filter strategies to the detected stack. +Use when authoring a project's `code-grounding-filters.yaml` from observed stack conventions. Detects language + CLI framework + config-object style in the project source tree and emits a tailoring YAML populated with the four parameterised filter strategies. Does not invoke `pharaoh-req-code-grounding-check`; purely produces tailoring. See [`skills/pharaoh-tailor-code-grounding-filters/SKILL.md`](../../skills/pharaoh-tailor-code-grounding-filters/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-detect.agent.md b/.github/agents/pharaoh.tailor-detect.agent.md index 233a42d..680f063 100644 --- a/.github/agents/pharaoh.tailor-detect.agent.md +++ b/.github/agents/pharaoh.tailor-detect.agent.md @@ -1,10 +1,10 @@ --- -description: Inspect a sphinx-needs project and emit a structured report of detected conventions. +description: Use when inspecting a sphinx-needs project to emit a structured report of detected conventions — prefixes, ID regex candidates, separator, lifecycle states, artefact types with observed required/optional fields. Does NOT author tailoring files (see pharaoh-tailor-fill). handoffs: [] --- # @pharaoh.tailor-detect -Inspect a sphinx-needs project and emit a structured report of detected conventions. +Use when inspecting a sphinx-needs project to emit a structured report of detected conventions — prefixes, ID regex candidates, separator, lifecycle states, artefact types with observed required/optional fields. Does NOT author tailoring files (see pharaoh-tailor-fill). See [`skills/pharaoh-tailor-detect/SKILL.md`](../../skills/pharaoh-tailor-detect/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-fill.agent.md b/.github/agents/pharaoh.tailor-fill.agent.md index 92f35f3..11afcd1 100644 --- a/.github/agents/pharaoh.tailor-fill.agent.md +++ b/.github/agents/pharaoh.tailor-fill.agent.md @@ -1,10 +1,10 @@ --- -description: Author .pharaoh/project/ tailoring files from a detected-conventions report. +description: Use when authoring the .pharaoh/project/ tailoring files (id-conventions.yaml, workflows.yaml, artefact-catalog.yaml, checklists/requirement.md) from detected-conventions JSON produced by pharaoh-tailor-detect. handoffs: [] --- # @pharaoh.tailor-fill -Author .pharaoh/project/ tailoring files from a detected-conventions report. +Use when authoring the .pharaoh/project/ tailoring files (id-conventions.yaml, workflows.yaml, artefact-catalog.yaml, checklists/requirement.md) from detected-conventions JSON produced by pharaoh-tailor-detect. See [`skills/pharaoh-tailor-fill/SKILL.md`](../../skills/pharaoh-tailor-fill/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.tailor-review.agent.md b/.github/agents/pharaoh.tailor-review.agent.md index 2a2abff..30a5d4e 100644 --- a/.github/agents/pharaoh.tailor-review.agent.md +++ b/.github/agents/pharaoh.tailor-review.agent.md @@ -1,10 +1,10 @@ --- -description: Audit .pharaoh/project/ tailoring files against JSON schemas and cross-file consistency. +description: Use when auditing .pharaoh/project/ tailoring files against JSON schemas (id-conventions, workflows, artefact-catalog, checklists frontmatter) plus cross-file consistency checks (every lifecycle state referenced in artefact-catalog exists in workflows.yaml, every prefix in artefact-catalog is declared in id-conventions, etc.). handoffs: [] --- # @pharaoh.tailor-review -Audit .pharaoh/project/ tailoring files against JSON schemas and cross-file consistency. +Use when auditing .pharaoh/project/ tailoring files against JSON schemas (id-conventions, workflows, artefact-catalog, checklists frontmatter) plus cross-file consistency checks (every lifecycle state referenced in artefact-catalog exists in workflows.yaml, every prefix in artefact-catalog is declared in id-conventions, etc.). See [`skills/pharaoh-tailor-review/SKILL.md`](../../skills/pharaoh-tailor-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.toctree-emit.agent.md b/.github/agents/pharaoh.toctree-emit.agent.md index eaeaaf5..1532bdb 100644 --- a/.github/agents/pharaoh.toctree-emit.agent.md +++ b/.github/agents/pharaoh.toctree-emit.agent.md @@ -1,10 +1,10 @@ --- -description: Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index.rst` with a Sphinx toctree over them. Prevents orphan-file warnings under `sphinx-build -W`. Does NOT modify the emitted RST files. Does NOT wire the emitted directory into any parent toctree, that is a caller concern. +description: Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index.rst` with a Sphinx toctree over them. Prevents orphan-file warnings under `sphinx-build -W`. Does NOT modify the emitted RST files. Does NOT wire the emitted directory into any parent toctree — that is a caller concern. handoffs: [] --- # @pharaoh.toctree-emit -Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index.rst` with a Sphinx toctree over them. Prevents orphan-file warnings under `sphinx-build -W`. Does NOT modify the emitted RST files. Does NOT wire the emitted directory into any parent toctree, that is a caller concern. +Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index.rst` with a Sphinx toctree over them. Prevents orphan-file warnings under `sphinx-build -W`. Does NOT modify the emitted RST files. Does NOT wire the emitted directory into any parent toctree — that is a caller concern. See [`skills/pharaoh-toctree-emit/SKILL.md`](../../skills/pharaoh-toctree-emit/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.trace.agent.md b/.github/agents/pharaoh.trace.agent.md index b6c4ba8..095c023 100644 --- a/.github/agents/pharaoh.trace.agent.md +++ b/.github/agents/pharaoh.trace.agent.md @@ -1,5 +1,5 @@ --- -description: Navigate traceability links between requirements, specifications, implementations, tests, and code in any direction. +description: Use when navigating traceability links between requirements, specifications, implementations, tests, and code in a sphinx-needs project handoffs: - label: Analyze Impact agent: pharaoh.change diff --git a/.github/agents/pharaoh.use-case-diagram-draft.agent.md b/.github/agents/pharaoh.use-case-diagram-draft.agent.md index 59255eb..a32fd11 100644 --- a/.github/agents/pharaoh.use-case-diagram-draft.agent.md +++ b/.github/agents/pharaoh.use-case-diagram-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Draft a single use-case diagram for one feat — actors, use cases, system boundary. +description: Use when drafting one use-case diagram for a single feat — actors (primary, secondary, external systems), use cases (one per user-facing capability), and system boundary. Renderer-aware (mermaid or plantuml per `.pharaoh/project/diagram-conventions.yaml`). First concrete `*-diagram-draft` skill — others follow the same shape. handoffs: [pharaoh.diagram-review] --- # @pharaoh.use-case-diagram-draft -Draft a single use-case diagram for one feat — actors, use cases, system boundary. +Use when drafting one use-case diagram for a single feat — actors (primary, secondary, external systems), use cases (one per user-facing capability), and system boundary. Renderer-aware (mermaid or plantuml per `.pharaoh/project/diagram-conventions.yaml`). First concrete `*-diagram-draft` skill — others follow the same shape. -See [`skills/pharaoh-use-case-diagram-draft/SKILL.md`](../../skills/pharaoh-use-case-diagram-draft/SKILL.md) for the full atomic specification. +See [`skills/pharaoh-use-case-diagram-draft/SKILL.md`](../../skills/pharaoh-use-case-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.verify.agent.md b/.github/agents/pharaoh.verify.agent.md new file mode 100644 index 0000000..f02b890 --- /dev/null +++ b/.github/agents/pharaoh.verify.agent.md @@ -0,0 +1,19 @@ +--- +description: Use when checking whether one sphinx-needs artefact actually addresses the substance of every parent it links to via :satisfies: or :verifies:. Cross-need content check — distinct from structural MECE, schema-level tailor-review, and per-axis req-review/arch-review. +handoffs: + - label: MECE Check + agent: pharaoh.mece + prompt: Run a structural gap-and-orphan analysis around the verified need + - label: Trace the Need + agent: pharaoh.trace + prompt: Trace the verified need through every link type + - label: Re-author the Need + agent: pharaoh.author + prompt: Revise the body to address the missing parent claims +--- + +# @pharaoh.verify + +Use when checking whether one sphinx-needs artefact actually addresses the substance of every parent it links to via :satisfies: or :verifies:. Cross-need content check — distinct from structural MECE, schema-level tailor-review, and per-axis req-review/arch-review. + +See [`skills/pharaoh-verify/SKILL.md`](../../skills/pharaoh-verify/SKILL.md) for the full atomic specification — inputs, scoring scale, and composition patterns. diff --git a/.github/agents/pharaoh.vplan-draft.agent.md b/.github/agents/pharaoh.vplan-draft.agent.md index 2b1b828..037c33c 100644 --- a/.github/agents/pharaoh.vplan-draft.agent.md +++ b/.github/agents/pharaoh.vplan-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Draft a single sphinx-needs test-case (verification plan item) for one requirement. +description: Use when drafting a single sphinx-needs test-case (verification plan item) for one requirement. Emits an RST tc__ directive with inputs, steps, and expected outcome, linking to the parent req via :verifies:. handoffs: [] --- # @pharaoh.vplan-draft -Draft a single sphinx-needs test-case (verification plan item) for one requirement. +Use when drafting a single sphinx-needs test-case (verification plan item) for one requirement. Emits an RST tc__ directive with inputs, steps, and expected outcome, linking to the parent req via :verifies:. See [`skills/pharaoh-vplan-draft/SKILL.md`](../../skills/pharaoh-vplan-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.vplan-review.agent.md b/.github/agents/pharaoh.vplan-review.agent.md index af3d327..e01ddbf 100644 --- a/.github/agents/pharaoh.vplan-review.agent.md +++ b/.github/agents/pharaoh.vplan-review.agent.md @@ -1,10 +1,10 @@ --- -description: Audit a single test case against ISO 26262-8 §6 axes plus vplan-specific axes. +description: Use when auditing a single test case against ISO 26262-8 §6 axes plus vplan-specific axes (coverage of parent req, completeness of steps, clarity of expected outcome). Emits structured findings JSON. handoffs: [] --- # @pharaoh.vplan-review -Audit a single test case against ISO 26262-8 §6 axes plus vplan-specific axes. +Use when auditing a single test case against ISO 26262-8 §6 axes plus vplan-specific axes (coverage of parent req, completeness of steps, clarity of expected outcome). Emits structured findings JSON. See [`skills/pharaoh-vplan-review/SKILL.md`](../../skills/pharaoh-vplan-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/prompts/pharaoh.author.prompt.md b/.github/prompts/pharaoh.author.prompt.md index f8a06dd..1234b39 100644 --- a/.github/prompts/pharaoh.author.prompt.md +++ b/.github/prompts/pharaoh.author.prompt.md @@ -1,3 +1,23 @@ --- agent: pharaoh.author --- + +# /pharaoh.author + +Author or modify one sphinx-needs artefact — a requirement, an architecture element, a test +case, or a decision — by routing to the right atomic drafting skill based on the project's +artefact catalog. One invocation produces one drafted RST directive with an ID, a parent link, +and a suggested file placement. + +Hand the agent: + +- the **target type** (e.g. `req`, `arch`, `tc`, `decision`), +- a short **draft seed** describing what to author, +- the **parent link** the new artefact will trace to (need-id), and +- any type-specific extras the dispatched drafter needs (e.g. `arch_type`, + `verification_level`). + +The agent picks the matching atomic drafter (`pharaoh-req-draft`, `pharaoh-arch-draft`, +`pharaoh-vplan-draft`, or `pharaoh-decide`), forwards the inputs, and returns the drafted RST +directive plus an authoring summary. Run `@pharaoh.verify` next to check the new artefact +against the substance of its parent. diff --git a/.github/prompts/pharaoh.verify.prompt.md b/.github/prompts/pharaoh.verify.prompt.md index 1726175..57d5ee9 100644 --- a/.github/prompts/pharaoh.verify.prompt.md +++ b/.github/prompts/pharaoh.verify.prompt.md @@ -1,3 +1,22 @@ --- agent: pharaoh.verify --- + +# /pharaoh.verify + +Check whether one sphinx-needs artefact actually addresses the substance of every parent it +links to via `:satisfies:` or `:verifies:`. This is a cross-need content check — distinct from +structural MECE (`@pharaoh.mece`), schema-level tailoring review (`@pharaoh.tailor-review`), +and per-axis prose review (`@pharaoh.req-review`, `@pharaoh.arch-review`, +`@pharaoh.vplan-review`). + +Hand the agent: + +- the **need-id** to verify, and +- optionally `transitive: true` to walk the full parent chain rather than just direct parents. + +The agent reads `needs.json`, walks the parent links, scores each (child, parent) pair on a +0-3 ordinal for substantive coverage, and returns a JSON document with per-pair verdicts and +concrete missing aspects. Use the result to decide whether to re-author the body via +`@pharaoh.author`, regenerate it per-axis via `@pharaoh.req-regenerate`, or move on to a +corpus-wide check with `@pharaoh.mece`. From 2b6ac331a52defbf64778ceb1ad86cbb51172340 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 16:38:44 +0200 Subject: [PATCH 07/15] Conform .pharaoh/project tailoring to canonical schemas Rewrites workflows.yaml and artefact-catalog.yaml to validate against the canonical JSON schemas shipped at pharaoh/examples/score/.pharaoh/project/schemas/. * workflows.yaml: flat top-level lifecycle_states + flat transitions with requires lists, replacing the per-type maps and {from, to, gate} inline form. Single Pharaoh review-workflow lifecycle, distinct from the sphinx-needs :status: field which carries authoring state. * artefact-catalog.yaml: drops child_of and lifecycle_ref keys (additionalProperties: false in the canonical schema). Adds per-type lifecycle arrays for review-eligible types and omits lifecycle for meta types (person, team, release, seq_msg, need). Both files now pass jsonschema validation against the canonical schemas. --- .pharaoh/project/artefact-catalog.yaml | 256 +++++++------------------ .pharaoh/project/workflows.yaml | 244 ++--------------------- 2 files changed, 81 insertions(+), 419 deletions(-) diff --git a/.pharaoh/project/artefact-catalog.yaml b/.pharaoh/project/artefact-catalog.yaml index 4be833b..3ad87cf 100644 --- a/.pharaoh/project/artefact-catalog.yaml +++ b/.pharaoh/project/artefact-catalog.yaml @@ -1,230 +1,102 @@ -# child_of values are intentionally conservative: only listed where the -# corpus shows >=90% coverage AND the link semantics are child-to-parent -# (decomposition / refinement). Owner-style links (:author:, :persons:), -# planning links (:release:), and parent-to-child links (:provides:, -# :startup_calls:, :shutdown_calls:) are excluded even when their corpus -# coverage is high. Types with no qualifying parent get child_of: []. +# Per-type catalogue. Shape conforms to the canonical schema in +# pharaoh/examples/score/.pharaoh/project/schemas/artefact-catalog.schema.json +# (additionalProperties: false). Allowed per-type keys: required_fields, +# optional_fields, lifecycle, required_body_sections. +# +# lifecycle entries are a subset of workflows.lifecycle_states (cross-file +# rule C2). lifecycle is omitted for meta types (person, team, release, +# seq_msg, need) that are reference data, not review-eligible artefacts. +# +# Parent-of relations between types are not encoded here (canonical schema +# has no slot for them). They are captured in pharaoh.toml [pharaoh.traceability] +# required_links instead. req: - required_fields: - - id - - status - optional_fields: - - source_doc - - reviewer - - approved_by - - customer - - approved - child_of: [] - lifecycle_ref: workflows.yaml#req + required_fields: [id, status] + optional_fields: [reviewer, approved_by, customer, approved] + lifecycle: [draft, reviewed, approved] spec: - required_fields: - - id - - status - optional_fields: - - reviewer - - approved_by - child_of: - - req - lifecycle_ref: workflows.yaml#spec + required_fields: [id, status] + optional_fields: [reviewer, approved_by] + lifecycle: [draft, reviewed, approved] impl: - required_fields: - - id - - status - optional_fields: - - reviewer - - approved_by - - jira - - github - - effort - - approved - child_of: [] - lifecycle_ref: workflows.yaml#impl + required_fields: [id, status] + optional_fields: [reviewer, approved_by, jira, github, effort, approved] + lifecycle: [draft, reviewed, approved] test: - required_fields: - - id - - status - optional_fields: - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#test + required_fields: [id, status] + optional_fields: [reviewer, approved_by] + lifecycle: [draft, reviewed, approved] person: - required_fields: - - id - - status - optional_fields: - - role - - contact - - image - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#person + required_fields: [id, status] + optional_fields: [role, contact, image, reviewer, approved_by] team: - required_fields: - - id - - status - optional_fields: - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#team + required_fields: [id, status] + optional_fields: [reviewer, approved_by] release: - required_fields: - - id - - status - optional_fields: - - date - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#release + required_fields: [id, status] + optional_fields: [date, reviewer, approved_by] arch: - required_fields: - - id - - status - optional_fields: - - source_doc - - reviewer - - approved_by - child_of: - - req - lifecycle_ref: workflows.yaml#arch + required_fields: [id, status] + optional_fields: [reviewer, approved_by] + lifecycle: [draft, reviewed, approved] need: - required_fields: - - id - - status - optional_fields: - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#need + required_fields: [id, status] + optional_fields: [reviewer, approved_by] swarch: - required_fields: - - id - - status - optional_fields: - - source_doc - - reviewer - - approved_by - child_of: - - swreq - lifecycle_ref: workflows.yaml#swarch + required_fields: [id, status] + optional_fields: [reviewer, approved_by] + lifecycle: [draft, reviewed, approved] component: - required_fields: - - id - - status - optional_fields: - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#component + required_fields: [id, status] + optional_fields: [reviewer, approved_by] + lifecycle: [draft, reviewed, approved] interface: - required_fields: - - id - - status - optional_fields: - - reviewer - - approved_by - child_of: - - component - lifecycle_ref: workflows.yaml#interface + required_fields: [id, status] + optional_fields: [reviewer, approved_by] + lifecycle: [draft, reviewed, approved] seq_msg: - required_fields: - - id - - status - optional_fields: - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#seq_msg + required_fields: [id, status] + optional_fields: [reviewer, approved_by] swreq: - required_fields: - - id - - status - optional_fields: - - source_doc - - reviewer - - approved_by - - jira - - github - child_of: - - req - lifecycle_ref: workflows.yaml#swreq + required_fields: [id, status] + optional_fields: [reviewer, approved_by, jira, github] + lifecycle: [draft, reviewed, approved] sys-arch: - required_fields: - - id - - status - optional_fields: - - source_doc - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#sys-arch + required_fields: [id, status] + optional_fields: [reviewer, approved_by] + lifecycle: [draft, reviewed, approved] hazard: - required_fields: - - id - - status - - asil - optional_fields: - - severity - - exposure - - controllability - - scenario - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#hazard + required_fields: [id, status, asil] + optional_fields: [severity, exposure, controllability, scenario, reviewer, approved_by] + lifecycle: [draft, reviewed, approved] safety_goal: - required_fields: - - id - - status - - asil - - mitigates - optional_fields: - - safe_state - - reviewer - - approved_by - child_of: - - hazard - lifecycle_ref: workflows.yaml#safety_goal + required_fields: [id, status, asil, mitigates] + optional_fields: [safe_state, reviewer, approved_by] + lifecycle: [draft, reviewed, approved] fsr: - required_fields: - - id - - status - optional_fields: - - asil - - reviewer - - approved_by - child_of: - - safety_goal - lifecycle_ref: workflows.yaml#fsr + required_fields: [id, status] + optional_fields: [asil, reviewer, approved_by] + lifecycle: [draft, reviewed, approved] sysreq: - required_fields: - - id - - status - optional_fields: - - source_doc - - reviewer - - approved_by - child_of: [] - lifecycle_ref: workflows.yaml#sysreq + required_fields: [id, status] + optional_fields: [reviewer, approved_by] + lifecycle: [draft, reviewed, approved] diff --git a/.pharaoh/project/workflows.yaml b/.pharaoh/project/workflows.yaml index 88a9bd4..615ec5a 100644 --- a/.pharaoh/project/workflows.yaml +++ b/.pharaoh/project/workflows.yaml @@ -1,227 +1,17 @@ -req: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -spec: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -impl: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -test: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -person: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -team: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -release: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -arch: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -need: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -swarch: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -component: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -interface: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -seq_msg: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -swreq: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -sys-arch: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -hazard: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -safety_goal: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -fsr: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved - -sysreq: - states: - - draft - - reviewed - - approved - transitions: - - {from: draft, to: reviewed, gate: "reviewer_present"} - - {from: reviewed, to: approved, gate: "approver_present"} - - {from: reviewed, to: draft, gate: "reviewer_rejected"} - initial: draft - final: approved +# Pharaoh review-workflow lifecycle. Flat shape per the canonical schema +# in pharaoh/examples/score/.pharaoh/project/schemas/workflows.schema.json. +# Distinct from sphinx-needs :status: field, which carries the project's +# authoring status (open / closed / passed / approved). + +lifecycle_states: [draft, reviewed, approved] + +transitions: + - from: draft + to: reviewed + requires: [reviewer_present] + - from: reviewed + to: approved + requires: [approver_present] + - from: reviewed + to: draft + requires: [reviewer_rejected] From 152e77cb6e51b7496ac69a03dfbd140ef46c111c Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 16:42:32 +0200 Subject: [PATCH 08/15] Add source_doc to req and sysreq optional_fields Verification flagged that req.md and sysreq.md checklists reference :source_doc: in their bullets, but source_doc was not listed in those types' optional_fields. The other top-level types (arch, swarch, sys-arch, swreq) already carried source_doc as an optional Pharaoh provenance field. Adding it for req and sysreq aligns the catalog with the checklist content and with the convention used by sibling top-level types. --- .pharaoh/project/artefact-catalog.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.pharaoh/project/artefact-catalog.yaml b/.pharaoh/project/artefact-catalog.yaml index 3ad87cf..f7a4e40 100644 --- a/.pharaoh/project/artefact-catalog.yaml +++ b/.pharaoh/project/artefact-catalog.yaml @@ -13,7 +13,7 @@ req: required_fields: [id, status] - optional_fields: [reviewer, approved_by, customer, approved] + optional_fields: [source_doc, reviewer, approved_by, customer, approved] lifecycle: [draft, reviewed, approved] spec: @@ -98,5 +98,5 @@ fsr: sysreq: required_fields: [id, status] - optional_fields: [reviewer, approved_by] + optional_fields: [source_doc, reviewer, approved_by] lifecycle: [draft, reviewed, approved] From a10df375de2b124979ea827b6d01202a8d4f5bed Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 16:43:48 +0200 Subject: [PATCH 09/15] Encode link-target requirements in catalog required_fields Without child_of (which the canonical schema rejects), the only way to tell an authoring agent what to link is to list the link-option name in the type's required_fields. Score example does this: gd_req's required fields include satisfies, which means every gd_req must have a non-empty :satisfies: link. Aligns spec and fsr with the same convention already used by safety_goal: * spec: required_fields includes reqs (every spec must link to a req via :reqs:, matching pharaoh.toml [pharaoh.traceability].required_links spec -> req chain). * fsr: required_fields includes derives_from (every fsr must link to a safety_goal via :derives_from:, matching the fsr -> safety_goal chain). * safety_goal already had mitigates in required_fields. arch -> req also exists in required_links but is omitted here because the project uses generic :links: for that hop, and listing :links: as required would be too broad. --- .pharaoh/project/artefact-catalog.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.pharaoh/project/artefact-catalog.yaml b/.pharaoh/project/artefact-catalog.yaml index f7a4e40..6daceb2 100644 --- a/.pharaoh/project/artefact-catalog.yaml +++ b/.pharaoh/project/artefact-catalog.yaml @@ -17,7 +17,7 @@ req: lifecycle: [draft, reviewed, approved] spec: - required_fields: [id, status] + required_fields: [id, status, reqs] optional_fields: [reviewer, approved_by] lifecycle: [draft, reviewed, approved] @@ -92,7 +92,7 @@ safety_goal: lifecycle: [draft, reviewed, approved] fsr: - required_fields: [id, status] + required_fields: [id, status, derives_from] optional_fields: [asil, reviewer, approved_by] lifecycle: [draft, reviewed, approved] From a6f17bc82ad1a3605ef81217a81086a87fdc1a03 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 17:17:16 +0200 Subject: [PATCH 10/15] Add walkthrough section to docs/pharaoh.rst Four-step demo flow on the existing Lane Detection corpus that a regular user can paste verbatim into Copilot Chat or Claude Code: * Step 1 (`@pharaoh.mece`): gap analysis against the configured traceability chains plus consistency / ID-regex / undeclared-type checks. * Step 2 (`@pharaoh.req-from-code`): reverse-engineer a focused API-encapsulation requirement from `src/automotive_adas.py` as a child of REQ_001. * Step 3 (`@pharaoh.vplan-draft`): write the missing system test that verifies the lane-detection requirement. * Step 4 (`@pharaoh.change`): change-impact analysis for a revision to REQ_001, scoped to outgoing and incoming :links: edges so the blast radius stays in the lane-detection domain. Each step lists the prompt body, the constraints that match the project's declared types/links/status conventions, and the expected output shape. A trailing sanity-check block re-runs sphinx-build -W and ubc check after the new needs are pasted. Also tightens the existing Tailoring layer section to reference real entry points (workflows.yaml, id-conventions.yaml) instead of the deprecated `child_of` field that the canonical schema does not allow. --- docs/pharaoh.rst | 147 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 140 insertions(+), 7 deletions(-) diff --git a/docs/pharaoh.rst b/docs/pharaoh.rst index aa7f155..72ba519 100644 --- a/docs/pharaoh.rst +++ b/docs/pharaoh.rst @@ -202,18 +202,151 @@ violations, undeclared types injected by ``sphinx-test-reports``) are pre-existing properties of the corpus and are surfaced for review, not introduced by Pharaoh. +🛠️ Walkthrough: AI-assisted V-model authoring +--------------------------------------------- + +Once setup is complete, the typical Pharaoh workflow on this demo +takes four prompts. Each prompt is self-contained, so they can be run +top-to-bottom or in isolation. The example below uses the GitHub +Copilot Chat syntax (``@pharaoh.<name>``); in Claude Code the +equivalent slash form is ``/pharaoh:pharaoh-<name>``. + +The shared scenario across the four prompts is the existing +``REQ_001 — Lane Detection Algorithm`` and the ``LaneDetection`` class +in ``src/automotive_adas.py``. The walkthrough fills a focused +API-encapsulation requirement, writes the test that verifies it, and +then asks what changes if ``REQ_001`` itself is revised. + +Step 1: gap analysis +^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: text + + @pharaoh.mece + + Analyse this sphinx-needs project for traceability gaps and + consistency issues. Use the corpus at docs/_build/html/needs.json. + Report required-link-chain violations, orphans, status mismatches, + ID-pattern violations against the project's id_regex, and any need + type observed in the corpus that is not declared in + [[needs.types]]. Quote concrete need IDs in every finding. + +Expected: the four declared traceability chains hit 100 percent +coverage, but the report surfaces parent-closed/child-open status +mismatches (for example ``EX_SPEC_001 (closed) -> EX_REQ_001 (open)``), +a handful of needs of types injected by ``sphinx-test-reports`` that +are not declared in ``[[needs.types]]``, and a number of existing +need IDs that fail the project's own ``id_regex``. + +Step 2: reverse-engineer a focused requirement from code +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: text + + @pharaoh.req-from-code + + Read src/automotive_adas.py. The LaneDetection class is already + linked to SWREQ_001/002/003 and REQ_001 via three IMPL directives + in its docstrings, so the broad behaviour is captured. What is + not captured is a focused requirement for the API encapsulation + contract of the class — what the caller can rely on without + inspecting raw image data. Emit a single sphinx-needs `req` + directive that captures that contract as a child of REQ_001. + + Constraints: ID prefix at most 10 chars (e.g. REQ_LANE_01); + :status: open; :links: REQ_001; only use fields and link options + declared in docs/ubproject.toml. + +Expected: a ``.. req:: REQ_LANE_01`` block (or similar short ID), +``:links: REQ_001``, and a single shall-clause grounded in the +``LaneDetection`` class's public methods, ready to paste under +``docs/automotive-adas/``. + +Step 3: write the missing test +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: text + + @pharaoh.vplan-draft + + Pick requirement REQ_001 (Lane Detection Algorithm) from the + corpus and generate a single test case that verifies its + observable behaviour. + + Constraints: directive name `test` (NOT `tc`); ID prefix at most + 10 chars (e.g. T_LANE_DET_001); :status: open; link to the parent + via the project's generic option (`:links: REQ_001`), not + `:verifies:`; body must contain explicit Inputs, Steps, and + Expected sections with at least one measurable threshold. + +Expected: a ``.. test:: T_LANE_DET_001`` block with three RST +paragraphs (Inputs / Steps / Expected) and a measurable acceptance +criterion (for example a lateral-deviation threshold across lighting +scenarios). + +Step 4: change-impact analysis +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: text + + @pharaoh.change + + The existing requirement REQ_001 (Lane Detection Algorithm) is + being revised: tighten the lateral-deviation tolerance and require + the algorithm to run on a constrained embedded ECU instead of the + development host. + + Walk the traceability graph in docs/_build/html/needs.json + starting from REQ_001, following outgoing and incoming :links: + edges only. Skip person, team, and release nodes. Report: + + * every architecture element linked to REQ_001 (must update) + * every software requirement linked to REQ_001 (must update) + * every test case that exercises REQ_001 or the affected swreqs + (must re-run) + * the release REQ_001 is scheduled into; flag whether the change + fits the release window + * any newly authored child needs from Steps 2 and 3 of this + walkthrough that depend on REQ_001 + + End with a one-paragraph summary suitable for a change-board + ticket. + +Expected: a tight blast radius of 6 to 8 needs in the lane-detection +domain — one architecture element, three software requirements, two +system tests, the release window, and the two new needs authored in +Steps 2 and 3. + +Sanity-check the artefacts +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The two RST blocks emitted by Steps 2 and 3 paste directly under +``docs/automotive-adas/``. Re-run the build and the ubc check to +confirm both still pass: + +.. code-block:: bash + + uv run sphinx-build -W -b html docs docs/_build/html + ubc check docs + +Both commands should report success with the new needs included. + Tailoring layer --------------- -The YAML files under ``.pharaoh/project/`` are intentionally -human-readable so that they can be hand-tuned. Key entry points: +The files under ``.pharaoh/project/`` are intentionally human-readable +so that they can be hand-tuned. Key entry points: -* ``workflows.yaml``: change the allowed states or add a +* ``workflows.yaml``: change the allowed lifecycle states or add a ``deprecated`` terminal state. * ``artefact-catalog.yaml``: promote a field from optional to - required, or restrict ``child_of`` to a smaller set of parent types. + required, or add a project-specific link option (for example + ``mitigates`` for safety goals) to a type's ``required_fields``. * ``checklists/<type>.md``: edit the review questions consumed by - ``pharaoh:<type>-review``. + the per-type review skills. +* ``id-conventions.yaml``: tighten or relax the ID regex once the + ID-policy collisions in the project's own ``[[needs.types]]`` are + resolved. -Re-run ``pharaoh:tailor-detect`` once the catalogue grows past a few -dozen needs to refresh the inferred conventions. +Re-run ``@pharaoh.tailor-detect`` once the catalogue grows past a +few dozen new needs to refresh the inferred conventions. From 2266aef5afda87db83e4ee7ad04bf32f85c6a8f5 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 17:23:05 +0200 Subject: [PATCH 11/15] Tighten walkthrough prompts to demo-grade brevity The previous walkthrough hand-fed the agent a paragraph of constraints per step (ID regex bounds, declared field set, status enum, link options). That made every prompt longer than the artefact it produced and missed the point: the agent is supposed to read the project's tailoring (.pharaoh/project/) and ubproject.toml itself. Each step now ships a one-line prompt plus a "Why so short" note that points the missing-skill-grounding back to upstream Pharaoh, not to the user. If a skill needs hand-fed constraints to produce build-clean output, that is a Pharaoh-skill gap, not a property of this walkthrough. Word counts before / after, per step: * Step 1: ~70 -> 0 (just the agent invocation) * Step 2: ~120 -> ~15 * Step 3: ~85 -> ~10 * Step 4: ~110 -> ~15 --- docs/pharaoh.rst | 95 ++++++++++++++++-------------------------------- 1 file changed, 32 insertions(+), 63 deletions(-) diff --git a/docs/pharaoh.rst b/docs/pharaoh.rst index 72ba519..f6ed2c7 100644 --- a/docs/pharaoh.rst +++ b/docs/pharaoh.rst @@ -167,7 +167,7 @@ A few decisions worth calling out: * **Strictness** is ``advisory``. Pharaoh skills suggest the recommended workflow but never block authoring. * **Required link chains** reflect the 100%-coverage policy that the - existing 268 needs already satisfy: ``spec → req``, ``arch → req``, + existing needs already satisfy: ``spec → req``, ``arch → req``, ``safety_goal → hazard``, ``fsr → safety_goal``. Chains for ``impl`` and ``test`` were intentionally left out because the corpus shows mixed parent types below 90% coverage. @@ -224,17 +224,9 @@ Step 1: gap analysis @pharaoh.mece - Analyse this sphinx-needs project for traceability gaps and - consistency issues. Use the corpus at docs/_build/html/needs.json. - Report required-link-chain violations, orphans, status mismatches, - ID-pattern violations against the project's id_regex, and any need - type observed in the corpus that is not declared in - [[needs.types]]. Quote concrete need IDs in every finding. - Expected: the four declared traceability chains hit 100 percent coverage, but the report surfaces parent-closed/child-open status -mismatches (for example ``EX_SPEC_001 (closed) -> EX_REQ_001 (open)``), -a handful of needs of types injected by ``sphinx-test-reports`` that +mismatches, needs of types injected by ``sphinx-test-reports`` that are not declared in ``[[needs.types]]``, and a number of existing need IDs that fail the project's own ``id_regex``. @@ -245,22 +237,14 @@ Step 2: reverse-engineer a focused requirement from code @pharaoh.req-from-code - Read src/automotive_adas.py. The LaneDetection class is already - linked to SWREQ_001/002/003 and REQ_001 via three IMPL directives - in its docstrings, so the broad behaviour is captured. What is - not captured is a focused requirement for the API encapsulation - contract of the class — what the caller can rely on without - inspecting raw image data. Emit a single sphinx-needs `req` - directive that captures that contract as a child of REQ_001. - - Constraints: ID prefix at most 10 chars (e.g. REQ_LANE_01); - :status: open; :links: REQ_001; only use fields and link options - declared in docs/ubproject.toml. + src/automotive_adas.py — focused API contract for the LaneDetection + class as a child of REQ_001. -Expected: a ``.. req:: REQ_LANE_01`` block (or similar short ID), -``:links: REQ_001``, and a single shall-clause grounded in the -``LaneDetection`` class's public methods, ready to paste under -``docs/automotive-adas/``. +Expected: a ``.. req::`` block with a short domain-shaped ID, status +``open``, ``:links: REQ_001``, and a single shall-clause grounded in +the ``LaneDetection`` class's public methods. The skill reads +``id-conventions.yaml`` and ``ubproject.toml`` to pick a regex-valid +ID and use only declared fields and link options. Step 3: write the missing test ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -269,20 +253,13 @@ Step 3: write the missing test @pharaoh.vplan-draft - Pick requirement REQ_001 (Lane Detection Algorithm) from the - corpus and generate a single test case that verifies its - observable behaviour. - - Constraints: directive name `test` (NOT `tc`); ID prefix at most - 10 chars (e.g. T_LANE_DET_001); :status: open; link to the parent - via the project's generic option (`:links: REQ_001`), not - `:verifies:`; body must contain explicit Inputs, Steps, and - Expected sections with at least one measurable threshold. + System test for REQ_001 with measurable lighting-condition + thresholds. -Expected: a ``.. test:: T_LANE_DET_001`` block with three RST -paragraphs (Inputs / Steps / Expected) and a measurable acceptance -criterion (for example a lateral-deviation threshold across lighting -scenarios). +Expected: a ``.. test::`` block with a short domain-shaped ID, three +body sections (Inputs, Steps, Expected), and a measurable acceptance +criterion such as a lateral-deviation threshold across lighting +scenarios. Step 4: change-impact analysis ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -291,31 +268,23 @@ Step 4: change-impact analysis @pharaoh.change - The existing requirement REQ_001 (Lane Detection Algorithm) is - being revised: tighten the lateral-deviation tolerance and require - the algorithm to run on a constrained embedded ECU instead of the - development host. - - Walk the traceability graph in docs/_build/html/needs.json - starting from REQ_001, following outgoing and incoming :links: - edges only. Skip person, team, and release nodes. Report: - - * every architecture element linked to REQ_001 (must update) - * every software requirement linked to REQ_001 (must update) - * every test case that exercises REQ_001 or the affected swreqs - (must re-run) - * the release REQ_001 is scheduled into; flag whether the change - fits the release window - * any newly authored child needs from Steps 2 and 3 of this - walkthrough that depend on REQ_001 - - End with a one-paragraph summary suitable for a change-board - ticket. - -Expected: a tight blast radius of 6 to 8 needs in the lane-detection -domain — one architecture element, three software requirements, two -system tests, the release window, and the two new needs authored in -Steps 2 and 3. + REQ_001 is being revised: tighter lateral tolerance, ECU port. + Scope to the lane-detection domain. + +Expected: a tight blast radius of about 6 to 8 needs — one +architecture element, three software requirements, two system tests, +the release window, and any newly authored child needs from Steps 2 +and 3. + +Why so short +^^^^^^^^^^^^ + +The whole point is that the agent reads the project's tailoring +(``.pharaoh/project/``) and ``ubproject.toml`` itself. The user +should never have to remind it of the ID regex, the available link +options, or the canonical status set. If a skill needs hand-fed +constraints to produce build-clean output, that is a Pharaoh-skill +gap (tracked upstream), not a property of this walkthrough. Sanity-check the artefacts ^^^^^^^^^^^^^^^^^^^^^^^^^^ From 884f2a806b932695a8196bf0296c21c55f2cbcf8 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 17:49:02 +0200 Subject: [PATCH 12/15] Tighten walkthrough Steps 2 and 3 to keep skills atomic MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit End-to-end claude -p sonnet validation surfaced two atomicity drifts in the canonical Pharaoh skills under live invocation: * pharaoh-req-from-code happily emits 3 SW-level requirements when 1 feature-level requirement was asked for. One-line nudge: "Emit a single req directive using only declared fields and link options." * pharaoh-vplan-draft happily follows its own draft with a self-review pass and returns only the review. One-line nudge: "Emit only the RST .. test:: directive ready to paste, no review and no self-evaluation." The Why-so-short section now names both behaviours and notes they are tracked upstream rather than papering over them with longer prompts. Steps 1 (mece) and 4 (change) remain one-line invocations — both pass clean against the corpus when run with --permission-mode bypassPermissions to skip session-state write prompts. --- docs/pharaoh.rst | 32 +++++++++++++++++++------------- 1 file changed, 19 insertions(+), 13 deletions(-) diff --git a/docs/pharaoh.rst b/docs/pharaoh.rst index f6ed2c7..be3783b 100644 --- a/docs/pharaoh.rst +++ b/docs/pharaoh.rst @@ -238,13 +238,14 @@ Step 2: reverse-engineer a focused requirement from code @pharaoh.req-from-code src/automotive_adas.py — focused API contract for the LaneDetection - class as a child of REQ_001. + class as a child of REQ_001. Emit a single `req` directive using + only declared fields and link options. Expected: a ``.. req::`` block with a short domain-shaped ID, status ``open``, ``:links: REQ_001``, and a single shall-clause grounded in -the ``LaneDetection`` class's public methods. The skill reads -``id-conventions.yaml`` and ``ubproject.toml`` to pick a regex-valid -ID and use only declared fields and link options. +the ``LaneDetection`` class's public methods. The skill follows the +project's id-conventions and uses ``:links:`` rather than the +Pharaoh-internal ``:verification:`` field. Step 3: write the missing test ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -254,12 +255,13 @@ Step 3: write the missing test @pharaoh.vplan-draft System test for REQ_001 with measurable lighting-condition - thresholds. + thresholds. Emit only the RST `.. test::` directive ready to paste, + no review and no self-evaluation. Expected: a ``.. test::`` block with a short domain-shaped ID, three body sections (Inputs, Steps, Expected), and a measurable acceptance -criterion such as a lateral-deviation threshold across lighting -scenarios. +criterion such as a lateral-deviation or IoU threshold across +lighting scenarios. Step 4: change-impact analysis ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -279,12 +281,16 @@ and 3. Why so short ^^^^^^^^^^^^ -The whole point is that the agent reads the project's tailoring -(``.pharaoh/project/``) and ``ubproject.toml`` itself. The user -should never have to remind it of the ID regex, the available link -options, or the canonical status set. If a skill needs hand-fed -constraints to produce build-clean output, that is a Pharaoh-skill -gap (tracked upstream), not a property of this walkthrough. +The agent reads the project's tailoring (``.pharaoh/project/``) and +``ubproject.toml`` itself, so the user does not have to repeat the +declared types, link options, ID regex, or status enum. The two short +nudges in Steps 2 and 3 (``single req directive``, ``emit only the +RST``) keep specific Pharaoh atomic skills from drifting into adjacent +behaviour: the canonical ``pharaoh-req-from-code`` skill is happy to +emit several SW-level requirements when one feature-level requirement +was asked for, and ``pharaoh-vplan-draft`` is happy to follow itself +with a self-review. Both behaviours are tracked as upstream Pharaoh +issues. Sanity-check the artefacts ^^^^^^^^^^^^^^^^^^^^^^^^^^ From 5e6ec392f7c2b54a262c3329d4866c30fe5cb2c1 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 18:26:00 +0200 Subject: [PATCH 13/15] Sync .github/agents with useblocks/pharaoh@v1.2.0 Pharaoh PR #16 (Fix #13: tailoring, atomic-skill, and setup gaps) merged to upstream main and tagged v1.2.0. Pulls the seven agent files that received refinements during that PR's review: * pharaoh.arch-draft, pharaoh.change, pharaoh.flow, pharaoh.plan, pharaoh.req-draft, pharaoh.setup, pharaoh.vplan-draft. The other 64 agent files are byte-equal to what this PR was already shipping from the PR-16 head sync. --- .github/agents/pharaoh.arch-draft.agent.md | 4 +-- .github/agents/pharaoh.change.agent.md | 6 +++++ .github/agents/pharaoh.flow.agent.md | 6 +++-- .github/agents/pharaoh.plan.agent.md | 6 +++++ .github/agents/pharaoh.req-draft.agent.md | 4 +-- .github/agents/pharaoh.setup.agent.md | 29 ++++++++++++++------- .github/agents/pharaoh.vplan-draft.agent.md | 4 +-- 7 files changed, 42 insertions(+), 17 deletions(-) diff --git a/.github/agents/pharaoh.arch-draft.agent.md b/.github/agents/pharaoh.arch-draft.agent.md index 3f9b5e3..ccb3362 100644 --- a/.github/agents/pharaoh.arch-draft.agent.md +++ b/.github/agents/pharaoh.arch-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting a single sphinx-needs architecture element (component / interface / module) from one parent requirement. Emits an RST directive block linking back to the parent via :satisfies:. +description: Use when drafting a single sphinx-needs architecture element from one parent requirement. The artefact type is parameterised via `target_level` (any catalog-declared architecture type — e.g. `arch`, `swarch`, `sys-arch`, `module`, `component`, `interface`). Emits an RST directive block linking back to the parent via `:satisfies:`. handoffs: [] --- # @pharaoh.arch-draft -Use when drafting a single sphinx-needs architecture element (component / interface / module) from one parent requirement. Emits an RST directive block linking back to the parent via :satisfies:. +Use when drafting a single sphinx-needs architecture element from one parent requirement. The artefact type is parameterised via `target_level` (any catalog-declared architecture type — e.g. `arch`, `swarch`, `sys-arch`, `module`, `component`, `interface`). Emits an RST directive block linking back to the parent via `:satisfies:`. See [`skills/pharaoh-arch-draft/SKILL.md`](../../skills/pharaoh-arch-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.change.agent.md b/.github/agents/pharaoh.change.agent.md index df8fdf3..9ffa829 100644 --- a/.github/agents/pharaoh.change.agent.md +++ b/.github/agents/pharaoh.change.agent.md @@ -1,6 +1,12 @@ --- description: Use when analyzing the impact of changing a requirement, specification, or any sphinx-needs item, including traceability to code via codelinks handoffs: + - label: Author the affected needs + agent: pharaoh.author + prompt: Author the needs flagged in this change analysis + - label: Verify the affected needs + agent: pharaoh.verify + prompt: Verify the authored needs against their parents and review axes - label: MECE Check agent: pharaoh.mece prompt: Check the affected area for gaps and redundancies diff --git a/.github/agents/pharaoh.flow.agent.md b/.github/agents/pharaoh.flow.agent.md index ac61e8d..028d3e8 100644 --- a/.github/agents/pharaoh.flow.agent.md +++ b/.github/agents/pharaoh.flow.agent.md @@ -1,10 +1,12 @@ --- -description: Use when orchestrating the full V-model chain for one feature context — requirement → architecture element → verification plan → FMEA, each with a review pass. Invokes pharaoh-req-draft, pharaoh-req-review, pharaoh-arch-draft, pharaoh-arch-review, pharaoh-vplan-draft, pharaoh-vplan-review, pharaoh-fmea in sequence. +description: Use when orchestrating the full V-model chain for one feature context across the optional ISO 26262 safety V (hazard / safety_goal / fsr), the ASPICE SYS layer (sysreq / sys-arch), the ASPICE SW layer (swreq / swarch), and the classical component V (req / comp_req then arch then vplan then fmea), each with a review pass. Auto-detects which layers to run from the project's artefact-catalog.yaml; the caller can pass a stages argument to skip layers explicitly. Dispatches to pharaoh-req-draft, pharaoh-req-review, pharaoh-arch-draft, pharaoh-arch-review, pharaoh-vplan-draft, pharaoh-vplan-review, and pharaoh-fmea — safety-V types route through pharaoh-req-draft with the appropriate target_level, no new safety-V drafting skills are introduced. handoffs: [] --- # @pharaoh.flow -Use when orchestrating the full V-model chain for one feature context — requirement → architecture element → verification plan → FMEA, each with a review pass. Invokes pharaoh-req-draft, pharaoh-req-review, pharaoh-arch-draft, pharaoh-arch-review, pharaoh-vplan-draft, pharaoh-vplan-review, pharaoh-fmea in sequence. +Use when orchestrating the full V-model chain for one feature context across the optional ISO 26262 safety V (hazard / safety_goal / fsr), the ASPICE SYS layer (sysreq / sys-arch), the ASPICE SW layer (swreq / swarch), and the classical component V (req / comp_req then arch then vplan then fmea), each with a review pass. Auto-detects which layers to run from the project's artefact-catalog.yaml; the caller can pass a stages argument to skip layers explicitly. + +Dispatches to pharaoh-req-draft, pharaoh-req-review, pharaoh-arch-draft, pharaoh-arch-review, pharaoh-vplan-draft, pharaoh-vplan-review, and pharaoh-fmea. Safety-V types route through pharaoh-req-draft with the appropriate target_level (hazard, safety_goal, fsr) — no new safety-V drafting skills are introduced. See [`skills/pharaoh-flow/SKILL.md`](../../skills/pharaoh-flow/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.plan.agent.md b/.github/agents/pharaoh.plan.agent.md index db2b52f..027ff2b 100644 --- a/.github/agents/pharaoh.plan.agent.md +++ b/.github/agents/pharaoh.plan.agent.md @@ -4,6 +4,12 @@ handoffs: - label: Start Change Analysis agent: pharaoh.change prompt: Analyze the impact of the planned changes + - label: Author the planned needs + agent: pharaoh.author + prompt: Author the needs identified in this plan, one per task + - label: Verify the authored needs + agent: pharaoh.verify + prompt: Verify the authored needs against their parents and review axes --- # @pharaoh.plan diff --git a/.github/agents/pharaoh.req-draft.agent.md b/.github/agents/pharaoh.req-draft.agent.md index 1e34763..6021bfd 100644 --- a/.github/agents/pharaoh.req-draft.agent.md +++ b/.github/agents/pharaoh.req-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting a single sphinx-needs requirement from a feature description. Produces a new RST directive block with ID, status=draft, and a single shall-clause body, linking to a parent requirement or workflow per the project's artefact-catalog. +description: Use when drafting a single sphinx-needs requirement-shaped artefact (req, comp_req, sysreq, swreq, hazard, safety_goal, fsr, etc.) from a feature description. The artefact type is parameterised via target_level (any catalog-declared requirement-shaped type, including ISO 26262 safety-V types). Produces a new RST directive block with ID, status=draft, and either a shall-clause body or a hazard/goal-shaped body, linking to a parent per the project's artefact-catalog. handoffs: [] --- # @pharaoh.req-draft -Use when drafting a single sphinx-needs requirement from a feature description. Produces a new RST directive block with ID, status=draft, and a single shall-clause body, linking to a parent requirement or workflow per the project's artefact-catalog. +Use when drafting a single sphinx-needs requirement-shaped artefact (req, comp_req, sysreq, swreq, hazard, safety_goal, fsr, etc.) from a feature description. The artefact type is parameterised via target_level (any catalog-declared requirement-shaped type, including ISO 26262 safety-V types). Produces a new RST directive block with ID, status=draft, and either a shall-clause body or a hazard/goal-shaped body, linking to a parent per the project's artefact-catalog. See [`skills/pharaoh-req-draft/SKILL.md`](../../skills/pharaoh-req-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. diff --git a/.github/agents/pharaoh.setup.agent.md b/.github/agents/pharaoh.setup.agent.md index e72a89e..c0b857c 100644 --- a/.github/agents/pharaoh.setup.agent.md +++ b/.github/agents/pharaoh.setup.agent.md @@ -1,5 +1,5 @@ --- -description: Use when setting up Pharaoh in a sphinx-needs project for the first time, scaffolding Copilot agents, or reconfiguring project detection +description: Use when setting up Pharaoh in a sphinx-needs project for the first time, scaffolding Copilot agents, or reconfiguring project detection. Reads project state (declared types, fields, links, observed RST IDs and statuses) before imposing Pharaoh-internal defaults. handoffs: - label: Run MECE Check agent: pharaoh.mece @@ -11,7 +11,7 @@ handoffs: # @pharaoh.setup -Scaffold Pharaoh into a sphinx-needs project. Detect the project structure, generate a `pharaoh.toml` configuration file, and recommend tooling for the best experience. +Scaffold Pharaoh into a sphinx-needs project. Detect the project structure, read its declared conventions and existing artefacts, generate a `pharaoh.toml` configuration file, seed `.pharaoh/project/` tailoring descriptively from observation, and recommend tooling for the best experience. ## Data Access @@ -57,13 +57,19 @@ Data access: ubCode MCP: <available | not available> ``` -### Step 2: Generate pharaoh.toml +### Step 2: Generate pharaoh.toml and seed .pharaoh/project/ descriptively 1. Ask the user for strictness preference: `advisory` (default, suggests but never blocks) or `enforcing` (checks prerequisites, blocks if not met). -2. Analyze existing need IDs to detect the ID pattern (e.g., `{TYPE}_{NUMBER}` or `{TYPE}-{MODULE}-{NUMBER}`). -3. Build `required_links` from detected extra link types and their usage. -4. Check if `pharaoh.toml` already exists. If so, show a diff and ask what to do. -5. Present the generated content and get confirmation before writing. +2. **Classify project mode from declared types and RST content** — not from `needs.json` existence (a gitignored build artefact). `[[needs.types]]` declared + RST has needs with ≥10% in matured statuses → `steady-state`; types declared + RST has needs → `reverse-eng`; types declared + no needs → `greenfield`. +3. **Detect the ID pattern from up to 20 sampled IDs** in `<source-dir>/**/*.rst`. Recognise `{TYPE}_{NUMBER}`, `{TYPE}-{MODULE}-{NUMBER}`, and `{DOMAIN}_{NUMBER}` (leading token is not a declared type prefix — e.g. `BRAKE_CTRL_01`). Reject the heuristic `{TYPE}_{NUMBER}` default when observed IDs do not conform. +4. **Read `[needs.fields.X]` and `[needs.links.X]` from `ubproject.toml`** to populate `optional_fields`, `required_metadata_fields`, `required_links`, `optional_links`, `required_roles` per declared type for `.pharaoh/project/artefact-catalog.yaml`. Fall back to Pharaoh-internal defaults (`reviewer`, `approved_by`, `source_doc`) only when the project declares no fields. +5. **Compute `lifecycle_states` from a status histogram** of `:status:` values in existing RST files. Fall back to `[draft, reviewed, approved]` only when no `:status:` is observed anywhere. +6. **Detect ID-prefix collisions in `[[needs.types]]`.** In `advisory` strictness, WARN with a remediation hint and proceed; in `enforcing`, FAIL and refuse to write `id-conventions.yaml`. +7. **Direction-infer `required_links` from edges**, not link names. Apply the rule uniformly to every link option declared in `[needs.links.<name>]` (no per-name allow-list). Source 1 (built `needs.json` ≥3 instances + ≥90% coverage) → Source 2 (declared `from`/`to` hint) → Source 3 (refuse to guess; emit TODO comment). +8. Check if `pharaoh.toml` already exists. If so, show a diff and ask what to do. +9. Present the generated content and get confirmation before writing. + +After writing `pharaoh.toml`, invoke `pharaoh-tailor-bootstrap` with the descriptive overrides from steps 4-7 above so `.pharaoh/project/{workflows,id-conventions,artefact-catalog}.yaml` capture what the project declares and uses, not Pharaoh-internal placeholders. ### Step 3: Configure .gitignore @@ -106,8 +112,13 @@ Present everything configured and list available agents: ``` Available agents (GitHub Copilot): - @pharaoh.setup @pharaoh.change @pharaoh.trace @pharaoh.mece - @pharaoh.author @pharaoh.verify @pharaoh.release @pharaoh.plan + <enumerate from the `.github/agents/pharaoh.*.agent.md` files installed + in this project, one entry per file in alphabetical order, formatted as + @pharaoh.<name>. Do not hardcode this list — the skill set has grown + beyond the original happy-path agents to include atomic skills like + pharaoh.req-draft, pharaoh.req-review, pharaoh.arch-draft, + pharaoh.tailor-detect, pharaoh.tailor-fill, pharaoh.audit-fanout, and + others.> Workflow: @pharaoh.change -> @pharaoh.author -> @pharaoh.verify -> @pharaoh.release ``` diff --git a/.github/agents/pharaoh.vplan-draft.agent.md b/.github/agents/pharaoh.vplan-draft.agent.md index 037c33c..24c2d1f 100644 --- a/.github/agents/pharaoh.vplan-draft.agent.md +++ b/.github/agents/pharaoh.vplan-draft.agent.md @@ -1,10 +1,10 @@ --- -description: Use when drafting a single sphinx-needs test-case (verification plan item) for one requirement. Emits an RST tc__ directive with inputs, steps, and expected outcome, linking to the parent req via :verifies:. +description: Use when drafting a single sphinx-needs test-case (verification plan item) for one requirement. The artefact type is parameterised via `target_level` (any catalog-declared verification-plan / test-case type — e.g. `tc`, `test`, `vplan`). Emits an RST directive with inputs, steps, and expected outcome, linking to the parent req via `:verifies:`. handoffs: [] --- # @pharaoh.vplan-draft -Use when drafting a single sphinx-needs test-case (verification plan item) for one requirement. Emits an RST tc__ directive with inputs, steps, and expected outcome, linking to the parent req via :verifies:. +Use when drafting a single sphinx-needs test-case (verification plan item) for one requirement. The artefact type is parameterised via `target_level` (any catalog-declared verification-plan / test-case type — e.g. `tc`, `test`, `vplan`). Emits an RST directive with inputs, steps, and expected outcome, linking to the parent req via `:verifies:`. See [`skills/pharaoh-vplan-draft/SKILL.md`](../../skills/pharaoh-vplan-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. From 8859553131855a2b481317f27394b77473a94967 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 18:47:04 +0200 Subject: [PATCH 14/15] Tighten walkthrough nudges for Pharaoh v1.2 retest results MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit End-to-end claude -p sonnet retest against pharaoh@v1.2.0 surfaced two new behaviours and confirmed one prior workaround still needed: * Step 2 (req-from-code): one-req guard now works — single req emitted instead of three swreqs. Skill still emits Pharaoh-internal :source_doc: and :verification: placeholders even though the demo project does not declare them. Tighten the prompt to forbid those placeholders explicitly and document the strip-before-paste fallback. * Step 4 (change): pharaoh-change in v1.2 writes a session-state acknowledgement gate and stops at "acknowledge this change?" instead of emitting the impact report. Add "emit directly; do not stop at acknowledgement" to the prompt body to bypass the gate. The Why-so-short section now lists all three skill-drift behaviours covered by the nudges (req-from-code over-emit + placeholder fields, vplan-draft self-review, change acknowledgement gate). Steps 1 (mece) and 3 (vplan-draft) work unchanged. P3 in particular benefits from the v1.2 fix that drops the hardcoded tc__ prefix. --- docs/pharaoh.rst | 42 ++++++++++++++++++++++++++++-------------- 1 file changed, 28 insertions(+), 14 deletions(-) diff --git a/docs/pharaoh.rst b/docs/pharaoh.rst index be3783b..261ca02 100644 --- a/docs/pharaoh.rst +++ b/docs/pharaoh.rst @@ -238,14 +238,17 @@ Step 2: reverse-engineer a focused requirement from code @pharaoh.req-from-code src/automotive_adas.py — focused API contract for the LaneDetection - class as a child of REQ_001. Emit a single `req` directive using - only declared fields and link options. + class as a child of REQ_001. Emit a single `req` directive. Use only + fields declared in ubproject.toml; do not emit :source_doc: or + :verification: unless the project's catalogue declares them. Expected: a ``.. req::`` block with a short domain-shaped ID, status ``open``, ``:links: REQ_001``, and a single shall-clause grounded in the ``LaneDetection`` class's public methods. The skill follows the -project's id-conventions and uses ``:links:`` rather than the -Pharaoh-internal ``:verification:`` field. +project's id-conventions and uses only declared fields. If the +emitted RST still carries Pharaoh-internal placeholders, strip them +before pasting under ``docs/automotive-adas/`` or add the fields to +``ubproject.toml``. Step 3: write the missing test ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -271,26 +274,37 @@ Step 4: change-impact analysis @pharaoh.change REQ_001 is being revised: tighter lateral tolerance, ECU port. - Scope to the lane-detection domain. + Scope to the lane-detection domain. Emit the full change-impact + report directly; do not stop at an acknowledgement prompt. Expected: a tight blast radius of about 6 to 8 needs — one architecture element, three software requirements, two system tests, the release window, and any newly authored child needs from Steps 2 -and 3. +and 3. The ``emit directly`` nudge bypasses the acknowledgement +workflow gate added in Pharaoh v1.2. Why so short ^^^^^^^^^^^^ The agent reads the project's tailoring (``.pharaoh/project/``) and ``ubproject.toml`` itself, so the user does not have to repeat the -declared types, link options, ID regex, or status enum. The two short -nudges in Steps 2 and 3 (``single req directive``, ``emit only the -RST``) keep specific Pharaoh atomic skills from drifting into adjacent -behaviour: the canonical ``pharaoh-req-from-code`` skill is happy to -emit several SW-level requirements when one feature-level requirement -was asked for, and ``pharaoh-vplan-draft`` is happy to follow itself -with a self-review. Both behaviours are tracked as upstream Pharaoh -issues. +declared types, link options, ID regex, or status enum. The short +nudges in Steps 2, 3, and 4 keep specific Pharaoh atomic skills from +drifting into adjacent behaviour: + +* ``pharaoh-req-from-code`` happily emits several SW-level + requirements when one feature-level requirement was asked for, and + emits Pharaoh-internal placeholder fields (``:source_doc:``, + ``:verification:``) regardless of whether the project declares + them. +* ``pharaoh-vplan-draft`` happily follows its draft with a + self-review pass and returns the review instead of the test. +* ``pharaoh-change`` (since Pharaoh v1.2) writes a session-state + acknowledgement gate and waits for the user to approve before + emitting the full impact report. + +All three behaviours are tracked as upstream Pharaoh issues. The +nudges are workshop-grade workarounds, not project-specific tailoring. Sanity-check the artefacts ^^^^^^^^^^^^^^^^^^^^^^^^^^ From 4b3d790b0d9a227e51ae2699b207a066bc87c1f6 Mon Sep 17 00:00:00 2001 From: Bartosz Burda <bartoszburda93@gmail.com> Date: Tue, 5 May 2026 21:43:23 +0200 Subject: [PATCH 15/15] Inline SKILL.md content into Copilot agent files MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Each agent.md shipped under .github/agents/ previously ended with a relative `See [skills/pharaoh-X/SKILL.md](../../skills/pharaoh-X/SKILL.md)` link. The relative path resolves correctly inside the upstream Pharaoh repo (where dogfooding happens) but is a dead link in every user project, because the `skills/` directory only exists in the plugin source. 65 broken local refs across 71 agent files. More importantly: the LLM that handles the @pharaoh.X invocation in Copilot Chat sees only the agent.md content, never the SKILL.md the file points at. Most atomic agents shipped as 10-line stubs (10 lines agent.md vs ~300-500 lines SKILL.md), so the LLM had to improvise the skill's procedure from background knowledge. That is the root cause of the v1.2 retest findings (req-from-code emits Pharaoh-internal placeholders not in the project's tailoring; vplan-draft self-reviews when asked to draft). This commit appends the full SKILL.md content to every agent.md (stripping the SKILL.md frontmatter), and rewrites every reference to `skills/shared/X` and `../shared/X` to absolute URLs pinned to useblocks/pharaoh@v1.2.0 so the inlined content stays clickable. The broken `(../../skills/pharaoh-X/SKILL.md)` trailer is removed in the process. Result: * 73 of 73 agent files self-contained — no further file fetches required for an LLM to execute the skill correctly. * 0 broken local markdown refs (verified by parser). * 41 `../shared/X` and 24 `skills/shared/X` references rewritten to pinned absolute URLs. * Diff is +17367 / -65 lines, mostly the inlined skill specs. * sphinx-build -W and ubc check both still pass. Tracked upstream as `useblocks/pharaoh#18` (the workflow-gate + incomplete-#13 follow-up). The proper structural fix is a VS Code extension that registers @pharaoh.X chat participants from the plugin install dir; this in-PR inline is the workshop-grade workaround. --- .../pharaoh.activity-diagram-draft.agent.md | 85 +- .../pharaoh.api-coverage-check.agent.md | 175 +++- .github/agents/pharaoh.arch-draft.agent.md | 301 +++++- .github/agents/pharaoh.arch-review.agent.md | 319 +++++- .github/agents/pharaoh.audit-fanout.agent.md | 78 +- .github/agents/pharaoh.author.agent.md | 306 +++++- .../pharaoh.block-diagram-draft.agent.md | 99 +- .github/agents/pharaoh.bootstrap.agent.md | 464 ++++++++- .github/agents/pharaoh.change.agent.md | 497 ++++++++++ .../pharaoh.class-diagram-draft.agent.md | 106 +- .../pharaoh.component-diagram-draft.agent.md | 99 +- .../agents/pharaoh.context-gather.agent.md | 106 +- .github/agents/pharaoh.coverage-gap.agent.md | 379 ++++++- .github/agents/pharaoh.decide.agent.md | 346 +++++++ .../agents/pharaoh.decision-record.agent.md | 114 ++- .../agents/pharaoh.decision-review.agent.md | 45 +- .../pharaoh.deployment-diagram-draft.agent.md | 82 +- .github/agents/pharaoh.diagram-lint.agent.md | 184 +++- .../agents/pharaoh.diagram-review.agent.md | 76 +- .../pharaoh.dispatch-signal-check.agent.md | 66 +- .github/agents/pharaoh.execute-plan.agent.md | 283 +++++- .../pharaoh.fault-tree-diagram-draft.agent.md | 97 +- .github/agents/pharaoh.feat-balance.agent.md | 111 ++- .../pharaoh.feat-component-extract.agent.md | 156 ++- .../pharaoh.feat-draft-from-docs.agent.md | 187 +++- .github/agents/pharaoh.feat-file-map.agent.md | 172 +++- .../agents/pharaoh.feat-flow-extract.agent.md | 139 ++- .github/agents/pharaoh.feat-review.agent.md | 55 +- .../agents/pharaoh.finding-record.agent.md | 83 +- .github/agents/pharaoh.flow.agent.md | 514 +++++++++- .github/agents/pharaoh.fmea-review.agent.md | 50 +- .github/agents/pharaoh.fmea.agent.md | 328 ++++++- .github/agents/pharaoh.gate-advisor.agent.md | 161 ++- .github/agents/pharaoh.id-allocate.agent.md | 78 +- .../pharaoh.id-convention-check.agent.md | 135 ++- .../agents/pharaoh.lifecycle-check.agent.md | 252 ++++- .../pharaoh.link-completeness-check.agent.md | 139 ++- .github/agents/pharaoh.mece.agent.md | 506 ++++++++++ .../agents/pharaoh.output-validate.agent.md | 233 ++++- .../pharaoh.papyrus-non-empty-check.agent.md | 67 +- .github/agents/pharaoh.plan.agent.md | 366 +++++++ .github/agents/pharaoh.process-audit.agent.md | 324 +++++- .github/agents/pharaoh.prose-migrate.agent.md | 126 ++- .github/agents/pharaoh.quality-gate.agent.md | 213 +++- .github/agents/pharaoh.release.agent.md | 568 +++++++++++ .../pharaoh.reproducibility-check.agent.md | 213 +++- .../pharaoh.req-code-grounding-check.agent.md | 235 ++++- .../pharaoh.req-codelink-annotate.agent.md | 323 +++++- .github/agents/pharaoh.req-draft.agent.md | 525 +++++++++- .github/agents/pharaoh.req-from-code.agent.md | 293 +++++- .../agents/pharaoh.req-regenerate.agent.md | 315 +++++- .github/agents/pharaoh.req-review.agent.md | 335 ++++++- .../pharaoh.review-completeness.agent.md | 73 +- ...haraoh.self-review-coverage-check.agent.md | 76 +- .../pharaoh.sequence-diagram-draft.agent.md | 99 +- .github/agents/pharaoh.setup.agent.md | 927 ++++++++++++++++++ .github/agents/pharaoh.spec.agent.md | 523 ++++++++++ .../pharaoh.sphinx-extension-add.agent.md | 156 ++- .../pharaoh.standard-conformance.agent.md | 349 ++++++- .../pharaoh.state-diagram-draft.agent.md | 95 +- .../pharaoh.status-lifecycle-check.agent.md | 113 ++- .../agents/pharaoh.tailor-bootstrap.agent.md | 181 +++- ...aoh.tailor-code-grounding-filters.agent.md | 187 +++- .github/agents/pharaoh.tailor-detect.agent.md | 325 +++++- .github/agents/pharaoh.tailor-fill.agent.md | 427 +++++++- .github/agents/pharaoh.tailor-review.agent.md | 408 +++++++- .github/agents/pharaoh.toctree-emit.agent.md | 112 ++- .github/agents/pharaoh.trace.agent.md | 413 ++++++++ .../pharaoh.use-case-diagram-draft.agent.md | 118 ++- .github/agents/pharaoh.verify.agent.md | 328 ++++++- .github/agents/pharaoh.vplan-draft.agent.md | 413 +++++++- .github/agents/pharaoh.vplan-review.agent.md | 360 ++++++- .github/agents/pharaoh.write-plan.agent.md | 240 ++++- 73 files changed, 17367 insertions(+), 65 deletions(-) diff --git a/.github/agents/pharaoh.activity-diagram-draft.agent.md b/.github/agents/pharaoh.activity-diagram-draft.agent.md index 36bba5e..8262191 100644 --- a/.github/agents/pharaoh.activity-diagram-draft.agent.md +++ b/.github/agents/pharaoh.activity-diagram-draft.agent.md @@ -7,4 +7,87 @@ handoffs: [] Use when drafting one activity diagram showing control flow (actions, decisions, forks/joins, swimlanes) for one procedure or algorithm. Typical ASPICE usage — SWE.3 Software Detailed Design. Renderer tailored via `pharaoh.toml`. Does NOT emit other diagram kinds. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). -See [`skills/pharaoh-activity-diagram-draft/SKILL.md`](../../skills/pharaoh-activity-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-activity-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-activity-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.activity]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one activity diagram. Captures **control flow** within a single procedure: sequential actions, branching (decisions), forking (parallel activities), joining, and swimlanes (partitions) showing which actor/component performs which action. + +Typical ASPICE context: +- **SWE.3 Software Detailed Design**: algorithm breakdown per function. +- **SYS.3 System Architectural Design**: activity within a subsystem. + +Does NOT capture ordered inter-component message exchange (→ `pharaoh-sequence-diagram-draft`). Does NOT capture state lifecycle (→ `pharaoh-state-diagram-draft`). + +## Atomicity + +- (a) One procedure in → one diagram out. +- (b) Input: `{view_title: str, actions: list[ActionSpec], decisions: list[DecisionSpec], forks: list[ForkSpec], edges: list[EdgeSpec], initial: str, finals: list[str], swimlanes?: list[SwimlaneSpec], project_root: str, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `ActionSpec = {id: str, label: str, swimlane?: str}`, `DecisionSpec = {id: str, label: str, swimlane?: str}`, `ForkSpec = {id: str, kind: "fork"|"join", swimlane?: str}`, `EdgeSpec = {from: str, to: str, guard?: str, label?: str}`, `SwimlaneSpec = {id: str, label: str}`. Output: one RST directive block. +- (c) Reward: fixture — procedure `receive_can_frame` with actions [parse, validate, dispatch], one decision (valid?), two finals (accepted, rejected). Scorer: + 1. Output starts with renderer directive. + 2. Exactly one initial marker (`[*]` or equivalent) pointing to `initial`. + 3. Every action/decision/fork id appears as a node. + 4. Every edge renders with renderer-specific syntax; guards shown in `[...]`. + 5. Swimlanes (if any) group their members visually (Mermaid: no native swimlane, emit comment + `subgraph`; PlantUML: `|SwimlaneX|` partition). + 6. Every id in `finals` has an outgoing edge to `[*]`. + + Pass = all 6. +- (d) Reusable for any procedural spec: SW detailed design, system workflows, operator procedures. +- (e) One diagram per call. + +## Dangling edges + +FAIL on edge endpoint not in `actions ∪ decisions ∪ forks ∪ {initial}`. An activity diagram with a transition to an undeclared action is an incomplete procedure. + +## Output + +**PlantUML (preferred for swimlane support):** +```rst +.. uml:: + :caption: <view_title> + + @startuml + |Driver| + start + :parse CAN frame; + if (valid?) then (yes) + |Dispatcher| + :dispatch to handler; + stop + else (no) + |Driver| + :log error; + end + endif + @enduml +``` + +**Mermaid (limited — no native swimlanes):** +```rst +.. mermaid:: + :caption: <view_title> + + flowchart TD + Start([Start]) --> Parse[parse CAN frame] + Parse --> Valid{valid?} + Valid -->|yes| Dispatch[dispatch to handler] + Valid -->|no| Log[log error] + Dispatch --> End([End]) + Log --> End +``` + +## Non-goals + +- No pin/action-parameter visualization — out of scope. +- No code-to-activity extraction — caller provides the structure; a future `pharaoh-activity-from-cfg` could infer from control-flow graphs. +- No interrupt/exception flows — model those as explicit edges if needed. diff --git a/.github/agents/pharaoh.api-coverage-check.agent.md b/.github/agents/pharaoh.api-coverage-check.agent.md index 0fac840..86bb742 100644 --- a/.github/agents/pharaoh.api-coverage-check.agent.md +++ b/.github/agents/pharaoh.api-coverage-check.agent.md @@ -7,4 +7,177 @@ handoffs: [] Use when verifying that a source file is covered by the need catalogue on two axes — (1) at least one CREQ declares the file as its `:source_doc:`, and (2) every project-defined exception class raised in the file is named by some CREQ's title or content. Exception classes not defined in the project source tree (stdlib, third-party deps) are reported as `external` and do not fail the axis. Classifies non-behavioral files (constants, type aliases, bare re-exports) as skipped. Language-parametric via the shared regex table in `skills/shared/public-symbol-patterns.md` (python / rust / typescript / go / c / cpp / java). Single mechanical structural check. -See [`skills/pharaoh-api-coverage-check/SKILL.md`](../../skills/pharaoh-api-coverage-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-api-coverage-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` (invariant `api_coverage_clean`), from a pre-release CI job, or standalone when auditing whether the need catalogue has kept pace with the code. Reads one source file plus one `needs.json` and emits a binary verdict on two axes — file-level citation AND raise-site coverage. Non-behavioral files (constants, type aliases, bare re-exports) are skipped so they never fail the gate. + +This is the reverse coverage direction of `pharaoh-req-from-code`. The forward direction answers "for this file, which reqs should be drafted?". The reverse direction answers "does the catalogue acknowledge this file's existence AND every exception it raises?". Missing coverage here points at CREQs that were never authored, not at CREQs that are poorly written. + +Do NOT use to grade need prose quality — that is `pharaoh-req-review`. Do NOT use to verify that a CREQ's claims about the file are accurate — that is `pharaoh-req-code-grounding-check` (the forward fidelity check). Do NOT use to author or modify reqs (read-only). Python classification runs on an AST; other languages use regex approximations consistent with the shared public-symbol-patterns table. + +## Atomicity + +- (a) Indivisible: one source file + one `needs.json` in → one findings JSON out. No req drafting, no set-level analysis, no dispatch of other skills. +- (b) Input: `{source_file: str, needs_json_path: str, project_root: str | null, language: "auto" | "python" | "rust" | "typescript" | "go" | "c" | "cpp" | "java"}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-api-coverage-check/fixtures/` cover each verdict class and each supported language. Pass = each fixture's actual output matches `expected-output.json` modulo ordering of the `covered`, `uncovered`, and `external` arrays under `raise_site_coverage` (sorted ascending in the emitted output) and of `file_coverage.citing_creqs` (sorted ascending). +- (d) Reusable across projects — the language regex table is read from [`skills/shared/public-symbol-patterns.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/public-symbol-patterns.md), not inlined. No project-specific symbol names baked in. +- (e) Read-only. Does not modify the source file, `needs.json`, or any on-disk state. Running twice on identical inputs yields byte-identical output. + +## Input + +- `source_file`: absolute path to the source file under audit, OR a path relative to `project_root`. Extension is used for language inference when `language=auto`. +- `needs_json_path`: absolute path to the built sphinx-needs corpus `needs.json`. Accepts either the flat `{"needs": {<id>: {...}}}` shape or the versioned `{"versions": {"<v>": {"needs": {...}}}}` shape. Each need dict must carry at least `id`, `title`, and `content` (or a synonymous body field); the `source_doc` option is what the file-coverage axis reads. +- `project_root`: optional absolute path. Used for three things: (1) resolve `source_file` when relative, (2) resolve a need's `:source_doc:` value when relative, (3) scope the project-definition scan that distinguishes project-defined exception classes from external (stdlib / third-party) ones in raise-site coverage. When omitted, relative paths are resolved against the current working directory and the project-definition scan is skipped — every raised class is then treated as project-defined. +- `language`: one of `"auto"`, `"python"`, `"rust"`, `"typescript"`, `"go"`, `"c"`, `"cpp"`, `"java"`. Default `"auto"`. When `"auto"`, the skill resolves the language from the source-file extension via the globs column of [`skills/shared/public-symbol-patterns.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/public-symbol-patterns.md). When an explicit language is given, extension is ignored — this is the dogfood escape hatch for literate source (e.g. a `.txt` with Python snippets). + +Edge cases: +- `source_file` missing or unreadable → `overall: "fail"`, blocker `"source_file unresolved: <path>"`. +- `needs_json_path` missing or unparseable → `overall: "fail"`, blocker `"needs.json unresolved: <path>"`. +- `language="auto"` and extension matches no row in the shared table → `overall: "fail"`, blocker `"unsupported language: <ext>"`, `language: "unknown"`. + +## Output + +```json +{ + "source_file": "/abs/path/src/module/client.py", + "language": "python", + "classification": "behavioral", + "file_coverage": { + "passed": true, + "citing_creqs": ["CREQ_inventory_client", "CREQ_inventory_load"] + }, + "raise_site_coverage": { + "total": 2, + "project_defined": 1, + "covered": ["InventoryError"], + "uncovered": [], + "external": ["ValueError"], + "passed": true + }, + "overall": "pass", + "blockers": [] +} +``` + +Fields (in canonical order): +- `source_file`: echo of the input path (as supplied — absolute or `project_root`-relative). +- `language`: resolved language string (one of the seven supported names) or `"unknown"` on unsupported-language failure. +- `classification`: `"behavioral"` or `"non-behavioral"`. +- `file_coverage.passed`: `true` iff ≥1 CREQ in the catalogue has `:source_doc:` resolving to this file. `null` when `classification == "non-behavioral"`. +- `file_coverage.citing_creqs`: list of CREQ IDs whose `:source_doc:` resolves to this file, sorted ascending. Empty list when no CREQ cites the file; still emitted for `non-behavioral` classification (diagnostic). +- `raise_site_coverage.total`: count of distinct exception class names extracted from `raise` / `throw` sites in the file. +- `raise_site_coverage.project_defined`: count of those names that resolve to a class / struct / enum defined somewhere under `project_root` for the file's language (see `## Project-definition scan`). `external` = `total - project_defined`. +- `raise_site_coverage.covered`: list of project-defined raised class names that appear (case-sensitive substring) in the title or content of some CREQ anywhere in the catalogue (not scoped to citing CREQs), sorted ascending. +- `raise_site_coverage.uncovered`: list of project-defined raised class names absent from every CREQ's title and content, sorted ascending. +- `raise_site_coverage.external`: list of raised class names that do not resolve to any project-local class definition — stdlib exceptions, third-party dep types. Diagnostic only; does not contribute to the pass/fail decision. Sorted ascending. +- `raise_site_coverage.passed`: `true` iff `uncovered` is empty. `null` when `classification == "non-behavioral"`. +- `overall`: + - `"pass"` — `classification == "behavioral"` AND `file_coverage.passed` AND `raise_site_coverage.passed`. + - `"fail"` — `classification == "behavioral"` AND either sub-axis is false. + - `"skipped"` — `classification == "non-behavioral"`. +- `blockers`: list of blocker strings (input errors — unreadable source, unreadable needs.json, unsupported language). Always present; empty list on pass / skipped / clean fail. + +On input errors the shape still carries every field. `classification` is `"non-behavioral"`, `file_coverage.citing_creqs` is `[]`, `raise_site_coverage` carries `total: 0, project_defined: 0, covered: [], uncovered: [], external: []` with `passed: null`, and `blockers` is populated with the error strings. + +## Path resolution + +- `source_file` resolution: absolute path used verbatim; relative path joined with `project_root` (or CWD if unset), then normalised via `os.path.normpath` (resolve `./` and `../`, collapse double slashes). Comparison is case-sensitive on POSIX. On Windows drive-letter paths the drive letter and path separators are normalised case-insensitively. +- `source_doc` resolution per need: the need's `source_doc` option value is resolved the same way (absolute verbatim, relative joined with `project_root`). A need cites the file iff its resolved `source_doc` equals the resolved `source_file`. + +## Process + +### Step 1: Resolve the language + +If `language == "auto"`, read the globs column of [`skills/shared/public-symbol-patterns.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/public-symbol-patterns.md) and find the first row whose glob list contains the source file's extension. If no row matches, emit an error output with `language: "unknown"`, `classification: "non-behavioral"`, `overall: "fail"`, `blockers: ["unsupported language: <ext>"]`, and stop. If an explicit language is given, use it verbatim and skip extension resolution. + +### Step 2: Classify the file + +A file is `behavioral` iff ANY of the following holds: + +1. **Non-trivial function body**: ≥1 function / async function / method whose body contains more than 2 top-level statements. A body of `pass`, `...`, a single return, a single expression, or a single delegation call does not qualify. Docstrings are statements but do not count (strip them before measuring length). +2. **Exception surface**: ≥1 `raise X(...)` statement anywhere in the file. For languages whose exception syntax is `throw`, the equivalent `throw X(...)` / `throw new X(...)` counts. +3. **Method-rich class**: ≥1 class whose body contains ≥2 method definitions (public or private — the count is structural, not visibility-scoped). + +Otherwise the file is `non-behavioral`: constants, type aliases, bare re-exports, empty `__init__.py` forwarders. Emit `classification: "non-behavioral"`, `overall: "skipped"`, both sub-axes with empty content and `passed: null`, and stop. + +**Python**: parse with `ast`. Count `FunctionDef`/`AsyncFunctionDef` body length after stripping the module-level / function-level docstring (the first statement if it is a bare `Expr(Constant(str))`); check for `Raise` nodes anywhere; count `FunctionDef`/`AsyncFunctionDef` children inside each `ClassDef`. + +**Other languages**: regex approximations colocated in the language table below. + +### Step 3: File coverage + +Load `needs.json`, flatten the needs map, and collect every CREQ whose `source_doc` option resolves (per `## Path resolution`) to the input `source_file`. `file_coverage.passed` is `true` iff the collected list is non-empty; `file_coverage.citing_creqs` is the sorted list of their IDs. + +### Step 4: Raise-site coverage + +Extract every exception class name `X` from every `raise X(...)` / `throw X(...)` / `throw new X(...)` site across the file (de-duplicate). + +For each `X`, classify as **project-defined** or **external** per `## Project-definition scan` below. External classes go into the `external` diagnostic list and are excluded from pass/fail evaluation. + +For each project-defined `X`, check whether `X` appears (case-sensitive substring) in the `title` OR `content` (or synonymous body field) of any CREQ anywhere in the catalogue — NOT scoped to citing CREQs. `raise_site_coverage.passed` is `true` iff every project-defined `X` is covered. + +### Step 5: Emit the findings JSON + +Populate every field per the `## Output` shape. Sort `file_coverage.citing_creqs`, `raise_site_coverage.covered`, `raise_site_coverage.uncovered`, and `raise_site_coverage.external` ascending. Set `overall` per the rules in the output-field description. + +## Project-definition scan + +When `project_root` is provided, the skill walks the tree and collects every class-like definition name matching the input file's language: + +- Python: `class X` top-level or nested. +- Rust: `struct X`, `enum X`. +- TypeScript: `class X`, `class X extends Y`, `export class X`, `export default class X`, plus `interface X` / `type X = ...` shapes used as error types. +- Go: `type X struct`, `type X interface`. +- Java: `class X`, `interface X`, nested declarations. +- C: `struct X` (C has no exception classes in practice — typically empty set). +- C++: `class X`, `struct X`. + +Only files whose extension matches the language's glob (per the table) are scanned. The regex uses the `public symbol regex` column from [`skills/shared/public-symbol-patterns.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/public-symbol-patterns.md) (named capture `name`) filtered to class-like kinds. + +A raised class `X` is **project-defined** iff its name appears in the collected set. Otherwise `X` is **external** — stdlib (`ValueError`, `RuntimeError`, `java.lang.IllegalArgumentException`, `std::runtime_error`) or third-party dep types. + +When `project_root` is omitted, the scan is skipped and every raised class is treated as project-defined (the `external` list is empty). This keeps the skill usable in single-file contexts but the external-filter value is only available when `project_root` is supplied. + +## Language table + +| language | extension globs | classifier notes | raise-site regex | +|------------|------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------| +| python | `*.py` | Parse with `ast`. Body length computed after stripping the leading docstring. Class methods counted via `FunctionDef`/`AsyncFunctionDef` children of `ClassDef`. | `raise\s+(?P<exc>[A-Z][A-Za-z0-9_]+)\s*\(` | +| rust | `*.rs` | Function-body statement count via brace-delimited body plus `;`-terminated statement splitting (trailing expressions count as one statement). Methods in an `impl` block count as class methods. | n/a — Rust uses `Result<E>` returns | +| typescript | `*.ts`, `*.tsx` | Function-body statement count via brace-delimited body plus `;`-terminated statement splitting. Class methods counted inside `class` / `export class` blocks. | `throw\s+(?:new\s+)?(?P<exc>[A-Z][A-Za-z0-9_]+)\s*\(` | +| go | `*.go` | Function-body statement count via brace-delimited body plus `;`-or-newline-terminated statement splitting. Interface methods and struct methods counted toward the method-rich-class rule (Go has no `class` keyword — `type T struct` plus ≥2 methods declared as `func (r *T)` qualifies). | n/a — Go uses `error` return values | +| java | `*.java` | Function-body statement count via brace-delimited body plus `;`-terminated statement splitting. Methods counted inside `class` / `interface` blocks. | `throw\s+(?:new\s+)?(?P<exc>[A-Z][A-Za-z0-9_]+)\s*\(` | +| c | `*.c`, `*.h` | Function-body statement count via brace-delimited body plus `;`-terminated statement splitting. C has no classes — method-rich-class rule vacuously unsatisfied. | n/a — C uses integer return codes | +| cpp | `*.cpp`, `*.hpp`, `*.cc`, `*.h` | Function-body statement count as in C. Methods counted inside `class` / `struct` blocks. | `throw\s+(?:new\s+)?(?P<exc>[A-Z][A-Za-z0-9_]+)\s*\(` | + +The regex-based classifiers for non-Python languages share the accuracy ceiling of `shared/public-symbol-patterns.md` — known false positives in comments/strings, conservative over-reporting. + +## Detection rule + +One mechanical check, implemented as the five-step process above. No LLM judgement. + +## Failure modes + +- **Regex false positives inside comments and strings (non-Python).** A `// throw new FooError(...)` line in a block comment at column 0 is extracted as a raise site. Python avoids this because it uses the AST. +- **Case-sensitive substring matching for raise-site coverage.** Deliberate: sphinx-needs ids and class names are case-sensitive in practice, and a CREQ that describes `refreshTokenError` does not cover `RefreshTokenError`. Projects whose naming convention differs between code and prose must normalise at authoring time. +- **Raise-site extraction is shallow.** `raise Exc(...)` / `throw new Exc(...)` literals are detected. A function that raises by calling a helper that raises is not counted here — the question is "does this source file raise this class?", not "does execution of this file ever produce this class?". +- **`source_doc` must be declared.** Needs without the option cannot be collected by step 3, so a file with no citing needs is marked uncovered even if its behavior is described in free-floating prose elsewhere. That is the design — a CREQ that does not declare which file it describes cannot honestly claim to cover that file. +- **Raise-site coverage is catalogue-wide.** A CREQ for `bar.py` that names `InventoryError` in its body does cover the raise of `InventoryError` in `foo.py`. This is intentional: exception types are shared surface that multiple CREQs may reference. Projects that want scoped coverage should narrow via `source_doc` filters downstream. + +## Tailoring extension point + +None. The language regex table is shared ([`skills/shared/public-symbol-patterns.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/public-symbol-patterns.md)) — a project that needs a new language supports it by adding a row there plus a corresponding entry in this skill's language table, which benefits both `pharaoh-req-from-code` and this skill. + +## Composition + +Role: `atom-check`. + +Called from `pharaoh-quality-gate` under the invariant key `api_coverage_clean` (pass requirement: `overall ∈ {"pass", "skipped"}`). Also callable standalone from any CI job that already knows which source files and which `needs.json` to feed it. Never dispatches other skills. Never modifies the source file or the need catalogue. + +Complements `pharaoh-req-code-grounding-check`: that skill runs the forward direction (does the CREQ's cited exception actually get raised in the file?), this skill runs the reverse direction (does every raised exception have a covering CREQ?). The two atoms share the language-regex table but no other code — they answer genuinely different questions and fail on different inputs. diff --git a/.github/agents/pharaoh.arch-draft.agent.md b/.github/agents/pharaoh.arch-draft.agent.md index ccb3362..9deb1a5 100644 --- a/.github/agents/pharaoh.arch-draft.agent.md +++ b/.github/agents/pharaoh.arch-draft.agent.md @@ -7,4 +7,303 @@ handoffs: [] Use when drafting a single sphinx-needs architecture element from one parent requirement. The artefact type is parameterised via `target_level` (any catalog-declared architecture type — e.g. `arch`, `swarch`, `sys-arch`, `module`, `component`, `interface`). Emits an RST directive block linking back to the parent via `:satisfies:`. -See [`skills/pharaoh-arch-draft/SKILL.md`](../../skills/pharaoh-arch-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-arch-draft + +## When to use + +Invoke when the user has a validated requirement (ideally reviewed by `pharaoh-req-review`) and +wants to derive one architecture element from it. The element's directive name and ID prefix +come from the project's `artefact-catalog.yaml` / `id-conventions.yaml`; this skill is +type-agnostic and supports any architecture-shaped type the catalog declares +(`arch`, `swarch`, `sys-arch`, `module`, `component`, `interface`, …). + +Do NOT draft multiple architecture elements in a single invocation — one element per call. +Do NOT create architecture elements without a parent requirement — every arch element must trace +back to at least one req via `:satisfies:`. +Do NOT review — use `pharaoh-arch-review` after drafting. + +--- + +## Inputs + +- **parent_req_id** (from user): need-id of the parent requirement — must exist in needs.json +- **target_level** (from user): the artefact-catalog type name to emit. Any type declared in + `.pharaoh/project/artefact-catalog.yaml` is accepted (typical examples: `arch`, `swarch`, + `sys-arch`, `module`, `component`, `interface`). The emitted directive uses `target_level` + verbatim as the directive name; the ID prefix is resolved from the catalog / id-conventions. +- **element_description** (from user): 1-3 sentences describing the element's responsibility +- **tailoring** (from `.pharaoh/project/`): + - `artefact-catalog.yaml` — look up the entry for `target_level` to read `required_fields`, + `optional_fields`, and `lifecycle` + - `id-conventions.yaml` — `prefixes` map (key = type name → value = identifier prefix + string), `separator`, `id_regex` +- **needs.json**: required for parent resolution and ID uniqueness + +> Note: A `shared/tailoring-access.md` helper module is planned. Until it exists, Steps 1-2 below +> inline the tailoring-access logic directly. When that file is created, this skill should be +> updated to delegate to it. + +--- + +## Outputs + +A single RST directive block for the architecture element, containing: + +- Unique ID using the prefix resolved for `target_level` from `id-conventions.yaml` +- `:status: draft` +- `:satisfies:` pointing to parent_req_id +- Every field listed in the catalog entry's `required_fields` (the directive name itself + carries the type, so a separate `:type:` option is only emitted when the catalog entry + declares it as required) +- Body: 1-3 sentences describing the element's responsibility; no `shall` — architecture elements + state what something *is*, not what it *shall do* (requirements do that) + +--- + +## Process + +### Step 1: Read tailoring + +**1a. `artefact-catalog.yaml`** + +Look up the entry whose key equals `target_level`. If found, read: + +- `required_fields` — fields that must be present in the emitted directive +- `optional_fields` — fields that may be added +- `lifecycle` — valid `:status:` values + +If the entry is absent, FAIL: + +``` +FAIL: target_level "<value>" is not declared in .pharaoh/project/artefact-catalog.yaml. +Add an entry for "<value>" (with required_fields, optional_fields, lifecycle) before +drafting, or pass a target_level that is already declared. +``` + +**1b. `id-conventions.yaml`** + +Read the `prefixes:` map and look up the prefix for `target_level`. Also extract +`separator` and `id_regex`. + +If `prefixes` does not declare `target_level`, FAIL: + +``` +FAIL: id-conventions.yaml prefixes map has no entry for "<value>". +Declare a prefix for "<value>" (e.g. SWARCH_) before drafting. +``` + +The resolved prefix is the value of `prefixes[target_level]`. + +--- + +### Step 2: Locate and parse needs.json + +Find `needs.json` (check `docs/_build/needs/needs.json`, then `_build/needs/needs.json`, then +any `needs.json` under a `_build` directory). If not found, FAIL: + +``` +FAIL: needs.json not found. Build the Sphinx project first (`sphinx-build docs/ docs/_build/`), +then re-run this skill. +``` + +Extract a flat map of `id → {id, type, status}` and the set of all existing IDs. + +--- + +### Step 3: Validate parent_req_id + +1. Look up `parent_req_id` in the needs.json map. If not found, FAIL: + +``` +FAIL: parent_req_id "<id>" not found in needs.json. +Specify an existing requirement ID or build the project first. +``` + +2. Confirm the parent is a requirement type (prefix ends in `req` or `_req`). If it is a + different type (e.g. `wf`, `wp`), warn but do not block: + +``` +[WARNING] parent_req_id "<id>" has type "<type>" which is not a requirement. +Architecture elements should trace to requirements. Proceeding at user's discretion. +``` + +--- + +### Step 4: Assign a unique ID + +**4a. Derive local-ID part** + +Format: `<prefix><local>` where `<prefix>` is the value resolved in Step 1b. The local part +is derived from `element_description`: + +- Lowercase, words separated by underscores +- Maximum 5 words; trim articles, prepositions, conjunctions +- Example: "Power management module for ECU startup" → local `power_management_module` + +**4b. Check uniqueness** + +Candidate = `<prefix><local>` (or `<prefix><separator><local>` if `id-conventions.yaml` +declares an explicit separator distinct from the prefix's trailing punctuation). +If the candidate is already in the needs.json ID set, append `_2`, `_3`, etc. + +**4c. Validate against id_regex** + +If the candidate does not match the `id_regex` declared in `id-conventions.yaml`, FAIL: + +``` +FAIL: generated ID "<id>" does not match id_regex "<regex>". +Revise element_description to use lowercase ASCII words. +``` + +--- + +### Step 5: Draft the element body + +Write 1-3 sentences describing: + +1. What the element *is* (its role in the system) +2. What it contains or depends on (if known from parent req) +3. Its boundary (what it does NOT include) — only if the parent req implies a clear scope limit + +Do NOT use `shall` in the body. Architecture descriptions use present tense: "The X module +manages Y" / "The X interface provides Z". + +Single-responsibility check: the description must describe one coherent unit. If +`element_description` implies multiple distinct concerns (e.g. "handles user authentication AND +logs all activity"), FAIL: + +``` +FAIL: element_description describes multiple responsibilities. +Invoke pharaoh-arch-draft once per responsibility. +Primary responsibility identified: "<extracted primary>". +``` + +--- + +### Step 6: Self-check + +Before emitting: + +**Check A — required fields present** +Every field in `required_fields` from Step 1 must appear in the directive. + +**Check B — parent resolves** +`:satisfies:` value is present in needs.json (confirmed in Step 3). + +**Check C — ID unique** +Chosen ID not in needs.json (confirmed in Step 4). + +**Check D — no `shall` in body** +Body must not contain `shall`. If found, rewrite in descriptive present tense. + +If any check fails after one rewrite attempt, emit with `[DIAGNOSTIC]`: + +``` +[DIAGNOSTIC] Self-check "<check name>" failed after rewrite. +Manual correction required before running pharaoh-arch-review. +``` + +--- + +### Step 7: Emit the directive block + +```rst +.. <target_level>:: <element title> + :id: <id> + :status: draft + :satisfies: <parent_req_id> + + <1-3 sentence description> +``` + +Add any catalog-declared `required_fields` not already shown above (the catalog is the source +of truth — emit every field it lists). + +--- + +## Guardrails + +**G1 — Parent not found** + +If parent_req_id is absent from needs.json, FAIL immediately (Step 3 handles this). + +**G2 — Multiple responsibilities** + +If element_description covers more than one distinct concern, FAIL (Step 5 handles this). Do not +silently draft a compound element. + +**G3 — target_level not declared** + +If `target_level` is not declared in `artefact-catalog.yaml` or has no entry in +`id-conventions.yaml`'s `prefixes` map, FAIL (Step 1 handles this). Do not silently fall +back to a hardcoded default — the catalog is the contract. + +**G4 — needs.json unavailable** + +If needs.json cannot be found, FAIL and instruct the user to build first (Step 2 handles this). + +--- + +## Advisory chain + +After successfully emitting the directive: + +``` +Consider running `pharaoh-arch-review <new_id>` to audit against ISO 26262-8 §6 axes. +``` + +Do not show this if the emit included a `[DIAGNOSTIC]`. + +--- + +## Worked example + +**User input:** +> Parent: `gd_req__abs_pump_activation`; target_level: `arch`; description: "Manages the ABS +> pump drive circuit, including PWM duty-cycle control and over-current protection." + +**Step 1:** `artefact-catalog.yaml` has an `arch` entry with `required_fields: +[id, status, satisfies]`. `id-conventions.yaml` `prefixes` map has `arch: arch__`. + +**Step 2:** needs.json found at `docs/_build/needs/needs.json`; 185 IDs loaded. + +**Step 3:** `gd_req__abs_pump_activation` found in needs.json; type `gd_req`. Parent valid. + +**Step 4:** local derived: `abs_pump_driver`. Candidate: `arch__abs_pump_driver`. Not in +needs.json. Passes id_regex `^[a-z][a-z_]*__[a-z0-9_]+$`. ID assigned. + +**Step 5:** Single responsibility — manages ABS pump drive circuit only. No `shall` in body. OK. + +**Step 6:** All checks pass. + +**Step 7 output:** + +```rst +.. arch:: ABS pump driver component + :id: arch__abs_pump_driver + :status: draft + :satisfies: gd_req__abs_pump_activation + + The ABS pump driver component manages the pump drive circuit, controlling output + PWM duty cycle and providing over-current protection for the pump motor. +``` + +``` +Consider running `pharaoh-arch-review arch__abs_pump_driver` to audit against ISO 26262-8 §6 axes. +``` + +For a project that distinguishes system-level and software-level architecture, the same skill +serves both — pass `target_level: sys-arch` to draft a system architecture element, or +`target_level: swarch` for a software architecture element. The directive name and prefix come +from the project's catalog and id-conventions; nothing in this skill is hardcoded to the three +classical names `module` / `component` / `interface`. + +## Last step + +After emitting the artefact, invoke `pharaoh-arch-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must address the action items and re-invoke this skill with the revised target as input. + +See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/.github/agents/pharaoh.arch-review.agent.md b/.github/agents/pharaoh.arch-review.agent.md index ac6afb0..c2972d0 100644 --- a/.github/agents/pharaoh.arch-review.agent.md +++ b/.github/agents/pharaoh.arch-review.agent.md @@ -7,4 +7,321 @@ handoffs: [] Use when auditing a single architecture element against the 10 ISO 26262-8 §6 axes plus arch-specific axes (traceability back to requirement). Emits structured findings JSON. -See [`skills/pharaoh-arch-review/SKILL.md`](../../skills/pharaoh-arch-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-arch-review + +## When to use + +Invoke when the user has a single architecture element (either just drafted by `pharaoh-arch-draft` +or retrieved from needs.json by ID) and wants per-axis inspection against ISO 26262-8 §6. + +Do NOT review sets of arch elements — each invocation audits exactly one element. +Do NOT re-author or fix — emit findings only. Re-authoring is a follow-up step outside this skill's scope. +Do NOT audit requirements — use `pharaoh-req-review` for those. + +--- + +## Inputs + +- **target**: either an RST directive block (from `pharaoh-arch-draft`) OR a need-id present + in needs.json +- **tailoring** (from `.pharaoh/project/`): + - `artefact-catalog.yaml` — `arch` entry; required/optional fields + - `id-conventions.yaml` — id_regex and prefix map +- **needs.json**: required for traceability axis (verifying `:satisfies:` resolves to a req) + +--- + +## Outputs + +A single JSON document with **no prose wrapper**. Shape mirrors `pharaoh-req-review`, with +`traceability` added as a binary axis and `verifiability` adapted for arch context: + +```json +{ + "need_id": "arch__example", + "axes": { + "atomicity": {"score": 0, "reason": "..."}, + "internal_consistency": {"score": 0, "reason": "..."}, + "traceability": {"score": 0, "reason": "..."}, + "completeness": {"score": "deferred", "reason": "set-level axis — see note"}, + "external_consistency": {"score": "deferred", "reason": "set-level axis — see note"}, + "no_duplication": {"score": "deferred", "reason": "set-level axis — see note"}, + "schema": {"score": 0, "reason": "..."}, + "maintainability": {"score": null, "reason": "chain-level axis — see note"}, + "unambiguity_prose": {"score": 0, "reason": "..."}, + "comprehensibility": {"score": 0, "reason": "..."}, + "feasibility": {"score": 0, "reason": "..."} + }, + "action_items": ["..."], + "overall": "pass" +} +``` + +Note: `verifiability` from req-review is replaced by `traceability` for arch elements. Architecture +elements are not directly tested; they must *trace* to a requirement that carries the verification +link. + +### Score scales + +**Binary (0 or 1) — mechanized axes:** + +| Axis | Score 0 = FAIL | Score 1 = PASS | +|---|---|---| +| `atomicity` | element bundles more than one distinct architectural concern (e.g. two independent responsibilities in the body) | element represents exactly one coherent architectural unit | +| `internal_consistency` | body contains a self-contradictory statement | no self-contradiction detectable within this element | +| `traceability` | `:satisfies:` field absent, empty, or the linked ID does not exist in needs.json as a requirement type | `:satisfies:` present and resolves to a requirement-type need in needs.json | +| `schema` | any field listed in `required_fields` for the `arch` entry in artefact-catalog.yaml is missing | all required fields present and non-empty | + +**Ordinal (0–3) — subjective LLM-judge axes:** + +| Axis | 0 | 1 | 2 | 3 | +|---|---|---|---|---| +| `unambiguity_prose` | multiple conflicting interpretations of the element's responsibility | single interpretation but phrasing is awkward | single clear interpretation; minor phrasing issues | unambiguous and precise | +| `comprehensibility` | adjacent-level reader (test engineer or architect) cannot follow | mostly unclear without extra context | mostly clear; minor jargon | fully self-contained and clear | +| `feasibility` | obviously unrealisable given item-development constraints | realisable but significant unknowns | realisable with known engineering effort | clearly realisable; well-bounded | + +### Deferred set-level axes + +`completeness`, `external_consistency`, and `no_duplication` require the full set of sibling +architecture elements. Defer to a planned `pharaoh-arch-set-review` skill. + +In the output JSON, record each as +`{"score": "deferred", "reason": "set-level axis — assess with pharaoh-arch-set-review"}`. + +### Chain-level axis + +`maintainability` requires observing convergence across regeneration iterations. Record as +`{"score": null, "reason": "chain-level axis — assess after the parent requirement and architecture revisions land"}`. + +### `overall` field + +Computed from non-deferred, non-null axes (atomicity, internal_consistency, traceability, schema, +unambiguity_prose, comprehensibility, feasibility): + +- `"pass"` — all binary axes score 1, all subjective axes score ≥ 2 +- `"needs_work"` — no binary axis fails, but ≥ 1 subjective axis scores < 2 +- `"fail"` — ≥ 1 binary axis scores 0 + +--- + +## Process + +### Step 1: Read tailoring + +Read `.pharaoh/project/artefact-catalog.yaml` and `.pharaoh/project/id-conventions.yaml`. + +For the `arch` artefact type, extract: +- `required_fields` (expected: `[id, status, satisfies, type]`) +- `id_regex` + +If the `arch` entry is absent from artefact-catalog, apply defaults: +- `required_fields`: `[id, status, satisfies, type]` +Note the fallback in output. + +--- + +### Step 2: Resolve target + +**If target is a need-id:** + +1. Find needs.json (`docs/_build/needs/needs.json`, then `_build/needs/needs.json`). +2. Look up the need-id in the needs map. +3. If not found, FAIL (see Guardrails G1). +4. Extract title, all option fields, and body text. + +**If target is an RST directive block:** + +1. Parse the block — extract `:id:`, `:status:`, `:satisfies:`, `:type:`, and body. +2. Determine artefact type from the directive name (e.g. `.. arch::` → type `arch`). +3. If needs.json is available, check whether the ID already exists (may be a re-review). + +--- + +### Step 3: Evaluate binary axes + +**Atomicity:** + +Read the body. Assess whether it describes a single coherent architectural concern or bundles +multiple independent responsibilities. Look for: +- Multiple distinct systems or subsystems described as owned by this element +- Compound responsibility lists joined by "and" describing unrelated concerns + +Score 1 if the body describes one coherent unit. Score 0 if two or more clearly separable +responsibilities are described. + +**Internal consistency:** + +Read the body for contradictory statements (e.g. "is responsible for X" followed by "does not +handle X"). Score 1 if no contradiction; score 0 if one is identifiable. + +**Traceability:** + +Check `:satisfies:` option: +1. Present and non-empty — if absent, score 0. +2. Resolves in needs.json — if the ID does not exist, score 0. +3. The resolved need is a requirement type (prefix ends in `req` or equivalent) — if it is a + non-requirement type (e.g. `wf`, `wp`), score 0 and note the type mismatch. + +Score 1 only if all three conditions hold. + +**Schema:** + +Verify every field in `required_fields` is present and non-empty. For default arch: +`id`, `status`, `satisfies`, `type` must all be present. Score 0 with reason listing missing +field(s) if any are absent. + +--- + +### Step 4: Evaluate subjective axes + +**Unambiguity (prose):** + +Read the body. Assess whether a reader can derive only one interpretation of the element's +responsibility. Vague ownership ("may handle", "is sometimes responsible for") → score 1. +Clear, bounded description → score 3. + +**Comprehensibility:** + +Assess whether a test engineer or software architect at an adjacent level could understand what +this element does without reading any other document. Check for undefined acronyms, missing +subject, and missing context. Full clarity → score 3; requires significant additional context → +score 0. + +**Feasibility:** + +Assess whether the element as described could be realised within typical automotive software +item-development constraints. Flag obviously infeasible claims (score 0), heavily under-constrained +elements (score 1), normal engineering effort (score 2), or tightly and clearly bounded elements +(score 3). + +--- + +### Step 5: Record deferred and null axes + +Set `completeness`, `external_consistency`, `no_duplication` to +`{"score": "deferred", "reason": "set-level axis — assess with pharaoh-arch-set-review"}` and +`maintainability` to +`{"score": null, "reason": "chain-level axis — assess after the parent requirement and architecture revisions land"}`. + +--- + +### Step 6: Compute overall and action items + +Compute `overall` from non-deferred, non-null axes per the policy above. + +For each binary axis scoring 0 and each subjective axis scoring 0 or 1, add a concrete action +item naming the axis and stating what must change. + +If all 7 evaluated axes pass/score ≥ 2, `action_items` is an empty array. + +--- + +### Step 7: Emit JSON + +Emit only the JSON document. Do not prepend or append prose. + +--- + +## Guardrails + +**G1 — Unresolved target** + +If target is a need-id and it does not appear in needs.json: + +``` +FAIL: need-id "<id>" not found in needs.json. +Verify the ID is correct or build the Sphinx project first. +``` + +**G2 — Malformed JSON output** + +If the emitted JSON is syntactically invalid or missing any axis key, self-correct once. If still +malformed after one correction attempt: + +```json +{ + "need_id": "<id>", + "diagnostic": "JSON self-correction failed. Raw findings follow.", + "raw": "<free-text findings>" +} +``` + +**G3 — Insufficient context for subjective axis** + +If the element body is empty or too short to assess a subjective axis, record: +`{"score": 0, "reason": "insufficient context — body is empty or too short to assess"}`. +Continue evaluating remaining axes. + +--- + +## Advisory chain + +If `overall` is `"needs_work"` or `"fail"`, append — after the JSON — a single line: + +``` +Re-run this review after action items are addressed to confirm the findings are resolved. +``` + +This is the only prose permitted after the JSON. + +--- + +## Worked example + +**Target (RST block from pharaoh-arch-draft):** + +```rst +.. arch:: ABS pump driver component + :id: arch__abs_pump_driver + :status: draft + :satisfies: gd_req__abs_pump_activation + :type: component + + The ABS pump driver component manages the pump drive circuit, controlling output + PWM duty cycle and providing over-current protection for the pump motor. +``` + +**Step 2:** RST parsed. needs.json found; `gd_req__abs_pump_activation` resolves as type `gd_req`. + +**Step 3 — binary axes:** +- atomicity: body describes one coherent unit (pump drive circuit management); no separate + unrelated concerns → score 1 +- internal_consistency: no self-contradiction → score 1 +- traceability: `:satisfies: gd_req__abs_pump_activation` present; resolves in needs.json; + type is `gd_req` → score 1 +- schema: `id`, `status`, `satisfies`, `type` all present → score 1 + +**Step 4 — subjective axes:** +- unambiguity_prose: single clear interpretation (manages pump drive circuit) → score 3 +- comprehensibility: subject, responsibility, and constraints explicit; no undefined acronyms + (PWM is standard automotive abbreviation) → score 3 +- feasibility: standard automotive power electronics function; well-constrained → score 3 + +**Step 6:** all 7 evaluated axes pass or score ≥ 2 → `overall = "pass"`, `action_items = []`. + +**Step 7 output:** + +```json +{ + "need_id": "arch__abs_pump_driver", + "axes": { + "atomicity": {"score": 1, "reason": "single coherent unit: pump drive circuit management"}, + "internal_consistency": {"score": 1, "reason": "no self-contradictory statement detected"}, + "traceability": {"score": 1, "reason": ":satisfies: gd_req__abs_pump_activation resolves as type gd_req"}, + "schema": {"score": 1, "reason": "id, status, satisfies, type all present"}, + "completeness": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-arch-set-review"}, + "external_consistency": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-arch-set-review"}, + "no_duplication": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-arch-set-review"}, + "maintainability": {"score": null, "reason": "chain-level axis — assess after the parent requirement and architecture revisions land"}, + "unambiguity_prose": {"score": 3, "reason": "single clear interpretation: manages pump drive circuit"}, + "comprehensibility": {"score": 3, "reason": "subject, responsibility, and constraints all explicit"}, + "feasibility": {"score": 3, "reason": "standard automotive power electronics function; well-constrained"} + }, + "action_items": [], + "overall": "pass" +} +``` diff --git a/.github/agents/pharaoh.audit-fanout.agent.md b/.github/agents/pharaoh.audit-fanout.agent.md index c314f96..55949b0 100644 --- a/.github/agents/pharaoh.audit-fanout.agent.md +++ b/.github/agents/pharaoh.audit-fanout.agent.md @@ -7,4 +7,80 @@ handoffs: [] Use when running a full project audit in parallel by dispatching 5 atomic audit skills, each writing findings to a shared Papyrus workspace via pharaoh-finding-record for automatic deduplication. Emits the aggregated deduplicated findings list. -See [`skills/pharaoh-audit-fanout/SKILL.md`](../../skills/pharaoh-audit-fanout/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-audit-fanout + +## When to use + +Invoke at the start of a full project audit, CI gate, or pre-release review. Produces a deduplicated, terminology-consistent findings report covering 5 sub-areas in parallel. + +## Compositional design (not atomic by design) + +This is an orchestrator skill. Atomicity criteria (a) does not apply — the skill is compositional by construction. Its 5 component skills each pass (a)-(e) individually. + +## Sub-areas + +| # | Atomic skill | Target issues | +|---|---|---| +| 1 | `pharaoh-coverage-gap` | orphans, unverified req, uncovered req-verification | +| 2 | `pharaoh-lifecycle-check` | invalid state transitions, stale reviews | +| 3 | `pharaoh-standard-conformance` | ISO 26262 §6 / ASPICE indicator gaps | +| 4 | `pharaoh-review-completeness` | missing reviewer / approved_by fields | +| 5 | `pharaoh-process-audit` | duplicate req, contradictory pair, missing FMEA, broken back-link | + +## Shared memory protocol + +All 5 subagents write findings to the project's `.papyrus/` workspace using `pharaoh-finding-record`. The deterministic-ID scheme in `pharaoh-finding-record` provides natural dedup via Papyrus's `FileLock`-guarded ID-collision semantics. + +Subagents SHOULD NOT retry on `action: duplicate` — that's the intended signal that another subagent already covered this issue. + +## Process + +### Step 1: Initialize workspace + +Ensure `<project_dir>/.papyrus/` exists and is writable. If not, initialize via `papyrus --workspace <project_dir>/.papyrus init` (zero-config, silos preset). + +### Step 2: Dispatch 5 subagents in parallel + +In a production Pharaoh deployment, dispatch via Agent tool with `subagent_type: general-purpose`. In the Phase 4b harness, parallelism is achieved at the harness level via `concurrent.futures.ThreadPoolExecutor` over 5 separate `claude -p` invocations, each primed with one sub-area's task. + +Each subagent receives: +- The project directory path +- Its assigned sub-area skill name (one of the 5) +- Explicit instruction to invoke `pharaoh-finding-record` for every finding + +### Step 3: Aggregate + +Once all 5 subagents complete, read the Papyrus workspace via: + +```bash +papyrus --workspace <project_dir>/.papyrus recall --tag category:* --format full +``` + +Emit as the final audit report (JSON list of findings, ordered by category). + +## Input / output + +**Input:** `{project_dir: path, scope?: "full"|"partial"}`. Default scope is `full` (all 5 sub-areas). Partial scope limits to a subset of sub-areas (not used in Phase 4b testing). + +**Output:** JSON list of aggregated findings, one object per `(category, subject_id)` pair: + +```json +[ + {"category": "orphan_arch", "subject_id": "arch__orphan_0", "finding_text": "...", "reporter_id": "coverage-gap"}, + ... +] +``` + +## Failure modes + +- `.papyrus/` workspace uninitializable → fall back to sequential single-agent `pharaoh-process-audit` with stderr warning. +- Any subagent errors → other 4 continue; final report notes the missing sub-area. +- >1 subagent timeout → abort and surface partial Papyrus state. + +## Composition + +Consumed by CI gates, on-demand user audits, and release governance workflows. Output is directly consumable by `pharaoh-audit-report` (future skill) or printed as-is. diff --git a/.github/agents/pharaoh.author.agent.md b/.github/agents/pharaoh.author.agent.md index e1a8e7c..5f923f2 100644 --- a/.github/agents/pharaoh.author.agent.md +++ b/.github/agents/pharaoh.author.agent.md @@ -16,4 +16,308 @@ handoffs: Use when authoring or modifying a single sphinx-needs artefact (requirement, architecture element, test case, decision) by routing to the matching atomic drafting skill based on the project's artefact catalog. Returns the drafted RST directive with an ID, file placement suggestion, and parent link. -See [`skills/pharaoh-author/SKILL.md`](../../skills/pharaoh-author/SKILL.md) for the full atomic specification — inputs, dispatch table, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-author + +## When to use + +Invoke when the user wants to create or modify one sphinx-needs artefact (one need per call) +and needs the right atomic drafting skill picked for them based on the artefact type. + +This skill is a thin type-router. It does not author content itself — it dispatches to one of +the atomic drafting skills (`pharaoh-req-draft`, `pharaoh-arch-draft`, `pharaoh-vplan-draft`, +`pharaoh-decide`) and forwards their RST output verbatim, plus a file-placement hint and the +parent link. + +Do NOT use when: + +- The user wants the full V-model chain in one call — use `pharaoh-flow`. +- The user wants to draft multiple artefacts at once — invoke this skill once per artefact. +- The user wants to review or verify content — use `pharaoh-req-review`, `pharaoh-arch-review`, + or `pharaoh-verify` instead. + +> This is a compositional orchestrator. The atomicity criterion (a) does not apply: by design +> it dispatches to one atomic skill. Scope is bounded to "one artefact → one drafted directive". + +--- + +## Inputs + +- **target_type** (from user or inferred): the artefact type to draft. Recognised values come + from `.pharaoh/project/artefact-catalog.yaml`. Common values: `req`, `gd_req`, `arch`, + `tc`, `decision`, plus ISO 26262 safety-V types `hazard`, `safety_goal`, `fsr`, `tsr`. + Synonyms are tolerated — e.g. `requirement` → `req`/`gd_req`, `architecture` / `spec` → + `arch`, `test_case` / `verification` → `tc`, `safety_requirement` → `fsr`/`tsr`. +- **target_id** (optional): if the user is modifying an existing need, the need-id to update. + When absent, the dispatched drafter generates a fresh ID. +- **draft_seed** (from user): short prose describing what to author. Forwarded as + `feature_context` / `element_description` / parent claim depending on the dispatch target. +- **parent_link** (from user, required for everything except top-level requirements and + decisions): the need-id this artefact will trace to via `:satisfies:` or `:verifies:`. +- **tailoring** (from `.pharaoh/project/`): + - `artefact-catalog.yaml` — used to resolve `target_type` to a known prefix and confirm a + drafter is available + - `id-conventions.yaml` — read by the dispatched drafter (not by this skill directly) +- **needs.json**: required by every dispatched drafter for parent resolution and ID uniqueness + +--- + +## Outputs + +The drafted RST directive block exactly as produced by the dispatched skill, plus a thin +authoring-summary block: + +``` +=== [ARTEFACT] <type>: <id> === +<RST directive block from the dispatched drafter> + +=== [AUTHORING SUMMARY] === +{ + "need_id": "<id>", + "type": "<resolved target_type>", + "dispatched_skill": "pharaoh-req-draft|pharaoh-arch-draft|pharaoh-vplan-draft|pharaoh-decide", + "parent_link": "<id or null>", + "file_placement": "<suggested .rst path>", + "stop_reason": null +} +``` + +If the dispatched drafter returns a hard FAIL or `[DIAGNOSTIC]`, forward it verbatim and set +`stop_reason` in the summary. + +--- + +## Process + +### Step 0: Resolve target_type + +Normalise the user-supplied type to a catalog key: + +| User input (any case) | Resolves to | +|---|---| +| `req`, `requirement`, `gd_req`, `comp_req`, `sysreq`, `swreq`, `feat` | the catalog key matching the user's `target_type` exactly when present; otherwise the catalog key whose suffix matches the request and which appears first in `artefact-catalog.yaml`'s declaration order. If two or more catalog keys match equally, FAIL with a clear `ambiguous target_type` error and list the candidates so the caller can resolve | +| `hazard` | `hazard` (catalog key) | +| `safety_goal`, `sg` | `safety_goal` (catalog key) | +| `fsr`, `safety_requirement_functional`, `functional_safety_requirement` | `fsr` (catalog key) | +| `tsr`, `safety_requirement_technical`, `technical_safety_requirement` | `tsr` (catalog key) | +| `arch`, `architecture`, `spec`, `specification`, `module`, `component`, `interface` | `arch` | +| `tc`, `test_case`, `test`, `verification_plan`, `vplan` | `tc` | +| `decision`, `dec`, `adr` | `decision` | + +If the user's input does not map to a key present in `.pharaoh/project/artefact-catalog.yaml`, +emit: + +``` +FAIL: target_type "<value>" is not in the project's artefact catalog. +Catalog keys available: <list>. +``` + +Read `.pharaoh/project/artefact-catalog.yaml` to confirm the resolved key exists. If the +catalog file is missing, fall back to the bundled defaults — `gd_req`, `arch`, `tc` — and note +the fallback in the authoring summary. + +--- + +### Step 1: Select the dispatch skill + +Apply the routing table: + +| Resolved key | Dispatched skill | Notes | +|---|---|---| +| `req`, `gd_req`, `comp_req`, `sysreq`, `swreq`, `feat`, or any key whose suffix matches `req`; ISO 26262 safety-V types (`hazard`, `safety_goal`, `fsr`, `tsr`) | `pharaoh-req-draft` | type-agnostic; `target_level` forwarded verbatim. Safety-V types route here because `pharaoh-req-draft` is the canonical drafter for any requirement-shaped artefact and reads required fields / links / metadata fields from the catalog. | +| any catalog key whose suffix matches `arch` (e.g. `arch`, `swarch`, `sys-arch`) or whose synonym in the Step 0 table resolves to architecture (`module`, `component`, `interface`) | `pharaoh-arch-draft` | type-agnostic; `target_level` forwarded verbatim | +| any catalog key whose suffix matches `tc`, `test`, `vplan`, or `safety_v` | `pharaoh-vplan-draft` | type-agnostic; `target_level` forwarded verbatim | +| `decision` | `pharaoh-decide` | | + +Routing is driven solely by the synonym table in Step 0 plus the suffix-matching rules +above. The artefact-catalog schema does not carry a `category` field on per-type entries +(`additionalProperties: false`), so the router does not read one. Any catalog-declared +type whose key matches one of the suffix patterns above routes correctly without +modification to this skill; types whose keys do not match are caught by Guardrail G2. + +These routing entries were thin passthroughs in the initial `pharaoh-author` commit. This +update reflects the parameterised interfaces of the three drafting skills — +`pharaoh-req-draft`, `pharaoh-arch-draft`, and `pharaoh-vplan-draft` all now accept any +catalog-declared type via `target_level` (no more hardcoded `arch_type ∈ {module, +component, interface}` allow-list, no more hardcoded `tc__` prefix, no more "no drafter for +type X yet" FAIL on safety-V types). The router forwards `target_level` verbatim and lets +the drafter resolve prefix, required fields, required links, and required metadata fields +from `artefact-catalog.yaml` / `id-conventions.yaml`. + +--- + +### Step 2: Forward inputs to the dispatched skill + +Pack a minimal input set per skill and invoke it. Pass through any field the user supplied; +let the dispatched skill apply its own defaults and FAIL when its inputs are insufficient. + +**`pharaoh-req-draft`** + +- `target_level` ← the resolved catalog key from Step 0 (e.g. `gd_req`, `comp_req`, `sysreq`, + `swreq`, `hazard`, `safety_goal`, `fsr`, `tsr`). The drafter looks up the entry in + `artefact-catalog.yaml` and reads `required_fields`, `required_metadata_fields`, and + `required_links`; if the type is missing from the catalog it FAILs with a clear + "type X not declared" message +- `feature_context` ← `draft_seed` +- `parent_link` ← `parent_link` (may be a workflow-id when drafting a top-level requirement) + +**`pharaoh-arch-draft`** + +- `parent_req_id` ← `parent_link` +- `target_level` ← the resolved catalog key from Step 0 (e.g. `arch`, `swarch`, `sys-arch`, + `module`, `component`, `interface`). The drafter looks up the entry in + `artefact-catalog.yaml` and the prefix in `id-conventions.yaml`; if either is missing it + FAILs with a clear "type X not declared" message +- `element_description` ← `draft_seed` + +**`pharaoh-vplan-draft`** + +- `parent_id` ← `parent_link` +- `target_level` ← the resolved catalog key from Step 0 (e.g. `tc`, `test`, `vplan`). The + drafter derives the directive name from `target_level` and the ID prefix from + `id-conventions.yaml`'s `prefixes` map — a project whose `test` type uses prefix `T_` + emits compliant `T_…` IDs without modifying the skill +- `verification_level` ← user-supplied if present (`unit` / `integration` / `system`); the + drafter will FAIL on a missing or unrecognised value + +**`pharaoh-decide`** + +- forward `draft_seed`, the `:decides:` link list (parsed from `parent_link` — comma-separated + is supported), and any `decided_by` / `alternatives` / `rationale` provided by the caller + +If the caller provides additional fields (e.g. `safety_context`, `tags`, `level`), forward +them as-is. The dispatched skill ignores keys it does not recognise. + +--- + +### Step 3: Capture the drafter output + +Capture the full output of the dispatched skill — the RST directive block plus any +`[NOTE]` / `[DIAGNOSTIC]` annotations. + +If the dispatched skill returned a hard FAIL, propagate it verbatim and set +`stop_reason = "<dispatched-skill> FAIL"` in the authoring summary. Do not attempt to fix +the inputs and retry — the caller decides. + +--- + +### Step 4: File placement + +Suggest where to write the new RST directive. The author router does not write files — it +only suggests a path that the caller can use. + +Default placement rules: + +| Dispatched skill | Suggested file | +|---|---| +| `pharaoh-req-draft` | same directory as `parent_link`'s source file; filename matching the project's existing convention (e.g. `requirements.rst`) | +| `pharaoh-arch-draft` | sibling `architecture.rst` in the same directory as `parent_link`'s source | +| `pharaoh-vplan-draft` | `tests/` subdirectory if one exists, else sibling `tests.rst` | +| `pharaoh-decide` | `decisions.rst` next to the first `:decides:` target | + +If `parent_link` is empty or its source location cannot be resolved from `needs.json`, emit +`file_placement: null` and let the caller decide. + +--- + +### Step 5: Emit the authoring summary + +Emit the `=== [ARTEFACT] ===` block followed by the `=== [AUTHORING SUMMARY] ===` JSON. No +prose wrapper. + +If the dispatched drafter advised a follow-up (e.g. `pharaoh-arch-review`), preserve that +line below the summary so callers can chain. + +--- + +## Guardrails + +**G1 — Unknown target_type** + +If the resolved key is not in the artefact catalog, FAIL (Step 0). Do not invent a new +artefact type at runtime. + +**G2 — No drafter for known type** + +If the resolved key is in the catalog but does not match any of the four router categories +(requirement-shaped, architecture, verification-plan, decision), FAIL with the catalog +key and the four supported categories listed. Do not silently fall back to a drafter. +This guardrail is rarely hit in practice — the four categories cover every artefact type +declared in the bundled catalogs and every safety-V type — but stays in place to make +"my catalog declared a type the router does not classify" a loud error rather than a +silent miscategorisation. + +**G3 — Dispatched drafter failed** + +Forward FAIL / `[DIAGNOSTIC]` verbatim. Record `stop_reason` in the summary. Do not retry. + +**G4 — Tailoring missing** + +If `.pharaoh/project/artefact-catalog.yaml` is absent, the dispatched drafter will operate on +its built-in defaults. Note the fallback in the authoring summary so the caller knows the +output is not catalog-validated. + +--- + +## Advisory chain + +After a successful authoring summary, advise the caller: + +``` +Consider running `pharaoh-verify <need_id>` to confirm the drafted artefact +addresses the substance of its parent. For a per-axis review of the prose itself, +use `pharaoh-req-review` / `pharaoh-arch-review` / `pharaoh-vplan-review`. +``` + +--- + +## Worked example + +**User input:** + +> target_type: `arch` +> draft_seed: "Manages the ABS pump drive circuit, including PWM duty-cycle control and +> over-current protection." +> parent_link: `gd_req__abs_pump_activation` + +**Step 0:** `arch` resolves directly to catalog key `arch`. + +**Step 1:** routing table → `pharaoh-arch-draft` (architecture category). + +**Step 2:** forward `parent_req_id`, `target_level=arch`, `element_description`. + +**Step 3:** `pharaoh-arch-draft` returns its RST block for `arch__abs_pump_driver`. + +**Step 4:** `gd_req__abs_pump_activation` lives in `docs/requirements/braking.rst`. +Suggested placement: `docs/requirements/architecture.rst`. + +**Step 5 output (condensed):** + +``` +=== [ARTEFACT] arch: arch__abs_pump_driver === +.. arch:: ABS pump driver component + :id: arch__abs_pump_driver + :status: draft + :satisfies: gd_req__abs_pump_activation + + The ABS pump driver component manages the pump drive circuit, controlling + output PWM duty cycle and providing over-current protection for the pump motor. + +=== [AUTHORING SUMMARY] === +{ + "need_id": "arch__abs_pump_driver", + "type": "arch", + "dispatched_skill": "pharaoh-arch-draft", + "parent_link": "gd_req__abs_pump_activation", + "file_placement": "docs/requirements/architecture.rst", + "stop_reason": null +} +``` + +``` +Consider running `pharaoh-verify arch__abs_pump_driver` to confirm the drafted +artefact addresses the substance of its parent. +``` diff --git a/.github/agents/pharaoh.block-diagram-draft.agent.md b/.github/agents/pharaoh.block-diagram-draft.agent.md index 99e72ac..fbf9775 100644 --- a/.github/agents/pharaoh.block-diagram-draft.agent.md +++ b/.github/agents/pharaoh.block-diagram-draft.agent.md @@ -7,4 +7,101 @@ handoffs: [] Use when drafting one SysML-style block diagram — Block Definition Diagram (BDD) showing block structure and composition, or Internal Block Diagram (IBD) showing ports, flows, and part interconnections. Typical ASPICE usage — SYS.2/SYS.3 for system-level architecture, and SWE.2 for software architecture on SysML-heavy projects. Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). -See [`skills/pharaoh-block-diagram-draft/SKILL.md`](../../skills/pharaoh-block-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-block-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-block-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.block]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one block diagram, either a **BDD** (structural — blocks, parts, value properties, composition hierarchy) or **IBD** (internal — parts, ports, item flows, constraint properties). Which variant is rendered depends on input: presence of `ports` and `flows` implies IBD; absence implies BDD. + +Typical ASPICE context: +- **SYS.2 System Requirements Analysis**: BDD for the system under analysis. +- **SYS.3 System Architectural Design**: BDD for subsystem decomposition; IBD for internal wiring. +- **SWE.2 Software Architectural Design**: same, applied at SW component level. + +Distinct from `pharaoh-component-diagram-draft` (UML component view — looser, allows external ghost nodes) because BDD/IBD are closed SysML models with strict composition semantics. + +## Atomicity + +- (a) One block scope in → one diagram out. Variant (BDD vs IBD) inferred from input presence. +- (b) Input: `{view_title: str, blocks: list[BlockSpec], parts: list[PartSpec], compositions: list[CompositionSpec], ports?: list[PortSpec], flows?: list[FlowSpec], associations?: list[AssocSpec], project_root: str, variant_override?: "bdd"|"ibd", renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `BlockSpec = {id: str, label: str, stereotype?: "block"|"subsystem"|"valueType", value_properties?: list[str], operations?: list[str]}`, `PartSpec = {id: str, label: str, type_id: str, multiplicity?: str}`, `CompositionSpec = {whole: str, part: str, label?: str}`, `PortSpec = {id: str, label: str, direction: "in"|"out"|"inout", owner_block_or_part: str}`, `FlowSpec = {from_port: str, to_port: str, item_type?: str, label?: str}`, `AssocSpec = {from: str, to: str, kind: "reference"|"depend", label?: str}`. Output: one RST directive block. +- (c) Reward: two fixtures. + + **BDD fixture** — blocks [Vehicle, ECU, Sensor], composition Vehicle◆━ECU, Vehicle◆━Sensor. Scorer: + 1. Output starts with renderer directive. + 2. Every block rendered with `<<block>>` stereotype. + 3. Compositions rendered with filled-diamond arrow. + 4. Value properties (if any) shown inside block compartments. + 5. `ports`/`flows` absent → no IBD syntax emitted. + + Pass = all 5. + + **IBD fixture** — one block Vehicle with parts ecu:ECU, sensor:Sensor, ports [Vehicle.can_out: out, ECU.can_in: in], one flow Vehicle.can_out → ECU.can_in item_type=CANFrame. Scorer: + 1. Output starts with renderer directive. + 2. The enclosing block rendered as the diagram frame. + 3. Parts rendered inside the frame with `:TypeName` notation. + 4. Ports rendered on the boundary (block port) or inside (part port), with direction indicated (triangle/arrow). + 5. Flows rendered with item type label. + 6. All ports / parts have valid `owner_block_or_part` references. + + Pass = all 6. + +- (d) Reusable for any SysML-based systems engineering workflow. +- (e) One diagram per call. + +## Dangling references + +FAIL on `part.type_id` not in `blocks`, `composition.whole`/`composition.part` not in `blocks ∪ parts`, `port.owner_block_or_part` not in `blocks ∪ parts`, `flow.from_port`/`flow.to_port` not in `ports`. + +## Output + +**PlantUML (SysML-style; BDD):** +```rst +.. uml:: + :caption: <view_title> + + @startuml + class Vehicle <<block>> { + + mass : kg + } + class ECU <<block>> + class Sensor <<block>> + Vehicle *-- "1" ECU : ecu + Vehicle *-- "1..*" Sensor : sensor + @enduml +``` + +**PlantUML (IBD):** +```rst +.. uml:: + :caption: <view_title> + + @startuml + rectangle "Vehicle" as veh { + component "ecu : ECU" as ecu + component "sensor : Sensor" as sns + } + portout "can_out" as p1 + portin "can_in" as p2 + veh - p1 + ecu - p2 + p1 --> p2 : <<flow>> CANFrame + @enduml +``` + +**Mermaid** — no native SysML support; render as annotated flowchart with stereotypes in labels. Emit a `%% NOTE: Mermaid approximation of SysML block diagram` comment. + +## Non-goals + +- No parametric diagrams (constraint properties with equations) — separate future skill. +- No BDD / IBD round-trip to SysML XMI — out of scope; this skill emits diagrams only. +- No automatic BDD-from-code inference — caller provides structure. diff --git a/.github/agents/pharaoh.bootstrap.agent.md b/.github/agents/pharaoh.bootstrap.agent.md index 213249a..9cec0de 100644 --- a/.github/agents/pharaoh.bootstrap.agent.md +++ b/.github/agents/pharaoh.bootstrap.agent.md @@ -10,4 +10,466 @@ handoffs: Use when a Sphinx project has no sphinx-needs configured and you need minimum viable scaffolding — adding the extension and declaring need types — so that sphinx-build produces a valid needs.json for downstream Pharaoh skills. -See [`skills/pharaoh-bootstrap/SKILL.md`](../../skills/pharaoh-bootstrap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-bootstrap + +## When to use + +Invoke when a project has a working Sphinx setup (`conf.py` builds without sphinx-needs) but does not yet load `sphinx_needs` as an extension. This skill injects the minimum configuration required for sphinx-needs to produce a valid `needs.json` on the next build. Downstream skills (`pharaoh-setup`, `pharaoh-tailor-detect`, `pharaoh-req-draft`, etc.) require that output. + +Do NOT invoke if `sphinx_needs` is already listed in extensions — use `pharaoh-setup` for that case. Do NOT invoke on a directory that is not yet a Sphinx project — `sphinx-quickstart` is a prerequisite, not part of this skill. Do NOT seed stub RST files, build the project, or write `pharaoh.toml` — those are separate concerns. + +## sphinx-needs version policy + +Pharaoh **recommends** `sphinx-needs >= 8.0.0` (8.x consolidated TOML loading, type-field schema validation, and extra-link declaration format). It does **not** require it. Many real projects pin older versions for lockfile stability or compliance reasons; the skill respects that choice. + +The version handling is a three-way branch: + +| Detected state | Default behavior | +|---|---| +| Not installed | Propose installing the latest available `sphinx-needs`. User confirms → install; rejects → abort (no config written, since no install = nothing to configure). | +| Installed, `>= recommended` | Proceed silently. | +| Installed, `< recommended` | Propose upgrading to recommended. User picks: (a) upgrade; (b) accept current version and proceed; (c) abort. | + +The skill never silently installs or upgrades; every mutation is gated by explicit confirmation (or by a caller passing `on_version_mismatch="install"` / `"accept"` for unattended flows). + +## Atomicity + +- (a) Indivisible — one `project_dir` + config spec in → `conf.py` and/or `ubproject.toml` edits out, plus at most one sphinx-needs version-alignment action (install / upgrade) gated by user confirmation. No directory creation beyond opening existing files; no RST seeding; no docs content; no Pharaoh-level config. The install step is a guarded side effect — it runs only when the caller's confirmation is received, never speculatively. +- (b) Input: `{project_dir: str, config_target: "auto"|"conf.py"|"ubproject.toml", types: list[TypeSpec], extra_links?: list[LinkSpec], extra_options?: list[str], id_required?: bool, id_length?: int, recommended_sphinx_needs_version?: str, on_version_mismatch?: "fail"|"prompt"|"install"|"accept"}` where `TypeSpec = {directive: str, title: str, prefix: str, color?: str, style?: str}` and `LinkSpec = {option: str, incoming: str, outgoing: str}`. `extra_options` defaults to `["source_doc"]` (Pharaoh convention — emitters like `pharaoh-feat-draft-from-docs` set `:source_doc:` on every emitted need; without the declaration, `sphinx-build -nW` fails with `Unknown option 'source_doc'`). `recommended_sphinx_needs_version` defaults to `"8.0.0"`; skill uses this to compute "latest satisfying" when installing, and as the threshold for "older than recommended" proposals. `on_version_mismatch` defaults to `"prompt"`. Output: JSON `{files_modified: list[str], config_target_used: "conf.py"|"ubproject.toml", sphinx_needs_version_before: str|null, sphinx_needs_version_after: str, version_action: "installed"|"upgraded"|"accepted_current"|"already_ok", install_command_used: str|null, warnings: list[str], next_step: str}`. On `"prompt"` path → single JSON `{status: "needs_confirmation", proposal: ...}` with no file writes and no install. +- (c) Reward: fixture covers four scenarios in separate test environments: + + **(i) fresh env (no sphinx-needs installed), `on_version_mismatch="install"`** — scorer checks: + 1. After run, `import sphinx_needs` succeeds with version `>= recommended`. + 2. `version_action == "installed"`, `sphinx_needs_version_before == null`. + 3. Config written and `sphinx-build -b needs` succeeds, producing empty `needs.json`. + + **(ii) env with `sphinx-needs >= recommended`, any `on_version_mismatch`** — scorer checks: + 1. `version_action == "already_ok"`, `install_command_used == null`. + 2. `sphinx_needs_version_before == sphinx_needs_version_after`. + 3. Config written and build succeeds. + + **(iii) env with old version (6.3.0), `on_version_mismatch="accept"`** — scorer checks: + 1. `version_action == "accepted_current"`, `install_command_used == null`. + 2. Config written and build succeeds with the OLD version (assuming the old version is still functional — this is the "user pinned deliberately" path). + 3. Output contains a warning naming the version gap. + + **(iv) env with old version (6.3.0), `on_version_mismatch="prompt"`** — scorer checks: + 1. Output is `{status: "needs_confirmation", proposal: ...}`. + 2. Proposal offers both `upgrade` and `accept` paths. + 3. No files modified. `import sphinx_needs` still reports the old version. + + Idempotence: re-run in the already-aligned state is a no-op (`version_action == "already_ok"`). + + Pass = all scenarios pass their checks. +- (d) Reusable: any first-time sphinx-needs adoption; migration from plain Sphinx; reverse-engineering pilots on projects that start without requirements. Independent of downstream Pharaoh workflow. +- (e) Composable: edits config + at most one guarded install. Does not call other skills, does not write `.pharaoh/`, does not build. + +## Input + +- `project_dir`: absolute path to the Sphinx project root. Must contain `conf.py`. +- `config_target`: where to declare sphinx-needs settings. + - `"auto"` (default): if `ubproject.toml` exists in `project_dir`, use it; otherwise use `conf.py`. + - `"ubproject.toml"`: force TOML. Create the file if missing. + - `"conf.py"`: force Python-level declarations (`needs_types`, `needs_extra_links`, etc.). +- `types`: list of need types to declare. **Only declare types that will have at least one need on day one.** Declaring speculative types (e.g. adding `test` because you plan to write tests "eventually") produces dead type registrations and forces downstream `pharaoh.toml` traceability chains to alarm on empty targets — observed during dogfooding where declaring `test` + `verifies` link made 100% of `comp_req` needs appear unverified on day one. Add new types when the first need of that type lands, not before. + + Each `TypeSpec` has: + - `directive` (required): snake_case directive name, e.g. `"req"`, `"spec"`, `"impl"`, `"test"`. + - `title` (required): human-readable title, e.g. `"Requirement"`. + - `prefix` (required): ID prefix used by sphinx-needs, e.g. `"REQ_"`. + - `color` (optional): hex color or name; defaults left to sphinx-needs. + - `style` (optional): node style; defaults left to sphinx-needs. + + At least one type is required; sphinx-needs builds with defaults but Pharaoh workflows expect explicit declarations. +- `extra_links` (optional): list of `LinkSpec` entries for typed relationships beyond the default `links` option. +- `extra_options` (optional): list of custom option names to declare. Default `["source_doc"]`. Pharaoh emitters (e.g. `pharaoh-feat-draft-from-docs`) always set `:source_doc:` on emitted needs to track provenance back to the authoring document; without this declaration, `sphinx-build -nW` fails with `Unknown option 'source_doc'`. Caller may pass additional option names — the skill unions them with the default. Passing `[]` explicitly suppresses the default (caller accepts the -nW warning as trade-off). Declaration SHAPE is version-dependent: on sphinx-needs ≥ 8.0.0 the skill emits `[needs.fields.NAME]` dict-of-dicts (config option `needs_fields`); on < 8 it emits the legacy `[[needs.extra_options]]` / `needs_extra_options`. The input name stays `extra_options` for API stability — callers pass a list of names and the skill picks the right shape. +- `id_required` (optional): if `true`, declare `needs_id_required = True`. Default: omit (sphinx-needs default is `False`). +- `id_length` (optional): integer; if provided, declare `needs_id_length`. Default: omit (sphinx-needs default). +- `recommended_sphinx_needs_version` (optional): the version Pharaoh recommends. Default `"8.0.0"`. Used as (a) the threshold for "older than recommended" proposals, and (b) the version installed when the skill runs in install mode (or the latest release that satisfies `>=recommended` — see Step 0c). Compared with `packaging.version.parse`. +- `on_version_mismatch` (optional): `"fail"` | `"prompt"` | `"install"` | `"accept"`. Default `"prompt"`. Applies when the detected version is absent OR `< recommended`: + - `"fail"`: abort with a remediation-focused error. + - `"prompt"`: emit a `needs_confirmation` proposal with BOTH an upgrade option and an accept-current option (or install/abort if nothing is installed). The caller picks one and re-invokes with `"install"` or `"accept"`. + - `"install"`: if nothing installed → install recommended; if older version installed → upgrade to recommended. Non-interactive. + - `"accept"`: proceed with whatever is installed. If nothing is installed → FAIL (there is no "current" to accept). + +## Output + +A single JSON object — no prose wrapper. Shape: + +```json +{ + "files_modified": ["docs/conf.py"], + "config_target_used": "conf.py", + "sphinx_needs_version_before": "6.3.0", + "sphinx_needs_version_after": "8.0.0", + "version_action": "upgraded", + "install_command_used": "uv pip install --upgrade sphinx-needs==8.0.0", + "sphinx_build_command": "sphinx-build -b needs docs docs/_build/needs", + "warnings": [], + "next_step": "Run `sphinx-build -b needs docs docs/_build/needs` (see sphinx_build_command) to generate needs.json, then run pharaoh-setup." +} +``` + +`sphinx_build_command` is a concrete, copy-pasteable invocation that assumes the caller's cwd is the project root. Resolution: +- Builder flag: `-b needs`. +- `<sourcedir>`: the relative path from the detected project root to `project_dir` (the argument the skill was invoked with). If `project_dir` contains both `conf.py` and the .rst source tree (typical `sphinx-quickstart` flat layout), `<sourcedir>` is the project_dir path. If `conf.py` lives in one directory and .rst sources live in a sibling (e.g. `conf.py` in `docs/` but RST files under `docs/source/`), the command uses `-c <conf_dir> <source_dir>`; the skill detects this by checking whether the `conf.py` directory contains any `*.rst` files. +- `<outdir>`: `<sourcedir>/_build/needs` by convention. +- If the skill cannot resolve the project root relative to `project_dir` (e.g. `project_dir` is absolute with no parent that looks like a project root), it falls back to absolute paths. + +`version_action` is one of: +- `"installed"` — was missing, installed recommended +- `"upgraded"` — was older than recommended, upgraded +- `"accepted_current"` — was older than recommended, user opted to keep it +- `"already_ok"` — detected version already `>= recommended`, no action taken + +When `on_version_mismatch == "prompt"` and a mismatch is detected, response is: + +```json +{ + "status": "needs_confirmation", + "proposal": { + "detected_version": "6.3.0", + "recommended_version": "8.0.0", + "detected_package_manager": "rye", + "options": [ + { + "action": "upgrade", + "description": "Install sphinx-needs 8.0.0 (recommended). Unlocks TOML loading, schema validation, new extra-link format.", + "install_command": "rye add sphinx-needs~=8.0.0", + "alt_commands": [ + "uv pip install --upgrade sphinx-needs==8.0.0", + "pip install --upgrade sphinx-needs==8.0.0" + ], + "pyproject_patch": { + "target_file": "pyproject.toml", + "section": "[project].dependencies", + "replace": {"sphinx-needs>=6.3.0": "sphinx-needs>=8.0.0"} + } + }, + { + "action": "accept", + "description": "Keep sphinx-needs 6.3.0. Bootstrap proceeds against the current version. Some Pharaoh features that depend on 8.x (schema validation, latest TOML loader) may be degraded or unavailable.", + "caveats": [ + "Downstream Pharaoh skills may warn about missing features.", + "Upgrade can be deferred — re-run pharaoh-bootstrap later to revisit." + ] + }, + { + "action": "abort", + "description": "Cancel bootstrap without writing config or installing anything." + } + ], + "rationale": "Pharaoh recommends sphinx-needs >= 8.0.0 for the richest feature set, but respects pinned older versions where the project has stability or compliance constraints." + } +} +``` + +No files are modified and no installs happen when the response is `needs_confirmation`. The caller (human or outer LLM) picks an option and re-invokes with `on_version_mismatch` set accordingly (`"install"` for upgrade, `"accept"` for accept, or simply stop for abort). + +The "nothing installed" variant of the same proposal drops the `accept` option (since there is no current version to accept) and the `upgrade` action becomes `install`. + +## Process + +### Step 0: Determine sphinx-needs version action + +Before touching any config file, resolve what the skill should do about `sphinx-needs` — install, upgrade, accept, or proceed without action. + +**0a. Detect current version.** + +Run `python -c "import sphinx_needs; print(sphinx_needs.__version__)"` in the project's interpreter (virtualenv-preferred, active shell Python as fallback). + +- Import succeeds → record printed version. +- Import fails → record `null`. + +**0b. Classify and branch.** + +Three classes: + +1. **Installed and `>= recommended_sphinx_needs_version`** → set `version_action = "already_ok"`, `install_command_used = null`, `sphinx_needs_version_before = sphinx_needs_version_after = detected`. Skip to Step 1. + +2. **Not installed** → branch on `on_version_mismatch`: + - `"fail"` → FAIL with remediation message. + - `"prompt"` → emit `needs_confirmation` proposal with options `["install", "abort"]` (no `accept` — nothing to accept). Return. + - `"install"` → go to Step 0c with action=install. + - `"accept"` → FAIL: `"on_version_mismatch='accept' requires an existing install, but sphinx-needs is not installed."` + +3. **Installed but `< recommended`** → branch on `on_version_mismatch`: + - `"fail"` → FAIL with remediation. + - `"prompt"` → emit `needs_confirmation` proposal with options `["upgrade", "accept", "abort"]`. Return. + - `"install"` → go to Step 0c with action=upgrade. + - `"accept"` → set `version_action = "accepted_current"`, emit a warning naming the version gap, set `sphinx_needs_version_before = sphinx_needs_version_after = detected`, `install_command_used = null`. Skip to Step 1. + +**0c. Detect package manager and run install/upgrade.** + +Only reached when `on_version_mismatch == "install"`. Detect package manager by scanning `project_dir` and up to 3 parent levels: + +| Indicator | Package manager | Install command | Upgrade command | +|---|---|---|---| +| `.python-version` + `pyproject.toml` with `[tool.rye]` or `rye.lock` | rye | `rye add sphinx-needs~=<rec>` | `rye add sphinx-needs~=<rec>` (rye resolves by constraint) | +| `uv.lock` or `pyproject.toml` with `[tool.uv]` | uv | `uv add sphinx-needs==<rec>` | `uv pip install --upgrade sphinx-needs==<rec>` | +| `poetry.lock` | poetry | `poetry add sphinx-needs@^<rec>` | `poetry add sphinx-needs@^<rec>` | +| `Pipfile.lock` | pipenv | `pipenv install sphinx-needs==<rec>` | `pipenv install sphinx-needs==<rec>` | +| `pdm.lock` | pdm | `pdm add sphinx-needs==<rec>` | `pdm update sphinx-needs` | +| otherwise, with active venv detectable via `VIRTUAL_ENV` or `project_dir/.venv` | pip (venv) | `<venv_python> -m pip install sphinx-needs==<rec>` | `<venv_python> -m pip install --upgrade sphinx-needs==<rec>` | +| otherwise | unknown | FAIL: "Cannot detect package manager. Install sphinx-needs manually and re-run with `on_version_mismatch='accept'` or `'install'` after install." | + +Closer indicator wins if multiple match. `<rec>` substituted with `recommended_sphinx_needs_version`. + +Run the selected command. Capture exit code and stdout/stderr. + +**0d. Verify post-install.** + +Re-run the probe from 0a. Determine final state: + +- Import still fails → FAIL naming the attempted command and exit code. +- Version now `>= recommended` → set `version_action = "installed"` (if 0b class was "not installed") or `"upgraded"` (if class was "older"). Record `install_command_used` = the command. Proceed to Step 1. +- Version below recommended but installed (install appeared to succeed but resolver picked an older version, e.g. constrained by lockfile) → emit warning, set `version_action = "accepted_current"`, proceed. The caller's lockfile constraints win over Pharaoh's recommendation. + +### Step 1: Verify project_dir is a Sphinx project + +Read `<project_dir>/conf.py`. If it does not exist, FAIL: + +``` +FAIL: <project_dir>/conf.py not found. +This skill scaffolds sphinx-needs INTO an existing Sphinx project. +Run `sphinx-quickstart` first to create a Sphinx project, then re-invoke. +``` + +### Step 2: Verify sphinx-needs is not already configured + +Search `conf.py` and (if present) `ubproject.toml` for the string `sphinx_needs`. If found in either file, FAIL: + +``` +FAIL: sphinx_needs is already referenced in <file>. +This skill is for projects without sphinx-needs. Use pharaoh-setup instead. +``` + +Rationale: mutating an existing config belongs to a separate skill (future: `pharaoh-setup-reconfigure`). Atomicity demands that `pharaoh-bootstrap` only handles first-time injection. + +### Step 3: Resolve config_target + +If `config_target == "auto"`: +- If `<project_dir>/ubproject.toml` exists → use `"ubproject.toml"`. +- Else → use `"conf.py"`. + +Record the resolved target. Emit a warning if the caller passed `"ubproject.toml"` but the file does not exist (the skill will create it). + +### Step 4: Inject `sphinx_needs` into the `extensions` list + +This always happens in `conf.py`, regardless of `config_target` (sphinx loads extensions from `conf.py` only). + +Read `conf.py`. Locate the `extensions = [...]` assignment. Two cases: + +**4a. Extensions list exists.** Append `"sphinx_needs"` as the last entry, preserving existing indentation and trailing comma conventions. If the list is empty (`extensions = []`), replace with `extensions = ["sphinx_needs"]`. + +**4b. Extensions list missing.** Append a new line `extensions = ["sphinx_needs"]` after the last existing top-level assignment (heuristic: find the last line that looks like `NAME = ...` at column 0, insert after it). Add a blank line before for readability. + +Do NOT reorder, rename, or reflow existing content. Do NOT add comments. + +### Step 5: Declare need types + +**5a. If `config_target_used == "ubproject.toml"`:** + +If the file does not exist, create it with a `$schema` header pointing at the public ubproject schema: + +```toml +"$schema" = "https://ubcode.useblocks.com/ubproject.schema.json" +``` + +Append a `[needs]` section with the types array. Example: + +```toml +[[needs.types]] +directive = "req" +title = "Requirement" +prefix = "REQ_" + +[[needs.types]] +directive = "spec" +title = "Specification" +prefix = "SPEC_" +``` + +Include `color` and `style` entries only if the caller provided them. + +Emit typed links and custom fields. **The shape depends on the detected `sphinx_needs_version_after` from Step 0.** sphinx-needs 8.x deprecated the pre-8 array-of-tables shape in favour of dict-of-dicts keyed by option name; emitting the legacy shape on 8.x triggers deprecation warnings at every build (`Config option "needs_extra_options" is deprecated. Please use "needs_fields" instead.`), and emitting the new shape on < 8 fails to load. The skill picks the right shape for the detected version. + +**sphinx-needs ≥ 8.0.0 — dict-of-dicts:** + +```toml +[needs.links.satisfies] +incoming = "is satisfied by" +outgoing = "satisfies" + +[needs.fields.source_doc] +description = "Relative path to the documentation file that authored this need (Pharaoh provenance)." +schema = "string" +default = "" +``` + +On 8.x, `description` + `schema` + `default` are all required on each field entry; omitting any of them triggers a backward-compatibility warning. For caller-supplied option names without explicit metadata, the skill synthesises: `description = "<NAME> (Pharaoh-declared custom field)"`, `schema = "string"`, `default = ""`. + +**sphinx-needs < 8.0.0 — array-of-tables (legacy shape):** + +```toml +[[needs.extra_links]] +option = "satisfies" +incoming = "is satisfied by" +outgoing = "satisfies" + +[[needs.extra_options]] +name = "source_doc" +``` + +On pre-8, `[[needs.extra_links]]` MUST be an array of tables — dict form (`[needs.extra_links.satisfies]`) fails with `TypeError: string indices must be integers`. + +Version comparison uses `packaging.version.parse`. Resolved `extra_options` = default `["source_doc"]` unioned with caller-provided extras, deduplicated, sorted for determinism. + +If caller explicitly passed `extra_options = []`, emit no field/option section and record a warning: `"extra_options suppressed by caller; Pharaoh emitters that set :source_doc: will trigger -nW warnings"`. + +Include `id_required` and `id_length` only if the caller provided them: + +```toml +[needs] +id_required = true +id_length = 8 +``` + +Also add the `needs_from_toml` hook to `conf.py` so sphinx-needs reads the TOML: + +```python +needs_from_toml = "ubproject.toml" +``` + +Insert this line after the `extensions = [...]` assignment that was touched in Step 4. + +**5b. If `config_target_used == "conf.py"`:** + +Append `needs_types` plus version-dependent link/field declarations plus optional ID settings directly to `conf.py` after the `extensions` assignment. The config-option names match the TOML shape chosen in Step 5a. + +**sphinx-needs ≥ 8.0.0 — `needs_links` / `needs_fields` dicts:** + +```python +needs_types = [ + {"directive": "req", "title": "Requirement", "prefix": "REQ_"}, + {"directive": "spec", "title": "Specification", "prefix": "SPEC_"}, +] + +needs_links = { + "satisfies": { + "incoming": "is satisfied by", + "outgoing": "satisfies", + }, +} + +needs_fields = { + "source_doc": { + "description": "Relative path to the documentation file that authored this need.", + "schema": "string", + "default": "", + }, +} + +needs_id_required = True +needs_id_length = 8 +``` + +**sphinx-needs < 8.0.0 — legacy `needs_extra_links` / `needs_extra_options`:** + +```python +needs_types = [ + {"directive": "req", "title": "Requirement", "prefix": "REQ_"}, + {"directive": "spec", "title": "Specification", "prefix": "SPEC_"}, +] + +needs_extra_links = [ + {"option": "satisfies", "incoming": "is satisfied by", "outgoing": "satisfies"}, +] + +needs_extra_options = ["source_doc"] + +needs_id_required = True +needs_id_length = 8 +``` + +Omit link/field declarations the caller did not supply (no `extra_links` input AND default `extra_options` not suppressed → emit only the `source_doc` field/option). Omit `needs_id_*` entries the caller did not supply. Always emit the `source_doc` declaration unless caller explicitly passed `extra_options = []` (in which case omit and warn — see Step 5a). Do NOT add comments. + +### Step 6: Emit output + +Emit the JSON object per the Output shape. Populate: +- `files_modified`: every file the skill wrote to, relative to `project_dir`. +- `config_target_used`: resolved target from Step 3. +- `warnings`: accumulated warnings (e.g., created `ubproject.toml` that did not exist). +- `sphinx_build_command`: resolved per the rules in the Output section. Prefer paths relative to the detected project root (the nearest ancestor of `project_dir` that contains a `pyproject.toml`, `.git`, or similar marker). If no project root is detectable, use absolute paths. If `project_dir` does not contain any `*.rst` files but `<project_dir>/source/` does (separated layout), emit `sphinx-build -b needs -c <project_dir> <project_dir>/source <project_dir>/source/_build/needs`. +- `next_step`: interpolate `sphinx_build_command` into the sentence `"Run \`<sphinx_build_command>\` to generate needs.json, then run pharaoh-setup."` + +## Guardrails + +**G1 — No Sphinx project.** `conf.py` missing → FAIL per Step 1. + +**G2 — sphinx-needs already present.** Any reference to `sphinx_needs` found → FAIL per Step 2. Do not attempt merge; that is a different skill. + +**G3 — Empty types list.** If `types == []`, FAIL: + +``` +FAIL: At least one type must be declared. +Pharaoh workflows expect explicit type declarations. Provide at least one TypeSpec. +``` + +**G4 — Directive collision.** If two entries in `types` have the same `directive`, FAIL with the offending directive name. Deduplication is the caller's responsibility. + +**G5 — Partial write.** If Step 4 succeeds but Step 5 fails, revert Step 4's change so the project is not left in a half-configured state. Report the failure and the rollback. + +## Advisory chain + +After successfully emitting output, always advise with the CONCRETE command resolved for this project (the value of the `sphinx_build_command` output field), not a placeholder: + +``` +Run `<sphinx_build_command>` to generate needs.json. +Then invoke `pharaoh-setup` to detect the fresh configuration and author pharaoh.toml. +``` + +Rationale: prior dogfooding had `conf.py` in `docs/` and RST files in `docs/source/`; the generic `<source> <outdir>` template forced the caller to grep `pyproject.toml` to find the right `-c` flag before the first build succeeded. Surfacing the concrete invocation in the bootstrap report removes that lookup. + +## Worked example + +**User input:** +```json +{ + "project_dir": "/work/my-project/docs", + "config_target": "auto", + "types": [ + {"directive": "feat", "title": "Feature", "prefix": "FEAT_"}, + {"directive": "comp_req", "title": "Component Requirement", "prefix": "CREQ_"} + ], + "extra_links": [ + {"option": "satisfies", "incoming": "is satisfied by", "outgoing": "satisfies"} + ] +} +``` + +**Step 1:** `/work/my-project/docs/conf.py` exists. OK. + +**Step 2:** Neither `conf.py` nor `ubproject.toml` mentions `sphinx_needs`. OK. + +**Step 3:** `ubproject.toml` exists in `/work/my-project/docs/` → resolve to `"ubproject.toml"`. + +**Step 4:** Append `"sphinx_needs"` to the existing `extensions = [...]` list in `conf.py`. + +**Step 5:** Add `[[needs.types]]` tables for `feat` and `comp_req` to `ubproject.toml`. Emit link and field declarations in the shape matching the detected sphinx-needs version — `[needs.links.satisfies]` and `[needs.fields.source_doc]` on ≥ 8.0.0, or the legacy `[[needs.extra_links]]` / `[[needs.extra_options]]` on < 8. Add `needs_from_toml = "ubproject.toml"` to `conf.py`. + +**Step 6 output:** + +```json +{ + "files_modified": ["conf.py", "ubproject.toml"], + "config_target_used": "ubproject.toml", + "sphinx_build_command": "sphinx-build -b needs docs docs/_build/needs", + "warnings": [], + "next_step": "Run `sphinx-build -b needs docs docs/_build/needs` to generate needs.json, then run pharaoh-setup." +} +``` diff --git a/.github/agents/pharaoh.change.agent.md b/.github/agents/pharaoh.change.agent.md index 9ffa829..fb93c92 100644 --- a/.github/agents/pharaoh.change.agent.md +++ b/.github/agents/pharaoh.change.agent.md @@ -129,3 +129,500 @@ This agent has **no prerequisites** and runs freely in both advisory and enforci 3. Support multi-project setups. Label cross-project needs. 4. For large impact scopes (>50 needs), recommend escalation. 5. Never modify need source files. This agent is read-only except for session state. + +--- + +## Full atomic specification + +# pharaoh-change: Change Impact Analysis + +Analyze the full impact of a proposed change to any sphinx-needs item. Trace through ALL link types -- standard `links`, `extra_links` (implements, tests, etc.), and sphinx-codelinks -- to produce a structured Change Document listing every affected need and code file with a recommended action. + +This is a **gate-free skill**. It can be invoked at any time in any strictness mode. Authoring skills depend on its output. + +--- + +## 1. Understand the Change + +Before accessing any project data, establish exactly what is being changed. + +### Step 1a: Identify the target need(s) + +Extract from the user's request: + +- **Target need ID(s)**: One or more need IDs (e.g., `REQ_001`, `SPEC_002`). If the user describes the need by title or content instead of ID, note it and resolve the ID in Step 2. +- **Nature of the change**: Classify as one of: + - **Value change** -- An attribute or content value is being modified (e.g., latency from 100ms to 50ms). + - **Addition** -- A new attribute, link, or content section is being added to the need. + - **Removal** -- An attribute, link, or the entire need is being removed. + - **Restructuring** -- The need is being split, merged, or moved to a different type or module. +- **Change description**: A plain-language summary of what changes and why. + +### Step 1b: Clarify ambiguity + +If the user's request is ambiguous, ask exactly one round of clarifying questions before proceeding. Cover only what is missing: + +- Which need ID(s) are affected? +- What specifically is changing (attribute, content, links)? +- What is the new value or desired state? + +Do not ask questions whose answers can be determined from the project data. + +--- + +## 2. Get Project Data + +Follow the instructions in [`skills/shared/data-access.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/data-access.md) to detect the project structure and build the needs index. Specifically: + +1. **Detect project structure** (Section 1 of data-access.md) -- find `ubproject.toml`, `conf.py`, and the documentation source tree. +2. **Read project configuration** (Section 2) -- extract need types, link types, and ID settings. +3. **Three-tier data access** (Section 3) -- use the best available source: + - **Tier 1: ubc CLI** -- `ubc build needs --format json` for the complete needs index. + - **Tier 2: ubCode MCP** -- MCP tools for pre-indexed data. + - **Tier 3: Raw file parsing** -- Grep for need directives, parse options, build the index manually. +4. **Detect sphinx-codelinks** (Section 4) -- determine if code traceability is available. +5. **Read pharaoh.toml** (Section 5) -- load strictness, workflow gates, traceability requirements, and codelinks settings. + +After data access completes, present a brief detection summary: + +``` +Project: <name> (<config source>) +Types: <list of directive names> +Links: <list of link type names> +Data source: <tier used> +Needs found: <count> +Codelinks: <enabled/disabled/not configured> +Strictness: <advisory/enforcing> +``` + +### Step 2a: Resolve need IDs from descriptions + +If the user described the target need by title or content rather than ID, search the needs index now. Match by: + +1. Exact title match (case-insensitive). +2. Substring match in title. +3. Substring match in content. + +If multiple needs match, list them and ask the user to confirm which one(s) to analyze. If exactly one matches, proceed with it and inform the user of the resolved ID. + +--- + +## 3. Impact Analysis + +With the needs index, link graph, and target need(s) identified, perform a full impact analysis. + +### Step 3a: Direct impact (one hop) + +Find every need that is directly linked to ANY target need. Check all link directions: + +- **Outgoing links from the target**: needs referenced in the target's `links`, `implements`, `tests`, and all other extra_link options. +- **Incoming links to the target**: needs whose `links`, `implements`, `tests`, or other extra_link options reference the target's ID. +- **Bidirectional**: For each extra_link type, check both the `outgoing` and `incoming` directions as defined in the project configuration. + +For each directly linked need, record: + +- Need ID +- Need type (directive name) +- Need title +- The link type connecting it to the target (e.g., `links`, `implements`, `tests`) +- The link direction (incoming or outgoing relative to the target) + +### Step 3b: Transitive impact (full graph) + +Starting from the set of directly impacted needs, recursively follow all links to find transitively affected needs. + +**Algorithm:** + +``` +visited = set(target_ids) +queue = [all directly impacted needs] +distance = {direct_need: 1 for direct_need in queue} + +while queue is not empty: + current = queue.pop(0) + for each need linked to current (all link types, both directions): + if need.id not in visited: + visited.add(need.id) + distance[need.id] = distance[current.id] + 1 + queue.append(need) +``` + +Stop conditions: +- The queue is empty (all reachable needs have been visited). +- A configurable maximum depth has been reached (default: no limit -- traverse the entire reachable graph). + +For each transitively impacted need, record: + +- Need ID +- Need type +- Need title +- Distance from the target (number of hops) +- The path of link types traversed to reach it + +### Step 3c: Code impact (sphinx-codelinks) + +Perform this step only if sphinx-codelinks is enabled (detected in Step 2 or configured in `pharaoh.toml`). + +For every need in the affected set (direct + transitive), search for code files that reference the need's ID via codelink annotations. + +**Search strategy:** + +1. **If ubc CLI is available**: Use ubc commands that resolve codelinks. If ubc provides a codelinks-aware query, prefer it. +2. **Raw search fallback**: Use Grep to search the project's source code directories for codelink patterns: + - `# codelink: <NEED_ID>` + - `// codelink: <NEED_ID>` + - `/* codelink: <NEED_ID> */` + - Any custom codelink pattern configured in `conf.py` or `ubproject.toml`. + +Exclude documentation directories (`docs/`, `_build/`) and common non-source directories (`node_modules/`, `.git/`, `__pycache__/`). + +For each code file found, record: + +- File path (relative to project root) +- Line number or function/class name where the codelink appears +- The need ID it references +- Context: a brief excerpt of the surrounding code (the line containing the codelink and 2 lines above/below) + +### Step 3d: Classify impact severity + +For each affected item (need or code file), classify the required action: + +**Must update** -- The change directly invalidates this item. Apply when: +- A specification references a specific value from the target need that is being changed (e.g., the spec mentions "100ms" and the requirement is changing to "50ms"). +- A test case validates the exact property being changed (e.g., test checks response time against the old threshold). +- An implementation encodes the changed value as a constant, threshold, or parameter. +- A code file contains the changed value as a literal (found via codelinks). + +**Review needed** -- The change may affect this item but requires human judgment. Apply when: +- The item is linked to the target but does not directly reference the changed value. +- The item is a sibling (e.g., another requirement linked to the same parent) that might have implicit dependencies. +- The item is transitively linked (2+ hops) and its content relates to the changed property. +- A code file references an affected need but the specific impact on the code is unclear. + +**No change needed** -- The item is linked but unaffected by this specific change. Apply when: +- The item is linked to the target but addresses an entirely different property or concern. +- The item is transitively linked through a need that itself requires no change. +- The link is structural (e.g., both needs belong to the same module) but not functional. + +**Classification rules:** + +1. Read the content of each affected need. If the content contains the specific value being changed (e.g., "100ms", "8m/s2"), classify as "Must update". +2. If the content references the property being changed but not the specific value (e.g., "response time" without a number), classify as "Review needed". +3. If the content does not reference the changed property at all, classify as "No change needed". +4. For transitively linked needs (2+ hops), default to "Review needed" unless content analysis clearly indicates "Must update" or "No change needed". +5. For code files, default to "Review needed" unless the code contains the specific changed value as a literal. + +--- + +## 4. Produce the Change Document + +Present results in the following structured format. Use markdown tables for readability. + +``` +## Change Document + +### Change Request +- **Target**: <NEED_ID> (<need title>) +- **Change**: <plain-language description of the change> +- **Requested by**: user +- **Date**: <current ISO 8601 date> + +### Direct Impact (1 hop) + +| Need ID | Type | Title | Link Type | Direction | Action | +|---------|------|-------|-----------|-----------|--------| +| <ID> | <type> | <title> | <link_type> | <in/out> | <Must update / Review needed / No change needed> | + +### Transitive Impact + +| Need ID | Type | Title | Distance | Path | Action | +|---------|------|-------|----------|------|--------| +| <ID> | <type> | <title> | <N hops> | <link chain> | <Must update / Review needed / No change needed> | + +### Code Impact + +| File | Location | Linked Need | Action | +|------|----------|-------------|--------| +| <relative path> | <function/line> | <NEED_ID> | <Must update / Review needed> | + +If codelinks are not enabled or no code references are found, display: + +> No code impact detected. sphinx-codelinks is not configured for this project. + +or: + +> No code files reference the affected needs via codelinks. + +### Summary +- **Needs requiring update**: <count> +- **Needs requiring review**: <count> +- **Needs with no change needed**: <count> +- **Code files affected**: <count> +- **Total items in impact scope**: <count> +- **Maximum traversal depth**: <N hops> +- **Recommendation**: <proceed / escalate / discuss> +``` + +**Recommendation logic:** + +- **Proceed**: 5 or fewer items require update, no safety-tagged needs are in "Must update", and the change is localized. +- **Escalate**: Any need tagged with `safety`, `critical`, or `regulatory` (or similar domain-specific tags) is classified as "Must update", OR more than 10 items require update. +- **Discuss**: The impact is ambiguous -- many items are "Review needed" with unclear severity, or the change affects needs across multiple unrelated modules. + +### Multiple targets + +If the user requested changes to multiple needs, produce one Change Document per target need. If the impact sets overlap, note the overlap at the end: + +``` +### Overlap +The following needs appear in the impact scope of multiple targets: +- <NEED_ID>: affected by both <TARGET_1> and <TARGET_2> +``` + +--- + +## 5. Update Session State + +After producing the Change Document, update the session state file so other skills can check whether change analysis was performed. + +Follow the session state instructions in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md) (Section 4). + +### Step 5a: Read or initialize session state + +1. Check if `.pharaoh/session.json` exists. +2. If it does not exist, create the `.pharaoh/` directory and initialize the session state: + +```json +{ + "version": 1, + "created": "<current ISO 8601 timestamp>", + "updated": "<current ISO 8601 timestamp>", + "changes": {}, + "global": { + "mece_checked": false, + "mece_timestamp": null, + "last_release": null + } +} +``` + +3. If it exists, read and parse it. If the JSON is malformed, warn the user and re-initialize. + +### Step 5b: Record the change analysis + +For each target need ID, add or update an entry in the `changes` dictionary: + +```json +{ + "changes": { + "<TARGET_ID>": { + "change_analysis": "<current ISO 8601 timestamp>", + "acknowledged": false, + "authored": false, + "verified": false + } + } +} +``` + +Key points: +- Set `change_analysis` to the current timestamp. +- Set `acknowledged` to `false`. The user must explicitly acknowledge before this gate is satisfied. +- Do not overwrite `authored` or `verified` if the entry already exists -- preserve those values. +- Update the top-level `updated` timestamp. + +### Step 5c: Write the session state + +Write the updated JSON to `.pharaoh/session.json`. Ensure the JSON is properly formatted (indented for readability). + +--- + +## 6. Ask for Acknowledgment + +After presenting the Change Document and updating session state, ask the user to acknowledge the analysis. + +Present exactly this prompt: + +``` +Acknowledge this change analysis? Acknowledging allows proceeding to the authoring skill for the affected needs. +``` + +### If the user acknowledges + +Update `.pharaoh/session.json`: set `acknowledged` to `true` for each target need ID analyzed in this invocation. Update the `updated` timestamp. + +Respond with: + +``` +Change analysis for <TARGET_ID(s)> acknowledged. You may now proceed with the appropriate authoring skill. +``` + +### If the user does not acknowledge + +Do not update the session state. The `acknowledged` field remains `false`. + +If the user asks questions about the Change Document, answer them. If the user requests modifications to the analysis (e.g., "also check the impact on module X"), re-run the relevant parts of the analysis and present an updated Change Document. Then ask for acknowledgment again. + +### If the user ignores the acknowledgment prompt + +Do not force the issue. The session state remains with `acknowledged: false`. In advisory mode this has no effect. In enforcing mode, any authoring skill will check and block if acknowledgment is missing. + +--- + +## 7. Strictness Behavior + +Follow the instructions in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md) for strictness handling. The specifics for this skill: + +### Advisory mode + +- Always produce the full Change Document regardless of workflow state. +- No gating -- this skill has no prerequisites. +- After completing the analysis, the acknowledgment step is optional. If the user skips it, other skills will show a tip but will not block. + +### Enforcing mode + +- This skill itself has no prerequisites (it is gate-free per [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md) Section 3, "Skills with no gates"). +- However, its output gates any authoring skill. In enforcing mode, authoring skills check `.pharaoh/session.json` for `acknowledged: true` on the relevant need IDs. +- Always perform the full analysis. Always update session state. Always ask for acknowledgment. + +### Strictness has no effect on analysis depth + +Both advisory and enforcing modes perform the same analysis. Strictness only affects whether downstream skills gate on the results. + +--- + +## 8. Using ubc diff + +If the ubc CLI is available (detected in Step 2), use `ubc diff` to supplement or replace parts of the manual impact analysis. + +### When to use ubc diff + +- The user is proposing a change that has already been partially implemented in the source files (e.g., they edited the RST but want to understand the impact before committing). +- The project uses version control and the user wants to compare the current state against a baseline (e.g., a branch or tag). + +### How to use ubc diff + +```bash +ubc diff +``` + +or with a specific baseline: + +```bash +ubc diff --base <ref> +``` + +`ubc diff` returns a structured diff including: +- Which needs were added, modified, or removed. +- Which attributes changed on each need. +- Impact tracing: which linked needs are affected by each change. + +### Integrating ubc diff output + +If `ubc diff` provides impact tracing: +1. Use its output as the primary source for the "Direct Impact" and "Transitive Impact" sections. +2. Supplement with manual link-graph traversal only for needs or link types that ubc diff does not cover. +3. Still perform the code impact analysis (Step 3c) separately, since ubc diff may not include codelink information. +4. Still classify severity (Step 3d) using content analysis. + +If `ubc diff` does not provide impact tracing (older version), use it only for identifying which needs changed, then perform the full manual analysis from Step 3. + +--- + +## 9. Edge Cases + +### Target need does not exist + +If the target need ID is not found in the needs index: +1. Report that the need ID was not found. +2. Suggest possible matches (typo correction, similar IDs). +3. Ask the user to confirm or provide the correct ID. +4. Do not proceed with analysis until a valid target is identified. + +### Target need has no links + +If the target need has zero outgoing and zero incoming links: +1. Report that the need is an orphan (no links in any direction). +2. The Direct Impact and Transitive Impact sections are empty. +3. Still check for code impact via codelinks. +4. In the Summary, note that the change is fully isolated and recommend proceeding. + +### Circular links + +If the link graph contains cycles (A links to B, B links to C, C links to A): +- The traversal algorithm in Step 3b uses a `visited` set, so cycles are handled automatically. +- No need is visited twice. +- Report the cycle in the Change Document under a note: + +``` +Note: Circular link chain detected: <A> -> <B> -> <C> -> <A>. Each need appears once in the impact analysis. +``` + +### Very large impact scope + +If the analysis yields more than 50 affected needs: +1. Present the full Change Document. +2. Add a warning in the Summary: + +``` +Warning: This change has a large impact scope (N needs). Consider breaking the change into smaller increments or reviewing the link structure for overly broad connections. +``` + +3. Recommend "escalate" regardless of other factors. + +### Multi-project impact + +If the workspace contains multiple sphinx-needs projects and the target need links to needs in a different project: +1. Identify cross-project links (need IDs that do not exist in the target's project but do exist in another project). +2. Follow the links into the other project. +3. In the Change Document, clearly mark cross-project needs: + +``` +| SPEC_EXT_001 | Specification | External sensor spec | 1 hop | Review needed | (project: sensor-subsystem) | +``` + +### Need described by title, not ID + +Handled in Step 2a. If the user says "change the brake response time requirement", resolve to `REQ_001` using title matching, then proceed normally. + +--- + +## 10. Complete Workflow Example + +To illustrate the full process, here is a walkthrough using the Brake System test fixture. + +**User request**: "Change REQ_001 latency from 100ms to 50ms" + +**Step 1** -- Target: `REQ_001` (Brake response time). Nature: value change. Change: response time threshold from 100ms to 50ms. + +**Step 2** -- Data access detects: +``` +Project: Brake System (ubproject.toml) +Types: req, spec, impl, test +Links: links, implements, tests +Data source: Tier 3 (raw file parsing) +Needs found: 8 +Codelinks: not configured +Strictness: advisory +``` + +**Step 3** -- Impact analysis: + +Direct (1 hop from REQ_001): +- `SPEC_001` -- linked via `links` (incoming: SPEC_001 links to REQ_001). Content mentions "10ms signal update rate" -- related to timing. Action: **Must update** (spec for the sensor interface must reflect the tighter timing budget). +- `REQ_002` -- linked via `links` (incoming: REQ_002 links to REQ_001). Content about "force distribution" -- different property. Action: **Review needed** (sibling requirement, may have implicit timing dependency). + +Transitive: +- `SPEC_002` -- 2 hops (REQ_001 -> REQ_002 -> SPEC_002 via links). Content about "force distribution algorithm". Action: **No change needed**. +- `IMPL_001` -- 2 hops (REQ_001 -> SPEC_001 -> IMPL_001 via links/implements). Content about "CAN driver for brake pedal sensor". Action: **Review needed** (driver timing may need adjustment for 50ms budget). +- `TEST_001` -- 3 hops (REQ_001 -> SPEC_001 -> IMPL_001 -> TEST_001 via links/implements/tests). Content: "Verify brake response within 100ms". Action: **Must update** (test threshold must change to 50ms). +- `IMPL_002` -- 3 hops (REQ_001 -> REQ_002 -> SPEC_002 -> IMPL_002 via links/implements). Action: **No change needed**. +- `TEST_002` -- 4 hops (via IMPL_002). Action: **No change needed**. + +Code impact: Not applicable (codelinks not configured). + +**Step 4** -- Change Document produced with the tables above. Summary: 2 must update, 2 review needed, 3 no change needed, 0 code files. Recommendation: proceed. + +**Step 5** -- Session state written: `REQ_001` entry with `acknowledged: false`. + +**Step 6** -- User asked to acknowledge. User says "yes". Session updated: `acknowledged: true`. diff --git a/.github/agents/pharaoh.class-diagram-draft.agent.md b/.github/agents/pharaoh.class-diagram-draft.agent.md index 5911999..c6b3547 100644 --- a/.github/agents/pharaoh.class-diagram-draft.agent.md +++ b/.github/agents/pharaoh.class-diagram-draft.agent.md @@ -7,4 +7,108 @@ handoffs: [] Use when drafting one class diagram showing a bounded set of types/entities with their fields, methods, and relationships (inheritance, composition, aggregation, association). Renderer tailored via `pharaoh.toml`. Does NOT emit component, sequence, or state diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). -See [`skills/pharaoh-class-diagram-draft/SKILL.md`](../../skills/pharaoh-class-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-class-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-class-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.class]` for per-type overrides. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one class/type diagram. Captures **structural relationships** between types: inheritance hierarchies, composition, aggregation, plain association, with optional per-class fields and methods. + +Does NOT capture runtime behavior over time (→ `pharaoh-sequence-diagram-draft`). Does NOT capture high-level component topology (→ `pharaoh-component-diagram-draft`). Does NOT capture lifecycle FSM (→ `pharaoh-state-diagram-draft`). + +## Atomicity + +- (a) One class set in → one diagram out. No splitting across diagrams; if the set is too large to fit, caller invokes multiple times with different scopes. +- (b) Input: `{view_title: str, classes: list[ClassSpec], relationships: list[RelationSpec], project_root: str, show_fields?: bool, show_methods?: bool, visibility_filter?: list["public"|"protected"|"private"], renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `ClassSpec = {id: str, label: str, stereotype?: "abstract"|"interface"|"enum"|"struct", fields?: list[FieldSpec], methods?: list[MethodSpec]}`, `FieldSpec = {name: str, type?: str, visibility?: "public"|"protected"|"private"}`, `MethodSpec = {name: str, params?: str, return_type?: str, visibility?: "public"|"protected"|"private"}`, `RelationSpec = {from: str, to: str, kind: "inherits"|"implements"|"composes"|"aggregates"|"associates"|"depends", label?: str, cardinality_from?: str, cardinality_to?: str}`. Output: one RST directive block. +- (c) Reward: fixture — abstract base `Shape` with method `area()`, concrete `Circle` and `Square` inheriting, plus `Canvas` composing 1..* Shapes. Scorer: + 1. Output starts with renderer-specific directive. + 2. All class IDs declared. + 3. Inheritance edges Circle→Shape, Square→Shape render in inheritance syntax (hollow triangle in both Mermaid/PlantUML). + 4. Composition edge Canvas→Shape renders in composition syntax (filled diamond). + 5. Cardinality `1..*` on composition edge is present. + 6. With `show_fields=false, show_methods=false`, no field/method lines appear. + 7. With `show_methods=true`, abstract `area()` on Shape is rendered with stereotype (italic/abstract marker). + + Pass = all 7. +- (d) Reusable: any OOP codebase, domain model extraction, data schema visualization. +- (e) One diagram per call. + +## Input highlights (others per shared doc) + +- `classes`: declared order = render order (usually doesn't matter for class diagrams but preserved for determinism). +- `relationships`: every `from`/`to` MUST reference a class ID in `classes`. Dangling relationship → FAIL (class diagrams don't tolerate ghost classes in the same way component diagrams tolerate out-of-scope links; either the class is in the diagram or it is not). +- `show_fields` / `show_methods` (optional): default `true`. Set to `false` for overview diagrams. +- `visibility_filter` (optional): include only members matching these visibilities. Default: all. + +## Output + +**Mermaid:** +```rst +.. mermaid:: + :caption: <view_title> + + classDiagram + class Shape { + <<abstract>> + +area() double + } + class Circle { + -radius: double + +area() double + } + class Canvas { + +shapes: List~Shape~ + } + Shape <|-- Circle + Shape <|-- Square + Canvas "1" *-- "1..*" Shape +``` + +**PlantUML:** +```rst +.. uml:: + :caption: <view_title> + + @startuml + abstract class Shape { + +area() : double + } + class Circle { + -radius : double + +area() : double + } + class Canvas + Shape <|-- Circle + Shape <|-- Square + Canvas "1" *-- "1..*" Shape + @enduml +``` + +## Relationship kind → renderer syntax + +| Kind | Mermaid | PlantUML | +|---|---|---| +| `inherits` | `A <|-- B` | `A <|-- B` | +| `implements` | `A <|.. B` | `A <|.. B` | +| `composes` | `A *-- B` | `A *-- B` | +| `aggregates` | `A o-- B` | `A o-- B` | +| `associates` | `A -- B` | `A -- B` | +| `depends` | `A ..> B` | `A ..> B` | + +Both renderers converge on UML-standard arrows; the syntax is virtually identical. + +## Non-goals + +- No generics/template detection — callers pass rendered forms (`List~Shape~`, `Option<T>`) in field types as strings. +- No automatic abstract detection — caller sets `stereotype` explicitly. +- No private-member hiding by default — caller uses `visibility_filter=["public"]` if needed. +- No class-from-code extraction — separate future skill (`pharaoh-classes-from-source`) could infer; out of scope here. diff --git a/.github/agents/pharaoh.component-diagram-draft.agent.md b/.github/agents/pharaoh.component-diagram-draft.agent.md index e4bf231..902cd08 100644 --- a/.github/agents/pharaoh.component-diagram-draft.agent.md +++ b/.github/agents/pharaoh.component-diagram-draft.agent.md @@ -7,4 +7,101 @@ handoffs: [] Use when drafting one component-relationship diagram (nodes = sphinx-needs, edges = link relations) for a bounded scope — one feature, one module, one architectural view. Renderer tailored via `pharaoh.toml`. Does NOT emit sequence, class, or state diagrams — those are separate skills. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). -See [`skills/pharaoh-component-diagram-draft/SKILL.md`](../../skills/pharaoh-component-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-component-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-component-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-tailoring.md). This skill reads `[pharaoh.diagrams]` and `[pharaoh.diagrams.component]` from the consumer project's `pharaoh.toml`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one component-relationship diagram (static containment + link-relation edges between needs). Analogue to UML component diagrams, C4 container/component views. + +Does NOT show behavior over time (→ `pharaoh-sequence-diagram-draft`). Does NOT show type hierarchies with fields/methods (→ `pharaoh-class-diagram-draft`). Does NOT show lifecycle/FSM (→ `pharaoh-state-diagram-draft`). + +## Atomicity + +- (a) One scope in → one diagram out. No multi-scope bundling. No mutation of needs. +- (b) Input: `{view_title: str, scope_ids: list[str], project_root: str, renderer_override?: "mermaid"|"plantuml", direction_override?: "TB"|"LR"|"BT"|"RL", ghost_nodes?: bool, on_missing_config?: "fail"|"prompt"|"use_default", papyrus_workspace?: str, reporter_id: str}`. Output: one RST directive block (`.. mermaid::` or `.. uml::`) with caption. No surrounding prose. +- (c) Reward: fixture with 3 in-scope needs (A, B, C) chained A→B→C via `:links:`, plus one out-of-scope need D that B links to. Scorer: + 1. Output starts with the renderer-specific directive matching tailoring. + 2. Every ID in `scope_ids` appears as a node in the diagram body. + 3. Edges A→B and B→C render in renderer syntax (`A --> B`). + 4. With default `ghost_nodes=true`: D appears as a ghost node (dashed outline / muted color / external stereotype), edge B→D is rendered. + 5. With `ghost_nodes=false`: D does NOT appear, edge B→D is dropped, and a warning is logged naming D as a dangling dependency. + 6. `renderer_override="mermaid"` on a PlantUML-tailored project produces Mermaid. + + Pass = all 6. +- (d) Reusable for any sphinx-needs project needing static architecture diagrams. +- (e) One phase, one skill. No cross-skill calls. + +## Input + +- `view_title`: human-readable title (→ diagram caption). +- `scope_ids`: list of sphinx-needs IDs to include. Skill reads each via `ubc` / file fallback to extract type, title, and outgoing link options. +- `project_root`: absolute path to consumer project root. Used for `pharaoh.toml` tailoring lookup. +- `renderer_override` (optional): per-call override. Resolution order in `shared/diagram-tailoring.md`. +- `direction_override` (optional): `TB` | `LR` | `BT` | `RL`. Falls back to `[pharaoh.diagrams.component].direction` → `"TB"`. +- `ghost_nodes` (optional): if `true` (default), edges whose target is outside `scope_ids` render as ghost nodes — dashed outline, muted color, visually distinct from in-scope nodes — so reviewers see the boundary between "our scope" and "external dependencies." If `false`, the dangling edge is dropped and a warning is logged. Default `true`. +- `on_missing_config` (optional): see shared doc. Default `"prompt"`. +- `papyrus_workspace` (optional): for consistent node labeling across diagrams (same canonical names as `pharaoh-req-from-code`). +- `reporter_id`: short agent identifier. + +## Output + +Single RST directive block. Renderer-dependent body: + +**Mermaid (default):** +```rst +.. mermaid:: + :caption: <view_title> + + graph TB + FEAT_csv_export[CSV Export]:::feat + CREQ_csv_export_01[Write CSV header row]:::comp_req + CREQ_csv_export_02[Serialize rows]:::comp_req + CREQ_csv_export_01 --> FEAT_csv_export + CREQ_csv_export_02 --> FEAT_csv_export + classDef feat fill:#4ECDC4 + classDef comp_req fill:#BFD8D2 +``` + +**PlantUML:** +```rst +.. uml:: + :caption: <view_title> + + @startuml + component "CSV Export" as FEAT_csv_export #4ECDC4 + component "Write CSV header row" as CREQ_csv_export_01 #BFD8D2 + component "Serialize rows" as CREQ_csv_export_02 #BFD8D2 + CREQ_csv_export_01 --> FEAT_csv_export + CREQ_csv_export_02 --> FEAT_csv_export + @enduml +``` + +`classDef`/color fills come from `[pharaoh.diagrams.type_styles]` if tailored; otherwise renderer defaults (no styling). + +## Process (sketch) + +1. Resolve renderer, direction, type_styles from `pharaoh.toml` (see shared doc for order). If any mandatory field missing AND `on_missing_config == "prompt"` → emit structured proposal. +2. Read each need in `scope_ids` via data-access layer (`ubc` CLI preferred). +3. Build internal graph: nodes = scope_ids, edges = outgoing links. +4. For each edge: if target ∈ scope_ids → render as in-scope edge. If target ∉ scope_ids → behavior depends on `ghost_nodes`: + - `ghost_nodes=true` (default): add the target as a ghost node (dashed outline, muted color, `<<external>>` stereotype or renderer-equivalent). Render the edge normally. Log info-level note listing all ghost nodes. + - `ghost_nodes=false`: drop the edge. Log warning naming the dangling pair. +5. Emit renderer-specific syntax. Ghost nodes are grouped visually apart from in-scope nodes where the renderer supports it (Mermaid: separate `subgraph External`; PlantUML: `package "external" { ... }`). +6. Wrap in RST directive with caption. +7. Return. + +## Non-goals + +- Not sequences, not classes, not state machines — separate skills. +- Not auto-layout tuning — emit simple directional graphs. +- Not diagram-to-needs sync — edges are DERIVED from needs, never a source of truth. diff --git a/.github/agents/pharaoh.context-gather.agent.md b/.github/agents/pharaoh.context-gather.agent.md index 8f21d82..aadffe5 100644 --- a/.github/agents/pharaoh.context-gather.agent.md +++ b/.github/agents/pharaoh.context-gather.agent.md @@ -7,4 +7,108 @@ handoffs: [] Use when retrieving rationale memories relevant to an authoring context from a Papyrus workspace, before invoking any draft or review skill. Returns a structured list of memories (memory_id, text, relevance_score). Does NOT draft, review, or modify artefacts. -See [`skills/pharaoh-context-gather/SKILL.md`](../../skills/pharaoh-context-gather/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-context-gather + +## When to use + +Invoke BEFORE any draft skill (`pharaoh-req-draft`, `pharaoh-arch-draft`, `pharaoh-vplan-draft`, `pharaoh-fmea`) when the project directory contains a `.papyrus/` workspace. The output is a compact bundle of rationale decisions (past design choices, constraints, conventions) that the downstream draft skill must respect. + +Do NOT invoke for project directories without `.papyrus/` — this skill is a no-op in that case. Do NOT invoke after drafting; the purpose is precondition-retrieval only. + +## Atomicity + +- (a) Indivisible — single retrieval action. Does not draft, review, route, or modify artefacts. +- (b) Input: `{feature_context: str, artefact_type: "req"|"arch"|"vplan"|"fmea", project_dir: path}`. Output: JSON list of `{memory_id, text, relevance_score}`. +- (c) Reward: recall@k + MRR vs ground-truth relevant IDs. Deterministic, IR-style. +- (d) Reusable: consumed by all 4 draft skills; standalone "show rationale for this change" flow. +- (e) Composable: read-only upstream; never mutates artefact files. + +## Process + +### Step 1: Detect workspace + +Check that `<project_dir>/.papyrus/` exists as a directory AND contains +`memory/` OR `.papyrus/index.json` (Papyrus's internal index). If no +`.papyrus/` at all, return `[]` (no memories, no error) and stop. +If the `project_dir` passed in already points at a path whose basename +is `.papyrus`, treat that as the workspace directly. + +### Step 2: Run semantic recall + +Pass `feature_context` verbatim as the query; Papyrus does cosine similarity +over pre-embedded memory vectors. Requires the workspace to have been indexed +via `papyrus rebuild-index` (with the `papyrus[semantic]` extra installed). + +```bash +papyrus --workspace <project_dir>/.papyrus recall --semantic \ + -q "<feature_context>" --top-k 10 --show-scores --format full +``` + +The output is plaintext with one memory per block. Each block is prefixed by +a line `[score=<cosine>]` and starts with a line `# <id>` (e.g. +`# dec__utc_only`), followed by header lines (`Type:`, `Status:`, +`Confidence:`, `Scope:`, `Title:`, `Tags:`, `Source:`), an empty line, and +the body paragraphs. Optional `Links:` block follows the body. + +Parse by splitting on `^\[score=...\]\s*\n# ` and extracting: +- `id`: the token after `# ` on the line following the score +- `score`: the float inside `[score=...]` +- `text`: everything between the empty-line-after-headers and the start of the `Links:` block (or end of block, whichever comes first) + +### Step 3: Score relevance + +Use the cosine `score` parsed from `[score=<cosine>]` verbatim as +`relevance_score`. Scores are already in `[0, 1]` and already ranked by +Papyrus. + +### Step 4: (reserved) + +### Step 5: Emit structured output + +Return a JSON list ordered by relevance_score descending, top 10 only: + +```json +[ + { + "memory_id": "dec__utc_only", + "text": "All timestamps recorded by the tooling SHALL use UTC. ...", + "relevance_score": 1.0 + }, + ... +] +``` + +Print as a single fenced block: + +```` +```json +<list> +``` +```` + +No surrounding prose. + +## Input + +- `feature_context` (from the caller): 1-3 sentences describing the feature being authored. +- `artefact_type`: one of `req`, `arch`, `vplan`, `fmea` (reserved for future filtering; v1 does not use it). +- `project_dir` (from the caller or inferred from cwd): path whose `.papyrus/` is searched. + +## Output + +JSON array of 0-10 memory objects. Empty array means no workspace or no matches — not an error. + +## Failure modes + +- `papyrus` binary missing → return `[]` (skill is a best-effort assist; do not block chain). +- Workspace empty → return `[]`. +- `papyrus recall` exits non-zero → return `[]`. +- Semantic extra not installed (exits non-zero with "requires papyrus[semantic]") → return `[]`. Caller should install `papyrus[semantic]` and run `papyrus rebuild-index` to enable. + +## Composition + +Upstream: any draft skill. The draft skill reads this skill's JSON output and inserts the memory `text` entries as a "Design decisions to respect" section in its own input context. diff --git a/.github/agents/pharaoh.coverage-gap.agent.md b/.github/agents/pharaoh.coverage-gap.agent.md index b95341d..3ea1611 100644 --- a/.github/agents/pharaoh.coverage-gap.agent.md +++ b/.github/agents/pharaoh.coverage-gap.agent.md @@ -7,4 +7,381 @@ handoffs: [] Use when detecting one gap category (orphan / unverified / duplicate / contradictory / lifecycle / ...) in a sphinx-needs corpus. Returns ordered list of needs falling into that gap. -See [`skills/pharaoh-coverage-gap/SKILL.md`](../../skills/pharaoh-coverage-gap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-coverage-gap + +## When to use + +Invoke when you want to scan a sphinx-needs corpus for a single, named gap category and get +back the list of needs that fall into it. + +**One category per invocation.** For a full-corpus audit across all 10 gap categories at once, +use `pharaoh-process-audit` instead. + +Do NOT use to fix the gaps — this skill only detects and lists them. + +--- + +## Inputs + +- **project_root**: path to the sphinx-needs project (must contain `.pharaoh/project/` tailoring + and a built `needs.json` under `docs/_build/needs/needs.json` or equivalent) +- **category**: one of the 10 supported gap categories (see Detection rules below) + +--- + +## Outputs + +A single JSON document — no prose wrapper. Shape: + +```json +{ + "category": "unverified_req", + "detection_rule": "gd_req needs with no tc__* linking to them via :verifies:", + "matches": [ + { + "need_id": "gd_req__abs_pump_activation", + "evidence": ":verification: field absent; no tc found with :verifies: gd_req__abs_pump_activation", + "severity_hint": "high" + } + ], + "match_count": 1, + "false_positive_risk": "low" +} +``` + +Fields: + +| Field | Type | Description | +|---|---|---| +| `category` | string | Echoes the input `category` | +| `detection_rule` | string | One-sentence description of the rule applied | +| `matches` | array | Ordered list of gaps found (most severe / most prominent first) | +| `matches[].need_id` | string | ID of the need falling into the gap | +| `matches[].evidence` | string | Specific observation — what is missing or wrong | +| `matches[].severity_hint` | `"high"` / `"medium"` / `"low"` | Per-match severity for process-audit aggregation | +| `match_count` | integer | `len(matches)` | +| `false_positive_risk` | `"low"` / `"medium"` / `"high"` | Signal reliability flag (see Detection rules) | + +--- + +## Detection rules + +### `orphan_arch` + +**Rule:** arch needs that have no `:satisfies:` link, or whose `:satisfies:` target does not +resolve to a `gd_req` in needs.json. + +**Detection:** For every need of type `arch`, check (a) `:satisfies:` field is present and +non-empty, (b) the resolved need exists in needs.json, (c) the resolved need is of type `gd_req`. +Report any that fail any of (a)–(c). + +**false_positive_risk:** `low` — purely link-resolution, no model judgment. + +--- + +### `unverified_req` + +**Rule:** `gd_req` needs with no test case pointing to them via `:verifies:`. + +**Detection:** Build an inverted index: for each `tc` need, collect need IDs in its `:verifies:` +field. For each `gd_req`, check whether its id appears in this index. Report any `gd_req` that +does not appear. + +**false_positive_risk:** `low` — deterministic graph query. + +--- + +### `invalid_lifecycle_transition` + +**Rule:** need whose `status` value is not reachable from the previous recorded state per the +`workflows.yaml` state machine. + +**Detection:** Read `workflows.yaml` transitions. For each need, check: +- `status` is declared in `lifecycle_states`. +- If the need carries a `previous_status` field (or history metadata), the transition is legal + per `transitions`. If no history is available, check only that `status` is a declared state. + +Report needs with undeclared status values, and — when history is available — needs that +skipped a required intermediate state. + +**false_positive_risk:** `medium` — history fields may not be present in all projects; partial +detection when `previous_status` is absent. + +--- + +### `duplicate_req` + +**Rule:** pair of `gd_req` bodies with cosine distance < 0.15 (near-identical text). + +**Detection:** For each pair of `gd_req` needs, compute the cosine similarity of their body +text (TF-IDF or embedding representation). Flag pairs where cosine distance < 0.15 (similarity +> 0.85). Report the lower-priority need in each pair as the duplicate (the one with the higher +alphabetical id, as a tie-breaker). + +**false_positive_risk:** `medium` — embedding/TF-IDF similarity is approximation; near-identical +bodies may be intentional specialisations. Include both need IDs in `evidence` to let the user +decide. + +--- + +### `contradictory_req_pair` + +**Rule:** pair of `gd_req` needs where NLI (natural language inference) scores the pair as +"contradiction". + +**Detection:** For each pair of `gd_req` body texts, apply NLI entailment check. Flag pairs +where the contradiction label has the highest logit. Report both need IDs in `evidence`. + +**false_positive_risk:** `high` — NLI models misfire on domain-specific requirements language. +Always include both full bodies in `evidence` for human review. + +--- + +### `missing_fmea` + +**Rule:** `gd_req` needs carrying a safety-relevant tag (any tag matching `ASIL-[ABCD]` or +`safety_goal__*`) but with no `fmea` need referencing them. + +**Detection:** For each `gd_req`, check its `tags` field for ASIL or safety-goal markers. +Build an inverted index: for each `fmea` need, collect the IDs in any reference field +(`:parent_id:` or `:satisfies:` or `:id:` prefix `fmea__<req_stem>`). Report any safety-tagged +`gd_req` not covered. + +**false_positive_risk:** `low` if ASIL tags are present; `medium` if safety relevance must be +inferred from body text. + +--- + +### `stale_review` + +**Rule:** need with `status: inspected` but no review record dated within the last 12 months. + +**Detection:** For each need where `status = inspected`, check for a `:reviewed_by:` or +`:inspection_record:` field. If the field contains a date, verify it is within 365 days of +today. If absent or older than 365 days, report as stale. + +**false_positive_risk:** `medium` — date parsing depends on the convention used in the field +value; projects with no date convention will show `medium` risk. + +--- + +### `broken_back_link` + +**Rule:** need whose `:satisfies:` (or `:verifies:`) target does not exist in needs.json. + +**Detection:** For every need that carries a `:satisfies:` or `:verifies:` field, look up each +referenced ID in needs.json. Report any ID that does not resolve. + +**false_positive_risk:** `low` — pure existence check. + +--- + +### `schema_violation` + +**Rule:** need missing one or more fields listed as `required_fields` in artefact-catalog.yaml +for its type. + +**Detection:** For each need, look up its type in artefact-catalog.yaml. Check that every +`required_fields` entry is present and non-empty in the need's field map. Report violations. + +**false_positive_risk:** `low` — deterministic field presence check. + +--- + +### `wrong_prefix_id` + +**Rule:** need whose `id` does not match the `id_regex` for its type in id-conventions.yaml. + +**Detection:** For each need, look up the `id_regex` for its type in id-conventions.yaml. Apply +the regex to the need's `id` field. Report any mismatch. + +**false_positive_risk:** `low` — deterministic regex match. + +--- + +## Process + +### Step 1: Validate inputs + +Confirm `project_root` and `category` are provided. If `category` is not one of the 10 +supported values, FAIL: + +``` +FAIL: unknown category "<value>". +Supported categories: orphan_arch, unverified_req, invalid_lifecycle_transition, +duplicate_req, contradictory_req_pair, missing_fmea, stale_review, +broken_back_link, schema_violation, wrong_prefix_id. +``` + +--- + +### Step 2: Load tailoring and needs.json + +Read `.pharaoh/project/artefact-catalog.yaml`, `.pharaoh/project/id-conventions.yaml`, and +`.pharaoh/project/workflows.yaml` from `project_root`. + +Find needs.json: check `<project_root>/docs/_build/needs/needs.json`, then +`<project_root>/_build/needs/needs.json`. Extract the flat needs ID map. + +If needs.json is missing, FAIL with path attempted and rebuild hint. +If tailoring files are missing, FAIL with missing-file name. + +--- + +### Step 3: Apply detection rule + +Apply the detection rule for `category` (see Detection rules above) to the full needs map. +Collect all matching need IDs with evidence and severity hints. + +For categories that require pairwise comparison (`duplicate_req`, `contradictory_req_pair`): +process all pairs. Log the first occurrence of each pair; do not double-report. + +For categories that require external models (cosine similarity, NLI): apply the model; flag +`false_positive_risk` accordingly. + +--- + +### Step 4: Order results + +Sort `matches` by severity_hint: `high` first, then `medium`, then `low`. Within each +severity tier, sort alphabetically by `need_id`. + +--- + +### Step 5: Emit JSON + +Emit the single JSON document. No prose before or after. + +--- + +## Guardrails + +**G1 — Unknown category** + +Unsupported `category` value → FAIL (Step 1) with enumerated list. Do not proceed. + +**G2 — Missing needs.json** + +``` +FAIL: needs.json not found at expected paths. +Rebuild the Sphinx project first: sphinx-build docs/ docs/_build/ +``` + +**G3 — Missing tailoring** + +``` +FAIL: <filename> not found at .pharaoh/project/<filename>. +Run pharaoh-tailor-fill to generate tailoring files. +``` + +**G4 — Empty corpus** + +If needs.json contains zero needs, return: + +```json +{ + "category": "<category>", + "detection_rule": "<rule>", + "matches": [], + "match_count": 0, + "false_positive_risk": "low" +} +``` + +Do not FAIL — an empty corpus is valid (no gaps by definition). + +--- + +## Advisory chain + +`chains_to: []` — this skill is terminal. If `match_count > 0`, append after the JSON: + +``` +Use `pharaoh-process-audit` to run all 10 gap categories in one pass. +``` + +--- + +## Worked example + +**Run against the Score project — two categories:** + +### Category 1: `unverified_req` + +**Inputs:** `project_root = examples/my-project`, `category = unverified_req` + +**Step 2:** tailoring loaded; needs.json found with 185 `gd_req` needs and 63 `tc` needs. + +**Step 3:** inverted index built from `tc` `:verifies:` fields. After scanning all 185 +`gd_req` ids, 3 are not found in the index: +- `gd_req__impl_complexity_analysis` — no `:verification:` field; no tc found +- `gd_req__power_budget_monitoring` — `:verification:` present but referenced tc not in needs.json +- `gd_req__diag_log_rotation` — no tc with matching `:verifies:` + +```json +{ + "category": "unverified_req", + "detection_rule": "gd_req needs with no tc__* linking to them via :verifies:", + "matches": [ + { + "need_id": "gd_req__power_budget_monitoring", + "evidence": ":verification: tc__power_budget_001 present but tc not found in needs.json (broken link)", + "severity_hint": "high" + }, + { + "need_id": "gd_req__impl_complexity_analysis", + "evidence": ":verification: field absent; no tc found with :verifies: gd_req__impl_complexity_analysis", + "severity_hint": "high" + }, + { + "need_id": "gd_req__diag_log_rotation", + "evidence": "no tc found with :verifies: gd_req__diag_log_rotation", + "severity_hint": "medium" + } + ], + "match_count": 3, + "false_positive_risk": "low" +} +``` + +--- + +### Category 2: `schema_violation` + +**Inputs:** `project_root = examples/my-project`, `category = schema_violation` + +**Step 3:** artefact-catalog.yaml loaded. For `gd_req`, required fields are `[id, status, satisfies]`. +After scanning all needs: 1 `gd_req` missing `:satisfies:` field; 2 `arch` needs missing +`:type:` field. + +```json +{ + "category": "schema_violation", + "detection_rule": "needs missing required_fields listed in artefact-catalog.yaml for their type", + "matches": [ + { + "need_id": "gd_req__impl_complexity_analysis", + "evidence": "gd_req required field 'satisfies' is absent", + "severity_hint": "medium" + }, + { + "need_id": "arch__diag_subsystem", + "evidence": "arch required field 'type' is absent", + "severity_hint": "medium" + }, + { + "need_id": "arch__power_mgmt_module", + "evidence": "arch required field 'type' is absent", + "severity_hint": "medium" + } + ], + "match_count": 3, + "false_positive_risk": "low" +} +``` + +Use `pharaoh-process-audit` to run all 10 gap categories in one pass. diff --git a/.github/agents/pharaoh.decide.agent.md b/.github/agents/pharaoh.decide.agent.md index 4e942fb..457e826 100644 --- a/.github/agents/pharaoh.decide.agent.md +++ b/.github/agents/pharaoh.decide.agent.md @@ -113,3 +113,349 @@ Strictness has no effect on decision recording. Both modes follow the same proce 5. **Reuse @pharaoh.author** for RST writing, file placement, and ID generation. Do not duplicate logic. 6. **Validate `:decides:` targets exist.** Warn if a target is missing from the needs index. 7. **Semicolons for alternatives.** Separate with semicolons, not commas. + +--- + +## Full atomic specification + +# pharaoh-decide + +Record design decisions as traceable sphinx-needs `decision` directives. Each decision captures the chosen option, rejected alternatives, rationale, and explicit links to the requirements or specifications it affects. This skill ensures every decision has proper `decided_by`, `alternatives`, and `rationale` fields. + +--- + +## When to Use + +- When a design choice has been made and must be recorded for traceability. +- When comparing alternatives for a requirement or specification and committing to one. +- When `pharaoh:spec` identifies a gap and calls this skill programmatically to record the resolution. +- When superseding an earlier decision with a new one. + +## Prerequisites + +- The workspace must contain at least one sphinx-needs project with a `decision` type configured. +- No workflow gates. This skill runs freely in both advisory and enforcing modes. + +--- + +## Process + +Execute the following steps in order. + +--- + +### Step 1: Get Project Data + +Follow the full detection and data access algorithm defined in [`skills/shared/data-access.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/data-access.md). + +1. Detect project structure (project roots, source directories, configuration). +2. Read project configuration (need types, link types, ID settings). +3. Build the needs index using the best available data tier (ubc CLI, ubCode MCP, or raw file parsing). +4. Read `pharaoh.toml` for strictness level and workflow settings. + +Present the detection summary before proceeding: + +``` +Project: <name> (<config source>) +Types: <list of directive names> +Links: <list of link type names> +Data source: <tier used> +Needs found: <count> +Strictness: <advisory|enforcing> +``` + +#### Verify `decision` type exists + +After reading the project configuration, confirm that a type with directive name `decision` is present in the types list. + +If `decision` is not configured, show the user the exact TOML to add: + +```toml +[[needs.types]] +directive = "decision" +title = "Decision" +prefix = "DEC_" +color = "#E8D0A9" +style = "node" +``` + +Also verify that these extra options are configured: + +```toml +[needs.fields.decided_by] +description = "Who made the decision" + +[needs.fields.alternatives] +description = "Rejected alternatives, semicolon-separated" + +[needs.fields.rationale] +description = "Why this option was chosen" +``` + +And verify the `decides` link type exists: + +```toml +[needs.extra_links.decides] +incoming = "is decided by" +outgoing = "decides for" +``` + +Ask the user to confirm before proceeding if any of these are missing. + +--- + +### Step 2: Gather Decision Context + +Determine what to record. The following pieces of information are required: + +- **Title**: What is being decided (e.g., "Use CAN bus for brake pedal sensor"). +- **Affected needs**: Need IDs for the `:decides:` link (e.g., `REQ_001, SPEC_001`). +- **decided_by**: Who made this decision. Default to `claude` when the AI is generating the decision autonomously. Ask the user otherwise. +- **alternatives**: Rejected alternatives, semicolon-separated (e.g., `SPI at 1MHz; Direct analog input`). +- **rationale**: Why this option was chosen over the alternatives. +- **status**: One of `proposed`, `accepted`, `superseded`, `rejected`. + +#### When called standalone + +Prompt the user for each missing piece. Do not proceed until all five fields (title, affected needs, decided_by, alternatives, rationale) are populated. If the user omits any field, ask for it explicitly. + +#### When called by `pharaoh:spec` + +Accept all context programmatically. Do not prompt. All required fields must be provided by the calling skill. + +#### Status defaults + +- **Standalone invocation**: Default status to `proposed`. +- **Called by `pharaoh:spec`**: Default status to `accepted`. + +The user may override the default in either case. + +--- + +### Step 3: Generate ID + +1. Check `pharaoh.toml` for `[pharaoh.id_scheme]`. If a pattern exists, apply it with `{TYPE}` resolving to `DEC`. +2. If no id_scheme is configured, infer the pattern from existing `decision` needs in the index. Look for `DEC_*` IDs and determine the numbering scheme. +3. If no existing decisions exist, use the prefix from the type configuration (e.g., `DEC_`) and start at `001`, padded to match `id_length`. +4. Validate uniqueness against the full needs index. If the generated ID already exists, increment until a unique ID is found. + +--- + +### Step 4: Write the Need + +Write the directive directly to the target file with all fields populated: + +```rst +.. decision:: <title> + :id: <generated_id> + :status: <proposed|accepted> + :decides: <need_id1>, <need_id2> + :decided_by: <name or claude> + :alternatives: <alt1>; <alt2> + :rationale: <why this option> + + <expanded description> +``` + +The expanded description should summarize the decision in one to three sentences, covering what was chosen and why the alternatives were rejected. + +#### Superseding an existing decision + +When a new decision replaces an old one: + +1. Locate the old decision's directive in its RST file and change its `:status:` field value to `superseded`. +2. On the new decision, add `:links: <old_dec_id>` to establish the supersession chain. +3. The new decision's description should reference the old decision ID and explain why it is being replaced. + +--- + +### Step 5: File Placement + +Place the decision in `decisions.rst` in the same directory as the affected requirements. + +1. Identify the directory of the first need listed in `:decides:`. Use the needs index to find its file path. +2. Check if `decisions.rst` exists in that directory. If it does, append the new decision after the last existing `decision` directive in the file. +3. If `decisions.rst` does not exist, create it with a proper RST title: + +```rst +Decisions +========= + +.. decision:: <title> + :id: <id> + ... +``` + +4. If no `:decides:` links are specified (rare case), place the decision in `decisions.rst` at the sphinx-needs source root. + +--- + +### Step 6: Update Session State + +After successfully writing the decision: + +1. Read `.pharaoh/session.json` (or initialize if it does not exist). +2. Create the `.pharaoh/` directory if it does not exist. +3. For the decision need ID, set `changes.<dec_id>.authored = true`: + +```json +{ + "<dec_id>": { + "change_analysis": null, + "acknowledged": false, + "authored": true, + "verified": false + } +} +``` + +4. Set `updated` to the current ISO 8601 timestamp. +5. Write the updated JSON back to `.pharaoh/session.json`. + +--- + +### Step 7: Follow-up + +#### Standalone invocation + +After writing the decision, suggest the next step: + +``` +Next step: Run pharaoh:req-review to validate the decision against its linked requirements. +``` + +#### Called by `pharaoh:spec` + +Return the decision ID silently. Do not print follow-up suggestions. The calling skill manages the workflow. + +--- + +## Strictness Behavior + +Follow the instructions in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md). + +### Advisory mode + +Execute freely. No gates. Decisions have no prerequisites and do not gate other skills. Do not show tips -- decisions can be recorded at any time without prior analysis. + +### Enforcing mode + +Execute freely. No gates. Decisions can be recorded at any time regardless of strictness level. This skill is gate-free in both modes. + +Strictness has no effect on decision recording. Both modes follow the same process. + +--- + +## Key Constraints + +1. **All three fields are mandatory.** Always populate `decided_by`, `alternatives`, and `rationale`. If the user omits any of them, ask explicitly. Do not write a decision with missing fields. +2. **Default `decided_by` to `claude`** when the AI is making the decision autonomously (e.g., during `pharaoh:spec` execution). +3. **Default `status` to `proposed`** when standalone, `accepted` when called by `pharaoh:spec`. +4. **Superseding requires two writes.** When replacing an old decision, update the old decision's status to `superseded` AND add `:links:` on the new decision referencing the old one. +5. **Write RST directly.** Generate the directive text and write it to the target file. Do not delegate to any other skill for file operations. +6. **Validate `:decides:` targets exist.** Every need ID in the `:decides:` field must exist in the needs index. If a target does not exist, warn the user and ask whether to proceed. +7. **Semicolons for alternatives.** Separate rejected alternatives with semicolons, not commas. Commas are reserved for need ID lists. + +--- + +## Examples + +### Example 1: Standalone decision recording + +**User request**: "Record a decision that we chose PostgreSQL over MongoDB for the data store" + +**Step 1** -- Data access detects: + +``` +Project: Backend Service (ubproject.toml) +Types: req, spec, impl, test, decision +Links: links, implements, tests, decides +Data source: Tier 3 (raw file parsing) +Needs found: 12 +Strictness: advisory +``` + +**Step 2** -- Gather context. The user provided the title and one alternative. Prompt for missing fields: + +``` +Recording decision: "Use PostgreSQL for the data store" + +Which requirements or specs does this decision affect? +> REQ_003, SPEC_005 + +Who is making this decision? +> engineering-lead + +Any other rejected alternatives besides MongoDB? +> Redis as primary store + +Why was PostgreSQL chosen over the alternatives? +> PostgreSQL provides ACID transactions and mature JSON support, + which MongoDB and Redis cannot guarantee for our consistency requirements. +``` + +**Step 3** -- Generate ID. Existing decisions: DEC_001, DEC_002. Next ID: `DEC_003`. + +**Step 4** -- Directive written directly to file: + +```rst +.. decision:: Use PostgreSQL for the data store + :id: DEC_003 + :status: proposed + :decides: REQ_003, SPEC_005 + :decided_by: engineering-lead + :alternatives: MongoDB; Redis as primary store + :rationale: PostgreSQL provides ACID transactions and mature JSON support required for data consistency + + Selected PostgreSQL over MongoDB and Redis. MongoDB lacks full ACID transaction + support across collections. Redis does not provide durable storage guarantees + suitable for a primary data store. PostgreSQL meets the consistency requirements + defined in REQ_003. +``` + +**Step 5** -- Written to `docs/decisions.rst` (same directory as REQ_003). + +**Step 6** -- Session state updated: `DEC_003.authored = true`. + +**Step 7** -- Follow-up: + +``` +Next step: Run pharaoh:req-review to validate the decision against its linked requirements. +``` + +--- + +### Example 2: Decision during spec generation (programmatic call) + +`pharaoh:spec` identifies that no specification covers the communication protocol for subsystem X. It calls `pharaoh:decide` with all context provided: + +- **Title**: "Use CAN bus for brake pedal sensor communication" +- **decides**: `REQ_001, SPEC_001` +- **decided_by**: `claude` +- **alternatives**: `SPI at 1MHz; Direct analog input` +- **rationale**: "CAN bus provides noise immunity required for safety-critical braking" +- **status**: `accepted` + +`pharaoh:decide` executes without prompting: + +1. Verifies `decision` type exists in configuration. +2. Generates ID: `DEC_004`. +3. Writes directive directly to `docs/decisions.rst`: + +```rst +.. decision:: Use CAN bus for brake pedal sensor communication + :id: DEC_004 + :status: accepted + :decides: REQ_001, SPEC_001 + :decided_by: claude + :alternatives: SPI at 1MHz; Direct analog input + :rationale: CAN bus provides noise immunity required for safety-critical braking + + Selected CAN bus over SPI and direct analog based on EMC requirements. + SPI at 1MHz lacks sufficient noise immunity for the safety-critical braking + subsystem. Direct analog input introduces unacceptable signal degradation + over the required cable lengths. +``` + +4. Updates session state: `DEC_004.authored = true`. +5. Returns `DEC_004` to `pharaoh:spec`. No follow-up message printed. diff --git a/.github/agents/pharaoh.decision-record.agent.md b/.github/agents/pharaoh.decision-record.agent.md index c7cccc6..0e9e67d 100644 --- a/.github/agents/pharaoh.decision-record.agent.md +++ b/.github/agents/pharaoh.decision-record.agent.md @@ -7,4 +7,116 @@ handoffs: [] Use when recording a canonical decision, fact, or preference in the shared Papyrus workspace with automatic dedup on (type, canonical_name). Returns {action: wrote|duplicate, papyrus_id}. Generalizes pharaoh-finding-record beyond audit findings. -See [`skills/pharaoh-decision-record/SKILL.md`](../../skills/pharaoh-decision-record/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-decision-record + +## When to use + +Invoke from any authoring or multi-agent coordination workflow that needs to record a canonical named item (architecture decision, domain fact, style preference) in a way that survives concurrent writers without duplication. Typical callers: `pharaoh-req-from-code` (when a type or concept is first surfaced), `pharaoh-decide` chains, multi-agent reverse-engineering fan-out. + +Do NOT invoke for informational or non-canonical observations, and do NOT invoke when `pharaoh-finding-record` is a better fit (audit-finding category + subject_id tuple). + +## Atomicity + +- (a) Indivisible — single write-or-dedup action. Does not author, classify, or retrieve. +- (b) Input: `{type: "dec"|"fact"|"pref", canonical_name: str, body: str, tags?: list[str], reporter_id: str}`. Output: `{action: "wrote"|"duplicate"|"error", papyrus_id: str, dup_of?: str, message?: str}`. +- (c) Reward: deterministic — two writers for the same `(type, canonical_name)` must produce exactly 1 `"wrote"` + 1 `"duplicate"`; measured via fixture. +- (d) Reusable: any multi-agent workflow needing canonical-vocabulary coordination; ADR capture; design-decision dedup. +- (e) Composable: Papyrus write-only; never modifies artefact files or invokes other skills. + +## Input + +- `type`: one of `dec` (decision), `fact` (domain fact), `pref` (preference). Controls the Papyrus need type. +- `canonical_name`: canonical identifier for the subject. Style (snake_case, CamelCase, etc.) MUST match the frozen vocabulary if one exists for the project. If the caller is unsure, it MUST first invoke `pharaoh-context-gather` and reuse an existing canonical before coining a new one. +- `body`: 1-3 sentence description. Stored verbatim as the Papyrus need body. +- `tags`: optional list of free-form tags. +- `reporter_id`: caller identifier (e.g. `req-from-code:health_monitor.cpp`). Stored in the Papyrus need `source` field for traceability. + +## Output + +Exactly one single-line JSON object, no prose: + +```json +{"action": "wrote", "papyrus_id": "FACT_HealthMonitor"} +``` + +or: + +```json +{"action": "duplicate", "papyrus_id": "FACT_HealthMonitor", "dup_of": "FACT_HealthMonitor"} +``` + +or, on subprocess failure: + +```json +{"action": "error", "papyrus_id": "FACT_HealthMonitor", "message": "<stderr-first-line>"} +``` + +## Process + +### Step 1: Construct deterministic ID + +``` +papyrus_id = uppercase(type) + "_" + sanitize(canonical_name) +``` + +`sanitize` replaces non-alphanumeric characters with underscores, collapses consecutive underscores, strips leading/trailing underscores. Case is PRESERVED (do not lowercase). + +Examples: +- `("fact", "HealthMonitor")` → `FACT_HealthMonitor` +- `("fact", "heartbeat_timeout")` → `FACT_heartbeat_timeout` +- `("dec", "use thread pool for monitors")` → `DEC_use_thread_pool_for_monitors` + +### Step 2: Attempt `papyrus add` + +```bash +papyrus --workspace .papyrus add <papyrus_need_type> \ + "<canonical_name>" \ + --id <papyrus_id> \ + --body "<body>" \ + --tags "canonical:<canonical_name>,<joined_tags>" \ + --source "<reporter_id>" \ + --scope local +``` + +The `<papyrus_need_type>` argument maps as: `dec` → `decision`, `fact` → `fact`, `pref` → `preference`. + +### Step 3: Interpret result + +- Exit 0 → emit `{"action": "wrote", "papyrus_id": "<id>"}`. +- Exit non-zero with stderr containing `"already exists"` → emit `{"action": "duplicate", "papyrus_id": "<id>", "dup_of": "<id>"}`. +- Any other non-zero exit → emit `{"action": "error", "papyrus_id": "<id>", "message": "<stderr-first-line>"}` and return; the caller must not retry. + +No surrounding prose. Emit exactly one JSON object per invocation. + +## Dedup semantics + +- Match key is `(type, canonical_name)`. Body and tag differences do NOT suppress dedup — first writer wins and sets canonical body. +- `reporter_id` difference does NOT suppress dedup — two callers arriving at the same concept from different files still collapse to one record. +- Concurrent writes for the same `papyrus_id` are serialized by the Papyrus `FileLock`; only one succeeds, the others get `"duplicate"`. +- Case is significant in the ID: `FACT_HealthMonitor` and `FACT_healthmonitor` do NOT dedup. The caller is responsible for consistent casing (via `pharaoh-context-gather` lookup before coining). + +## Failure modes + +- `papyrus` binary missing → emit `{"action": "error", "message": "papyrus CLI not found"}`. +- `.papyrus/` workspace missing → emit `{"action": "error", "message": "no .papyrus/ workspace in cwd"}`. +- Any other subprocess failure → emit `{"action": "error", "message": "<stderr-first-line>"}`. + +## Relationship to `pharaoh-finding-record` + +`pharaoh-finding-record` is the audit-specialized sibling: it constrains `category` to a known enum and derives IDs from `(category, subject_id)`. `pharaoh-decision-record` accepts any `(type, canonical_name)` pair. Existing Phase 4b audit fan-out continues to use `pharaoh-finding-record`; Phase 4c's reverse-engineering fan-out uses `pharaoh-decision-record`. + +A future cleanup may reimplement `pharaoh-finding-record` as a thin wrapper over `pharaoh-decision-record`; that refactor is out of scope for Phase 4c. + +## Last step + +After emitting the artefact, invoke `pharaoh-decision-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-decision-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. + +## Composition + +Each caller invokes this skill once per canonical subject surfaced. The orchestrator or harness then reads the final Papyrus workspace via `papyrus recall` for the aggregated vocabulary. diff --git a/.github/agents/pharaoh.decision-review.agent.md b/.github/agents/pharaoh.decision-review.agent.md index 35d3bb0..c614e53 100644 --- a/.github/agents/pharaoh.decision-review.agent.md +++ b/.github/agents/pharaoh.decision-review.agent.md @@ -7,4 +7,47 @@ handoffs: [] Use when auditing a single recorded decision (DR / ADR / design note) against the generic decision review axes in `shared/checklists/decision.md`. Checks context/alternatives/consequences structure, traceability to affected artefacts, rationale completeness. Emits structured findings JSON. -See [`skills/pharaoh-decision-review/SKILL.md`](../../skills/pharaoh-decision-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-decision-review + +## When to use + +Invoke after `pharaoh-decision-record` wrote a decision memory. Part of the self-review invariant. + +## Atomicity + +- (a) One decision + one checklist in → one findings JSON out. +- (b) Input: `{target: <decision_rst_or_memory_id>, checklist_path: <path>, tailoring_path: <path>}`. Output: findings JSON. +- (c) Reward: fixtures `passing-decision.rst` + `failing-decision.rst` with expected findings. +- (d) Reusable. +- (e) Read-only. + +## Input + +- `target`: RST directive block for a `decision` directive, OR a Papyrus memory_id of type `decision`. +- `checklist_path`: `shared/checklists/decision.md`. + +## Output + +```json +{ + "need_id": "dr__example", + "type": "decision", + "axes": { + "context_section_present": {"passed": true}, + "alternatives_listed": {"passed": true, "reason": "3 alternatives + chosen=4"}, + "consequences_section_present":{"passed": true}, + "trace_to_affected_artefacts":{"passed": true, "reason": "links 2 reqs and 1 arch"}, + "canonical_name_unique": {"passed": true, "reason": "no dup in papyrus"}, + "rationale_quality": {"score": 3} + }, + "overall": "pass" +} +``` + +## Review axes + +See [`shared/checklists/decision.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/checklists/decision.md). diff --git a/.github/agents/pharaoh.deployment-diagram-draft.agent.md b/.github/agents/pharaoh.deployment-diagram-draft.agent.md index b35581c..4a3aa4f 100644 --- a/.github/agents/pharaoh.deployment-diagram-draft.agent.md +++ b/.github/agents/pharaoh.deployment-diagram-draft.agent.md @@ -7,4 +7,84 @@ handoffs: [] Use when drafting one deployment diagram showing physical nodes (ECUs, servers, boards), the software artefacts deployed on each, and communication channels (buses, networks). Typical ASPICE usage — SYS.3 System Architectural Design; essential for automotive HW/SW allocation per ISO 26262 Part 5 (HW) and Part 6 (SW). Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). -See [`skills/pharaoh-deployment-diagram-draft/SKILL.md`](../../skills/pharaoh-deployment-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-deployment-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-deployment-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.deployment]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one deployment diagram. Captures **execution environment topology**: physical/virtual nodes, the software artefacts (components, containers, services) deployed on each, and the communication channels between nodes (CAN bus, Ethernet, IPC, HTTP, etc.). + +Typical ASPICE / ISO 26262 context: +- **SYS.3 System Architectural Design**: allocation of system elements to HW. +- **ISO 26262 Part 5 (Hardware level)**: mapping safety goals to HW elements. +- **ISO 26262 Part 6 (Software level)**: SW partitioning across ECUs with ASIL tagging. + +## Atomicity + +- (a) One deployment topology in → one diagram out. +- (b) Input: `{view_title: str, nodes: list[NodeSpec], artefacts: list[ArtefactSpec], deployments: list[DeploymentSpec], channels: list[ChannelSpec], project_root: str, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `NodeSpec = {id: str, label: str, kind?: "device"|"ecu"|"server"|"cloud"|"container", stereotype?: str, asil?: "A"|"B"|"C"|"D"|"QM"}`, `ArtefactSpec = {id: str, label: str, kind?: "component"|"container"|"library"|"binary"|"config"}`, `DeploymentSpec = {node: str, artefact: str}`, `ChannelSpec = {from: str, to: str, label?: str, protocol?: str}`. Output: one RST directive block. +- (c) Reward: fixture — two ECUs (ECU_A ASIL B, ECU_B ASIL D), three artefacts, deployments mapping artefacts to ECUs, CAN bus channel between them. Scorer: + 1. Output starts with renderer directive. + 2. Every node rendered with cube/node shape. + 3. Every artefact rendered inside its deployed node. + 4. Every channel rendered with labeled arrow and protocol annotation. + 5. ASIL tag (when present) visible on node label or as stereotype. + 6. Deployment without a matching node/artefact → FAIL (dangling deployment). + + Pass = all 6. +- (d) Reusable across embedded / distributed / cloud projects; not automotive-specific (ASIL tag is optional). +- (e) One diagram per call. + +## Dangling relationships + +FAIL on `deployments.node` not in `nodes`, `deployments.artefact` not in `artefacts`, `channels.from`/`channels.to` not in `nodes`. + +## Output + +**PlantUML (richest deployment syntax):** +```rst +.. uml:: + :caption: <view_title> + + @startuml + node "ECU_A\n<<ASIL B>>" as ecuA { + artifact "sensor_driver" as art1 + artifact "can_stack" as art2 + } + node "ECU_B\n<<ASIL D>>" as ecuB { + artifact "brake_controller" as art3 + } + ecuA ..> ecuB : CAN (500kbit/s) + @enduml +``` + +**Mermaid:** +```rst +.. mermaid:: + :caption: <view_title> + + flowchart LR + subgraph ECU_A["ECU_A (ASIL B)"] + art1[sensor_driver] + art2[can_stack] + end + subgraph ECU_B["ECU_B (ASIL D)"] + art3[brake_controller] + end + ECU_A -.CAN 500kbit/s.-> ECU_B +``` + +## Non-goals + +- No electrical schematic — use dedicated EE tools for that. +- No real-time timing analysis on channels — a separate `pharaoh-timing-diagram-draft` could cover message schedules. +- No auto-derivation from HW description files (e.g. ARXML) — caller provides explicit node and deployment specs. diff --git a/.github/agents/pharaoh.diagram-lint.agent.md b/.github/agents/pharaoh.diagram-lint.agent.md index 45014e4..5c1bce2 100644 --- a/.github/agents/pharaoh.diagram-lint.agent.md +++ b/.github/agents/pharaoh.diagram-lint.agent.md @@ -10,4 +10,186 @@ handoffs: Use when running a terminal validation step over a directory of RST files to catch Mermaid / PlantUML parse failures that sphinx-build cannot detect. Extracts every `.. mermaid::` and `.. uml::` block and pipes it to the real renderer parser (mmdc / plantuml -checkonly). Returns structured findings. Does NOT modify the RST files. -See [`skills/pharaoh-diagram-lint/SKILL.md`](../../skills/pharaoh-diagram-lint/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-diagram-lint + +## When to use + +Invoke after a reverse-engineering or diagram-emission plan has written RST files containing Mermaid or PlantUML diagram blocks, as a terminal check before the plan's `pharaoh-quality-gate` consumes the results. `sphinx-build` does not validate diagram bodies at build time — it hands them to the browser renderer unchanged. A parse failure is therefore invisible in CI logs and surfaces only when a human opens the page. This skill is the parser in the validation loop. + +Do NOT invoke to modify diagrams (this skill is read-only). Do NOT invoke on a single RST file where you already hand-validate with `mmdc` — that workflow does not need an atomic skill. Do NOT use this skill to replace `pharaoh-quality-gate`; it is one of the checks the gate consumes. + +## Why this skill exists + +Mermaid diagrams can pass `sphinx-build -nW --keep-going -b html` with zero warnings while rendering as `Syntax error in text` in the browser. Prose review of surrounding artefacts does not catch this because it has no Mermaid parser. Running every diagram through `@mermaid-js/mermaid-cli` (matching the version sphinxcontrib-mermaid pins) surfaces parse errors the sphinx build misses. + +Structural validation of RST (directive options, needs schema) is necessary but insufficient. Every artefact type with its own render pipeline needs its own parser in the validation loop. This skill is that parser for Mermaid and PlantUML. + +## Atomicity + +- (a) **Indivisible.** One directory in → one findings report out. No RST mutation. No diagram authoring. No scope outside Mermaid/PlantUML block parsing. +- (b) **Typed I/O.** + - Input: `{docs_dir: str, strictness: "fail_on_any" | "report_only", renderers?: list["mermaid" | "plantuml"], mermaid_cli?: str, plantuml_cli?: str, reporter_id: str, papyrus_workspace?: str}`. + - Output: `{findings: list[{file: str, line: int, renderer: "mermaid"|"plantuml", block_index: int, parser_exit_code: int, parser_stderr: str, severity: "error"|"warning"}], summary: {blocks_scanned: int, blocks_failed: int, renderers_covered: list[str]}, status: "pass" | "fail" | "degraded"}`. `degraded` = scanner ran but at least one renderer CLI was not installed; findings cover the renderers that WERE available. +- (c) **Execution-based reward.** Fixture `pharaoh-validation/fixtures/pharaoh-diagram-lint/`: + - `docs/good.rst` — two valid Mermaid blocks (one sequenceDiagram, one flowchart). + - `docs/bad_semicolon.rst` — one Mermaid sequence diagram with a `;` in a message label (prior dogfooding defect). + - `docs/bad_pipe.rst` — one Mermaid flowchart with an unescaped `|` in an edge label. + - `docs/bad_plantuml.rst` — one `.. uml::` block with an unterminated `@startuml`. + - `docs/good.rst` must score zero findings; each `bad_*.rst` must produce at least one finding with `parser_exit_code != 0`. + - Scorer runs `pharaoh-diagram-lint` against `fixtures/pharaoh-diagram-lint/docs` with `strictness: report_only` and asserts: `summary.blocks_scanned == 5`, `summary.blocks_failed == 3`, one finding per bad file, `status == "fail"`. + - Idempotence: re-running on the same directory returns the same findings list (order stable by `file, line`). +- (d) **Reusable.** Any directory of RST. Not tied to Pharaoh pipelines — CI integrations, editor-in-the-loop lint, pre-commit hooks can use it. +- (e) **Composable.** `pharaoh-quality-gate` reads the `findings` and aggregates into its report under a `diagram_lint` section. The reverse-engineer-project template adds this skill as a dependency of `quality_gate`. + +## Input + +- `docs_dir` (required): absolute path to a directory. Scanner walks `**/*.rst` under it. +- `strictness` (required): `"fail_on_any"` returns `status: "fail"` if any finding has `severity: error`; `"report_only"` always returns the findings list with `status: "fail"` or `"pass"` based on findings but does not treat this as a skill failure. Plans wire this to `pharaoh.toml [pharaoh.quality_gate].strict`. +- `renderers` (optional): subset of `["mermaid", "plantuml"]`. Default: both. Useful for projects that emit one renderer only. +- `mermaid_cli` (optional): command name / path to the Mermaid CLI. Default `"mmdc"` (on `$PATH`). The skill uses whatever is resolved; no bundled tool. If unresolved and mermaid blocks are present, emit a `degraded` status + warning naming the installation command (`npm install -g @mermaid-js/mermaid-cli@11`). +- `plantuml_cli` (optional): command name / path to the PlantUML CLI. Default `"plantuml"`. Fallback installation command: `brew install plantuml` or `apt-get install plantuml`. +- `reporter_id` (required): short agent id, passed to `pharaoh-finding-record` calls. +- `papyrus_workspace` (optional): path to `.papyrus/` for recording findings as dedup-aware records. If absent, findings are only returned; not persisted. + +## Output + +Single JSON object. Example: + +```json +{ + "findings": [ + { + "file": "docs/source/spec/feature/jama.rst", + "line": 66, + "renderer": "mermaid", + "block_index": 2, + "parser_exit_code": 1, + "parser_stderr": "Error: Parse error on line 3: ... Expecting 'SOLID_ARROW'... got 'NEWLINE'", + "severity": "error" + } + ], + "summary": { + "blocks_scanned": 5, + "blocks_failed": 1, + "renderers_covered": ["mermaid", "plantuml"] + }, + "status": "fail" +} +``` + +`line` refers to the starting line of the `.. mermaid::` / `.. uml::` directive inside the RST file (where a human would look to fix it). `block_index` is the zero-indexed position of the block within the file (0 = first diagram in the file, 1 = second, etc.) to disambiguate when multiple blocks live in the same file. + +## Process + +### Step 1: Enumerate RST files under `docs_dir` + +Use the Glob tool to list `${docs_dir}/**/*.rst`. If empty, emit warning `"no RST files under docs_dir"` and return `{findings: [], summary: {blocks_scanned: 0, blocks_failed: 0, renderers_covered: []}, status: "pass"}`. + +### Step 2: Extract diagram blocks + +For each file, scan for directive openings. Recognise: + +- `.. mermaid::` — start of a Mermaid block. Body = subsequent lines indented by ≥ 3 spaces. +- `.. uml::` — start of a PlantUML block. +- `.. plantuml::` — alias for `.. uml::` (some projects use this spelling). + +A block ends at the first subsequent line that is either (a) non-blank and indented by < 3 spaces, or (b) end of file. Directive options (e.g. `:caption:`) between the opening line and the body are skipped (not part of the renderer input). + +Record for each block: `{file, start_line, renderer, block_index, body}` where `body` is the concatenation of body lines with the leading indent stripped. + +### Step 3: Check CLI availability + +For each renderer whose blocks were found (AND requested by `renderers` input): + +- Run `<cli> --version` via Bash. Capture exit code. +- If non-zero, emit a degraded-status warning naming the renderer + install command. Skip parsing for this renderer (findings empty for it). + +### Step 4: Parse each block + +For each block whose renderer CLI is available: + +1. Write `body` to a temp file (`/tmp/pharaoh-diagram-lint-${pid}-${idx}.mmd` or `.puml`). +2. Invoke the parser: + - **Mermaid**: `<mermaid_cli> -i <tmp_in> -o <tmp_out.svg>`. mmdc 11.x requires an output path with a recognised extension (`.svg` / `.png` / `.pdf` / `.md` / `.markdown`); a sentinel like `/dev/null` is rejected with `Output file must end with ...`. Delete `<tmp_out.svg>` afterwards. + - **PlantUML**: `<plantuml_cli> -checkonly <tmp>`. `-checkonly` parses without rendering. +3. Determine parse failure: + - **Mermaid** — mmdc 11.x returns exit code 0 even when the mermaid parse fails inside puppeteer. Treat stderr as authoritative: if stderr contains any of `"Error:"`, `"Parse error"`, `"Expecting "`, or `"UnknownDiagramError"`, the block failed. Callers synthesise a non-zero `parser_exit_code` in the finding for consistency across renderers. + - **PlantUML** — `plantuml -checkonly` exits 200 on parse failure. Exit code alone is reliable. +4. On failure, emit a finding with the captured stderr (trimmed to the first 200 chars, after stripping mmdc's success noise like `Generating single mermaid chart`). + +Each finding is: + +```json +{ + "file": "<relative path from docs_dir>", + "line": <start_line of the directive>, + "renderer": "<mermaid|plantuml>", + "block_index": <int>, + "parser_exit_code": <int>, + "parser_stderr": "<first 20 lines, stripped of CLI framing>", + "severity": "error" +} +``` + +### Step 5: Aggregate and return + +Sort findings by `(file, line, block_index)` for stable output. + +Compute `summary`: +- `blocks_scanned`: total blocks extracted (across all renderers). +- `blocks_failed`: length of findings list. +- `renderers_covered`: renderers whose CLI was available and actually parsed at least one block. + +Compute `status`: +- `degraded` if any requested renderer was unavailable AND blocks of that renderer exist. +- `fail` if any finding has `severity: error` AND `strictness == "fail_on_any"`, OR if `strictness == "report_only"` AND findings list is non-empty. +- `pass` otherwise. + +### Step 6: Optional Papyrus persistence + +If `papyrus_workspace` is provided, for each finding invoke `pharaoh-finding-record` with: + +- `category`: `"diagram_parse_failure"` +- `subject_id`: `<file>:L<line>:B<block_index>` (deterministic id: re-running on the same broken diagram returns `"duplicate"`, so findings don't accumulate across runs) +- `body`: the stderr excerpt +- `reporter_id`: the input `reporter_id` +- `tags`: `["renderer:<name>", "origin:diagram-lint"]` + +Skip this step if `papyrus_workspace` is absent — in-memory return is sufficient for plans that do not use shared memory. + +## Failure modes + +| Condition | Response | +| ------------------------------------------------- | ------------------------------------------------------------ | +| `docs_dir` missing | FAIL: `"docs_dir <path> does not exist"`. | +| No RST files under `docs_dir` | Return empty findings with warning; `status: pass`. | +| Mermaid CLI unresolved, mermaid blocks present | `status: degraded`; warning with install command; findings empty for mermaid. | +| PlantUML CLI unresolved, plantuml blocks present | `status: degraded`; warning with install command; findings empty for plantuml. | +| CLI reports parse failure (exit code OR stderr markers) | Emit finding. Continue with next block. mmdc uses stderr markers; plantuml uses exit code 200. | +| CLI hangs (> 30s) | Kill child process; emit finding with `parser_stderr: "timeout after 30s"`. | +| Temp-file write fails | Abort with FAIL naming the temp path. | + +## Non-goals + +- **No auto-fix.** The skill reports; it does not patch RST files. Fixing belongs to whatever emitted the broken diagram (usually a `pharaoh-*-diagram-draft` or `pharaoh-feat-*-extract` skill), or to a human. +- **No render output.** `mmdc` can render PNG/SVG; we discard that. Rendering is sphinx-build's job at HTML build time. +- **No semantic linting.** This skill checks syntactic validity per the renderer parser. Style complaints ("this diagram has too many participants") belong in a future `pharaoh-diagram-review` skill. +- **No other renderers.** Graphviz (`.. graphviz::`), KaTeX (`:math:`), and others are out of scope. Extend the skill (new renderer entries in Step 2's recognition table) when their silent-failure mode becomes a concrete problem. + +## Advisory chain + +After emitting findings: + +- If `status == "fail"` and strictness is `"fail_on_any"`: downstream `pharaoh-quality-gate` should flip to red. Callers should not ship the documentation build. +- If `status == "degraded"`: the missing CLI is a local-environment gap. Install it and re-run before considering the lint report authoritative. +- If `status == "pass"`: does NOT guarantee every diagram is *good*. Semantic correctness (right messages in the right order, right participant set) is unverified — this skill catches syntactic defects only. + +## Composition + +- The `reverse-engineer-project.yaml.j2` template adds `pharaoh-diagram-lint` as a dependency of `pharaoh-quality-gate` (and the gate's input includes the findings). +- `pharaoh-quality-gate` SHOULD expose a `diagram_lint` section in its aggregated report summarising the findings count per renderer and the first 5 errors verbatim. +- A future `pharaoh-render-check` generalisation covering Graphviz, KaTeX, etc. can subsume this skill; until then this is the only parser-in-the-loop validator for Mermaid and PlantUML. diff --git a/.github/agents/pharaoh.diagram-review.agent.md b/.github/agents/pharaoh.diagram-review.agent.md index deb7695..56bb006 100644 --- a/.github/agents/pharaoh.diagram-review.agent.md +++ b/.github/agents/pharaoh.diagram-review.agent.md @@ -7,4 +7,78 @@ handoffs: [] Use when auditing a single diagram block (Mermaid or PlantUML) emitted by any diagram-emitting skill. Single review atom covering all diagram types — trace/caption/element-count/parser/required-elements checks plus LLM-judge axes for purpose clarity and granularity consistency. Per-type required-element checks dispatched based on `diagram_type` input. -See [`skills/pharaoh-diagram-review/SKILL.md`](../../skills/pharaoh-diagram-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-diagram-review + +## When to use + +Invoke after any diagram-emitting skill produced a single diagram block. Part of the self-review invariant — every `*-diagram-draft` and `*-extract` skill chains into this review. + +One diagram per invocation. A plan emitting N diagrams invokes this skill N times. + +## Atomicity + +- (a) One diagram block + one checklist + one diagram_type in → one findings JSON out. No multi-diagram aggregation, no re-emission. +- (b) Input: `{diagram_block: <rst_directive_string>, diagram_type: <one of 11 canonical types>, parent_need_id: <str>, checklist_path: <path>, tailoring_path: <path>}`. Output: findings JSON with per-axis entries, mirroring `pharaoh-req-review` shape. +- (c) Reward: fixtures for each diagram_type — `passing-<type>.rst` + `failing-<type>.rst` with expected findings. Mechanized axes verified by grep / mmdc / plantuml; subjective axes spot-checked against golden JSON. +- (d) Reusable for every diagram-emitting skill regardless of renderer (mermaid / plantuml). +- (e) Read-only. Does not re-emit or modify the diagram. + +## Input + +- `diagram_block`: the full RST directive (`.. mermaid::` or `.. uml::`) including options and body, as a single string. Must be the complete directive, not just the body. +- `diagram_type`: one of `use_case | sequence | component | class | state | activity | block | deployment | fault_tree | feat_component_extract | feat_flow_extract`. Determines which per-type required-elements check runs. +- `parent_need_id`: need_id of the artefact the diagram is attached to (feat, arch, comp_req). Used for `trace_to_parent` check. +- `checklist_path`: `shared/checklists/diagram.md`. Per-project additions loaded from `.pharaoh/project/checklists/diagram.md` if present. +- `tailoring_path`: `.pharaoh/project/` for renderer preference and element-count threshold. + +## Output + +```json +{ + "parent_need_id": "FEAT_jama_import", + "diagram_type": "sequence", + "renderer": "mermaid", + "axes": { + "trace_to_parent": {"passed": true, "reason": "caption names FEAT_jama_import"}, + "caption_present": {"passed": true}, + "element_count_within_bounds": {"passed": true, "reason": "7 participants, limit 12"}, + "parser_clean": {"passed": true, "reason": "mmdc exit 0"}, + "required_elements_for_type": {"passed": true, "reason": "≥2 participants, ≥1 message"}, + "conditional_branches_marked": {"passed": true, "reason": "source has 2 branches; diagram uses 1 alt block"}, + "external_library_participant": {"passed": true, "reason": "requests imported and called; participant Requests present"}, + "returns_match_call_stack": {"passed": true, "reason": "4 returns, all terminate at prior caller or entrypoint"}, + "purpose_clarity": {"score": 3}, + "granularity_consistency": {"score": 3}, + "naming_clarity": {"score": 3} + }, + "overall": "pass" +} +``` + +Axes `conditional_branches_marked`, `external_library_participant`, and `returns_match_call_stack` apply only to the diagram types noted in [`shared/checklists/diagram.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/checklists/diagram.md). When a diagram's `diagram_type` falls outside the applicable set (e.g. `class`, `state`, `deployment`), the corresponding axis entry is `{"passed": "n/a", "reason": "axis applies only to sequence diagrams"}` and does NOT contribute to `overall`. + +## Review axes + +See [`shared/checklists/diagram.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/checklists/diagram.md) for the canonical axes. Per-type required-elements: + +| diagram_type | Required elements | +| ----------------- | -------------------------------------------------------------------- | +| use_case | ≥1 actor, 1 system boundary (`rectangle`/`package`), ≥1 use case | +| sequence | ≥2 participants, ≥1 message | +| component | ≥1 component node, ≥1 interface or arrow | +| class | ≥1 class node with at least one field OR one method | +| state | 1 initial pseudo-state, 1 final pseudo-state, ≥1 transition | +| activity | 1 start node, 1 end node, ≥1 action | +| block (BDD / IBD) | BDD: ≥1 block with `<<block>>` stereotype. IBD: ≥1 port, ≥1 connector | +| deployment | ≥1 node (physical), ≥1 artefact deployed | +| fault_tree | 1 top event, ≥1 gate (AND/OR), ≥1 basic event | +| feat_component_extract | ≥1 file node, ≥1 import arrow | +| feat_flow_extract | ≥1 participant, ≥1 call arrow | + +## Composition + +Invoked explicitly as a task in plans emitted by `pharaoh-write-plan`, directly after every diagram-emitting task. Coverage enforced by `pharaoh-self-review-coverage-check`. diff --git a/.github/agents/pharaoh.dispatch-signal-check.agent.md b/.github/agents/pharaoh.dispatch-signal-check.agent.md index 60f80e4..e59cd04 100644 --- a/.github/agents/pharaoh.dispatch-signal-check.agent.md +++ b/.github/agents/pharaoh.dispatch-signal-check.agent.md @@ -7,4 +7,68 @@ handoffs: [] Use when verifying that a plan's declared `execution_mode` matches observed subagent artefacts in `runs/`. Detects the "LLM-executor collapsed subagents into inline" failure class observed during dogfooding. One mechanical structural check. -See [`skills/pharaoh-dispatch-signal-check/SKILL.md`](../../skills/pharaoh-dispatch-signal-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-dispatch-signal-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` on any plan with `execution_mode: subagents` in any task. Compares declared mode against presence of per-task artefacts in `runs/`. Returns pass/fail + a list of tasks whose declared mode was not honoured. + +Do NOT use to enforce dispatch at runtime — that is `pharaoh-execute-plan`. This skill observes after the fact. + +## Atomicity + +- (a) Indivisible: one plan.yaml + one runs directory in → pass/fail + mismatch list out. No retry, no dispatch, no re-execution. +- (b) Input: `{plan_path: str, runs_path: str}`. Output: JSON `{passed: bool, mismatches: list[{task_id, declared, observed}]}`. +- (c) Reward: fixtures in `pharaoh-validation/fixtures/pharaoh-dispatch-signal-check/`: + 1. `match/`: plan declares `subagents` for two tasks, `runs/task_1/return.json` and `runs/task_2/return.json` both exist → matches `expected-match-pass.json` (`passed: true, mismatches: []`). + 2. `parallel-declared-inline-observed/`: plan declares `subagents`, runs only has `runs/aggregated.json` (no per-task files) → matches `expected-collapsed-fail.json` (`passed: false`, `mismatches` names the task, `observed: "inline"`). + 3. `inline-declared-parallel-observed/`: plan declares `inline`, runs has per-task files anyway → passed: true (over-dispatch is not a failure, only under-dispatch is). + 4. Idempotent. + + Pass = all 4. +- (d) Reusable by any plan-executing flow. +- (e) Read-only. + +## Input + +- `plan_path`: absolute path to `plan.yaml`. Accepts the full schema enum `execution_mode ∈ {inline, subagents, family-bundle, ask}` declared in `pharaoh-execute-plan/schema.md`. Default `ask` if omitted (per schema). This skill only enforces its detection rule on tasks with `execution_mode == subagents`; `inline`, `family-bundle`, and `ask` modes are skipped (no check). +- `runs_path`: absolute path to the plan's runs directory. Convention: `<project_root>/.pharaoh/runs/<run_id>/`. + +## Output + +```json +{ + "passed": false, + "mismatches": [ + { + "task_id": "reqs_from_code", + "declared": "subagents", + "observed": "inline", + "evidence": "expected per-item return.json under runs/reqs_from_code/; found aggregated.json at runs/ root instead" + } + ] +} +``` + +## Detection rule + +For each task `T` in the plan with `execution_mode: subagents` and a non-empty `foreach`: + +- **Expected shape:** at least `len(foreach)` return artefacts under `<runs_path>/<T.id>/`. Canonical form: per-item subdirectories `task_1/`, `task_2/`, ..., each with a `return.json`. +- **Collapse patterns that fail this check:** + 1. Only `<runs_path>/<T.id>/return.json` exists (single aggregated file under a task-named subdir, no per-item split). + 2. `<runs_path>/aggregated.json` exists at the runs root with no `<T.id>/` subdirectory (flat collapse, as seen in an earlier dogfooding iteration). + 3. Any other pattern with fewer artefacts than `len(foreach)` items. +- The `evidence` field in a mismatch entry names the concrete pattern observed ("single return.json under task subdir", "aggregated.json at the runs root", "N artefacts found, expected M"). + +For each task `T` with `execution_mode ∈ {inline, family-bundle, ask}`: + +- No check. These modes have different (or user-resolved) dispatch semantics that this skill does not model; under-dispatch detection here would produce false positives. + +## Composition + +Called by `pharaoh-quality-gate` when `required_checks` contains `dispatch_signal_matches_plan: true`. Never called directly. diff --git a/.github/agents/pharaoh.execute-plan.agent.md b/.github/agents/pharaoh.execute-plan.agent.md index 2324022..fd46715 100644 --- a/.github/agents/pharaoh.execute-plan.agent.md +++ b/.github/agents/pharaoh.execute-plan.agent.md @@ -7,4 +7,285 @@ handoffs: [] Use when executing a plan.yaml produced by pharaoh-write-plan. Reads the plan, runs each task (inline or via subagent dispatch), threads outputs between tasks per the ref grammar, validates outputs via pharaoh-output-validate, persists artefacts and report.yaml. Generic — the plan is the orchestrator, this skill is the engine. -See [`skills/pharaoh-execute-plan/SKILL.md`](../../skills/pharaoh-execute-plan/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-execute-plan + +## Invariant: every `completed` task has output on disk + +A task marked `status: completed` MUST have its declared output present on disk — an artefact file at the path from Step 4.6 (emission tasks) or a `return.json` at `<workspace_dir>/runs/<task_id>[/<foreach_index>]/return.json` (check / review / gate tasks). Tasks that "completed inlined without output" do not exist. If a task cannot produce its output, mark it `failed` or `skipped` with a reason — never `completed`. Step 4.10 (`output_presence_audit`) enforces this before `report.yaml` is written; missing output rewrites the task status to `reporting_error` and the plan status to `failed` with reason `missing_task_output`. Skipping or collapsing per-foreach-instance `return.json` files into one summary defeats `pharaoh-self-review-coverage-check`, which reads the files directly. + +## When to use + +Invoke when you already have a plan.yaml and want to execute it. The plan carries everything the executor needs: task graph, skill references, input refs, validation rules, execution-mode defaults. This skill never authors plans (that is `pharaoh-write-plan`) and never reviews results (that is `pharaoh-quality-gate` or a human). + +Also: this skill replaces the prose-orchestration of old composition skills like `pharaoh-feats-from-project` and `pharaoh-reqs-from-module`. Those made the LLM execute a 12-step process by reading prose; this skill executes a DAG declared as data. If you find yourself reading a multi-step prose-orchestration skill, stop and look for a plan.yaml instead. + +## Atomicity + +- (a) **Indivisible.** One plan.yaml in, one report.yaml plus an artefacts directory out. No plan authoring. No review. No domain-specific behaviour. Adding a feature to the executor means extending the schema, not this skill. +- (b) **Typed I/O.** + - Input: `{plan_path: str, project_root: str, workspace_dir?: str, execution_mode_override?: "inline"|"subagents"}`. + - Output: `{status: "completed"|"aborted"|"partial", report_path: str, artefacts_dir: str, failed_task_ids: list[str]}`. +- (c) **Execution-based reward.** Fixture in `pharaoh-validation/fixtures/execute-plan-smoke/` contains a 3-task plan using mock-emit skills (each returns a deterministic string given its input). After the executor runs: + 1. `report.yaml` exists and parses. + 2. All three tasks appear under `tasks:` with `status: completed`. + 3. Artefact files exist under the workspace at the paths declared in the report. + 4. Ref resolution worked: task 2 and task 3 received task 1's output verbatim (captured by the mock). +- (d) **Reusable.** Any plan that conforms to `schema.md`. Forward-engineering, reverse-engineering, migration — all look alike to the executor. +- (e) **Composable.** Called by higher-level skills or directly by the user. The only skill it calls internally is `pharaoh-output-validate`; per-task skill dispatch is parameterised by `skill:` in the plan. + +## Input + +- `plan_path`: absolute path to plan.yaml. +- `project_root`: absolute path. Must match the plan's `project_root` (else fail with `project_root_mismatch`). +- `workspace_dir` (optional): absolute path. If omitted, resolved from plan's `workspace_dir` or default `<project_root>/.pharaoh/runs/<plan.name>-<timestamp>/`. +- `execution_mode_override` (optional): overrides `defaults.execution_mode` in the plan. Individual tasks with their own `execution_mode:` still win over this override. + +## Output + +A mapping: + +```yaml +status: completed | aborted | partial +report_path: <abs_path_to_report.yaml> +artefacts_dir: <abs_path_to_artefacts_dir> +failed_task_ids: [<id>, ...] +``` + +`status`: +- `completed` — all tasks reached `completed` status. +- `partial` — some tasks `failed` or `skipped` under `on_fail: skip_dependents`; others ran to completion. +- `aborted` — an `on_fail: abort_plan` rule fired, or the plan itself was rejected at static validation. + +## Process + +### Step 1: Load and validate plan + +1. Read `plan_path`. Parse as YAML. On parse error → return `{status: "aborted", ...}` with the parse error recorded in `report.yaml`. +2. Validate against `schema.md`: + - Required top-level fields present. + - No unknown top-level fields. + - `version == 1`. + - Every task has required fields; no unknowns. + - Every `skill:` references a directory present under `<pharaoh>/skills/` or `<papyrus>/skills/`. +3. Confirm `project_root` input matches plan's declared `project_root`. Mismatch → abort. +4. Resolve `workspace_dir`. Create directory if missing. + +### Step 2: Static ref analysis + +Before any task runs, walk every task's `inputs`, `depends_on`, `foreach`: + +1. Parse each `${...}` ref. Syntax errors → abort plan; record which task and which field. +2. For each ref, resolve the producing task id. Unknown producer → abort. +3. For each ref using a helper, confirm the helper exists in the helper set declared in `schema.md`. Unknown helper → abort. +4. Build the dependency graph (explicit `depends_on` ∪ implicit deps from refs). +5. Detect cycles via DFS. Any cycle → abort; list the cycle in the report. +6. Validate `parallel_group` invariants: every group's members share the same `depends_on` set, no intra-group deps. +7. Warn (do not abort) on declared `outputs:` refs to fields not enumerated in the producer's `outputs:` map. Documentary only. + +Abort here means `status: "aborted"`, zero tasks executed, report written with the specific error. + +### Step 3: Topological order + +Produce a partial order: list-of-lists where each inner list is a "wave" — tasks with all upstream deps satisfied. Within a wave, tasks sharing a `parallel_group` are candidates for concurrent dispatch in subagents mode. + +Foreach-expanded tasks are expanded into concrete instances at this step: if `foreach: ${upstream}` produces N items, emit N logical tasks with ids `<task.id>[0]`, …, `<task.id>[N-1]`. Instance inputs are resolved per-iteration with `${item}` bound. + +### Step 3.5: Resolve execution mode (interactive for ambiguous foreach) + +Before Step 4 dispatches anything, walk every foreach-expanded task and determine its effective execution mode. Priority (first match wins): + +1. Executor was invoked with `execution_mode_override` → use the override. Skip prompting. +2. The task has an explicit `execution_mode:` field → use that value. +3. `plan.defaults.execution_mode` is a concrete mode (`inline`, `subagents`, `family-bundle`) → use the plan default. +4. Plan default is `ask` OR plan default is absent → GATE. Emit the prompt below, collect the user's answer, apply it to every expanded instance of this task. + +The gate fires at most once per foreach-originating task, not once per instance. Non-foreach tasks default to `inline` (no prompt); the gate exists specifically to prevent silent scope collapse on large fan-outs. + +**Prompt shape.** For each ambiguous foreach task (N instances), emit to the controller: + +``` +Task `<task_id>` has foreach over `<upstream_ref>` and expanded to <N> instances. +How should the executor dispatch them? + + [inline] Run instances sequentially in this conversation. + Cheapest. No cross-instance atomicity — the controlling + agent sees every instance's inputs and outputs. Good for + N ≤ 3 and deterministic skills. + + [subagents] Dispatch one subagent per instance. Full atomicity — each + subagent sees only its resolved inputs. Respects per- + instance caps (e.g. "5-7 comp_reqs per feat"). Expensive + at N > 20. + + [family-bundle] Group instances by a bundle key and dispatch one subagent + per bundle. Middle ground. Per-instance caps are NOT + enforced across the bundle — prior dogfooding confirmed + sibling instances leak into each other when one subagent + sees multiple foreach scopes at once. + +Choose one (inline | subagents | family-bundle): +``` + +If the user picks `family-bundle`, follow up with: + +``` +bundle_key (ref, e.g. `${item.feat_id}` or `${heuristics.<helper>(item.file)}`): +``` + +The user's answer is a valid ref per the schema's ref grammar. Validate it syntactically; on malformed ref, re-prompt once; on second failure, fall back to `subagents` mode and warn. + +**Recording.** Every gate decision lands in `report.yaml` under the task's entry: + +```yaml +tasks: + <task_id>: + execution_mode_decision: + resolved_mode: inline | subagents | family-bundle + source: override | task_level | plan_default | user_prompt + bundle_key: <ref> # only when resolved_mode=family-bundle + prompted_at: <iso8601> # only when source=user_prompt +``` + +This makes the decision auditable — if the pilot review says "executor silently bundled", the report either proves or disproves it. + +**Non-interactive callers.** When the executor cannot accept a response (e.g. running under a CI harness), treat `ask` as an error: abort the plan with `status: aborted` and note `execution_mode_gate_cannot_prompt`. Callers that want unattended execution must set `defaults.execution_mode` to a concrete mode or pass `execution_mode_override`. + +### Step 4: Per-task execution loop + +For each wave in order: + +4.1. **Dispatch plan.** Per-task resolved execution mode (from Step 3.5) drives dispatch shape: + + - **inline**: the controlling agent executes each instance sequentially in-context. `parallel_group` is informational only. + - **subagents**: dispatch one subagent per task (or per foreach instance). Group members in the same `parallel_group` dispatch in one turn via parallel Task tool calls. + - **family-bundle**: evaluate the task's `bundle_key` for every foreach instance. Partition instances by resolved key. Dispatch one subagent per bundle; each subagent receives the family-bundle variant of `implementer-prompt.md` and runs the skill once per item in its bundle. Bundles sharing a `parallel_group` dispatch concurrently. + + Tasks without foreach with `family-bundle` configured are a schema error (caught at Step 2); at this point every family-bundle task is foreach-expanded. + +4.2. **Per task, resolve runtime refs.** Look up each input ref in the in-memory artefact store. If any ref is unresolvable (upstream failed/skipped), mark this task `blocked`, apply its `on_fail` policy, continue. + +4.3. **Render implementer prompt.** Use `implementer-prompt.md` as the template. Fill variables: + - `{skill_name}` — from task's `skill:` + - `{skill_body}` — full contents of `<skills>/<skill_name>/SKILL.md`, minus frontmatter + - `{task_id}` — e.g. `map_files[3]` for foreach instance 3 + - `{task_inputs_yaml}` — the resolved input map as YAML + - `{expected_output_schema}` — task's `expected_output_schema` or "unspecified" + - `{project_root}`, `{workspace}` — absolute paths. + +4.4. **Dispatch.** + - `inline` mode: the controlling agent (the one running this skill) reads the rendered prompt and performs the atomic skill's process directly in-context. Record the output when done. + - `subagents` mode: invoke the Task tool with the rendered prompt as the subagent's whole brief. Capture the subagent's return message. + - `family-bundle` mode: render the family-bundle variant of `implementer-prompt.md` (one subagent covers all bundle items). Dispatch via Task tool. Capture the subagent's multi-output return (one artefact per bundle item, in the order the subagent was handed them). Validate each artefact independently per Step 4.5. + +4.5. **Validate output.** Run `pharaoh-output-validate` with: + - `output_text` = dispatched task's return value + - `target_schema` = `expected_output_schema` if set, else any `validation` rule targeting this task, else skip validation. + - `schema_context` = `{directive: ..., required_options: [...]}` when the schema is `rst_directive`; empty for other schemas. + - `strip_fences: true` + +4.6. **Handle validation result.** + - `valid: true` → persist the parsed/stripped artefact to `<workspace>/artefacts/<task_id>.<ext>` where `.ext` is `.rst` for directives, `.yaml` for yaml, `.txt` default. Mark task `completed`. Update in-memory artefact store with the task's output. + - `valid: false` and retries remaining → increment retry counter, rebuild prompt with stricter preamble (see below), re-dispatch. + - `valid: false` and retries exhausted → apply the validation rule's `on_fail` policy. + +4.7. **Retry preamble.** When re-dispatching after a validation failure, prepend to the prompt: + +``` +STRICT OUTPUT REQUIRED. Your previous attempt failed validation with: +<errors joined with ';'> +Emit ONLY the artefact content expected by the target schema. No prose wrapper. No markdown fences. No typos in option keys. +``` + +4.8. **On_fail policies.** + - `retry` — already consumed; after exhaustion treat as `skip_dependents`. + - `skip_dependents` — mark this task `failed`. Mark every transitive dependent `skipped`. Continue with independent branches of the DAG. Final plan status becomes `partial`. + - `abort_plan` — mark this task `failed`. Stop dispatching. Emit report. Status `aborted`. + +4.9. **Parallel dispatch.** In subagents mode within a parallel_group, dispatch all tasks in the group in one message (multiple Task tool calls in one turn). Wait for all to return before moving to the next wave. Per-task validation and retry still happen; retry re-dispatches only the failing task, not the whole group. + +4.10. **`output_presence_audit` — run before Step 5.** For every task whose in-memory status is `completed`, verify that its declared output exists on disk AND is non-empty: + + - **Emission tasks** (skill emits RST directives, YAML, diagrams, etc.): the task's artefact file persisted in Step 4.6 (`<workspace>/artefacts/<task_id>.<ext>`, or `<...>/artefacts/<task_id>/<foreach_index>.<ext>` for foreach). File must exist and `size > 0`. + - **Check / review / gate tasks** (skill emits JSON findings — `pharaoh-req-review`, `pharaoh-req-code-grounding-check`, `pharaoh-diagram-review`, `pharaoh-feat-review`, `pharaoh-diagram-lint`, `pharaoh-quality-gate`, `pharaoh-output-validate`, `pharaoh-self-review-coverage-check`, any future atom-check role): a `return.json` under `<workspace>/runs/<task_id>[/<foreach_index>]/return.json`. File must exist, parse as JSON, and be a non-empty object. For foreach tasks, verify one `return.json` per foreach instance (count ≥ instance count from the expansion in Step 4.1). + - **Composition / plumbing tasks** (skill emits only in-memory data used by downstream refs — `pharaoh-id-allocate`, `pharaoh-feat-file-map`, `pharaoh-context-gather`): require a `return.json` at `<workspace>/runs/<task_id>/return.json` capturing the in-memory output that downstream refs resolved. + + For each task that fails this audit: + 1. Rewrite its report status from `completed` to `reporting_error`. + 2. Append a `reporting_errors` entry naming the missing path. + 3. Mark the plan's overall status as `failed` with reason `missing_task_output` (overrides a previously-clean status; does NOT override `aborted`). + + The audit is mandatory. Skipping or collapsing it is the exact failure mode called out in the invariant at the top of this skill. + +### Step 5: Emit report + +After the loop terminates (completion, partial, or abort): + +1. Write `report.yaml` to `<workspace_dir>/report.yaml` per the schema in `schema.md#report-yaml`. +2. Include every task (completed, failed, skipped, blocked, `reporting_error`). +3. Include foreach instances under `foreach_instances:` for tasks that had foreach. +4. Include top-level `reporting_errors:` list if Step 4.10's audit caught anything; each entry is `{task_id, foreach_index?, expected_path, reason: "missing" | "empty" | "unparseable"}`. +5. Return `{status, report_path, artefacts_dir, failed_task_ids, reporting_errors}`. + +## Failure modes + +| Condition | Response | +| ------------------------------------------ | -------------------------------------------------------------- | +| Plan YAML invalid | status=aborted; report notes `plan_invalid: <parse_error>`. | +| Schema violation | status=aborted; report notes which rule failed. | +| project_root mismatch | status=aborted. | +| Unknown skill | status=aborted at static validation. | +| Cyclic dep | status=aborted; cycle printed. | +| Unresolvable ref at runtime | task=blocked; on_fail policy applies. | +| pharaoh-output-validate errors internally | Log, treat as validation failure (conservative). | +| Task dispatch returns empty | Treat as validation failure with error `empty_output`. | +| Subagent Task tool fails | Retry once; on second failure mark task failed. | +| `completed` task has no output on disk | Step 4.10 rewrites to `reporting_error`; plan status=`failed`. | + +## Worked example + +Plan (excerpt): + +```yaml +name: smoke +version: 1 +project_root: /tmp/fixture +tasks: + - id: feats + skill: pharaoh-feat-draft-from-docs + inputs: + docs_root: docs + outputs: + feats: list + expected_output_schema: rst_directive + - id: map + skill: pharaoh-feat-file-map + foreach: ${feats.feats} + inputs: + feat_id: ${item.id} + feat_title: ${item.title} + feat_body: ${item.body} + src_root: src + depends_on: [feats] + parallel_group: map_files +``` + +Execution trace (2 feats discovered): + +1. Wave 1: `feats` runs inline. Returns 2 directive blocks. Validated as rst_directive. Parsed `feats:` list cached to store. +2. Wave 2: `map` expands to `map[0]`, `map[1]`. Both share parallel_group `map_files`. In subagents mode: dispatched together in one turn. Each returns YAML; validated against yaml schema; persisted. +3. Report lists `feats: completed` and `map: completed` with `foreach_instances: [index:0 completed, index:1 completed]`. + +## Non-goals + +- Does not author plans. +- Does not choose `execution_mode` based on heuristics — that is the plan's business (via `defaults` or per-task override). +- Does not perform cross-plan dedup or impact analysis — those are separate skills. +- Does not log progress to stdout beyond the final return value; structured progress lives in report.yaml. + +## Relationship to deleted composition skills + +`pharaoh-feats-from-project` and `pharaoh-reqs-from-module` previously encoded orchestration in prose. They have been deleted in favour of this skill + `pharaoh-write-plan`. The domain heuristics those skills carried (split_strategy selection, preseed-before-reqs ordering, quality-gate wiring, id-allocate positioning) moved to `pharaoh-write-plan`'s plan-authoring logic. The executor itself is domain-free. diff --git a/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md b/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md index 044a7ad..81c2967 100644 --- a/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md +++ b/.github/agents/pharaoh.fault-tree-diagram-draft.agent.md @@ -7,4 +7,99 @@ handoffs: [] Use when drafting one fault tree for FTA (Fault Tree Analysis) — a top hazard event decomposed through AND/OR gates into basic events (component failures, random hardware faults, human errors). Typical ISO 26262 usage — Part 3 Hazard Analysis & Risk Assessment, and Part 5 supporting hardware architectural metrics. Renderer tailored via `pharaoh.toml`. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). -See [`skills/pharaoh-fault-tree-diagram-draft/SKILL.md`](../../skills/pharaoh-fault-tree-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-fault-tree-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-fault-tree-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.fault_tree]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-safe-labels.md). Every emitted label / node id / edge label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one fault tree. Captures top-down deductive decomposition of one **top hazard event** via logical gates (AND, OR, NOT, inhibit, priority-AND, exclusive-OR) into **intermediate events** and **basic events** with optional failure probabilities. + +Typical ISO 26262 context: +- **Part 3 HARA**: qualitative fault trees identifying top-level hazards. +- **Part 5 §9 Hardware architectural metrics (SPFM, LFM, PMHF)**: quantitative fault trees — each basic event has a failure rate λ; the tree propagates to the top event. +- **Safety case argumentation**: showing how a safety goal violation would have to occur, and what barriers prevent it. + +## Atomicity + +- (a) One top event in → one tree out. +- (b) Input: `{view_title: str, top_event: EventSpec, gates: list[GateSpec], basic_events: list[BasicEventSpec], edges: list[TreeEdgeSpec], project_root: str, show_probabilities?: bool, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `EventSpec = {id: str, label: str, probability?: float}`, `GateSpec = {id: str, kind: "AND"|"OR"|"NOT"|"INHIBIT"|"PAND"|"XOR", label?: str}`, `BasicEventSpec = {id: str, label: str, probability?: float, kind?: "hardware"|"software"|"human"|"environmental"}`, `TreeEdgeSpec = {from: str, to: str}` (tree edges go from parent to child, parent = top event or gate, child = gate or basic event). Output: one RST directive block. +- (c) Reward: fixture — top event "Unintended Acceleration", OR gate decomposing to [ECU software fault, sensor stuck signal], sensor fault AND-gated by [sensor failure, fallback disabled]. Scorer: + 1. Output starts with renderer directive. + 2. Top event appears at the graph root (no incoming edges). + 3. Every gate rendered with UML / FTA-standard shape: AND = flat-bottom D, OR = curved-bottom D, NOT = triangle with bar, etc. (Mermaid approximation: labeled diamond + annotation.) + 4. Basic events rendered as circles (standard FTA notation) or leaves. + 5. With `show_probabilities=true`, every event/basic-event with a `probability` shows it numerically; gates show computed result if all children have probabilities. + 6. No basic event has outgoing edges (leaves). + 7. Every non-leaf node has ≥1 outgoing edge (gates can't be childless). + + Pass = all 7. +- (d) Reusable across safety-critical domains (automotive, medical, aerospace, industrial). +- (e) One tree per call. + +## Dangling edges / cycles + +- FAIL on edge endpoint not in `{top_event} ∪ gates ∪ basic_events`. +- FAIL on cycle (fault trees are DAGs; a cycle means the model is wrong). +- FAIL if the top event has an incoming edge (root must be the top). + +## Output + +**PlantUML (has dedicated FTA symbols via GraphViz DOT syntax embedded):** +```rst +.. uml:: + :caption: <view_title> + + @startuml + skinparam defaultFontSize 11 + rectangle "TOP:\nUnintended Acceleration\nλ=1e-8/h" as TOP + rectangle "OR" as G1 + rectangle "AND" as G2 + circle "Sensor stuck\nλ=5e-7/h" as BE1 + circle "ECU SW fault\nλ=2e-8/h" as BE2 + circle "Fallback disabled\nλ=1e-6/h" as BE3 + TOP --> G1 + G1 --> BE2 + G1 --> G2 + G2 --> BE1 + G2 --> BE3 + @enduml +``` + +**Mermaid (flowchart approximation — no native FTA gate shapes):** +```rst +.. mermaid:: + :caption: <view_title> + + flowchart TD + TOP["TOP: Unintended Acceleration<br/>λ=1e-8/h"] + G1{{OR}} + G2{{AND}} + BE1(("Sensor stuck<br/>λ=5e-7/h")) + BE2(("ECU SW fault<br/>λ=2e-8/h")) + BE3(("Fallback disabled<br/>λ=1e-6/h")) + TOP --> G1 + G1 --> BE2 + G1 --> G2 + G2 --> BE1 + G2 --> BE3 +``` + +## Interaction with `pharaoh-fmea` + +FMEA and FTA are complementary: FMEA is bottom-up (component → effect), FTA is top-down (hazard → component). Pharaoh already has `pharaoh-fmea` for FMEA entries. A future orchestrator (`pharaoh-hazard-analysis`) may pair the two: extract top hazards from FMEA entries with high RPN, then generate FTA per hazard. Out of scope here. + +## Non-goals + +- No cut-set minimization — quantitative FTA tools (e.g. FaultTree+, CAFTA) handle this; this skill just emits the tree structure. +- No probability computation beyond trivial AND-of-independents / OR-of-independents — caller provides computed probabilities if needed. +- No dynamic fault trees (Markov chains, repair rates) — static FT only. +- No common-cause-failure (CCF) modeling — would need extra node kind; a future extension. diff --git a/.github/agents/pharaoh.feat-balance.agent.md b/.github/agents/pharaoh.feat-balance.agent.md index fd28551..9efff5b 100644 --- a/.github/agents/pharaoh.feat-balance.agent.md +++ b/.github/agents/pharaoh.feat-balance.agent.md @@ -7,4 +7,113 @@ handoffs: [] Use when a plan emitted by `pharaoh-write-plan` has completed its feature + comp_req emission and you need to check for granularity skew — features with too many reqs (under-decomposed feature model), too few (over-decomposed), fused sub-features (generic names like "utilities"), or redundancy (symmetric import/export pairs). Reports health and suggestions; does not mutate. -See [`skills/pharaoh-feat-balance/SKILL.md`](../../skills/pharaoh-feat-balance/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-feat-balance + +## When to use + +Invoke after a composition emits feats + comp_reqs, before running `pharaoh-quality-gate`, to catch granularity issues that quality-gate thresholds don't cover. Two typical failure shapes: a feat with an outsized CREQ count (the feat spans multiple capabilities and should be split) and a feat whose title contains a "utilities" / "misc" / "helpers" smell (two unrelated subcommands fused under one name). This skill surfaces both. + +Do NOT use to reshape the feature set — it only reports. Caller acts on suggestions manually. + +## Atomicity + +- (a) Indivisible — one distribution in → one report out. No mutations. No parallel fan-out. +- (b) Input: `{distribution_path: str, thresholds?: {max_reqs_per_feat: int, min_reqs_per_feat: int, name_smell_patterns: list[str], redundancy_title_overlap_min: float, redundancy_count_tolerance: float}}`. Output: YAML report with summary, outliers, redundancy_candidates, overall_health. Defaults: max=15, min=3, name_smell=["utilities","helpers","misc","other","general"], title_overlap=0.5, count_tolerance=0.20. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-feat-balance/input_distribution.yaml` modelled on a skewed-distribution example. Skill run against it with defaults produces output byte-exact matching `expected_output_skewed.yaml`: + - `FEAT_reqif_export` flagged `too_many` (19 > 15). + - `FEAT_jama_utilities` flagged `fused_subfeatures` (title matches smell pattern). + - `(FEAT_csv_export, FEAT_csv_import)` flagged as redundancy candidate (symmetric import/export with matching counts). + - `overall_health: "skewed"` (any flag = skewed). +- (d) Reusable on any feature catalogue. +- (e) Composable: never calls other skills. + +## Input + +- `distribution_path`: absolute path to a YAML file containing the feature distribution. Expected shape: + ```yaml + features: + - feat_id: FEAT_csv_export + title: "CSV Export" + reqs_count: 12 + - feat_id: FEAT_reqif_export + title: "ReqIF Export" + reqs_count: 19 + ... + ``` +- `thresholds` (optional): override defaults. Partial override supported (missing keys use defaults). + +## Output + +```yaml +summary: + feat_count: <int> + total_reqs: <int> + mean_reqs_per_feat: <float> + median: <int> + min: <int> + max: <int> + stdev: <float> + +outliers: + - feat_id: <id> + reqs_count: <int> + flag: too_many | too_few | fused_subfeatures + suggestion: <string> + +redundancy_candidates: + - feats: [<id1>, <id2>] + reason: <string> + +overall_health: healthy | skewed | critical +``` + +## Process + +### Step 1: Load + compute summary + +Read `distribution_path` via `yaml.safe_load`. Compute `feat_count`, `total_reqs`, `mean`, `median`, `min`, `max`, `stdev` across `features[*].reqs_count`. Round mean/stdev to one decimal. + +### Step 2: Flag outliers + +For each feature: +- If `reqs_count > thresholds.max_reqs_per_feat` (default 15) → flag `too_many`. Suggestion: `"Consider splitting: <N> reqs suggests the feature spans multiple distinct capabilities. Look for natural boundaries (e.g. <hint based on title>)."` +- If `reqs_count < thresholds.min_reqs_per_feat` (default 3) → flag `too_few`. Suggestion: `"Feature has only <N> req(s) — verify it's a distinct capability and not a stub. Consider merging into a parent feature if the scope is thin."` +- If feature title (lowercased) matches any `thresholds.name_smell_patterns` (default `["utilities","helpers","misc","other","general"]`) as a substring → flag `fused_subfeatures`. Suggestion: `"Feature title <title> is a code smell — 'utilities' and similar names often lump unrelated capabilities. Consider splitting by the specific capabilities it includes."` + +One feature may carry multiple flags — emit one outlier entry per flag. + +### Step 3: Detect redundancy candidates + +For each pair of features `(A, B)` where A != B: +- Compute title-token overlap: tokenize both titles on whitespace/punctuation, lowercase; `overlap = len(common_tokens) / len(tokens_A ∪ tokens_B)`. +- Compute count ratio: `ratio = min(A.count, B.count) / max(A.count, B.count)`. +- If `overlap >= thresholds.redundancy_title_overlap_min` (default 0.5) AND `ratio >= (1 - thresholds.redundancy_count_tolerance)` (default tolerance 0.20 → ratio ≥ 0.80) → add to `redundancy_candidates`. + +Deduplicate by sorting the pair (`[A.id, B.id]` lexicographic). + +For each candidate, compose a reason: `"Same title-token overlap (<overlap:.0%>), symmetric counts (<A.count> vs <B.count>). Consider <merged_name>."` where `merged_name` strips the differing token (e.g. `csv_export` + `csv_import` → `csv_exchange`). + +### Step 4: Determine overall_health + +- `healthy`: zero outliers AND zero redundancy_candidates. +- `skewed`: at least one outlier OR redundancy_candidate. +- `critical`: > 25% of features flagged (outliers only; redundancy does not count toward this). + +### Step 5: Return + +Return the YAML report. + +## Failure modes + +- `distribution_path` not readable → FAIL. +- Distribution parses but has no `features` key or empty list → FAIL. + +## Non-goals + +- No mutation of the feature set. +- No re-draft suggestions — this skill describes the shape problem, not the fix. +- No cross-project comparison — one distribution per invocation. diff --git a/.github/agents/pharaoh.feat-component-extract.agent.md b/.github/agents/pharaoh.feat-component-extract.agent.md index b3db215..5c14e2d 100644 --- a/.github/agents/pharaoh.feat-component-extract.agent.md +++ b/.github/agents/pharaoh.feat-component-extract.agent.md @@ -7,4 +7,158 @@ handoffs: [] Use when reverse-engineering a feat and you need to derive a component composition diagram automatically from the feat + its source files. Walks import edges between the listed files and emits a Mermaid or PlantUML diagram whose output shape is compatible with pharaoh-component-diagram-draft. Does NOT hand-author nodes or edges; extraction is rule-based. -See [`skills/pharaoh-feat-component-extract/SKILL.md`](../../skills/pharaoh-feat-component-extract/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-feat-component-extract + +## When to use + +Invoke after `pharaoh-feat-file-map` has produced a `{feat_id, files}` mapping, when you want a static architecture view of the feature showing which modules/classes compose it and how they depend on each other. The diagram output shape matches `pharaoh-component-diagram-draft` so downstream tooling (sphinx-needs rendering, diff review) treats auto-extracted diagrams identically to hand-authored ones. + +Do NOT use to draft a diagram from scratch when you already have explicit node+edge data — that is `pharaoh-component-diagram-draft`. Do NOT use to extract runtime flow — that is `pharaoh-feat-flow-extract`. + +## Tailoring awareness + +Shared tailoring rules: see `shared/diagram-tailoring.md`. Reads `[pharaoh.diagrams]` and `[pharaoh.diagrams.component]` from the consumer project's `pharaoh.toml` for renderer choice and styling. Respects `on_missing_config` per the shared `check → propose → confirm` pattern. + +Safe-label rules: see `shared/diagram-safe-labels.md`. Node IDs derived from file paths MUST be aliased (path characters `/` and `.` are invalid in Mermaid / PlantUML identifier positions). Edge labels MUST be sanitised — call-labels like `foo(arg1; arg2)` become `foo(arg1, arg2)` before emit. A parse failure in the emitted block is invisible under `sphinx-build` and surfaces only at browser render time; `pharaoh-diagram-lint` (run as part of `pharaoh-quality-gate`) is the second guard. + +## Atomicity + +- (a) Indivisible — one feat + one file list in → one diagram RST block out. No multi-feat bundling. No mutation of source files. No req emission. +- (b) Input: `{feat_id: str, feat_title: str, files: list[str], project_root: str, src_root: str, renderer_override?: "mermaid"|"plantuml", include_external?: bool, on_missing_config?: "fail"|"prompt"|"use_default", papyrus_workspace?: str, reporter_id: str}`. Output: one RST directive block (`.. mermaid::` or `.. uml::`) with caption `<feat_id> — component composition`. No surrounding prose. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-feat-component-extract/`: + - `input_feat.yaml` declares `feat_id: FEAT_csv_export`, `feat_title: "CSV Export"`, `files: [csv/export.py, csv/writer.py, commands/csv.py]`. + - `input_files/` contains three Python files with explicit imports: `commands/csv.py` imports `from csv.export import run_export`; `csv/export.py` imports `from csv.writer import CSVWriter`; `csv/writer.py` has no project-internal imports. + - Expected diagram at `expected_diagram.rst` has 3 nodes (one per file), 2 directed edges (`commands/csv.py → csv/export.py`, `csv/export.py → csv/writer.py`), no external nodes. + + Scorer: + 1. Output starts with the renderer directive. + 2. All 3 nodes appear by label. + 3. Both directed edges render with correct arrow syntax. + 4. With default `include_external=false`, no external imports (e.g. `typer`, `pathlib`, `csv`) appear as nodes. + 5. With `include_external=true`, external imports render as ghost nodes (dashed outline, muted color, `<<external>>` stereotype). + 6. Output matches `pharaoh-component-diagram-draft` output shape (same directive, same caption format, same node/edge syntax). + + Pass = all 6. +- (d) Reusable for any language whose import graph the extractor supports. Python initial target; regex-based import detection so adding Rust/TypeScript is a configuration table entry, not a rewrite. +- (e) Composable: one feat per call. A plan emitted by `pharaoh-write-plan` may include a `foreach` task over feats that dispatches N instances (one per feat) in parallel via `pharaoh-execute-plan`. This skill never invokes other skills. + +## Input + +- `feat_id`: the feature's sphinx-needs ID, used as the diagram caption prefix. +- `feat_title`: human-readable title, shown in caption. +- `files`: list of source file paths relative to `src_root`. These become the diagram's in-scope nodes. +- `project_root`: absolute path, for `pharaoh.toml` tailoring lookup. +- `src_root`: absolute path, the import-graph resolution root. `files[*]` resolve as `<src_root>/<file>`. +- `renderer_override` (optional): per shared doc. +- `include_external` (optional): if `true`, imports that resolve outside `files` but inside `src_root` become ghost nodes. Imports resolving outside `src_root` entirely (stdlib, third-party) are ignored regardless. Default `false`. +- `on_missing_config` (optional): per shared doc. Default `"prompt"`. +- `papyrus_workspace` (optional): for consistent node labeling with other skills that reference the same files. +- `reporter_id`: short agent identifier. + +## Output + +**Mermaid (default):** +```rst +.. mermaid:: + :caption: FEAT_csv_export — component composition + + graph TD + commands_csv[commands/csv.py<br/>run_export] + csv_export[csv/export.py<br/>run_export] + csv_writer[csv/writer.py<br/>CSVWriter] + commands_csv --> csv_export + csv_export --> csv_writer +``` + +Node IDs (left-hand side of the bracket) are sanitized forms of the file path (replace `/` and `.` with `_`). Node labels show the file path plus the primary symbol (largest top-level def/class, or the one whose name matches feat title tokens). + +**PlantUML:** +```rst +.. uml:: + :caption: FEAT_csv_export — component composition + + @startuml + component "commands/csv.py\n(run_export)" as commands_csv + component "csv/export.py\n(run_export)" as csv_export + component "csv/writer.py\n(CSVWriter)" as csv_writer + commands_csv --> csv_export + csv_export --> csv_writer + @enduml +``` + +## Process + +### Step 1: Enumerate nodes + +For each file in `files`, read via absolute path (`<src_root>/<file>`). Parse top-level symbol declarations: + +- Python: `^class <Name>`, `^def <Name>`, `^async def <Name>`. +- Rust: `^(pub )?(fn|struct|enum|trait|impl) <Name>`. +- JS/TS: `^(export )?(function|class|const|let|var) <Name>`. +- Go: `^func (<Receiver>) <Name>` / `^type <Name>`. + +Pick the primary symbol per file: longest body OR name matching `feat_title` tokens (case-insensitive substring match on any token). If ambiguous, pick the one defined earliest. + +Node label: `<file>\n(<primary_symbol>)` (Mermaid uses `<br/>`, PlantUML uses `\n`). +Node ID: `<file>` with `/` → `_`, `.` → `_`, stripped of final `py` extension marker. + +### Step 2: Enumerate edges + +For each file, parse imports via language-specific regex: + +- Python: `^(from (?P<module>[\w.]+) import|import (?P<module>[\w.]+))`. +- Rust: `^use (?P<module>[\w:]+)`. +- JS/TS: `^import .* from ["'](?P<module>[^"']+)["']`. +- Go: `^\s*"(?P<module>[\w/.]+)"` within an `import (...)` block. + +For each imported module, resolve to a file path: +- Try `<src_root>/<module_path>.py` (replacing `.` with `/`). +- Try `<src_root>/<module_path>/__init__.py`. +- Try other language-appropriate conventions. + +If the resolved path is in `files`, emit an edge `<importer_file> → <resolved_file>`. + +If the resolved path is outside `files` but inside `src_root` AND `include_external=true`, add a ghost node `external::<module>` and emit the edge. + +If the import resolves outside `src_root` entirely (stdlib, third-party), drop silently. + +### Step 3: Emit diagram + +Resolve renderer per shared doc's resolution order (`renderer_override` → `pharaoh.toml [pharaoh.diagrams].renderer` → default `mermaid`). + +Emit the diagram with direction `TD` (top-down, showing call depth). Caption: `<feat_id> — component composition`. + +For ghost nodes (when `include_external=true`), group them visually apart where the renderer supports it: +- Mermaid: separate `subgraph External` block. +- PlantUML: `package "external" { ... }` block. + +Ghost-node styling: dashed outline + muted color (specifics per renderer — consult `shared/diagram-tailoring.md` for the type_styles lookup). + +### Step 4: Return + +Single RST block. No prose before or after. + +## Failure modes + +- `files` empty → FAIL. +- Any file in `files` unreadable → log + skip that file (do not abort unless all files unreadable). +- Cycles in the import graph → emit the diagram anyway (Mermaid/PlantUML handle cycles); log a note. +- Zero edges resolved inside `files` → emit nodes only, log a note ("no intra-scope edges detected — check files list or use include_external=true"). +- `include_external=true` AND zero in-scope edges AND zero external edges → still emit nodes only, log note. + +## Non-goals + +- No runtime / call-graph inference — that is `pharaoh-feat-flow-extract`. +- No type hierarchy — that is `pharaoh-class-diagram-draft` (hand-authored) or a future `pharaoh-feat-class-extract`. +- No transitive import resolution beyond one hop — depth > 1 explodes scope. +- No dead-code detection — every file in `files` is a node, whether imported or not. + +## Last step + +After emitting the artefact, invoke `pharaoh-diagram-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-diagram-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/.github/agents/pharaoh.feat-draft-from-docs.agent.md b/.github/agents/pharaoh.feat-draft-from-docs.agent.md index d2c4477..d6abf59 100644 --- a/.github/agents/pharaoh.feat-draft-from-docs.agent.md +++ b/.github/agents/pharaoh.feat-draft-from-docs.agent.md @@ -7,4 +7,189 @@ handoffs: [] Use when reading one or more existing documentation files (unstructured prose, README, tutorial) and emitting one or more feature-level RST directives (typed by `target_level`, default `feat`) that describe the user-facing capabilities documented in those files. Does NOT read source code. Does NOT emit component requirements. Does NOT map features to files — that is `pharaoh-feat-file-map`. -See [`skills/pharaoh-feat-draft-from-docs/SKILL.md`](../../skills/pharaoh-feat-draft-from-docs/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-feat-draft-from-docs + +## When to use + +Invoke when a project has unstructured documentation (e.g. `docs/source/features/*.rst`, `README.md`, product overview pages) that describes user-facing capabilities in prose, and you need to extract those capabilities as sphinx-needs `feat` (or equivalent) directives. This is the first step of reverse-engineering a requirements model from an existing project: docs → features. The follow-up skill `pharaoh-feat-file-map` maps each emitted feature to source files; `pharaoh-req-from-code` then generates component requirements per-file from code. + +Do NOT use to draft features from scratch (that is `pharaoh-req-draft` with `target_level="feat"`). Do NOT use to emit reqs from code (that is `pharaoh-req-from-code`). Do NOT use to generate architecture diagrams (a separate future skill). + +## Tailoring awareness + +The emitted directive name and ID prefix come from the consumer project's `ubproject.toml` `[[needs.types]]` (or `.pharaoh/project/id-conventions.yaml` if present). The caller passes `target_level` — use it verbatim as the directive name. Do NOT hardcode `feat` as the only acceptable type. Projects may call their top-level artefact `story`, `capability`, `feature`, `use_case`, etc. + +## Atomicity + +- (a) Indivisible — one invocation reads `doc_files` and emits N feature directives. No source-code reads. No file-mapping. No inter-feature dependency analysis. One artefact × one phase. +- (b) Input: `{doc_files: list[str], target_level: str, project_root: str, papyrus_workspace?: str, reporter_id: str, on_missing_config?: "fail"|"prompt"|"use_default"}`. Output: single JSON object `{"feats": [{"id", "title", "type", "body", "source_doc", "raw_rst"}, ...]}`. The `raw_rst` field of each feat is the full RST directive block; downstream skills that want raw RST read it from there. On `on_missing_config="prompt"` with `target_level` undeclared → single JSON object `{status: "needs_confirmation", proposal: ...}`. +- (c) Reward: deterministic fixture — a 2-file doc tree with known feature vocabulary (e.g. `features/csv.rst` mentioning "CSV import" and "CSV export"; `features/jama.rst` mentioning "Jama pull" and "Jama push"). After skill runs, scorer checks: + 1. Every emitted block uses `target_level` as the directive name. + 2. Every emitted block has a `:id:` option. + 3. Every emitted ID prefix equals the ID prefix resolved from the project's tailoring (see Output). + 4. Every emitted block contains a `:source_doc:` option pointing to one of the `doc_files` paths. + 5. For each fixture doc paragraph marked as "must_yield_feat" in the fixture metadata, at least one emitted block's title or body mentions the paragraph's canonical vocabulary (substring match, case-insensitive). + 6. At least 1 feat is emitted (non-empty output). + + Pass = all 6 checks pass. +- (d) Reusable: any reverse-engineering workflow on projects with existing prose docs; migration from README-only to sphinx-needs; extracting features from product specs. +- (e) Composable: strictly one phase (docs → feat directives). Never invokes `pharaoh-req-from-code`, `pharaoh-arch-draft`, or `pharaoh-feat-file-map`. A plan emitted by `pharaoh-write-plan` composes this skill with `pharaoh-feat-file-map` and downstream req-emission tasks — not vice versa. + +## Input + +- `doc_files`: list of absolute paths to documentation files to read. Typically `.rst`, `.md`, or `.txt`. At least one must be provided. Files are read but not modified. +- `target_level`: directive name for the emitted features. Must match a `[[needs.types]].directive` in the consumer project's `ubproject.toml` (e.g. `"feat"`, `"story"`, `"capability"`). The emitted directive uses this name verbatim. +- `project_root`: absolute path to the consumer project's root, used to resolve the ID prefix from `ubproject.toml` (`[[needs.types]]` entry whose `directive` equals `target_level`). If the `ubproject.toml` does not declare a prefix for `target_level`, fall back to `<target_level>__` (double-underscore convention). +- `papyrus_workspace` (optional): path to `.papyrus/` directory for canonical-term coordination with concurrent agents. If omitted, the skill operates in no-memory mode. +- `reporter_id`: short identifier for this agent (e.g. `feat-draft-from-docs:features`). Passed to `pharaoh-decision-record` calls. +- `granularity` (optional): `"doc" | "top_section" | "manual_hint"`. Default `"doc"`. Controls decomposition of each doc file into feats: + - `"doc"` — one feat per input doc file. Simplest; right for "one topic per doc" layouts. Current default and the shape that has been stable since initial dogfooding. + - `"top_section"` — split each doc at its top-level headings (RST: title underlined with `===`; Markdown: a line starting with a single `#`). Emit one feat per top-level section. Right for docs that cover multiple capabilities under one roof (e.g. a ReqIF connector doc that covers both export and import — granularity `top_section` produces `FEAT_reqif_export` + `FEAT_reqif_import` instead of a single fused `FEAT_reqif_exchange`). + - `"manual_hint"` — look for explicit split markers inside each doc: `.. feat-split::` comment-directive (RST) or `<!-- feat-split -->` marker (Markdown). Emit one feat per segment separated by those markers. Caller-controlled; useful when prose organisation does not match the desired feat boundary. +- `on_missing_config` (optional): `"fail" | "prompt" | "use_default"`. Default `"prompt"`. Determines behavior when `target_level` is not declared in `ubproject.toml`. See shared `check → propose → confirm` pattern in `shared/diagram-tailoring.md` (same semantics, different subject matter). + +## Output + +A single JSON object with one top-level key `feats` (list of feat objects). One feat object per emitted feature. Shape: + +```json +{ + "feats": [ + { + "id": "<id_prefix><snake_case_id>", + "title": "<short_title>", + "type": "<target_level>", + "body": "<one-sentence feature statement in user-facing language>", + "source_doc": "<relative_path_to_doc_file>", + "raw_rst": ".. <target_level>:: <short_title>\n :id: ...\n :status: draft\n :source_doc: ...\n\n <body>\n" + } + ] +} +``` + +The `raw_rst` field MUST be exactly the directive block as it would appear if pasted into an RST file. Downstream skills (e.g. `pharaoh-req-review`, `pharaoh-feat-review`) read `raw_rst` when they need the directive text; helpers that consume `feats` (e.g. `to_papyrus_seeds`) read `id`, `title`, `body`. + +`<id_prefix>` resolution: +1. Read `<project_root>/ubproject.toml`. +2. Find the `[[needs.types]]` entry whose `directive` equals `target_level`. +3. If it has a `prefix` field, use that verbatim (e.g. `prefix = "FEAT_"` → `FEAT_csv_import`). +4. Otherwise use `<target_level>__` (e.g. `feat__csv_import`). + +`<snake_case_id>` is derived from the feature's short_title (lowercase, spaces → underscores, non-alphanumeric stripped). + +`source_doc` — relative path (from `project_root`) to the doc file this feature was derived from. This is a Pharaoh convention for provenance. `pharaoh-bootstrap` declares `source_doc` under `[[needs.extra_options]]` by default so sphinx-needs does not warn under `-nW`; callers who opted out of the default must declare it manually or accept the warnings. Downstream skills (`pharaoh-feat-file-map`, plans emitted by `pharaoh-write-plan`) read this to group features by source doc. + +The output is one JSON object — no surrounding prose, no concatenated RST outside the JSON. + +## Output schema + +Validated as `json_obj` by `pharaoh-output-validate`. Validator checks: +1. Top-level is a JSON object with exactly one required key `feats` (list). +2. Every `feats[*]` has the keys `id`, `title`, `type`, `body`, `source_doc`, `raw_rst`. +3. `feats[*].type` equals input `target_level` (default `feat`). +4. `feats[*].source_doc` references a path present in the input `doc_files` list. +5. `feats[*].raw_rst` matches the RST directive Stage 1 + Stage 2 regex from `pharaoh-req-from-code` `## Output schema`, with directive name = `feats[*].type` and `:id:` / `:status:` / `:source_doc:` options present. +6. `feats[*].id` matches the resolved `<id_prefix><snake_case_id>` pattern. + +## Process + +### Step 1: MANDATORY — query Papyrus for canonical feature names (if workspace provided) + +For each feature concept you identify in the docs, query `pharaoh-context-gather` with a semantic description ("the capability that exports needs to CSV"). If a canonical feature name already exists, reuse it verbatim. This prevents drift when the same doc is re-processed or when multiple docs describe overlapping capabilities. + +Skip this step if `papyrus_workspace` is not provided (no-memory mode). + +### Step 2: Read all doc_files + +Read every file in `doc_files`. Concatenate into working memory. Identify user-facing capability boundaries: + +- Section headers often signal capability boundaries ("## Import from ReqIF", "## Export to Jama"). +- Imperative verbs describing what users can do with the product ("You can import …", "Users can export …"). +- Top-level bullet lists in README "Features" sections. +- sphinx-design cards with short capability labels. + +Ignore: +- Installation/setup instructions. +- Contributing guidelines. +- License text. +- Changelog entries. + +### Step 3: Resolve ID prefix from tailoring (with check → propose → confirm) + +Read `<project_root>/ubproject.toml`. Find the `[[needs.types]]` entry matching `target_level`. Extract its `prefix` field. Three resolution paths: + +1. **Type declared, prefix present** → use the declared prefix. Proceed. +2. **Type declared, prefix absent** → use `<target_level>__` silently (this is a minor-enough gap to default). +3. **Type NOT declared**, OR `ubproject.toml` missing entirely → branch on `on_missing_config`: + - `"fail"` → FAIL with: `"target_level=<value> not declared in <project_root>/ubproject.toml. Run pharaoh-bootstrap first, or pass on_missing_config='prompt' to negotiate."` + - `"prompt"` (default) → emit a `needs_confirmation` proposal: + ```json + { + "status": "needs_confirmation", + "proposal": { + "target_level": "<value>", + "proposed_prefix": "<uppercase value>_", + "rationale": "target_level is not declared as a type in ubproject.toml. Propose adding it so downstream skills have a stable type.", + "tailoring_patch": { + "target_file": "ubproject.toml", + "table": "[[needs.types]]", + "entry": {"directive": "<value>", "title": "<Title Case value>", "prefix": "<uppercase>_"} + } + } + } + ``` + Return without emitting features. The caller confirms, runs `pharaoh-tailor-fill` or edits manually, then re-invokes with `on_missing_config="use_default"`. + - `"use_default"` → synthesize defaults silently: treat `target_level` as declared with prefix `<target_level>__`. Proceed. + +### Step 4: Record newly surfaced canonical feature names in Papyrus + +Only if `papyrus_workspace` is provided. For each feature concept you will emit that was NOT returned by Step 1, invoke `pharaoh-decision-record` with: + +- `type`: `"fact"` +- `canonical_name`: the short_title you chose for this feature (space-separated, Title Case — e.g. `"CSV Import"`) +- `body`: one sentence describing the capability +- `reporter_id`: your `reporter_id` input +- `tags`: `["origin:feat-draft-from-docs", "doc:<doc_basename>"]` + +If `pharaoh-decision-record` returns `"duplicate"`, re-query and adopt the existing canonical name. + +### Step 5: Emit feature directives + +The set of emitted capabilities depends on `granularity`: + +- `"doc"` (default): one feat per input doc file. If a doc covers multiple topics, pick the dominant theme for the title/body and rely on downstream skills (e.g. `pharaoh-feat-balance`) to flag under-decomposition. +- `"top_section"`: enumerate top-level headings across all input docs. For RST, a top-level heading is a line followed by a line of `=` characters of matching length. For Markdown, a top-level heading is a line starting with `# ` (single hash, not `##`). Each top-level section becomes one feat; the section's prose is the body source. A doc with no top-level headings falls back to one feat for the whole doc (same as `"doc"`). +- `"manual_hint"`: scan each doc for split markers. RST: lines of form `.. feat-split::` (optionally followed by a feat title on the same line, e.g. `.. feat-split:: CSV Import`). Markdown: lines exactly matching `<!-- feat-split -->` or `<!-- feat-split: Title -->`. Segments between markers (and the implicit segment before the first marker, if any) each become one feat. A doc with zero markers falls back to one feat for the whole doc. + +Emit one block per the Output shape per resolved capability. Target: 3-15 features total across all `doc_files`. Fewer than 3 suggests under-decomposition (lumping); more than 15 suggests over-decomposition (every button becomes a feature). If you hit these bounds, log a warning and proceed anyway — the eval will flag it. + +Body text must be user-facing: "The system shall import needs from ReqIF files", not "The `from_reqif` command parses XML via lxml". Implementation detail belongs in `comp_req`, not `feat`. + +### Step 6: Return + +Emit one JSON object `{"feats": [...]}` per the Output shape. For each emitted capability build the per-feat mapping with `id`, `title`, `type` (= `target_level`), `body`, `source_doc`, and `raw_rst` (the literal RST block that would render the directive). Nothing else on stdout — no prose wrapper, no fenced code block. + +## No-memory mode + +If `papyrus_workspace` is absent, skip Steps 1 and 4. Proceed directly to 2, 3, 5, 6. + +## Failure modes + +- `doc_files` empty → FAIL: "At least one doc file required." +- Any file in `doc_files` unreadable → log and skip that file; do not abort unless all files are unreadable. +- `ubproject.toml` missing or `target_level` undeclared → FAIL per Step 3. +- `pharaoh-context-gather` / `pharaoh-decision-record` errors → log and proceed as if no match (never abort on memory-layer issues). + +## Last step + +After emitting the artefact, invoke `pharaoh-feat-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-feat-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. + +## Composition + +A plan emitted by `pharaoh-write-plan` calls this skill once with the full `doc_files` list in the initial wave, then a foreach task dispatches `pharaoh-feat-file-map` once per emitted feature to produce the feat→files mapping. diff --git a/.github/agents/pharaoh.feat-file-map.agent.md b/.github/agents/pharaoh.feat-file-map.agent.md index 5726391..5238bbd 100644 --- a/.github/agents/pharaoh.feat-file-map.agent.md +++ b/.github/agents/pharaoh.feat-file-map.agent.md @@ -7,4 +7,174 @@ handoffs: [] Use when mapping one feature (already emitted as an RST directive) to the source files that implement it. Reads the source tree, returns a YAML entry `{feat_id: {files: [...], rationale: "..."}}`. Does NOT read docs. Does NOT emit reqs. Does NOT create or modify source files. -See [`skills/pharaoh-feat-file-map/SKILL.md`](../../skills/pharaoh-feat-file-map/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-feat-file-map + +## When to use + +Invoke after `pharaoh-feat-draft-from-docs` has emitted one or more feature directives, when you need to know which source files implement each feature. The emitted mapping feeds downstream `pharaoh-req-from-code` tasks (one invocation per file, with `parent_feat_ids` set from this mapping), producing `comp_req` directives that link back to the parent feature via `:satisfies:`. + +One invocation handles exactly one feature. To map N features, a plan emitted by `pharaoh-write-plan` uses a `foreach` task over feats to dispatch N instances concurrently. + +Do NOT use to draft features (that is `pharaoh-feat-draft-from-docs`). Do NOT use to emit reqs (that is `pharaoh-req-from-code`). Do NOT modify source files (that is a future bidirectional-trace skill). + +## Tailoring awareness + +This skill does not emit RST directives, so it is type-agnostic. It does, however, respect the consumer project's source layout: if `pharaoh.toml` or `ubproject.toml` declares a `[pharaoh.codelinks]` section or a sphinx-codelinks `source_discover.src_dir`, the skill uses that as the default `src_root`. Otherwise the caller must pass `src_root` explicitly. + +## Atomicity + +- (a) Indivisible — one feature in → one YAML entry out. No RST emit. No other feature analysis. One artefact × one phase. +- (b) Input: `{feat_id: str, feat_title: str, feat_body: str, src_root: str, file_glob?: str, exclude_glob?: list[str], papyrus_workspace?: str, reporter_id: str}`. Output: a single YAML object in FLAT shape `{feat_id: <str>, files: [<relative_path>, ...], rationale: "<one-sentence explanation>", entry_point?: <mapping>, shared_with?: [<feat_id>]}`. No wrapping prose, no outer `{<feat_id>: ...}` key — the `feat_id` lives as a sibling scalar alongside `files` and `rationale` so downstream aggregation over foreach results is a trivial list-of-mappings, not a merge of single-key mappings. +- (c) Reward: deterministic fixture — a 5-file source tree where 3 files clearly implement feature "FEAT_csv_export" (e.g. `csv/export.py`, `csv/writer.py`, `commands/csv.py`) and 2 are unrelated (`jama/client.py`, `reqif/parser.py`). After skill runs, scorer checks: + 1. Output is valid YAML parseable by PyYAML. + 2. Output has top-level keys including `feat_id` (equal to input `feat_id`), `files` (list), `rationale` (string). + 3. No other top-level keys are present beyond the optional `entry_point` and `shared_with`. + 4. Every path in `files` exists under `src_root`. + 5. Precision: of emitted files, ≥80% are in the fixture's ground-truth positive set. + 6. Recall: of the fixture's ground-truth positive files, ≥60% are in emitted `files`. + + Precision and recall targets are deliberately asymmetric — we accept more false positives than false negatives because downstream `pharaoh-req-from-code` can tolerate an extra file (just produces an extra req the human can delete), but missing a file means a behavior gets no requirement at all. +- (d) Reusable: any reverse-engineering workflow; impact analysis ("which files does this feature touch?"); rough component boundary detection. +- (e) Composable: one feature per call. A plan emitted by `pharaoh-write-plan` dispatches N instances via `foreach` when multiple feats exist. This skill never calls `pharaoh-feat-draft-from-docs` or `pharaoh-req-from-code`. + +## Input + +- `feat_id`: the feature's sphinx-needs ID (e.g. `"FEAT_csv_export"` or `"feat__csv_export"`). Used verbatim as the YAML key. +- `feat_title`: the feature's short title (e.g. `"CSV Export"`). Used for semantic reasoning about file relevance. +- `feat_body`: the feature's one-sentence statement (e.g. `"The system shall export needs to CSV files."`). Used for semantic reasoning. +- `src_root`: absolute path to the source tree to scan. All emitted file paths are relative to this root. +- `file_glob` (optional): glob pattern for candidate files. Default: `"**/*"` minus common excludes (see `exclude_glob`). Callers for a Python project may pass `"**/*.py"`; for a polyglot project, a combined pattern. +- `exclude_glob` (optional): list of glob patterns to exclude. Default: `["**/__pycache__/**", "**/.git/**", "**/node_modules/**", "**/*.pyc", "**/tests/**", "**/test_*.py", "**/*_test.py"]`. Tests are excluded by default because they describe verification, not implementation; a separate skill can map tests to features if needed. +- `papyrus_workspace` (optional): path to `.papyrus/` directory. If provided, the skill queries for prior knowledge about which files implement which concepts (enables cross-run consistency). +- `reporter_id`: short identifier for this agent (e.g. `feat-file-map:FEAT_csv_export`). + +## Output + +A single YAML document, no prose wrapper: + +```yaml +feat_id: FEAT_csv_export +files: + - csv/export.py + - csv/writer.py + - commands/csv.py +rationale: "Export pipeline: export.py orchestrates, writer.py serializes rows, commands/csv.py registers the CLI entrypoint." +``` + +Top-level keys: `feat_id` (equal to input), `files`, `rationale`. Optional top-level keys: + +- `shared_with: list[feat_id]` — populated by the orchestrator when the same file serves multiple features (see below). +- `entry_point: {file: str, symbol: str}` — names the file + symbol where feature flow begins (typically a CLI command, HTTP route, test entry, event handler). Downstream `pharaoh-feat-flow-extract` reads this to know where to start the call-chain walk. Leave absent when no single entry point applies (e.g. the feature is a pure data model with no orchestrating function). + +`files` is a list of strings (each a path relative to `src_root`) and `rationale` is a one-sentence string explaining why these files were chosen. + +Example with entry_point (recommended when one clearly exists): + +```yaml +feat_id: FEAT_csv_export +files: + - csv/export.py + - csv/writer.py + - commands/csv.py +rationale: "Export pipeline from CLI through the writer." +entry_point: + file: commands/csv.py + symbol: export +``` + +When a file implements behavior across multiple features (e.g. `commands/reqif.py` serves both ReqIF import and export), the `to_files_flat` helper in a plan emitted by `pharaoh-write-plan` detects this by seeing the same path appear under multiple feat entries (each entry a flat mapping produced by one `foreach` instance of this skill). It denormalises so the file appears once with `parents: [<feat_ids>]` listing all parents. Example (two instances from different foreach iterations): + +```yaml +# instance 1 (feat: FEAT_reqif_export) +feat_id: FEAT_reqif_export +files: + - reqif/needs2reqif.py + - commands/reqif.py +rationale: "..." + +# instance 2 (feat: FEAT_reqif_import) +feat_id: FEAT_reqif_import +files: + - reqif/reqif2needs.py + - commands/reqif.py # shared with FEAT_reqif_export +rationale: "..." +``` + +This atomic skill emits one entry at a time; cross-entry consolidation happens in the plan via `to_files_flat`, not in this skill. + +If no files match, emit: + +```yaml +feat_id: <input feat_id> +files: [] +rationale: "No source files matched this feature — check whether the feature is implemented in src_root or whether file_glob/exclude_glob are too restrictive." +``` + +Empty `files` is a valid output; the orchestrator decides whether to surface it as a warning. + +## Output schema + +Output must parse as a YAML document via `yaml.safe_load`. Validator checks: +1. Parsed root is a mapping with required keys `feat_id` (string equal to input `feat_id`), `files` (list of strings), `rationale` (non-empty string). +2. Optional top-level keys `shared_with` (list of strings) and `entry_point` (mapping with required `file: str` and `symbol: str`) are permitted; no other top-level keys accepted. +3. Every entry in `files` is a non-empty string. +4. `rationale` is a non-empty string. + +## Process + +### Step 1: Query Papyrus for prior file associations (if workspace provided) + +Query `pharaoh-context-gather` with `feat_title + " " + feat_body` against `papyrus_workspace`. If any prior memories link this feature (or a canonically-equivalent one) to specific files, bias toward those files in Step 3. If not, proceed. + +### Step 2: Enumerate candidate files + +Apply `file_glob` under `src_root`, then filter out everything matching `exclude_glob`. Read the resulting list of candidate files. + +### Step 3: Score each candidate for relevance to the feature + +For each candidate, read the first ~200 lines (or full file if smaller). Reason about relevance: + +- Strong positive signals: file name matches feature keywords (e.g. `csv_export.py` for CSV export); top-level function/class names use feature keywords; docstrings mention the feature's capability. +- Weak positive signals: imports from modules whose names match feature keywords; file is in a subdirectory whose name matches feature keywords. +- Negative signals: file name matches a different feature's keywords; file is clearly a helper/utility imported by many unrelated modules. + +Do NOT use file size as a signal. Do NOT use modification date as a signal. Do NOT follow imports transitively (that explodes scope). + +Assign each candidate an internal relevance score (high / medium / low / none). Emit all `high` and `medium` files. Drop `low` and `none`. + +### Step 4: Write rationale + +One sentence, ≤ 25 words, explaining the emitted file set. Example: `"reqif/reqif2needs.py parses XML, reqif/section.py handles section groups, commands/reqif.py wires the CLI."` + +Do NOT list every file in the rationale (that duplicates `files`). Instead describe the ROLE each file plays. + +### Step 4b: Identify entry_point (optional) + +After selecting the emitted files, identify the "entry point" — the file + symbol where user-facing flow begins. Heuristics, in order of preference: + +1. File in a directory named `commands/`, `cli/`, `api/`, `routes/`, `handlers/`, or `entrypoints/`. +2. File whose primary symbol name matches feat title tokens (case-insensitive substring). +3. File with a decorator-style entry marker (`@app.command()`, `@click.command()`, `@router.get()`, `@fastapi.*`). + +If exactly one candidate matches, emit `entry_point`. If multiple match, pick the one closest to the feat title tokens. If zero match, OMIT `entry_point` entirely (downstream skill detects absence and skips flow extraction). + +Do NOT invent an entry_point when the feat is a data model, a shared utility, or a configuration artefact with no orchestrating function. + +### Step 5: Return YAML + +Return the YAML object. No prose before or after. + +## Failure modes + +- `src_root` not readable → FAIL: "src_root unreadable: <path>". +- `feat_id` missing or not a string → FAIL: "feat_id must be a non-empty string". +- Zero candidate files after glob filtering → emit empty `files` with explanatory `rationale`, do NOT fail. +- `pharaoh-context-gather` errors → log and proceed without Papyrus bias. + +## Composition + +A plan emitted by `pharaoh-write-plan` calls `pharaoh-feat-draft-from-docs` once, then uses a `foreach` task to dispatch one `pharaoh-feat-file-map` per emitted feature in parallel. Merging / denormalisation to a flat file list happens in the plan via the `to_files_flat` helper — this skill never reads or writes a merged file. diff --git a/.github/agents/pharaoh.feat-flow-extract.agent.md b/.github/agents/pharaoh.feat-flow-extract.agent.md index 5d9cc2d..50ab345 100644 --- a/.github/agents/pharaoh.feat-flow-extract.agent.md +++ b/.github/agents/pharaoh.feat-flow-extract.agent.md @@ -7,4 +7,141 @@ handoffs: [] Use when reverse-engineering a feat and you need to derive a sequence diagram showing the control flow from its entry point through its source files. Walks the call graph up to a bounded depth and emits a Mermaid or PlantUML sequence diagram whose output shape matches pharaoh-sequence-diagram-draft. Complements pharaoh-feat-component-extract (static view); this is the dynamic view. -See [`skills/pharaoh-feat-flow-extract/SKILL.md`](../../skills/pharaoh-feat-flow-extract/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-feat-flow-extract + +## When to use + +Invoke after `pharaoh-feat-file-map` has produced a `{feat_id, files, entry_point}` mapping (with `entry_point` naming the file+symbol where flow begins — typically a CLI command, HTTP route, test entry, or event handler), when you want to see the call chain that realizes the feat. Output is a sequence diagram matching `pharaoh-sequence-diagram-draft`'s shape so downstream tooling treats it identically. + +Do NOT use for static architecture — that is `pharaoh-feat-component-extract`. Do NOT use when `entry_point` is not known — the skill fails fast rather than inventing one. + +## Tailoring awareness + +Shared tailoring rules: see `shared/diagram-tailoring.md`. Reads `[pharaoh.diagrams]` and `[pharaoh.diagrams.sequence]` from `pharaoh.toml` for renderer choice. Respects `on_missing_config` per shared `check → propose → confirm`. + +Safe-label rules: see `shared/diagram-safe-labels.md`. **Critical for this skill:** messages derived from call expressions (`foo(a; b, c)`) often contain `;` which Mermaid 11 treats as a statement terminator, and path fragments like `csv/export.py` are not valid participant IDs. Rules to apply before emit: (a) replace `;` in any message label with `,`; (b) use participant aliases (`participant Export as csv/export.py`), never raw paths as IDs; (c) strip backticks from symbol names. A message label containing `;` (e.g. `J->>J: filter by type; skip SET/Folder`) parses cleanly under `sphinx-build -nW` but renders as `Syntax error in text` in the browser — sanitisation catches this class before emit. + +## Atomicity + +- (a) Indivisible — one feat + one file list + one entry_point in → one sequence diagram out. No multi-scenario bundling. No mutation. No req emission. +- (b) Input: `{feat_id: str, feat_title: str, files: list[str], entry_point: {file: str, symbol: str}, project_root: str, src_root: str, renderer_override?: "mermaid"|"plantuml", max_depth?: int, on_missing_config?: "fail"|"prompt"|"use_default", papyrus_workspace?: str, reporter_id: str}`. Output: one RST directive block matching `pharaoh-sequence-diagram-draft`'s output shape. Default `max_depth=5`. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-feat-flow-extract/`: + - `input_feat.yaml` declares `entry_point: {file: commands/csv.py, symbol: export}`. + - `input_files/` is shared with `pharaoh-feat-component-extract` — the call chain `commands.csv:export → csv.export:run_export → csv.writer:CSVWriter.write_header / write_rows`. + - `expected_diagram.rst` has 3 participants (one per file touched) and the 4 messages representing the call chain. + + Scorer: + 1. Output starts with the renderer's sequence-diagram directive. + 2. Every participant in the call chain appears (one per distinct file). + 3. Messages render in call order with correct arrow syntax. + 4. Message count equals call count at `max_depth=5` (should resolve to 4 for this fixture). + 5. Participants are declared in first-seen order (entry point first). + 6. Output shape matches `pharaoh-sequence-diagram-draft`. + + Pass = all 6. +- (d) Reusable for any language whose call-graph the extractor supports. Python initial target (AST or regex). +- (e) Composable: one feat per call. A plan emitted by `pharaoh-write-plan` may include a `foreach` task over feats (with entry_point set) that dispatches this skill alongside `pharaoh-feat-component-extract` in the same `parallel_group`. Never invokes other skills. + +## Input + +- `feat_id`: diagram caption prefix. +- `feat_title`: human-readable, shown in caption. +- `files`: list of source file paths relative to `src_root`. Only calls resolving to files in this list are traced; calls to stdlib / third-party / out-of-scope files are silently dropped (they are not part of the feature). +- `entry_point`: + - `file`: path relative to `src_root` — must be in `files`. + - `symbol`: name of the function or method where flow begins. +- `project_root`, `src_root`: as in Task 19's skill. +- `renderer_override` (optional): per shared doc. +- `max_depth` (optional): maximum recursion depth when walking the call chain. Default `5`. +- `on_missing_config`, `papyrus_workspace`, `reporter_id`: standard. +- `scenarios` (optional): list of scenario names, default `["default"]`. Each scenario produces one diagram block. Scenario names drive annotations in the output (e.g. `:caption: FEAT_x — flow, scenario: error_handling`). Project tailoring declares the canonical scenario set via `.pharaoh/project/diagram-conventions.yaml > dynamic_view_scenarios`. See [`shared/diagram-view-selection.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-view-selection.md). + +## Output + +Output is a JSON document with shape: + +```json +{ + "diagrams": [ + { + "scenario": "default", + "diagram_block": ".. mermaid::\n :caption: FEAT_x — flow\n\n sequenceDiagram\n ...", + "element_count": 7, + "renderer": "mermaid" + } + ] +} +``` + +One entry per scenario. Callers invoke `pharaoh-diagram-review` per entry (plan template foreach-expands over `diagrams[]`). + +## Process + +### Step 1: Locate entry point + +Read `<src_root>/<entry_point.file>`. Locate the definition of `<entry_point.symbol>` via regex: + +- Python: `^(\s*)(def|async def|class) <symbol>\b`. +- Other languages per shared doc. + +If not found → FAIL: `"entry_point.symbol <symbol> not found in <entry_point.file>"`. + +Capture the body of the symbol (lines until the next line with indentation ≤ the symbol's definition line). + +### Step 2: Walk call chain up to max_depth + +Starting from the entry symbol's body, identify direct function/method calls. Regex (Python): + +- Bare calls: `(?<!\.)(?P<name>\w+)\(` (a bare identifier followed by `(`, not preceded by `.`). +- Method calls: `\.(?P<name>\w+)\(`. +- Constructor calls: `(?P<name>[A-Z]\w+)\(` (uppercased → probably a class instantiation). + +For each call, resolve the target: + +- Check if the name matches a top-level symbol defined in any of the `files` (use the primary-symbol detection from Task 19's skill). If so, record `(from_file, to_file, call_label)` where `call_label` is `<symbol>()` or `<method_name>()`. +- If the call resolves to a stdlib / third-party / imported-external symbol, drop it silently. +- If the call resolves to a local helper within the same file, drop it (same-file calls clutter the diagram; the participant-per-file abstraction collapses them). + +Recurse into resolved cross-file calls up to `max_depth` (default 5). Collect all resolved cross-file calls in call order (the order they appear in the body, top-to-bottom). + +### Step 3: Emit sequence diagram + +Resolve renderer per shared doc. Declare one participant per distinct file encountered in the call chain, in first-seen order. Emit messages in the order collected. + +Arrow syntax: + +- Synchronous call: Mermaid `->>`, PlantUML `->`. +- If the call target is `async def`, use async arrow: Mermaid `-)`, PlantUML `->>`. + +No return arrows are emitted by default — they clutter at this granularity. Callers who want them can use `pharaoh-sequence-diagram-draft` with explicit messages. + +Caption: `<feat_id> — flow from entry point`. + +### Step 4: Return + +Single RST block. No prose. + +## Failure modes + +- `entry_point.file` not in `files` → FAIL: `"entry_point.file <file> is not in the files list"`. +- `entry_point.symbol` not found in `entry_point.file` → FAIL per Step 1. +- Zero calls detected from the entry symbol's body → emit a minimal diagram with one participant (the entry point's file) and a self-note `Note over <participant>: entry point has no cross-file calls` instead of failing. +- Max depth exceeded → truncate at depth, log a note. + +## Non-goals + +- No return-arrow inference. Use `pharaoh-sequence-diagram-draft` if needed. +- No activation-bar insertion (PlantUML activates/deactivates). +- No concurrent / async branch handling beyond marking the arrow shape. Complex async flow is hand-authored via `pharaoh-sequence-diagram-draft`. +- No multi-entry-point diagrams. One entry → one diagram. If a feat has multiple entry points (e.g. a CLI with two subcommands), the orchestrator dispatches the skill twice. +- No code-to-sequence inference below function granularity (no per-statement trace). The unit of traceability is a function/method call crossing file boundaries. + +## Last step + +After emitting the artefact, invoke `pharaoh-diagram-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-diagram-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/.github/agents/pharaoh.feat-review.agent.md b/.github/agents/pharaoh.feat-review.agent.md index 73e9d33..d1cb1d3 100644 --- a/.github/agents/pharaoh.feat-review.agent.md +++ b/.github/agents/pharaoh.feat-review.agent.md @@ -7,4 +7,57 @@ handoffs: [] Use when auditing a single feature-level need (feat) against the generic feat review axes in `shared/checklists/feat.md` plus any per-project addenda in `.pharaoh/project/checklists/feat.md`. Emits structured findings JSON — per-axis pass/fail for mechanized axes, 0-3 score for subjective axes. Mirrors `pharaoh-req-review`'s shape for feat-level artefacts. -See [`skills/pharaoh-feat-review/SKILL.md`](../../skills/pharaoh-feat-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-feat-review + +## When to use + +Invoke after `pharaoh-feat-draft-from-docs` emitted a feat, or on an existing feat need-id in needs.json. Part of the self-review invariant — see [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md). + +Do NOT review comp_reqs or architecture — use `pharaoh-req-review` or `pharaoh-arch-review`. Do NOT re-author — invoke `pharaoh-feat-draft-from-docs` again with the review findings as input if regeneration is needed. + +## Atomicity + +- (a) One feat + one checklist in → one findings JSON out. +- (b) Input: `{target: <feat_directive_rst_or_need_id>, checklist_path: <path>, tailoring_path: <path>}`. Output: findings JSON with per-axis entries. +- (c) Reward: fixtures mirror `pharaoh-req-review` — one `passing.rst` and one `failing.rst` feat with expected findings JSON. +- (d) Reusable by any flow emitting feats. +- (e) Read-only. + +## Input + +- `target`: RST directive block for a feat, OR a `need_id` with `type: feat` present in needs.json. +- `checklist_path`: absolute path to `shared/checklists/feat.md`. Per-project extensions in `.pharaoh/project/checklists/feat.md` are appended if present. +- `tailoring_path`: absolute path to `.pharaoh/project/` root. Reads `artefact-catalog.yaml` for required/optional fields per the feat artefact type. + +## Output + +```json +{ + "need_id": "FEAT_example", + "type": "feat", + "axes": { + "trace_to_parent_or_workflow": {"passed": true, "reason": "links to wf__onboarding via :satisfies:"}, + "single_user_capability": {"score": 3, "reason": "scope is one feature"}, + "source_doc_present_and_valid": {"passed": true, "reason": "source_doc=docs/source/features/x.rst exists"}, + "required_fields_complete": {"passed": true, "reason": "id, status, source_doc present"}, + "shall_clause_user_observable": {"score": 2, "reason": "minor: names internal module"}, + "body_length_within_bounds": {"passed": true, "reason": "body=8 lines, limit=15"}, + "no_comp_level_mechanism_leak": {"score": 3, "reason": "no class / method names in body"}, + "naming_clarity": {"score": 3, "reason": "FEAT_reqif_export — clear"} + }, + "overall": "pass", + "actions": [] +} +``` + +## Review axes + +See [`shared/checklists/feat.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/checklists/feat.md) for the canonical axis list and rubric. Per-project extensions (e.g. Score's ASIL-level guidance, connector-family consistency (project-specific example)) are appended from `.pharaoh/project/checklists/feat.md` if present, with their own axis keys namespaced under `tailoring.*`. + +## Composition + +Invoked by `pharaoh-write-plan`-generated plans after every `pharaoh-feat-draft-from-docs` task. Also invoked ad-hoc per the self-review invariant. Coverage enforced by `pharaoh-self-review-coverage-check`. diff --git a/.github/agents/pharaoh.finding-record.agent.md b/.github/agents/pharaoh.finding-record.agent.md index 35e2162..d637991 100644 --- a/.github/agents/pharaoh.finding-record.agent.md +++ b/.github/agents/pharaoh.finding-record.agent.md @@ -7,4 +7,85 @@ handoffs: [] Use when recording an audit finding in the shared Papyrus workspace with automatic dedup. Uses deterministic ID to ensure the same {category, subject_id} tuple never appears twice across concurrent subagents. Returns {action: wrote|duplicate, papyrus_id}. -See [`skills/pharaoh-finding-record/SKILL.md`](../../skills/pharaoh-finding-record/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-finding-record + +## When to use + +Invoke from any audit subagent (`pharaoh-coverage-gap`, `pharaoh-lifecycle-check`, `pharaoh-standard-conformance`, `pharaoh-review-completeness`, `pharaoh-process-audit`) whenever the subagent has identified an issue that should be reported. Do NOT invoke for informational or non-actionable observations. + +Do NOT invoke for new audit categories not in the known list — category must be one of: `orphan_arch`, `unverified_req`, `invalid_lifecycle_transition`, `duplicate_req`, `contradictory_req_pair`, `missing_fmea`, `stale_review`, `broken_back_link`, `schema_violation`, `wrong_prefix_id`, `missing_reviewer`, `missing_approval`. + +## Atomicity + +- (a) Indivisible — single write-or-dedup action. Does not audit, classify, or author. +- (b) Input: `{category: str, subject_id: str, finding_text: str, reporter_id: str}`. Output: `{action: "wrote"|"duplicate", papyrus_id: str, dup_of?: str}`. +- (c) Reward: deterministic — 2 reporters for same `(category, subject_id)` must produce exactly 1 `"wrote"` + 1 `"duplicate"` response; measured via fixture. +- (d) Reusable: any audit subagent; standalone CI gate; bug-tracking dedup. +- (e) Composable: Papyrus write-only; never modifies artefact files or invokes other skills. + +## Input + +- `category`: one of the known categories listed above. +- `subject_id`: the project need ID affected by the finding (e.g. `arch__orphan_0`). +- `finding_text`: 1-3 sentence description. Used as the Papyrus need body. +- `reporter_id`: the calling subagent's area tag (e.g. `coverage-gap`, `lifecycle-check`). Stored in the Papyrus need `source` field for traceability. + +## Output + +A single-line JSON object, no prose: + +`{"action": "wrote", "papyrus_id": "FACT_orphan_arch_arch_orphan_0"}` + +or: + +`{"action": "duplicate", "papyrus_id": "FACT_orphan_arch_arch_orphan_0", "dup_of": "FACT_orphan_arch_arch_orphan_0"}` + +## Process + +### Step 1: Construct deterministic ID + +``` +papyrus_id = "FACT_" + sanitize(category) + "_" + sanitize(subject_id) +``` + +`sanitize` replaces non-alphanumeric characters with underscores, collapses consecutive underscores, strips leading/trailing underscores. Example: `(orphan_arch, arch__orphan_0)` → `FACT_orphan_arch_arch_orphan_0`. + +### Step 2: Attempt `papyrus add` + +```bash +papyrus --workspace .papyrus add fact \ + "<short_title_from_finding_text>" \ + --id <papyrus_id> \ + --body <finding_text> \ + --tags "category:<category>,subject:<subject_id>" \ + --source "<reporter_id>" \ + --scope local +``` + +### Step 3: Interpret result + +- Exit 0 → emit `{"action": "wrote", "papyrus_id": "<id>"}`. +- Exit non-zero with stderr containing `"already exists"` → emit `{"action": "duplicate", "papyrus_id": "<id>", "dup_of": "<id>"}`. +- Any other non-zero exit → emit `{"action": "error", "papyrus_id": "<id>", "message": "<stderr-first-line>"}` and return; the caller should not retry. + +No surrounding prose. Emit exactly one JSON object per invocation. + +## Dedup semantics + +- Match key is `(category, subject_id)`. `finding_text` differences do NOT suppress dedup — the first writer wins and sets canonical phrasing. +- `reporter_id` difference does NOT suppress dedup — two subagents finding the same issue from different angles still collapse to one record. +- Concurrent writes for the same `papyrus_id` are serialized by the Papyrus `FileLock`; only one succeeds, the others get `"duplicate"`. + +## Failure modes + +- `papyrus` binary missing → emit `{"action": "error", "message": "papyrus CLI not found"}` and return. +- `.papyrus/` workspace missing → emit `{"action": "error", "message": "no .papyrus/ workspace in cwd"}`. +- Any other subprocess failure → emit `{"action": "error", "message": "<stderr-first-line>"}`. + +## Composition + +Each audit subagent invokes this skill once per finding. The orchestrator or harness then reads the final Papyrus workspace via `papyrus recall` for the aggregated report. diff --git a/.github/agents/pharaoh.flow.agent.md b/.github/agents/pharaoh.flow.agent.md index 028d3e8..00bd4fc 100644 --- a/.github/agents/pharaoh.flow.agent.md +++ b/.github/agents/pharaoh.flow.agent.md @@ -9,4 +9,516 @@ Use when orchestrating the full V-model chain for one feature context across the Dispatches to pharaoh-req-draft, pharaoh-req-review, pharaoh-arch-draft, pharaoh-arch-review, pharaoh-vplan-draft, pharaoh-vplan-review, and pharaoh-fmea. Safety-V types route through pharaoh-req-draft with the appropriate target_level (hazard, safety_goal, fsr) — no new safety-V drafting skills are introduced. -See [`skills/pharaoh-flow/SKILL.md`](../../skills/pharaoh-flow/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-flow + +## When to use + +Invoke when the user wants to produce a complete V-model artefact chain from a single feature +context in one operation. This skill orchestrates the atomic drafting and review skills; it +does not author content itself. + +**Scope is bounded to one feature context.** Each layer that runs emits exactly one artefact +of each type the layer covers, with a review pass after every drafted artefact. + +The orchestrator walks up to three optional layers, top-down through the V: + +| Layer | Stages drafted | Catalog types it expects | +|---|---|---| +| `safety_v` | hazard → safety_goal → fsr | `hazard`, `safety_goal`, `fsr` | +| `sys` | sysreq → sys-arch | `sysreq`, `sys-arch` | +| `sw` | swreq → swarch | `swreq`, `swarch` | +| `component` | req (or `comp_req` / `gd_req`) → arch → vplan → fmea | one requirement-shaped key (e.g. `req`, `comp_req`, `gd_req`), `arch`, `tc`, `fmea` | + +A layer runs only if its types are declared in `.pharaoh/project/artefact-catalog.yaml`. A +project that declares only the classical types runs only the `component` layer (the prior +behaviour of this skill is preserved exactly). + +Do NOT invoke when the user wants to draft only one artefact type — use the individual atomic +skills directly. Do NOT invoke when the feature context implies multiple requirements at the +same level (the orchestrator drafts the single most direct requirement per layer and advises +re-invocation). + +> This is a compositional orchestrator. The atomicity criterion (a) does not apply: by design +> it invokes multiple skills. Scope is bounded to "one feature → one V-model chain across the +> declared layers". + +--- + +## Inputs + +- **feature_context** (from user): short prose describing the feature / hazard / safety goal + (1–5 sentences). Forwarded to the top-most layer that runs. +- **parent_link** (from user): need-id of the parent the top-most layer's first artefact will + trace to (e.g. a workflow id for safety-V, an upstream sysreq id when entering at SYS, etc.). + When a higher layer runs first, the lower layers chain off the IDs produced upstream — the + caller does not supply an extra parent for each layer. +- **safety_context** (from user, optional): ASIL level (A–D) or safety-goal handle if known — + forwarded to `pharaoh-req-draft` (for safety-V types) and to `pharaoh-fmea`. +- **stages** (from user, optional): explicit list of layers to run. Allowed values: + `safety_v`, `sys`, `sw`, `component`. Order in the input is ignored — the orchestrator always + walks the layers top-down (`safety_v` → `sys` → `sw` → `component`). When omitted, + auto-detect from the catalog. +- **tailoring** (from `.pharaoh/project/`): every sub-skill consumes `artefact-catalog.yaml`, + `id-conventions.yaml`, `workflows.yaml`, and per-type checklists. +- **needs.json**: required for parent resolution and uniqueness checks in each sub-skill. + +### Auto-detect vs. explicit `stages` + +Default behaviour is **auto-detect**: the orchestrator runs every layer whose types are +declared in the catalog. This is the correct mode for a project that has finished tailoring +and wants the full V emitted in one call. + +The caller passes `stages` explicitly when: + +- A project is **bootstrapping** the safety V and wants only the upper-V emitted while the + classical layer is still being shaped. Without an explicit `stages: ["safety_v"]`, + auto-detect would also try to run the lower layers. +- An audit run wants to **regenerate one layer in isolation** without retouching others + already emitted (e.g. `stages: ["sys"]` to refresh the SYS layer after a tailoring edit). +- The catalog declares a layer's types but the caller knows that layer's parent IDs aren't + yet stable, so the layer should be deferred. + +When `stages` is supplied, every requested layer must have its types declared in the catalog; +a missing declaration is a hard FAIL (see Guardrail G2). + +--- + +## Outputs + +For every layer that runs, emit each drafted artefact and its review (where the layer +includes one) in a labeled fenced block, then a single flow summary at the end. Block +ordering follows the V (top-down then layer-internal): + +``` +=== [SAFETY_V 1/3] hazard: <id> === (only if layer ran) +=== [REVIEW SAFETY_V 1/3] req-review: <id> === +=== [SAFETY_V 2/3] safety_goal: <id> === +=== [REVIEW SAFETY_V 2/3] req-review: <id> === +=== [SAFETY_V 3/3] fsr: <id> === +=== [REVIEW SAFETY_V 3/3] req-review: <id> === + +=== [SYS 1/2] sysreq: <id> === (only if layer ran) +=== [REVIEW SYS 1/2] req-review: <id> === +=== [SYS 2/2] sys-arch: <id> === +=== [REVIEW SYS 2/2] arch-review: <id> === + +=== [SW 1/2] swreq: <id> === (only if layer ran) +=== [REVIEW SW 1/2] req-review: <id> === +=== [SW 2/2] swarch: <id> === +=== [REVIEW SW 2/2] arch-review: <id> === + +=== [COMPONENT 1/4] <req-key>: <id> === (only if layer ran) +=== [REVIEW COMPONENT 1/4] req-review: <id> === +=== [COMPONENT 2/4] arch: <id> === +=== [REVIEW COMPONENT 2/4] arch-review: <id> === +=== [COMPONENT 3/4] tc: <id> === +=== [REVIEW COMPONENT 3/4] vplan-review: <id> === +=== [COMPONENT 4/4] fmea: <id> === + +=== [FLOW SUMMARY] === +<summary JSON> +``` + +Each `[REVIEW …]` block is omitted only when its corresponding draft step was skipped or +failed. + +**Flow summary shape:** + +```json +{ + "feature_context_summary": "one sentence", + "stages_run": ["safety_v", "sys", "sw", "component"], + "stages_skipped": [], + "skip_reasons": { + "<stage>": "auto-detect: catalog does not declare <missing types>" + }, + "artefacts": { + "safety_v": { + "hazard": {"id": "hazard__...", "overall": "pass|needs_work|fail"}, + "safety_goal": {"id": "safety_goal__...", "overall": "pass|needs_work|fail"}, + "fsr": {"id": "fsr__...", "overall": "pass|needs_work|fail"} + }, + "sys": { + "sysreq": {"id": "sysreq__...", "overall": "pass|needs_work|fail"}, + "sys-arch": {"id": "sys_arch__...", "overall": "pass|needs_work|fail"} + }, + "sw": { + "swreq": {"id": "swreq__...", "overall": "pass|needs_work|fail"}, + "swarch": {"id": "swarch__...", "overall": "pass|needs_work|fail"} + }, + "component": { + "req": {"id": "<req-key>__...", "overall": "pass|needs_work|fail"}, + "arch": {"id": "arch__...", "overall": "pass|needs_work|fail"}, + "tc": {"id": "tc__...", "overall": "pass|needs_work|fail"}, + "fmea": {"id": "fmea__...", "rpn": 160} + } + }, + "stop_reason": null +} +``` + +When a layer is skipped, its key in `artefacts` is omitted and the layer name appears in +`stages_skipped` with the reason in `skip_reasons`. When the chain stops early (Guardrail +G3), `stop_reason` carries the diagnostic from the failing skill. + +--- + +## Process + +### Step 0: Validate inputs + +Confirm `feature_context` and `parent_link` are provided. If either is missing, FAIL before +invoking any sub-skill: + +``` +FAIL: pharaoh-flow requires feature_context and parent_link. +Provide both before invoking the orchestrator. +``` + +If the caller supplied `stages`, validate every entry against the allowed set +(`safety_v`, `sys`, `sw`, `component`). Unknown values FAIL. + +--- + +### Step 1: Resolve which layers will run + +Read `.pharaoh/project/artefact-catalog.yaml`. For each layer, mark it `present-in-catalog` +when **every** required type for that layer is declared: + +| Layer | Required catalog keys | +|---|---| +| `safety_v` | `hazard`, `safety_goal`, `fsr` | +| `sys` | `sysreq`, `sys-arch` | +| `sw` | `swreq`, `swarch` | +| `component` | one of (`req`, `comp_req`, `gd_req`), plus `arch`, `tc`. (`fmea` is best-effort and may be absent — see Step 6 of the component layer.) | + +Selection rules: + +- **No `stages` argument (auto-detect)** — every layer where `present-in-catalog` is true + runs; layers whose types are not declared are silently skipped, with `skip_reasons["<layer>"] + = "auto-detect: catalog does not declare <missing types>"`. +- **Explicit `stages` argument** — only requested layers run; layers not requested record + `skip_reasons["<layer>"] = "not requested by caller"`. For every requested layer that is + NOT `present-in-catalog`, FAIL hard: + + ``` + FAIL: stages argument requested "<layer>" but artefact-catalog.yaml does not + declare the required types: <missing types>. + Either declare the types in the catalog (run pharaoh-tailor-fill), or remove + "<layer>" from the stages argument. + ``` + +If neither auto-detect nor an explicit `stages` argument selects any layer, FAIL: + +``` +FAIL: no layers selected. Catalog declares none of {safety_v, sys, sw, component} +artefact types and the caller did not pass a stages argument. +``` + +For the `component` layer, also resolve which requirement key to use (the first of +`req` / `comp_req` / `gd_req` declared in the catalog wins). Record the chosen key as +`<req-key>` and use it consistently throughout the layer. + +--- + +### Step 2: Run the safety_v layer (if selected) + +For each stage in order — `hazard`, `safety_goal`, `fsr` — invoke `pharaoh-req-draft` with +`target_level=<stage>`, then `pharaoh-req-review` on the drafted RST. + +Inputs forwarded to `pharaoh-req-draft`: + +| Stage | feature_context | parent_link | +|---|---|---| +| `hazard` | the user's `feature_context` | the user's `parent_link` | +| `safety_goal` | "Safety goal addressing hazard `<hazard-id>`: …" — derived from the user's `feature_context` and the hazard ID emitted in the prior step | the `hazard` ID | +| `fsr` | "Functional safety requirement deriving safety goal `<safety_goal-id>`: …" | the `safety_goal` ID | + +Forward `safety_context` to every step. The drafter uses the catalog's `required_links` +(e.g. `derives_from`, `safety_goal_for`) to attach the correct link relation; the +orchestrator does not hardcode link names. + +Capture the IDs emitted by the layer; the `fsr` ID becomes the parent for the SYS layer's +`sysreq` (if SYS runs). If the SYS layer is skipped, the `fsr` ID becomes the parent for the +SW layer's `swreq` (if SW runs). If SYS and SW are both skipped, the `fsr` ID becomes the +parent for the `component` layer's requirement. + +Review-policy: a `pharaoh-req-review` returning `overall: needs_work` or `overall: fail` does +NOT stop the chain (Guardrail G4). A hard FAIL from `pharaoh-req-draft` does (Guardrail G3). + +--- + +### Step 3: Run the sys layer (if selected) + +Step 3a — `pharaoh-req-draft` with `target_level=sysreq`. Parent is whichever upstream ID +was last produced (the `fsr` ID when safety-V ran, otherwise the user's `parent_link`). + +Step 3b — `pharaoh-req-review` on the `sysreq`. + +Step 3c — `pharaoh-arch-draft` with `target_level=sys-arch`. Parent is the `sysreq` ID. + +Step 3d — `pharaoh-arch-review` on the `sys-arch`. + +Capture both IDs. The `sys-arch` ID is the upstream parent for the SW layer (if SW runs); when +SW is skipped, the `sys-arch` ID becomes the parent for the `component` layer's requirement. + +--- + +### Step 4: Run the sw layer (if selected) + +Step 4a — `pharaoh-req-draft` with `target_level=swreq`. Parent is whichever upstream ID was +last produced (`sys-arch` when SYS ran, `fsr` when safety-V ran without SYS, or the user's +`parent_link` otherwise). + +Step 4b — `pharaoh-req-review` on the `swreq`. + +Step 4c — `pharaoh-arch-draft` with `target_level=swarch`. Parent is the `swreq` ID. + +Step 4d — `pharaoh-arch-review` on the `swarch`. + +Capture both IDs. The `swarch` ID is the upstream parent for the `component` layer's +requirement (if the `component` layer runs). + +--- + +### Step 5: Run the component layer (if selected) + +This is the classical chain preserved from the prior behaviour, with an explicit review pass +after every drafted artefact (req, arch, tc) and an FMEA at the end. + +Step 5a — `pharaoh-req-draft` with `target_level=<req-key>` (the key resolved in Step 1). +Parent is the closest upstream ID — `swarch` when SW ran, else `sys-arch` when SYS ran, else +`fsr` when safety_V ran, else the user's `parent_link`. + +Step 5b — `pharaoh-req-review` on the requirement. + +Step 5c — `pharaoh-arch-draft` with `target_level=arch` and the requirement's ID as parent. + +Step 5d — `pharaoh-arch-review` on the architecture element. + +Step 5e — `pharaoh-vplan-draft` with `target_level=tc` and the requirement's ID as parent. + +Step 5f — `pharaoh-vplan-review` on the test case. + +Step 5g — `pharaoh-fmea` with `parent_id=<requirement-id>` and the user's `safety_context`. + +For Steps 5e (vplan-draft) and 5g (fmea), a hard FAIL emits a `=== [WARNING …] ===` block but +does NOT stop the chain — the artefact is recorded as `null` in the summary. Steps 5a and +5c (the load-bearing draft steps) DO stop the chain on hard FAIL, per Guardrail G3. + +--- + +### Step 6: Emit all outputs and the flow summary + +Emit each artefact and review block in the order shown in the Outputs section. Emit the +flow summary last. List every layer that ran in `stages_run` and every layer that was +skipped in `stages_skipped`, with the reason recorded under `skip_reasons`. + +--- + +## Guardrails + +**G1 — Missing required inputs** + +`feature_context` or `parent_link` absent → FAIL before any sub-skill runs (Step 0). + +**G2 — Explicit stages argument referencing un-declared types** + +When the caller passes `stages` and a requested layer's types are not declared in +`artefact-catalog.yaml`, FAIL hard with the missing types listed (Step 1). Auto-detect +silently skips a missing layer; an explicit request never falls back silently. + +**G3 — Hard failure in a load-bearing draft step** + +Within each layer, the draft skills are load-bearing. A hard FAIL from any +`pharaoh-req-draft` or `pharaoh-arch-draft` invocation stops the chain at that point and +records `stop_reason` with the failing skill name and its diagnostic. The vplan and fmea +steps in the `component` layer are best-effort (warnings but not chain stops). + +**G4 — Review findings don't block the chain** + +A review returning `overall: needs_work` or `overall: fail` is informational. The chain +continues. Action items are preserved in the review block for the user to address. The +orchestrator never auto-regenerates; that requires `pharaoh-req-regenerate`. + +**G5 — Tailoring unavailable** + +If `.pharaoh/project/` tailoring files are missing, the sub-skills will fail. Fail fast +with: + +``` +FAIL: pharaoh-flow cannot run without tailoring files at .pharaoh/project/. +Run pharaoh-tailor-detect → pharaoh-tailor-fill first. +``` + +**G6 — Catalog declares safety-V partial set** + +A project that declares only some of `hazard`, `safety_goal`, `fsr` is mis-tailored. In +auto-detect mode the safety_v layer skips with reason +`auto-detect: catalog declares only <subset>; safety_v layer requires the full set`. +In explicit-stages mode this is a hard FAIL via Guardrail G2. + +--- + +## Advisory chain + +This skill has `chains_to: []` — it is a terminal orchestrator. After the flow summary, +advise only when reviews returned action items: + +``` +Review action items in the [REVIEW …] blocks above. +Use `pharaoh-req-regenerate` or `pharaoh-arch-draft` (with corrections) to address them. +``` + +--- + +## Worked example — safety-V on a project that declares the full V + +**User input:** + +> feature_context: "Unintended ABS pump activation while the brake pedal is released can +> destabilise the vehicle on slippery surfaces. The brake controller must prevent activation +> outside the slip-detection window." +> parent_link: `wf__brake_system_design` +> safety_context: ASIL B +> stages: (omitted — auto-detect) + +The project's `artefact-catalog.yaml` declares `hazard`, `safety_goal`, `fsr`, `sysreq`, +`sys-arch`, `swreq`, `swarch`, `comp_req`, `arch`, `tc`. All four layers will run. + +**Layer 1 — safety_v:** + +- `pharaoh-req-draft` (target_level=hazard) emits `hazard__unintended_abs_pump_activation`, + parent `wf__brake_system_design`, body describes the hazardous event. +- `pharaoh-req-draft` (target_level=safety_goal) emits + `safety_goal__no_unintended_abs_activation` linked via `:derives_from:` to the hazard. +- `pharaoh-req-draft` (target_level=fsr) emits `fsr__abs_activation_window_check` linked via + `:safety_goal_for:` to the safety goal. +- Each is reviewed by `pharaoh-req-review`; all three pass. + +**Layer 2 — sys:** + +- `sysreq__abs_activation_window_check` derives from `fsr__abs_activation_window_check`. +- `sys_arch__brake_controller_supervision` satisfies the sysreq. +- Both reviewed. + +**Layer 3 — sw:** + +- `swreq__pedal_state_gate` derives from `sys_arch__brake_controller_supervision`. +- `swarch__abs_supervision_module` satisfies the swreq. +- Both reviewed. + +**Layer 4 — component:** + +- `comp_req__abs_pump_activation` (the `<req-key>` resolved to `comp_req`) derives from + `swarch__abs_supervision_module`. +- `arch__abs_pump_driver` satisfies the comp_req. +- `tc__abs_pump_activation_001` verifies the comp_req. +- `fmea__abs_pump_activation__no_activation` derived from the comp_req; RPN = 160. + +**Flow summary (condensed):** + +``` +=== [FLOW SUMMARY] === +{ + "feature_context_summary": "Prevent unintended ABS pump activation outside slip window (ASIL B)", + "stages_run": ["safety_v", "sys", "sw", "component"], + "stages_skipped": [], + "skip_reasons": {}, + "artefacts": { + "safety_v": { + "hazard": {"id": "hazard__unintended_abs_pump_activation", "overall": "pass"}, + "safety_goal": {"id": "safety_goal__no_unintended_abs_activation","overall": "pass"}, + "fsr": {"id": "fsr__abs_activation_window_check", "overall": "pass"} + }, + "sys": { + "sysreq": {"id": "sysreq__abs_activation_window_check", "overall": "pass"}, + "sys-arch": {"id": "sys_arch__brake_controller_supervision", "overall": "pass"} + }, + "sw": { + "swreq": {"id": "swreq__pedal_state_gate", "overall": "pass"}, + "swarch": {"id": "swarch__abs_supervision_module", "overall": "pass"} + }, + "component": { + "req": {"id": "comp_req__abs_pump_activation", "overall": "pass"}, + "arch": {"id": "arch__abs_pump_driver", "overall": "pass"}, + "tc": {"id": "tc__abs_pump_activation_001", "overall": "pass"}, + "fmea": {"id": "fmea__abs_pump_activation__no_activation", "rpn": 160} + } + }, + "stop_reason": null +} +``` + +--- + +## Worked example — classical V on a project without safety-V or SYS/SWE split + +**User input:** + +> feature_context: "The brake controller shall engage the ABS pump when wheel slip exceeds a +> calibrated threshold. Target level: component." +> parent_link: `wf__brake_system_design` +> safety_context: ASIL B +> stages: (omitted — auto-detect) + +The project's catalog declares only `gd_req`, `arch`, `tc`. Auto-detect skips `safety_v`, +`sys`, and `sw`; only the `component` layer runs. + +**Layer 4 — component (only):** + +- `gd_req__abs_pump_activation` parent `wf__brake_system_design`. +- `arch__brake_controller_abs_module` satisfies the requirement. +- `tc__abs_pump_activation_001` verifies the requirement. +- `fmea__abs_pump_activation__no_activation` derived from the requirement; RPN = 160. + +**Flow summary:** + +``` +=== [FLOW SUMMARY] === +{ + "feature_context_summary": "Brake controller engages ABS pump on wheel-slip threshold exceedance (ASIL B)", + "stages_run": ["component"], + "stages_skipped": ["safety_v", "sys", "sw"], + "skip_reasons": { + "safety_v": "auto-detect: catalog does not declare hazard, safety_goal, fsr", + "sys": "auto-detect: catalog does not declare sysreq, sys-arch", + "sw": "auto-detect: catalog does not declare swreq, swarch" + }, + "artefacts": { + "component": { + "req": {"id": "gd_req__abs_pump_activation", "overall": "pass"}, + "arch": {"id": "arch__brake_controller_abs_module", "overall": "pass"}, + "tc": {"id": "tc__abs_pump_activation_001", "overall": "pass"}, + "fmea": {"id": "fmea__abs_pump_activation__no_activation", "rpn": 160} + } + }, + "stop_reason": null +} +``` + +This is the prior behaviour of the skill, preserved exactly. + +--- + +## Worked example — bootstrapping safety V in isolation + +**User input:** + +> feature_context: "Loss of brake pedal feedback while ABS is intervening can lead to a delayed +> driver response and longer stopping distances." +> parent_link: `wf__hara` +> safety_context: ASIL C +> stages: ["safety_v"] + +Even though the project's catalog also declares `sys`, `sw`, and `component` types, the +caller restricts the run to the safety V — typical when bootstrapping HARA outputs before the +lower V is mature enough to chain. + +Only Layer 1 runs: hazard → safety_goal → fsr, each reviewed. The `[FLOW SUMMARY]` +records `stages_run: ["safety_v"]`, `stages_skipped: ["sys", "sw", "component"]`, and a +`skip_reasons` map indicating each was skipped because the caller did not request it. diff --git a/.github/agents/pharaoh.fmea-review.agent.md b/.github/agents/pharaoh.fmea-review.agent.md index 9f8c623..f969eb8 100644 --- a/.github/agents/pharaoh.fmea-review.agent.md +++ b/.github/agents/pharaoh.fmea-review.agent.md @@ -7,4 +7,52 @@ handoffs: [] Use when auditing a single FMEA entry (failure-mode row) against the generic FMEA review axes in `shared/checklists/fmea.md` plus per-project addenda. Checks severity/occurrence/detection scales, RPN computation, cause/effect well-formedness, traceability to the analyzed artefact. Emits structured findings JSON. -See [`skills/pharaoh-fmea-review/SKILL.md`](../../skills/pharaoh-fmea-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-fmea-review + +## When to use + +Invoke after `pharaoh-fmea` emitted a single failure-mode entry. Part of the self-review invariant. + +Do NOT review sets of FMEA rows — this skill reviews one entry. A fleet review is a separate flow that invokes this skill per entry. + +## Atomicity + +- (a) One FMEA entry + one checklist in → one findings JSON out. +- (b) Input: `{target: <fmea_entry_json_or_need_id>, checklist_path: <path>, tailoring_path: <path>}`. Output: findings JSON. +- (c) Reward: fixtures `passing-fmea.json` + `failing-fmea.json` with expected findings. +- (d) Reusable. +- (e) Read-only. + +## Input + +- `target`: JSON object with the FMEA entry shape emitted by `pharaoh-fmea`, OR a need_id with type `fmea` in needs.json. +- `checklist_path`: `shared/checklists/fmea.md`. +- `tailoring_path`: `.pharaoh/project/` for optional scale extensions. + +## Output + +```json +{ + "need_id": "fmea__example_01", + "type": "fmea", + "axes": { + "trace_to_analyzed_artefact": {"passed": true}, + "severity_in_range": {"passed": true, "reason": "sev=7, scale=1..10"}, + "occurrence_in_range": {"passed": true, "reason": "occ=4"}, + "detection_in_range": {"passed": true, "reason": "det=3"}, + "rpn_computed_correctly": {"passed": true, "reason": "7*4*3=84, entry reports 84"}, + "cause_well_formed": {"score": 3}, + "effect_well_formed": {"score": 3}, + "mitigation_proposed_if_rpn_high": {"score": 2, "reason": "RPN 84 > threshold 60; mitigation text thin"} + }, + "overall": "pass" +} +``` + +## Review axes + +See [`shared/checklists/fmea.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/checklists/fmea.md). diff --git a/.github/agents/pharaoh.fmea.agent.md b/.github/agents/pharaoh.fmea.agent.md index 9b3f4b4..528d8ea 100644 --- a/.github/agents/pharaoh.fmea.agent.md +++ b/.github/agents/pharaoh.fmea.agent.md @@ -7,4 +7,330 @@ handoffs: [] Use when deriving a single failure-mode entry (FMEA / DFA row) from one requirement or architecture element. Emits structured JSON with cause, effect, severity (1-10), occurrence (1-10), detection (1-10), and RPN. -See [`skills/pharaoh-fmea/SKILL.md`](../../skills/pharaoh-fmea/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-fmea + +## When to use + +Invoke when the user has a requirement or architecture element and wants to derive one failure-mode +entry from it. Each invocation produces exactly one FMEA row. + +Do NOT derive multiple failure modes in one invocation — one mode per call. If the parent element +implies several potential failure modes, derive the most safety-critical one and advise the user +to re-invoke for others. +Do NOT produce RST output — FMEA entries are tabular/JSON artefacts, not sphinx-needs directives. +Do NOT score whole systems — one failure mode, one parent element per call. + +--- + +## Inputs + +- **parent_id** (from user): need-id of the parent requirement or architecture element + — must exist in needs.json +- **failure_mode** (from user): short description of the specific failure to analyse + (optional — if omitted, derive the most apparent failure mode from the parent body) +- **safety_context** (from user, optional): ASIL level (A–D) or safety goal if known; informs + severity rating +- **needs.json**: required for parent resolution + +--- + +## Outputs + +A single JSON object — no RST, no prose wrapper: + +```json +{ + "fmea_id": "fmea__<parent_local_id>__<mode>", + "parent": "<parent_id>", + "failure_mode": "short description of how the element fails", + "cause": "root cause or failure mechanism", + "effect": "downstream consequence if failure is not detected", + "severity": 1, + "occurrence": 1, + "detection": 1, + "rpn": 1, + "mitigations": ["mitigation or design control"], + "justifications": { + "severity": "one-sentence rationale for the severity score", + "occurrence": "one-sentence rationale for the occurrence score", + "detection": "one-sentence rationale for the detection score" + } +} +``` + +`rpn` is always `severity × occurrence × detection`. Do not emit a pre-computed value that +differs from this product. + +--- + +## Rating scales + +All three ordinal dimensions use 1–10 per AIAG-VDA FMEA / ISO 26262 severity-risk convention: + +### Severity (S) — consequence of the failure effect + +| Score | Meaning | +|---|---| +| 9–10 | Hazardous: safety-critical failure; may result in injury or loss of life; ASIL C/D item | +| 7–8 | High: significant damage or non-compliance; ASIL B item | +| 5–6 | Moderate: degraded function; customer/operator dissatisfied; ASIL A item | +| 3–4 | Low: minor degraded function; slight annoyance | +| 1–2 | None to negligible: no discernible effect | + +### Occurrence (O) — likelihood of the cause occurring + +| Score | Meaning | +|---|---| +| 9–10 | Very high: failure is almost certain (≥ 1 in 100 operations) | +| 7–8 | High: repeated failures (≥ 1 in 1000 operations) | +| 5–6 | Moderate: occasional failures (≥ 1 in 10 000 operations) | +| 3–4 | Low: relatively few failures | +| 1–2 | Remote: failure is unlikely | + +### Detection (D) — likelihood that the failure is NOT detected before reaching the customer / next level + +| Score | Meaning | +|---|---| +| 9–10 | Almost impossible to detect; no control mechanism in place | +| 7–8 | Very low chance of detection | +| 5–6 | Moderate detection probability | +| 3–4 | High detection probability; test or monitoring in place | +| 1–2 | Almost certain detection; continuous monitoring or mandatory test | + +> LLM-judge is the source of truth for scoring. Scores reflect the LLM's reasoning from the +> parent element body and safety_context. Harness spot-checks against human-rated samples. + +--- + +## Process + +### Step 1: Locate and parse needs.json + +Find `needs.json` (check `docs/_build/needs/needs.json`, then `_build/needs/needs.json`, then +any `needs.json` under a `_build` directory). If not found, FAIL: + +``` +FAIL: needs.json not found. Build the Sphinx project first (`sphinx-build docs/ docs/_build/`), +then re-run this skill. +``` + +Extract a flat map of `id → {id, type, status, body}`. + +--- + +### Step 2: Validate parent_id + +Look up `parent_id` in the needs.json map. If not found, FAIL: + +``` +FAIL: parent_id "<id>" not found in needs.json. +Specify an existing requirement or architecture element ID. +``` + +Extract the parent body — this is the primary source for deriving failure mode, cause, and effect. + +--- + +### Step 3: Determine failure_mode + +If `failure_mode` was provided by the user, use it as-is (cleaned to ≤ 8 words, lowercase with +underscores for the ID slug). + +If `failure_mode` was not provided: +1. Read the parent body. +2. Identify the primary function or constraint being specified. +3. Derive the most apparent failure mode: "What happens if this function does NOT occur, occurs + incorrectly, or occurs at the wrong time?" +4. State the failure mode as a short noun phrase: e.g. "no ABS pump activation on wheel slip". + +If the parent body is too vague to derive a failure mode, FAIL: + +``` +FAIL: parent "<parent_id>" body is too vague to derive a failure mode. +Provide explicit failure_mode in the input or improve the parent element first. +``` + +--- + +### Step 4: Derive cause, effect, and mitigations + +**Cause:** Root cause or failure mechanism at the element level. Focus on the element itself: +hardware fault, software logic error, interface signal loss, incorrect parameterisation, etc. +One concrete cause per FMEA entry. + +**Effect:** Downstream consequence if this failure propagates undetected. Describe the effect +at the system / vehicle / user level (not just at the element level). Reference the parent's +role in the safety chain. + +**Mitigations:** One to three design controls or mitigations that reduce severity, occurrence, +or detection probability. Examples: watchdog monitoring, redundant sensor path, end-of-line +calibration test, periodic self-test. + +If safety_context (ASIL level) was provided, reference it in the effect and severity +justification. + +--- + +### Step 5: Assign S / O / D scores and compute RPN + +For each dimension, select a score from the 1–10 scale using the tables above and the parent +body plus safety_context. Write a one-sentence justification per dimension. + +**Severity:** driven primarily by the effect description and ASIL level if known. + +**Occurrence:** driven by the failure mechanism type (random hardware fault vs. systematic +software defect) and any field-experience hints in the parent body or safety_context. + +**Detection:** driven by what monitoring or testing is stated or implied in the parent element. +If the parent has a `:verification:` link, lower detection score (better detection). If no +verification is present, default to score 7 (poor detection assumed). + +Compute: `rpn = severity × occurrence × detection`. + +--- + +### Step 6: Assign fmea_id + +Format: `fmea__<parent_local>__<mode_slug>` + +Where: +- `parent_local` is the local part of parent_id (after the separator `__`) +- `mode_slug` is the failure_mode short form: lowercase, underscores, ≤ 5 words + +Example: parent `gd_req__abs_pump_activation`, mode "no activation on slip" → +`fmea__abs_pump_activation__no_activation`. + +--- + +### Step 7: Self-check + +Before emitting: + +- `rpn` == `severity × occurrence × detection` (recompute and verify) +- All three scores are integers in 1–10 +- All three justification strings are non-empty +- `fmea_id` matches the `fmea__<parent_local>__<mode_slug>` format +- `mitigations` contains at least one item + +If any check fails, correct and re-check once. If still failing, emit with `[DIAGNOSTIC]`. + +--- + +### Step 8: Emit JSON + +Emit the single JSON object. No prose before or after except the advisory note. + +If multiple failure modes were apparent from the parent element and only one is emitted, append +after the JSON: + +``` +[NOTE] Additional failure modes may be apparent from this element. +Re-invoke pharaoh-fmea with an explicit failure_mode argument for each additional mode. +``` + +--- + +## Guardrails + +**G1 — Parent not found** + +parent_id absent from needs.json → FAIL (Step 2). + +**G2 — Parent body too vague** + +No derivable failure mode → FAIL (Step 3). Do not invent a generic placeholder failure mode. + +**G3 — Multiple failure modes inferred** + +If the parent implies more than one distinct failure mode, derive the highest-severity one and +emit the `[NOTE]` advisory (Step 8). Do not bundle two failure modes in one JSON entry. + +**G4 — needs.json unavailable** + +Cannot find needs.json → FAIL (Step 1). + +**G5 — RPN mismatch** + +If the emitted `rpn` field does not equal `severity × occurrence × detection`, self-correct +before emitting. Never emit a mismatched RPN. + +--- + +## Advisory chain + +This skill has no downstream chain (`chains_to: []`). No advisory is appended unless multiple +failure modes were detected (see Step 8 `[NOTE]`). + +--- + +## Worked example + +**User input:** +> Parent: `gd_req__abs_pump_activation`; no explicit failure_mode; safety_context: ASIL B. + +**Parent body (from needs.json):** +> "The brake controller shall engage the ABS pump when measured wheel slip exceeds the +> calibrated activation threshold." + +**Step 3 — derive failure_mode:** +Primary function: engage ABS pump on slip threshold exceedance. +Most apparent failure: pump is not engaged despite slip threshold being exceeded. +failure_mode = "no ABS pump activation on slip threshold exceedance" + +**Step 4:** +- cause: "Brake controller software fails to detect slip threshold exceedance due to ADC input + signal corruption or watchdog reset during the slip-detection window." +- effect: "ABS pump does not activate; wheel locks up during emergency braking; braking distance + increases; driver loses directional control. ASIL B safety goal potentially violated." +- mitigations: ["Watchdog monitoring on slip-detection task", "Redundant wheel speed sensor + input path", "End-of-line ABS activation test before vehicle delivery"] + +**Step 5 — S/O/D:** +- severity = 8; justification: "Failure to activate ABS during emergency braking can cause loss + of directional control, significant injury risk — ASIL B item." +- occurrence = 4; justification: "ADC signal corruption or watchdog reset is a low-probability + systematic defect in a mature ECU design." +- detection = 5; justification: "No continuous runtime monitoring of pump-activation response + is stated in the parent requirement; EoL test improves detection but does not eliminate + in-field risk." +- rpn = 8 × 4 × 5 = 160 + +**Step 6:** fmea_id = `fmea__abs_pump_activation__no_activation` + +**Step 7:** rpn == 160; all scores 1–10; justifications non-empty; at least one mitigation. Pass. + +**Step 8 output:** + +```json +{ + "fmea_id": "fmea__abs_pump_activation__no_activation", + "parent": "gd_req__abs_pump_activation", + "failure_mode": "no ABS pump activation on slip threshold exceedance", + "cause": "Brake controller software fails to detect slip threshold exceedance due to ADC input signal corruption or watchdog reset during the slip-detection window.", + "effect": "ABS pump does not activate; wheel locks up during emergency braking; braking distance increases; driver loses directional control. ASIL B safety goal potentially violated.", + "severity": 8, + "occurrence": 4, + "detection": 5, + "rpn": 160, + "mitigations": [ + "Watchdog monitoring on slip-detection task", + "Redundant wheel speed sensor input path", + "End-of-line ABS activation test before vehicle delivery" + ], + "justifications": { + "severity": "Failure to activate ABS during emergency braking can cause loss of directional control and significant injury risk — ASIL B item.", + "occurrence": "ADC signal corruption or watchdog reset is a low-probability systematic defect in a mature ECU design.", + "detection": "No continuous runtime monitoring of pump-activation response stated in parent requirement; EoL test improves but does not eliminate in-field detection gap." + } +} +``` + +## Last step + +After emitting the artefact, invoke `pharaoh-fmea-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-fmea-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/.github/agents/pharaoh.gate-advisor.agent.md b/.github/agents/pharaoh.gate-advisor.agent.md index 0991f5c..5ab311d 100644 --- a/.github/agents/pharaoh.gate-advisor.agent.md +++ b/.github/agents/pharaoh.gate-advisor.agent.md @@ -7,4 +7,163 @@ handoffs: [] Use when reading a project's `pharaoh.toml` to report which phased-enablement ladder step is the recommended next gate to switch on. Single mechanical advisory check — parses five flags (`strictness`, `require_verification`, `require_change_analysis`, `require_mece_on_release`, `codelinks.enabled`), walks the fixed ladder in order, and emits the first unmet step plus its blocker note. Read-only; never edits `pharaoh.toml`. -See [`skills/pharaoh-gate-advisor/SKILL.md`](../../skills/pharaoh-gate-advisor/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-gate-advisor + +## When to use + +Invoke after `pharaoh-bootstrap` + `pharaoh-setup` have landed a `pharaoh.toml`, whenever an auditor asks "which gate should we switch on next?", or as a recurring prompt in a project-health review. Reads `pharaoh.toml`, reports the current state of the five ladder knobs, and names the FIRST ladder step whose flag is not yet enabled along with the pre-work that blocks enabling it. When every step is satisfied, returns `recommended_next_gate: null` with `rationale: "ladder complete"`. + +The ladder is fixed and ordered by value / cost ratio — cheapest-and-most-effective first, hardest-and-most-disruptive last. Advancing one step at a time makes the transition from "advisory everywhere" to "enforcing everywhere" debuggable — a project that flips `strictness = "enforcing"` before any individual gate is on ships a gate that enforces nothing, then gets blamed when a later flip fails. + +Do NOT invoke to modify `pharaoh.toml` — this skill is advisory, read-only. Auto-enablement belongs in `pharaoh-setup` or a future `pharaoh-setup-reconfigure`, not here. Do NOT invoke to grade the QUALITY of the gates' effects (whether review coverage is good, whether MECE is clean) — that is `pharaoh-quality-gate`. Do NOT invoke to reason about gates not in the ladder (e.g. `[pharaoh.traceability]`); the ladder is deliberately five steps. + +## Atomicity + +- (a) Indivisible: one `pharaoh.toml` in → one findings JSON out. No file writes, no dispatch of other skills, no reasoning about anything besides the five ladder flags. +- (b) Input: `{pharaoh_toml_path: str}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-gate-advisor/fixtures/` — one per ladder outcome: + 1. `fresh-from-bootstrap/` — every flag at its advisory default (`strictness = "advisory"`, all four booleans `false`). Expected: `recommended_next_gate == "require_verification"`, rationale names step 1 as the lowest-cost enablement, `ladder[0].blocker == "none — safe to enable now"`. + 2. `step-1-enabled/` — `require_verification = true`, the remaining three booleans `false`, strictness advisory. Expected: `recommended_next_gate == "require_change_analysis"`, rationale names step 2 and the pharaoh-change tailoring blocker. + 3. `all-steps-enabled/` — `strictness = "enforcing"`, all four booleans `true`. Expected: `recommended_next_gate == null`, rationale `"ladder complete"`, every ladder entry reports its flag as enabled. + + Pass = each fixture's actual output matches `expected-output.json` byte-for-byte (the ladder array is fixed and deterministic). +- (d) Reusable across projects — the ladder ships with zero project-specific vocabulary. Only `pharaoh.toml`'s own key names appear, and those are the same for every Pharaoh consumer. Tailoring extension point: projects may override `rationale` text via `tailoring.gate_advisor_rationale_overrides` if they want house-style blocker notes, but the ladder ORDER is fixed. +- (e) Read-only. Does not modify `pharaoh.toml`, `pharaoh.toml.example`, or any on-disk state. Running twice on identical input yields byte-identical output. + +## Input + +- `pharaoh_toml_path`: absolute path to the project's `pharaoh.toml`. The skill reads exactly five keys: + - `[pharaoh].strictness` — string; treated as `"advisory"` unless the value is exactly `"enforcing"`. + - `[pharaoh.workflow].require_verification` — boolean. + - `[pharaoh.workflow].require_change_analysis` — boolean. + - `[pharaoh.workflow].require_mece_on_release` — boolean. + - `[pharaoh.codelinks].enabled` — boolean. + + Default values are NOT redeclared in this skill. If a flag is absent from the project's `pharaoh.toml`, the skill treats it as the value declared in `pharaoh.toml.example` at the Pharaoh repo root. The example currently sets `(require_change_analysis=true, require_verification=true, require_mece_on_release=false)` and `codelinks.enabled=true`; for absent strictness the example sets `"advisory"`. To change the defaults this skill walks against, edit `pharaoh.toml.example` only — never reintroduce competing defaults here. + +Edge cases: +- `pharaoh_toml_path` missing or unreadable → emit `overall: "error"` with `errors: ["pharaoh.toml unresolved: <path>"]` and no other keys. Callers branch on `overall` first. No ladder array is emitted on this path — the ladder is meaningful only when the file parsed. +- TOML parse error (syntax bad) → same `overall: "error"` shape with the parser message included. +- Keys present but with unexpected types (e.g. `require_verification = "yes"` as a string) → treat as the typed default declared in `pharaoh.toml.example` (`true`/`false`/`"advisory"` per the example) and add a note `"unexpected type for <key>; treated as default"` in `notes`. +- Entire `[pharaoh.workflow]` or `[pharaoh.codelinks]` section absent → every flag in that section resolves to its example default; no error. + +## Output + +```json +{ + "current_state": { + "strictness": "advisory", + "require_verification": false, + "require_change_analysis": false, + "require_mece_on_release": false, + "codelinks_enabled": false + }, + "recommended_next_gate": "require_verification", + "rationale": "require_verification = true is the highest-value, lowest-cost step — it wires the review skills that are already ship-ready into the release gate and catches every PARTIAL finding via pharaoh-req-review. No pre-work required.", + "ladder": [ + {"step": 1, "gate": "require_verification = true", "blocker": "none — safe to enable now"}, + {"step": 2, "gate": "require_change_analysis = true", "blocker": "needs pharaoh-change to be tailored"}, + {"step": 3, "gate": "require_mece_on_release = true", "blocker": "needs release-gate workflow"}, + {"step": 4, "gate": "codelinks.enabled = true", "blocker": "needs codelink annotations in source"}, + {"step": 5, "gate": "strictness = enforcing", "blocker": "requires steps 1-4 satisfied"} + ], + "overall": "pass", + "notes": [] +} +``` + +Fields (in canonical order): +- `current_state`: echo of the five parsed flags, using the canonical key names above. `codelinks_enabled` is underscored here even though the TOML key is `codelinks.enabled`, so the JSON is flat and one-shape. +- `recommended_next_gate`: the canonical key name of the FIRST ladder step whose corresponding config field is not yet at its enabled value. One of `"require_verification"`, `"require_change_analysis"`, `"require_mece_on_release"`, `"codelinks_enabled"`, `"strictness_enforcing"`, or `null` when the ladder is complete. +- `rationale`: one or two sentences naming why this step is the next one — what it unlocks and what (if anything) blocks enabling it right now. On `null` recommendation, the string is exactly `"ladder complete"`. +- `ladder`: the fixed five-entry array shown above, shipped verbatim in every non-error response. Each entry has `step` (1–5), `gate` (the TOML-style line the project would add), and `blocker` (the pre-work the project must complete first, or `"none — safe to enable now"` for step 1). +- `overall`: `"pass"` when the file parsed and the ladder computed. `"error"` when the file failed to resolve or parse (see Edge cases). +- `notes`: any non-fatal observations (e.g. mistyped value, absent section treated as default). Empty list when clean. + +## Detection rule + +Two passes over the input; both mechanical, no LLM judgement. + +### 1. Parse the five flags + +**Check:** Load `pharaoh.toml` as TOML. Read each of the five keys per the `## Input` section. Apply defaults for missing keys. Coerce unexpected types to their defaults and add a note. + +**Detection:** +```python +import tomllib + +with open(pharaoh_toml_path, "rb") as fh: + data = tomllib.load(fh) + +strictness = data.get("pharaoh", {}).get("strictness", "advisory") +if strictness != "enforcing": + strictness = "advisory" + +wf = data.get("pharaoh", {}).get("workflow", {}) +rv = wf.get("require_verification", False) is True +rca = wf.get("require_change_analysis", False) is True +rmr = wf.get("require_mece_on_release", False) is True + +cl = data.get("pharaoh", {}).get("codelinks", {}) +ce = cl.get("enabled", False) is True +``` + +### 2. Walk the fixed ladder + +**Check:** Iterate the five ladder entries in order. The first entry whose corresponding flag is NOT at its enabled value is the `recommended_next_gate`. If all five are at their enabled value, `recommended_next_gate` is `null`. + +Enabled values per step (canonical): +1. `require_verification` enabled iff `rv is True`. +2. `require_change_analysis` enabled iff `rca is True`. +3. `require_mece_on_release` enabled iff `rmr is True`. +4. `codelinks_enabled` enabled iff `ce is True`. +5. `strictness_enforcing` enabled iff `strictness == "enforcing"`. + +**Detection:** +```python +LADDER = [ + ("require_verification", rv, "none — safe to enable now"), + ("require_change_analysis", rca, "needs pharaoh-change to be tailored"), + ("require_mece_on_release", rmr, "needs release-gate workflow"), + ("codelinks_enabled", ce, "needs codelink annotations in source"), + ("strictness_enforcing", strictness == "enforcing", "requires steps 1-4 satisfied"), +] + +recommended = next((name for name, enabled, _ in LADDER if not enabled), None) +``` + +The ladder array in the output is derived once from a static template (see `## Output`); only `recommended_next_gate`, `rationale`, and `current_state` vary per input. `overall` is `"pass"` on any successful parse. + +`rationale` text is drawn from a static map keyed by `recommended_next_gate`: + +| `recommended_next_gate` | Default rationale | +|------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------| +| `require_verification` | "require_verification = true is the highest-value, lowest-cost step — it wires the review skills that are already ship-ready into the release gate and catches every PARTIAL finding via pharaoh-req-review. No pre-work required." | +| `require_change_analysis` | "require_change_analysis = true is the next step. Blocker: pharaoh-change must be tailored for this project before the gate is meaningful — otherwise every authoring task will trip an alarm with no mitigation path." | +| `require_mece_on_release` | "require_mece_on_release = true is the next step. Blocker: the project needs a release-gate workflow that understands how to invoke pharaoh-mece and act on its findings." | +| `codelinks_enabled` | "codelinks.enabled = true is the next step. Blocker: the source tree needs codelink annotations (`@req`, `@impl`, etc.) on the symbols this project wants to trace, otherwise the flag activates an empty traceability view." | +| `strictness_enforcing` | "strictness = enforcing is the final step. Blocker: steps 1-4 must all be satisfied first — flipping strictness before the individual gates are on ships a gate that enforces nothing." | +| `null` | "ladder complete" | + +Projects override any row via `tailoring.gate_advisor_rationale_overrides[<key>]` in `.pharaoh/project/checklists/gate-advisor.md` (optional). The ladder ORDER and the `gate` / `blocker` strings are fixed and not overridable. + +## Tailoring extension point + +- `tailoring.gate_advisor_rationale_overrides`: map of `{recommended_next_gate: rationale_string}` that replaces the default rationale when emitted. A project that prefers short blocker notes, or that wants to surface internal-ticket links in the rationale, uses this. The key set must match the canonical `recommended_next_gate` names above; unknown keys are ignored with a `notes` entry. + +No other knobs are exposed. The ladder itself is the shared reference [`skills/shared/gate-enablement.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/gate-enablement.md) — a project that disagrees with the ladder order should file an issue against the shared reference, not fork this atom. + +## Composition + +Role: `atom-check`. + +Called standalone by auditors, by `pharaoh-process-audit` as an optional health check, or from a CI job that wants a deterministic recommendation in the project dashboard. Never invoked by `pharaoh-quality-gate` (this atom is advisory, not a gate invariant — the gate invariants check the effects of the flags, not whether the flags themselves are set). Never dispatches other skills. Never modifies `pharaoh.toml`. + +Related but distinct: +- `pharaoh-setup` ships step 1 (`require_verification = true`) on by default — so a fresh project running this skill straight after setup lands on step 2 as the recommendation. +- `shared/gate-enablement.md` documents the rationale for the ladder order; projects read it to understand WHY each step is where it is. +- `pharaoh-quality-gate` runs the invariants that the ladder flags control — it answers "are my gates passing?", not "which gate should I enable next?". diff --git a/.github/agents/pharaoh.id-allocate.agent.md b/.github/agents/pharaoh.id-allocate.agent.md index bb80390..0b4def9 100644 --- a/.github/agents/pharaoh.id-allocate.agent.md +++ b/.github/agents/pharaoh.id-allocate.agent.md @@ -7,4 +7,80 @@ handoffs: [] Use when about to dispatch a fan-out of emission subagents (pharaoh-req-from-code, pharaoh-feat-draft-from-docs) and you need to pre-allocate globally-unique sphinx-needs IDs. Each subagent receives its pre-allocated pool and emits only from that pool, so parallel agents cannot collide on stem choice. Does NOT invoke emitters, does NOT write RST. -See [`skills/pharaoh-id-allocate/SKILL.md`](../../skills/pharaoh-id-allocate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-id-allocate + +## When to use + +Invoke from a plan emitted by `pharaoh-write-plan` (executed via `pharaoh-execute-plan`) before any task that fans out req-emission. Produces a deterministic mapping from `(parent_feat_id, stem)` to a unique list of IDs, so each req-emission task gets its own pre-reserved slots. Without this step, parallel req-emitters choose stems independently and may emit colliding IDs. + +Do NOT use to rename existing IDs. Do NOT use to emit reqs. Do NOT use to delete IDs. + +## Atomicity + +- (a) Indivisible — one request set in → one list of unique IDs out. No subagent dispatch. No file writes. No mutation of the source of `existing_ids`. +- (b) Input: `{existing_ids_file?: str, existing_ids?: list[str], requests: list[{parent_feat_id: str, stem: str, count: int, type: str, prefix: str}]}`. Output: list of allocated ID strings, one per requested slot, in request order. Globally unique across `existing_ids` AND within the returned list. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-id-allocate/input_spec.json` with 27 planned IDs across 3 features. When `existing_ids` contains `CREQ_writer_01`, the allocator's output for the first `writer` request starts at `CREQ_writer_02`. Output list length equals sum of `requests[].count`. +- (d) Reusable: any fan-out workflow where subagents emit IDs; CI allocators; renumbering utilities. +- (e) Composable: purely pure function. No side effects. No cross-skill calls. + +## Input + +- `existing_ids_file` (optional): path to a `needs.json` file. The allocator reads every need's `id` field into the existing-id set. If not provided, falls back to `existing_ids`. +- `existing_ids` (optional): explicit list of IDs to treat as already-allocated. Used when no `needs.json` is available. +- `requests`: list of allocation requests. Each request has: + - `parent_feat_id`: the parent feature this batch belongs to. Used for log messages only; IDs do not include it. + - `stem`: the per-file / per-symbol disambiguator (e.g. `writer`, `cli`, `exporter`). Usually the file stem normalized to snake_case. + - `count`: how many IDs to allocate in this batch. + - `type`: the sphinx-needs directive name (e.g. `comp_req`, `feat`) — recorded in log, not used for ID generation. + - `prefix`: the ID prefix (e.g. `CREQ_`, `FEAT_`). Determines the allocated ID format. + +At least one of `existing_ids_file` or `existing_ids` MUST be provided. Pass `existing_ids=[]` to signal a greenfield project. + +## Output + +A JSON array of ID strings, in request order. For `requests = [{stem: "a", count: 2, prefix: "CREQ_"}, {stem: "b", count: 1, prefix: "CREQ_"}]`, the output is exactly: + +```json +["CREQ_a_01", "CREQ_a_02", "CREQ_b_01"] +``` + +(assuming no collisions with existing IDs). Callers parse with `json.loads` — no line-oriented or comma-separated alternative. + +On any collision, the allocator advances an independent per-stem sequence counter until a free slot is found, then emits exactly `count` IDs per request. If the per-stem counter reaches 99 without emitting `count` free slots, FAIL — excessive collision means the caller chose a poor stem (too generic, reused across many features). The cap aligns with the 2-digit `_<seq:02d>` format: emitted `seq` values stay in `01..99`, never wider. + +## Process + +### Step 1: Collect existing IDs + +If `existing_ids_file` is provided, read it, parse as JSON, extract every `needs[*].id` value into an existing-id set. If `existing_ids` is provided, union its contents into the set. If both are missing, FAIL (caller error). + +### Step 2: Allocate per request + +Maintain an "allocated in this call" set alongside `existing_ids`. For each request, keep a per-stem `seq` counter (starts at 1). Produce exactly `request.count` IDs by looping `slots_emitted` from 0 to `count - 1`: + +1. Generate candidate `<prefix><stem>_<seq:02d>` (e.g. `CREQ_writer_01`). +2. If candidate collides with either set, increment `seq` and retry — the slot is not consumed. If `seq` exceeds 99 without emitting `count` free IDs for this request, FAIL naming the stem and the collision rate. +3. On a non-colliding candidate: add to the "allocated in this call" set, append to the output list, increment `seq`, increment `slots_emitted`. + +Exactly `count` IDs per request end up in the output. The per-stem `seq` counter is independent of `slots_emitted` — `seq` only advances on collision; `slots_emitted` only advances on successful emit. + +### Step 3: Return + +Emit the output as a JSON array of strings (per the wire format declared above). Nothing else on stdout. + +## Failure modes + +- Neither `existing_ids_file` nor `existing_ids` provided → FAIL. +- `existing_ids_file` path unreadable → FAIL. +- Counter exceeds 99 for any request → FAIL naming the stem. +- Any request has `count < 1` → FAIL. + +## Non-goals + +- No ID minting strategy beyond sequential numbering — if a project wants UUID-based IDs, this skill is not the right fit. +- No bulk renumbering of existing IDs — this skill only allocates new ones. +- No cross-project uniqueness — scoped to the one project whose `needs.json` (or `existing_ids`) was provided. diff --git a/.github/agents/pharaoh.id-convention-check.agent.md b/.github/agents/pharaoh.id-convention-check.agent.md index f8e2ff2..8cf6197 100644 --- a/.github/agents/pharaoh.id-convention-check.agent.md +++ b/.github/agents/pharaoh.id-convention-check.agent.md @@ -7,4 +7,137 @@ handoffs: [] Use when verifying that every need id in a sphinx-needs corpus matches the regex declared for its type in `.pharaoh/project/id-conventions.yaml`. Single mechanical structural check — applies the tailored per-type regex, emits a list of violations. Does NOT auto-detect how many schemes coexist — scheme policy is the tailoring author's responsibility (declare an alternation to allow multiple forms). -See [`skills/pharaoh-id-convention-check/SKILL.md`](../../skills/pharaoh-id-convention-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-id-convention-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` (invariant: `id-convention-consistent`) or directly after a build to confirm the corpus obeys its declared id scheme. Reads `id-conventions.yaml` + `needs.json`, returns findings JSON listing every need whose `id` does not match the regex for its `type`. + +Do NOT use to discover or count id schemes — the tailoring author declares ONE canonical regex per type and this atom only reports violations against that regex. If multiple forms are legal (e.g. legacy plus new prefix), the tailoring author encodes that as an alternation in the regex (`^CREQ_.+$|^gd_req__.+$`). Do NOT use to rename ids or mutate the corpus — read-only. + +## Atomicity + +- (a) Indivisible: one `id-conventions.yaml` + one `needs.json` in → one findings JSON out. No scheme counting, no regex inference, no id rewriting, no dispatch of other skills. +- (b) Input: `{id_conventions_path: str, needs_json_path: str}`. Output: JSON `{needs_checked: int, violations: [{need_id, type, expected_regex, reason}], overall: "pass" | "fail"}`. +- (c) Reward: fixtures under `skills/pharaoh-id-convention-check/fixtures/` — one per outcome: + 1. `all-conform/` — every id matches its type's regex → matches `expected-output.json` (`overall: "pass"`, empty `violations`, `needs_checked == len(needs)`). + 2. `some-violate/` — mix of conforming and non-conforming ids across two types → `overall: "fail"`, `violations` lists each offender with its `type`, the `expected_regex` applied, and a short `reason`. + 3. `alternation-regex/` — tailoring declares `^CREQ_.+$|^gd_req__.+$` and both forms are used in the corpus → `overall: "pass"` because the alternation matches both. + + Pass = all 3 fixture outputs match `expected-output.json` exactly (modulo ordering of `violations`, which is sorted by `need_id` in the emitted output). +- (d) Reusable across projects — the regex is data-driven via tailoring; no project-specific prefix or separator is hardcoded. Works for any sphinx-needs corpus with an `id-conventions.yaml`. +- (e) Read-only. No side effects. Does not modify the tailoring file or the needs corpus. Running twice on identical inputs yields byte-identical output. + +## Input + +- `id_conventions_path`: absolute path to the tailoring file `.pharaoh/project/id-conventions.yaml`. Schema accepted: + + ```yaml + # top-level default regex applied to any type without an override + id_regex: "^[a-z][a-z_]*__[a-z0-9_]+$" + + # per-type overrides — the regex applied to needs of that type + id_regex_exceptions: + comp_req: "^CREQ_[a-z]+_[a-z]+_[a-z]+$" + gd_req: "^CREQ_.+$|^gd_req__.+$" + ``` + + Resolution order for a need of type `T`: `id_regex_exceptions[T]` if declared, else `id_regex` (top-level default), else fail the whole check with `reason: "no regex declared for type <T>"` on every need of that type. + +- `needs_json_path`: absolute path to the built sphinx-needs corpus `needs.json`. Accepts either the flat `{"needs": {<id>: {id, type, ...}, ...}}` shape or the versioned `{"versions": {"<v>": {"needs": {...}}}}` shape (uses `current_version` if declared, else the latest key). Each need object must carry at least `id` and `type`; needs missing either field are reported as violations with `reason: "missing id or type field"`. + +Edge cases: +- Empty corpus (`needs` is `{}`) → `needs_checked: 0, violations: [], overall: "pass"` (vacuously true). +- `id-conventions.yaml` has neither `id_regex` nor `id_regex_exceptions` → every need is a violation with `reason: "no regex declared for type <T>"`. +- Regex compilation error (invalid Python regex syntax in the tailoring) → `overall: "fail"` with a single violation `{need_id: "*", type: "<T>", expected_regex: "<bad regex>", reason: "regex compile error: <python error>"}` and `needs_checked: 0`. +- Need `type` not mentioned in `id_regex_exceptions` and no top-level default → violation with `reason: "no regex declared for type <T>"`. + +## Output + +```json +{ + "needs_checked": 44, + "violations": [ + { + "need_id": "comp_req__login_ok", + "type": "comp_req", + "expected_regex": "^CREQ_[a-z]+_[a-z]+_[a-z]+$", + "reason": "does not match" + }, + { + "need_id": "CREQ_a", + "type": "comp_req", + "expected_regex": "^CREQ_[a-z]+_[a-z]+_[a-z]+$", + "reason": "does not match" + } + ], + "overall": "fail" +} +``` + +`overall` is `"pass"` iff `violations` is empty. `needs_checked` counts every need that was read from `needs.json` (including ones that triggered a "no regex declared" violation — they are still counted). `violations` is sorted by `need_id` ascending for deterministic fixture comparison. `reason` is a short human string: one of `"does not match"`, `"missing id or type field"`, `"no regex declared for type <T>"`, or `"regex compile error: <python error>"`. + +## Detection rule + +For every need `N` in the flattened needs map: + +1. Read `N.id` and `N.type`. If either is absent, emit violation `{need_id: <whatever id is, or "<missing>">, type: <or "<missing>">, expected_regex: null, reason: "missing id or type field"}` and continue. +2. Resolve the regex for `N.type`: first `id_regex_exceptions[N.type]`, else top-level `id_regex`. If neither is declared, emit violation `{need_id: N.id, type: N.type, expected_regex: null, reason: "no regex declared for type <N.type>"}` and continue. +3. Compile the regex with Python `re.compile(pattern)`. On `re.error`, emit a single synthetic violation (see Edge cases above) and abort. +4. Apply `re.fullmatch(pattern, N.id)`. If `None`, emit violation `{need_id: N.id, type: N.type, expected_regex: <pattern>, reason: "does not match"}`. + +`fullmatch` (not `search` or `match`) is load-bearing: the regex describes the entire id, anchors or not. This rule is what lets the tailoring author write `^CREQ_.+$|^gd_req__.+$` and have both forms pass without the alternation implicitly anchoring only the first branch. + +Minimum viable Python reference implementation (≤ 30 lines): + +```python +import json, re, yaml, sys + +conv = yaml.safe_load(open(id_conventions_path)) +nj = json.load(open(needs_json_path)) +needs = nj.get("needs") or next(iter(nj.get("versions", {}).values()), {}).get("needs", {}) + +default = conv.get("id_regex") +by_type = conv.get("id_regex_exceptions", {}) or {} + +violations = [] +for nid, n in needs.items(): + t = n.get("type"); i = n.get("id", nid) + if not t or not i: + violations.append({"need_id": i or "<missing>", "type": t or "<missing>", + "expected_regex": None, "reason": "missing id or type field"}); continue + pat = by_type.get(t, default) + if pat is None: + violations.append({"need_id": i, "type": t, "expected_regex": None, + "reason": f"no regex declared for type {t}"}); continue + try: + rx = re.compile(pat) + except re.error as e: + print(json.dumps({"needs_checked": 0, "violations": [ + {"need_id": "*", "type": t, "expected_regex": pat, + "reason": f"regex compile error: {e}"}], "overall": "fail"})); sys.exit(0) + if not rx.fullmatch(i): + violations.append({"need_id": i, "type": t, "expected_regex": pat, "reason": "does not match"}) + +violations.sort(key=lambda v: v["need_id"]) +print(json.dumps({"needs_checked": len(needs), + "violations": violations, + "overall": "pass" if not violations else "fail"})) +``` + +## Failure modes + +- **Scheme auto-detection is explicitly out of scope.** This atom does NOT answer "how many id schemes exist in this corpus?" — that is a tailoring-authoring concern, served by `pharaoh-tailor-detect`. If a project wants to allow two prefixes, the tailoring author writes an alternation regex; this check applies whatever regex is declared. +- **No Unicode normalisation.** Ids are matched byte-for-byte against the regex. Non-ASCII ids work only if the regex accounts for them. Sphinx-needs ids are ASCII in practice, so this is not a blocker. +- **No type-name validation against `artefact-catalog.yaml`.** An id of type `T` whose `T` is absent from the artefact catalog will still be checked against the default regex (or flagged with "no regex declared"). Cross-file consistency of type names is `pharaoh-tailor-review`'s job, not this atom's. +- **`fullmatch` semantics.** Writers of the tailoring must know their regex will be `fullmatch`-ed. Adding redundant anchors (`^...$`) is harmless; omitting anchors also works. Using `re.search`-style partial patterns that were intended to match substrings will misbehave — document this in project tailoring. + +## Composition + +Role: `atom-check`. + +Called by `pharaoh-quality-gate` when `required_checks` contains `id_convention_consistent: true`, under the invariant delegation entry `id-convention-consistent`. Never invokes other skills; never dispatched from emission skills. May also be invoked directly by a human auditor inspecting a corpus. diff --git a/.github/agents/pharaoh.lifecycle-check.agent.md b/.github/agents/pharaoh.lifecycle-check.agent.md index b7474ef..da9e981 100644 --- a/.github/agents/pharaoh.lifecycle-check.agent.md +++ b/.github/agents/pharaoh.lifecycle-check.agent.md @@ -7,4 +7,254 @@ handoffs: [] Use when verifying a sphinx-needs artefact's current lifecycle state and the legality of a requested state transition per the project's workflows.yaml state machine. -See [`skills/pharaoh-lifecycle-check/SKILL.md`](../../skills/pharaoh-lifecycle-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-lifecycle-check + +## When to use + +Invoke when you want to check whether a proposed state transition is allowed for a given need, +or to audit a need's current state against the state machine. + +This skill is a deterministic state-machine check — no LLM judgment. It reads the project's +`workflows.yaml` and evaluates legality mechanically. + +Do NOT invoke to change state — this skill only checks. Do NOT invoke for bulk transition +checks — one need per invocation. + +--- + +## Inputs + +- **need_id** (from user): ID of the need to check — must exist in `needs.json` +- **target_state** (from user, optional): the desired target state. If omitted, the skill + audits the current state only (checks it is a valid declared state, no illegal prerequisites + outstanding) +- **tailoring** (from `.pharaoh/project/`): + - `workflows.yaml` — lifecycle state machine (states + transitions with `requires:` lists) + - `artefact-catalog.yaml` — maps artefact type to lifecycle list +- **needs.json**: required to look up the need's current status and type + +--- + +## Outputs + +A single JSON document — no prose wrapper. Shape: + +```json +{ + "need_id": "gd_req__abs_pump_activation", + "artefact_type": "gd_req", + "current_state": "draft", + "target_state": "valid", + "legal": true, + "missing_prerequisites": [], + "transition_path": ["draft", "valid"], + "notes": [] +} +``` + +Fields: +- `legal`: `true` if the transition is permitted per `workflows.yaml`; `false` otherwise +- `missing_prerequisites`: list of requirement strings from the `requires:` list that cannot + be confirmed met from the need's current data (e.g. `"independent_review_complete"`) +- `transition_path`: ordered list of states from current to target (direct or multi-hop + shortest path); `null` if unreachable +- `notes`: informational observations (e.g. current state not declared, indirect path needed) + +If `target_state` was omitted, `target_state` is `null` and `legal` reflects whether the +current state is a valid declared state for this artefact type. + +--- + +## Process + +### Step 1: Load tailoring and needs.json + +**1a.** Read `workflows.yaml` from `.pharaoh/project/`. The file shape is fixed by +`schemas/workflows.schema.json` (flat `lifecycle_states` array, `transitions` array of +`{from, to, requires}`). Extract: +- `lifecycle_states` — flat list of declared state-name strings +- `transitions` list: each entry has `from`, `to`, and a `requires` list of gate-name + strings (always a list per the schema, never a scalar) + +**1b.** Read `artefact-catalog.yaml`. Find the entry for the artefact type of `need_id`. +Record the `lifecycle` list for that type (if present). + +**1c.** Find and parse `needs.json` (search order: `docs/_build/needs/needs.json`, +`_build/needs/needs.json`, any `needs.json` under `_build`). Extract the flat ID map. + +If any required file is missing, FAIL with the missing-file path and a hint to rebuild or +run `pharaoh-tailor-fill`. + +--- + +### Step 2: Resolve need + +Look up `need_id` in the needs.json ID map. If not found, FAIL: + +``` +FAIL: need_id "<id>" not found in needs.json. +Verify the ID or rebuild the project. +``` + +Extract: +- `type` (artefact type, e.g. `gd_req`) +- `status` (current lifecycle state) + +--- + +### Step 3: Validate current state + +Check whether `status` (current state) is declared in `workflows.yaml.lifecycle_states`. + +If not declared: +- Set `legal: false` +- Add to `notes`: `"Current state '<status>' is not declared in workflows.yaml lifecycle_states"` +- If `target_state` was not requested, emit result and stop. + +If declared, current state is valid. + +--- + +### Step 4: Check target state (if provided) + +If `target_state` was not provided, emit with `target_state: null`, `legal: true` +(assuming current state is valid), and stop. + +If `target_state` is provided: + +**4a.** Confirm `target_state` is declared in `workflows.yaml.lifecycle_states`. If not: +- Set `legal: false` +- Add to `notes`: `"Target state '<target_state>' is not declared in workflows.yaml lifecycle_states"` +- Emit and stop. + +**4b.** Build the transition graph from `workflows.yaml.transitions`. Find the shortest path +from `current_state` to `target_state` using BFS. + +If no path exists: +- Set `legal: false` +- Set `transition_path: null` +- Add to `notes`: `"No transition path from '<current_state>' to '<target_state>' in workflows.yaml"` + +If a path exists, set `transition_path` to the ordered list of states. + +--- + +### Step 5: Check prerequisites + +For each transition in the found path, read the `requires:` list. For each requirement string +in `requires:`: + +- Check whether the requirement can be confirmed met from the need's current data. The + following heuristics apply: + - `"independent_review_complete"` — check whether the need has a `:reviewed_by:` or + `:inspection_record:` field with a non-empty value in needs.json. If not present, + mark as missing. + - `"inspection_record_present"` — check whether the need has an `:inspection_record:` field + with a non-empty value. If not present, mark as missing. + - Any other requirement string — cannot be automatically confirmed; mark as missing with a + note that manual verification is required. + +Populate `missing_prerequisites` with strings that are not confirmed met. + +If `missing_prerequisites` is non-empty, set `legal: false`. +If all prerequisites are met (or the `requires:` lists are empty), `legal` remains `true`. + +--- + +### Step 6: Emit JSON + +Emit the single JSON document. No prose before or after. + +--- + +## Guardrails + +**G1 — need_id not in needs.json** + +FAIL immediately (Step 2). Do not proceed with guessed data. + +**G2 — workflows.yaml missing** + +If `workflows.yaml` is absent from `.pharaoh/project/`, FAIL: + +``` +FAIL: workflows.yaml not found at .pharaoh/project/workflows.yaml. +Run pharaoh-tailor-fill to generate tailoring files. +``` + +**G3 — Current state not declared** + +Proceed but set `legal: false` and note the anomaly (Step 3). Do not abort — the output is +still useful for diagnosing the state machine gap. + +--- + +## Advisory chain + +No downstream chain. If `legal: false`, append after the JSON: + +``` +Resolve missing prerequisites or fix the state declaration before performing the transition. +``` + +--- + +## Worked example + +**User input:** +- `need_id`: `gd_req__abs_pump_activation` +- `target_state`: `valid` + +**Step 1:** `workflows.yaml` loaded. States: `draft`, `valid`, `inspected`. Transitions: +`draft → valid` (requires: `independent_review_complete`), `valid → inspected` +(requires: `inspection_record_present`), `inspected → draft` (requires: `[]`). + +**Step 2:** Need found in needs.json. `type = gd_req`, `status = draft`. + +**Step 3:** `draft` is declared in lifecycle_states. Current state valid. + +**Step 4a:** `valid` declared. Continue. + +**Step 4b:** BFS finds direct path `[draft, valid]`. + +**Step 5:** Transition `draft → valid` requires `independent_review_complete`. The need has no +`:reviewed_by:` or `:inspection_record:` field in needs.json → prerequisite unconfirmed → +added to `missing_prerequisites`. + +```json +{ + "need_id": "gd_req__abs_pump_activation", + "artefact_type": "gd_req", + "current_state": "draft", + "target_state": "valid", + "legal": false, + "missing_prerequisites": ["independent_review_complete"], + "transition_path": ["draft", "valid"], + "notes": [ + "Prerequisite 'independent_review_complete' cannot be confirmed from needs.json fields — manual verification required" + ] +} +``` + +``` +Resolve missing prerequisites or fix the state declaration before performing the transition. +``` + +**Variant — current state only (no target_state):** + +```json +{ + "need_id": "gd_req__abs_pump_activation", + "artefact_type": "gd_req", + "current_state": "draft", + "target_state": null, + "legal": true, + "missing_prerequisites": [], + "transition_path": null, + "notes": [] +} +``` diff --git a/.github/agents/pharaoh.link-completeness-check.agent.md b/.github/agents/pharaoh.link-completeness-check.agent.md index 6d91804..b06207f 100644 --- a/.github/agents/pharaoh.link-completeness-check.agent.md +++ b/.github/agents/pharaoh.link-completeness-check.agent.md @@ -7,4 +7,141 @@ handoffs: [] Use when verifying outgoing-link coverage across a full needs.json graph. For each declared link type in `artefact-catalog.yaml`, confirms every need of the governed type carries a non-empty value AND every target id resolves to an existing need. Closes the "catalogue declares `verifies` required but half the reqs ship without it" failure class. -See [`skills/pharaoh-link-completeness-check/SKILL.md`](../../skills/pharaoh-link-completeness-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-link-completeness-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` (invariant `link-types-covered`) or from any corpus-level lint that wants to fail the build when declared link coverage slips. Reads one `artefact-catalog.yaml` + one `needs.json`, returns coverage metrics per link type plus the list of uncovered need ids. + +Scope clarification — NOT a schema check for individual directive blocks. Use `pharaoh-output-validate` for block-level schema validation (required fields present, no unknown options, well-formed RST / YAML / JSON). This atom operates on the full needs.json graph: coverage of link types across all needs of each type, link-target resolution, per-type policy enforcement. + +Do NOT use to author the catalog (that is `pharaoh-tailor-fill`). Do NOT use to re-link or patch missing links (read-only). Do NOT use to grade prose quality of the linked needs. + +## Atomicity + +- (a) Indivisible: one artefact catalog + one needs.json in → one findings JSON out. No re-linking, no re-authoring, no dispatch of other skills. +- (b) Input: `{artefact_catalog_path: str, needs_json_path: str}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-link-completeness-check/fixtures/`: + 1. `all-covered/` — every need of every governed type carries every declared-required outgoing link AND every target resolves → `expected-output.json` with `overall: "pass"`, zero `missing` across `coverage_by_link_type`, empty `uncovered_needs`. + 2. `partial-coverage/` — some `comp_req` needs lack `:verifies:` and one need points at a non-existent id → `overall: "fail"`, `coverage_by_link_type.verifies.missing > 0`, `uncovered_needs` lists every failing id once. + 3. `tailoring-declares-verifies-optional/` — artefact-catalog marks `verifies` as optional for `comp_req`; those needs have no `:verifies:` field → `overall: "pass"`, `coverage_by_link_type.verifies.required: false`, no entries in `uncovered_needs` for that link type. + + Pass = each fixture's actual output matches `expected-output.json` modulo ordering of list elements. +- (d) Reusable across projects — consumes only the generic `artefact-catalog.yaml` + `needs.json` shapes. No project-specific link names or prefixes baked in. Tailoring extension point: the set of governed types and their `required_links` / `optional_links` is declared entirely in the catalog. +- (e) Read-only. Does not modify catalog, needs, or any on-disk state. Running twice on identical inputs produces identical output. + +## Input + +- `artefact_catalog_path`: absolute path to `artefact-catalog.yaml`. Each top-level key is a need `type`. Each type may declare `required_links: [<link_name>, ...]` and `optional_links: [<link_name>, ...]`. If a type declares neither, it is skipped (no policy, no failures). If a link name appears in both lists, `required_links` wins. +- `needs_json_path`: absolute path to `needs.json` produced by `sphinx-build`. Must contain a top-level `needs` object keyed by need id. Each need dict carries at least `type`, `id`, and any link-name keys whose values are lists of target ids. + +Edge cases: empty `needs.json` (no needs) → `overall: "pass"`, `needs_checked: 0`, empty `coverage_by_link_type`; missing `artefact-catalog.yaml` → fail with `overall: "error"`, `errors: ["artefact_catalog not found: <path>"]`; malformed YAML / JSON → fail with `overall: "error"` and the parser message; needs of a type not declared in the catalog → counted in `needs_checked` but contribute no link-coverage rows. + +## Output + +```json +{ + "needs_checked": 40, + "coverage_by_link_type": { + "satisfies": {"required": true, "covered": 40, "missing": 0}, + "verifies": {"required": true, "covered": 11, "missing": 29} + }, + "uncovered_needs": ["comp_req__auth_login", "comp_req__auth_logout"], + "unresolved_targets": [ + {"need_id": "comp_req__auth_login", "link": "verifies", "target": "tc__auth_login_ok", "reason": "target id not in needs.json"} + ], + "overall": "fail" +} +``` + +`overall` is `"pass"` iff every required link type has `missing == 0` AND `unresolved_targets` is empty. A single required-link gap OR a single unresolved target promotes `overall: "fail"`. Optional link types are reported in `coverage_by_link_type` with `required: false` and never contribute to the gate outcome — their `missing` counts are informational only. Needs whose type is absent from the catalog are counted in `needs_checked` but never populate `uncovered_needs`. + +`uncovered_needs` lists each need id at most once, even when it misses more than one required link type. `unresolved_targets` enumerates every broken target separately so the caller can name each dangling pointer. + +On input errors, the shape is `{"overall": "error", "errors": [<msg>, ...]}` with no other keys — callers branch on `overall` first. + +## Detection rule + +Three passes over the inputs; all mechanical, no LLM judgement. + +### 1. Load and index + +**Check:** Parse `artefact-catalog.yaml` into `{type: {required_links: set, optional_links: set}}`. Parse `needs.json` into `{need_id: need_dict}`. Build `known_ids = set(needs.keys())`. + +**Detection:** +```python +catalog = yaml.safe_load(open(artefact_catalog_path)) +needs = json.load(open(needs_json_path))["needs"] +known_ids = set(needs.keys()) +``` + +### 2. Per-need outgoing-link coverage + +**Check:** For each need, look up the catalog entry for its `type`. For every link name in `required_links`, the need's dict must have that key AND the value must be a non-empty list. Missing key OR empty list records the need id in `uncovered_needs` and increments `coverage_by_link_type[<link>].missing`. + +Needs whose type is not declared in the catalog contribute to `needs_checked` but generate no coverage rows. Optional links that are absent do not fail; when present, their targets are still resolved (step 3). + +**Detection:** +```python +for nid, need in needs.items(): + policy = catalog.get(need["type"]) + if not policy: + continue + for link_name in policy.get("required_links", []): + value = need.get(link_name) or [] + if not value: + uncovered.add(nid) + coverage[link_name]["missing"] += 1 + else: + coverage[link_name]["covered"] += 1 +``` + +### 3. Target resolution + +**Check:** For every link value (required OR optional) whose list is non-empty, each target id must appear in `known_ids`. A target absent from `known_ids` records an entry in `unresolved_targets` with `{need_id, link, target, reason}`. Unresolved targets count as coverage failures even when the link itself is present and non-empty — a link that points nowhere is worse than no link. + +**Detection:** +```python +for nid, need in needs.items(): + policy = catalog.get(need["type"]) + if not policy: + continue + all_links = set(policy.get("required_links", [])) | set(policy.get("optional_links", [])) + for link_name in all_links: + for target in need.get(link_name) or []: + if target not in known_ids: + unresolved.append({ + "need_id": nid, + "link": link_name, + "target": target, + "reason": "target id not in needs.json", + }) +``` + +### 4. Aggregate + +**Check:** `overall = "pass"` iff every required link has `missing == 0` AND `unresolved_targets == []`. Otherwise `"fail"`. `uncovered_needs` is the deduplicated sorted list of need ids that missed at least one required link. + +## Tailoring extension point + +All policy is declared in `artefact-catalog.yaml` — no frontmatter knobs on this skill. Projects add or remove link types by editing the catalog entry for each type: + +```yaml +comp_req: + required_links: [satisfies, verifies] + optional_links: [refines, supersedes] +``` + +Moving a link name from `required_links` to `optional_links` (or vice versa) is the single tailoring lever. The base skill ships with zero hardcoded link names. + +## Composition + +Role: `atom-check`. + +Called from `pharaoh-quality-gate.required_checks` under the invariant key `link-types-covered`, which passes iff `overall == "pass"`. Also directly invokable from any corpus-level lint or CI job that produces a `needs.json`. + +Never invoked by end users mid-authoring — authoring-time link checks belong in `pharaoh-req-review` / `pharaoh-arch-review` for the single artefact in hand. This atom is for full-graph sweeps after `sphinx-build` has produced `needs.json`. diff --git a/.github/agents/pharaoh.mece.agent.md b/.github/agents/pharaoh.mece.agent.md index ec4d116..421398f 100644 --- a/.github/agents/pharaoh.mece.agent.md +++ b/.github/agents/pharaoh.mece.agent.md @@ -143,3 +143,509 @@ This agent has **no prerequisites**. Runs freely in any mode. Its result gates ` 2. Follow ALL configured link types. Do not hardcode type or link names. 3. Redundancy uses string comparison only, not semantic analysis. 4. Always update session state after completing the report. + +--- + +## Full atomic specification + +# pharaoh:mece -- MECE Analysis + +Analyze a sphinx-needs project for structural completeness and consistency. +MECE stands for Mutually Exclusive, Collectively Exhaustive. This skill finds +gaps in traceability coverage (not exhaustive), redundant or overlapping +requirements (not mutually exclusive), and status or ID inconsistencies across +the needs set. + +## 1. Overview + +### What this skill does + +- **Gap analysis**: Finds needs that are missing required downstream coverage + (e.g., a requirement with no linked specification). +- **Orphan detection**: Finds needs that are completely disconnected from the + traceability graph, or that terminate unexpectedly. +- **Redundancy analysis**: Flags needs of the same type with very similar titles, + content, or identical link structures that may be unintentional duplicates. +- **Status inconsistency checks**: Detects contradictions between a need's status + and the statuses of its linked parents or children. +- **ID scheme validation**: Ensures all need IDs conform to the project's naming + convention and that no duplicates exist. +- **Schema validation**: When ubc CLI is available, runs full ontology and lint + checks. + +### How it differs from review skills (e.g. pharaoh:req-review) + +Review skills (e.g. `pharaoh:req-review`) check the **content** of individual needs -- whether the +requirement text is clear, whether test cases adequately cover what they claim, +whether implementations match their specifications. + +`pharaoh:mece` checks the **structure** of the requirements set as a whole -- +whether every need is connected to the traceability chain, whether the chain has +gaps, and whether the set is internally consistent. + +Both skills are complementary. Running them together gives full coverage of +content quality and structural integrity. + +### Why it matters + +In safety-critical domains (ISO 26262, IEC 62304, DO-178C, A-SPICE), regulatory +audits require evidence of complete bidirectional traceability from top-level +requirements through specifications, implementations, and tests. A single orphan +requirement or broken link chain can result in audit findings. This skill +automates the structural checks that catch these problems before an auditor does. + +--- + +## 2. Process + +Follow these steps in order. Do not skip steps. If a step fails or produces no +data, note that in the final report and continue to the next step. + +--- + +### Step 1: Get project data + +Follow the instructions in [`skills/shared/data-access.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/data-access.md) completely. At the +end of data access you must have: + +1. **Project roots**: All identified project root paths. +2. **Source directories**: Documentation source path for each project. +3. **Need types**: List of valid directive types with their prefixes (e.g., + `req`, `spec`, `impl`, `test`). +4. **Link types**: Standard `links` plus all extra_link names (e.g., + `implements`, `tests`). +5. **Data access tier**: Which tier is active (ubc CLI / ubCode MCP / raw files). +6. **Needs index**: Complete index of all needs with IDs, types, titles, + statuses, links, content, and source file locations. +7. **Link graph**: Bidirectional graph of need relationships. +8. **Codelinks status**: Whether sphinx-codelinks is configured. +9. **Pharaoh config**: Strictness level, workflow gates, traceability + requirements. + +Read `pharaoh.toml` and extract the `[pharaoh.traceability]` section. Record +the `required_links` array. These rules drive gap analysis in Step 2. + +If `pharaoh.toml` does not exist or `required_links` is missing/empty, use the +following defaults based on the detected need types: + +- If `req` and `spec` types exist: `"req -> spec"` +- If `spec` and `impl` types exist: `"spec -> impl"` +- If `impl` and `test` types exist: `"impl -> test"` + +Only apply default rules for type pairs that actually exist in the project +configuration. Do not assume types that are not configured. + +Present the data access summary before proceeding: + +``` +Project: <name> (<config source>) +Types: <list of directive names> +Links: <list of link type names> +Data source: <tier and version> +Needs found: <count> +Strictness: <advisory or enforcing> +Required chains: <list of required_links rules> +``` + +--- + +### Step 2: Gap analysis (Collectively Exhaustive) + +For each rule in the `required_links` list (or the defaults from Step 1): + +1. Parse the rule. Each rule has the form `"<source_type> -> <target_type>"` + (e.g., `"req -> spec"`). +2. Find all needs whose `type` matches `<source_type>`. +3. For each source need, check whether it has at least one outgoing link + (through any link type: `links`, `implements`, `tests`, or any extra_link) + to a need whose `type` matches `<target_type>`. +4. A source need that has **no** outgoing link to any need of the target type + is a **gap**. + +When checking links, resolve them transitively only one level deep. That is, +check direct links only. Do not follow chains (e.g., if `req -> spec -> impl` +are two separate rules, check each rule independently). + +When checking link targets, match on the **type** of the target need, not on +the ID prefix. Look up each linked ID in the needs index and check its `type` +field. + +Record each gap as: + +- Source need ID +- Source need type +- Missing target type +- The rule that requires the link (e.g., `"req -> spec"`) + +If ubc CLI is available, also run `ubc check` and incorporate any traceability +warnings it reports. Merge ubc results with the file-based analysis to avoid +duplicates. + +--- + +### Step 3: Orphan detection + +Scan the needs index and link graph to classify each need: + +**Completely disconnected (orphans)**: +A need that has NO incoming links AND no outgoing links of any kind. It is +entirely isolated from the traceability graph. + +**Dead ends**: +A need that has incoming links but no outgoing links. This is expected for +**leaf types** -- types that sit at the end of the traceability chain (typically +test cases, or top-level requirements that have no parent). It is unexpected for +**intermediate types** (e.g., a specification with no implementation link). + +Determine leaf types from the `required_links` rules: +- A type that never appears as a `<source_type>` in any rule is a leaf type. + Example: if the rules are `req -> spec`, `spec -> impl`, `impl -> test`, + then `test` is a leaf type because it never appears on the left side. +- A type that never appears as a `<target_type>` in any rule is a root type. + Example: `req` is a root type because it never appears on the right side. + +Classification: + +| Condition | Root type | Intermediate type | Leaf type | +|---|---|---|---| +| No incoming, no outgoing | Orphan (warning) | Orphan (error) | Orphan (error) | +| Has incoming, no outgoing | N/A (root has no incoming by definition) | Dead end (error) | Expected (ok) | +| No incoming, has outgoing | Expected (ok) | Missing parent (warning) | Unexpected parent (info) | +| Has incoming, has outgoing | Unexpected (info) | Expected (ok) | Unexpected (info) | + +Record each finding with: +- Need ID +- Need type +- Title +- Issue description +- Severity (error, warning, info) + +--- + +### Step 4: Redundancy analysis (Mutually Exclusive) + +Check for potential duplicates within each need type. Compare needs of the +**same type** only -- needs of different types are expected to have related +content (e.g., a spec that mirrors a req is correct, not redundant). + +**Title similarity**: +For each pair of needs of the same type, compare their titles. Flag pairs where: +- Titles are identical (after normalizing whitespace and case). +- Titles differ only by a trailing number or minor suffix (e.g., + "User login" vs "User login v2"). + +**Content similarity**: +For each pair of needs of the same type, compare their content bodies. Flag +pairs where: +- Content is identical or nearly identical (ignoring whitespace differences). +- One content body is a strict subset of the other. + +**Structural similarity**: +For each pair of needs of the same type, compare their link sets. Flag pairs +where: +- Both needs link to the exact same set of parent needs AND the exact same set + of child needs. + +Do not attempt full semantic analysis. Use string-level comparison only. The +goal is to surface obvious duplicates that a human reviewer should evaluate. + +For each potential redundancy, record: +- Need A ID +- Need B ID +- Similarity type (title, content, structural) +- Brief reason (e.g., "Identical titles", "Same link set") + +Always flag redundancies as **informational**. Redundancy may be intentional. +Never suggest automatic resolution. + +--- + +### Step 5: Status inconsistencies + +Check for contradictions between the statuses of linked needs. These checks +apply regardless of the link type used. + +**Parent closed, child open**: +If a parent need has status matching any of `closed`, `done`, `verified`, +`approved` (case-insensitive), but a child need linked from it has status +matching any of `open`, `draft`, `in_progress`, `todo` (case-insensitive), +flag the inconsistency. The parent appears complete, but work remains on its +child. + +**Child closed, parent open**: +If all children of a need have a closed-family status, but the parent itself +has an open-family status, flag it. The parent may be ready to close. + +**Status vs. link existence**: +- A need with status `implemented` (or similar) that has no outgoing link to + any `impl`-type need is suspicious. +- A need with status `verified` or `tested` that has no outgoing link to any + `test`-type need is suspicious. +- Only flag these if the relevant need types exist in the project configuration. + +For each inconsistency, record: +- Need ID +- Current status +- Issue description +- Severity (warning) + +--- + +### Step 6: ID scheme violations + +Check that all need IDs conform to the project's expected patterns. + +**From pharaoh.toml**: +If `pharaoh.toml` defines `[pharaoh.id_scheme]` with a `pattern`, use that +pattern as the expected format. The pattern is a template string like +`"{TYPE}-{MODULE}-{NUMBER}"`. Convert it to a validation check: +- `{TYPE}` should match the need's type prefix. +- `{MODULE}` should be an uppercase alphanumeric string. +- `{NUMBER}` should be a zero-padded integer of at least `id_length` digits. + +**From ubproject.toml**: +If no pharaoh.toml pattern exists, use the type prefixes from ubproject.toml. +Each need's ID should start with its type's `prefix` value (e.g., `REQ_` for +`req` type needs). + +**Duplicate ID check**: +Scan all need IDs across all files. Report any ID that appears more than once. +Include the file paths and line numbers of each occurrence. + +**ID format consistency**: +Even without an explicit pattern, check that all IDs of the same type follow a +consistent format. If most `req` IDs are `REQ_NNN` but one is `req-42`, flag +the outlier. + +For each violation, record: +- Need ID +- Expected pattern +- Issue description +- File path and line number +- Severity (error for duplicates, warning for format issues) + +--- + +### Step 7: Schema validation (if ubc CLI is available) + +If the data access tier is ubc CLI (Tier 1): + +1. Run `ubc schema validate` from the project root. Parse the output for + validation errors and warnings. +2. Run `ubc check` from the project root. Parse the output for lint findings. +3. Include all results in the report under a dedicated section. +4. Merge any findings that overlap with Steps 2-6 to avoid duplicate reporting. + If ubc reports the same gap or orphan that the file-based analysis found, + keep only one entry and note the source. + +If ubc CLI is not available, skip this step and note in the report: +``` +Schema validation: Skipped (ubc CLI not available) +``` + +--- + +### Step 8: Present MECE report + +Compile all findings into a single structured report. Use the format below +exactly. Omit sections that have no findings (but mention "None found" in the +summary counts). + +``` +## MECE Analysis Report + +### Project +- Name: <project name> +- Data source: <tier> +- Needs analyzed: <total count> +- Types: <list> +- Required chains: <list of rules> + +### Gaps (Missing Coverage) + +| Source | Type | Missing | Required By | +|--------|------|---------|-------------| +| <id> | <type> | No <target_type> | <rule> | +| ... | ... | ... | ... | + +### Orphans + +| Need ID | Type | Title | Issue | Severity | +|---------|------|-------|-------|----------| +| <id> | <type> | <title> | <description> | error/warning/info | +| ... | ... | ... | ... | ... | + +### Potential Redundancies + +| Need A | Need B | Similarity | Reason | +|--------|--------|------------|--------| +| <id> | <id> | Title/Content/Structural | <brief reason> | +| ... | ... | ... | ... | + +### Status Inconsistencies + +| Need ID | Status | Issue | Severity | +|---------|--------|-------|----------| +| <id> | <status> | <description> | warning | +| ... | ... | ... | ... | + +### ID Violations + +| Need ID | Expected Pattern | Issue | Severity | +|---------|-----------------|-------|----------| +| <id> | <pattern> | <description> | error/warning | +| ... | ... | ... | ... | + +### Schema Validation + +<ubc results or "Skipped (ubc CLI not available)"> + +### Summary + +- Gaps found: <N> (errors) +- Orphans: <N> (<X> errors, <Y> warnings, <Z> info) +- Redundancies: <N> (info) +- Status issues: <N> (warnings) +- ID violations: <N> (<X> errors, <Y> warnings) +- Schema issues: <N or "skipped"> +- Overall health: <good / needs-attention / critical> +``` + +**Overall health classification**: + +- **good**: Zero errors across all categories. Warnings and info items are + acceptable. +- **needs-attention**: One or more errors exist, but total error count is 5 or + fewer. The project has issues that should be addressed but is not + fundamentally broken. +- **critical**: More than 5 errors, or any category has more errors than valid + needs of that type. The traceability structure has significant problems. + +--- + +### Step 9: Update session state + +After presenting the report, update the session state file +(`.pharaoh/session.json`) as described in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md): + +1. Read the current `.pharaoh/session.json` (or create the default structure + if it does not exist). +2. Set `global.mece_checked` to `true`. +3. Set `global.mece_timestamp` to the current ISO 8601 timestamp. +4. Set `updated` to the current ISO 8601 timestamp. +5. Write the file back. + +This records that MECE analysis was performed, which satisfies the +`require_mece_on_release` gate if `pharaoh.toml` has it enabled. + +--- + +## 3. Strictness Behavior + +Follow the instructions in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md) for all strictness +decisions. + +### pharaoh:mece has no prerequisites + +This skill has no gates. It executes freely in both advisory and enforcing +mode. There is no prerequisite skill that must run before MECE analysis. + +### pharaoh:mece as a prerequisite for others + +When `pharaoh.toml` contains: + +```toml +[pharaoh.workflow] +require_mece_on_release = true +``` + +then `pharaoh:release` requires a passing MECE check before it can proceed +(in enforcing mode). The session state field `global.mece_checked` must be +`true`. + +In advisory mode, if `require_mece_on_release = true` and MECE has not been +run, `pharaoh:release` shows: + +``` +Tip: Consider running pharaoh:mece to check for gaps before release. +``` + +### Re-running after changes + +If the user modifies needs after a MECE check, the session state is NOT +automatically invalidated. The recorded `mece_checked = true` persists until +the session is reset. This is by design -- the user decides when to re-run. + +If you observe that needs were modified since the last MECE timestamp (by +comparing file modification times to `global.mece_timestamp`), mention this +in your output: + +``` +Note: Needs files were modified after the last MECE check +(<timestamp>). Consider re-running pharaoh:mece for an up-to-date analysis. +``` + +--- + +## 4. Scope Options + +The user may request a scoped analysis instead of a full project scan. Support +the following scope modifiers. If no scope is specified, default to full project +analysis. + +### Full project (default) + +Analyze all needs across all files in all detected project roots. This is the +default when the user invokes `pharaoh:mece` with no arguments. + +### Single file or directory + +When the user specifies a file path or directory: + +- Restrict the needs index to needs found in the specified file or directory. +- Still load the **full** link graph (all needs) so that link targets outside + the scope can be resolved. A need in `auth/requirements.rst` may link to a + need in `shared/types.rst` -- that link must still be validated. +- Report only findings for needs within the scope. Do not report issues for + needs outside the scope, even if they are linked to scoped needs. +- Note the scope in the report header: + ``` + Scope: auth/requirements.rst (12 of 47 needs) + ``` + +### Specific need type + +When the user specifies a type (e.g., "check only specs" or "mece for +requirements"): + +- Restrict analysis to needs of the specified type. +- Gap analysis: Only check rules where the specified type is the source type. +- Orphan detection: Only report orphans of the specified type. +- Redundancy analysis: Only compare needs of the specified type (this is + already the default behavior since redundancy only checks within a type). +- Status checks and ID checks: Only for the specified type. +- Note the scope in the report header: + ``` + Scope: type=spec (15 of 47 needs) + ``` + +### Specific traceability level + +When the user specifies a level of the chain (e.g., "check the spec -> impl +link" or "mece for the implementation level"): + +- Run gap analysis only for the specified rule or rules involving the + specified type. +- Run orphan detection for types involved in the specified rule. +- Skip redundancy analysis, status checks, and ID checks (these are + type-level concerns, not chain-level). +- Note the scope in the report header: + ``` + Scope: chain spec -> impl (15 specs, 22 impls) + ``` + +### Combining scopes + +Scopes can be combined. For example, "check specs in auth/" applies both the +type filter and the directory filter. Apply all filters as a logical AND -- +a need must match all specified criteria to be in scope. diff --git a/.github/agents/pharaoh.output-validate.agent.md b/.github/agents/pharaoh.output-validate.agent.md index bfa1914..3d91c48 100644 --- a/.github/agents/pharaoh.output-validate.agent.md +++ b/.github/agents/pharaoh.output-validate.agent.md @@ -7,4 +7,235 @@ handoffs: [] Use when `pharaoh-execute-plan` (or any caller) has dispatched a subagent whose output must match one of the documented schemas (RST directive, sphinx-codelinks one-line comment, YAML mapping, JSON object). Returns {valid, errors, parsed, recovery}. Callers gate subagent output through this before writing anything to disk. -See [`skills/pharaoh-output-validate/SKILL.md`](../../skills/pharaoh-output-validate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-output-validate + +## When to use + +Invoke from `pharaoh-execute-plan` after each dispatched task returns, to check that the task's raw output matches the emitting skill's declared `## Output schema` section. Also invoke directly from any other skill or human checking emitted content. Reject output that fails validation; optionally retry with stricter prompt; never write drifted output to disk. + +Do NOT use to generate output (that is the emitting skill). Do NOT use to parse output that already passed validation (the `parsed` field carries the structured form for you). + +## Atomicity + +- (a) Indivisible — one target description in → one validation result out. The atom has a single responsibility: **"validate required fields for this artefact type are present and well-formed."** Two input shapes are exposed via the `mode` input: + - `mode: "block"` (default; backward-compatible): one output string + one target schema + schema context. Validates one directive block against a declared schema. + - `mode: "graph"`: one `needs.json` + one `artefact-catalog.yaml` path. Validates every need's tailored `required_metadata_fields` across the full graph. + + The mode toggle selects the input shape — it does NOT add a second responsibility. In both modes the atom asks the same question per-artefact ("does this need carry the required fields declared for its type?") and returns the same verdict axis. No mutation of inputs. No re-dispatch. No logging beyond the structured return. +- (b) Input: + - block mode: `{mode?: "block", output_text: str, target_schema: "rst_directive"|"codelinks_comment"|"yaml_map"|"json_obj", schema_context: dict, strip_fences?: bool}`. `schema_context` fields vary per `target_schema`; documented in `## Schema context`. + - graph mode: `{mode: "graph", needs_json_path: str, artefact_catalog_path: str}`. + + Output (shared shape; `parsed` / `recovery` are block-mode-only): + - block mode: `{valid: bool, errors: list[str], parsed: object|null, recovery: {stripped_text: str|null}}`. + - graph mode: `{valid: bool, errors: list[str], needs_checked: int, violations: [{need_id, type, missing_fields: [str]}]}`. +- (c) Reward: block-mode fixtures in `pharaoh-validation/fixtures/pharaoh-output-validate/` + graph-mode fixtures in `skills/pharaoh-output-validate/fixtures/`. + + Block-mode (4 fixtures): + 1. `sample_clean.rst` with `target_schema="rst_directive"`, `schema_context={directive: "feat", required_options: ["id", "status", "source_doc"]}` → `valid=true`, `parsed` contains one block with the expected fields. + 2. `sample_fenced.md` with the same schema → `valid=false` without `strip_fences`; with `strip_fences=true` → `valid=true` and `recovery.stripped_text` set. + 3. `sample_prose_wrapped.rst` → `valid=false` regardless of `strip_fences` (prose is not a fence). Errors name the surrounding prose. + 4. `sample_typo_option.rst` with `schema_context={directive: "comp_req", required_options: ["id", "status"], allowed_options: ["id", "status", "satisfies"]}` → `valid=false`. Errors name `subsatisfies` as unknown. + + Graph-mode (3 fixtures in `skills/pharaoh-output-validate/fixtures/`): + 5. `graph-all-metadata-present/` — catalog declares `required_metadata_fields` for each type; every need carries every field non-empty → `valid=true`, empty `violations`. + 6. `graph-missing-tags/` — catalog declares `:tags:` required for `comp_req`; several `comp_req` needs lack `tags` → `valid=false`, `violations` lists the offenders with `missing_fields: ["tags"]`. + 7. `graph-empty-required-list/` — catalog declares `required_metadata_fields: []` (or omits the key) for a type; that type's needs carry no metadata → `valid=true` (nothing to check). + + Pass = all 7 produce the stated result. +- (d) Reusable: any composition skill that dispatches emission subagents needs this. +- (e) Composable: this skill never calls emission skills back. It is purely a parser. + +## Input + +- `mode` (optional, default `"block"`): one of `"block"` or `"graph"`. Selects the input shape and processing branch. Existing callers that omit `mode` get block mode — fully backward-compatible. + +### Block-mode fields (used when `mode == "block"`) + +- `output_text`: the raw text the subagent returned. May include prefixes like `# emit=rst` from `pharaoh-req-from-code` — the validator strips documented prefixes before parsing. +- `target_schema`: one of: + - `"rst_directive"` — expect one or more RST directive blocks per `pharaoh-req-from-code`'s Output schema Stage 1 / Stage 2 regex. + - `"codelinks_comment"` — expect one or more sphinx-codelinks one-line comments parseable by the tailored `oneline_comment_style`. + - `"yaml_map"` — expect a YAML document with a specific top-level key shape. + - `"json_obj"` — expect a JSON object with specific required keys. +- `schema_context`: schema-specific context. See `## Schema context`. +- `strip_fences` (optional, default `false`): if `true`, one automatic recovery attempt strips a leading/trailing triple-backtick fence (with optional language hint) before re-validating. + +### Graph-mode fields (used when `mode == "graph"`) + +- `needs_json_path`: absolute path to the built sphinx-needs corpus `needs.json`. Accepts either the flat `{"needs": {<id>: {...}, ...}}` shape or the versioned `{"versions": {"<v>": {"needs": {...}}}}` shape (uses `current_version` if declared, else the latest key). +- `artefact_catalog_path`: absolute path to `.pharaoh/project/artefact-catalog.yaml`. Each top-level key is a need `type`; the validator reads `required_metadata_fields: [<field_name>, ...]` per type. Empty list → no metadata check for that type. Absent key → treated as empty (no check, not an error). + +## Schema context + +Per `target_schema`: + +- `"rst_directive"`: `{directive: str, required_options: list[str], allowed_options?: list[str], parent_ids?: list[str]}`. `allowed_options` extends the built-in sphinx-needs options + `source_doc` Pharaoh convention. If `parent_ids` is non-empty, the validator checks that `satisfies` (or tailored link name) is present and lists every id. +- `"codelinks_comment"`: `{oneline_style: {start_sequence: str, field_split_char: str, needs_fields: list[dict]}}` — exact shape of `[codelinks.projects.<name>.analyse.oneline_comment_style]`. +- `"yaml_map"`: `{required_top_level_key: str, required_sub_keys: list[str], allowed_sub_keys: list[str]}`. +- `"json_obj"`: `{required_keys: list[str], allowed_unknown_keys: bool}`. + +## Output + +### Block mode + +```json +{ + "valid": true, + "errors": [], + "parsed": [ + { + "directive": "feat", + "title": "CSV Export", + "options": {"id": "FEAT_csv_export", "status": "draft", "source_doc": "features/csv.rst"}, + "body": "The system shall export sphinx-needs data to CSV files." + } + ], + "recovery": {"stripped_text": null} +} +``` + +On `valid=false`, `parsed` is `null`. `errors` is a list of human-readable strings naming each violation with line numbers where possible. + +### Graph mode + +```json +{ + "valid": false, + "errors": [], + "needs_checked": 44, + "violations": [ + {"need_id": "comp_req__auth_login", "type": "comp_req", "missing_fields": ["tags"]}, + {"need_id": "comp_req__auth_logout", "type": "comp_req", "missing_fields": ["tags", "priority"]} + ] +} +``` + +`valid` is `true` iff `violations` is empty. `needs_checked` counts every need read from `needs.json` (including ones whose type has no `required_metadata_fields` declared — they are still counted). `violations` is sorted by `need_id` ascending for deterministic fixture comparison. `errors` is reserved for structural problems (missing / unparseable input files) and is disjoint from `violations`: an error short-circuits with `valid: false`, empty `violations`, and `needs_checked: 0`. + +## Recovery modes + +Strict by default. One automatic recovery when `strip_fences=true`: + +- If `output_text` starts with a triple-backtick fence (optionally with language hint) and ends with closing fence, strip fences and re-validate. If re-validation passes, return `valid=true` with `recovery.stripped_text` set. If it still fails, return `valid=false` with both original and stripped errors. + +The validator never silently recovers from prose wrapping or option typos — those are always `valid=false`. The caller decides whether to re-dispatch the subagent or fail. + +## Process + +If `mode == "graph"`, skip directly to `## Graph mode` below. The steps in this section apply to block mode only. + +### Step 1: Strip emit-header prefix if present + +If `output_text` starts with `# emit=rst\n` or `# emit=codelinks_comment\n`, remove that line. Record what was stripped (for error messages). + +### Step 2: Handle fence recovery (if `strip_fences=true`) + +If `output_text` (after emit-header strip) matches `^```[a-z]*\n(.+?)\n```\s*$` (with `re.DOTALL`), capture the inner content. Validate the inner content as if it were the original. If it validates, return `valid=true` with `recovery.stripped_text` set. If it does not, fall through to validate the original and include both error sets. + +### Step 3: Dispatch to schema-specific parser + +Per `target_schema`, apply the parser: + +- `rst_directive`: Stage 1 + Stage 2 regex from `pharaoh-req-from-code` `## Output schema`. Iterate blocks, enumerate options per block. +- `codelinks_comment`: invoke sphinx-codelinks' own `oneline_parser.parse_line()` per line. +- `yaml_map`: `yaml.safe_load`, check shape. +- `json_obj`: `json.loads`, check keys. + +### Step 4: Apply schema-specific checks + +Per `target_schema` and `schema_context`: + +- `rst_directive`: directive equals `directive`; every `required_options` present; no option outside `allowed_options ∪ {required_options}`; if `parent_ids` given, `satisfies` value contains each; no non-blank content after last block. +- `codelinks_comment`: `parse_line()` returns a dict with every `needs_fields[].name` populated (or default applied). +- `yaml_map`: exactly one top-level key equal to `required_top_level_key`; sub-keys include every `required_sub_keys`; no sub-key outside `allowed_sub_keys ∪ required_sub_keys`. +- `json_obj`: every `required_keys` present; if `allowed_unknown_keys` is `false`, no unknown keys. + +### Step 5: Return + +```json +{"valid": true|false, "errors": [...], "parsed": ..., "recovery": {"stripped_text": ...}} +``` + +## Graph mode + +Graph mode validates the tailored `required_metadata_fields` across every need in `needs.json`. It is the delegated check for the `metadata_fields_present` invariant in `pharaoh-quality-gate`. + +### Process + +1. **Load.** Parse `artefact_catalog_path` via `yaml.safe_load` into `{type: {required_metadata_fields: [str]}}`. Parse `needs_json_path` via `json.load` and extract the needs map (handle flat `needs` key or versioned `versions` shape). On either parse failure or missing file, return `{valid: false, errors: ["<message>"], needs_checked: 0, violations: []}`. +2. **Resolve per-type required-field lists.** For each type `T` present in `needs.json`, look up `catalog[T].required_metadata_fields`. Absent type or absent key → treat as `[]` (no check for that type; this is not an error). Empty list → no check for that type. +3. **Iterate needs.** For each need `N`: + - Let `required = catalog[N.type].required_metadata_fields` (resolved per step 2; defaults to `[]`). + - For each `field` in `required`, check the need dict. The field counts as **present and non-empty** when `field` is a key on the need AND the value is neither `None`, `""`, nor `[]`. + - Collect all missing/empty field names for this need into `missing_fields`. + - If `missing_fields` is non-empty, append `{need_id: N.id, type: N.type, missing_fields: <sorted>}` to `violations`. +4. **Aggregate.** Sort `violations` by `need_id` ascending for deterministic output. Set `valid = len(violations) == 0`. + +### Detection rule (reference) + +```python +import json, yaml + +catalog = yaml.safe_load(open(artefact_catalog_path)) or {} +nj = json.load(open(needs_json_path)) +needs = nj.get("needs") or next(iter(nj.get("versions", {}).values()), {}).get("needs", {}) + +violations = [] +for nid, n in needs.items(): + t = n.get("type") + required = (catalog.get(t) or {}).get("required_metadata_fields") or [] + missing = [f for f in required + if n.get(f) in (None, "", []) or f not in n] + if missing: + violations.append({"need_id": nid, "type": t, "missing_fields": sorted(missing)}) + +violations.sort(key=lambda v: v["need_id"]) +result = { + "valid": len(violations) == 0, + "errors": [], + "needs_checked": len(needs), + "violations": violations, +} +``` + +### Tailoring extension point + +The full policy lives in `artefact-catalog.yaml`. Each type declares its own `required_metadata_fields` independently: + +```yaml +comp_req: + required_metadata_fields: [tags, priority] +feat: + required_metadata_fields: [tags] +tc: + required_metadata_fields: [] # explicitly no check +gd_req: + # required_metadata_fields omitted # treated as empty, no check +``` + +No hardcoded field names in the base skill. Projects that do not care about metadata completeness either set empty lists or omit the key — either way, graph mode returns `valid: true` with no violations. + +## Failure modes + +Block mode: +- `output_text` empty → `valid=false`, errors=["empty output"]. +- `target_schema` unknown → FAIL (caller error). +- `schema_context` missing required fields → FAIL (caller error). +- Parser throws (malformed YAML/JSON/RST) → `valid=false`, errors=["parser exception: <message>"]. + +Graph mode: +- `needs_json_path` or `artefact_catalog_path` missing or unparseable → `valid=false`, `errors` names the offending path, `needs_checked=0`, empty `violations`. +- Empty corpus (`needs` is `{}`) → `valid=true`, `needs_checked=0`, empty `violations` (vacuously true). +- `mode` value not in `{"block", "graph"}` → FAIL (caller error). + +## Non-goals + +- No side effects — never writes files, never dispatches subagents, never retries. +- No semantic validation beyond option-name/key-name presence — e.g. does not check whether `parent_feat_id` values exist in the project; that is a downstream concern. +- No repair — output is either valid, fence-strippable, or rejected. +- Graph mode does NOT validate link-target resolution, id convention, or status lifecycle — those live in `pharaoh-link-completeness-check`, `pharaoh-id-convention-check`, and `pharaoh-status-lifecycle-check` respectively. Graph mode only checks tailored required-metadata-field presence, keeping the atom's single responsibility intact. diff --git a/.github/agents/pharaoh.papyrus-non-empty-check.agent.md b/.github/agents/pharaoh.papyrus-non-empty-check.agent.md index 712d563..9617f25 100644 --- a/.github/agents/pharaoh.papyrus-non-empty-check.agent.md +++ b/.github/agents/pharaoh.papyrus-non-empty-check.agent.md @@ -7,4 +7,69 @@ handoffs: [] Use when verifying that a Papyrus workspace actually received writes during a plan run. Single mechanical check — counts directives across `.papyrus/memory/*.rst` and returns pass/fail against a configured minimum. Wired into `pharaoh-quality-gate` to detect the "LLM-executor skipped the atomic Papyrus writes" failure class observed in prior dogfooding. -See [`skills/pharaoh-papyrus-non-empty-check/SKILL.md`](../../skills/pharaoh-papyrus-non-empty-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-papyrus-non-empty-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` on any plan that declared Papyrus writes (every plan produced by `pharaoh-write-plan` where `preseed_papyrus: true`). Returns `{passed: bool, actual_count: int, required_min: int}` so the gate can decide. + +Do NOT use to read or interpret memory content — that is `papyrus-query` / `pharaoh-context-gather`. This skill only counts. + +## Atomicity + +- (a) Indivisible: one workspace path + one minimum count in → one pass/fail + actual count out. No memory classification, no dedup check, no content inspection. +- (b) Input: `{workspace_path: str, required_min: int}`. Output: JSON `{passed: bool, actual_count: int, required_min: int, workspace_path: str}`. +- (c) Reward: fixtures `pharaoh-validation/fixtures/pharaoh-papyrus-non-empty-check/`: + 1. `empty-workspace/` (7 `.rst` files, only headers, 0 directives) + `required_min: 1` → matches `expected-empty-fail.json` (`passed: false, actual_count: 0`). + 2. `populated-workspace/` (facts.rst with 3 `.. fact::` directives) + `required_min: 1` → matches `expected-populated-pass.json` (`passed: true, actual_count: 3`). + 3. Missing `.papyrus/` directory under workspace_path → `passed: false, actual_count: 0, note: "no papyrus workspace"` (same shape, extra field). + 4. Idempotent: same inputs produce same output. + + Pass = all 4. +- (d) Reusable by any composition that declared Papyrus writes. +- (e) Read-only. No side effects. + +## Input + +- `workspace_path`: absolute path to a directory containing `.papyrus/memory/*.rst`. If the directory does not exist, check returns `passed: false, actual_count: 0, note: "no papyrus workspace"`. +- `required_min`: integer ≥ 0. Minimum number of directives (lines matching `^\.\.\s+[a-z_]+::`) across all `.papyrus/memory/*.rst` files summed together. + +## Output + +```json +{ + "passed": true, + "actual_count": 3, + "required_min": 1, + "workspace_path": "/absolute/path/to/workspace", + "note": null +} +``` + +On missing workspace: + +```json +{ + "passed": false, + "actual_count": 0, + "required_min": 1, + "workspace_path": "/absolute/path/to/workspace", + "note": "no papyrus workspace" +} +``` + +## Counting rule + +```bash +grep -rEh '^\.\.\s+[a-z_]+::' <workspace_path>/.papyrus/memory/*.rst 2>/dev/null | wc -l +``` + +An RST directive line must match `^\.\.\s+[a-z_]+::`. Header underlines (`====`) and blank lines do not count. + +## Composition + +Called by `pharaoh-quality-gate` when `required_checks` contains `papyrus_non_empty: {required_min: N}`. Never called directly by user-facing flows. diff --git a/.github/agents/pharaoh.plan.agent.md b/.github/agents/pharaoh.plan.agent.md index 027ff2b..09fe80c 100644 --- a/.github/agents/pharaoh.plan.agent.md +++ b/.github/agents/pharaoh.plan.agent.md @@ -131,3 +131,369 @@ During execution: 5. One agent invocation per task. 6. Building the plan does not modify session state. Only execution does. 7. This agent has no workflow gates and runs freely in any mode. + +--- + +## Full atomic specification + +# pharaoh-plan + +Break a set of requirement changes into ordered, actionable tasks. Each task maps +to a Pharaoh skill invocation. The plan respects `pharaoh.toml` workflow gates, +establishes task dependencies, and provides a roadmap for implementing changes +across the requirements hierarchy. + +## When to Use + +- A requirement needs to change and you want a structured sequence of steps to propagate that change through specifications, implementations, and test cases. +- A new feature is being added and you need to create needs at every level of the hierarchy with proper traceability. +- Multiple requirements are changing at once and you need to coordinate the work. +- You have a Change Document from `pharaoh:change` and want to turn its impact analysis into an execution plan. +- You want to ensure workflow compliance (change analysis before authoring, verification before release) without manually tracking what has been done. + +## Prerequisites + +- The workspace must contain at least one sphinx-needs project. +- No other Pharaoh skills are required before running this one. `pharaoh:plan` has no workflow gates and runs freely in both advisory and enforcing modes. + +--- + +## Process + +Execute the following steps in order. + +--- + +### Step 1: Get project data + +Follow the instructions in [`skills/shared/data-access.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/data-access.md) to: + +1. Detect the project structure (find `ubproject.toml`, `conf.py`, source directories). +2. Read the project configuration (need types, extra_links, ID settings). +3. Determine the data access tier (ubc CLI > ubCode MCP > raw file parsing). +4. Build the needs index with all needs and their attributes. +5. Build the link graph with all relationships in both directions. +6. Detect sphinx-codelinks configuration. +7. Read `pharaoh.toml` for strictness level, workflow gates, and traceability requirements. + +After completing data access, present the detection summary before proceeding: + +``` +Project: <name> (<config source>) +Types: <list of directive names> +Links: <list of link type names> +Data source: <tier used> +Needs found: <count> +Strictness: <advisory|enforcing> +``` + +If detection fails (no project found, no needs in source files), report the issue +and ask the user for guidance. Do not proceed with empty data. + +--- + +### Step 2: Understand the scope + +Determine what changes the user wants to make. + +**If the user provides a Change Document** (output from a previous `pharaoh:change` invocation): + +1. Parse the Change Document to extract: + - The target need ID(s) and what is changing. + - The list of affected needs (downstream impacts). + - The affected files. +2. Use this as the authoritative scope. Do not re-run change analysis. + +**If the user describes the change in natural language** (e.g., "change brake response time from 100ms to 50ms"): + +1. Identify which need(s) the user is referring to. Search the needs index by title, content, and tags. +2. If the target need is ambiguous, present candidates and ask the user to choose: + ``` + Multiple matches found: + 1. REQ_001 (Requirement: Brake response time) [open] + 2. REQ_007 (Requirement: Brake pedal response) [approved] + Which need(s) are you changing? Enter numbers or IDs. + ``` +3. Once the target is confirmed, determine the scope by running the `pharaoh:change` impact analysis logic: + - Trace downstream from each target need to find all affected specifications, implementations, test cases, and code references. + - Record every affected need with its type, file, and the nature of the expected change. + +**If the user wants to add a new feature** (no existing need to change): + +1. Confirm what the feature is and which levels of the hierarchy need new needs (requirements, specifications, implementations, test cases). +2. The scope is "create new" rather than "modify existing." There are no affected needs yet, only needs to be created. +3. Determine how many needs are expected at each level based on the feature description and the project's traceability requirements (`required_links` from `pharaoh.toml`). + +**Scope summary:** + +After determining the scope, record: +- **Change type**: `modify` (changing existing needs) or `create` (adding new needs). +- **Target needs**: The need ID(s) being changed or the description of new needs to create. +- **Affected needs**: All needs that must be updated as a consequence (for `modify` type). +- **Affected files**: The source files that contain the target and affected needs. +- **Hierarchy levels touched**: Which need types are involved (e.g., req, spec, impl, test). + +--- + +### Step 3: Read workflow gates + +Follow the instructions in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md) to determine which workflow gates apply. + +Read `pharaoh.toml` (or use defaults if absent): + +- `strictness`: `"advisory"` or `"enforcing"`. +- `require_change_analysis`: Whether authoring skills require a prior `pharaoh:change`. +- `require_verification`: Whether `pharaoh:release` requires a prior review skill. +- `require_mece_on_release`: Whether `pharaoh:release` requires a prior `pharaoh:mece`. + +These gates determine which tasks are mandatory in the plan versus optional. + +--- + +### Step 4: Build the task sequence + +Assemble the ordered list of tasks based on the scope and workflow gates. + +#### Task sequence for modifying existing needs + +When the change type is `modify`, use this default sequence: + +1. **Change analysis** (`pharaoh:change`): Analyze impact of the change on the target need(s). Produces a Change Document listing all affected needs. + - Skip this task if the user already provided a Change Document. + - In enforcing mode with `require_change_analysis = true`, this task is mandatory before any authoring tasks. + +2. **Author target need(s)** (authoring skill, e.g. `pharaoh:req-draft`): Modify each target need with the requested change. One task per target need. Use the skill matching the need type (req-draft for requirements, arch-draft for architecture, vplan-draft for verification plans). + +3. **Author affected specifications** (authoring skill): Update each specification that traces to a modified need. One task per affected specification. + +4. **Author affected implementations** (authoring skill): Update each implementation that traces to a modified specification. One task per affected implementation. + +5. **Author affected test cases** (authoring skill): Update each test case that traces to a modified implementation. One task per affected test case. + +6. **Verify all changes** (review skill, e.g. `pharaoh:req-review`): Verify that all modified needs satisfy their parents and meet traceability requirements. + - In enforcing mode with `require_verification = true`, this task is mandatory before any release task. + +7. **MECE check** (`pharaoh:mece`): Check for gaps, redundancies, or inconsistencies introduced by the changes. This task is optional by default. + - In enforcing mode with `require_mece_on_release = true`, this task is mandatory before any release task. + +8. **Release** (`pharaoh:release`): Generate a changelog entry for the changes. Include this task only if the user indicates they are preparing a release. + +**Ordering within authoring tasks**: Follow the hierarchy top-down. Modify needs in this order: requirements first, then specifications, then implementations, then test cases. This ensures that each level is updated before its children are modified, so authors can reference the updated parent content. + +#### Task sequence for adding new needs + +When the change type is `create`, use this sequence: + +1. **Author new requirement(s)** (`pharaoh:req-draft`): Create the new top-level requirement(s). One task per requirement. + +2. **Author new specifications** (authoring skill matching need type): Create specifications for each new requirement. One task per specification. + +3. **Author new implementations** (authoring skill matching need type): Create implementations for each new specification. One task per implementation. + +4. **Author new test cases** (authoring skill matching need type): Create test cases for each new implementation. One task per test case. + +5. **MECE check** (`pharaoh:mece`): Verify complete coverage across the new needs. Confirm that every `required_links` chain is satisfied. + +6. **Verify all new needs** (review skill, e.g. `pharaoh:req-review`): Verify that all new needs have proper links and satisfy their parents. + +**Note on hierarchy levels**: Not every project uses all four levels (req, spec, impl, test). Only include tasks for need types that exist in the project's configuration. If the project defines only `req` and `test`, the plan should include only those levels. + +#### Task sequence adjustments + +- If the scope involves both modifications and new needs, interleave them: modify existing needs first, then create new needs that fill gaps. +- If the user explicitly requests to skip a step, omit it from the plan. In enforcing mode, warn that skipping a mandatory gate may block downstream tasks. +- If `pharaoh.toml` is absent, include all steps as recommendations but mark none as mandatory. + +--- + +### Step 5: Present the plan + +Format the plan as a structured document the user can review before execution. + +**Plan format:** + +``` +## Implementation Plan + +### Scope +- Change: <description of what is changing> +- Type: <modify|create> +- Target needs: <count> +- Affected needs: <count> +- Affected files: <count> +- Strictness: <advisory|enforcing> + +### Tasks + +| # | Task | Skill | Target | Detail | File | Required | +|----|-----------------------|------------------|------------|-------------------------------------------|--------------------------|----------| +| 1 | Analyze impact | pharaoh:change | REQ_001 | Trace downstream impact of latency change | docs/requirements.rst | yes | +| 2 | Update requirement | pharaoh:req-draft / pharaoh:req-regenerate | REQ_001 | Change latency from 100ms to 50ms | docs/requirements.rst | yes | +| 3 | Update specification | pharaoh:arch-draft | SPEC_001 | Update signal timing to match new latency | docs/specifications.rst | yes | +| 4 | Update implementation | pharaoh:arch-draft | IMPL_001 | Adjust timer configuration | docs/implementations.rst | yes | +| 5 | Update test case | pharaoh:vplan-draft | TC_001 | Update expected timing in assertions | docs/test_cases.rst | yes | +| 6 | Verify all changes | pharaoh:req-review, pharaoh:arch-review, pharaoh:vplan-review | (all) | Run the per-type review for each updated artefact | -- | yes* | +| 7 | MECE check | pharaoh:mece | (all) | Check for gaps in modified area | -- | no | + +*Required in enforcing mode when require_verification = true. + +### Dependencies +- Task 1 must complete before Tasks 2-5 (change analysis before authoring). +- Tasks 2-5 should execute in order (top-down through hierarchy). +- Task 6 requires Tasks 2-5 to complete (verification after all authoring). +- Task 7 can run after Task 6 or independently. + +### Estimated scope +- Needs to modify: <count> +- Needs to create: <count> +- Files to touch: <count> +``` + +**Presentation rules:** + +- Number every task sequentially starting from 1. +- For each task, specify the exact skill to invoke, the target need ID, a concise description of the change, and the source file. +- Mark tasks as "Required" based on workflow gates and strictness mode: + - In enforcing mode: tasks mandated by workflow gates are marked `yes`. + - In advisory mode: all tasks are marked `recommended` instead of `yes`. No task is strictly required. +- Show dependencies explicitly so the user understands the execution order. +- If the plan has more than 10 tasks, group them by hierarchy level with subtotals. + +--- + +### Step 6: Offer execution + +After presenting the plan, ask the user how they want to proceed: + +``` +Execute this plan step by step? + +Options: + 1. Execute all tasks in sequence + 2. Execute up to task N (partial execution) + 3. Modify the plan first + 4. Save the plan and execute later +``` + +**If the user chooses to execute:** + +1. Begin with Task 1. Invoke the specified skill with the specified target and parameters. +2. After each task completes, report progress: + ``` + Task 2/7 complete: Updated REQ_001 (latency changed to 50ms). + Proceeding to Task 3: Update SPEC_001... + ``` +3. If a task fails or produces unexpected results, pause and report: + ``` + Task 3 encountered an issue: SPEC_001 has additional links to IMPL_002 + that were not in the original scope. + + Options: + 1. Add IMPL_002 to the plan and continue + 2. Skip SPEC_001 and continue with Task 4 + 3. Stop execution and revise the plan + ``` +4. Allow the user to pause at any point. If the user says "pause," "stop," or "wait," halt execution immediately and report the current state: + ``` + Plan paused after Task 3/7. + Completed: Tasks 1-3 + Remaining: Tasks 4-7 + + Resume with "continue" or modify the remaining tasks. + ``` +5. Update `.pharaoh/session.json` as each skill completes, following the state management rules in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md). This ensures that workflow gates are satisfied as the plan progresses. + +**If the user chooses to modify the plan:** + +1. Ask what changes they want: add tasks, remove tasks, reorder tasks, or change task details. +2. Rebuild the plan with the modifications. +3. Re-present the updated plan and offer execution again. + +**If the user chooses to save:** + +1. Present the plan in a copyable format (the table above). +2. Inform the user they can invoke individual skills manually following the plan order. +3. Note which tasks are required by workflow gates if in enforcing mode. + +--- + +### Step 7: Handle edge cases during execution + +**New impacts discovered mid-execution:** + +If authoring a need reveals additional downstream impacts not captured in the original Change Document: + +1. Pause execution. +2. Report the newly discovered impacts. +3. Offer to extend the plan with additional tasks for the new impacts. +4. Resume only after the user confirms the updated plan. + +**Enforcing mode gate failures:** + +If a task cannot execute because a prerequisite gate is not satisfied: + +1. Report which gate failed and which prerequisite is missing. +2. Check if the missing prerequisite is a task earlier in the plan that was skipped. +3. Offer to insert or re-run the prerequisite task. +4. Do not silently skip the blocked task. + +**Conflicting changes:** + +If two tasks in the plan modify the same need (e.g., a specification is affected by changes to two different requirements): + +1. Detect the conflict when building the plan in Step 4. +2. Merge the two tasks into a single authoring task that addresses both changes. +3. Note the merge in the plan: `Update SPEC_001 (affected by REQ_001 and REQ_003)`. + +--- + +## Strictness Behavior + +### Advisory mode + +- The plan includes all recommended tasks in the proper order. +- No task is marked as strictly required. +- The user can skip any task during execution. +- After skipping a recommended task, show a tip: + ``` + Tip: Skipping change analysis. Consider running pharaoh:change later + to document the impact of these modifications. + ``` +- Do not block execution for any reason. + +### Enforcing mode + +- Tasks mandated by workflow gates are marked as required in the plan. +- During execution, if the user attempts to skip a required task, block with a clear message: + ``` + Blocked: Task 1 (change analysis) is required before authoring tasks + can execute. This is enforced by pharaoh.toml: + [pharaoh.workflow] + require_change_analysis = true + + Run the change analysis first, or switch to advisory mode in pharaoh.toml. + ``` +- The plan must include change analysis before any authoring tasks when `require_change_analysis = true`. +- The plan must include verification before any release task when `require_verification = true`. +- The plan must include a MECE check before any release task when `require_mece_on_release = true`. +- If a required task fails, execution halts. The user must resolve the failure before continuing. + +--- + +## Key Constraints + +1. **Keep plans concrete.** Every task must specify which need to modify or create, what the change is, and which file is involved. Vague tasks like "update related specs" are not acceptable. Name each need explicitly. + +2. **Never auto-execute without user consent.** Always present the plan and wait for explicit confirmation before invoking any skill. This applies even if the plan has only one task. + +3. **Allow plan modification before and during execution.** The user can add, remove, reorder, or change tasks at any point. Re-present the modified plan before resuming execution. + +4. **Respect the hierarchy.** Author needs top-down: requirements before specifications, specifications before implementations, implementations before test cases. This ensures parent content is finalized before children are updated. + +5. **One skill invocation per task.** Each task in the plan maps to exactly one Pharaoh skill call. Do not combine multiple skill invocations into a single task. + +6. **Handle partial execution gracefully.** If execution is paused or interrupted, the plan state (which tasks are complete, which remain) must be clear to the user. Completed tasks should already be reflected in `.pharaoh/session.json`. + +7. **No session state changes from planning alone.** Building and presenting the plan does not modify `.pharaoh/session.json`. State is only updated when tasks are actually executed. + +8. **No workflow gates on this skill.** As noted in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md), `pharaoh:plan` has no prerequisites and executes freely in both advisory and enforcing modes. diff --git a/.github/agents/pharaoh.process-audit.agent.md b/.github/agents/pharaoh.process-audit.agent.md index 3911cc0..afb2d78 100644 --- a/.github/agents/pharaoh.process-audit.agent.md +++ b/.github/agents/pharaoh.process-audit.agent.md @@ -7,4 +7,326 @@ handoffs: [] Use when running a full-corpus audit against a sphinx-needs project. Orchestrates pharaoh-coverage-gap across all gap categories plus cross-artefact consistency checks. Emits a prioritised gap report. -See [`skills/pharaoh-process-audit/SKILL.md`](../../skills/pharaoh-process-audit/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-process-audit + +## When to use + +Invoke when you want a single, prioritised gap report covering all known gap categories in a +sphinx-needs project in one operation. This skill orchestrates `pharaoh-coverage-gap` across +all 10 categories; it does not detect gaps itself. + +**Scope:** one project root → one gap report across all 10 categories. + +Do NOT invoke when you want to check a single gap category — use `pharaoh-coverage-gap` +directly. Do NOT invoke when you want indicator-level standard conformance on a single +artefact — use `pharaoh-standard-conformance`. + +> This is a compositional orchestrator. Atomicity criterion (a) does not apply: by design +> it delegates to `pharaoh-coverage-gap` for each category. Scope is bounded to +> "one project → one full-corpus gap report". + +--- + +## Inputs + +- **project_root**: path to the sphinx-needs project (must contain `.pharaoh/project/` tailoring + and a built `needs.json` under `docs/_build/needs/needs.json` or equivalent) + +--- + +## Outputs + +A single JSON document — no prose wrapper. Shape: + +```json +{ + "project_path": "examples/my-project", + "needs_total": 401, + "gaps": [ + { + "category": "unverified_req", + "severity": "high", + "count": 3, + "exemplars": ["gd_req__power_budget_monitoring", "gd_req__impl_complexity_analysis"], + "detection_rule": "gd_req needs with no tc__* linking to them via :verifies:" + } + ], + "summary_by_severity": { + "high": 2, + "medium": 5, + "low": 1 + }, + "recommended_next_actions": [ + "Address 3 unverified_req gaps: add :verification: fields and corresponding tc__ needs.", + "Fix 2 broken_back_link gaps before next build: link targets do not exist in needs.json." + ] +} +``` + +### Fields + +| Field | Type | Description | +|---|---|---| +| `project_path` | string | Echoes `project_root` | +| `needs_total` | integer | Total number of needs in needs.json | +| `gaps` | array | One entry per category where `match_count > 0`, sorted by severity | +| `gaps[].category` | string | Gap category name (one of 10) | +| `gaps[].severity` | `"high"` / `"medium"` / `"low"` | Highest severity_hint seen in that category | +| `gaps[].count` | integer | Total matches for that category | +| `gaps[].exemplars` | array | Up to 3 representative need IDs (highest-severity first) | +| `gaps[].detection_rule` | string | One-sentence description from `pharaoh-coverage-gap` | +| `summary_by_severity` | object | Count of categories (not needs) at each severity level | +| `recommended_next_actions` | array | Up to 5 concrete actions, highest-priority first | + +Categories with zero matches are omitted from `gaps`. + +--- + +## Process + +### Step 0: Validate inputs + +Confirm `project_root` is provided and exists. Confirm `.pharaoh/project/` directory is +present. If tailoring is missing, FAIL before running any sub-skill: + +``` +FAIL: .pharaoh/project/ not found at <project_root>/.pharaoh/project/. +Run pharaoh-tailor-detect → pharaoh-tailor-fill first. +``` + +Find needs.json: check `<project_root>/docs/_build/needs/needs.json`, then +`<project_root>/_build/needs/needs.json`. If not found: + +``` +FAIL: needs.json not found at expected paths under <project_root>. +Rebuild the Sphinx project first: sphinx-build docs/ docs/_build/ +``` + +Record `needs_total` from the top-level count in needs.json. + +--- + +### Step 1: Run pharaoh-coverage-gap for each category + +Invoke `pharaoh-coverage-gap` once per category, in this order: + +1. `broken_back_link` +2. `schema_violation` +3. `wrong_prefix_id` +4. `orphan_arch` +5. `unverified_req` +6. `invalid_lifecycle_transition` +7. `missing_fmea` +8. `stale_review` +9. `duplicate_req` +10. `contradictory_req_pair` + +The order is deterministic → results ordered by cheapest-to-detect first (low +`false_positive_risk` categories first; expensive NLI/embedding categories last). + +For each category, pass `project_root` and `category`. Capture the returned JSON. + +If a sub-skill returns a FAIL rather than a JSON result, record the category as: + +```json +{ + "category": "<category>", + "severity": "high", + "count": -1, + "exemplars": [], + "detection_rule": "FAILED: <fail message>" +} +``` + +Continue with remaining categories — do not abort the full audit on one category failure. + +--- + +### Step 2: Aggregate results + +For each category result: + +- If `match_count == 0`: skip (omit from `gaps`). +- If `match_count > 0`: construct a gap entry: + - `severity` = highest `severity_hint` seen across all matches in that category + - `exemplars` = first 3 `need_id` values from `matches` (already ordered high→low) + - `detection_rule` from the sub-skill result + +--- + +### Step 3: Sort and summarise + +Sort `gaps` by severity: `high` first, then `medium`, then `low`. Within each tier, preserve +the detection order from Step 1. + +Compute `summary_by_severity` by counting entries at each severity tier. + +--- + +### Step 4: Generate recommended next actions + +For the top-5 highest-severity gaps, generate one concrete action each. Each action must: +- Name the category +- State the count +- Suggest the most direct remediation (e.g. "add :verification: field", "fix link target", + "split requirement body") + +If fewer than 5 gaps exist, generate one action per gap. If zero gaps, set +`recommended_next_actions: ["No gaps detected — corpus is clean."]`. + +--- + +### Step 5: Emit JSON + +Emit the single JSON document. No prose before or after. + +--- + +## Guardrails + +**G1 — Missing tailoring** + +`.pharaoh/project/` absent → FAIL before any sub-skill runs (Step 0). + +**G2 — Missing needs.json** + +needs.json not found → FAIL before any sub-skill runs (Step 0). + +**G3 — Sub-skill failure** + +A single `pharaoh-coverage-gap` failure does not abort the audit. Record as count -1 and +continue (Step 1). At audit completion, if any categories failed, append a note: + +``` +NOTE: <n> categories failed during detection — results for those categories are incomplete. +``` + +**G4 — Corrupted needs.json** + +If needs.json is present but cannot be parsed as JSON: + +``` +FAIL: needs.json at <path> is not valid JSON. +Check for incomplete builds or file-system errors. +``` + +--- + +## Advisory chain + +After the gap report, if `gaps` is non-empty: + +- For each `high`-severity gap, append after the JSON: + ``` + Run `pharaoh-coverage-gap <project_root> <category>` for the full match list. + ``` +- For standard conformance on individual flagged artefacts, suggest `pharaoh-standard-conformance`. + +--- + +## Worked example + +**Input:** `project_root = examples/my-project` + +**Step 0:** `.pharaoh/project/` found; needs.json found with 401 needs total. + +**Step 1 — detection results (condensed):** + +| Category | match_count | highest severity | +|---|---|---| +| broken_back_link | 2 | high | +| schema_violation | 3 | medium | +| wrong_prefix_id | 0 | — | +| orphan_arch | 1 | high | +| unverified_req | 3 | high | +| invalid_lifecycle_transition | 0 | — | +| missing_fmea | 2 | medium | +| stale_review | 4 | medium | +| duplicate_req | 1 | medium | +| contradictory_req_pair | 0 | — | + +**Step 2:** 7 categories with matches; 3 categories clean. + +**Step 3:** sorted by severity; `summary_by_severity` computed. + +**Step 4:** top-5 actions generated for high and medium gaps. + +**Step 5 output:** + +```json +{ + "project_path": "examples/my-project", + "needs_total": 401, + "gaps": [ + { + "category": "broken_back_link", + "severity": "high", + "count": 2, + "exemplars": ["arch__diag_subsystem", "tc__power_budget_001"], + "detection_rule": "needs whose :satisfies: or :verifies: target does not exist in needs.json" + }, + { + "category": "orphan_arch", + "severity": "high", + "count": 1, + "exemplars": ["arch__legacy_watchdog_module"], + "detection_rule": "arch needs with no :satisfies: link resolving to a gd_req" + }, + { + "category": "unverified_req", + "severity": "high", + "count": 3, + "exemplars": ["gd_req__power_budget_monitoring", "gd_req__impl_complexity_analysis", "gd_req__diag_log_rotation"], + "detection_rule": "gd_req needs with no tc__* linking to them via :verifies:" + }, + { + "category": "stale_review", + "severity": "medium", + "count": 4, + "exemplars": ["gd_req__brake_pedal_response", "gd_req__ecu_reset_recovery", "arch__abs_controller"], + "detection_rule": "needs with status=inspected but no review record within the last 12 months" + }, + { + "category": "missing_fmea", + "severity": "medium", + "count": 2, + "exemplars": ["gd_req__abs_pump_activation", "gd_req__wheel_speed_plausibility"], + "detection_rule": "gd_req with safety-relevant tag (ASIL-A/B/C/D) but no fmea need referencing them" + }, + { + "category": "schema_violation", + "severity": "medium", + "count": 3, + "exemplars": ["gd_req__impl_complexity_analysis", "arch__diag_subsystem", "arch__power_mgmt_module"], + "detection_rule": "needs missing required_fields listed in artefact-catalog.yaml for their type" + }, + { + "category": "duplicate_req", + "severity": "medium", + "count": 1, + "exemplars": ["gd_req__ecu_watchdog_timeout"], + "detection_rule": "pair of gd_req bodies with cosine distance < 0.15" + } + ], + "summary_by_severity": { + "high": 3, + "medium": 4, + "low": 0 + }, + "recommended_next_actions": [ + "Fix 2 broken_back_link gaps: verify targets arch__diag_subsystem and tc__power_budget_001 exist in needs.json or correct the link values.", + "Resolve 1 orphan_arch gap: add :satisfies: link to arch__legacy_watchdog_module pointing to its parent gd_req.", + "Add verification for 3 unverified_req needs: create tc__ needs with :verifies: gd_req__power_budget_monitoring, gd_req__impl_complexity_analysis, gd_req__diag_log_rotation.", + "Re-inspect 4 stale_review needs or update :inspection_record: dates to reflect current review status.", + "Create fmea entries for 2 ASIL-tagged requirements: gd_req__abs_pump_activation and gd_req__wheel_speed_plausibility." + ] +} +``` + +Run `pharaoh-coverage-gap examples/my-project broken_back_link` for the full match list. +Run `pharaoh-coverage-gap examples/my-project orphan_arch` for the full match list. +Run `pharaoh-coverage-gap examples/my-project unverified_req` for the full match list. diff --git a/.github/agents/pharaoh.prose-migrate.agent.md b/.github/agents/pharaoh.prose-migrate.agent.md index de450d1..7f2c72b 100644 --- a/.github/agents/pharaoh.prose-migrate.agent.md +++ b/.github/agents/pharaoh.prose-migrate.agent.md @@ -7,4 +7,128 @@ handoffs: [] Use when a reverse-engineering run (a plan emitted by pharaoh-write-plan) finds pre-existing prose documentation files in the target output directory that would collide with generated feat RST files. Produces a sentence-by-sentence migration proposal — keep-as-user-guide, merge-into-feat-body, discard. Does NOT overwrite anything; the caller applies the proposal manually. -See [`skills/pharaoh-prose-migrate/SKILL.md`](../../skills/pharaoh-prose-migrate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-prose-migrate + +## When to use + +Invoke when a feature-extraction run is about to write `features/<stem>.rst` into a directory that already contains a human-authored prose file with a colliding stem (e.g. `features/reqif.rst` was written by a human as user documentation; the orchestrator is about to emit `features/reqif_export.rst` and `features/reqif_import.rst`). Without migration guidance, both files end up in the tree with no cross-reference and unclear canonicity — the exact confusion observed during dogfooding. + +Do NOT use to apply a migration (that is a future `pharaoh-prose-apply` skill). Do NOT use to generate new prose — this skill only processes existing content. + +## Atomicity + +- (a) Indivisible — one prose file + one set of emitted feats → one migration proposal. No file mutation. No deletion. No writes to the new feat RSTs. +- (b) Input: `{prose_file: str, emitted_feats: list[{id: str, title: str, body: str, source_doc: str}]}`. Output: YAML migration proposal (see Output schema). +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-prose-migrate/` contains `input_reqif.rst` (a prose file describing ReqIF usage) and `expected_proposal.yaml`. Scorer: + 1. Output parses as YAML. + 2. `decisions` covers every sentence (sentence count in `decisions[*].source_sentences` sums to the total sentence count of `input_reqif.rst`). + 3. Sentences classified as `merge_into_feat_body` target an `emitted_feats[*].id`. + 4. Sentences classified as `keep_as_user_guide` have a `target_file` under `user_guide/`. + 5. Sentences classified as `discard` include a rationale naming why (boilerplate, outdated, changelog-like). + 6. `summary` totals match the `decisions` aggregation. + + Pass = all 6. +- (d) Reusable for any project with pre-existing prose in its docs/source/features/ directory. +- (e) Composable: emits only a proposal. Never mutates. Caller decides whether to apply. + +## Input + +- `prose_file`: absolute path to the existing prose RST file to migrate. +- `emitted_feats`: list of features just emitted by `pharaoh-feat-draft-from-docs` that may cover content in `prose_file`. Each entry has: + - `id`: feat ID (e.g. `FEAT_reqif_export`) + - `title`: short title + - `body`: one-sentence feat statement + - `source_doc`: the doc the feat was derived from (used to distinguish "this feat's source was this prose file" from "this feat came from elsewhere") + +## Output + +```yaml +source_file: <relative path> +decisions: + - type: keep_as_user_guide + target_file: user_guide/<stem>.rst + source_sentences: [<line-numbers or sentence-indices>] + content_preview: "<first 40 chars of the preserved sentences>" + rationale: <string> + - type: merge_into_feat_body + target_feat_id: FEAT_<name> + source_sentences: [<indices>] + content_preview: "<first 40 chars>" + rationale: <string> + - type: discard + source_sentences: [<indices>] + content_preview: "<first 40 chars>" + rationale: <string> +summary: + total_sentences: <int> + keep_as_user_guide: <int> + merge_into_feat: <int> + discard: <int> +``` + +## Process + +### Step 1: Read and sentence-split + +Read `prose_file`. Strip RST directives (lines starting `.. ` and their indented bodies) from the split candidates — directives are not migratable prose. On the remaining text, split into sentences via a simple splitter: + +```python +re.split(r'(?<=[.!?])\s+(?=[A-Z])', text) +``` + +Index sentences 1..N. Preserve their original text for `content_preview`. + +### Step 2: Classify each sentence + +Apply classification rules in order; first match wins: + +**discard** — matches any of: +- Contains `TODO`, `FIXME`, `XXX`, `deprecated`, `see also`, or matches a changelog pattern (`Version \d+\.\d+`, `- Added:`, `- Fixed:`). +- Is pure boilerplate: ≤ 5 words with no noun-phrase content (e.g. "More details below.", "The following sections explain this."). +- References an outdated feature that is not in `emitted_feats`. + +**keep_as_user_guide** — matches any of: +- Imperative voice addressing the user ("You can ...", "Users should ...", "To import, run ...", "Invoke the CLI with ..."). +- Describes CLI commands, config syntax, or step-by-step instructions. +- Contains an RST code block reference (fenced with `::` or `.. code-block::`). +- Describes usage scenarios with concrete inputs/outputs. + +Target file: `user_guide/<stem>.rst` where `<stem>` is derived from `prose_file`'s basename (e.g. `reqif.rst` → `user_guide/reqif.rst`). This keeps user documentation co-located with the feature name. + +**merge_into_feat_body** — matches any of: +- Describes what the system does in declarative voice ("The system imports ReqIF files.", "ReqIF export preserves hierarchy."). +- Overlaps semantically with one of `emitted_feats[*].title + body`. Use substring match on feat title keywords as a first pass; if multiple feats match, pick the one with highest keyword overlap. +- Target: the matching feat's `id`. + +**discard (fallthrough)** — if none of the above rules fire after three passes, classify as `discard` with rationale `"Sentence did not fit any migration category — likely boilerplate or stale."` + +### Step 3: Group consecutive same-class sentences + +Adjacent sentences with the same `type` and same `target_file`/`target_feat_id` are grouped into one `decisions` entry with `source_sentences` listing all their indices. + +### Step 4: Emit summary + +Count per-type: +- `total_sentences` = sum of all `source_sentences` lengths. +- `keep_as_user_guide`, `merge_into_feat`, `discard` = per-type counts. + +### Step 5: Return + +Return the YAML proposal. + +## Failure modes + +- `prose_file` not readable → FAIL. +- `emitted_feats` empty and no `discard`-only output possible → FAIL: `"no feats provided to merge into, and prose_file has no discardable content"`. +- All sentences fall through to `discard` → emit the proposal anyway with 100% discard. The caller likely misinvoked this skill; the output makes that clear. + +## Non-goals + +- No automatic application. The caller reviews the proposal, manually moves sentences, deletes the legacy prose file (or leaves it — skill does not prescribe). +- No cross-file prose migration. One prose file per invocation. +- No LLM re-writing of sentences. The proposal is sentence-by-sentence verbatim; the caller edits after applying. +- No user_guide/ directory creation. If the caller applies `keep_as_user_guide` decisions, they create the directory themselves. diff --git a/.github/agents/pharaoh.quality-gate.agent.md b/.github/agents/pharaoh.quality-gate.agent.md index d500423..5365793 100644 --- a/.github/agents/pharaoh.quality-gate.agent.md +++ b/.github/agents/pharaoh.quality-gate.agent.md @@ -7,4 +7,215 @@ handoffs: [] Use when running the final validation step of any Pharaoh composition that emits artefacts (reqs, features, architecture elements). Consumes an aggregated review+mece+coverage summary plus a gate spec; returns pass/fail with named breaches. Never produces summaries itself — thin gate layer over upstream atomic checkers. -See [`skills/pharaoh-quality-gate/SKILL.md`](../../skills/pharaoh-quality-gate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-quality-gate + +## When to use + +Invoke as the terminal task of a plan (emitted by `pharaoh-write-plan`, executed by `pharaoh-execute-plan`) after `pharaoh-req-review`, `pharaoh-mece`, and `pharaoh-coverage-gap` tasks have produced their reports. This skill aggregates their findings against configured thresholds and decides whether the run may declare itself "complete". Without this gate, a plan that emits N artefacts with zero quality checks can return success on `sphinx-build exit 0` alone — the exact failure mode observed during dogfooding. + +Do NOT use to produce the reports it consumes — that is upstream atomic skills. Do NOT use to halt execution — this skill returns pass/fail; the plan's `on_fail` policy or the human decides what to do with that. + +## Atomicity + +- (a) Indivisible — one artefacts summary + one gate spec + one project_root in → one pass/fail report out. No new review judgment, no need-file reads, no MECE analysis. Pure threshold check. +- (b) Input: `{artefacts_summary_path: str, gate_spec_path: str, project_root: str}`. Output: JSON `{pass: bool, breaches: list[str], report_path: str}`. +- (c) Reward: fixtures `pharaoh-validation/fixtures/pharaoh-quality-gate/`: + 1. `input_artefacts.yaml` + `gate_spec.yaml` where all thresholds pass → output `pass: true`, `breaches: []`, report written to `<project_root>/.pharaoh/quality-gate-report-<timestamp>.yaml` matching `expected_report_pass.yaml` (timestamp masked). + 2. Same input_artefacts with a gate_spec variant where `testability_fail_rate_max: 0.20` but observed is `0.25` → `pass: false`, `breaches` names that threshold, report matches `expected_report_fail.yaml`. + 3. Idempotent: same inputs produce same output content (up to timestamp). + 4. Missing `artefacts_summary_path` → FAIL. + 5. `gate_spec.invariants.self_review_coverage.enabled: true` + runs path where one artefact is missing its review → `pass: false`, `breaches` includes entry naming the specific artefact and referring to `pharaoh-self-review-coverage-check` output. Same structure for `papyrus_non_empty` and `dispatch_signal_matches_plan`. + + Gate aggregates by calling each configured invariant check as a separate delegated skill, never duplicating the check logic itself — atomicity (a) preserved. + + Pass = all 5. +- (d) Reusable by any composition skill that has upstream review/mece/coverage reports. +- (e) Composable: composition skills invoke this at end; this skill never calls composition or atomic skills back. + +## Input + +- `artefacts_summary_path` (optional on plans that ran no review/mece/coverage tasks): absolute path to a YAML document produced by aggregating `pharaoh-req-review`, `pharaoh-mece`, and `pharaoh-coverage-gap` reports. Must parse via `yaml.safe_load`. Expected shape: + ```yaml + review_axis_fail_rates: + <axis_name>: <float 0..1> + ... + duplicate_rate: <float 0..1> + orphan_rate: <float 0..1> + unverified_rate: <float 0..1> + ``` +- `gate_spec_path` (optional): absolute path to a YAML document declaring thresholds. Shape: + ```yaml + thresholds: + review_axis_fail_rate_max: <float 0..1> + duplicate_rate_max: <float 0..1> + orphan_rate_max: <float 0..1> + unverified_rate_max: <float 0..1> + diagram_lint_errors_max: <int> # default 0; any error finding breaches + sampling: + method: stratified + per_feat_min: <int> + per_feat_fraction: <float 0..1> + ``` +- `diagram_lint_findings` (optional, inline): list of finding objects as produced by `pharaoh-diagram-lint`. Each entry matches the shape `{file, line, renderer, block_index, parser_exit_code, parser_stderr, severity}`. Passed by ref from the plan (e.g. `diagram_lint_findings: ${diagram_lint.findings}`), not via file path. When absent, diagram lint is assumed not run and no diagram breach is evaluated. +- `diagram_lint_status` (optional, inline): one of `"pass" | "fail" | "degraded"` as reported by `pharaoh-diagram-lint`. Used by the report's `diagram_lint` section for transparency (a `degraded` status surfaces as a warning in the report, not a breach). +- `project_root`: absolute path used to resolve the report output location (`<project_root>/.pharaoh/quality-gate-report-<timestamp>.yaml`). + +### Invariants + +Invariant checks are delegated to atomic check skills. Added to close the "skipped atomic step" class of failure observed during dogfooding, plus the structural-lint gaps (ID convention, link coverage, status lifecycle, metadata fields) surfaced by prior catalogue reviews. + +```yaml +# gate_spec.yaml — invariants block +invariants: + papyrus_non_empty: + enabled: true # default true when preseed_papyrus was used; false otherwise + required_min: 1 # minimum directive count across .papyrus/memory/*.rst + dispatch_signal_matches_plan: + enabled: true # default true + self_review_coverage: + enabled: true # default true + self_review_map_path: skills/shared/self-review-map.yaml # resolved relative to pharaoh/ + id_convention_consistent: + enabled: true # default true when id-conventions.yaml exists + id_conventions_path: .pharaoh/project/id-conventions.yaml + needs_json_path: docs/_build/needs/needs.json + link_types_covered: + enabled: true # default true when artefact-catalog.yaml declares required_links + artefact_catalog_path: .pharaoh/project/artefact-catalog.yaml + needs_json_path: docs/_build/needs/needs.json + status_lifecycle_healthy: + enabled: false # default false (advisory); release pipelines override to true + workflow_path: .pharaoh/project/workflows.yaml + needs_json_path: docs/_build/needs/needs.json + enforce: true # release-gate only — binary pass/fail on zero drafts + metadata_fields_present: + enabled: true # default true when artefact-catalog.yaml declares required_metadata_fields + artefact_catalog_path: .pharaoh/project/artefact-catalog.yaml + needs_json_path: docs/_build/needs/needs.json + api_coverage_clean: + enabled: true # default true when any source file under source_doc tree is declared + needs_json_path: docs/_build/needs/needs.json + source_file: null # resolved per-file by the plan's scatter-gather; null here means "no default — template must supply" + language: auto + task_output_present: + enabled: true # default true — independent second signal against "completed but no output" tasks + report_path: .pharaoh/runs/<latest>/report.yaml + workspace_dir: .pharaoh/runs/<latest> +``` + +Every new key follows the same pattern as the existing three: a boolean `enabled` plus whatever paths the delegated check needs. Adding a future invariant is a config-only change to this block plus one row in the delegation table below. + +## Invariant delegation + +For every key under `gate_spec.invariants.*` where `enabled: true`, the gate invokes the correspondingly named atomic check: + +| Invariant key | Delegated skill | Pass requirement | +| ------------------------------- | ------------------------------------------ | ------------------------------------------------------------ | +| `papyrus_non_empty` | `pharaoh-papyrus-non-empty-check` | `passed == true` | +| `dispatch_signal_matches_plan` | `pharaoh-dispatch-signal-check` | `passed == true` | +| `self_review_coverage` | `pharaoh-self-review-coverage-check` | `passed == true` | +| `id_convention_consistent` | `pharaoh-id-convention-check` | `overall == "pass"` | +| `link_types_covered` | `pharaoh-link-completeness-check` | `overall == "pass"` | +| `status_lifecycle_healthy` | `pharaoh-status-lifecycle-check` | `overall == "pass"` (release-gate only; `enforce=true` is typically supplied by the release pipeline) | +| `metadata_fields_present` | `pharaoh-output-validate` (graph mode) | every need carries the tailored `required_metadata_fields` for its type (delegated atom returns `valid == true`) | +| `api_coverage_clean` | `pharaoh-api-coverage-check` | `overall ∈ {"pass", "skipped"}`; invoked per source file, aggregated pass = every behavioral file has both a citing CREQ and every raised exception class named in some CREQ, non-behavioral files are skipped | +| `task_output_present` | inline check (no delegate) — re-runs `pharaoh-execute-plan` Step 4.10 audit against `report_path` + `workspace_dir` | every task with `status: completed` in the report has a non-empty artefact or `return.json` on disk at the declared path; any `reporting_error` status fails the gate | + +Each delegated check returns either `{passed: bool, ...}` or the atom's native `{overall: "pass"|"fail", ...}` / `{valid: bool, ...}` shape. The gate normalises each return against the pass requirement in the table and, on failure, merges the atom's breach fields into its top-level `breaches` list under a namespaced prefix (`invariant.<invariant_key>.<field>`). This keeps the gate itself a pure aggregator — atomicity (a) is preserved because the check logic lives in the delegated skills, not here. + +`metadata_fields_present` delegates to the existing `pharaoh-output-validate` atom invoked in `mode: "graph"` (see that skill's `## Graph mode`). The tailored `required_metadata_fields` list is declared per-type in `artefact-catalog.yaml`; empty list disables the check for that type, absent key is treated as empty. No new atom is introduced for this invariant — graph mode is a second input-shape on the existing block-validator. + +The four release-gate fields backing `link_types_covered`, `metadata_fields_present`, and the `pharaoh-review-completeness` invariant (`required_links`, `optional_links`, `required_metadata_fields`, `required_roles`) are declared per-type in `artefact-catalog.yaml`. Their canonical schema lives at `schemas/artefact-catalog.schema.json` (see `schemas/README.md`); the absence of any of the three required-* keys is surfaced as a finding by `pharaoh-tailor-review` rule C6, so a project running the gate after a clean `pharaoh-tailor-review` has explicitly declared (possibly as empty arrays) what every consumer reads. + +If a delegated check is not yet implemented in the skill tree, the gate records a warning in the report but does not fail — so that adding new invariants in future is a config-only change. + +## Output + +```json +{ + "pass": false, + "breaches": [ + "review_axis 'testability' fail rate 0.25 exceeds 0.20", + "orphan_rate 0.02 exceeds 0.00" + ], + "report_path": "/abs/path/.pharaoh/quality-gate-report-2026-04-20T14:03:12Z.yaml" +} +``` + +On `pass: true`, `breaches` is `[]` but the report file is still written. + +## Process + +### Step 1: Load inputs + +Read `artefacts_summary_path` (if provided) and `gate_spec_path` (if provided) via `yaml.safe_load`. If `artefacts_summary_path` is provided but the file is missing or malformed, FAIL naming the path. Same for `gate_spec_path`. When both are absent (plan did not run review/mece/coverage), the gate degrades to a diagram-lint-only pass/fail and `thresholds_evaluated` in the report will be empty for the review/mece/coverage axes. + +### Step 2: Check each threshold + +For each threshold in `gate_spec.thresholds` (if gate spec loaded): + +- `review_axis_fail_rate_max`: iterate `artefacts_summary.review_axis_fail_rates`. For each axis where observed > max, add `"review_axis '<axis>' fail rate <observed> exceeds <max>"` to breaches. +- `duplicate_rate_max`: if observed > max, add breach. +- `orphan_rate_max`: if observed > max, add breach. +- `unverified_rate_max`: if observed > max, add breach. If max is 1.00, this threshold is inactive (skip — it's a no-op). + +Sampling thresholds (`per_feat_min`, `per_feat_fraction`) are informational — they constrain upstream sampling in `pharaoh-req-review`, not checks here. Do not evaluate. + +### Step 2.5: Check diagram-lint findings (if provided) + +If `diagram_lint_findings` is non-null, count findings with `severity == "error"`. Compare against `gate_spec.thresholds.diagram_lint_errors_max` (default `0`): + +- `error_count > max` → add breach `"diagram_lint emitted <error_count> parser-error finding(s), exceeds max <max>"` followed by one sub-breach per finding of shape `"diagram_lint: <file>:L<line> (<renderer>) — <parser_stderr first 120 chars>"`. + +If `diagram_lint_status == "degraded"`, add a WARNING (not a breach) to the report: `"diagram_lint ran in degraded mode — at least one renderer CLI was missing; lint coverage is incomplete"`. Warnings surface in the report's `warnings` field but do not flip `pass` to `false`. + +### Step 3: Compute pass + +`pass = len(breaches) == 0`. + +### Step 4: Write report + +Write a full report to `<project_root>/.pharaoh/quality-gate-report-<iso8601_timestamp>.yaml` with: + +```yaml +timestamp: <iso8601> +pass: <bool> +breaches: [...] +warnings: [...] # non-breach issues (e.g. diagram_lint degraded mode) +thresholds_evaluated: + <threshold_name>: {max: <float>, observed: <float>} + ... +diagram_lint: # omit this section if diagram_lint_findings was null + status: <"pass"|"fail"|"degraded"> + errors_count: <int> + findings: + - {file, line, renderer, block_index, parser_exit_code, parser_stderr, severity} + ... +inputs: + artefacts_summary_path: <abs_path or null> + gate_spec_path: <abs_path or null> + diagram_lint_findings_count: <int> +``` + +Create `.pharaoh/` directory if it does not exist. + +### Step 5: Return + +Return the JSON object. `report_path` is the absolute path of the file written in Step 4. + +## Failure modes + +- `artefacts_summary_path` missing or unparseable → FAIL. +- `gate_spec_path` missing or unparseable → FAIL. +- `project_root/.pharaoh/` unwritable → FAIL. + +## Non-goals + +- Does not produce review / mece / coverage reports — those are `pharaoh-req-review`, `pharaoh-mece`, `pharaoh-coverage-gap`. +- Does not DECIDE thresholds — that's the gate spec authored by the project. +- Does not HALT anything — returns pass/fail; the orchestrator decides. +- No tiered thresholds (e.g. "soft" and "hard" gates) — everything is a hard threshold. diff --git a/.github/agents/pharaoh.release.agent.md b/.github/agents/pharaoh.release.agent.md index 089e004..8f87acf 100644 --- a/.github/agents/pharaoh.release.agent.md +++ b/.github/agents/pharaoh.release.agent.md @@ -108,3 +108,571 @@ Adapt sections dynamically to the project's configured types. Omit empty section 5. Handle missing git history gracefully. 6. Never modify need directive source files. This agent is read-only for documentation. 7. Prefer `ubc diff` when available for accurate structural change detection. + +--- + +## Full atomic specification + +# pharaoh-release + +Generate release artifacts from sphinx-needs changes. This skill identifies which +requirements, specifications, implementations, and test cases changed between +releases, produces structured changelogs and release notes, and computes +traceability coverage metrics suitable for safety-critical audit trails. In +enforcing mode, release is gated by prior verification (and optionally MECE +analysis). + +## When to Use + +- Preparing a release and need a changelog of requirement-level changes. +- Summarizing what needs were added, modified, or removed since the last release or tag. +- Producing traceability coverage metrics for compliance or audit documentation. +- Generating release notes that include requirement impact chains. + +## Prerequisites + +This skill has workflow gates. Follow the strictness check in Step 1 before +proceeding to the release process. + +--- + +## Process + +Execute the following steps in order. + +--- + +### Step 1: Strictness Check + +Follow the decision flow defined in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md), Section 5. + +#### 1a. Read strictness configuration + +1. Look for `pharaoh.toml` in the workspace root. +2. Read `[pharaoh]` to determine the `strictness` level (`"advisory"` or `"enforcing"`). +3. Read `[pharaoh.workflow]` for the gate settings: + - `require_verification` (default: `true`) + - `require_mece_on_release` (default: `false`) +4. If `pharaoh.toml` does not exist, treat strictness as `"advisory"`. + +#### 1b. Advisory mode + +If `strictness = "advisory"`: + +- Proceed directly to Step 2 without blocking. +- After the release output is complete (Step 6), check whether prerequisites were + skipped and show relevant tips (at most one per missing prerequisite): + +| Missing prerequisite | Tip | +|---|---| +| no review skill run | `Tip: Consider running the appropriate review skill (e.g. pharaoh:req-review) to validate implementations before release.` | +| `pharaoh:mece` not run | `Tip: Consider running pharaoh:mece to check for gaps before release.` | + +Do not show a tip if the corresponding workflow gate is disabled in `pharaoh.toml` +(e.g., do not show the MECE tip if `require_mece_on_release = false`). + +#### 1c. Enforcing mode + +If `strictness = "enforcing"`: + +1. Read `.pharaoh/session.json` (see [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md), Section 4). +2. Check **verification gate** (if `require_verification = true`): + - Look at the `changes` dictionary in session state. + - Every need that was authored or modified in this session must have `verified = true`. + - If any need has `verified = false` or is missing from session state, block: + ``` + Blocked: Verification required before release. + Run the appropriate review skill (e.g. pharaoh:req-review) to validate implementations first. + + Unverified needs: + - REQ_001 (authored but not verified) + - SPEC_003 (authored but not verified) + ``` +3. Check **MECE gate** (if `require_mece_on_release = true`): + - Check `global.mece_checked` in session state. + - If `mece_checked = false` or `null`, block: + ``` + Blocked: MECE analysis required before release. + Run pharaoh:mece to check for gaps first. + ``` +4. If any gate fails, stop. Do not proceed to Step 2. +5. If all gates pass, proceed to Step 2. + +#### 1d. User bypass + +If the user explicitly requests to skip a gate check (e.g., "proceed anyway" or +"skip the check"), respect the request. Log a warning: + +``` +Warning: Skipping verification gate at user request. Workflow compliance is not guaranteed. +``` + +Proceed with Step 2. Do not update session state to indicate the prerequisite was met. + +--- + +### Step 2: Get Project Data + +Follow the instructions in [`skills/shared/data-access.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/data-access.md) to: + +1. Detect the project structure (find `ubproject.toml`, `conf.py`, source directories). +2. Read the project configuration (need types, extra_links, ID settings). +3. Determine the data access tier (ubc CLI > ubCode MCP > raw file parsing). +4. Build the needs index with all needs and their attributes. +5. Build the link graph with all relationships in both directions. +6. Detect sphinx-codelinks configuration. +7. Read `pharaoh.toml` for traceability requirements (`required_links`). + +After completing data access, present the detection summary before proceeding: + +``` +Project: <name> (<config source>) +Types: <list of directive names> +Links: <list of link type names> +Data source: <tier used> +Needs found: <count> +Strictness: <advisory|enforcing> +``` + +If detection fails (no project found, no needs in source files), report the issue +and ask the user for guidance. Do not proceed with empty data. + +--- + +### Step 3: Determine Release Scope + +Identify which version this release covers and what to compare against. + +#### 3a. Get version identifier from user + +Ask the user for the release version if not already provided: + +``` +What version is this release? + - Enter a version string (e.g., v1.2.0) + - Or type "since last release" to auto-detect from the latest git tag +``` + +If the user provides a version string, record it as the release version. + +#### 3b. Find the comparison baseline + +Determine the baseline to compare against: + +**If the user said "since last release" or provided no baseline:** + +1. Run `git tag --sort=-v:refname` to list tags in reverse version order. +2. Take the most recent tag as the baseline. +3. If no tags exist, use the initial commit as the baseline. +4. Confirm with the user: + ``` + Comparing against: <tag> (<date>) + Proceed? [yes/no] + ``` + +**If the user provided a specific baseline** (a tag, branch, or commit): + +1. Verify the reference exists with `git rev-parse <ref>`. +2. If it does not exist, report the error and ask for a valid reference. + +#### 3c. Determine changed files + +Use git to find which documentation files changed between the baseline and HEAD: + +``` +git diff --name-only <baseline>..HEAD -- <source_directories> +``` + +Filter to only RST and MD files (the files that can contain need directives). +Record the list of changed files. + +If `ubc diff` is available (the ubc CLI was detected and supports the `diff` +subcommand), prefer it for structural diff: + +``` +ubc diff <baseline>..HEAD --format json +``` + +This provides a precise need-level diff rather than file-level, which is more +accurate for detecting added, modified, and removed needs. + +--- + +### Step 4: Identify Changed Needs + +Categorize every need change into one of three buckets: new, modified, or removed. + +#### 4a. Using ubc diff (Tier 1) + +If `ubc diff` is available and returned JSON output in Step 3c, parse it directly. +The output provides need-level changes with before/after states. Map each entry to +the appropriate bucket. + +#### 4b. Using git diff with raw parsing (fallback) + +If `ubc diff` is not available, analyze changes manually: + +**Find new needs:** + +1. For each changed file, run `git diff <baseline>..HEAD -- <file>`. +2. In the diff output, look for added lines (prefixed with `+`) that contain + need directives: `.. <type>::` where `<type>` is one of the configured + directive names. +3. For each added directive, parse the full need (title, ID, options) from the + current version of the file. +4. Record as a new need. + +**Find removed needs:** + +1. In the diff output, look for removed lines (prefixed with `-`) that contain + need directives. +2. Parse the full need from the old version (use `git show <baseline>:<file>` + to read the old content). +3. Verify the need ID no longer exists in the current needs index. +4. Record as a removed need. + +**Find modified needs:** + +1. For each need ID that appears in both the baseline and current versions of + changed files, compare the need's attributes: + - Title changed + - Status changed + - Content/body changed + - Options changed (tags, links, custom options) + - Link targets changed +2. For each attribute that differs, record the old and new values. +3. Record as a modified need with a list of changed attributes. + +#### 4c. Build the change summary + +Organize all changes into a structured summary: + +``` +Changes detected: + New needs: <count> (<breakdown by type>) + Modified needs: <count> (<breakdown by type>) + Removed needs: <count> (<breakdown by type>) + Unchanged: <count> +``` + +For each modified need, also identify its **impact chain**: the set of needs +linked to it (upstream and downstream) that may be affected by the change. Use the +link graph built in Step 2 to trace one level in each direction. + +--- + +### Step 5: Generate Changelog + +Build a structured changelog from the change summary. + +#### 5a. Changelog format + +Use the following markdown template. Group changes by need type. Within each +group, sort by need ID. + +```markdown +## Release <version> - <date> + +### Summary + +- **<count>** new needs added +- **<count>** needs modified +- **<count>** needs removed +- **<total>** total needs in project + +### New Requirements +- **REQ_004**: <title> [<status>] + <brief description or first sentence of content> + +### New Specifications +- **SPEC_005**: <title> [<status>] + Linked to: REQ_004 + +### New Implementations +- (none) + +### New Test Cases +- (none) + +### Modified Requirements +- **REQ_001**: <title> + - <attribute>: <old value> -> <new value> + - Impact: <list of directly linked need IDs that may need review> + +### Modified Specifications +- **SPEC_001**: <title> + - content: updated acceptance criteria + - Impact: IMPL_001, TEST_001 + +### Modified Implementations +- (none) + +### Modified Test Cases +- (none) + +### Removed Requirements +- (none) + +### Removed Specifications +- (none) + +### Removed Implementations +- (none) + +### Removed Test Cases +- (none) + +### Traceability Changes +- New links: <list of new link pairs, e.g., SPEC_005 -> REQ_004> +- Removed links: <list of removed link pairs> +- Modified links: <list of needs whose link targets changed> + +### Verification Status +- All modified needs verified: <yes/no> +- Unverified needs: <list of IDs, or "none"> +- MECE analysis: <passed / not run> +- Open MECE issues: <count, or "N/A"> +``` + +**Adapt the type sections to the project's configured types.** If the project +defines custom types beyond the standard four (e.g., `story`, `hazard`, +`constraint`), generate sections for each type that has changes. Omit type +sections that have no changes in any category (no new, no modified, no removed). + +#### 5b. Changelog for custom need types + +For each need type defined in the project configuration, generate the +corresponding "New", "Modified", and "Removed" sections only if that type has at +least one change. Use the type's `title` field from the configuration for the +section heading (e.g., "New Hazard Analyses" for a type with `title = "Hazard Analysis"`). + +#### 5c. Verification status section + +Populate the verification status from session state (`.pharaoh/session.json`): + +- Check `changes.<need_id>.verified` for each modified or new need. +- Check `global.mece_checked` for MECE status. +- If session state does not exist, report: + ``` + Verification Status + - Session state not available. Run the appropriate review skill and pharaoh:mece for status. + ``` + +--- + +### Step 6: Generate Release Summary + +Produce a high-level summary with coverage metrics. + +#### 6a. Needs inventory + +Count all needs in the current project by type and status: + +``` +Needs Inventory +=============== +Type Total draft approved implemented verified +Requirement 12 2 5 3 2 +Specification 10 1 4 3 2 +Implementation 8 0 2 4 2 +Test Case 7 0 1 3 3 +-------------------------------------------------------------- +Total 37 3 12 13 9 +``` + +Adapt the status columns to the project's actual status values. If the project +uses different statuses (e.g., `open`, `in_progress`, `closed`), use those instead. + +#### 6b. Traceability coverage metrics + +Calculate coverage percentages based on `required_links` from `pharaoh.toml`: + +For each required link chain (e.g., `"req -> spec"`): + +1. Count the number of source-type needs (e.g., all `req` needs). +2. Count how many have at least one link to a target-type need (e.g., linked to + at least one `spec`). +3. Calculate: `coverage = linked_count / total_count * 100` + +Present as: + +``` +Traceability Coverage +===================== +Chain Covered Total Coverage +req -> spec 10 12 83.3% +spec -> impl 8 10 80.0% +impl -> test 7 8 87.5% +Full chain (req->test) 6 12 50.0% +``` + +The "Full chain" row traces the complete path from top-level needs to leaf needs. +A source need is "fully covered" only if there is a complete path through all +intermediate types to the final type in the chain. + +If `required_links` is not configured, skip this section and note: + +``` +Traceability coverage: Not configured. +Add [pharaoh.traceability] required_links to pharaoh.toml to enable coverage metrics. +``` + +#### 6c. Open issues from MECE analysis + +If `pharaoh:mece` was run in this session (check `global.mece_checked` in session +state), summarize any open issues that were identified: + +- Orphaned needs (needs with no incoming or outgoing links) +- Gaps in required link chains +- Redundant or overlapping needs +- Inconsistent statuses (e.g., a `verified` need linked to a `draft` specification) + +If MECE was not run, note: + +``` +MECE Issues: Not available (pharaoh:mece has not been run this session). +``` + +#### 6d. Codelinks summary + +If sphinx-codelinks is enabled, include a code traceability summary: + +``` +Code Traceability +================= +Needs with code references: <count> / <total> +Code files referencing needs: <count> +``` + +--- + +### Step 7: Output and Next Steps + +#### 7a. Present changelog to user + +Display the complete changelog (from Step 5) and release summary (from Step 6) +to the user in a single output. + +#### 7b. Offer to write to file + +After presenting the output, ask the user: + +``` +Would you like to save these release artifacts? + + 1. Write changelog to CHANGELOG.md (append at top) + 2. Write full release notes to docs/releases/<version>.md + 3. Write both + 4. Do not write any files + +Choose an option: [1/2/3/4] +``` + +**Option 1: Append to CHANGELOG.md** + +1. Check if `CHANGELOG.md` exists in the workspace root. +2. If it exists, read its current content. Insert the new changelog entry at the + top of the file, after any existing header (e.g., after a `# Changelog` line). +3. If it does not exist, create it with a `# Changelog` header followed by the + new entry. +4. Show the user what will be written and confirm before writing. + +**Option 2: Write to docs/releases/** + +1. Create the `docs/releases/` directory if it does not exist. +2. Write the full release notes (changelog + release summary) to + `docs/releases/<version>.md`. +3. Show the user what will be written and confirm before writing. + +**Option 3: Both** + +Execute both Option 1 and Option 2. + +**Option 4: No files** + +Do nothing. The output was already presented on screen. + +#### 7c. Suggest git tag + +After file output is handled, suggest tagging: + +``` +Suggested next step: + git tag -a <version> -m "Release <version>" + +Create this tag now? [yes/no] +``` + +If the user confirms, run the `git tag` command. Do **not** push the tag. If the +user wants to push, they must explicitly request it. + +If the user declines, do nothing. + +#### 7d. Update session state + +After the release process completes successfully (regardless of whether files were +written), update `.pharaoh/session.json`: + +1. Read the current session state (or create the initial structure). +2. Set `global.last_release` to the current ISO 8601 timestamp. +3. Set `updated` to the current ISO 8601 timestamp. +4. Write the updated JSON back to `.pharaoh/session.json`. + +--- + +## Key Constraints + +1. **Never auto-tag or auto-push without user confirmation.** The `git tag` and + `git push` commands must always be explicitly confirmed by the user. Never run + them silently. + +2. **Never overwrite files without asking.** Before writing to `CHANGELOG.md` or + any release notes file, show the user what will be written and get explicit + confirmation. If the file already exists, show how it will be modified. + +3. **Include traceability metrics for safety-critical audit trails.** The release + summary must always include the needs inventory and traceability coverage + metrics. These are essential for compliance in regulated industries (automotive, + aerospace, medical device). + +4. **Support both full release notes and incremental changelogs.** The changelog + (Step 5) captures incremental changes for this release. The release summary + (Step 6) captures the full project state. Both are produced every time. + +5. **Adapt to project-specific types and statuses.** Do not hardcode need types + (`req`, `spec`, `impl`, `test`) or status values (`draft`, `approved`). Read + the project configuration and use whatever types and statuses the project + defines. Generate changelog sections dynamically. + +6. **Handle missing git history gracefully.** If the project is not a git + repository, or if there are no tags, report the limitation: + ``` + This workspace is not a git repository (or has no tags). + Cannot determine changes since last release automatically. + + Please provide the set of changed need IDs manually, or specify two + file snapshots to compare. + ``` + Then allow the user to provide change information manually and proceed + with changelog generation from that input. + +7. **Handle empty releases.** If no needs changed between the baseline and HEAD, + report: + ``` + No requirement changes detected between <baseline> and HEAD. + + If documentation files changed but no need directives were affected, + this is a documentation-only release with no requirements impact. + ``` + Still offer to generate the release summary (Step 6) since the project + inventory and coverage metrics may be useful even without changes. + +8. **Respect the data access tier hierarchy.** Always prefer ubc CLI (`ubc diff`) + for change detection when available. Fall back to git diff with raw parsing + only when ubc is not available. The structural diff from ubc is more precise + than text-based diff parsing. + +9. **Do not modify any need directives.** This skill is read-only with respect to + documentation source files. It generates release artifacts and writes them to + dedicated output files (CHANGELOG.md, docs/releases/). It never modifies RST + or MD files containing need directives. diff --git a/.github/agents/pharaoh.reproducibility-check.agent.md b/.github/agents/pharaoh.reproducibility-check.agent.md index a42da34..b1526a4 100644 --- a/.github/agents/pharaoh.reproducibility-check.agent.md +++ b/.github/agents/pharaoh.reproducibility-check.agent.md @@ -7,4 +7,215 @@ handoffs: [] Use when diffing two output directories produced by running the same plan twice to confirm the build is reproducible. Consumes a baseline directory, a rerun directory, and an optional list of mask rules for known-non-deterministic fields (timestamps, randomly-generated ids); emits a list of drifted files with per-file changed-field summaries. Does NOT run the plan — running is the caller's responsibility (`pharaoh-execute-plan`). -See [`skills/pharaoh-reproducibility-check/SKILL.md`](../../skills/pharaoh-reproducibility-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-reproducibility-check + +## When to use + +Invoke from a reproducibility-audit CI job (or directly by a human) after the caller has produced two output directories from two independent runs of the same plan. Takes the two directories plus an optional list of `mask_rules` for known-non-deterministic fields and emits a findings JSON listing which files drifted and which fields inside them changed. Passes when every file is byte-identical after masking; fails when at least one file differs. + +**This skill does NOT run the plan.** Running the plan twice is the caller's responsibility — `pharaoh-execute-plan` is the atom that executes plans, and the orchestrator that calls `pharaoh-execute-plan` twice and then this check is future work (deferred from this plan's scope). This atom only diffs two pre-existing output directories. + +Do NOT use to re-author artefacts, to regenerate the rerun directory, or to repair drift — read-only. Do NOT use to mask the baseline in place or rewrite it with placeholders — the masking is done on in-memory copies for the comparison only. Do NOT use to infer mask rules automatically — the caller declares them; no hardcoded Pharaoh-specific masks. + +## Atomicity + +- (a) Indivisible: one baseline directory + one rerun directory + optional mask rules in → one drift report out. No plan execution, no artefact emission, no side effects. +- (b) Input: `{plan_path: str, baseline_output_dir: str, rerun_output_dir: str, mask_rules: list[{path: str, field: str, regex: str}]}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-reproducibility-check/fixtures/` — one per outcome: + 1. `identical-output/` — baseline and rerun are byte-identical after masking (timestamps masked out; everything else matches) → `overall: "pass"`, `drifted_files: []`, empty `drift_summary`. + 2. `drifted-titles/` — rerun has different need titles (e.g. `"Login requirement"` → `"Login req"`) that no mask rule targets → `overall: "fail"`, `drifted_files` names the file, `drift_summary[file].fields_changed` lists the `.title` paths of the drifted records. + 3. `drifted-ids-but-masked/` — rerun has different generated need ids (`REQ_abc123` vs `REQ_def456`) but `mask_rules` includes an entry that replaces any matching `id` value with a placeholder; after masking the files are equal → `overall: "pass"`. + + Pass = each fixture's actual output matches `expected-output.json` modulo ordering of `drifted_files` (sorted ascending) and `fields_changed` (also sorted ascending). +- (d) Reusable across projects — the diff is tree-of-files generic and the mask rules are data-driven. No Pharaoh-specific field names, id shapes, or timestamp formats are baked in. Works for any plan whose output directory is a tree of JSON / YAML / text files. +- (e) Read-only. Does not modify the baseline or rerun directories, does not write the masked copies to disk, does not touch the plan file. Running twice on identical inputs yields byte-identical output. + +## Input + +- `plan_path`: absolute path to the plan YAML the two runs came from. Used as diagnostic metadata in the emitted report (echoed under the plan key in a future shape) but is NOT semantically load-bearing for the diff itself — the skill does not re-read or re-execute the plan. An unreadable or missing path is surfaced as a blocker but does not abort the diff if both output directories are readable. +- `baseline_output_dir`: absolute path to the output directory produced by the first plan run. Must exist and be readable. +- `rerun_output_dir`: absolute path to the output directory produced by the second plan run on the same plan. Must exist and be readable. +- `mask_rules`: optional list of `{path: str, field: str, regex: str}` entries. Each entry declares that, inside every file matched by `path` (a glob relative to the output-dir root), before comparing, replace the value at `field` (a dotted JSON-path into the parsed file) with the placeholder string `"<masked>"` if the current value matches `regex`. Defaults to `[]` (no masking). + +Edge cases: +- `baseline_output_dir` or `rerun_output_dir` missing → `overall: "fail"`, `blockers: ["baseline_output_dir unresolved: <path>"]` (or the rerun equivalent). +- One side contains files the other does not — file-level drift: the absent file is listed in `drifted_files` with `drift_summary[file] = {"fields_changed": [], "reason": "file only present in <baseline|rerun>"}`. +- A `mask_rules` entry's `regex` fails to compile → `overall: "fail"`, blocker `"mask regex invalid: <entry>"`; no files are diffed. +- A mask rule targets a path that no file matches, or a field that no parsed record carries → silently ignored for that file (masking is best-effort per-entry). +- Non-parseable files (binary, malformed JSON) are compared byte-for-byte; masking is skipped for them and any bytes-difference is reported as `fields_changed: ["<byte-diff>"]`. + +## Output + +```json +{ + "baseline": "/abs/path/baseline/", + "rerun": "/abs/path/rerun/", + "drifted_files": [ + "docs/_build/needs/needs.json" + ], + "drift_summary": { + "docs/_build/needs/needs.json": { + "fields_changed": [ + "comp_req__foo_01.title" + ], + "count": 1 + } + }, + "overall": "fail" +} +``` + +Fields (in canonical order): +- `baseline`: echo of the input `baseline_output_dir`. +- `rerun`: echo of the input `rerun_output_dir`. +- `drifted_files`: list of file paths (relative to the respective output-dir roots) that differ after masking, sorted ascending. +- `drift_summary`: mapping from each drifted file path to `{fields_changed: list[str], count: int}`. `fields_changed` is the sorted list of dotted field paths whose values changed; `count` is `len(fields_changed)`. For files that exist on only one side, `fields_changed` is empty and an extra `reason` field explains the asymmetry. For byte-level diffs on non-parseable files, `fields_changed` is `["<byte-diff>"]`. +- `overall`: `"pass"` iff `drifted_files` is empty AND no blocker fired. `"fail"` otherwise. + +On input errors (unresolved paths, invalid mask regex) the shape still carries every field with empty `drifted_files`, empty `drift_summary`, `overall: "fail"`, plus a top-level `blockers` list containing the error strings, so downstream callers can diff one shape. + +**What counts as drift.** Drift is reported at two granularities: the outer `drifted_files` list names files at file-level (present on both sides but differing, OR present on only one side), and the inner `drift_summary` reports field-level detail for each drifted parseable file. The gate is file-level (any entry in `drifted_files` fails the check); the per-field detail exists so the caller can see WHAT drifted without re-running the diff. + +## Process + +### Step 1: Validate inputs + +Resolve `baseline_output_dir` and `rerun_output_dir`. If either is missing or unreadable, populate `blockers` and emit the error shape. Compile every `mask_rules[i].regex` eagerly; on any `re.error`, populate `blockers` with `"mask regex invalid: <entry>"` and emit the error shape. `plan_path` is echoed into diagnostic logs but validation is soft — a missing plan file does not abort the diff. + +### Step 2: Enumerate files + +Walk `baseline_output_dir` recursively, collect the relative path of every file. Do the same for `rerun_output_dir`. Compute the union of the two sets. For each file path in the union: + +- If present on only one side, flag it as drifted with `reason: "file only present in <baseline|rerun>"`. +- If present on both sides, continue to Step 3. + +### Step 3: Load and mask + +For each file present on both sides: + +1. Attempt to parse both copies (JSON for `*.json`, YAML for `*.yaml`/`*.yml`, plain text otherwise). Non-parseable files short-circuit to byte-comparison (Step 4b). +2. For each `mask_rules` entry whose `path` glob matches the current file's relative path, apply the mask: traverse `field` (dotted JSON-path, e.g. `needs.comp_req__foo_01.created_at`; supports `*` wildcard segments for per-item masking like `needs.*.created_at`) on the parsed structure. At each leaf the mask visits, if the current value is a string matching `regex`, replace it with `"<masked>"`. Apply masks to both the baseline and rerun copies in memory. +3. Proceed to Step 4a. + +### Step 4: Compare + +**4a (parseable files):** Deep-compare the two masked structures. Any field whose value differs is added to `fields_changed` for this file, expressed as a dotted path (`<top-key>.<sub-key>...`). Added or removed keys are reported as `<path>` with a trailing `+` or `-` respectively. If `fields_changed` is non-empty, the file is drifted. + +**4b (byte-comparable files):** Byte-compare the two files. If they differ, the file is drifted with `fields_changed: ["<byte-diff>"]`. + +### Step 5: Emit the findings JSON + +Populate every field per the `## Output` shape. Sort `drifted_files` ascending; sort each `fields_changed` ascending. `overall` is `"pass"` iff `drifted_files` is empty and no blocker fired; `"fail"` otherwise. + +## Detection rule + +One mechanical check, implemented as the five-step process above. No LLM judgement. + +Minimum viable Python reference implementation (≤ 60 lines, omitting glob and dotted-path helpers for brevity): + +```python +import json, os, re, fnmatch, yaml +from pathlib import Path + +def walk(root): + root = Path(root) + return {str(p.relative_to(root)) for p in root.rglob("*") if p.is_file()} + +def load(p): + s = open(p, "rb").read() + try: + if p.endswith(".json"): + return "parsed", json.loads(s) + if p.endswith((".yaml", ".yml")): + return "parsed", yaml.safe_load(s) + except Exception: + pass + return "bytes", s + +def apply_masks(obj, field_path, regex): + # Traverse dotted field_path (with `*` wildcards). At each leaf, if the + # current value is a string matching regex, replace it with "<masked>". + segs = field_path.split(".") + def visit(node, i): + if i == len(segs): + return "<masked>" if isinstance(node, str) and regex.search(node) else node + if segs[i] == "*" and isinstance(node, dict): + return {k: visit(v, i + 1) for k, v in node.items()} + if isinstance(node, dict) and segs[i] in node: + node[segs[i]] = visit(node[segs[i]], i + 1) + return node + return visit(obj, 0) + +def diff(a, b, prefix=""): + changed = [] + if type(a) != type(b): + return [prefix or "<root>"] + if isinstance(a, dict): + for k in sorted(set(a) | set(b)): + p = f"{prefix}.{k}" if prefix else k + if k not in a: changed.append(p + "+") + elif k not in b: changed.append(p + "-") + else: changed += diff(a[k], b[k], p) + return changed + if a != b: return [prefix or "<root>"] + return [] + +# Main +compiled = [(r["path"], r["field"], re.compile(r["regex"])) for r in mask_rules] +b_files, r_files = walk(baseline), walk(rerun) +drifted, summary = [], {} + +for rel in sorted(b_files | r_files): + if rel not in b_files: + drifted.append(rel); summary[rel] = {"fields_changed": [], "count": 0, + "reason": "file only present in rerun"}; continue + if rel not in r_files: + drifted.append(rel); summary[rel] = {"fields_changed": [], "count": 0, + "reason": "file only present in baseline"}; continue + + kind_b, a = load(os.path.join(baseline, rel)) + kind_r, c = load(os.path.join(rerun, rel)) + if kind_b != kind_r or kind_b == "bytes": + if a != c: + drifted.append(rel); summary[rel] = {"fields_changed": ["<byte-diff>"], "count": 1} + continue + + for glob, field, rx in compiled: + if fnmatch.fnmatch(rel, glob): + a = apply_masks(a, field, rx); c = apply_masks(c, field, rx) + + fc = sorted(diff(a, c)) + if fc: + drifted.append(rel); summary[rel] = {"fields_changed": fc, "count": len(fc)} + +overall = "pass" if not drifted else "fail" +``` + +The full implementation adds the blocker propagation for unresolved paths, the eager regex compilation, and the canonical-field emission order. + +## Failure modes + +- **Dotted field paths are a simplified JSON-pointer.** Segments are literal keys; `*` wildcards any key at that level; arrays are addressed by index (`needs.0.title`). Projects whose data has keys containing literal dots must split those keys before emitting the output — documented limitation, acceptable for every Pharaoh output shape observed to date. +- **Masking is per-leaf, not per-subtree.** A mask rule targeting `needs.*.created_at` replaces only the `created_at` scalar, not the whole need record. Projects wanting to mask out entire subtrees should declare a rule per leaf field or pre-process the output. +- **Regex matching is `re.search`, not `re.fullmatch`.** The rule fires when the regex finds a match anywhere in the string value; this is deliberate so a regex like `\d{10,}` can mask out Unix timestamps without requiring the field value to be exactly a timestamp. +- **Binary or malformed files fall back to byte compare.** A corrupt JSON on either side is compared byte-for-byte. That is usually what the caller wants (a malformed file is itself drift), but a project relying on lenient parsing should repair the file before invoking this check. +- **`plan_path` is metadata-only.** The skill does NOT parse or execute the plan; it does not verify that the two output directories actually came from it. Callers that need that assurance should assert it before invoking. +- **File-level is the gate.** Any drifted file fails the check. The per-field detail does not downgrade a one-field diff to a warning — reproducibility is binary. Projects that want per-field tolerance should encode it via mask rules. + +## Tailoring extension point + +- `tailoring.reproducibility_mask_rules`: projects can declare a canonical list of mask rules in their tailoring and pipe it into this skill's `mask_rules` input. Typical entries cover timestamps (`created_at`, `updated_at`, `build_timestamp`) and randomly-generated ids (`run_id`, `session_id`). No other knobs are exposed. + +No other knobs. The skill is deliberately a thin diff engine — every policy decision (what to mask, what threshold) lives in the caller or the tailoring. + +## Composition + +Role: `atom-check`. + +Callable standalone from any CI job that already holds two output directories plus a mask-rule list. The orchestrator that invokes `pharaoh-execute-plan` twice and then this check is out of scope for this atom. Never dispatches other skills. Never modifies the baseline or rerun directories. + +Complements `pharaoh-dispatch-signal-check` (which audits whether a plan's declared execution mode was respected in `runs/`) — that skill checks run structure, this skill checks output-byte stability across reruns. The two atoms operate on different artefacts and neither dispatches the other. diff --git a/.github/agents/pharaoh.req-code-grounding-check.agent.md b/.github/agents/pharaoh.req-code-grounding-check.agent.md index c8d4075..6df8c7e 100644 --- a/.github/agents/pharaoh.req-code-grounding-check.agent.md +++ b/.github/agents/pharaoh.req-code-grounding-check.agent.md @@ -7,4 +7,237 @@ handoffs: [] Use when verifying a single drafted requirement against the source file it cites via `:source_doc:`. Single mechanical fidelity check — compares the CREQ's claims about exceptions, triggers, types, structural symbols, backtick-quoted identifiers, grounding density, adjectives, quantifiers, and branch count against the cited source, returning per-axis findings JSON. Complements `pharaoh-req-review` (which grades prose quality) with code-grounded axes. -See [`skills/pharaoh-req-code-grounding-check/SKILL.md`](../../skills/pharaoh-req-code-grounding-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-req-code-grounding-check + +## When to use + +Invoke as a sibling review alongside `pharaoh-req-review` whenever an emission skill (e.g. `pharaoh-req-from-code`) has just produced a requirement that declares `:source_doc:`. Reads the RST directive block + the cited source file, emits findings JSON with per-axis pass/fail so the caller can decide whether to finalize, regenerate, or reject the requirement. + +Do NOT use to grade prose quality (atomicity, verifiability, ambiguity) — that is `pharaoh-req-review`. Do NOT use for requirements lacking `:source_doc:` — axis #8 will fail immediately and the remaining axes cannot be evaluated. Do NOT use to re-author or modify the requirement — this skill is read-only and emits findings only. + +## Atomicity + +- (a) Indivisible: one CREQ + one source file in → one findings JSON out. No re-authoring, no set-level analysis, no dispatch of other skills. +- (b) Input: `{target: <need_id_or_rst>, source_doc_path: <str>, tailoring_path: <str>}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-req-code-grounding-check/fixtures/` — one per failure mode: + 1. `passing-case/` — all axes pass; matches `expected-output.json` (`overall: "pass"`, empty `blockers`). + 2. `dead-exception/` — CREQ names 5-class hierarchy; source raises 2 of 5 → `exception_raise_sites_exist` fails with 3 missing names in `evidence`. + 3. `inverted-trigger/` — CREQ says `when origin == "Sphinx-Needs"`; source has `if origin != "Sphinx-Needs"` → `trigger_condition_literal_match` fails. + 4. `pydantic-halluc/` — CREQ says "Pydantic model"; source imports `dataclasses` → `type_framework_matches_imports` fails. + 5. `weasel-adjectives/` — CREQ body contains `structured`, `comprehensive`, `full` → `no_weasel_adjectives` fails with the 3 matches in `evidence`. + 6. `unbounded-all/` — CREQ says "all validation errors" without enumeration → `quantifier_enumerated` fails. + 7. `collapsed-branches/` — CREQ is one shall-clause; source function has 4 visible branches → `branch_count_aligned` scores 1. + 8. `misattributed-config-field/` — CREQ body backtick-cites a default literal and a config field name; declared source_doc is the consumer module which uses them only through attribute access. Fixture ships a `code-grounding-filters.yaml` enabling the `cross_file_literal_default` strategy so the skill can emit the actionable "lives in config, cite attribute instead" evidence. Without the YAML the tokens would still fail axis #5 with the generic "not in source_doc" message. + 9. `typer-kebab-filter/` — CREQ body cites ``--license-key``; source defines `license_key` as a Typer parameter. Fixture ships a `code-grounding-filters.yaml` enabling `kebab_to_snake_or_pascal` with `morphology_prefixes: ["Opt"]`. The filter resolves; without the YAML, universal filters do not cover this pattern and the axis would fail. + 10. `toml-section-filter/` — CREQ body cites ``[myapp.export_config]``; skipped by universal filter #1 (TOML section). No tailoring YAML required. + 11. `external-dotted-path/` — CREQ body cites ``rich.console.Console``; source imports `from rich.console import Console`. Fixture ships a `code-grounding-filters.yaml` enabling `dotted_import_resolution` with the Python separator / import patterns. Without the YAML, universal filters do not cover this pattern. + 12. `env-var-glob/` — CREQ body cites ``JAMA_*``; source defines `JAMA_URL_ENV`, `JAMA_USERNAME_ENV`, etc. Fixture ships a `code-grounding-filters.yaml` enabling `prefix_glob_expansion`. Without the YAML, universal filters do not cover this pattern. + 13. `abstract-prose/` — CREQ body uses only "the component shall" / "caller-configured" with zero backtick-quoted identifiers; fails axis #8 (`source_doc_resolves`) because the file contains no symbols the shall clause names, exposing that the CREQ is untestable against the cited file. No tailoring YAML required — axis mechanics are language-agnostic. + + Pass = all 13 fixture outputs match `expected-output.json` modulo `evidence` field substring match. +- (d) Reusable across projects — any corpus whose CREQs declare `:source_doc:`. Two extension points, both optional: (i) weasel blacklist via `tailoring.weasel_extra`; (ii) axis-#5 pluggable language-specific filter chain via `code-grounding-filters.yaml` (schema: [`shared/code-grounding-filters.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/code-grounding-filters.md)). Without any tailoring the skill runs three universal axis-#5 filters and the base weasel blacklist — stricter signal, language-agnostic, usable in any project out of the box. +- (e) Read-only. Does not modify the CREQ RST or the source file. Never invokes other skills (caller runs `pharaoh-req-review` as a sibling). + +## Input + +- `target`: either a `need_id` resolvable in `needs.json`, or a raw RST directive block for one CREQ. The block must contain the `:source_doc:` option; if absent, axis #8 (`source_doc_resolves`) fails with `"source_doc missing — cannot ground check"` and every other axis records `passed: "n/a"`. +- `source_doc_path` (optional when `target` is an RST block): path to the cited source file. Accepts either an absolute path or a path relative to `project_root`; relative paths are joined with `project_root` before opening. Extension determines the raise-site / import regex flavour (Python MVP; other languages via `shared/public-symbol-patterns.md`). If the resolved path does not exist, axis #8 fails with `"source_doc unresolved"`. When `target` is a raw RST block AND `source_doc_path` is omitted, the skill auto-derives it from the block's `:source_doc:` option and resolves via `project_root`. +- `project_root` (optional, required when `source_doc_path` is relative or omitted): absolute path to the consumer project's root. Used to resolve relative or auto-derived source docs to absolute paths before opening. +- `tailoring_path`: absolute path to the project's tailoring directory (`.pharaoh/project/`). Two files are read: + - `checklists/requirement.md` frontmatter for `tailoring.weasel_extra: [<word>, ...]` (axis #6 extension). + - `code-grounding-filters.yaml` for axis #5's pluggable language-specific filter chain; schema in [`shared/code-grounding-filters.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/code-grounding-filters.md). Missing, empty, or malformed YAML is acceptable — only the three universal filters apply. + +Edge cases: empty source file → axes #1, #2, #3, #4, #5, #9 fail with `"source file empty"` evidence (axes that read the source body); missing tailoring file → base blacklist applies silently, no pluggable filters load; malformed `code-grounding-filters.yaml` → skill logs a warning in `notes`, falls back to universal filters only; language-specific axes (#1 raise-sites, #2 trigger, #3 named-symbol) use the Python MVP regex by default and record `passed: "n/a", reason: "language not yet supported"` for non-Python sources where the regex does not apply. + +## Output + +```json +{ + "need_id": "REQ_example_read", + "source_doc": "src/example/reader.py", + "axes": { + "exception_raise_sites_exist": {"passed": false, "evidence": "ReadError cited; 1 raise site found in reader.py:159 — PASS; UnicodeDecodeError cited as raised, 0 raise sites — FAIL"}, + "trigger_condition_literal_match": {"passed": true}, + "named_symbol_exists": {"passed": true}, + "type_framework_matches_imports": {"passed": "n/a", "reason": "no type-framework claim in body"}, + "backtick_symbol_in_source_doc": {"passed": false, "evidence": "4 backtick tokens checked: 'read_file' ✓, 'strict' ✓, 'uuid_target' ✓, 'reqif_uuid' ✗ (not in reader.py, lives in config/reqif_config.py — cross-file leak)"}, + "no_weasel_adjectives": {"passed": false, "evidence": "'structured diagnostic' — 'structured' blacklisted"}, + "quantifier_enumerated": {"passed": false, "evidence": "'all unrecoverable input failures' — unbounded 'all' without enumeration"}, + "source_doc_resolves": {"passed": true}, + "branch_count_aligned": {"score": 1, "evidence": "function read_file has 4 visible branches (encoding err / parse err / empty / success); CREQ is one shall-clause"} + }, + "overall": "fail", + "blockers": ["exception_raise_sites_exist", "backtick_symbol_in_source_doc", "no_weasel_adjectives", "quantifier_enumerated"], + "actions": [ + "Remove 'UnicodeDecodeError raised' claim OR add raise site for UnicodeDecodeError in reader.py", + "Replace 'reqif_uuid' with the consumer-side attribute form (e.g. 'uuid_target' accessed via 'self.config.uuid_target') OR change :source_doc: to config/reqif_config.py", + "Replace 'structured diagnostic' with concrete term (list[str])", + "Enumerate 'all unrecoverable input failures' or replace with specific classes" + ] +} +``` + +`overall` is `"pass"` iff every mechanical axis has `passed: true` (or `"n/a"`) AND `branch_count_aligned.score >= 2`. Any mechanical `passed: false` OR a branch-count score of 0 or 1 promotes the axis name into `blockers` and sets `overall: "fail"`. `actions` enumerates one remediation per blocker, with enough specificity to guide regeneration. + +## Detection rule + +Eight mechanical axes plus one subjective axis. Every mechanical axis resolves to a grep over the CREQ body and/or the source file; no LLM judgement on mechanical axes. Axes are listed in the order they should execute — cheap greps first, so a failing body short-circuits before expensive AST / import resolution on axes 4, 7, and 9. + +### 1. `exception_raise_sites_exist` + +**Check:** For each class name `X` mentioned in a `raises X` / `shall raise X` / `throws X` clause in the CREQ body, grep `raise X(` in the source file. Each cited class must have ≥1 raise site. Missing raise sites promote the axis to `passed: false` with the evidence listing every missing class name. + +**Detection:** +```bash +# Extract cited exceptions from CREQ body: +grep -oE '(?:raises?|throws?|shall raise)\s+(?:the\s+|an?\s+)?[A-Z][A-Za-z0-9_]+' <creq> \ + | awk '{print $NF}' | sort -u +# For each, verify raise site in source: +grep -cE "raise\s+<X>\s*\(" <source_doc> +``` + +### 2. `trigger_condition_literal_match` + +**Check:** Detect `when <field> == "<value>"` / `when <field> is <value>` in the CREQ body. Extract `<field>` and `<value>`. Grep source for `<field>\s*==\s*"<value>"` vs `<field>\s*!=\s*"<value>"`. Mismatch between claimed operator / value and the source code fails. + +**Detection:** +```bash +grep -oE 'when\s+[a-z_]+\s*(==|is)\s*"[^"]*"' <creq> +# then in source: +grep -E '<field>\s*(==|!=)\s*"<value>"' <source_doc> +``` + +### 3. `named_symbol_exists` + +**Check:** Extract symbol names from the CREQ body ONLY in bounded structural contexts: + +- Verb-prefix pattern: `(?:raises?|throws?|uses?|wraps?|calls?|invokes?|extends?|subclasses?)\s+(?:the\s+|an?\s+)?(?P<sym>[A-Z][A-Za-z0-9_]+)` +- Function-call shape: `(?P<fn>[a-z_][a-z0-9_]+)\(` + +Every extracted `sym` / `fn` must appear as a definition or call site in the source file. This narrowing — verb prefix OR trailing parens — is load-bearing: unrestricted `[A-Z][a-zA-Z0-9]+` matching produces false positives on stdlib generics (`List`, `Dict`, `Optional`) and sentence-initial capitalization (`Parser`, `User`). + +**Detection:** +```bash +grep -E "(raises?|throws?|uses?|wraps?|calls?|invokes?|extends?|subclasses?)\s+(the\s+|an?\s+)?[A-Z][A-Za-z0-9_]+" <creq> +grep -E "[a-z_][a-z0-9_]+\(" <creq> +# each extracted name must exist in <source_doc> as def/class/call site +``` + +### 4. `type_framework_matches_imports` + +**Check:** If CREQ body mentions "Pydantic model", source must `from pydantic import ...` or `import pydantic`. "dataclass" → source must have `@dataclass` or `from dataclasses`. "attrs class" → `@attr` or `import attr`. "TypedDict" → `from typing import TypedDict` or `typing.TypedDict`. Mismatch between the claim and the imports fails. + +**Detection:** +```bash +grep -oEi 'pydantic|dataclass|attrs\s+class|typeddict' <creq> +# Python imports: +grep -E '^(from|import)\s+(pydantic|dataclasses|attr|typing)' <source_doc> +grep -E '^@(dataclass|attr\.s|attrs\.define)' <source_doc> +``` + +### 5. `backtick_symbol_in_source_doc` + +**Check:** For every backtick-quoted token ``` ``X`` ``` in the CREQ body, verify that `X` appears as a literal substring in the declared `:source_doc:`. This catches cross-file leaks that the verb-prefixed axis #3 (`named_symbol_exists`) misses, because many backtick-cited identifiers sit in running prose without a structural verb in front of them (e.g. "honouring ``include_links`` and per-field delimiters"). + +The check runs **after** normalising the token through a two-tier filter chain: three universal filters built into the base skill plus zero or more pluggable language-specific filters loaded from `<tailoring_path>/code-grounding-filters.yaml`. Tokens surviving both tiers are looked up in the source file; the first unresolved token fails the axis with `evidence` naming each unresolved token and — when the token is found elsewhere in the project source tree — the file it actually lives in, so the caller knows whether to retarget `:source_doc:` or rewrite the CREQ. + +#### Universal filters (always active, language-agnostic) + +Apply in order; a token that matches any step is counted as resolved (or skipped) without penalty. + +1. **TOML section / table header** — token matches `^\[[a-z_][\w.]*\]$` (e.g. ``[myapp.export_config]``, ``[foo]``). Not a code identifier in any language; skip. +2. **File path / command-string** — token contains `/` or a space (e.g. ``commands/csv.py``, ``jama check``). Skip; file paths are covered by axis #8 (`source_doc_resolves`) and multi-word strings are user-facing UI text, not code symbols. Language-agnostic — `/` and whitespace are not valid identifier characters in any mainstream language. +3. **Short-prose guard** — tokens that are lowercase English words under 4 chars (e.g. ``id``, ``to``, ``or``) OR all-caps domain acronyms in a closed list (``API``, ``CSV``, ``JSON``, ``REST``, ``TOML``, ``CLI``, ``URL``, ``HTTP``) are treated as prose, not symbols. Skip. The acronym list is conservative and stays in the base skill. + +#### Pluggable filters (from tailoring YAML) + +After the three universal filters run, the skill loads `<tailoring_path>/code-grounding-filters.yaml` (if present) and applies every filter declared there in order. The YAML schema and the four supported strategies (`kebab_to_snake_or_pascal`, `prefix_glob_expansion`, `dotted_import_resolution`, `cross_file_literal_default`) are documented in [`shared/code-grounding-filters.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/code-grounding-filters.md). Each strategy is a parameterised shape; projects supply the language-specific regex / separator / patterns per strategy. Python / Typer projects get CLI-kebab + env-glob + stdlib-import + dataclass-default filters; Rust / Clap projects get the same strategies with different patterns (`use X::Y`, `#[serde(default=...)]`). An absent YAML means only the universal filters run — acceptable default, stricter signal, no wrong-language false negatives. + +Any token that fails every filter AND does not literally appear in `:source_doc:` fails the axis. + +**Detection (pseudocode):** +```python +for tok in re.findall(r'``([^`]+)``', creq_body): + # Tier 1 — universal filters (always active) + if match_toml_section(tok): continue + if '/' in tok or ' ' in tok: continue + if is_short_prose(tok): continue + + # Tier 2 — pluggable filters from tailoring + if any(f.resolves(tok, source_text, project_root) + for f in tailored_filters): + continue + + # Baseline — literal substring match + if tok in source_text: continue + + # Not resolved — record with cross-file lookup for evidence + elsewhere = locate_in_project(tok, project_root) + violations.append({"token": tok, "found_in": elsewhere}) +``` + +The ordering is load-bearing: universal filters short-circuit before the YAML is even opened, so missing-tailoring projects pay zero cost for the three cheap greps. Pluggable filters run before the baseline substring check so that language-specific resolutions take precedence over accidental substring coincidences. + +### 6. `no_weasel_adjectives` + +**Check:** Grep the CREQ body against the base blacklist: + +``` +structured, comprehensive, full, absolute, paginated, robust, complete, proper +``` + +Any match fails with the matched word in evidence. These words imply mechanised behaviour without grounding. Tailoring extension: `tailoring.weasel_extra` (list) is union-merged with the base before the grep. + +**Detection:** +```bash +grep -iwE '\b(structured|comprehensive|full|absolute|paginated|robust|complete|proper)\b' <creq> +``` + +### 7. `quantifier_enumerated` + +**Check:** Narrow, mechanical quantifier detection only. Regex: + +``` +\b(?:all|every|each)\s+(?:[a-z]+\s+){0,3}(?:error|errors|exception|exceptions|failure|failures|case|cases|command|commands|branch|branches|mode|modes|validator|validators)s?\b +``` + +If matched, the same sentence or the next sentence must contain either: +- `:\s` (enumeration colon), OR +- ` namely ` / ` specifically ` / ` including ` (enumeration marker), OR +- a Sphinx list directive (`.. list-table::` / `- ` bullet in an adjacent block). + +Otherwise fail. Broader quantifier judgement — "does 'the system' implicitly quantify?" — is deferred to the subjective `unambiguity_prose` axis in `pharaoh-req-review`. This axis catches only the specific pattern where the noun signals an expected enumeration that is missing. + +### 8. `source_doc_resolves` + +**Check:** The CREQ's `:source_doc:` option must (a) be present, (b) point at an existing file, and (c) the file must contain at least one symbol the CREQ names (per axis #3 extraction). Three fail modes: + +- `:source_doc:` absent → `passed: false, evidence: "source_doc missing — cannot ground check"`. +- Path does not exist → `passed: false, evidence: "source_doc unresolved: <path>"`. +- Symbols from CREQ body absent in file → `passed: false, evidence: "source_doc-symbol mismatch: none of [<sym>, ...] found in <path>"`. + +### 9. `branch_count_aligned` (subjective, 0-3) + +**Check:** Count `if` / `elif` / `else` / `match` branches in the function named by `:source_doc:` (parse via Python `ast` where available, regex fallback). If CREQ is one shall-clause but the function has ≥3 branches producing visibly different outputs, score ≤ 2. If CREQ enumerates branches or is a set of short CREQs covering them, score 3. + +Rubric: +- 3 — CREQ structure matches source branch count (1 shall per branch, or a single CREQ with explicit per-branch enumeration). +- 2 — CREQ groups branches under a justified umbrella (e.g. "validation errors" for 2-3 similar branches). +- 1 — CREQ collapses ≥3 distinct branches into one shall-clause with no enumeration; different projects would reasonably want these split. +- 0 — CREQ omits entire branches that produce observable output. + +## Tailoring extension point + +See [`shared/checklists/requirement.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/checklists/requirement.md) — the canonical location of the `tailoring.weasel_extra` frontmatter key consumed by axis #6 (`no_weasel_adjectives`). No other project-specific state in the base skill; all regulatory-standard vocabulary (ASIL, ARC, ASPICE process IDs) stays out of the base. + +## Composition + +Role: `atom-check`. + +Called as a sibling alongside `pharaoh-req-review` from the `## Last step` of any emission skill that drafts requirements with `:source_doc:` (e.g. `pharaoh-req-from-code`). The two atoms run independently; neither dispatches the other. The caller merges findings under `review.iso_axes` (from req-review) and `review.code_grounding` (from this skill). Emission fails if either atom returns a mechanical-axis failure. + +Never invoked directly by end users — always from an emission skill's Last step or from `pharaoh-quality-gate.required_checks` in invariant-delegation mode. diff --git a/.github/agents/pharaoh.req-codelink-annotate.agent.md b/.github/agents/pharaoh.req-codelink-annotate.agent.md index c4f79b1..74fce50 100644 --- a/.github/agents/pharaoh.req-codelink-annotate.agent.md +++ b/.github/agents/pharaoh.req-codelink-annotate.agent.md @@ -7,4 +7,325 @@ handoffs: [] Use when a requirement has been drafted (either as an RST block by `pharaoh-req-from-code` or implicitly) and you need to insert a one-line comment into the source file that carries the trace. Two modes — `codelinks` (sphinx-codelinks-compatible multi-field `@ title, id, type, [links]` form; the comment IS the need) and `backref` (minimal `@req ID: title` pointer back to an RST-hosted need). Mode is tailored via `ubproject.toml` / `pharaoh.toml`, not hardcoded. -See [`skills/pharaoh-req-codelink-annotate/SKILL.md`](../../skills/pharaoh-req-codelink-annotate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-req-codelink-annotate + +## When to use + +Invoke when you want a source file to carry a machine- and human-readable reference to a requirement. The resulting comment either: + +- **`codelinks` mode** — contains the full need definition in sphinx-codelinks one-line format (e.g. `# @ Write CSV header row, CREQ_csv_export_01, comp_req, [FEAT_csv_export]`). When the project builds Sphinx with `sphinx_codelinks` loaded, this comment becomes an actual need directive at render time. The comment IS the source of truth. +- **`backref` mode** — contains only a pointer to a need defined elsewhere (e.g. `# @req CREQ_csv_export_01: Write CSV header row`). The RST block (e.g. from `pharaoh-req-from-code`) is the source of truth; the comment exists for grep/IDE navigation only. + +Mode is not a per-call preference — it is a project-level decision baked into `ubproject.toml` / `pharaoh.toml` (see Tailoring awareness). The caller may override per invocation, but the default MUST come from tailoring so that the whole codebase stays consistent. + +Do NOT use to generate the requirement's text (that is `pharaoh-req-from-code` with appropriate `emit` mode). Do NOT use to delete or update comments after the fact — this skill only inserts, per atomicity. + +## Tailoring awareness — mode resolution + +The skill resolves `mode` in this order: + +1. Input `mode_override` parameter (per-call, highest precedence). +2. `pharaoh.toml` → `[pharaoh.codelink_comments].mode` (explicit project choice). +3. Auto-detect: if `ubproject.toml` contains `[codelinks.projects.*]` (i.e. sphinx-codelinks is in use) → `"codelinks"`; otherwise `"backref"`. +4. Fallback: `"backref"` (conservative — a dumb grep-able pointer is safe even if the project later adopts sphinx-codelinks). + +### `codelinks` mode configuration + +Comes from `ubproject.toml` under `[codelinks.projects.<name>.analyse.oneline_comment_style]` — exactly the table sphinx-codelinks itself reads. The skill does NOT re-invent this schema; it reads the same `start_sequence`, `end_sequence`, `field_split_char`, and `needs_fields` that sphinx-codelinks will use to parse the comment at build time. This guarantees round-trip safety: what this skill writes, sphinx-codelinks reads. + +The caller must indicate which codelinks project the file belongs to, via input `codelinks_project_name`. If omitted, the skill tries to infer from `file_path` + each project's `source_discover.src_dir`; if exactly one project matches, use it; if zero or multiple match → FAIL asking the caller to be explicit. This keeps the skill atomic (no hidden "which project" guessing) while staying ergonomic. + +### `backref` mode configuration + +Comes from `pharaoh.toml`: + +```toml +[pharaoh.codelink_comments] +mode = "backref" +prefix = "@req" # marker for grep +format = "{prefix} {id}: {title}" # template +``` + +If `pharaoh.codelink_comments` is absent, defaults: `prefix = "@req"`, `format = "{prefix} {id}: {title}"`. + +### `check → propose → confirm` + +If `on_missing_config == "prompt"` (default) AND no tailoring is found (neither `pharaoh.codelink_comments` nor `[codelinks.projects.*]`), the skill does NOT silent-default. It returns a structured proposal object: + +```json +{ + "status": "needs_confirmation", + "proposal": { + "mode": "backref", + "rationale": "No [codelinks.projects.*] table in ubproject.toml — project does not appear to use sphinx-codelinks. Proposing minimal backref mode.", + "tailoring_patch": { + "target_file": "pharaoh.toml", + "section": "[pharaoh.codelink_comments]", + "patch": {"mode": "backref", "prefix": "@req", "format": "{prefix} {id}: {title}"} + } + } +} +``` + +The caller confirms (humans or an outer LLM), the tailoring gets written (typically via `pharaoh-tailor-fill`), and the skill is re-invoked — now finding the config and proceeding silently with `on_missing_config="use_default"` semantics. + +### Language-to-comment-syntax mapping + +Derived from file extension (both modes): + +| Extension | Prefix | +|---|---| +| `.py`, `.rb`, `.sh`, `.toml`, `.yaml`, `.yml` | `#` | +| `.c`, `.cpp`, `.cxx`, `.cc`, `.h`, `.hpp`, `.hxx`, `.ts`, `.tsx`, `.js`, `.jsx`, `.rs`, `.go`, `.java`, `.kt`, `.swift`, `.scala`, `.groovy`, `.dart` | `//` | +| `.sql`, `.hs`, `.lua`, `.ada` | `--` | + +Unknown extension → FAIL rather than guess. + +## Atomicity + +- (a) Indivisible — one (req_id, file_path, anchor) triple in → one comment line inserted. No multi-req batching, no req text modification, no RST file modification. +- (b) Input: `{req_id: str, req_title: str, req_type: str, file_path: str, anchor: AnchorSpec, project_root: str, parent_links?: list[str], mode_override?: "codelinks"|"backref", codelinks_project_name?: str, on_missing_config?: "fail"|"prompt"|"use_default", dry_run?: bool}`. Output: JSON `{mode_used: "codelinks"|"backref", inserted_line: int, inserted_text: str, file_modified: bool}`. On `dry_run=true` → `file_modified=false`, no write. On `on_missing_config="prompt"` with no tailoring → `{status: "needs_confirmation", proposal: ...}` (see Tailoring awareness). +- (c) Reward: two deterministic fixtures, one per mode. + + **Fixture A — `codelinks` mode** (project tailored with `[codelinks.projects.demo.analyse.oneline_comment_style]` matching sphinx-codelinks defaults: `start_sequence="@"`, `field_split_char=","`, `needs_fields=[title, id, type, links]`). 20-line `.py` file, known anchor, known (req_id, req_title, req_type, parent_links=["FEAT_X"]). Scorer: + 1. File is still syntactically valid Python. + 2. Exactly one line added. + 3. Inserted line starts with `# @ ` (comment prefix + start_sequence + space). + 4. Parsing the inserted line with sphinx-codelinks' own oneline parser (or a faithful reimplementation) yields a need with `title==req_title`, `id==req_id`, `type==req_type`, `links==parent_links`. + 5. `mode_used == "codelinks"` in output. + 6. Idempotent re-run: second invocation detects `req_id` substring, no-op, `file_modified=false`. + + **Fixture B — `backref` mode** (project without `[codelinks.*]` and without `[pharaoh.codelink_comments]`, `on_missing_config="use_default"`). Same file shape. Scorer: + 1. File is still syntactically valid Python. + 2. Exactly one line added. + 3. Inserted line starts with `# @req ` (default backref prefix). + 4. Inserted line contains both `req_id` and `req_title` as substrings. + 5. `mode_used == "backref"` in output. + 6. Idempotent re-run: no-op. + + **Fixture C — prompt mode** (no tailoring, `on_missing_config="prompt"`). Scorer: + 1. Output has `status == "needs_confirmation"`. + 2. Output has `proposal.mode == "backref"` (conservative default). + 3. Output has `proposal.tailoring_patch` pointing at `pharaoh.toml`. + 4. File was NOT modified. + + Pass = all checks in all three fixtures. +- (d) Reusable: any source tree in any supported language; bidirectional trace for reverse-engineered reqs; IDE-navigable "where is this req implemented" queries via grep. +- (e) Composable: strictly one phase (source mutation, one comment). Never modifies RST, never calls `pharaoh-req-from-code` or other skills. A plan emitted by `pharaoh-write-plan` MAY include a foreach task over req-emission outputs that dispatches this skill per req, but this skill itself does not orchestrate. + +## Input + +- `req_id`: the requirement's sphinx-needs ID (e.g. `"CREQ_csv_export_01"`). +- `req_title`: the requirement's short title (e.g. `"Write CSV header row"`). +- `req_type`: the requirement's directive name (e.g. `"comp_req"`, `"impl"`). In `codelinks` mode used for the `type` field. In `backref` mode used only if the tailored `format` template includes `{type}`. +- `file_path`: path to the source file to annotate. Accepts either an absolute path or a path relative to `project_root`; relative paths are joined with `project_root` before the file is opened. +- `anchor`: `AnchorSpec` — one of: + - `{type: "top_of_file"}` — insert after shebang/encoding lines but before any other content. + - `{type: "before_symbol", symbol: "<name>"}` — insert immediately before the line where `<name>` is defined. Regex-based detection, not AST-level. + - `{type: "before_line", line: <n>}` — insert before line `<n>` (1-indexed). +- `project_root`: absolute path to the consumer project's root. Used to locate `ubproject.toml` (for `codelinks` mode config) and `pharaoh.toml` (for `backref` mode config and mode selection). +- `parent_links` (optional): list of parent IDs (e.g. `["FEAT_csv_export"]`). In `codelinks` mode used for the `links` field verbatim. In `backref` mode used only if the tailored `format` template includes `{parent_links}`. +- `mode_override` (optional): `"codelinks"` or `"backref"`. Forces mode for this call. If omitted, resolution follows the order in Tailoring awareness. +- `codelinks_project_name` (optional): which entry under `[codelinks.projects.*]` in `ubproject.toml` this file belongs to. Required when multiple projects are defined and their `source_discover.src_dir` values would both match `file_path`. Ignored in `backref` mode. +- `on_missing_config` (optional): `"fail" | "prompt" | "use_default"`. Default `"prompt"`. Determines behavior when tailoring is missing (see Tailoring awareness). +- `dry_run` (optional): if `true`, skill returns what WOULD be written without touching the file. Default `false`. + +## Output + +A single JSON object, one of three shapes: + +**Success shape (file modified):** + +```json +{ + "mode_used": "backref", + "inserted_line": 15, + "inserted_text": "# @req CREQ_csv_export_01: Write CSV header row", + "file_modified": true +} +``` + +**Idempotent re-run (comment already present):** + +```json +{ + "mode_used": "backref", + "inserted_line": 15, + "inserted_text": "# @req CREQ_csv_export_01: Write CSV header row", + "file_modified": false +} +``` + +**Needs-confirmation (no tailoring, `on_missing_config="prompt"`):** + +```json +{ + "status": "needs_confirmation", + "proposal": { + "mode": "backref", + "rationale": "No [codelinks.projects.*] table in ubproject.toml — proposing minimal backref mode.", + "tailoring_patch": { + "target_file": "pharaoh.toml", + "section": "[pharaoh.codelink_comments]", + "patch": {"mode": "backref", "prefix": "@req", "format": "{prefix} {id}: {title}"} + } + } +} +``` + +On `dry_run=true`, `file_modified` is always `false`. + +## Output schema + +Output must parse as JSON via `json.loads`. Validator checks one of two shapes: + +**Success shape:** +- Required keys: `mode_used` (one of `"codelinks"`, `"backref"`), `inserted_line` (int ≥ 1), `inserted_text` (non-empty str), `file_modified` (bool). +- Unknown keys are permitted and surface as a warning, not a rejection, to allow forward-compatible evolution. + +**Needs-confirmation shape:** +- Required keys: `status == "needs_confirmation"`, `proposal` (mapping). See `## Tailoring awareness` for proposal details. +- Mutually exclusive with the success shape (a response has one or the other, never both). + +## Process + +### Step 1: Resolve mode + +Resolve `mode` per the Tailoring awareness order: +1. `mode_override` → use directly. +2. `pharaoh.toml [pharaoh.codelink_comments].mode` → use. +3. `ubproject.toml` has any `[codelinks.projects.*]` table → `"codelinks"`. +4. Fallback → `"backref"`. + +If resolution step 2 and 3 both yield nothing AND `on_missing_config == "prompt"` → emit the `needs_confirmation` proposal object described in Tailoring awareness and return without modifying the file. + +If `on_missing_config == "fail"` and no config → FAIL. + +If `on_missing_config == "use_default"` and no config → proceed with `"backref"` silently. + +### Step 2a: Format the comment — `codelinks` mode + +1. Determine the codelinks project: use `codelinks_project_name` if provided; else infer by matching `file_path` against each `[codelinks.projects.*].source_discover.src_dir`. If exactly one matches, use that project. If zero or multiple → FAIL. +2. Read `[codelinks.projects.<name>.analyse.oneline_comment_style]` from `ubproject.toml`: + - `start_sequence` (e.g. `"@"`) + - `end_sequence` (optional; default empty) + - `field_split_char` (e.g. `","`) + - `needs_fields`: ordered list of `{name, type?, default?}` entries. +3. Build the field values in declared order: + - For each field, pick the value from the mapping: `title`=`req_title`, `id`=`req_id`, `type`=`req_type`, `links`=`parent_links` (rendered as `[a, b, c]` per sphinx-codelinks list-of-strings syntax). + - If a declared field has no matching value AND has a `default` → omit (sphinx-codelinks will fill it). Else FAIL naming the missing field. + - Escape `field_split_char` and `[`/`]` characters in string values per sphinx-codelinks escaping rules (backslash prefix). +4. Join fields with `" " + field_split_char + " "` (a space, separator, space — matching sphinx-codelinks' own formatting). +5. Prepend `start_sequence + " "` (e.g. `"@ "`). +6. Append `end_sequence` if non-empty. +7. The result is the comment text body. Example: `"@ Write CSV header row, CREQ_csv_export_01, comp_req, [FEAT_csv_export]"`. + +### Step 2b: Format the comment — `backref` mode + +Read `<project_root>/pharaoh.toml` if present. Extract `[pharaoh.codelink_comments]`: +- `prefix` (default `"@req"`) +- `format` (default `"{prefix} {id}: {title}"`) + +Substitute placeholders: +- `{prefix}` → the prefix value +- `{id}` → `req_id` +- `{title}` → `req_title` +- `{type}` → `req_type` +- `{parent_links}` → `", ".join(parent_links)` or empty string if not provided + +### Step 3: Resolve comment syntax from file extension + +Map file extension → comment prefix: + +| Extension | Prefix | +|---|---| +| `.py`, `.rb`, `.sh`, `.toml`, `.yaml`, `.yml` | `#` | +| `.c`, `.cpp`, `.cxx`, `.cc`, `.h`, `.hpp`, `.hxx`, `.ts`, `.tsx`, `.js`, `.jsx`, `.rs`, `.go`, `.java`, `.kt`, `.swift`, `.scala`, `.groovy`, `.dart` | `//` | +| `.sql`, `.hs`, `.lua`, `.ada` | `--` | + +Unknown extension → FAIL: `"Cannot determine comment syntax for extension <ext>. Add the extension to the mapping or pass a file with a known extension."`. + +The inserted line is: `<comment_prefix> <formatted_text>`. + +### Step 4: Read the file + +Read `file_path`. Split into lines (preserve trailing newline state for round-trip). + +### Step 5: Idempotency check + +Scan every line for `req_id` as a substring. If any line contains it, the comment is already present (or a different reference to the same req exists). Return without modifying the file: + +```json +{"inserted_line": <matched_line_index>, "inserted_text": "<matched_line>", "file_modified": false} +``` + +This is the idempotency guarantee from reward check #6. Do NOT re-insert, do NOT duplicate, do NOT update a stale title (if the title changed, the human is responsible for deleting the old line; auto-update risks corrupting hand-edited comments). + +### Step 6: Resolve anchor to a line index + +- `top_of_file`: skip shebang (`#!`) and encoding lines (`# -*- coding: ... -*-`, `# coding: ...`), stop at the first content line. Insert immediately before it. +- `before_symbol`: regex-scan for the symbol's declaration line. Language-specific patterns: + - Python: `^\s*(def|class|async\s+def)\s+<name>\b` + - JS/TS: `^\s*(function|class|const|let|var)\s+<name>\b`, also `^\s*<name>\s*(?::|=)` for object-method shorthand + - Rust: `^\s*(pub\s+)?(fn|struct|enum|trait|impl)\s+<name>\b` + - Go: `^\s*func\s+(\([^)]*\)\s+)?<name>\b`, also `^\s*(type\s+<name>\b|var\s+<name>\b)` + - C/C++: `^\s*[\w:*&<>\s]+\s+<name>\s*\(` (function) or `^\s*(class|struct|enum)\s+<name>\b` (type) + + If multiple matches, warn and use the first. If zero matches, FAIL: `"Symbol <name> not found in <file_path>."`. +- `before_line`: validate `1 <= line <= len(lines) + 1`. FAIL if out of range. + +### Step 7: Insert the comment + +If `dry_run=true`, return without writing: + +```json +{"inserted_line": <resolved_line>, "inserted_text": "<formatted_comment>", "file_modified": false} +``` + +Otherwise, insert the comment line at the resolved index. Preserve indentation — the comment inherits the indentation of the line it precedes (so it stays at the correct nesting level for `before_symbol` anchors). + +Write the file back. Preserve original EOL convention (LF vs CRLF) and final-newline state. + +### Step 8: Return + +Return the JSON object per the Output shape with `file_modified=true`. + +## Last step + +No dedicated `*-review` atom exists for codelink annotation; the operation is a one-line insert whose correctness is structural rather than prose-judgement. This skill therefore performs an inline self-verification in Step 8 before returning `file_modified=true`: + +1. Re-read `file_path` and confirm the line at `inserted_line` is byte-for-byte equal to `inserted_text`. +2. Confirm the file has at most one line starting with the tailored `start_sequence + <separator>` bearing `req_id` (idempotence — a subsequent run with the same inputs must be a no-op, not a duplicate insert). +3. If either check fails, roll back the write (restore the original file) and return `status: "failed"` with evidence. + +Coverage is mechanically enforced at plan level by `pharaoh-quality-gate`'s `link_types_covered` invariant (verifies every required link type referenced by the project's artefact catalog has at least one non-empty value across the emitted corpus). See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale. + +## Failure modes + +- `file_path` not readable → FAIL: `"file not readable: <path>"`. +- Unknown extension → FAIL per Step 2. +- Symbol not found for `before_symbol` anchor → FAIL per Step 5. +- Line out of range for `before_line` anchor → FAIL per Step 5. +- File is in `.git/`, `node_modules/`, `__pycache__/`, or a build output directory (detected by path segment) → FAIL: `"Refusing to annotate generated/vendored file: <path>"`. This protects against accidental writes into machine-generated code. + +## Non-goals + +- No AST-level insertion — regex is deliberately simple to keep the skill language-agnostic. Callers who need AST-precise placement should use a language-specific tool and pass `before_line` with the exact line number. +- No multi-comment insertion in one call — this skill is atomic per (req, file, anchor). Callers who need N comments make N calls. +- No cross-file traceability validation — a separate skill (`pharaoh-codelink-validate`, not yet implemented) will scan for orphan back-references whose req no longer exists. +- No removal — if a req is deleted, the comment stays until a human or a dedicated cleanup skill removes it. This skill only inserts. + +## Composition + +The typical flow: + +1. A plan emitted by `pharaoh-write-plan` includes a foreach task over req-from-code outputs; `pharaoh-execute-plan` dispatches N `pharaoh-req-from-code` instances that generate `comp_req` blocks. +2. Caller (human or the same plan) reviews and accepts the reqs. +3. A downstream foreach task in the same plan (typically `id: codelink_annotate`) dispatches this skill per accepted req with: + - `req_id`, `req_title`, `req_type` from the RST block + - `file_path` = the source file the req was derived from (available from the `req-from-code:<filename>` `reporter_id` used by the upstream task) + - `anchor` = `{type: "top_of_file"}` for coarse placement, or `{type: "before_symbol", symbol: <primary_symbol>}` when the req is clearly about a specific function/class. diff --git a/.github/agents/pharaoh.req-draft.agent.md b/.github/agents/pharaoh.req-draft.agent.md index 6021bfd..edc81e0 100644 --- a/.github/agents/pharaoh.req-draft.agent.md +++ b/.github/agents/pharaoh.req-draft.agent.md @@ -7,4 +7,527 @@ handoffs: [] Use when drafting a single sphinx-needs requirement-shaped artefact (req, comp_req, sysreq, swreq, hazard, safety_goal, fsr, etc.) from a feature description. The artefact type is parameterised via target_level (any catalog-declared requirement-shaped type, including ISO 26262 safety-V types). Produces a new RST directive block with ID, status=draft, and either a shall-clause body or a hazard/goal-shaped body, linking to a parent per the project's artefact-catalog. -See [`skills/pharaoh-req-draft/SKILL.md`](../../skills/pharaoh-req-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-req-draft + +## When to use + +Invoke when the user provides a short feature, hazard, or safety-goal description (1-5 sentences) and wants a single requirement-shaped artefact authored at a specific catalog-declared level. Do NOT decompose multiple levels; do NOT review existing artefacts; do NOT draft architecture — those are separate skills. + +This skill produces exactly one artefact per invocation. If the user appears to want multiple artefacts from a single description, draft only the most direct one and tell the user to re-invoke for any additional ones. + +This skill is the canonical drafter for any requirement-shaped type the project's +`artefact-catalog.yaml` declares — classical levels (`req`, `comp_req`, `sysreq`, `swreq`, +`gd_req`) as well as ISO 26262 safety-V types (`hazard` for HARA outputs, `safety_goal` +for Part 3 goals, `fsr` for Part 4 functional safety requirements, `tsr` for Part 4 +technical safety requirements). The skill drives every type from the catalog's +`required_fields` and `required_metadata_fields`; nothing in the skill is hardcoded to a +fixed allow-list of types. + +### ISO 26262 framing + +Safety-V types follow ISO 26262 Part 3 (HARA → safety goals) and Part 4 (FSR / TSR) +flow. This skill is not a safety expert system: it does not decide ASIL ratings or +hazard classifications. It surfaces the ISO-26262-relevant fields (`asil`, `severity`, +`exposure`, `controllability`, `safe_state`, etc.) as placeholders when the project's +catalog declares them required, and prompts the user to fill them. Projects that do +not declare these fields in their catalog get a plain requirement; projects that do +(e.g. `useblocks/sphinx-needs-demo` with a HARA tailoring) get the safety-V shape. + +## Inputs + +- **feature_context** (from user): short prose describing the feature, hazard, or safety + goal — plus the safety relevance if any +- **target_level** (from user): the artefact-catalog type name to emit. Any type declared + in `.pharaoh/project/artefact-catalog.yaml` is accepted, including: + - classical requirement levels — `req`, `comp_req`, `sysreq`, `swreq`, `gd_req`, … + - ISO 26262 safety-V types — `hazard`, `safety_goal`, `fsr`, `tsr`, … + The emitted directive uses `target_level` verbatim as the directive name; the ID prefix, + required fields, and required metadata fields are resolved from the catalog and + `id-conventions.yaml`. If `target_level` is absent the skill falls back to the project's + primary requirement type (`gd_req` in Score, `req` in the bundled defaults) — pass it + explicitly when drafting safety-V artefacts. +- **parent_link** (from user or inferred): ID of the parent the new artefact links to + via the catalog-declared link relation (`:satisfies:` for classical reqs and FSRs, + `:safety_goal_for:` / `:derives_from:` / similar for safety-V — read from the catalog). +- **safety_classification** (optional, from user): metadata block for ASIL-related fields + on safety-V types. Recommended (but not required) when `target_level` is one of + `hazard`, `safety_goal`, `fsr`, or `tsr`. Common shape: + `{asil: "B", severity: "S2", exposure: "E3", controllability: "C2", safe_state: "..."}`. + Each field is emitted only if the catalog declares it as `required_fields` or + `required_metadata_fields`; surplus fields are dropped silently. Missing values for + catalog-required fields are emitted as `<TBD>` placeholders with a `[FLAG]` line so the + user knows to fill them. +- **tailoring** (from `.pharaoh/project/` files): + - `id-conventions.yaml` — prefix, separator, and ID regex for each artefact type + - `artefact-catalog.yaml` — `required_fields`, `optional_fields`, `required_metadata_fields`, + `required_links`, `lifecycle` for the target type + - `checklists/requirement.md` — ISO 26262-8 §6 axes used in self-check +- **needs.json** (built artefact index): used for parent resolution and ID uniqueness + +> Note: A `shared/tailoring-access.md` helper module is planned. Until it exists, Steps 1-2 below +> inline the tailoring-access logic directly. When that file is created, this skill should be +> updated to delegate to it. + +## Outputs + +A single RST directive block matching the project's requirement prefix (e.g. `gd_req::` for Score), containing: + +- Unique ID per id-conventions +- `:status: draft` +- `:satisfies:` link to parent_link (validated present in needs.json) +- `:verification:` link stub — use `tc__TBD` if no test ID exists yet; this is flagged in the output +- Single-sentence body with exactly one `shall`, no coordinating conjunctions within the shall clause +- No additional conjectural content beyond the single shall statement + +--- + +## Process + +### Step 1: Read tailoring + +Read three files from `.pharaoh/project/`: + +**1a. `artefact-catalog.yaml`** + +Resolve `target_level` against the catalog. Look up the entry whose top-level key equals +`target_level`. If found, record: + +- `required_fields` — every field that must appear in the directive (e.g. `id`, `status`, + `satisfies` for a classical req; plus `asil`, `severity`, `exposure`, + `controllability`, `safe_state` for safety-V types when the catalog declares them) +- `optional_fields` — fields that may appear +- `required_metadata_fields` — option keys that must be set with non-empty values before + release. Treated as required at draft time too — any missing value is emitted as `<TBD>` + with a `[FLAG]` line. +- `required_links` — link-relation names that every artefact of this type must declare + with a non-empty target list (e.g. `satisfies` for a comp_req, `safety_goal_for` for + an `fsr`, `derives_from` for a `safety_goal`). Use this to pick the right link option + in Step 7 — never hardcode `:satisfies:`. +- `lifecycle` — valid values for `:status:` + +If the entry is absent, FAIL: + +``` +FAIL: target_level "<value>" is not declared in .pharaoh/project/artefact-catalog.yaml. +Add an entry for "<value>" (with required_fields, optional_fields, lifecycle, and any +required_metadata_fields / required_links) before drafting, or pass a target_level that +is already declared. +``` + +If `target_level` was not provided by the caller, fall back to the project's primary +requirement type — the first catalog key whose suffix is `req` (e.g. `gd_req` in Score, +`req` in the bundled defaults) — and note the fallback in the output. Always pass +`target_level` explicitly when drafting safety-V types (`hazard`, `safety_goal`, `fsr`, +`tsr`); the fallback is only safe for classical requirements. + +Built-in default profile (bundled example, used when no catalog is present): `req` with +required = `[id, status, satisfies]`; optional = `[complies, tags, rationale, verification]`; +lifecycle = `[draft, valid, inspected]`. + +**1b. `id-conventions.yaml`** + +Extract: +- `prefixes` — map of artefact-type key to its identifier prefix string. Read the value + for `target_level`. +- `separator` — string used between prefix and local-ID part (e.g. `__`) +- `id_regex` — regex pattern all generated IDs must match (e.g. `^[a-z][a-z_]*__[a-z0-9_]+$`) +- `id_regex_exceptions` — per-type overrides (note: `std_req` is exempt for Score) + +If `prefixes` does not declare `target_level`, FAIL: + +``` +FAIL: id-conventions.yaml prefixes map has no entry for "<target_level>". +Declare a prefix for "<target_level>" before drafting. +``` + +**1c. `checklists/requirement.md`** + +Read the Individual checklist axes. These will be used in Step 6 self-check. You do not need to apply the Set-level axes at draft time. Record which axes are mechanically checkable at single-artefact level: +- `unambiguity` — one `shall`, no coordinating conjunctions in shall clause (applies to + shall-clause types — `req`, `fsr`, `tsr`, `safety_goal`) +- `atomicity` — body is a single shall statement (or a single hazard statement for + `hazard` types — see Step 5) +- `verifiability` — `:verification:` link present and non-empty (where the catalog declares it) + +--- + +### Step 2: Locate and parse needs.json + +Find `needs.json` in the project build directory. Common locations (check in order): +1. `docs/_build/needs/needs.json` +2. `_build/needs/needs.json` +3. Any `needs.json` under a `_build` directory + +If not found, FAIL: + +``` +FAIL: needs.json not found. Build the Sphinx project first (`sphinx-build docs/ docs/_build/`), +then re-run this skill. needs.json is required for parent validation and ID uniqueness. +``` + +Parse the JSON. Extract: +- A flat map of `id → {id, type, status, ...}` across all needs +- The set of all existing IDs (for uniqueness check in Step 3) + +--- + +### Step 3: Validate parent_link + +The user must supply a `parent_link` — an ID of an existing requirement or workflow that the new requirement will satisfy. + +1. Look up `parent_link` in the needs.json ID map. +2. If not found, FAIL: + +``` +FAIL: parent "<parent_link>" not found in needs.json. +Specify an existing parent ID. Available IDs starting with that prefix: <up to 5 examples> +``` + +3. If the parent is of an incompatible type (e.g. a `wp` artefact cannot be a `satisfies` target for `gd_req` in Score), warn but do not block — the user may be modeling a cross-type link intentionally. + +4. If `parent_link` was not provided at all and cannot be inferred from context, FAIL: + +``` +FAIL: parent_link required. Provide the ID of the parent requirement or workflow +this new requirement satisfies. +``` + +--- + +### Step 4: Assign a unique ID + +Generate a unique ID for the new requirement. + +**4a. Determine the local-ID part** + +The ID format is `<prefix><separator><local>`, e.g. `gd_req__brake_activation_threshold`. + +Derive the local part from the feature_context: +- Lowercase, words separated by underscores +- Maximum 5 words; trim articles and prepositions +- Must satisfy `id_regex` after combining with prefix and separator +- Example: feature "Brake activation threshold at low speed" → local `brake_activation_threshold` + +**4b. Check uniqueness** + +Look up `<prefix><separator><local>` in the needs.json ID set. If it already exists, append a numeric suffix starting at `2`: +- `gd_req__brake_activation_threshold` taken → try `gd_req__brake_activation_threshold_2`, then `_3`, etc. + +**4c. Validate against id_regex** + +Confirm the final candidate matches `id_regex` (or the applicable `id_regex_exceptions` entry). +If it does not match, FAIL: + +``` +FAIL: generated ID "<id>" does not match id_regex "<regex>". +Revise the feature_context to use lowercase ASCII words. +``` + +--- + +### Step 5: Draft the body + +Body shape depends on `target_level`: + +- **Shall-clause types** — `req`, `comp_req`, `sysreq`, `swreq`, `gd_req`, `fsr`, `tsr`, + `safety_goal` (and any catalog type whose checklist declares the unambiguity / atomicity + shall axes): write a single shall sentence per the rules below. +- **Hazard types** — `hazard` (and any catalog type that documents an event/situation rather + than a behaviour): write a single declarative sentence describing the hazardous event, + its trigger condition, and the affected vehicle/actor — NO `shall`. Example: + `Unintended ABS pump activation while driving on dry asphalt at >80 km/h causes loss of + braking force on the front axle.` Then rely on the catalog's `required_metadata_fields` + (severity / exposure / controllability / asil) to carry the HARA classification — do + NOT bake those values into the body prose. + +For a **shall-clause** body, write a single sentence that: + +1. Uses exactly one `shall` +2. Names a subject (the system, component, or actor) +3. Specifies a condition or measurable criterion where the feature_context provides one +4. Contains no coordinating conjunctions (`and`, `or`, `but`) within the `shall` clause +5. Does not interpret or expand the feature_context beyond what is stated — if the context is too vague to write a specific shall clause, see Guardrails +6. Describes **observable behavior at the component boundary**, not internal mechanism. Do NOT name internal methods, classes, private variables, field names, or module-local symbols inside the shall body. External API names (published HTTP routes, CLI flags, pypi packages, protocol names, algorithm names) ARE observable and are fine. Rationale: the prior dogfooding audit showed ~7% (3/40) of LLM-drafted shall clauses named internal symbols AND got the described mechanism wrong — internal-name mentions rot on rename and are a primary accuracy-failure class. Keep traceability to internal symbols in `pharaoh-req-codelink-annotate` output, not in the shall body. + +Good patterns: +- `The <system> shall <action> when <condition>.` +- `The <system> shall <action> within <measurable criterion>.` +- `The <component> shall <provide/reject/signal> <object> <constraint>.` +- `The exporter shall use HMAC-SHA256 to sign each outgoing request.` ← algorithm is observable at the boundary, fine to name. + +Bad patterns (reject these in Step 6): +- Two verbs joined by `and`: `The system shall detect and report...` → FAIL +- Implicit plural: `The system shall check all sensors...` → acceptable only if "all" is intentional scope +- Vague quantity: `The system shall respond quickly` → too vague; note in output +- Internal symbol in the shall body: `The system shall use global_id to drive create-vs-update.` → internal variable name, will rot; rewrite as `The exporter shall decide create-vs-update based on whether a tracked identifier is already known for the need.` + +--- + +### Step 6: Self-check + +Before emitting, run these checks. If a check fails, attempt to re-draft (up to 2 retries). If still failing after 2 retries, emit the directive with a `[DIAGNOSTIC]` annotation explaining the issue. + +**Check A — single shall (shall-clause types only)** + +For shall-clause types, count occurrences of `shall` in the body. Must be exactly 1. + +```python +assert body.count("shall") == 1 +``` + +If `> 1`: split into the first shall clause and discard the rest. Re-draft a clean single-shall body. + +For hazard types, skip this check — body must NOT contain `shall`. If `shall` appears in +a hazard body, re-draft the hazard as a declarative event statement. + +**Check B — no conjunction in shall clause (shall-clause types only)** + +For shall-clause types, extract the shall clause (text from `shall` to end of sentence). +Check for `, and `, `, or `, ` and `, ` or ` within it. + +If found: split into the primary action only. Re-draft. + +For hazard types, skip this check — a single hazard sentence may legitimately combine a +trigger condition and an effect (`while driving on dry asphalt and braking…`). + +**Check B.bis — no internal symbol in shall clause** + +Flag any of the following patterns inside the shall body: +- An identifier in backticks (`` `foo_bar` ``) that is NOT one of the external-surface classes whitelisted for backticks: CLI flags (``--host``), env vars (``APP_LICENSE_KEY``), TOML config keys / section headers (``[myapp.export_config]``, ``links_delimiter``), protocol tokens (``HMAC-SHA256``), HTTP routes (``/itemtypes``). Internal function / method names, private variables, and implementation identifiers (`lower_snake` / `camelCase` symbols not in the whitelist) stay bare. +- A function-call shape like `some_func(...)` in the shall body. +- A private-looking variable reference (`self.x`, `obj._y`). + +The whitelist mirrors `pharaoh-req-from-code` Rule 7. Project-internal TOML keys that never appear in public docs are still acceptable in backticks — a tester must copy-paste them verbatim — so do NOT require public-doc evidence on the whitelisted classes. + +If found, re-draft to describe the observable behavior without naming the internal symbol (see Step 5 guideline 6). After 2 retries, emit with `[DIAGNOSTIC] shall body names internal symbol — post-emit revision required.` + +**Check C — parent resolves** + +Confirm the parent ID under the catalog-declared link relation (`:satisfies:` for classical +reqs, `:safety_goal_for:` / `:derives_from:` / similar per the catalog's `required_links` +for safety-V types) is present in needs.json (already checked in Step 3, re-confirm before +emit). + +**Check D — ID unique** + +Confirm chosen ID does not appear in needs.json (already checked in Step 4, re-confirm before emit). + +**Check E — required fields present** + +Verify the directive block includes every field from `required_fields` in +`artefact-catalog.yaml`. For the bundled default profile: `id`, `status`, `satisfies` must +all be present. For a safety-V type whose catalog entry declares e.g. +`required_fields: [id, status, safety_goal_for, asil, severity, exposure, controllability]`, +every one of those options must appear (use `<TBD>` placeholder when the user did not +supply a value, and emit a `[FLAG]` line per placeholder). + +**Check F — required metadata fields present** + +For each entry in `required_metadata_fields` from the catalog, confirm the directive +emits the option with a non-empty value or a `<TBD>` placeholder. Emit a `[FLAG]` line +for every `<TBD>` so the user knows release will block until it is filled. + +--- + +### Step 7: Emit the directive block + +Produce the final RST directive. Use `target_level` verbatim as the directive name. +Emit every option from `required_fields` (with values supplied by the user or `<TBD>` +placeholders), every link in `required_links` (with the user-supplied parent ID), and +every option in `required_metadata_fields` (value or `<TBD>`): + +```rst +.. <target_level>:: <title> + :id: <id> + :status: draft + :<required_link_relation>: <parent_link> + <... every other field from required_fields ...> + <... every option from required_metadata_fields ...> + + <body per Step 5 — single shall for shall-clause types, declarative event for hazards> +``` + +Formatting rules: +- Directive line: `.. <target_level>:: <title>` with exactly one space after `..` and one space after `::`. +- Options: indented by 3 spaces. Format: `:<option>: <value>`. +- One blank line between options block and content body. +- Content body: indented by 3 spaces. +- One blank line after the content body. + +If `:verification:` (or any other field) is set to `<TBD>`, append a flagged note per +placeholder after the block: + +``` +[FLAG] :verification: set to tc__TBD — link to a real test case before promoting to status=valid. +[FLAG] :asil: set to <TBD> — assign an ASIL level (A/B/C/D/QM) before release. +``` + +Include optional fields only if the user explicitly provided values for them (e.g. rationale, tags). Do not invent optional field values. + +--- + +## Guardrails + +**G1 — Ambiguous level** + +If feature_context describes behaviour at two distinct abstraction levels (e.g. component-level and unit-level) without the user specifying a target level, FAIL before drafting: + +``` +FAIL: context is ambiguous between level <X> and level <Y>. +Re-run with target_level specified (e.g. "component-level" or "unit-level"). +``` + +**G2 — Parent not in needs.json** + +If parent_link does not resolve in needs.json (as detected in Step 3), FAIL: + +``` +FAIL: parent "<parent_link>" not found in needs.json. +Specify an existing parent ID or build the project first. +``` + +**G3 — Multiple shall after 2 retries** + +If the drafted body still contains `> 1 shall` or a conjunction in the shall clause after 2 self-correction attempts, emit the directive as-is with a diagnostic: + +``` +[DIAGNOSTIC] Body failed shall-atomicity check after 2 retries. +Issue: <check A or B description> +Action required: manually edit the body before review. +``` + +Do not block the user — emit with the diagnostic so they can proceed and fix manually. + +**G4 — Feature context too vague** + +If feature_context does not contain enough specifics to write a measurable shall clause (no actor, no action, no condition), FAIL and ask for more detail: + +``` +FAIL: feature_context is too vague to draft a verifiable requirement. +Please provide: (1) the system/component subject, (2) the required action or property, +(3) any measurable condition or threshold. +``` + +--- + +## Advisory chain + +After successfully emitting the directive, always advise: + +``` +Consider running `pharaoh-req-review <new_id>` to audit against ISO 26262-8 §6 axes. +``` + +Do not show this if the emit included a `[DIAGNOSTIC]` (the user has a more urgent issue to fix first). + +--- + +## Worked examples + +### Example 1 — classical guide-level requirement (Score) + +**Feature context (user input):** +> "The brake controller shall engage the ABS pump when wheel slip exceeds a calibrated +> threshold. target_level: gd_req. Parent: gd_req__brake_system_safety" + +**Step 1 result:** catalog entry `gd_req` has `required_fields: [id, status, satisfies]`, +`required_links: [satisfies]`, optional includes `verification`. id-conventions prefix +`gd_req`, separator `__`, id_regex `^[a-z][a-z_]*__[a-z0-9_]+$`. + +**Step 2 result:** needs.json found at `docs/_build/needs/needs.json`; 185 IDs loaded. + +**Step 3 result:** `gd_req__brake_system_safety` found in needs.json. Parent valid. + +**Step 4 result:** local = `abs_pump_activation`; candidate = `gd_req__abs_pump_activation`; not in needs.json; passes id_regex. ID assigned. + +**Step 5 draft body** (shall-clause type): +> The brake controller shall engage the ABS pump when measured wheel slip exceeds the calibrated activation threshold. + +**Step 6 checks:** `shall` count = 1 ✓; no conjunction in shall clause ✓; parent resolves ✓; ID unique ✓; required fields all present ✓; no `required_metadata_fields` declared. + +**Step 7 output:** + +```rst +.. gd_req:: ABS pump activation on wheel slip threshold + :id: gd_req__abs_pump_activation + :status: draft + :satisfies: gd_req__brake_system_safety + :verification: tc__TBD + + The brake controller shall engage the ABS pump when measured wheel slip exceeds + the calibrated activation threshold. +``` + +``` +[FLAG] :verification: set to tc__TBD — link to a real test case before promoting to status=valid. + +Consider running `pharaoh-req-review gd_req__abs_pump_activation` to audit against ISO 26262-8 §6 axes. +``` + +### Example 2 — ISO 26262 functional safety requirement (FSR) + +**Feature context (user input):** +> "The braking ECU shall command vehicle deceleration to a defined safe state within +> 100 ms of detecting unintended ABS pump activation. target_level: fsr. +> Parent (the safety goal it implements): sg__no_unintended_abs_pump_activation." +> +> safety_classification: `{asil: "B", safe_state: "wheels_unlocked_no_abs_modulation"}` + +**Step 1 result:** catalog entry for `fsr` declares +`required_fields: [id, status, safety_goal_for, asil, safe_state]`, +`required_links: [safety_goal_for]`, `required_metadata_fields: [asil, safe_state]`, +`lifecycle: [draft, reviewed, approved]`. id-conventions prefix `fsr`, separator `__`. + +**Step 2 result:** needs.json found at `docs/_build/needs/needs.json`; 240 IDs loaded. + +**Step 3 result:** `sg__no_unintended_abs_pump_activation` found in needs.json (type +`safety_goal`). Catalog-declared link relation for `fsr` is `safety_goal_for` (not +`satisfies`) — the skill uses that key, not a hardcoded `:satisfies:`. + +**Step 4 result:** local = `safe_deceleration_on_unintended_pump`; candidate = +`fsr__safe_deceleration_on_unintended_pump`; passes id_regex. ID assigned. + +**Step 5 draft body** (shall-clause type — `fsr` is a requirement-shaped safety-V type): +> The braking ECU shall command vehicle deceleration to the +> `wheels_unlocked_no_abs_modulation` safe state within 100 ms of detecting unintended +> ABS pump activation. + +**Step 6 checks:** `shall` count = 1 ✓; no conjunction ✓; parent resolves ✓; ID unique ✓; +required fields `[id, status, safety_goal_for, asil, safe_state]` all present (asil and +safe_state from `safety_classification`) ✓; required_metadata_fields `[asil, safe_state]` +both non-empty ✓. + +**Step 7 output:** + +```rst +.. fsr:: Safe deceleration on unintended ABS pump activation + :id: fsr__safe_deceleration_on_unintended_pump + :status: draft + :safety_goal_for: sg__no_unintended_abs_pump_activation + :asil: B + :safe_state: wheels_unlocked_no_abs_modulation + + The braking ECU shall command vehicle deceleration to the + wheels_unlocked_no_abs_modulation safe state within 100 ms of detecting + unintended ABS pump activation. +``` + +``` +Consider running `pharaoh-req-review fsr__safe_deceleration_on_unintended_pump` to audit +against ISO 26262-8 §6 axes (the same checklist applies to FSRs as to classical reqs). +``` + +For a hazard-typed draft (`target_level: hazard`), the catalog typically declares +`required_metadata_fields: [severity, exposure, controllability, asil]`. The skill emits +those options as `<TBD>` placeholders when `safety_classification` does not include +them, and the body is a single declarative event sentence (no `shall`) — see Step 5. + +## Last step + +After emitting the artefact, invoke `pharaoh-req-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-req-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/.github/agents/pharaoh.req-from-code.agent.md b/.github/agents/pharaoh.req-from-code.agent.md index 9e7bb93..597151d 100644 --- a/.github/agents/pharaoh.req-from-code.agent.md +++ b/.github/agents/pharaoh.req-from-code.agent.md @@ -7,4 +7,295 @@ handoffs: [] Use when reading one source file and emitting one or more requirement RST directives (typed by `target_level`) describing the observable behavior in that file. Queries shared Papyrus for canonical terms before naming concepts; writes newly surfaced concepts back. Does not draft architecture, plans, or FMEA. -See [`skills/pharaoh-req-from-code/SKILL.md`](../../skills/pharaoh-req-from-code/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-req-from-code + +## Shall-clause rules + +The seven rules below govern what a CREQ's body looks like. All seven apply to every emission; violations of any rule mean the emission is a draft, not a valid CREQ. + +### Rule 1 — Subject is the component (or an external actor) + +The grammatical subject of the shall clause is either: + +1. the component / capability (e.g. "The CSV Importer", "The export CLI", "The API client"). The component name comes from the parent feat title, from a named role inside the feat scope, or from the project's `artefact-catalog.yaml`. +2. an external actor (user, operator, caller, third-party service) whose action the component shall respond to — acceptable when the feat is an interactive CLI, an API, or any user-facing interface. Example: "The authenticated user shall receive a non-zero exit code on malformed input." + +Never a Python function, class, private method, or file. + +Code-narration subjects (❌) vs component subjects (✅): + +| Bad | Good | +|---|---| +| ``from_source_a`` shall call ``check_license`` | The Source A Connector shall reject unlicensed use | +| ``_apply_cli_overrides`` shall override credentials | The Source A CLI shall accept server credentials on the command line | +| ``FooClient._classify_connection_error`` shall raise ``FooAuthenticationError`` on HTTP 401 | The Source A Connector shall signal an authentication failure when the server rejects the configured credentials | +| ``ItemLoader`` shall load items from ``ImporterConfig.input_path`` via ``load_items`` | The Importer shall read items from the configured input path | + +Tests: the component form is falsifiable by a tester who can't read the source (black-box), stable across refactors (renaming `_apply_cli_overrides` does not invalidate the CREQ), and traceable to the feat via `:satisfies:`. + +### Rule 2 — No internal implementation details in the body + +No internal / private function names, no leading-`_` methods, no class-dot-method references, no file paths, no line numbers ever (`around line 165`, `in commands/foo.py`, `at the top of the module` — all banned). Traceability to code lives in `:source_doc:`; the shall clause carries behavior. These two jobs stay separate. + +Known prior failures this rule catches: + +- `"record_key drives the create-vs-update decision"` — names an internal field AND gets the mechanism wrong (actual decision variable is a different id attribute on the same record). +- `"parse_timestamp raises on unparseable input"` — names an internal function AND misstates the behavior (the function returns `None`; the caller raises). + +A clean behavioral shall with zero backticks and one `:source_doc:` is preferred over a code-narration shall with ten backticks. + +### Rule 3 — `:source_doc:` must point at the implementing source code file + +Every emitted CREQ carries `:source_doc:` pointing at a real source file — typically `.py`, `.rs`, `.ts`, `.go`, `.c`, `.cpp`, `.java` under the project's source tree (e.g. `src/<project>/csv/csv2needs.py`). Pointing `:source_doc:` at the spec RST file itself or at a prose feature doc is a validation failure — the spec RST is where the requirement lives, not where the behavior is implemented. + +When a CREQ's behavior spans multiple source files, pick the file that owns the primary observable (usually the converter module, not a CLI dispatcher). `pharaoh-req-code-grounding-check` axis #8 (`source_doc_resolves`) fails if the cited file is the spec RST or missing entirely. + +### Rule 4 — CREQ adds constraint beyond the parent feat + +A CREQ whose shall clause paraphrases the feat capability with the same subject, verb, and scope is tautological and MUST NOT be emitted. + +Test before every emission: *what constraint does this CREQ impose that the feat body alone does not?* Answers that count: a concrete pre-condition, a post-condition, an error contract, a default value, an ordering guarantee, a quantitative bound, a specific field / flag / command name. Answers that don't: just naming a sub-capability in the imperative. + +Bad (tautology) — feat says "The CSV Connector enables bidirectional exchange between Sphinx-Needs and CSV files"; CREQ says "The CSV Importer shall convert a user-supplied CSV file into a needs.json file when the user invokes `<cli> <subcmd> from-csv`" — zero added constraint. + +Good — "The CSV Importer shall fail with a non-zero exit code and a single-line error message on the first row whose mapped `id` column is missing or empty, without writing a partial `needs.json`" — specific precondition, specific post-condition, specific boundary observable. + +### Rule 5 — Enumerate boundary-observable code structures exhaustively + +For each `:source_doc:` file, enumerate and emit one CREQ per boundary-observable structure: + +1. **Every raised exception class that escapes the module's public surface.** "The component emits `FooError` when <condition>." +2. **Every published config key the module reads.** TOML keys, env vars, dataclass fields — one CREQ per key, naming the key and the default. +3. **Every public function or CLI subcommand exposed by the module.** CLI subcommand = one CREQ; exported library function = one CREQ. + +Expected floor per typical connector module (200-500 LOC, 3-8 exception classes, 5-10 config keys, 1-3 public functions): **12-20 CREQs per module**. Under-decomposition below this floor means structures got bundled into compound shalls — split them. + +Each emitted block's body has exactly one `shall` clause. Zero intra-clause conjunctions joining modal-verb phrases (`, and shall` / ` and shall` / ` or raise` / `, or ` — all splits). Multiple observable behaviors = multiple CREQs. Intra-clause conjunctions are a hard fail regardless of behavioral quality; split the block before returning. + +### Rule 6 — `:verification:` field is required + +Every emitted CREQ carries `:verification:` with at minimum the placeholder `tc__TBD`. Absence is a schema failure. If the project uses a different link name for the req→test relation (`verifies`, `covered_by`), declare it in `[[needs.extra_links]]`; the default placeholder stays required. + +### Rule 7 — Backticks are for code / protocol tokens only + +Backticks signal "copy this string verbatim — it is a code symbol, config key, or protocol token". NOT for format acronyms (``CSV``, ``JSON``, ``XML``, ``TOML``, ``HTML``), document-type nouns (``document``, ``file``, ``row``), or emphasis (``default``, ``required``). + +Test: would a tester copy-paste this string into test code or configuration? If yes, backtick it. If not, leave it bare. + +Backticks ARE acceptable on external-surface identifiers: CLI flags (``--host``), env vars (``APP_LICENSE_KEY``), TOML config keys (``[myapp.export_config]``, ``links_delimiter``), HTTP routes (``/itemtypes``), protocol tokens (``HMAC-SHA256``). + +### Config-value citation (see Rule 3 + Rule 7) + +Example: a consumer module that reads `self.config.output_format` cites ``output_format`` (its own reference form), not the default-value literal (``default_format``) which lives in the config module. Citing the default-value form creates a false paper-trail — the grounding-check axis will fail because the shall clause names a symbol the cited file does not contain. + +## When to use + +Invoke with a single source file (any language) assigned to this agent and (optionally) a shared Papyrus workspace for cross-agent terminology coordination. Emit one requirement (of type `target_level`) per distinct boundary-observable behavior expressed in the file. Do NOT emit reqs for behavior not grounded in the file (that is drafting, not reverse-engineering). Do NOT attempt architecture, verification plans, or FMEA — those are separate skills. + +## Tailoring awareness + +Two axes are tailored, both read at runtime from the consumer project's `ubproject.toml` / `pharaoh.toml`: + +**Type axis** — need types and ID conventions are project-specific. Read `[[needs.types]]` entries from `ubproject.toml` (or `.pharaoh/project/id-conventions.yaml` if present) — each has `directive` and `prefix`. Do NOT hardcode `comp_req` as the only acceptable type. The caller passes `target_level` — use it verbatim as the directive name (in `rst` emit) or as the `type` field (in `codelinks_comment` emit). + +**Emit axis** — whether to emit RST directive blocks or sphinx-codelinks-compatible one-line comments. Resolution order: + +1. `emit_override` input (per-call). +2. `pharaoh.toml [pharaoh.codelink_comments].mode` — `"codelinks"` → `codelinks_comment`; `"backref"` or absent → `rst`. +3. Auto-detect: `ubproject.toml` contains `[codelinks.projects.*]` → `"codelinks_comment"`; otherwise `"rst"`. +4. Fallback: `"rst"`. + +If `on_missing_config == "prompt"` (default) AND tailoring is missing (no `target_level` in `[[needs.types]]`, or emit mode unresolvable), the skill returns `{status: "needs_confirmation", proposal: {...}}` with a tailoring patch the caller can confirm. Caller confirms → tailoring gets written (typically via `pharaoh-tailor-fill`) → re-invoke with `on_missing_config="use_default"` for silent proceed. + +## Atomicity + +- (a) Indivisible — one file in → N reqs out. No I/O beyond file read + optional Papyrus query/write + req emit. Emits in exactly one representation per call (`rst` OR `codelinks_comment`). +- (b) Input: `{file_path, target_level, shared_context_path?, papyrus_workspace?, reporter_id, parent_feat_ids?, emit_override?, codelinks_project_name?, on_missing_config?, allowed_ids?, split_strategy?}`. Output: single JSON object `{"reqs": [{"id", "title", "type", "body", "source_doc", "satisfies", "verification", "raw_rst"}, ...]}` for `emit=rst`, or `{"codelinks": [str, ...]}` for `emit=codelinks_comment`. On missing tailoring with `on_missing_config=prompt`: single JSON object `{status: "needs_confirmation", proposal}`. +- (c) Reward: language-parametric fixture — given `test_fixture.<ext>` (`.py` / `.cpp` / `.rs` / `.ts`) containing exactly 3 named symbols (`FooBar`, `BazQux`, `Quux`), emitted reqs must mention all 3 by canonical name. Directive name must equal `target_level`. If `parent_feat_ids` is non-empty, every emitted block MUST contain `:satisfies: <id1>, <id2>, ...` with all parents comma-joined. +- (d) Reusable across reverse-engineering workflows, spec drafting, standalone CI "are there reqs for this code?" gates. +- (e) Composable — strictly one phase. Never invokes `pharaoh-arch-draft`, `pharaoh-fmea`, `pharaoh-plan`. + +## Input + +- `file_path`: absolute path to the source file (any language). +- `target_level`: requirement artefact directive name as declared in the consumer project's `ubproject.toml` (e.g. `"comp_req"`, `"impl"`, `"spec"`). ID prefix is `target_level` + `__` unless `[[needs.types]].prefix` overrides. +- `shared_context_path` (optional): companion source file read by all agents in the fan-out (e.g. `common.cpp`). Read but NOT reverse-engineered. +- `papyrus_workspace` (optional): path to `.papyrus/` for canonical-term coordination. Absent → no-memory mode (skip Steps 1 and 3). +- `reporter_id`: short identifier for this agent (e.g. `req-from-code:csv2needs.py`). +- `parent_feat_ids` (optional): list of parent feature IDs. When non-empty, every emitted block gets `:satisfies: <id1>, <id2>, ...` comma-joined. +- `allowed_ids` (optional): pre-allocated ID list. When provided, emitter MUST NOT invent IDs outside this list; emits only `len(allowed_ids)` reqs max; overflow logged as a warning comment. +- `split_strategy` (optional): `"single"` (default, whole file as one scope, target 1-5 reqs), `"top_level_symbols"` (per top-level symbol, target 1-3 reqs/symbol), or `"sections"` (per `# ---` / `// ===` horizontal-rule marker, target 1-3 reqs/section). Plans supply this via `${heuristics.split_strategy(...)}`. + +## Output + +A single JSON object. The top-level key names the emit mode: `reqs` for `emit=rst`, `codelinks` for `emit=codelinks_comment`. Downstream skills key off the presence of one or the other. + +### `emit=rst` + +```json +{ + "reqs": [ + { + "id": "<id_prefix><snake_case_id>", + "title": "<short_title>", + "type": "<target_level>", + "body": "The <Component subject> shall <observable behavior>.", + "source_doc": "<path to implementing source file>", + "satisfies": ["<parent_1>", "<parent_2>"], + "verification": "tc__TBD", + "raw_rst": ".. <target_level>:: <short_title>\n :id: ...\n :status: draft\n :satisfies: ...\n :source_doc: ...\n :verification: tc__TBD\n\n <body>\n" + } + ] +} +``` + +Field semantics: + +- `id` — `<id_prefix><snake_case_id>`. `<id_prefix>` defaults to `target_level` (`comp_req` → `comp_req__foo_01`). If `[[needs.types]].prefix` declares `"CREQ_"`, use `CREQ_foo_01`. +- `type` — equals input `target_level`. +- `satisfies` — list of parent feat ids. Empty list when `parent_feat_ids` was empty. Always present (use `[]`). +- `raw_rst` — exactly the RST directive block as it would appear in an `.rst` file. Downstream review / annotation skills read `raw_rst` when they need the directive text; helpers that consume `reqs` (e.g. by-stem grouping) read `id` / `source_doc`. + +### `emit=codelinks_comment` + +```json +{ + "codelinks": [ + "@ <title>, <id>, <target_level>, [<parent_1>, <parent_2>, ...]" + ] +} +``` + +Each `codelinks[i]` is one comment line matching the project's `[codelinks.projects.<name>.analyse.oneline_comment_style]`: + +- Tailored `start_sequence` (default `@`). +- Tailored `field_split_char` (default `,`) with surrounding spaces. +- Field order matches tailored `needs_fields`. +- Values escaped per sphinx-codelinks rules. +- No language comment prefix (that is `pharaoh-req-codelink-annotate`'s job). + +### `status == "needs_confirmation"` + +When tailoring is missing and `on_missing_config == "prompt"`, output is a single JSON object `{"status": "needs_confirmation", "proposal": {...}}`. Downstream consumers check for this shape before parsing as `reqs` / `codelinks`. + +## Output schema + +Validated as `json_obj` by `pharaoh-output-validate`. The validator checks the top-level shape, then per-item shape against the regexes below. + +**Stage 1 — block recognizer (Python regex, `re.MULTILINE`):** + +```regex +^\.\. (?P<directive>[a-z_]+)::\s+(?P<title>.+)$ +(?P<options>(?:^ :[a-z_]+:.*$\n?)+) +(?:^\n (?P<body>[\s\S]+?))? +(?=^\.\.\s|\Z) +``` + +Identifies one directive block bounded by the next `.. ` at column 0 or end of input. `re.MULTILINE` without `re.DOTALL` keeps `.` line-bounded; options cannot leak into adjacent blocks. + +**Stage 2 — option enumeration (on the recognizer's `options` capture):** + +```regex +^ :(?P<option>[a-z_]+):\s*(?P<value>.*)$ +``` + +`re.finditer` with `re.MULTILINE` enumerates every option/value pair. + +**Validator checks (per `reqs[*]`):** + +1. `raw_rst` matches Stage 1 + Stage 2 — block is well-formed. +2. `raw_rst` directive name equals `type` and equals input `target_level`. +3. Stage 2 on `raw_rst` yields at least `id`, `status`, `source_doc`, and `verification`; values match the corresponding top-level fields. +4. If `parent_feat_ids` was provided: `satisfies` field is non-empty and lists every parent id; `raw_rst` `:satisfies:` (or tailored child→parent link name) value matches. +5. Every option in `raw_rst` is either declared in `ubproject.toml` `[[needs.types]]`, a built-in sphinx-needs option, or a Pharaoh convention option. Reject unknown names (catches typos like `subsatisfies`). +6. If `allowed_ids` was provided: every `reqs[*].id` is a member of `allowed_ids`. + +**`emit=codelinks_comment`** — each `codelinks[*]` string must parse via sphinx-codelinks `oneline_parser.parse_line()` against the tailored `oneline_comment_style`. + +## Process + +### Step 1: Query Papyrus for canonical terms BEFORE naming + +Only applies if `papyrus_workspace` is provided. For each type / function / concept you may name in a req: + +1. Form a short semantic query ("what do we call the subsystem that supervises other monitors"). +2. Invoke `pharaoh-context-gather` with `mode="semantic"`. Semantic mode is required — substring recall silently misses morphological synonyms. +3. If a canonical appears in the top-3 results, use its exact spelling (preserve case). +4. If no match, plan to introduce a new canonical in Step 3. + +### Step 2: Read the source file + +Read `file_path` and, if provided, `shared_context_path`. Identify boundary-observable behaviors grounded in control flow and data flow. Ignore internal helpers, log messages, assertion text. + +Apply `split_strategy`: + +- `"single"` (default): whole file as one scope. Target 1-5 reqs. +- `"top_level_symbols"`: enumerate top-level symbols via the patterns in [`../shared/public-symbol-patterns.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/public-symbol-patterns.md). Emit per symbol. Target 1-3 reqs per symbol. +- `"sections"`: split at `^#\s*={3,}` / `^//\s*={3,}` markers. Target 1-3 reqs per section. + +In plan-driven runs the `${heuristics.split_strategy(...)}` helper picks per-file by LOC (≤500 → single; 500-2000 with markers → sections; else → top_level_symbols). + +### Step 3: Record newly surfaced concepts in Papyrus + +Only applies if `papyrus_workspace` is provided. For each concept you will mention and that Step 1 did not return, invoke `pharaoh-decision-record`: + +- `type`: `"fact"` +- `canonical_name`: idiomatic casing for the source language (CamelCase for types; snake_case for functions/fields in Python/Rust/C; camelCase in TypeScript/Java). Preserve what the source uses. +- `body`: one sentence. +- `reporter_id`: your `reporter_id` input. +- `tags`: `["origin:req-from-code", "file:<basename>"]`. + +On `"duplicate"`: a concurrent agent raced you; re-query via `pharaoh-context-gather`, adopt the existing spelling, rewrite your draft to match. + +### Step 4: Resolve tailoring (type + emit mode) + +Read `<project_root>/ubproject.toml` and `<project_root>/pharaoh.toml`. + +**Type resolution:** find `[[needs.types]]` entry where `directive == target_level`. Extract `prefix`. If not declared: +- `on_missing_config == "fail"` → FAIL. +- `on_missing_config == "prompt"` → emit `{status: "needs_confirmation", proposal: ...}` and return without emitting reqs. +- `on_missing_config == "use_default"` → use `<target_level>__` silently. + +**Emit mode:** per the Tailoring awareness order. Log the resolved mode on the header line. + +**Codelinks format** (only if `emit == "codelinks_comment"`): resolve `[codelinks.projects.<name>.analyse.oneline_comment_style]` via `codelinks_project_name` or by matching `file_path` against each project's `source_discover.src_dir`. Zero or multiple matches with `on_missing_config != "fail"` → `needs_confirmation`. `on_missing_config == "fail"` → FAIL. + +### Step 5a: Emit — `rst` mode + +For each boundary-observable behavior (per Rule 5 enumeration): + +- `<short_title>` — 3-6 word summary. +- `:id: <id_prefix><filename_stem>_<n>` — `<id_prefix>` resolved in Step 4. File basename (stem, snake_case) as disambiguator. Examples: `comp_req__csv2needs_01`, `CREQ_csv2needs_01`. +- `:status: draft`. +- `:satisfies: <parent_1>, ...` — iff `parent_feat_ids` non-empty. All parents comma-joined. If `[[needs.extra_links]]` declares a different outgoing name (e.g. `realizes`), use that instead. +- `:source_doc: <path to implementing source file>` — per Rule 3. +- `:verification: tc__TBD` — per Rule 6. +- Body — single shall clause, component subject (Rule 1), no internals (Rule 2), adds constraint (Rule 4), atomicity + no conjunctions (Atomicity rule above). Canonical names from Steps 1/3. + +### Step 5b: Emit — `codelinks_comment` mode + +For each behavior, emit one line that sphinx-codelinks' oneline parser would read back into a need equivalent to what `rst` mode would produce. Follow tailored `needs_fields` order and escape rules. Do NOT include the language comment prefix — that is `pharaoh-req-codelink-annotate`'s concern. + +The `links` field renders as `[<parent_1>, ...]` when `parent_feat_ids` non-empty, else `[]` (or omitted if tailored `default = []`). The body shall-clause does NOT fit on a one-line comment — implied by the title and lost in this mode. For full shall-clause text use `emit="rst"`. + +Target: 1-5 reqs per file (per split_strategy). Fewer than 1 only if the file has no observable behavior; more than 5 suggests over-decomposition. + +### Step 6: Return + +Emit one JSON object per the Output shape (`{"reqs": [...]}` for `emit=rst`, `{"codelinks": [...]}` for `emit=codelinks_comment`). Build each `reqs[i]` by populating `id`, `title`, `type`, `body`, `source_doc`, `satisfies` (use `[]` when empty), `verification`, and `raw_rst` (the literal RST block that would render the directive). Nothing else on stdout — no `# emit=...` header line, no prose wrapper, no fenced code block. + +## Failure modes + +- `file_path` not readable → return empty output (no reqs). +- `pharaoh-context-gather` errors → log and proceed as if no match found. +- `pharaoh-decision-record` returns `"error"` (not `"duplicate"`) → log and proceed. Do not retry. + +## Composition + +Under `pharaoh-execute-plan`, a plan emitted by `pharaoh-write-plan` dispatches N instances of this skill via a `foreach` task over the file list. Per-CREQ review is scheduled as explicit top-level `review_comp_reqs` + `grounding_check_comp_reqs` + `api_coverage_comp_reqs` plan tasks (see `pharaoh-write-plan` templates); the plan DAG enforces them as dependencies of `quality_gate`. Direct out-of-plan invocation by a human auditor is acceptable; the caller is responsible for running the sibling reviews if coverage matters. + +See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` reading the explicit plan-task output files. diff --git a/.github/agents/pharaoh.req-regenerate.agent.md b/.github/agents/pharaoh.req-regenerate.agent.md index 28f7407..29fec1f 100644 --- a/.github/agents/pharaoh.req-regenerate.agent.md +++ b/.github/agents/pharaoh.req-regenerate.agent.md @@ -7,4 +7,317 @@ handoffs: [] Use when regenerating a single sphinx-needs requirement to address findings from pharaoh-req-review. Consumes the original RST + findings JSON, emits a revised RST directive that passes the flagged axes. -See [`skills/pharaoh-req-regenerate/SKILL.md`](../../skills/pharaoh-req-regenerate/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-req-regenerate + +## When to use + +Invoke when `pharaoh-req-review` has produced a findings JSON for a single requirement and +`overall` is `"needs_work"` or `"fail"`. Provide the original RST directive block and the +findings JSON together. + +Do NOT use to re-author a requirement from scratch — use `pharaoh-req-draft` for new requirements. +Do NOT invoke when `overall` is `"pass"` and no subjective axis scored below 2 (see Guardrail G2). +One invocation addresses one requirement. If multiple requirements need regeneration, run this +skill once per requirement. + +--- + +## Inputs + +- **original_rst** (from user): the RST directive block produced by `pharaoh-req-draft` or + retrieved from needs.json — must include the `:id:` option +- **findings_json** (from `pharaoh-req-review`): the full structured findings JSON for that + requirement — must contain the `axes` and `action_items` fields +- **tailoring** (from `.pharaoh/project/`): + - `artefact-catalog.yaml` — required/optional fields for the target artefact type + - `id-conventions.yaml` — id_regex for output validation + - `checklists/requirement.md` — axis definitions for self-check +- **needs.json**: used to confirm the re-issued need-id is still unique and parent links resolve + +> Note: A `shared/tailoring-access.md` helper module is planned. Until it exists, Steps 1-2 below +> inline the tailoring-access logic directly. When that file is created, this skill should be +> updated to delegate to it. + +--- + +## Outputs + +A single revised RST directive block where every failing binary axis now passes and every +subjective axis that scored < 2 has been reworded to score ≥ 2. + +Preserved fields (must not change unless the review explicitly flagged them): +- `:id:` — never change the need-id +- `:status:` — preserve unless the `action_items` list contains an explicit status demotion +- `:satisfies:` — preserve the parent link + +If the review flagged a missing required field (schema axis), add the field with a best-effort +value and append a `[FLAG]` note identifying it. + +--- + +## Process + +### Step 1: Read tailoring + +Read `.pharaoh/project/artefact-catalog.yaml` and `.pharaoh/project/id-conventions.yaml`. + +Extract for the artefact type matching the directive prefix (e.g. `gd_req`): +- `required_fields` — every field that must be present +- `id_regex` — regex the output id must match + +If tailoring files are missing, fall back to built-in defaults (bundled example +requirement profile — `req` required: `[id, status, satisfies]`, id_regex: +`^[a-z][a-z_]*__[a-z0-9_]+$`). + +--- + +### Step 2: Parse findings_json + +Parse the findings JSON. If malformed (missing `axes` key, invalid JSON syntax, axis count < 11), +FAIL immediately — do not attempt partial regeneration: + +``` +FAIL: findings_json is malformed — <specific parse error>. +Re-run `pharaoh-req-review` on the requirement to obtain a valid findings JSON, +then re-invoke pharaoh-req-regenerate with the corrected output. +``` + +Extract: +- `need_id` — must match the `:id:` in original_rst; if mismatched, FAIL: + ``` + FAIL: findings_json.need_id "<a>" does not match original_rst :id: "<b>". + Supply matching RST and findings for the same requirement. + ``` +- `axes` — a map of axis → `{score, reason}` entries +- `action_items` — the list of strings describing required changes + +--- + +### Step 3: Identify axes to fix + +For each axis in `axes`: + +- Binary axis (score is 0 or 1): mark for fix if `score == 0` +- Subjective axis (score is 0–3): mark for fix if `score < 2` +- Deferred axis (score == "deferred") or null axis: skip — not fixable at single-req scope + +Build a prioritised fix list: +1. Binary failures first (blocking for `overall = "fail"`) +2. Subjective axes scoring 0 or 1 second + +--- + +### Step 4: Apply fixes + +Work through the fix list. For each axis: + +**atomicity / unambiguity_prose** (body contains > 1 `shall` or a conjunction in the shall clause): + +Read the original body. Identify the primary action. Discard secondary clauses. Rewrite as a +single `shall` sentence with no `, and`/`, or`/` and `/` or ` within the shall clause. +Preserve subject and measurable condition. + +**verifiability** (`:verification:` absent or does not resolve): + +If `:verification:` is absent: add `:verification: tc__TBD`. +If it is present but does not resolve: replace with `:verification: tc__TBD`. +Append `[FLAG] :verification: set to tc__TBD — link to a real test case before promoting to status=valid.` + +**schema** (required field missing): + +Add the missing field. If the value cannot be inferred from context, use a placeholder: +- `:satisfies:` missing: set to `TBD` and append `[FLAG] :satisfies: requires a real parent ID.` +- Other missing fields: add empty with `[FLAG]`. + +**unambiguity_prose** (score < 2, no shall-atomicity failure): + +Replace vague quantifiers (e.g. "sufficient", "quickly", "within a reasonable time") with +either a concrete criterion (if the original_rst or user context supplies one) or a clearly +labelled parameter placeholder: `<THRESHOLD_TO_BE_DEFINED>`. + +**comprehensibility** (score < 2): + +Expand the subject if missing or ambiguous. Remove undefined acronyms or expand them inline +on first use. Do not change the measurable claim. + +**feasibility** (score < 2): + +Add or tighten the constraint. If the claim is infeasible as stated, insert a +`[DIAGNOSTIC]` note asking the user to verify the feasibility constraint before re-reviewing. +Do not silently drop the constraint. + +--- + +### Step 5: Self-check + +After rewriting, run the same checks as `pharaoh-req-review` Step 3 (binary axes only): + +- Atomicity: exactly one `shall`, no conjunction in shall clause +- Internal consistency: no self-contradiction +- Schema: all required fields present and non-empty +- Verifiability: `:verification:` present. On a `status: draft` requirement, a placeholder value matching `^(tc|test_case)__TBD$` (or the project's `tailoring.verification_placeholder_regex`) scores 0.5 on review's verifiability axis — passing the binary gate and terminating the regen loop. Once status advances past draft, the placeholder stops passing and a real test-case id is required. + +If any binary check still fails after one rewrite attempt, attempt one further rewrite targeting +only the still-failing axis. If it still fails after two total attempts, emit the directive with: + +``` +[DIAGNOSTIC] axis "<name>" still failing after 2 rewrite attempts. +Manual correction required before re-running pharaoh-req-review. +``` + +--- + +### Step 6: Fixed-point protection + +If the revised directive is character-for-character identical to `original_rst`, the findings +indicated no change was needed or the rewrite is stuck. Stop and emit: + +``` +[DIAGNOSTIC] Regenerated output is identical to input after applying all fixes. +This may indicate the action_items are non-actionable at single-requirement scope +or that the findings_json and original_rst describe different requirements. +Review action_items manually. +``` + +Do not loop. Emit the original directive unchanged alongside the diagnostic. + +--- + +### Step 7: Emit the revised directive + +Emit the revised RST block using the same formatting rules as `pharaoh-req-draft`: + +```rst +.. <prefix>:: <title> + :id: <id> + :status: <status> + :satisfies: <parent_link> + :verification: <test_id> + + <revised single-sentence body> +``` + +Followed by any `[FLAG]` or `[DIAGNOSTIC]` notes as plain text after the block. + +--- + +## Guardrails + +**G1 — Malformed findings_json** + +If findings_json cannot be parsed or is missing required keys (`axes`, `action_items`): + +``` +FAIL: findings_json is malformed — <error>. +Re-run `pharaoh-req-review` to regenerate valid findings, then retry. +``` + +**G2 — No action required** + +If the findings_json reports `overall = "pass"` AND no subjective axis scores below 2: + +``` +NO-OP: findings_json overall = "pass" with all subjective axes ≥ 2. +This requirement already passes all evaluated axes. No regeneration needed. +``` + +Return this message only, without emitting a directive. + +**G3 — Fixed-point loop protection** + +If two sequential invocations produce the same output (caller detects this externally and passes +the same `original_rst` and `findings_json` again without change), emit: + +``` +[DIAGNOSTIC] Fixed-point detected: same input and same output on consecutive invocations. +Axes that cannot be resolved at single-requirement scope: <list>. +Escalate to manual authoring or review with a human expert. +``` + +--- + +## Advisory chain + +After successfully emitting the revised directive: + +``` +Consider running `pharaoh-req-review <need_id>` to confirm all axes now pass. +``` + +Do not show this if the emit included a `[DIAGNOSTIC]`. + +--- + +## Worked example + +**Original RST (from pharaoh-req-draft):** + +```rst +.. gd_req:: Brake pressure limit check + :id: gd_req__brake_pressure_limit + :status: draft + :satisfies: gd_req__brake_system_safety + + The brake controller shall detect excessive pressure and activate the pressure relief + valve and log the event. +``` + +**Findings JSON (from pharaoh-req-review):** + +```json +{ + "need_id": "gd_req__brake_pressure_limit", + "axes": { + "atomicity": {"score": 0, "reason": "conjunction 'and' joins two actions within the shall clause"}, + "internal_consistency": {"score": 1, "reason": "no self-contradiction"}, + "verifiability": {"score": 0, "reason": ":verification: field absent"}, + "schema": {"score": 1, "reason": "id, status, satisfies present"}, + "completeness": {"score": "deferred", "reason": "set-level axis"}, + "external_consistency": {"score": "deferred", "reason": "set-level axis"}, + "no_duplication": {"score": "deferred", "reason": "set-level axis"}, + "maintainability": {"score": null, "reason": "chain-level axis"}, + "unambiguity_prose": {"score": 2, "reason": "mostly clear; slight scope ambiguity from compound action"}, + "comprehensibility": {"score": 3, "reason": "subject, action, condition all explicit"}, + "feasibility": {"score": 3, "reason": "standard automotive function"} + }, + "action_items": [ + "atomicity: split into one requirement — retain only the pressure relief valve activation; draft a separate requirement for event logging", + "verifiability: add :verification: field" + ], + "overall": "fail" +} +``` + +**Step 3:** Axes to fix — atomicity (binary, score 0), verifiability (binary, score 0). + +**Step 4 fixes:** +- atomicity: retain primary action (activate pressure relief valve); drop secondary clause (log the event). + Body: "The brake controller shall activate the pressure relief valve when detected brake line pressure exceeds the maximum safe operating limit." +- verifiability: add `:verification: tc__TBD`. + +**Step 5 self-check:** `shall` count = 1, no conjunction, required fields present, `:verification:` present. All binary axes pass. + +**Step 6:** Output differs from input. No fixed-point. + +**Step 7 output:** + +```rst +.. gd_req:: Brake pressure limit check + :id: gd_req__brake_pressure_limit + :status: draft + :satisfies: gd_req__brake_system_safety + :verification: tc__TBD + + The brake controller shall activate the pressure relief valve when detected brake + line pressure exceeds the maximum safe operating limit. +``` + +``` +[FLAG] :verification: set to tc__TBD — link to a real test case before promoting to status=valid. + +Consider running `pharaoh-req-review gd_req__brake_pressure_limit` to confirm all axes now pass. +``` diff --git a/.github/agents/pharaoh.req-review.agent.md b/.github/agents/pharaoh.req-review.agent.md index dabf554..dea9749 100644 --- a/.github/agents/pharaoh.req-review.agent.md +++ b/.github/agents/pharaoh.req-review.agent.md @@ -7,4 +7,337 @@ handoffs: [] Use when auditing a single sphinx-needs requirement against the 11 ISO 26262 Part 8 §6 axes. Emits structured findings JSON — per-axis pass/fail for mechanized axes, 0-3 score for subjective axes, with action items for any failure. -See [`skills/pharaoh-req-review/SKILL.md`](../../skills/pharaoh-req-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-req-review + +## When to use + +Invoke when the user has a single requirement (either just drafted by `pharaoh-req-draft`, or an +existing need-id present in needs.json) and wants per-axis inspection against ISO 26262-8 §6. + +Do NOT review sets of requirements — use `pharaoh-req-set-review` (planned, not in Phase 1). +Do NOT re-author or fix — invoke `pharaoh-req-regenerate` after reviewing. + +--- + +## Inputs + +- **target**: either an RST directive block (from `pharaoh-req-draft`) OR a need-id present in + needs.json +- **tailoring** (from `.pharaoh/project/`): + - `checklists/requirement.md` — 11 ISO 26262-8 §6 axes + - `artefact-catalog.yaml` — required/optional fields per artefact type + - `id-conventions.yaml` — ID regex and prefix map +- **needs.json**: required for link resolution on the verifiability axis + +--- + +## Outputs + +A single JSON document with **no prose wrapper**. Shape: + +```json +{ + "need_id": "gd_req__example", + "axes": { + "atomicity": {"score": 0, "reason": "..."}, + "internal_consistency": {"score": 0, "reason": "..."}, + "verifiability": {"score": 0, "reason": "..."}, + "completeness": {"score": "deferred", "reason": "set-level axis — see note"}, + "external_consistency": {"score": "deferred", "reason": "set-level axis — see note"}, + "no_duplication": {"score": "deferred", "reason": "set-level axis — see note"}, + "schema": {"score": 0, "reason": "..."}, + "maintainability": {"score": null, "reason": "chain-level axis — see note"}, + "unambiguity_prose": {"score": 0, "reason": "..."}, + "comprehensibility": {"score": 0, "reason": "..."}, + "feasibility": {"score": 0, "reason": "..."} + }, + "action_items": ["..."], + "overall": "pass" +} +``` + +### Score scales — two distinct scales, never mixed + +**Binary (0 or 1) — mechanized axes** (exec-based scorers define the rule; skill applies the rule +description and records its verdict; the harness compares skill verdict to scorer verdict): + +| Axis | Score 0 = FAIL | Score 1 = PASS | +|---|---|---| +| `atomicity` | body contains more than one `shall`, or a coordinating conjunction joins modal verbs within the shall clause | body contains exactly one `shall`; no `, and`/`, or`/` and `/ or ` within the shall clause | +| `internal_consistency` | body contains a self-contradictory statement (e.g. "shall always … unless required not to") | no self-contradiction detectable within this requirement | +| `verifiability` | `:verification:` field absent, empty, or link does not resolve in needs.json (and does not match a recognised placeholder) | `:verification:` present and resolves to a real need-id in needs.json | +| `schema` | any field listed under `required_fields` in artefact-catalog.yaml is missing from the directive | all required fields present and non-empty | + +**Verifiability placeholder pathway (score 0.5):** a drafted req with `status: draft` AND `:verification:` set to a recognised placeholder (matching `^(tc|test_case)__TBD$` by default, or the pattern declared under `tailoring.verification_placeholder_regex` in `checklists/requirement.md`) scores 0.5, not 0. This lets a regenerate loop terminate on an iteratively improved draft that still lacks a concrete test-case id. The placeholder pathway does not apply once status has advanced past `draft`. For `overall`, treat 0.5 as passing the binary gate but append `"verifiability: placeholder-only"` to `action_items`. + +**Ordinal (0–3) — subjective LLM-judge axes:** + +| Axis | 0 | 1 | 2 | 3 | +|---|---|---|---|---| +| `unambiguity_prose` | multiple conflicting interpretations possible | single interpretation but phrasing is awkward | single interpretation, minor phrasing issues | unambiguous and precise | +| `comprehensibility` | reader at adjacent abstraction level cannot follow | mostly unclear without extra context | mostly clear; minor jargon or ellipsis | fully self-contained and clear | +| `feasibility` | obviously infeasible given item-development constraints | feasible but significant unknowns | feasible with known engineering effort | clearly feasible, well-constrained | + +### Deferred set-level axes + +`completeness`, `external_consistency`, and `no_duplication` require the full set of sibling +requirements to compute meaningful signal. Scoring them against a single req out of 185 is noise. +These three axes are **deferred to `pharaoh-req-set-review`** (YAGNI in Phase 1). + +In the output JSON, record each as `{"score": "deferred", "reason": "set-level axis — assess with pharaoh-req-set-review"}`. + +### Chain-level axis + +`maintainability` (set survives regeneration fixed-point within 2 iterations) cannot be evaluated +at single-requirement or single-invocation scope — it requires running `pharaoh-req-regenerate` +and observing convergence. Record as `{"score": null, "reason": "chain-level axis — assess after pharaoh-req-regenerate runs"}`. + +### `overall` field + +Computed from the non-deferred, non-null axes only (atomicity, internal_consistency, verifiability, +schema, unambiguity_prose, comprehensibility, feasibility): + +- `"pass"` — all binary axes score 1 (or `verifiability` scores 0.5 via the placeholder pathway), all subjective axes score ≥ 2 +- `"needs_work"` — no binary axis fails, but ≥ 1 subjective axis scores < 2 +- `"fail"` — ≥ 1 binary axis scores 0 + +--- + +## Process + +### Step 1: Read tailoring + +Read `.pharaoh/project/checklists/requirement.md`, `.pharaoh/project/artefact-catalog.yaml`, and +`.pharaoh/project/id-conventions.yaml`. Extract: + +- Axis definitions (confirm the 11 axes match the expected set) +- `required_fields` for the target artefact type (used in schema axis) +- `id_regex` (used to verify the need-id format if target is an RST block) + +If any tailoring file is missing, proceed with built-in defaults (bundled example +profile — generic `req` required fields: `[id, status, satisfies]`). Note the +fallback in the output. + +### Step 2: Resolve target + +**If target is a need-id:** + +1. Find needs.json (check `docs/_build/needs/needs.json`, then `_build/needs/needs.json`). +2. Look up the need-id in the needs map. +3. If not found: FAIL immediately (see Guardrails G1). +4. Extract the full directive content: title, all option fields, body text. + +**If target is an RST directive block:** + +1. Parse the block inline — extract id from `:id:` option, all other options, and body text. +2. Determine artefact type from the directive name (e.g. `.. gd_req::` → type `gd_req`). +3. If needs.json is available, check the id does not already exist (new draft). If it does exist, + warn but continue — the user may be re-reviewing an existing need. +4. For the verifiability axis, attempt to resolve the `:verification:` link in needs.json. + If needs.json is not available, record verifiability as `{"score": 0, "reason": "needs.json not available — cannot resolve :verification: link"}`. + +### Step 3: Evaluate binary axes + +Evaluate each binary axis using the rule from the score-scale table above. Apply the rule textually +to the extracted directive content. Record `score` (0 or 1) and a one-sentence `reason`. + +**Atomicity:** +Count occurrences of `shall` in the body text. Check for `, and `, `, or `, ` and `, ` or ` within +the shall clause (from `shall` to end of sentence). Score 1 if exactly one `shall` and no +conjunction; score 0 otherwise. + +**Internal consistency:** +Read the body text for contradictory statements (e.g. simultaneous "always" and "unless not +required"). If none detectable, score 1. Score 0 if a self-contradiction is identifiable. + +**Verifiability:** +Check whether `:verification:` option is present and non-empty. If present, look up the value as a +need-id in needs.json. Score 1 if present and resolves; score 0 otherwise. + +**Schema:** +Check that every field in `required_fields` from artefact-catalog.yaml is present and non-empty in +the directive. For the built-in default profile: `id`, `status`, `satisfies` must all be present. +Score 1 if all present; score 0 with reason listing the missing field(s). + +### Step 4: Evaluate subjective axes + +Evaluate using the 0–3 ordinal scale. + +**Unambiguity (prose):** +Read the body text. Assess whether a reader could derive more than one meaning from the shall +clause. A single vague term (e.g. "sufficient") is score 1; an unambiguous measurable criterion is +score 3. + +**Comprehensibility:** +Assess whether a reader at the adjacent abstraction level (a software architect reading a +component-level req, or a test engineer reading a system-level req) could understand the +requirement without reading any other document. Check for undefined acronyms, missing subject, +and missing context. + +**Feasibility:** +Assess whether the requirement as stated could be implemented within typical automotive item +development constraints. Flag physically impossible claims (score 0), heavily under-constrained +targets (score 1), normal engineering effort (score 2), or tightly and clearly bounded +requirements (score 3). + +### Step 5: Record deferred and null axes + +Set `completeness`, `external_consistency`, `no_duplication` to `{"score": "deferred", ...}` and +`maintainability` to `{"score": null, ...}` per the policy above. Do not attempt to evaluate them. + +### Step 6: Compute overall and action items + +Compute `overall` from the non-deferred, non-null axes per the rule in the Outputs section. + +For each binary axis scoring 0, and each subjective axis scoring 0 or 1, add a concrete action +item to `action_items`. Each action item must name the axis and state what must change. + +If all 7 evaluated axes pass/score ≥ 2, `action_items` is an empty array. + +### Step 7: Emit JSON + +Emit only the JSON document. Do not prepend or append prose. + +--- + +## Guardrails + +**G1 — Unresolved target** + +If target is a need-id and it does not appear in needs.json: + +``` +FAIL: need-id "<id>" not found in needs.json. +Verify the ID is correct or build the Sphinx project first (`sphinx-build docs/ docs/_build/`). +``` + +Do not emit partial JSON. Return only the FAIL message. + +**G2 — Malformed JSON output** + +If the emitted JSON is syntactically invalid or missing any of the 11 axis keys, self-correct once: +re-emit the full JSON document. If still malformed after one self-correction attempt, emit: + +```json +{ + "need_id": "<id>", + "diagnostic": "JSON self-correction failed. Raw findings follow.", + "raw": "<free-text findings>" +} +``` + +**G3 — Insufficient context for subjective axis** + +If you cannot meaningfully assess a subjective axis (e.g. body is empty or only a title stub), +record `{"score": 0, "reason": "insufficient context — body is empty or too short to assess"}`. +Do not skip the axis. Continue evaluating the remaining axes. + +--- + +## Advisory chain + +If `overall` is `"needs_work"` or `"fail"`, append — after the JSON — a single line: + +``` +Consider `pharaoh-req-regenerate <need_id>` after addressing action items. +``` + +This is the only prose permitted after the JSON. + +--- + +## Worked example + +**Target (RST block from pharaoh-req-draft):** + +```rst +.. gd_req:: ABS pump activation on wheel slip threshold + :id: gd_req__abs_pump_activation + :status: draft + :satisfies: gd_req__brake_system_safety + :verification: tc__abs_pump_001 + + The brake controller shall engage the ABS pump when measured wheel slip exceeds + the calibrated activation threshold. +``` + +**Step 2:** RST block parsed. needs.json found; `tc__abs_pump_001` resolves. Parent +`gd_req__brake_system_safety` noted (not evaluated here — verifiability checks `:verification:`). + +**Step 3 — binary axes:** +- atomicity: `shall` count = 1; no conjunction in shall clause → score 1 +- internal_consistency: no self-contradiction → score 1 +- verifiability: `:verification: tc__abs_pump_001` present and resolves → score 1 +- schema: `id`, `status`, `satisfies` all present → score 1 + +**Step 4 — subjective axes:** +- unambiguity_prose: "calibrated activation threshold" is a defined term; single interpretation → score 3 +- comprehensibility: subject (brake controller), action (engage ABS pump), condition (wheel slip + exceeds threshold) all stated; adjacent-level reader can follow → score 3 +- feasibility: standard automotive function; well-constrained → score 3 + +**Step 6:** all 7 evaluated axes pass or score ≥ 2 → `overall = "pass"`, `action_items = []`. + +**Step 7 output:** + +```json +{ + "need_id": "gd_req__abs_pump_activation", + "axes": { + "atomicity": {"score": 1, "reason": "exactly one shall; no coordinating conjunction in shall clause"}, + "internal_consistency": {"score": 1, "reason": "no self-contradictory statement detected"}, + "verifiability": {"score": 1, "reason": ":verification: tc__abs_pump_001 resolves in needs.json"}, + "schema": {"score": 1, "reason": "id, status, satisfies all present"}, + "completeness": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-req-set-review"}, + "external_consistency": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-req-set-review"}, + "no_duplication": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-req-set-review"}, + "maintainability": {"score": null, "reason": "chain-level axis — assess after pharaoh-req-regenerate runs"}, + "unambiguity_prose": {"score": 3, "reason": "calibrated activation threshold is a defined term; single interpretation"}, + "comprehensibility": {"score": 3, "reason": "subject, action, and condition all explicit; no undefined acronyms"}, + "feasibility": {"score": 3, "reason": "standard automotive ABS function; well-constrained threshold trigger"} + }, + "action_items": [], + "overall": "pass" +} +``` + +--- + +**Variant: two axes fail** + +Same requirement but `:verification:` is absent and body reads "The brake controller shall detect +wheel slip and engage the ABS pump when the threshold is exceeded." + +Binary axis failures: +- verifiability: `:verification:` absent → score 0 +- atomicity: "detect … and engage" — conjunction joins two actions within the shall clause → score 0 + +```json +{ + "need_id": "gd_req__abs_pump_activation", + "axes": { + "atomicity": {"score": 0, "reason": "conjunction 'and' joins two actions (detect, engage) within the shall clause"}, + "internal_consistency": {"score": 1, "reason": "no self-contradictory statement detected"}, + "verifiability": {"score": 0, "reason": ":verification: field absent"}, + "schema": {"score": 1, "reason": "id, status, satisfies all present"}, + "completeness": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-req-set-review"}, + "external_consistency": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-req-set-review"}, + "no_duplication": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-req-set-review"}, + "maintainability": {"score": null, "reason": "chain-level axis — assess after pharaoh-req-regenerate runs"}, + "unambiguity_prose": {"score": 2, "reason": "single interpretation but two-action body creates scope ambiguity"}, + "comprehensibility": {"score": 3, "reason": "subject, action, and condition clear despite atomicity issue"}, + "feasibility": {"score": 3, "reason": "standard automotive function; well-constrained"} + }, + "action_items": [ + "atomicity: split into two requirements — one for slip detection, one for ABS pump engagement", + "verifiability: add :verification: field linking to a test case; use tc__TBD as placeholder before a real test case exists" + ], + "overall": "fail" +} +``` + +Consider `pharaoh-req-regenerate gd_req__abs_pump_activation` after addressing action items. diff --git a/.github/agents/pharaoh.review-completeness.agent.md b/.github/agents/pharaoh.review-completeness.agent.md index df8be0e..af0f8bf 100644 --- a/.github/agents/pharaoh.review-completeness.agent.md +++ b/.github/agents/pharaoh.review-completeness.agent.md @@ -7,4 +7,75 @@ handoffs: [] Use when inspecting one or more needs for review / approval-chain completeness. Flags needs missing required :reviewer: or :approved_by: fields per the project's artefact catalog. Emits one finding per incomplete need via pharaoh-finding-record. -See [`skills/pharaoh-review-completeness/SKILL.md`](../../skills/pharaoh-review-completeness/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-review-completeness + +## When to use + +Invoke during audit workflows to check that every need with a required review/approval chain has the corresponding fields populated. Works at single-need granularity (inspect one ID) or over the whole graph (iterate all needs of relevant types). + +Do NOT invoke for artefact types whose project tailoring does NOT list `:reviewer:` / `:approved_by:` as required — the skill is a no-op in that case. + +## Atomicity + +- (a) Indivisible — one lookup + one field-presence check per need. +- (b) Input: `{project_dir, need_ids?: list[str]}` (if `need_ids` omitted, iterate all). Output: `[{need_id, missing_roles: [str]}]` (empty list if all complete). +- (c) Reward: deterministic — field-present vs field-absent. 100% target on fixture. +- (d) Reusable: audit orchestrators, standalone CI gate, pre-merge check. +- (e) Composable: read-only over needs.json + artefact-catalog.yaml. + +## Process + +### Step 1: Load artefact catalog + +Read `<project_dir>/.pharaoh/project/artefact-catalog.yaml`. For each artefact type, extract `required_roles` — which may include `reviewer`, `approved_by`, or be absent (no review required). + +### Step 2: Load needs + +Read `<project_dir>/needs.json` (or the pre-built bazel artefact). For each need matching `need_ids` (or all needs if none given), determine its artefact type from its ID prefix. + +### Step 3: Check required roles + +For each need whose type has `required_roles`, verify each required role field is present in the need's options and non-empty. Collect missing roles. + +### Step 4: Emit findings + +For each need with at least one missing role, emit a finding record: + +```json +{ + "need_id": "<id>", + "missing_roles": ["reviewer", "approved_by"] +} +``` + +Output is a JSON list of such objects, wrapped in a single fenced `json` block. Empty list → `[]`. + +No surrounding prose. + +## Input / output example + +Input call (as audit subagent): +``` +pharaoh-review-completeness on project <Score dir> for all component-level reqs +``` + +Output: +```json +[ + {"need_id": "gd_req__timestamp_recording", "missing_roles": ["approved_by"]} +] +``` + +## Failure modes + +- `artefact-catalog.yaml` absent → emit `[]` with stderr warning; skill is a no-op without tailoring. +- `needs.json` absent → emit `[]` with stderr warning. +- Malformed artefact catalog → emit `[]` with stderr warning `"artefact-catalog malformed"`. + +## Composition + +Consumed by `pharaoh-audit-fanout` as sub-area 4. Each returned finding is passed to `pharaoh-finding-record` with `category: missing_reviewer` or `category: missing_approval`. diff --git a/.github/agents/pharaoh.self-review-coverage-check.agent.md b/.github/agents/pharaoh.self-review-coverage-check.agent.md index e93ca48..6073482 100644 --- a/.github/agents/pharaoh.self-review-coverage-check.agent.md +++ b/.github/agents/pharaoh.self-review-coverage-check.agent.md @@ -7,4 +7,78 @@ handoffs: [] Use when verifying that every artefact emitted during a plan run received a matching review. For every drafted artefact in `runs/`, confirms a matching `<id>_review.json` exists and is non-empty. Closes the "draft emitted but review was skipped" failure class. -See [`skills/pharaoh-self-review-coverage-check/SKILL.md`](../../skills/pharaoh-self-review-coverage-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-self-review-coverage-check + +## When to use + +Invoke from `pharaoh-quality-gate.required_checks` on any plan that emitted drafts (reqs, feats, archs, vplans, fmeas, decisions, diagrams). Uses the draft↔review mapping in `shared/self-review-map.yaml` to determine which review skill was supposed to be invoked. + +Do NOT use to re-invoke missing reviews — this skill only observes. Remediation is up to the plan's `on_fail` policy. + +## Atomicity + +- (a) Indivisible: one runs directory + one self-review map in → pass/fail + uncovered list out. +- (b) Input: `{runs_path: str, self_review_map_path: str}`. Output: JSON `{passed: bool, uncovered: list[{artefact_id, draft_skill, expected_review_skill}]}`. +- (c) Reward: fixtures in `pharaoh/skills/pharaoh-self-review-coverage-check/fixtures/`: + 1. `fully-covered/`: 2 `*_draft` return.json files + 2 matching `*_review.json` → `expected-fully-covered-pass.json` (`passed: true, uncovered: []`). + 2. `missing-review/`: 2 draft files + only 1 review file → `expected-missing-review-fail.json` (`passed: false`, `uncovered` names the missing pair). + 3. Empty review file (`{}`) counts as missing → failure with `reason: "review JSON is empty"`. + 4. Idempotent. + 5. `scalar-mapped/`: emission skill maps to a single scalar review skill; that review is invoked in the emission's `## Last step`. Expected: pass (backward-compatibility check — scalar mappings still work). + 6. `list-mapped-complete/`: emission skill maps to a list `[A, B]`; both A and B are invoked in the emission's `## Last step`. Expected: pass. + 7. `list-mapped-partial/`: emission skill maps to a list `[A, B]`; only A is invoked. Expected: fail with `uncovered` naming B as the missing `expected_review_skill`. + + Pass = all 7. +- (d) Reusable by any plan. +- (e) Read-only. + +## Input + +- `runs_path`: absolute path to runs directory. Must contain `*_draft.json` files (draft outputs) and `*_review.json` files (review outputs). Files may be under per-task subdirectories. +- `self_review_map_path`: absolute path to `shared/self-review-map.yaml`. Maps each draft skill to its review skill. + +## Output + +```json +{ + "passed": false, + "uncovered": [ + { + "artefact_id": "REQ_example_02", + "draft_skill": "pharaoh-req-draft", + "expected_review_skill": "pharaoh-req-review", + "reason": "no matching *_review.json found" + } + ] +} +``` + +## Detection rule + +For every `<run_dir>/**/<id>_draft.json` OR every entry in `<run_dir>/**/return.json` with `emitted: [...]`: + +1. Identify the emission skill for the artefact (from the `draft_skill` field in the run record or the emission task name in the plan). +2. Look up the emission skill in `self_review_map.map`. The mapped value is either: + - a **scalar** (string): the name of the single review skill expected to be invoked from the emission skill's `## Last step`. + - a **list** of strings: every review skill in the list is expected to be invoked. + + Branch on type: `isinstance(value, list)` vs scalar. Lists iterate; scalars are treated as a single-element check. (Dict values are out of scope; treat as a schema error.) + +3. For each expected review skill name: + - **Only source**: a matching `<id>_<review_skill_short>.json` under `<run_dir>/**`, produced by an explicit plan task (e.g. `review_comp_reqs`, `grounding_check_comp_reqs`) that ran the review skill. Expected filename shapes: `<id>_review.json` for `pharaoh-req-review`, `<id>_code_grounding.json` for `pharaoh-req-code-grounding-check`, `<id>_diagram_review.json` for `pharaoh-diagram-review`, etc. + - Load the file. If missing, empty object `{}`, or unparseable → record as uncovered with the specific `expected_review_skill` name. + - **Never accept "inlined" / "covered in emission skill's Last step" / "semantically satisfied" as completion evidence.** The only evidence is a non-empty JSON file on disk at the expected path. The emission skill's `## Last step` clause is explicitly deferred under plan execution (see `pharaoh-req-from-code` SKILL.md); coverage is determined exclusively by the presence of explicit plan-task output files. An uncovered finding indicates the plan did not schedule the review task, the executor skipped it, or the executor claimed "completed" without producing output (which `pharaoh-execute-plan` Step 4.10 should have already caught as a `reporting_error`, but this check provides a second independent signal). + +4. The emission is covered only when **every** expected review skill (all list members, or the single scalar) is invoked AND its produced JSON is non-empty. Missing one out of N list members fails, with the missing entry named in `uncovered[].expected_review_skill`. + +Backward compatibility: existing scalar mappings (e.g. `pharaoh-req-draft: pharaoh-req-review`) continue to pass under the same check — the scalar is treated as a one-element list. + +Use `self_review_map` to label `expected_review_skill` in the uncovered entries. When multiple review skills are expected, emit one `uncovered` entry per missing review skill so the caller sees every missing pair separately. + +## Composition + +Called by `pharaoh-quality-gate` when `required_checks` contains `self_review_coverage: true`. diff --git a/.github/agents/pharaoh.sequence-diagram-draft.agent.md b/.github/agents/pharaoh.sequence-diagram-draft.agent.md index 666e39c..b9bb864 100644 --- a/.github/agents/pharaoh.sequence-diagram-draft.agent.md +++ b/.github/agents/pharaoh.sequence-diagram-draft.agent.md @@ -7,4 +7,101 @@ handoffs: [] Use when drafting one sequence diagram showing ordered interactions between participants (components, actors, external systems) over time. Renderer tailored via `pharaoh.toml`. Does NOT emit component, class, or state diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). -See [`skills/pharaoh-sequence-diagram-draft/SKILL.md`](../../skills/pharaoh-sequence-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-sequence-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-sequence-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.sequence]` for per-type overrides. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-safe-labels.md). Every emitted label / node id / edge label / message text MUST be sanitised per that rule set before the block leaves this skill. **Extra sharp for sequence diagrams:** Mermaid 11 treats `;` inside a message label as a statement terminator — prior dogfooding shipped a `J->>J: filter by type; skip SET/Folder` that parsed cleanly under `sphinx-build -nW` but rendered as `Syntax error` in the browser. Always replace `;` with `,` in message labels. + +## Purpose + +One invocation → one sequence diagram. Captures **ordered interactions over time** between a bounded set of participants. Typical inputs: a feature's "happy path" flow, an interface's request/response trace, an incident timeline reconstructed from logs. + +Does NOT capture static containment (→ `pharaoh-component-diagram-draft`). Does NOT capture type relationships (→ `pharaoh-class-diagram-draft`). Does NOT capture state transitions (→ `pharaoh-state-diagram-draft`). + +## Atomicity + +- (a) One interaction in → one diagram out. No multi-scenario bundling (alt paths in one diagram are OK — they are part of one scenario; but two independent scenarios are two skill invocations). +- (b) Input: `{view_title: str, participants: list[ParticipantSpec], messages: list[MessageSpec], project_root: str, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `ParticipantSpec = {id: str, label: str, kind?: "actor"|"component"|"boundary"|"database"|"external"}` and `MessageSpec = {from: str, to: str, label: str, kind?: "sync"|"async"|"return"|"self", fragment?: FragmentSpec}`. Output: one RST directive block. +- (c) Reward: fixture with 3 participants (User, API, DB) and 4 messages (User→API: request, API→DB: query, DB→API: result, API→User: response). Scorer: + 1. Output starts with renderer-specific directive. + 2. Every participant id in `participants` is declared in the diagram body. + 3. Every message appears in order (renderer syntax: `User->>API: request` for Mermaid). + 4. Message count in output = `len(messages)`. + 5. Sync vs async arrow differs syntactically (`->>` vs `-)` in Mermaid; `->` vs `->>` in PlantUML). + 6. Self-message (kind=`self`) renders as a self-loop on the participant. + + Pass = all 6. +- (d) Reusable: any interaction diagram. Especially valuable for interface/API specs. +- (e) One diagram kind per skill. + +## Input + +- `view_title`: diagram caption. +- `participants`: ordered list; order = left-to-right placement in the diagram. +- `messages`: ordered list; order = top-to-bottom time axis. Each message references participants by id. +- `project_root`: for tailoring lookup. +- `renderer_override`, `on_missing_config`, `papyrus_workspace`, `reporter_id`: as in shared doc. + +### FragmentSpec (optional per message) + +```json +{"type": "alt"|"opt"|"loop"|"par"|"critical"|"break", "condition": "<string>"} +``` + +Groups consecutive messages under a fragment (e.g. `alt`: alternative paths; `loop`: repeated block). If `messages[i].fragment` is non-null, it opens a fragment that stays open until a later message with `fragment = null` or a different fragment type. + +This is the one piece of sequence-diagram structure that has no analogue in component diagrams — hence sequence gets its own skill. + +## Output + +**Mermaid:** +```rst +.. mermaid:: + :caption: <view_title> + + sequenceDiagram + participant User + participant API + participant DB + User->>API: request + API->>DB: query + DB-->>API: result + API-->>User: response +``` + +**PlantUML:** +```rst +.. uml:: + :caption: <view_title> + + @startuml + actor User + participant API + database DB + User -> API : request + API -> DB : query + DB --> API : result + API --> User : response + @enduml +``` + +## Process (sketch) + +1. Resolve tailoring per shared doc. +2. Emit participant declarations in order (Mermaid: `participant X`; PlantUML: `actor X`/`participant X`/`database X` keyed on `ParticipantSpec.kind`). +3. Emit messages in order. Map `kind` → renderer syntax (sync/async/return/self). +4. Handle fragments: open `alt`/`opt`/`loop` as messages are emitted; close at end of fragment. +5. Wrap in RST directive. + +## Non-goals + +- No auto-extraction of sequences from code/logs — the caller provides `participants` and `messages` explicitly. A separate future skill (`pharaoh-sequence-from-trace`) could infer these from runtime logs, but that is a different concern. +- No return-arrow inference — if the caller wants a return, they include it as a message with `kind="return"`. +- No activation-bar auto-insertion (PlantUML activates/deactivates) — caller can add via `fragment` or future extension. diff --git a/.github/agents/pharaoh.setup.agent.md b/.github/agents/pharaoh.setup.agent.md index c0b857c..750aca2 100644 --- a/.github/agents/pharaoh.setup.agent.md +++ b/.github/agents/pharaoh.setup.agent.md @@ -131,3 +131,930 @@ Recommend running `@pharaoh.mece` next. 2. `pharaoh.toml` controls only Pharaoh's behavior. Never re-define need types or link types from `ubproject.toml`. 3. Degrade gracefully when tools are missing. 4. This agent has no workflow gates and runs freely in any mode. + +--- + +## Full atomic specification + +# pharaoh-setup + +Scaffold Pharaoh into a sphinx-needs project. This skill detects the project structure, generates a `pharaoh.toml` configuration file, optionally installs GitHub Copilot agents, and recommends tooling for the best experience. + +## When to Use + +- First-time setup of Pharaoh in a sphinx-needs project. +- Adding GitHub Copilot agent support to an existing Pharaoh project. +- Reconfiguring project detection after structural changes (new need types, link types, or project layout changes). +- Migrating from `conf.py`-only configuration to `ubproject.toml`. + +## Prerequisites + +- The workspace must contain at least one sphinx-needs project (a directory with `ubproject.toml` or a `conf.py` that loads `sphinx_needs`). +- No other Pharaoh skills are required before running this one. `pharaoh:setup` has no workflow gates and runs freely in both advisory and enforcing modes. + +--- + +## Process + +Execute the following steps in order. Present results to the user at each major step and ask for confirmation before writing any files. + +--- + +### Step 1: Detect Project Structure + +Follow the full detection algorithm defined in [`skills/shared/data-access.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/data-access.md). The subsections below summarize what to detect and how to present it. + +#### 1a. Find Sphinx project roots + +Search for `ubproject.toml` files in the workspace root and up to two levels of subdirectories using Glob with pattern `**/ubproject.toml`. Each location is a candidate root. + +For each candidate root, verify sphinx-needs is actually configured by checking either (a) a `[needs]` section or `[[needs.types]]` tables in `ubproject.toml`, or (b) `sphinx_needs` in the `extensions` list of a co-located `conf.py`. Candidates that fail this check are classified as **plain-Sphinx candidates** (no sphinx-needs), not sphinx-needs project roots. + +If no `ubproject.toml` match is a true sphinx-needs root, search for `conf.py` files containing sphinx-needs configuration using Grep with pattern `sphinx_needs|needs_types|needs_from_toml` in `**/conf.py`. Each matching `conf.py` location is a sphinx-needs project root. + +If no sphinx-needs roots are found at all, do a final pass: Glob `**/conf.py` and record every match as a **plain-Sphinx candidate** (these exist but do not load sphinx-needs). + +Record every sphinx-needs root path and every plain-Sphinx candidate separately. + +#### 1b. Read need types + +For each project root, read the configured need types. + +From `ubproject.toml`, read the `[needs]` section and extract the `types` array. Each entry has `directive`, `title`, `prefix`, `color`, and `style`. Build a list of directive names (e.g., `req`, `spec`, `impl`, `test`). + +From `conf.py` (fallback), read `needs_types` or follow `needs_from_toml` to the referenced TOML file. + +Record the list of need types per project root. + +#### 1c. Read extra link types + +From `ubproject.toml`, read `[needs.extra_links]`. Each key is a link option name with `incoming` and `outgoing` descriptions. Example: `implements = {incoming = "is implemented by", outgoing = "implements"}`. + +From `conf.py` (fallback), read `needs_extra_links`. + +Record the list of extra link types per project root. + +#### 1d. Read ID settings + +From `ubproject.toml`, read `id_required` and `id_length` from the `[needs]` section. + +From `conf.py` (fallback), read `needs_id_required` and `needs_id_length`. + +Record the ID settings per project root. + +#### 1e. Detect sphinx-codelinks + +Follow Step 4 of [`skills/shared/data-access.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/data-access.md): + +- Check `ubproject.toml` for `sphinx_codelinks` or `sphinx-codelinks` in extensions or configuration sections. +- Check `conf.py` for `"sphinx_codelinks"` in the `extensions` list or `codelinks_*` configuration variables. + +Record whether codelinks are configured per project root. + +#### 1f. Check ubc CLI availability + +Run `ubc --version` in a shell. If the command succeeds (exit code 0), record the version string and mark ubc CLI as available. If it fails, mark ubc CLI as unavailable. + +#### 1g. Check ubCode MCP availability + +Check the available tool list for MCP tools with names containing `ubcode` or `useblocks`. If found, record ubCode MCP as available. If not found, record it as unavailable. + +#### 1h. Identify documentation source tree + +For each project root, locate the documentation source files: + +1. Check for a `docs/` subdirectory containing `.rst` or `.md` files. +2. Check for a `source/` subdirectory. +3. Check `conf.py` for `master_doc` or source directory configuration. +4. If none found, assume RST/MD files are in the project root itself. + +Record the source directory per project root. + +#### 1i. Present detection summary + +Present a summary of everything detected. Format it as follows: + +``` +Pharaoh Project Detection +========================= + +Project roots found: <count> + +Project: <project name from ubproject.toml [project] name, or directory name> + Root: <path> + Source dir: <path> + Config: ubproject.toml | conf.py + Types: <comma-separated directive names> + Extra links: <comma-separated link option names, or "none"> + ID required: <yes/no> + ID length: <number or "not set"> + Codelinks: <detected/not detected> + +<repeat for each project root> + +Data access: + ubc CLI: <available (version) | not available> + ubCode MCP: <available | not available> + Fallback: raw file parsing (always available) +``` + +If no sphinx-needs project roots were found, branch on whether plain-Sphinx candidates exist: + +**Case A — No Sphinx project at all (no `conf.py` anywhere):** + +``` +No Sphinx project detected in this workspace. + +Run `sphinx-quickstart` to create a Sphinx project, or provide the path +to an existing one. +``` + +**Case B — Plain-Sphinx candidates exist but none loads sphinx-needs:** + +``` +Sphinx project(s) detected at: + - <path> + ... + +sphinx-needs is not configured in any of them. + +Pharaoh requires sphinx-needs to be loaded as an extension and at least +one need type to be declared. + +Run `pharaoh-bootstrap` first to inject the minimum sphinx-needs +configuration into the chosen project, then re-run this skill to author +pharaoh.toml. +``` + +In either case, ask the user how to proceed before writing any files. + +--- + +### Step 2: Generate pharaoh.toml + +#### 2a. Ask about strictness preference + +Ask the user which strictness mode they prefer: + +``` +Strictness mode controls whether Pharaoh enforces workflow order. + + advisory (default) - Pharaoh suggests the recommended workflow + but never blocks you from proceeding. + enforcing - Pharaoh checks prerequisites before each + skill and blocks if they are not met. + (e.g., pharaoh:change must run before + any authoring skill) + +Which mode would you like? [advisory/enforcing] +``` + +If the user does not specify, default to `"advisory"`. + +#### 2a.bis. Detect and confirm project mode + +Pharaoh's workflow gates (`require_change_analysis`, `require_verification`, `require_mece_on_release`) have different natural defaults depending on where the project sits in its lifecycle. Hardcoding the example's values is what produced the pilot feedback: a reverse-engineering project had `require_change_analysis = true` on day one, alarming every newly-drafted need because there was no Pharaoh change issue yet. + +Classify the project into one of three modes by inspecting **declared types in `ubproject.toml`** and **existing RST content under the source tree** — not by `needs.json` existence. `needs.json` is a gitignored build artefact; using it as a signal misclassifies every fresh clone as `reverse-eng` until `sphinx-build` runs. + +Apply rules in order; the first matching branch wins: + +| Signal | Inferred mode | +| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | +| `[[needs.types]]` declared **and** the source tree has at least one `.. <directive>::` block in any `.rst` file under the source dir **and** ≥10% of those needs carry a status from a "matured" set (`approved`, `closed`, `reviewed`, `passed`). | `steady-state` | +| `[[needs.types]]` declared **and** the source tree has at least one `.. <directive>::` block in any `.rst` file. (Active drafting; not enough matured-status needs to qualify as steady-state yet.) | `reverse-eng` | +| `[[needs.types]]` declared **but** no `.. <directive>::` blocks found in any `.rst` file. (Types declared, no needs authored.) | `greenfield` | +| `[[needs.types]]` **not** declared. Step 1 already routed this case to `pharaoh-bootstrap`; should not reach here. If it does, FAIL. | (n/a) | + +The classifier reads RST files directly; it does NOT depend on `needs.json` and does NOT depend on prose-feature heuristics. The heuristic "`docs/` has prose files with imperative verbs" was previously used to disambiguate `reverse-eng` vs `greenfield` — it is replaced by the cleaner test "are there sphinx-needs directives in the RST tree". + +Present the detected mode and ask the user to confirm or override: + +``` +Detected project mode: <reverse-eng | greenfield | steady-state> + + reverse-eng - Codebase exists and has feature-level documentation, but + sphinx-needs artefacts are being created now. Workflow + gates start permissive; tighten them once the catalogue + stabilises. + greenfield - Minimal scaffolding. Verification matters from day one + (every new need should have a verification path), but + change-analysis and MECE gates are noise until the + catalogue grows. + steady-state - Mature catalogue (≥10 needs). Full gating: change + analysis before edits, verification required, MECE at + release. + +Confirm detected mode, or choose a different one +[reverse-eng/greenfield/steady-state]? +``` + +Record the chosen mode. Per-mode `[pharaoh.workflow]` defaults (applied in Step 2b): + +| Mode | `require_change_analysis` | `require_verification` | `require_mece_on_release` | +| -------------- | ------------------------- | ---------------------- | ------------------------- | +| `reverse-eng` | `false` | `true` | `false` | +| `greenfield` | `false` | `true` | `false` | +| `steady-state` | `true` | `true` | `true` | + +`require_verification = true` is uniform across all three modes — step 1 of the gate-enablement ladder (see [`skills/shared/gate-enablement.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/gate-enablement.md)) is safe to enable out of the box because the review skills are ship-ready and read-only. A project that runs `pharaoh-setup` → `pharaoh-gate-advisor` immediately lands on step 2 as its next recommendation, not step 1. Mode still differentiates `require_change_analysis` and `require_mece_on_release` because those gates have pre-work that is not safe to assume on every project. + +A caller running this skill non-interactively MAY pass `mode` as an explicit override input. When present, Step 2a.bis uses that value and skips the confirmation prompt. + +#### 2b. Build pharaoh.toml content + +Generate the `pharaoh.toml` content using the detected project data. Use `pharaoh.toml.example` as the structural template, but populate values from detection results. + +**`[pharaoh]` section:** +- Set `strictness` to the user's choice from Step 2a. + +**`[pharaoh.id_scheme]` section:** + +Detect the ID pattern descriptively from existing IDs in the RST tree. Do NOT default to `{TYPE}_{NUMBER}` without first checking whether observed IDs conform to it. + +1. Sample existing need IDs: glob `<source-dir>/**/*.rst`, extract the `:id:` value from every sphinx-needs directive (`.. <directive>:: <ID>` or `:id: <ID>`). Take up to 20 samples (or all if fewer exist). +2. Classify the dominant shape: + + | Observed shape | Pattern token | Example | + | ----------------------------------------------------------- | ------------------------- | ---------------------- | + | `<TYPE-PREFIX-FROM-needs.types>_<digits>` | `{TYPE}_{NUMBER}` | `REQ_001`, `FEAT_42` | + | `<TYPE-PREFIX>-<UPPER-MODULE>-<digits>` | `{TYPE}-{MODULE}-{NUMBER}`| `REQ-BRAKE-001` | + | `<UPPER-DOMAIN>_<digits>` where the leading token is **not** any declared type prefix in `[[needs.types]]` | `{DOMAIN}_{NUMBER}` | `BRAKE_CTRL_01`, `FSR_POWER_01` | + | Heterogeneous / no clear shape | (no pattern) | (mixed) | + + The `{DOMAIN}_{NUMBER}` test is what catches the `useblocks/sphinx-needs-demo` case: IDs lead with a domain name (e.g. `BRAKE_CTRL`) rather than the directive's declared type prefix. + +3. Emit the detected pattern as `pattern = "{TYPE}_{NUMBER}"` / `pattern = "{DOMAIN}_{NUMBER}"` / etc. and a comment recording the sample size: + + ```toml + [pharaoh.id_scheme] + # Inferred from 20 sampled IDs in source-dir/**/*.rst. + # Observed shape: {DOMAIN}_{NUMBER} (e.g. BRAKE_CTRL_01). + pattern = "{DOMAIN}_{NUMBER}" + auto_increment = true + ``` + +4. **No-evidence fallback.** If zero IDs are found in the RST tree (greenfield), fall back to `pattern = "{TYPE}_{NUMBER}"` with a comment marking the value as a default not derived from observation: + + ```toml + pattern = "{TYPE}_{NUMBER}" # default; no IDs observed in RST tree + ``` + +5. **Heterogeneous fallback.** If observed IDs do not fit any single shape, emit `pattern = "{ANY}"` with a TODO comment asking the user to declare the project's convention manually. Do NOT silently force `{TYPE}_{NUMBER}`. + +6. `auto_increment = true` is unchanged. + +The matching `id_regex` for `.pharaoh/project/id-conventions.yaml` (Step 5b) is derived from the same observation: for `{DOMAIN}_{NUMBER}` emit `^[A-Z][A-Z0-9_]*_[0-9]+$`; for `{TYPE}_{NUMBER}` emit the union of declared `[[needs.types]] prefix` values + digits anchor; etc. + +**`[pharaoh.workflow]` section:** +- Persist the chosen mode as a real key (`mode = "reverse-eng" | "greenfield" | "steady-state"`). The key lives in `pharaoh.toml` alongside the gate flags so a later reader (and any skill that wants to reason about lifecycle stage) can parse it directly. Do NOT persist it as a comment-only line — comments are not parseable. +- Populate the three gate flags from the mode table in Step 2a.bis based on the mode the user confirmed. Do NOT blindly copy values from `pharaoh.toml.example` — that file documents the steady-state shape, not the day-one defaults for every mode. +- Emit a short rationale comment above the gate flags naming the assumption that produced these values: + ```toml + [pharaoh.workflow] + mode = "reverse-eng" + # Gates tuned for reverse-eng — tighten as the catalogue stabilises. + require_change_analysis = false + require_verification = true + require_mece_on_release = false + ``` + +**`[pharaoh.traceability]` section:** + +`required_links` declares chains in the form `"source-type -> target-type"`. The semantics enforced by `pharaoh:mece` are: every need of `source-type` must have at least one outgoing link to a need of `target-type` (see `skills/pharaoh-mece/SKILL.md` Step 2). The chain direction is therefore the direction the link option resolves, **not** the direction of the conceptual type hierarchy. Both conventions exist in the wild — some projects put `:implements:` on the `impl` directive (child references parent, chain `impl -> spec`); others put `:specifies:` on the `spec` directive (parent references child, chain `spec -> impl`). Inferring direction from the link option name picks one convention and emits inverted chains for projects on the other; `pharaoh:mece` then reports 100% gaps for the source type. Resolve direction from ground truth instead. + +Apply the following sources in priority order. For each link option, stop at the first source that resolves a direction. + +**Source 1 — built `needs.json` (preferred, when available).** If the project has a built `needs.json` (typical paths: `<source-dir>/_build/needs/needs.json`, `<source-dir>/_build/html/needs.json`, or any `needs.json` under `_build/`), parse it and inspect the actual edges: + +For each declared link option `L` (including the standard `:links:`) and each ordered pair of declared types `(X, Y)`: + +1. Let `n_X` = count of needs of type `X` whose `:L:` value is non-empty. +2. Let `n_X_to_Y` = count of those needs whose `:L:` resolves to at least one need of type `Y`. +3. Emit `"X -> Y"` only if `n_X >= 3` and `n_X_to_Y / n_X >= 0.9`. + +The thresholds (`>= 3` instances, `>= 90%` coverage) suppress chains inferred from a single accidental edge while still emitting chains the project consistently produces. Include a comment recording the sample size: + +```toml +required_links = [ + "spec -> req", # needs.json: 18/18 spec needs link to req via :reqs: + "impl -> spec", # needs.json: 35/35 impl needs link to spec via :implements: +] +``` + +**Source 2 — declared semantics from `[needs.links.<name>]` (greenfield, no `needs.json`).** The `outgoing` and `incoming` descriptions identify the verb and which side bears the link option, but they do not, on their own, identify the type pair: any type can carry any link option. Without empirical edges or an explicit hint, the source and target types are unknown. + +Use this source only when `ubproject.toml` carries an explicit hint (e.g., a future `[needs.links.<name>] from = "<type>", to = "<type>"` extension). Do not invent the type pair from the link option name. + +**Source 3 — refuse to guess.** When neither source resolves a link option to a `(source-type, target-type)` pair, do **not** emit a chain for that link. Emit a TODO comment in its place so the user sees what was skipped and why: + +```toml +required_links = [ + "spec -> req", # needs.json: 18/18 spec needs link to req via :reqs: + # TODO: link option `implements` is declared but no `needs.json` was + # found. Build the docs once and re-run `pharaoh:setup`, or add an + # explicit chain manually in the form "source-type -> target-type". +] +``` + +The previous heuristic-name table (`implements -> "spec -> impl"`, `tests -> "impl -> test"`, etc.) is removed: it encoded one project convention as universal and produced inverted chains on every project that used the opposite convention. + +**Coverage note.** Source 1 iterates over **every** link option declared in `[needs.links.<name>]` (and the built-in `:links:`). There is no per-link-name allow-list. A `useblocks/sphinx-needs-demo`-style project declaring `verifies`, `satisfies`, `implements`, `triggers`, `derives_from`, etc. has every option evaluated against the same coverage threshold. This closes the PR #14 follow-up: the direction-inference rule is uniform across the full declared set, not specific to a fixed list of relations. + +**Type-pair filter (applied to every source).** Emit a chain only when both the source type and the target type are declared in `ubproject.toml` `[[needs.types]]`. When Source 1 resolves an edge whose target type is not declared, drop it with a comment naming the dropped target — the chain is dead config that would alarm on every source need from day one: + +```toml +required_links = [ + "req -> spec", + "spec -> impl", + # "impl -> test", # SKIPPED: 'test' is not declared in [[needs.types]] +] +``` + +**Empty-array fallback.** If no link option resolves to a chain by any source, emit: + +```toml +required_links = [ + # No traceability chains inferred. Add entries of the form + # "source-type -> target-type" once the link conventions stabilise, + # or build the docs and re-run `pharaoh:setup` to infer from `needs.json`. +] +``` + +**`[pharaoh.codelinks]` section:** +- Set `enabled = true` if sphinx-codelinks was detected in Step 1e. +- Set `enabled = false` if not detected. + +#### 2c. Check for existing pharaoh.toml + +Before writing, check if `pharaoh.toml` already exists in the workspace root. + +If it exists: +1. Read the existing file. +2. Show a diff between the existing content and the newly generated content. +3. Ask the user: + ``` + pharaoh.toml already exists. What would you like to do? + 1. Overwrite with the new configuration + 2. Keep the existing file + 3. Show both side by side so I can choose specific settings + ``` +4. Proceed according to the user's choice. + +If it does not exist, proceed to write. + +#### 2d. Present and write pharaoh.toml + +Show the user the complete `pharaoh.toml` content that will be written: + +``` +The following pharaoh.toml will be created at <workspace root>/pharaoh.toml: + +--- +<file content> +--- + +Write this file? [yes/no] +``` + +After the user confirms, write the file to the workspace root (the same directory as `ubproject.toml` or `conf.py`). If there are multiple project roots, write to the top-level workspace root. + +--- + +### Step 3: Scaffold Copilot Agents (if requested) + +#### 3a. Ask if user wants Copilot support + +``` +Would you like to set up GitHub Copilot agent support? + +This will create agent and prompt files in your .github/ directory, +enabling @pharaoh.change, @pharaoh.trace, and other agents in +VS Code Copilot Chat. + +Set up Copilot agents? [yes/no] +``` + +If the user declines, skip to Step 4. + +#### 3b. Locate Copilot templates + +The Copilot templates live in the Pharaoh plugin directory under `.github/`. Pharaoh dogfoods its own agents — the same `.github/` tree it copies out is the one it uses on itself. Locate this directory relative to the plugin installation path. + +The expected template structure is: + +``` +.github/ + agents/ + pharaoh.*.agent.md (discovered via glob, not hardcoded) + prompts/ + pharaoh.*.prompt.md (discovered via glob, not hardcoded) + copilot-instructions.md +``` + +Do NOT hardcode the agent or prompt file list in the skill — enumerate them at runtime with Glob on `.github/agents/pharaoh.*.agent.md` and `.github/prompts/pharaoh.*.prompt.md`. The set grows as new atomic skills land; a hardcoded list rots on every release. + +If the `.github/agents/` directory is not found in the plugin dir, inform the user: + +``` +Copilot templates not found in the Pharaoh plugin directory +(expected .github/agents/ and .github/prompts/). +This may indicate an incomplete installation. Skipping Copilot setup. + +You can manually create Copilot agents later by running pharaoh:setup again +after reinstalling the plugin. +``` + +Then skip to Step 4. + +#### 3c. Check for existing .github/ files + +Before copying, check if any of the target files already exist in the user's project: + +- `.github/agents/` -- any `pharaoh.*.agent.md` files +- `.github/prompts/` -- any `pharaoh.*.prompt.md` files +- `.github/copilot-instructions.md` + +For each existing file: +1. Read both the existing file and the template. +2. Show the diff. +3. Ask the user whether to overwrite, skip, or merge. + +For files that do not exist, list them as new files to be created. + +#### 3d. Present file list and copy + +Enumerate the actual template files via Glob (see Step 3b) and show a summary. Example shape (exact list depends on the current plugin version): + +``` +The following files will be created in your project: + + New files (N agents, M prompts): + .github/agents/pharaoh.<name>.agent.md × N + .github/prompts/pharaoh.<name>.prompt.md × M + .github/copilot-instructions.md + +Proceed? [yes/no] +``` + +Show the full enumerated list to the user — do not print the `× N` shorthand. The shorthand above is just for this skill spec; the runtime output must list every file by name so the user can review before confirming. + +After user confirms, create the necessary directories (`.github/agents/`, `.github/prompts/`) and copy each template file to the user's project. + +--- + +### Step 4: Configure .gitignore + +#### 4a. Check for .gitignore + +Look for a `.gitignore` file in the workspace root. + +#### 4b. Add Pharaoh ephemeral paths (narrow, not wholesale) + +`.pharaoh/` contains a mix of committed tailoring and ephemeral run state. Ignoring the whole tree is wrong — it hides `.pharaoh/project/` tailoring which IS shared across the team. The skill ignores only the ephemeral subpaths: + +| Path | Purpose | Commit? | +| ----------------------- | -------------------------------------------------------- | ------- | +| `.pharaoh/project/` | Tailoring: workflows, id-conventions, artefact-catalog, checklists | **yes** | +| `.pharaoh/runs/` | `pharaoh-execute-plan` run artefacts (report.yaml, staged RST) | no | +| `.pharaoh/plans/` | plan.yaml files emitted by `pharaoh-write-plan` | no | +| `.pharaoh/session.json` | Session / gate state | no | +| `.pharaoh/cache/` | Derived caches | no | + +Emitted entries: + +``` +.pharaoh/runs/ +.pharaoh/plans/ +.pharaoh/session.json +.pharaoh/cache/ +``` + +If `.gitignore` exists, read its contents and branch: + +1. **Wide form already present.** If the file contains a bare `.pharaoh/` or `.pharaoh` line (no trailing path segment), emit a warning and leave it alone — do not auto-migrate, respect user control: + > `.pharaoh/ is ignored as a whole — this hides .pharaoh/project/ tailoring which should be committed. Consider narrowing to: .pharaoh/runs/, .pharaoh/plans/, .pharaoh/session.json, .pharaoh/cache/.` + Report: `".pharaoh/" entry is too wide; left in place with a warning.` +2. **All four narrow entries already present.** Do nothing. Report: `".pharaoh/ ephemeral paths already ignored -- no changes needed."` +3. **Some narrow entries missing.** Append the missing entries on new lines. If the file does not end with a newline, add one first. Report: `"Added <count> Pharaoh ephemeral-path entries to .gitignore."` + +If `.gitignore` does not exist, create it with: + +``` +# Pharaoh ephemeral state (do not commit). Project tailoring at .pharaoh/project/ IS committed. +.pharaoh/runs/ +.pharaoh/plans/ +.pharaoh/session.json +.pharaoh/cache/ +``` + +Report: `Created .gitignore with Pharaoh ephemeral-path entries.` + +--- + +### Step 5: Recommend Tooling + +#### 5a. ubc CLI recommendation + +If ubc CLI was not found in Step 1f, present: + +``` +Recommendation: Install the ubc CLI for faster, more accurate data access. + +ubc provides deterministic JSON output for needs indexing, validation, +and impact analysis. It is the fastest data source Pharaoh can use. + +Install: https://ubcode.useblocks.com/ubc/installation.html + +Without ubc, Pharaoh falls back to reading RST/MD files directly. +This works but is slower on large projects. +``` + +If ubc CLI was found, present: + +``` +ubc CLI detected (version <version>). Pharaoh will use it for +fast, deterministic data access. +``` + +#### 5b. ubCode extension recommendation + +If ubCode MCP was not found in Step 1g, present: + +``` +Recommendation: Install the ubCode VS Code extension for the best experience. + +ubCode provides real-time indexing, MCP integration, and live +validation directly in your editor. Combined with ubc CLI, it +gives Pharaoh instant access to pre-indexed project data. + +Install from the VS Code marketplace: search for "ubCode". +``` + +If ubCode MCP was found, present: + +``` +ubCode MCP detected. Pharaoh will use it for real-time indexed +data access when available. +``` + +#### 5c. Present experience tiers + +``` +Pharaoh Experience Tiers +======================== + +Tier | What you have | Experience +---------|------------------------|--------------------------------------------- +Basic | Pharaoh only | AI reads files directly. Works everywhere, + | | slower on large projects. +Good | + ubc CLI | Fast deterministic indexing, JSON output, + | | CI/CD compatible. +Best | + ubc CLI + ubCode | Real-time indexing, MCP integration, live + | | validation, full schema checks. + +Your current tier: <Basic|Good|Best> +``` + +Determine the current tier: +- **Best**: Both ubc CLI and ubCode MCP are available. +- **Good**: ubc CLI is available but ubCode MCP is not. +- **Basic**: Neither ubc CLI nor ubCode MCP is available. + +--- + +### Step 5b: Bootstrap tailoring from declared types and observed RST content + +After `pharaoh.toml` is written, generate `.pharaoh/project/{workflows,id-conventions,artefact-catalog}.yaml` plus `checklists/<type>.md` per declared type. The bootstrap is **descriptive**: it captures what the project already declares and what existing RST content already uses, falling back to Pharaoh-internal defaults only when no project signal is available. + +The base shapes and fallbacks are documented in `pharaoh-tailor-bootstrap` — invoke it for the structural emission. Before invoking it, gather the project-state inputs below and pass them as overrides so the emitted tailoring matches the project's reality, not a Pharaoh-internal placeholder set. + +#### 5b.1. Read `[needs.fields.X]` from `ubproject.toml` (artefact-catalog `optional_fields` and `required_metadata_fields`) + +For each declared `[needs.fields.<name>]` table in `ubproject.toml`: + +- The `<name>` is a sphinx-needs option key (e.g. `asil`, `severity`, `exposure`, `controllability`, `safe_state`). +- If the table declares `required = true` (or any explicit-required marker the project uses), add `<name>` to that type's `required_metadata_fields`. +- Otherwise, add `<name>` to that type's `optional_fields`. +- Scope: if `[needs.fields.<name>]` declares `applies_to = ["<type1>", "<type2>"]`, restrict to those types. Without an `applies_to`, treat as global and add to every declared type's `optional_fields`. + +When `[needs.fields.X]` is **declared** in `ubproject.toml`, the Pharaoh-internal placeholder set (`reviewer`, `approved_by`, `source_doc`) is appended only for types that do not already have at least one project-declared field — i.e., we add Pharaoh defaults on top of the project's own fields, never as a replacement. + +When `[needs.fields.X]` is **absent**, fall back to `pharaoh-tailor-bootstrap`'s built-in default (`optional_fields: [reviewer, approved_by, source_doc]`). + +#### 5b.2. Compute lifecycle from RST status histogram (`workflows.yaml lifecycle_states`) + +Glob `<source-dir>/**/*.rst` and parse `:status: <value>` from every sphinx-needs directive. Build a histogram of observed values. + +- If the histogram is non-empty and at least two distinct values appear, set `lifecycle_states` to the observed values, ordered by frequency descending. Emit a comment recording the histogram counts. +- If only one distinct value appears (e.g. every need has `:status: open`), still emit it as the first lifecycle state but append the Pharaoh defaults (`draft`, `reviewed`, `approved`) so transitions are at least defined. +- If no `:status:` fields are found anywhere, fall back to `pharaoh-tailor-bootstrap`'s default `[draft, reviewed, approved]`. + +Worked example for `useblocks/sphinx-needs-demo`-style histogram (`open: 145, closed: 16, passed: 7, approved: 2`): + +```yaml +# workflows.yaml — generated by pharaoh-setup with histogram override +# Observed status counts in <source-dir>/**/*.rst: +# open: 145, closed: 16, passed: 7, approved: 2 +lifecycle_states: + - open + - closed + - passed + - approved + +transitions: + - {from: open, to: passed, requires: []} + - {from: open, to: closed, requires: []} + - {from: passed, to: approved, requires: []} + # Add the inverse (passed -> open, approved -> open) only if the histogram or + # explicit project policy suggests they are reachable. The default is to + # leave the state machine as observed. +``` + +The transition graph is **not** inferred from the histogram (the histogram does not record transitions). The skill emits a permissive forward-only chain `state[i] -> state[i+1]`. The user is expected to edit transitions to match project policy; emit a comment naming this expectation. + +#### 5b.3. Detect ID-prefix collisions (`id-conventions.yaml prefixes`) + +Read `[[needs.types]]` from `ubproject.toml`. Build a map `prefix -> [directive...]`. Any prefix mapping to ≥2 directives is a collision. + +Real-world example from `useblocks/sphinx-needs-demo`: +- `R_` declared on both `req` and `release` +- `T_` declared on both `test` and `team` +- (empty prefix `""`) declared on both `arch` and `need` + +Behaviour by strictness mode (the value chosen in Step 2a): + +- **`advisory`** — emit a WARN to the user listing each collision with a remediation hint, and proceed with the prefixes as-declared (`pharaoh-id-convention-check` will then surface ambiguous IDs at runtime). The warning text: + + ``` + WARNING: ID-prefix collisions detected in [[needs.types]]: + - R_ used for: req, release + - T_ used for: test, team + - "" (empty) used for: arch, need + + Disambiguate by giving each declared type a unique prefix in + ubproject.toml, e.g. release -> REL_, team -> TEAM_, need -> NEED_. + Until disambiguated, pharaoh-id-convention-check cannot tell a release + ID from a requirement ID and pharaoh-id-allocate may emit colliding IDs. + ``` + +- **`enforcing`** — FAIL with the same message and refuse to write `id-conventions.yaml`. The user must fix `[[needs.types]]` first. + +When no collisions are detected, emit `prefixes` directly from `[[needs.types]]` as today. + +#### 5b.4. Detect ID regex from observed IDs (`id-conventions.yaml id_regex`) + +Use the same sample collected in Step 2b's `[pharaoh.id_scheme]` detection: + +- If the dominant observed shape is `{TYPE}_{NUMBER}` and observed IDs match the union of declared prefixes + digits, emit the union-of-prefixes regex (current default). +- If the dominant shape is `{DOMAIN}_{NUMBER}` (leading token does not match any declared type prefix), emit a regex matching the observation: + + ```yaml + id_regex: "^[A-Z][A-Z0-9_]*_[0-9]+$" + ``` + + with a comment naming the sampled IDs. + +- If the observed shape is `{TYPE}-{MODULE}-{NUMBER}`, emit: + + ```yaml + id_regex: "^(REQ|SPEC|IMPL|TEST)-[A-Z]+-[0-9]+$" + ``` + + (substituting actually-declared prefixes). + +- If observed IDs don't conform to a single shape, emit `id_regex: ".+"` with a TODO comment asking the user to declare the convention manually. + +Reject the heuristic union-of-prefixes regex when observed IDs do not match it — the regex would fail validation on every existing need. + +#### 5b.5. Emit Phase-5 release-gate fields per type (artefact-catalog) + +For each declared type, emit `required_links`, `optional_links`, `required_metadata_fields`, `required_roles` per the canonical schema (see `schemas/artefact-catalog.schema.json`): + +- **`required_links`:** for each `[needs.links.<name>]` (also written `[needs.extra_links]` in older sphinx-needs configs) declared with `required = true` — or, when the project has a built `needs.json`, for each link option that 100% of existing needs of this type carry (per Source 1 in `[pharaoh.traceability]` direction inference) — include the option name. +- **`optional_links`:** every other declared link option that is legal on this type (per `[needs.links.<name>] applies_to`, or default to "any declared option not in `required_links`"). Drop overlap with `required_links`. +- **`required_metadata_fields`:** every `[needs.fields.<name>]` declared with `required = true` for this type, plus `status` (every governed type has a lifecycle, so `status` is always required). When the project declares no required fields, emit `[status]`. +- **`required_roles`:** if the project declares any field whose name implies a role (`reviewer`, `approver`, `approved_by`, `responsible`, `assignee`), include the matching options. Otherwise emit `[]` — explicit "no policy", surfaced by `pharaoh-tailor-review` C6 if the user later wants to enforce a review gate. + +`pharaoh-tailor-bootstrap` handles the structural emission; `pharaoh-setup` supplies the derived inputs above. + +#### 5b.6. Invoke `pharaoh-tailor-bootstrap` + +After gathering 5b.1 through 5b.5, invoke `pharaoh-tailor-bootstrap` with: +- `project_root` = the workspace root. +- `on_missing_config` = `"prompt"` (so the user confirms the generated content). +- An overrides bundle carrying the descriptive values from 5b.1–5b.5. When `pharaoh-tailor-bootstrap` does not yet support an explicit overrides input, the caller is responsible for editing the emitted YAML in place to apply the overrides before showing the user the final form. Document this gap; the structural shape is unchanged. + +If the user rejects the proposal, skip — the caller may run `pharaoh-tailor-fill` later (after needs exist) as the alternative path. The `pharaoh-tailor-fill` skill is in fact the canonical descriptive author for matured projects (≥10 needs); `pharaoh-setup` here only seeds the file with what's available at setup time so the project doesn't sit with placeholder defaults. + +#### 5b.7. Worked example — `useblocks/sphinx-needs-demo` + +Concrete walk-through showing how Steps 2 through 5b together emit descriptive tailoring on a project that exposes every defect listed in issue #13 §8. + +**Input — `ubproject.toml` excerpt (paraphrased):** + +```toml +[[needs.types]] +directive = "req" +prefix = "R_" + +[[needs.types]] +directive = "release" +prefix = "R_" # collision with req + +[[needs.types]] +directive = "test" +prefix = "T_" + +[[needs.types]] +directive = "team" +prefix = "T_" # collision with test + +[[needs.types]] +directive = "fsr" +prefix = "FSR_" + +[needs.fields.asil] +applies_to = ["fsr", "safety_goal", "hazard"] +required = true + +[needs.fields.severity] +applies_to = ["hazard"] + +[needs.fields.scenario] +[needs.fields.safe_state] +[needs.fields.customer] + +[needs.links.satisfies] +[needs.links.verifies] +[needs.links.derives_from] +``` + +**Input — observed RST IDs and statuses:** + +- `BRAKE_CTRL_01`, `BRAKE_CTRL_02`, `FSR_POWER_01`, `FSR_POWER_02`, ... (20 sampled, all matching `^[A-Z][A-Z0-9_]*_[0-9]+$`, none matching `^(R_|T_|FSR_)[0-9]+$`) +- Status histogram: `open: 145, closed: 16, passed: 7, approved: 2`. + +**Output — `pharaoh.toml`:** + +```toml +[pharaoh] +strictness = "advisory" + +[pharaoh.id_scheme] +# Inferred from 20 sampled IDs in source-dir/**/*.rst. +# Observed shape: {DOMAIN}_{NUMBER} (e.g. BRAKE_CTRL_01, FSR_POWER_01). +# Note: declared type prefixes (R_, T_, FSR_) do not match the leading +# token of observed IDs — IDs lead with a domain name, not a type prefix. +pattern = "{DOMAIN}_{NUMBER}" +auto_increment = true + +[pharaoh.workflow] +mode = "reverse-eng" +# Gates tuned for reverse-eng — tighten as the catalogue stabilises. +require_change_analysis = false +require_verification = true +require_mece_on_release = false + +[pharaoh.traceability] +# Direction inferred from needs.json edges (Source 1). +required_links = [ + "spec -> req", # 100% of spec needs link to req via :satisfies: + "fsr -> safety_goal", # 100% of fsr needs link to safety_goal via :derives_from: +] + +[pharaoh.codelinks] +enabled = false +``` + +**Output — pre-bootstrap WARNINGS (advisory mode):** + +``` +WARNING: ID-prefix collisions detected in [[needs.types]]: + - R_ used for: req, release + - T_ used for: test, team + +Disambiguate by giving each declared type a unique prefix in +ubproject.toml, e.g. release -> REL_, team -> TEAM_. +``` + +**Output — `.pharaoh/project/workflows.yaml`:** + +```yaml +# Observed status counts in <source-dir>/**/*.rst: +# open: 145, closed: 16, passed: 7, approved: 2 +lifecycle_states: + - open + - closed + - passed + - approved + +transitions: + - {from: open, to: passed, requires: []} + - {from: open, to: closed, requires: []} + - {from: passed, to: approved, requires: []} +``` + +**Output — `.pharaoh/project/id-conventions.yaml`:** + +```yaml +prefixes: + req: R_ + release: R_ # COLLISION — flagged, not silently merged + test: T_ + team: T_ # COLLISION — flagged + fsr: FSR_ +id_regex: "^[A-Z][A-Z0-9_]*_[0-9]+$" +separator: "_" +``` + +**Output — `.pharaoh/project/artefact-catalog.yaml` excerpt for `fsr`:** + +```yaml +fsr: + required_fields: [id, status, title, asil] + optional_fields: [scenario, safe_state, customer, reviewer, approved_by, source_doc] + lifecycle: [open, closed, passed, approved] + required_links: [derives_from] + optional_links: [satisfies, verifies] + required_metadata_fields: [status, asil] + required_roles: [] +``` + +Compare to the prescriptive default this skill emitted before the rewrite, which would have produced `optional_fields: [reviewer, approved_by, source_doc]` (Pharaoh-internal placeholder set), `lifecycle: [draft, reviewed, approved]` (Pharaoh-internal default), `required_metadata_fields: [status]` (no `asil` despite the project declaring it required), `id_regex: "^(R_|T_|FSR_)[0-9]+$"` (which fails on every actual ID in the project), and `pattern = "{TYPE}_{NUMBER}"` (which assumes the project's IDs lead with a declared type prefix). + +The descriptive emission captures what the project already declares and uses; the prescriptive emission imposed a Pharaoh-internal world-view onto a project that had never agreed to it. + +--- + +### Step 6: Summary + +Present a final summary of everything that was configured: + +``` +Pharaoh Setup Complete +====================== + +Configuration: + pharaoh.toml: <created | updated | skipped> (<path>) + Strictness: <advisory | enforcing> + Mode: <reverse-eng | greenfield | steady-state> + Workflow: change=<on|off>, verification=<on|off>, mece=<on|off> + Codelinks: <enabled | disabled> + Traceability: <N required link chains | no required links> + +Copilot agents: <installed (<count> agents, <count> prompts) | skipped> + +.gitignore: <updated | already configured | created> + +Data access tier: <Basic | Good | Best> + +Detected projects: + <project name> (<path>) + Types: <comma-separated> + Links: <comma-separated> + +Available skills (Claude Code): + <enumerate from `skills/pharaoh-*/SKILL.md` frontmatter at runtime — + do not hardcode. The skill list has grown beyond the original 8 happy-path + agents to include atomic skills like pharaoh:req-draft, pharaoh:req-review, + pharaoh:arch-draft, pharaoh:arch-review, pharaoh:vplan-draft, + pharaoh:vplan-review, pharaoh:fmea, pharaoh:tailor-detect, + pharaoh:tailor-fill, pharaoh:audit-fanout, and others.> +``` + +If Copilot agents were installed, also show: + +``` +Available agents (GitHub Copilot): + <enumerate from the copied .github/agents/pharaoh.*.agent.md files — + do not hardcode. One entry per installed agent, formatted as @pharaoh.<name>.> + +Orchestration agents (coordinate atomic agents for end-to-end flows): + @pharaoh.flow, @pharaoh.process-audit, @pharaoh.write-plan, @pharaoh.execute-plan, ... + (again, discover from installed agents rather than hardcoding) + +For reverse-engineering requirements or architecture from code, use + @pharaoh.write-plan to generate a plan.yaml (choose a template such as + reverse-engineer-project or reverse-engineer-module) and @pharaoh.execute-plan + to run it. The deleted @pharaoh.reqs-from-module skill has been replaced by + this plan-based flow. +``` + +End with a recommendation to run the MECE check: + +``` +Next step: Run pharaoh:mece to get an overview of your project's +requirements health -- gaps, orphans, and traceability coverage. +``` + +--- + +## Key Constraints + +1. **Never overwrite files without asking.** Always check if a target file exists before writing. If it exists, show a diff and ask the user what to do. +2. **Always show what will be created or modified before doing it.** Present file contents or file lists and get explicit confirmation. +3. **Work with any sphinx-needs project structure.** Handle single-project and multi-project setups. Handle `ubproject.toml`, `conf.py`, or both. Handle projects with or without sphinx-codelinks. +4. **Do not duplicate sphinx-needs configuration.** `pharaoh.toml` controls only Pharaoh's own behavior. Need types, link types, and ID settings are read from `ubproject.toml` or `conf.py` -- never re-defined in `pharaoh.toml`. +5. **Degrade gracefully.** If ubc CLI is not available, do not fail. If Copilot templates are missing, skip Copilot setup with a clear message. If no project is detected, ask the user for guidance. +6. **This skill has no workflow gates.** It runs freely regardless of strictness mode. It does not read or write `.pharaoh/session.json`. diff --git a/.github/agents/pharaoh.spec.agent.md b/.github/agents/pharaoh.spec.agent.md index a3b87b5..e0dfe13 100644 --- a/.github/agents/pharaoh.spec.agent.md +++ b/.github/agents/pharaoh.spec.agent.md @@ -122,3 +122,526 @@ Both modes perform identical analysis depth. Strictness only affects the `Requir 4. **Never auto-execute.** Present the complete spec and wait for approval before invoking downstream skills. 5. **Single combined spec for multiple requirements.** Do not produce separate documents. 6. **No session state changes from spec generation.** Only @pharaoh.decide and @pharaoh.plan update session state. + +--- + +## Full atomic specification + +# pharaoh-spec + +Generate a self-contained specification and plan document from one or more sphinx-needs +requirements. The spec bridges the gap between requirements (the "what") and +implementation (the "how") by pulling full requirement text, mapping existing downstream +coverage, identifying gaps, recording design decisions via `pharaoh:decide`, and +producing a plan table that feeds directly into `pharaoh:plan`. + +The output is a markdown document in `docs/superpowers/specs/` containing requirements, +coverage analysis, gap list, decisions, and an actionable plan table. + +## When to Use + +- You have one or more requirement IDs and need a structured spec before implementation begins. +- You want to understand what downstream coverage already exists for a set of requirements (specs, impls, tests) and what gaps remain. +- You need to record design decisions for uncovered areas before authoring new needs. +- You want a single document that a team can review before executing changes via `pharaoh:plan`. +- A new feature has been captured as requirements and you need to decompose it into specifications, implementations, and test cases with full traceability. + +## Prerequisites + +- The workspace must contain at least one sphinx-needs project. +- No workflow gates. This skill runs freely in both advisory and enforcing modes. +- If decisions need to be recorded, the project must have a `decision` type configured (see `pharaoh:decide` Step 1). + +--- + +## Process + +Execute the following steps in order. + +--- + +### Step 1: Get Project Data + +Follow the full detection and data access algorithm defined in [`skills/shared/data-access.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/data-access.md). + +1. Detect project structure (project roots, source directories, configuration). +2. Read project configuration (need types, link types, ID settings). +3. Build the needs index using the best available data tier (ubc CLI, ubCode MCP, or raw file parsing). +4. Build the link graph with all relationships in both directions (outgoing and incoming for every link type including extra_links). +5. Read `pharaoh.toml` for strictness level, workflow gates, traceability requirements, and `required_links` chains. + +Present the detection summary before proceeding: + +``` +Project: <name> (<config source>) +Types: <list of directive names> +Links: <list of link type names> +Data source: <tier used> +Needs found: <count> +Strictness: <advisory|enforcing> +``` + +If detection fails (no project found, no needs in source files), report the issue and ask the user for guidance. Do not proceed with empty data. + +--- + +### Step 2: Parse Input + +Accept one or more requirement IDs from the user's request. + +#### When the user provides IDs directly + +Validate each ID against the needs index. If an ID does not exist: + +1. Report that the ID was not found. +2. Suggest possible matches (typo correction, similar IDs). +3. Ask the user to confirm or provide the correct ID. + +#### When the user provides natural language + +Resolve to IDs using the needs index. Match by: + +1. Exact title match (case-insensitive). +2. Substring match in title. +3. Substring match in content. +4. Tag match. + +If multiple needs match, present candidates and ask the user to choose: + +``` +Multiple matches found: +1. REQ_001 (Requirement: Brake response time) [open] +2. REQ_007 (Requirement: Brake pedal response) [approved] +Which requirement(s) should be included in the spec? Enter numbers or IDs. +``` + +If exactly one matches, proceed with it and inform the user of the resolved ID. + +#### Multiple requirements + +When called with multiple requirement IDs, produce a single combined spec document covering all of them. Do not produce separate documents per requirement. + +--- + +### Step 3: Resolve Requirements Scope + +For each input requirement, build a complete scope tree. + +#### 3a: Pull full requirement text + +For each input requirement, retrieve all available fields: + +- ID, title, type, status +- Full content text +- Tags +- All link fields (links, implements, tests, and any extra_links) +- Any custom fields defined in the project configuration + +This is the **full text** -- requirements are the source of truth and must appear verbatim in the spec document. + +#### 3b: Trace downstream coverage + +Starting from each input requirement, follow the link graph recursively to find all downstream needs. Follow all link types (links, implements, tests, and all extra_links) in both directions. Continue until the full downstream tree is mapped. + +For each downstream need, collect **references only** (ID, type, title, status, link type to parent). Do NOT pull full content for downstream needs -- they are resolvable via `ubc` or source files if needed later. + +#### 3c: Build the scope tree + +Assemble a tree showing the requirement at the root with all downstream coverage: + +``` +REQ_042 (full text) ++-- SPEC_010 (ref) -- exists, status: open ++-- SPEC_011 (ref) -- exists, status: approved +| +-- IMPL_005 (ref) -- exists, status: open ++-- [gap] -- no spec covers subsystem X + +-- [gap] -- no impl for subsystem X + +-- [gap] -- no test for subsystem X +``` + +#### 3d: Identify gaps + +Determine what downstream coverage is missing using `pharaoh.toml` `required_links` chains. If `required_links` is configured, follow the chain (e.g., `req -> spec -> impl -> test`). If not configured, infer expected chains from configured types and link types. + +Gaps include: requirements with no spec, specs with no impl, impls with no test, and requirements whose content suggests multiple subsystems with only partial spec coverage. + +For each gap, record the parent need ID, what is missing, and whether it represents a decision point (ambiguity in decomposition or approach). + +--- + +### Step 4: Present Scope Summary + +Before generating the spec document, present a summary for the user to review: + +``` +Scope for REQ_042: + Requirements: 1 (full text included) + Specifications: 2 (references only) + Implementations: 1 (reference only) + Test cases: 0 + Gaps: 2 (no spec for subsystem X, no test for IMPL_005) + Decisions needed: 2 +``` + +For multiple requirements, show a combined summary with the same format. + +If the scope is unexpectedly large (more than 30 downstream needs), warn the user and suggest splitting into separate specs per requirement. + +Wait for user confirmation before proceeding to Step 5. + +--- + +### Step 5: Make Design Decisions + +For each gap or ambiguity identified in Step 3d, determine whether a design decision is needed. + +#### When decisions are needed + +Decisions are needed when: + +- **Missing spec coverage**: How should the requirement be decomposed into specifications? What design approach should be taken? +- **Multiple implementation approaches**: Which technology, algorithm, or architecture should be used? +- **Missing test coverage**: What verification method is appropriate (unit test, integration test, manual review)? +- **Conflicting constraints**: Two requirements impose contradictory constraints on a shared specification. + +#### Recording decisions + +For each decision, invoke `pharaoh:decide` programmatically with all context: + +- **Title**: A clear statement of the decision (e.g., "Decompose REQ_042 into timing and protocol specifications"). +- **decides**: The need IDs affected by this decision. +- **decided_by**: `claude` (since the AI is generating the spec). +- **alternatives**: At least two alternatives considered, semicolon-separated. +- **rationale**: Why this option was chosen. +- **status**: `accepted` (decisions made during spec generation are accepted by default). + +`pharaoh:decide` will generate the decision ID, write the RST directive, and return the ID. Collect all decision IDs for use in Step 6. + +**Important**: Write all decisions BEFORE generating the spec document. The spec must reference stable decision IDs, not placeholders. + +#### When no decisions are needed + +If all gaps are straightforward (e.g., a missing test case for an existing implementation where the test strategy is obvious), skip decision recording. Note in the spec that no design decisions were required. + +--- + +### Step 6: Generate the Spec Document + +Write the spec document to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md`. + +- `YYYY-MM-DD` is the current date. +- `<topic>` is a short kebab-case slug derived from the requirement title(s) (e.g., `brake-response-time`). +- The user may override the file path. If they specify a different location, use it. + +Create the `docs/superpowers/specs/` directory if it does not exist. + +#### Document structure + +The spec document MUST contain these sections in this order: + +```markdown +# Spec: <Requirement title(s)> + +Generated from sphinx-needs on YYYY-MM-DD. +Source requirements: REQ_042, REQ_043 + +## Requirements (source of truth) + +### REQ_042: <title> +**Status:** <status> | **Tags:** <tag1>; <tag2> + +<Full requirement content text pulled verbatim from sphinx-needs.> + +### REQ_043: <title> +**Status:** <status> | **Tags:** <tag1>; <tag2> + +<Full requirement content text pulled verbatim from sphinx-needs.> + +## Existing coverage + +| Need | Type | Title | Status | Links | +|------|------|-------|--------|-------| +| SPEC_010 | spec | Signal timing | open | REQ_042 | +| SPEC_011 | spec | Protocol design | approved | REQ_042 | +| IMPL_005 | impl | CAN driver | open | SPEC_011 | + +## Gaps + +- [ ] No specification covers subsystem X of REQ_042 +- [ ] No test case for IMPL_005 +- [ ] No implementation for SPEC_010 + +## Decisions + +- DEC_001: Decompose REQ_042 into timing and protocol specifications +- DEC_002: Use CAN bus for sensor communication + +> If no decisions were needed, write: "No design decisions required. All gaps are +> covered by straightforward additions." + +## Implementation scope + +### Needs to create +| Type | Purpose | Links to | File | +|------|---------|----------|------| +| spec | Subsystem X timing | REQ_042 | specifications.rst | +| test | CAN driver verification | IMPL_005 | test_cases.rst | + +### Needs to modify +| Need | Change | Reason | +|------|--------|--------| +| SPEC_010 | Update timing constraints | REQ_042 timing budget changed | + +> If no needs to create or modify, write "None" for the respective subsection. + +## Plan table + +| # | Task | Skill | Target | Detail | File | Required | +|---|------|-------|--------|--------|------|----------| +| 1 | Analyze impact | pharaoh:change | REQ_042 | Trace downstream effects | docs/requirements.rst | yes | +| 2 | Author spec | pharaoh:arch-draft | (new) | Subsystem X timing | docs/specifications.rst | yes | +| 3 | Author test | pharaoh:vplan-draft | (new) | CAN driver verification | docs/test_cases.rst | yes | +| 4 | Update spec | pharaoh:arch-draft | SPEC_010 | Timing constraints | docs/specifications.rst | yes | +| 5 | Verify | pharaoh:arch-review, pharaoh:vplan-review | (all) | Check traceability and per-type axes | -- | yes | +``` + +#### Section rules + +1. **Requirements**: Full verbatim text. The spec must be self-contained for requirement content. +2. **Existing coverage**: Reference table only, sorted by type (specs, impls, tests). +3. **Gaps**: Checkbox list (unchecked). One item per gap from Step 3d. +4. **Decisions**: List each by ID and title, referencing the decision need written in Step 5. +5. **Implementation scope**: "Needs to create" (with suggested target files) and "Needs to modify" (with change description and reason). Write "None" if a subsection is empty. +6. **Plan table**: Built in Step 7. Same columns and format as `pharaoh:plan` Step 5. + +--- + +### Step 7: Build the Plan Table + +Construct the plan table following `pharaoh:plan` task sequencing rules. + +#### Task ordering + +1. **Change analysis first** (if modifying existing needs): One `pharaoh:change` task per modified need, or a single task covering all modifications. +2. **Author needs top-down**: Requirements before specifications, specifications before implementations, implementations before test cases. New needs before modifications at each level. +3. **Verify after all authoring**: One review skill task (e.g. `pharaoh:req-review`) covering all created and modified needs. +4. **MECE check if configured**: Include a `pharaoh:mece` task if `require_mece_on_release = true` in `pharaoh.toml`, or if the scope involves creating needs at multiple hierarchy levels. + +#### Task format + +Each task row must specify: + +- **#**: Sequential number starting from 1. +- **Task**: Concise description of what the task does. +- **Skill**: The exact Pharaoh skill to invoke (e.g., `pharaoh:change`, `pharaoh:req-draft`, `pharaoh:req-review`, `pharaoh:mece`). +- **Target**: The need ID being acted on, or `(new)` for needs to create, or `(all)` for verification tasks. +- **Detail**: A specific description of the change or action. Not vague -- name the exact property or content being changed. +- **File**: The target file path for the task (e.g., `docs/requirements.rst`, `docs/specifications.rst`), or `--` if not applicable. +- **Required**: `yes`, `no`, or `recommended` based on strictness mode. + +#### Required field rules + +- **Enforcing mode**: Tasks mandated by workflow gates are marked `yes`. Optional tasks (like MECE check when not required) are marked `recommended`. +- **Advisory mode**: All tasks are marked `recommended`. No task is strictly required. + +#### Plan table constraints + +- The plan table MUST use the exact same format and column names as `pharaoh:plan` Step 5. This ensures the plan can be handed off to `pharaoh:plan` for execution without reformatting. +- One skill invocation per task. Do not combine multiple skill calls into a single row. +- Every task must have a concrete target. Vague tasks like "update related specs" are not acceptable. + +--- + +### Step 8: Handoff + +After generating the spec document, present the file path and offer next steps: + +``` +Spec document written to: docs/superpowers/specs/2026-04-07-brake-response-time-design.md + +Options: + 1. Execute the plan via pharaoh:plan + 2. Review or modify the spec first + 3. Execute later (plan is saved in the spec document) +``` + +- **Option 1**: Invoke `pharaoh:plan` with the plan table from the spec document. +- **Option 2**: Allow edits to any section, regenerate affected parts, then re-offer options. +- **Option 3**: Confirm the spec is saved. No further action. + +**Never auto-execute.** Always present the spec and wait for explicit user approval. + +--- + +## Strictness Behavior + +Follow the instructions in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md). + +### Advisory mode + +- Execute freely. No gates. This skill has no prerequisites. +- All plan table tasks are marked `recommended` instead of `yes`. +- Decisions are still recorded (they provide traceability regardless of strictness). +- After generating the spec, show a tip if the user skips review: + ``` + Tip: Consider reviewing the spec before executing the plan. + The spec captures design decisions that affect downstream authoring. + ``` + +### Enforcing mode + +- Execute freely. No gates. This skill has no prerequisites. +- Plan table tasks mandated by workflow gates are marked `yes`: + - `pharaoh:change` tasks are required if `require_change_analysis = true`. + - review skill tasks are required if `require_verification = true`. + - `pharaoh:mece` tasks are required if `require_mece_on_release = true`. +- Decisions are recorded with status `accepted` (same as advisory mode). +- The spec document clearly marks which plan tasks are mandatory. + +### Strictness has no effect on analysis depth + +Both advisory and enforcing modes perform the same scope resolution, gap analysis, and decision recording. Strictness only affects the `Required` column in the plan table and whether downstream skills gate on prerequisites. + +--- + +## Key Constraints + +1. **Requirements get full text, downstream needs get references only.** The spec is self-contained for requirements but does not duplicate downstream content. Downstream details are resolvable via `ubc` or by reading source files. + +2. **Decisions are written as sphinx-needs objects before the spec references them.** Never reference a decision ID that has not been written. Always invoke `pharaoh:decide` first, collect the ID, then use it in the spec. + +3. **Plan table format must match `pharaoh:plan` exactly.** Same column names, same task granularity, same required-field semantics. The plan table in the spec must be directly executable by `pharaoh:plan`. + +4. **Spec doc location defaults to `docs/superpowers/specs/` but is overridable.** If the user specifies a different path, use it without question. + +5. **Never auto-execute.** Always present the complete spec document and wait for explicit user approval before invoking any downstream skill. This applies even if the plan has only one task. + +6. **When called with multiple requirement IDs, produce a single combined spec.** Do not generate separate documents per requirement. The scope tree, gap analysis, and plan table cover all input requirements together. + +7. **Keep the spec document focused.** Do not include implementation details, code snippets, or design elaborations beyond what the decisions capture. The spec is a bridge document -- it connects requirements to a plan, not a detailed design document. + +8. **No session state changes from spec generation alone.** Generating and writing the spec document does not modify `.pharaoh/session.json`. Only decision recording (via `pharaoh:decide`) and plan execution (via `pharaoh:plan`) update session state. + +--- + +## Examples + +### Example 1: Single requirement with gaps + +**User request**: `pharaoh:spec REQ_001` + +**Step 1** -- Data access detects: + +``` +Project: Brake System (ubproject.toml) +Types: req, spec, impl, test, decision +Links: links, implements, tests, decides +Data source: Tier 3 (raw file parsing) +Needs found: 12 +Strictness: advisory +``` + +**Step 2** -- Input: `REQ_001`. Validated against needs index. Found: `REQ_001` (Brake response time). + +**Step 3** -- Scope resolution: + +Full text retrieved for REQ_001: +- Title: "Brake response time" +- Status: approved +- Tags: safety; braking +- Content: "The brake system shall respond within 100ms of pedal input under all operating conditions." + +Downstream trace: +``` +REQ_001 (full text) ++-- SPEC_001 (ref) -- Signal timing, status: open +| +-- IMPL_001 (ref) -- CAN driver, status: open ++-- [gap] -- no spec for subsystem X (pedal sensor interface) + +-- [gap] -- no impl + +-- [gap] -- no test ++-- [gap] -- no test for IMPL_001 +``` + +Gaps identified: +1. No specification covers the pedal sensor interface subsystem. +2. No test case for IMPL_001 (CAN driver). + +**Step 4** -- Scope summary presented: + +``` +Scope for REQ_001: + Requirements: 1 (full text included) + Specifications: 1 (reference only) + Implementations: 1 (reference only) + Test cases: 0 + Gaps: 2 (no spec for pedal sensor interface, no test for IMPL_001) + Decisions needed: 2 +``` + +User confirms: proceed. + +**Step 5** -- Decisions recorded via `pharaoh:decide`: + +1. `DEC_003`: "Decompose pedal sensor interface into separate specification" + - decides: REQ_001 + - alternatives: Include in SPEC_001; Create standalone spec + - rationale: Pedal sensor interface is safety-critical and warrants independent review + - Result: DEC_003 written to decisions.rst + +2. `DEC_004`: "Use hardware-in-the-loop testing for CAN driver verification" + - decides: IMPL_001 + - alternatives: Unit test with mock CAN; HIL testing; Manual bench test + - rationale: Safety-critical braking path requires realistic signal conditions + - Result: DEC_004 written to decisions.rst + +**Step 6** -- Spec document generated at `docs/superpowers/specs/2026-04-07-brake-response-time-design.md` with: +- Full text of REQ_001 in Requirements section +- Coverage table: SPEC_001 (open), IMPL_001 (open) +- Gaps: no spec for pedal sensor interface, no test for IMPL_001 +- Decisions: DEC_003, DEC_004 +- Implementation scope: create 1 spec (pedal sensor timing) and 1 test (CAN driver HIL) +- Plan table (advisory mode, all `recommended`): + +| # | Task | Skill | Target | Detail | File | Required | +|---|------|-------|--------|--------|------|----------| +| 1 | Analyze impact | pharaoh:change | REQ_001 | Trace downstream effects of new spec | docs/requirements.rst | recommended | +| 2 | Author spec | pharaoh:req-draft | (new) | Pedal sensor interface timing spec | docs/specifications.rst | recommended | +| 3 | Author test | pharaoh:req-draft | (new) | CAN driver HIL test case | docs/test_cases.rst | recommended | +| 4 | Verify coverage | pharaoh:req-review | (all) | Check REQ_001 traceability chain | -- | recommended | + +**Step 8** -- Handoff: + +``` +Spec document written to: docs/superpowers/specs/2026-04-07-brake-response-time-design.md + +Options: + 1. Execute the plan via pharaoh:plan + 2. Review or modify the spec first + 3. Execute later (plan is saved in the spec document) +``` + +--- + +### Example 2: Multiple requirements, full coverage + +**User request**: `pharaoh:spec REQ_001 REQ_002` + +**Step 1** -- Same detection as Example 1. + +**Steps 2-3** -- Both IDs validated. Scope resolution finds full downstream coverage for both requirements (each has spec, impl, and test in approved status). No gaps, no decisions needed. + +**Step 4** -- Scope summary: + +``` +Scope for REQ_001, REQ_002: + Requirements: 2 (full text included) + Specifications: 2 (references only) + Implementations: 2 (references only) + Test cases: 2 (references only) + Gaps: 0 + Decisions needed: 0 +``` + +**Steps 5-6** -- Decisions skipped. Spec document generated at `docs/superpowers/specs/2026-04-07-brake-system-design.md` with both requirements in full text, a complete coverage table (6 downstream needs), empty gaps section ("No gaps identified"), empty decisions section, and no plan table ("No tasks required. All requirements have complete traceability chains."). + +**Step 8** -- Handoff offers: review the spec, run `pharaoh:req-review` to confirm traceability, or done. diff --git a/.github/agents/pharaoh.sphinx-extension-add.agent.md b/.github/agents/pharaoh.sphinx-extension-add.agent.md index 9f9e20c..a8d2bbe 100644 --- a/.github/agents/pharaoh.sphinx-extension-add.agent.md +++ b/.github/agents/pharaoh.sphinx-extension-add.agent.md @@ -7,4 +7,158 @@ handoffs: [] Use when you need to idempotently add one or more sphinx extension modules to a project's `conf.py` extensions list, optionally installing the corresponding pypi packages via the detected package manager. Invoked by plans produced by pharaoh-write-plan when a diagram-emitting task requires a renderer extension that `conf.py` does not yet load. Does NOT emit RST. Does NOT build. -See [`skills/pharaoh-sphinx-extension-add/SKILL.md`](../../skills/pharaoh-sphinx-extension-add/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-sphinx-extension-add + +## When to use + +Invoke when a plan requires a Sphinx extension that `conf.py` does not currently load (e.g. `sphinxcontrib.mermaid` for Mermaid diagrams, `sphinxcontrib.plantuml` for PlantUML, `sphinx_needs` for sphinx-needs itself). Typical caller: `pharaoh-execute-plan` executing a task that `pharaoh-write-plan` inserted as a prerequisite to a diagram-emitting task when Step 3.5's dep probe found a missing extension. + +Do NOT invoke to set arbitrary `conf.py` variables — this skill only touches the `extensions` list (and optionally triggers a pypi install). Do NOT invoke to load `sphinx_needs` on a project that never had sphinx-needs — that is `pharaoh-bootstrap`'s indivisible concern (which already includes extension injection as part of the bootstrap transaction). + +## Atomicity + +- (a) **Indivisible.** One `conf.py` + one extension list in → one updated `conf.py` (and optionally one package-manager install) out. No other `conf.py` mutation. No RST edits. No downstream skill invocation. +- (b) **Typed I/O.** + - Input: `{conf_py: str, extensions: list[str], install_if_missing: bool, on_package_manager_missing?: "fail"|"warn"|"skip", reporter_id: str}`. + - Output: `{files_modified: list[str], extensions_added: list[str], extensions_already_present: list[str], install_command_used: str | null, packages_installed: list[str], warnings: list[str]}`. Idempotent: when the extensions are already present AND (installed OR `install_if_missing == false`), `files_modified` and `install_command_used` are empty. +- (c) **Execution-based reward.** Fixture `pharaoh-validation/fixtures/pharaoh-sphinx-extension-add/`: + - `case_fresh/conf.py` — has `extensions = ['sphinx_needs']`. Call with `extensions: ['sphinxcontrib.mermaid']`, `install_if_missing: true`. Scorer asserts (1) `conf.py` now has both entries, (2) `sphinxcontrib-mermaid` is importable, (3) `extensions_added == ['sphinxcontrib.mermaid']`, (4) `packages_installed == ['sphinxcontrib-mermaid']`. + - `case_already_present/conf.py` — has `['sphinx_needs', 'sphinxcontrib.mermaid']`. Same call. Scorer asserts (1) `conf.py` unchanged (byte-identical), (2) `extensions_added == []`, `extensions_already_present == ['sphinxcontrib.mermaid']`, (3) `install_command_used is null`. + - `case_no_install/conf.py` — has `['sphinx_needs']`, extension `sphinxcontrib.plantuml` NOT installed. Call with `install_if_missing: false`. Scorer asserts (1) `conf.py` now has the entry, (2) `packages_installed == []`, (3) `warnings` contains one entry naming the missing package. + - Idempotence: re-running any case returns `files_modified == []`, `extensions_added == []`. +- (d) **Reusable.** Any Sphinx project, any extension. Not tied to diagrams — a future use case might be adding `sphinxcontrib.bibtex` or `myst_parser`. +- (e) **Composable.** Invoked inline (by `pharaoh-execute-plan` per plan task) or by humans via the CLI. Does not call other skills. + +## Input + +- `conf_py` (required): absolute path to a Sphinx `conf.py`. Must exist and be parseable Python. +- `extensions` (required): list of extension module paths (the strings that go inside `extensions = [...]`). Example: `["sphinxcontrib.mermaid"]`, `["sphinxcontrib.plantuml", "myst_parser"]`. +- `install_if_missing` (required): bool. If `true` and an extension module is not importable, attempt a package install before editing `conf.py` (order: install first, then edit, so a failed install does not leave `conf.py` referencing a missing module). If `false`, edit `conf.py` regardless of importability; record a warning per missing module. +- `on_package_manager_missing` (optional): `"fail"` | `"warn"` | `"skip"`. Default `"warn"`. Applies only when `install_if_missing` is `true` and no package manager is detectable (see package-manager detection table below). + - `"fail"`: abort before any edit. + - `"warn"`: log warning, proceed to edit `conf.py` anyway (user will install manually). + - `"skip"`: silently proceed to edit (no warning). Used by callers that intentionally edit `conf.py` in environments where pypi installation is handled elsewhere (e.g. CI build image pre-baked). +- `reporter_id` (required): short agent id, for audit logs. + +## Output + +```json +{ + "files_modified": ["docs/conf.py"], + "extensions_added": ["sphinxcontrib.mermaid"], + "extensions_already_present": [], + "install_command_used": "uv pip install sphinxcontrib-mermaid", + "packages_installed": ["sphinxcontrib-mermaid"], + "warnings": [] +} +``` + +`install_command_used` is `null` when nothing was installed. + +`packages_installed` lists the pypi package names (not the extension module paths — those differ: module `sphinxcontrib.mermaid` ships in pypi package `sphinxcontrib-mermaid`; see the extension → package resolution table). + +## Process + +### Step 0: Parse `conf.py`'s current extensions list + +Read `conf_py`. Locate the `extensions = [...]` assignment. Three cases: + +1. **Assignment present and parseable.** Extract the current list as a Python list of strings. +2. **Assignment missing.** Record empty list; the Edit step will append a new assignment. +3. **Parse error on the assignment** (e.g. `extensions = get_extensions()`). Abort: + ``` + FAIL: extensions = ... in conf.py is not a literal list. This skill cannot safely mutate computed extension lists. Edit manually. + ``` + +### Step 1: Classify each requested extension + +For each entry in input `extensions`: + +- **Already present** in the parsed current list → add to `extensions_already_present`, skip. +- **Missing AND importable** (`python -c "import <module_path>"` exits zero) → target for edit only, no install. +- **Missing AND not importable** → target for install + edit (if `install_if_missing`), or edit + warn (if not). + +### Step 2: Install (conditional) + +Only if `install_if_missing == true` AND the target set from Step 1 includes one or more non-importable modules. + +**2a. Resolve pypi package names.** Use the extension → pypi resolution table: + +| Extension module | Pypi package | +| ------------------------- | ------------------------ | +| `sphinxcontrib.mermaid` | `sphinxcontrib-mermaid` | +| `sphinxcontrib.plantuml` | `sphinxcontrib-plantuml` | +| `sphinxcontrib.bibtex` | `sphinxcontrib-bibtex` | +| `myst_parser` | `myst-parser` | +| `sphinx_copybutton` | `sphinx-copybutton` | +| `sphinx_design` | `sphinx-design` | +| `sphinx_needs` | `sphinx-needs` | +| `sphinx_codelinks` | `sphinx-codelinks` | +| `sphinxcontrib.<name>` | `sphinxcontrib-<name>` (default rule when not otherwise listed) | +| `<other>` | `<other>` with `_` → `-` (default rule) | + +Unknown extensions use the default rule. If the caller is certain about the pypi name, they can pass it as the module path anyway — the skill treats the input as authoritative and derives the install target via the rule above; the install-or-fail outcome is self-correcting. + +**2b. Detect package manager.** Same six-row table as `pharaoh-bootstrap` Step 0c (rye / uv / poetry / pipenv / pdm / pip-venv). Closer indicator wins. + +If no package manager is detected, branch on `on_package_manager_missing`: +- `"fail"` → abort before editing `conf.py`. +- `"warn"` → emit warning; go to Step 3 (edit `conf.py`); `packages_installed` stays empty. +- `"skip"` → go to Step 3 silently. + +**2c. Run install.** For each pypi package not yet installed, run the add/install command (e.g. `rye add sphinxcontrib-mermaid`, `uv pip install sphinxcontrib-mermaid`). Capture exit code per package. If any install fails: + +- If other packages in the batch succeeded, record the failure in `warnings` but proceed to edit `conf.py` for the successful ones; skip `conf.py` entry for the failed ones. +- If ALL installs failed, abort without editing. Record all failures in `warnings`. + +### Step 3: Edit `conf.py` + +For each target in Step 1's "missing" set that passed Step 2 (installed or skipped by design): + +1. **Extensions assignment exists.** Insert the extension string as the last entry, preserving indentation and trailing-comma conventions. If the existing list is on one line, append inline; if multi-line, append as a new line matching the indent of the last existing entry. +2. **Extensions assignment missing.** Append a new line `extensions = ["<ext>"]` after the last existing top-level assignment. Add a blank line before for readability. + +Preserve comments and blank lines around the assignment. Do NOT reorder existing entries. + +### Step 4: Verify the edit + +Re-read `conf_py`. Parse the `extensions = [...]` assignment again. Confirm every requested extension is present. If any is missing (edit did not take effect), abort with `FAIL: edit verification failed for <ext>; conf.py may be in an inconsistent state`. + +### Step 5: Return + +Emit the output JSON. Populate: + +- `files_modified`: `[conf_py]` if any edit happened; `[]` otherwise. +- `extensions_added`: extensions the edit introduced. +- `extensions_already_present`: extensions that were already in the list. +- `install_command_used`: the package-manager-specific command (e.g. `uv pip install sphinxcontrib-mermaid`) if any install ran; `null` otherwise. If multiple packages installed in separate commands, this is the last one (kept simple — callers who want the full history read `packages_installed`). +- `packages_installed`: pypi names of packages actually installed. +- `warnings`: any warning surfaced along the way. + +## Failure modes + +| Condition | Response | +| ------------------------------------------------------- | ----------------------------------------------------------- | +| `conf_py` missing | FAIL naming the path. | +| `extensions` empty list | FAIL: `"extensions input must contain at least one entry"`. | +| `extensions = ...` in `conf.py` is not a literal list | FAIL per Step 0. | +| All installs fail | FAIL without editing. Record failures in warnings. | +| Partial install failure | Edit for the successes; warn for the failures; no edit for failures. | +| Package manager not detected AND `on_package_manager_missing == "fail"` | FAIL before editing. | + +## Non-goals + +- **No `conf.py` mutation outside the `extensions` list.** Related settings (`mermaid_output_format`, `plantuml` path config) are deliberately not touched. Callers that need those set should invoke a different skill (or author a future `pharaoh-sphinx-option-set`). +- **No multi-file edits.** Only the named `conf_py` file. Multi-project Sphinx trees with multiple `conf.py` files need one invocation per file. +- **No `pyproject.toml` pinning.** The install command may or may not persist the dependency to `pyproject.toml` depending on the package manager (rye/uv/poetry/pdm persist; raw `pip install` does not). The skill does not second-guess the caller's pinning strategy. +- **No dry-run mode.** If the caller wants to preview changes, they can diff `conf.py` after the call — the skill is fast and idempotent, so a "run, review, revert" loop is cheaper than a separate dry-run code path. + +## Composition + +- `pharaoh-write-plan` Step 3.5 (dep probe) transitions from warn-only to task-insertion: when `conf.py` is missing a renderer extension required by a diagram-emitting task, the plan emits a `pharaoh-sphinx-extension-add` task as a dependency of the diagram task (or group of diagram tasks). The probe's warnings still include the install command as a human-readable handoff. +- `pharaoh-bootstrap` remains the authoritative entry for `sphinx_needs` itself (the bootstrap transaction covers extension + types + `needs_from_toml` as one atomic step). This skill is for post-bootstrap additions. +- `pharaoh-quality-gate` does NOT run this skill. Gate is read-only; extension adds are plan tasks. diff --git a/.github/agents/pharaoh.standard-conformance.agent.md b/.github/agents/pharaoh.standard-conformance.agent.md index ae0163c..364f6f9 100644 --- a/.github/agents/pharaoh.standard-conformance.agent.md +++ b/.github/agents/pharaoh.standard-conformance.agent.md @@ -7,4 +7,351 @@ handoffs: [] Use when evaluating a single sphinx-needs artefact against one regulatory standard (ISO 26262-8 §6, ASPICE 4.0, ISO/SAE 21434). Emits per-indicator findings JSON with pass/fail on mechanizable indicators and 0-3 scores on subjective ones. -See [`skills/pharaoh-standard-conformance/SKILL.md`](../../skills/pharaoh-standard-conformance/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-standard-conformance + +## When to use + +Invoke when you want to check whether a single artefact (requirement, architecture element, or +verification plan) meets the mandatory indicators of one named regulatory standard. + +**One artefact, one standard per invocation.** For a full corpus audit across multiple artefacts +and gap categories, invoke `pharaoh-process-audit` instead. + +Do NOT use to author or fix the artefact — run the relevant `*-review` skill for quality +feedback, then `pharaoh-req-regenerate` or re-draft to address findings. + +--- + +## Inputs + +- **artefact**: either an RST directive block (inline) or a need-id present in `needs.json` +- **standard**: one of `iso26262` | `aspice40` | `iso21434` +- **tailoring** (from `.pharaoh/project/`): + - `artefact-catalog.yaml` — required/optional fields per artefact type + - `id-conventions.yaml` — ID regex and prefix map +- **needs.json**: required for link resolution (field references to other needs) + +--- + +## Outputs + +A single JSON document — no prose wrapper. Top-level shape: + +```json +{ + "standard": "iso26262", + "artefact_id": "gd_req__example", + "artefact_type": "gd_req", + "indicators": [ + { + "id": "SC-ISO-01", + "name": "schema_completeness", + "type": "mechanized", + "result": 1, + "evidence": "All required fields (id, status, satisfies) present." + } + ], + "overall": "pass" +} +``` + +### `indicators` array + +Each entry: + +| Field | Type | Description | +|---|---|---| +| `id` | string | Indicator identifier, e.g. `SC-ISO-01` | +| `name` | string | Short snake_case name | +| `type` | `"mechanized"` or `"subjective"` | Determines result scale | +| `result` | integer | `0` or `1` for mechanized; `0–3` for subjective | +| `evidence` | string | One-sentence justification | + +### `overall` field + +Derived from all indicators: + +- `"pass"` — all mechanized indicators score 1, all subjective score ≥ 2 +- `"partial"` — no mechanized indicator scores 0, but ≥ 1 subjective indicator scores < 2 +- `"fail"` — ≥ 1 mechanized indicator scores 0 + +--- + +## Indicator sets per standard + +### ISO 26262-8 §6 (standard: `iso26262`) + +Cross-references the 10 ISO axes from `pharaoh-req-review` — this skill applies the same axes +but maps them to ISO 26262 Part 8 §6 indicator language. + +| ID | Name | Type | Pass rule | +|---|---|---|---| +| SC-ISO-01 | schema_completeness | mechanized | All `required_fields` from artefact-catalog.yaml are present and non-empty | +| SC-ISO-02 | unique_identification | mechanized | `id` matches `id_regex` from id-conventions.yaml for this artefact type | +| SC-ISO-03 | verifiability | mechanized | `:verification:` (for req) or `:verifies:` (for tc) present and resolves in needs.json | +| SC-ISO-04 | traceability_upward | mechanized | `:satisfies:` or `:verifies:` link present and resolves to a parent in needs.json | +| SC-ISO-05 | atomicity | mechanized | Body contains exactly one `shall`; no coordinating conjunction joins actions within the shall clause | +| SC-ISO-06 | unambiguity | subjective | 0–3 scale: 3 = measurable and unambiguous; 0 = multiple valid interpretations | +| SC-ISO-07 | comprehensibility | subjective | 0–3 scale: 3 = self-contained, no undefined terms; 0 = requires external context to interpret | +| SC-ISO-08 | feasibility | subjective | 0–3 scale: 3 = clearly achievable within known engineering constraints; 0 = infeasible or contradictory | + +For non-requirement artefacts (arch, tc): SC-ISO-05 atomicity is recorded as `{"result": null, +"evidence": "atomicity indicator applies to requirements only"}`. + +### ASPICE 4.0 (standard: `aspice40`) + +| ID | Name | Type | Pass rule | +|---|---|---|---| +| SC-APC-01 | sup10_traceability | mechanized | Every requirement has a bi-directional link: upward (`:satisfies:`) and downward (`:verification:` or linked tc `:verifies:`); both must resolve in needs.json | +| SC-APC-02 | swe1_elicitation_completeness | mechanized | Artefact carries `status` field with a value declared in workflows.yaml lifecycle_states | +| SC-APC-03 | swe2_design_linkage | mechanized | For arch artefacts: `:satisfies:` resolves to a gd_req in needs.json. For non-arch artefacts: recorded as `null` (not applicable) | +| SC-APC-04 | sup8_config_id | mechanized | `id` matches id-conventions.yaml id_regex for its type | +| SC-APC-05 | swe1_rationale_quality | subjective | 0–3: 3 = stakeholder intent and justification explicitly stated; 0 = no rationale whatsoever | +| SC-APC-06 | swe2_design_adequacy | subjective | 0–3: 3 = design element clearly addresses the stated requirement; 0 = no discernible connection | + +### ISO/SAE 21434 (standard: `iso21434`) + +| ID | Name | Type | Pass rule | +|---|---|---|---| +| SC-CS-01 | cs_rm01_risk_management | mechanized | Artefact references a TARA or threat ID in a `complies` or `tags` field (value matches regex `tara__.*` or `threat__.*`) | +| SC-CS-02 | cs_traceability | mechanized | Upward link (`:satisfies:`) present and resolves; for tc: `:verifies:` present and resolves | +| SC-CS-03 | cs_unique_id | mechanized | `id` matches id-conventions.yaml id_regex | +| SC-CS-04 | cs_threat_analysis_adequacy | subjective | 0–3: 3 = cyber threat scenario described with attack vector, impact, and likelihood; 0 = no threat context mentioned | +| SC-CS-05 | cs_risk_treatment_rationale | subjective | 0–3: 3 = chosen risk treatment (accept/mitigate/avoid/transfer) stated with explicit justification; 0 = treatment absent | + +--- + +## Process + +### Step 1: Validate inputs + +Confirm `artefact` and `standard` are provided. If `standard` is not one of `iso26262`, +`aspice40`, `iso21434`, FAIL immediately: + +``` +FAIL: unknown standard "<value>". +Supported standards: iso26262, aspice40, iso21434. +``` + +If `artefact` is absent, FAIL: + +``` +FAIL: no artefact provided. +Supply either a need-id or an RST directive block. +``` + +--- + +### Step 2: Read tailoring + +Read `.pharaoh/project/artefact-catalog.yaml` and `.pharaoh/project/id-conventions.yaml`. +Extract `required_fields` for the artefact type and `id_regex` for the type prefix. + +If tailoring files are missing, apply the built-in defaults (bundled example profile): +- `req` required fields: `[id, status, satisfies]` +- `arch` required fields: `[id, status, satisfies, type]` +- `tc` required fields: `[id, status, verifies]` + +Note the fallback in each affected indicator's `evidence`. + +--- + +### Step 3: Resolve artefact + +**If artefact is a need-id:** Look up in needs.json. If not found, FAIL (G1). +Extract all fields and body text. + +**If artefact is an RST block:** Parse inline — extract id from `:id:` option, all other +options, and body text. Determine type from the directive name (e.g. `.. gd_req::` → `gd_req`). +For link-resolution indicators, use needs.json if available; record +`"needs.json unavailable — link unresolvable"` in evidence and score 0 if not. + +--- + +### Step 4: Evaluate indicators + +Apply the indicator set for the selected `standard` (see tables above). + +For each **mechanized** indicator: +- Apply the stated pass rule deterministically. +- Record `result: 1` (pass) or `result: 0` (fail) and one-sentence `evidence`. + +For each **subjective** indicator: +- Apply the 0–3 scale description. +- Record integer score and one-sentence `evidence`. + +For not-applicable indicators (e.g. SC-ISO-05 atomicity on an arch artefact), record +`result: null` with a brief `evidence` string. Null indicators do not affect `overall`. + +--- + +### Step 5: Compute overall + +Inspect all non-null indicators: +- Any mechanized `result: 0` → `overall: "fail"` +- No mechanized failures but any subjective `result < 2` → `overall: "partial"` +- All mechanized `result: 1` and all subjective `result ≥ 2` → `overall: "pass"` + +--- + +### Step 6: Emit JSON + +Emit the single JSON document. No prose before or after. + +--- + +## Guardrails + +**G1 — Artefact not found** + +If artefact is a need-id and it does not appear in needs.json: + +``` +FAIL: need-id "<id>" not found in needs.json. +Verify the ID or rebuild the project first (sphinx-build docs/ docs/_build/). +``` + +Do not emit partial JSON. + +**G2 — Unknown standard** + +Unknown `standard` value → FAIL (Step 1) with list of supported standards. Do not attempt +indicator evaluation. + +**G3 — Unparseable RST block** + +If the provided RST block cannot be parsed (no `.. <type>::` opener, no `:id:` option): + +``` +FAIL: cannot parse artefact RST block. +Expected format: ".. <type>:: <title>" followed by indented :id: option. +Check indentation and directive syntax. +``` + +**G4 — Malformed JSON self-correction** + +If emitted JSON is syntactically invalid, self-correct once. On second failure: + +```json +{ + "standard": "<standard>", + "artefact_id": "<id>", + "diagnostic": "JSON self-correction failed. Raw findings follow.", + "raw": "<free-text findings>" +} +``` + +--- + +## Advisory chain + +`chains_to: []` — this skill is terminal. If `overall` is `"partial"` or `"fail"`, append a +single advisory line after the JSON: + +For `iso26262` findings: suggest `pharaoh-req-review` for detailed axis-level action items. +For `aspice40` / `iso21434` findings: suggest re-authoring the artefact to address failing +indicators directly. + +--- + +## Worked example + +### Run 1: ISO 26262 on a `gd_req` + +**Input:** +- `standard`: `iso26262` +- `artefact` (RST block): + +```rst +.. gd_req:: ABS pump activation on wheel slip threshold + :id: gd_req__abs_pump_activation + :status: draft + :satisfies: gd_req__brake_system_safety + :verification: tc__abs_pump_001 + + The brake controller shall engage the ABS pump when measured wheel slip + exceeds the calibrated activation threshold. +``` + +**Step 1:** standard `iso26262` valid; artefact provided. + +**Step 2:** tailoring loaded; `gd_req` required fields: `[id, status, satisfies]`. + +**Step 3:** RST parsed. `id = gd_req__abs_pump_activation`, `type = gd_req`. +needs.json available; `tc__abs_pump_001` and `gd_req__brake_system_safety` both resolve. + +**Step 4 — mechanized indicators:** +- SC-ISO-01: `id`, `status`, `satisfies` all present → result 1 +- SC-ISO-02: `gd_req__abs_pump_activation` matches `gd_req__[a-z0-9_]+` → result 1 +- SC-ISO-03: `:verification: tc__abs_pump_001` resolves → result 1 +- SC-ISO-04: `:satisfies: gd_req__brake_system_safety` resolves → result 1 +- SC-ISO-05: one `shall`, no coordinating conjunction → result 1 + +**Step 4 — subjective indicators:** +- SC-ISO-06: "calibrated activation threshold" is a defined term; single interpretation → score 3 +- SC-ISO-07: subject/action/condition all explicit; no undefined acronyms → score 3 +- SC-ISO-08: standard automotive ABS function; well-constrained → score 3 + +**Step 5:** all mechanized pass, all subjective ≥ 2 → `overall: "pass"`. + +```json +{ + "standard": "iso26262", + "artefact_id": "gd_req__abs_pump_activation", + "artefact_type": "gd_req", + "indicators": [ + {"id": "SC-ISO-01", "name": "schema_completeness", "type": "mechanized", "result": 1, "evidence": "id, status, satisfies all present and non-empty"}, + {"id": "SC-ISO-02", "name": "unique_identification", "type": "mechanized", "result": 1, "evidence": "gd_req__abs_pump_activation matches gd_req__[a-z0-9_]+ regex"}, + {"id": "SC-ISO-03", "name": "verifiability", "type": "mechanized", "result": 1, "evidence": ":verification: tc__abs_pump_001 resolves in needs.json"}, + {"id": "SC-ISO-04", "name": "traceability_upward", "type": "mechanized", "result": 1, "evidence": ":satisfies: gd_req__brake_system_safety resolves in needs.json"}, + {"id": "SC-ISO-05", "name": "atomicity", "type": "mechanized", "result": 1, "evidence": "exactly one shall; no coordinating conjunction in shall clause"}, + {"id": "SC-ISO-06", "name": "unambiguity", "type": "subjective", "result": 3, "evidence": "calibrated activation threshold is a defined term; single interpretation"}, + {"id": "SC-ISO-07", "name": "comprehensibility", "type": "subjective", "result": 3, "evidence": "subject, action, and condition explicit; no undefined acronyms"}, + {"id": "SC-ISO-08", "name": "feasibility", "type": "subjective", "result": 3, "evidence": "standard automotive ABS function; well-constrained threshold trigger"} + ], + "overall": "pass" +} +``` + +--- + +### Run 2: ASPICE 4.0 on the same `gd_req` + +Same artefact as Run 1. Standard changed to `aspice40` — different indicator set, same artefact. + +**Step 4 — mechanized indicators:** +- SC-APC-01: `:satisfies:` resolves upward AND `:verification: tc__abs_pump_001` resolves + downward → SUP.10 traceability satisfied → result 1 +- SC-APC-02: `status: draft` is declared in workflows.yaml lifecycle_states → result 1 +- SC-APC-03: `gd_req` is not an arch artefact → result null (not applicable) +- SC-APC-04: id matches regex → result 1 + +**Step 4 — subjective indicators:** +- SC-APC-05: no `rationale:` field present; stakeholder intent not stated → score 1 +- SC-APC-06: not applicable to req type → result null + +**Step 5:** SC-APC-01, -02, -04 pass; SC-APC-03 and SC-APC-06 null; SC-APC-05 scores 1 (< 2) → +`overall: "partial"`. + +```json +{ + "standard": "aspice40", + "artefact_id": "gd_req__abs_pump_activation", + "artefact_type": "gd_req", + "indicators": [ + {"id": "SC-APC-01", "name": "sup10_traceability", "type": "mechanized", "result": 1, "evidence": "upward :satisfies: and downward :verification: both resolve"}, + {"id": "SC-APC-02", "name": "swe1_elicitation_completeness","type": "mechanized", "result": 1, "evidence": "status 'draft' declared in workflows.yaml lifecycle_states"}, + {"id": "SC-APC-03", "name": "swe2_design_linkage", "type": "mechanized", "result": null, "evidence": "not applicable to gd_req artefact type"}, + {"id": "SC-APC-04", "name": "sup8_config_id", "type": "mechanized", "result": 1, "evidence": "id matches id_regex for gd_req prefix"}, + {"id": "SC-APC-05", "name": "swe1_rationale_quality", "type": "subjective", "result": 1, "evidence": "no :rationale: field; stakeholder intent not explicitly stated"}, + {"id": "SC-APC-06", "name": "swe2_design_adequacy", "type": "subjective", "result": null, "evidence": "not applicable to gd_req artefact type"} + ], + "overall": "partial" +} +``` + +Consider re-authoring the artefact to add a `:rationale:` field addressing SC-APC-05. diff --git a/.github/agents/pharaoh.state-diagram-draft.agent.md b/.github/agents/pharaoh.state-diagram-draft.agent.md index ec261c4..5f2864d 100644 --- a/.github/agents/pharaoh.state-diagram-draft.agent.md +++ b/.github/agents/pharaoh.state-diagram-draft.agent.md @@ -7,4 +7,97 @@ handoffs: [] Use when drafting one state-machine diagram showing lifecycle or behavioral states of a component/entity, with labeled transitions. Renderer tailored via `pharaoh.toml`. Does NOT emit component, sequence, or class diagrams. Status — PLANNED (design-only scaffold; invoking returns sentinel FAIL until implemented). -See [`skills/pharaoh-state-diagram-draft/SKILL.md`](../../skills/pharaoh-state-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-state-diagram-draft (PLANNED) + +> **Status:** DESIGN ONLY. Implementation sentinel FAIL: `"pharaoh-state-diagram-draft is planned but not implemented; see SKILL.md"`. + +Shared tailoring rules: see [`shared/diagram-tailoring.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-tailoring.md). Reads `[pharaoh.diagrams.state]`. + +Safe-label rules: see [`shared/diagram-safe-labels.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-safe-labels.md). Every emitted label / node id / transition label MUST be sanitised per that rule set before the block leaves this skill. `sphinx-build` does not validate diagram bodies — a parse failure becomes visible only at browser render time. Sanitisation is the first line of defence; the second is `pharaoh-diagram-lint` run as part of `pharaoh-quality-gate`. + +## Purpose + +One invocation → one state-machine diagram. Captures **discrete states** of a component/entity and **labeled transitions** between them, with optional events, guards, and actions per transition. + +Does NOT capture static structure (→ `pharaoh-component-diagram-draft`, `pharaoh-class-diagram-draft`). Does NOT capture ordered multi-participant interactions (→ `pharaoh-sequence-diagram-draft`). + +## Atomicity + +- (a) One state machine in → one diagram out. Nested state machines (composite states) are one machine; two independent machines = two skill invocations. +- (b) Input: `{view_title: str, states: list[StateSpec], transitions: list[TransitionSpec], initial_state: str, terminal_states?: list[str], project_root: str, renderer_override?, on_missing_config?, papyrus_workspace?, reporter_id: str}` where `StateSpec = {id: str, label: str, kind?: "simple"|"composite"|"choice"|"junction", sub_states?: list[StateSpec], entry?: str, exit?: str}`, `TransitionSpec = {from: str, to: str, event?: str, guard?: str, action?: str}`. Output: one RST directive block. +- (c) Reward: fixture — lifecycle `draft → in_review → approved → published`, plus `rejected` terminal off `in_review`. Scorer: + 1. Output starts with renderer-specific directive. + 2. Exactly one initial-state marker (`[*] -->` Mermaid, `[*] -->` PlantUML). + 3. `initial_state` is the target of the initial-state arrow. + 4. Every ID in `states` appears as a state node. + 5. Every transition renders with correct arrow and label (`event [guard] / action`). + 6. Every ID in `terminal_states` has a transition `→ [*]`. + 7. With a composite state containing sub_states, the sub-states are nested inside the composite (Mermaid: `state Foo { ... }`; PlantUML: `state Foo { ... }`). + + Pass = all 7. +- (d) Reusable: any lifecycle (workflow states, device modes, protocol states, order status machine). +- (e) One machine per call. + +## Input highlights + +- `states`: all states, possibly nested via `sub_states`. Composite states declare `kind = "composite"`. +- `transitions`: `from`/`to` reference state IDs, including sub-state IDs (cross-boundary transitions supported). +- `initial_state`: REQUIRED. Must be an ID in `states`. There is exactly one initial state; if the machine has multiple "entry points" from outer context, model them via transitions from `[*]` at the composite level. +- `terminal_states` (optional): list of IDs that have implicit transition to `[*]` (final pseudo-state). A machine may have zero terminal states (infinite loop) — valid. + +## Transition label format + +Renderer-independent format: `event [guard] / action`. +- `event` optional (unlabeled transition = auto). +- `guard` in square brackets, optional. +- `action` after slash, optional. + +If all three are absent, render an unlabeled arrow. + +## Output + +**Mermaid:** +```rst +.. mermaid:: + :caption: <view_title> + + stateDiagram-v2 + [*] --> draft + draft --> in_review : submit + in_review --> approved : approve [reviewer_count >= 2] + in_review --> rejected : reject / notify_author + approved --> published : publish + rejected --> [*] + published --> [*] +``` + +**PlantUML:** +```rst +.. uml:: + :caption: <view_title> + + @startuml + [*] --> draft + draft --> in_review : submit + in_review --> approved : approve [reviewer_count >= 2] + in_review --> rejected : reject / notify_author + approved --> published : publish + rejected --> [*] + published --> [*] + @enduml +``` + +## Non-goals + +- No state-from-code extraction — callers supply states and transitions explicitly. A future `pharaoh-states-from-source` skill could infer from match statements / switch / FSM libraries, but is a separate concern. +- No timing annotations (real-time deadlines, timer events) — sequence diagrams are a better fit for temporal constraints. +- No concurrency regions by default — a future extension may add orthogonal regions; for now, sub_states are strictly hierarchical. +- No auto-detection of terminal states — caller provides them. + +## Interaction with tailoring + +Some projects (e.g. workflow-heavy ubproject.toml with lifecycle state enums) already declare state machines implicitly — sphinx-needs `status` enum is a two-line state machine. A caller might want to derive the diagram from the project's `workflows.yaml` (when present). That derivation is NOT this skill's concern; the caller invokes `pharaoh-state-diagram-draft` with explicit `states` and `transitions`. A wrapper that reads `workflows.yaml` and calls this skill is orchestration, not atomic. diff --git a/.github/agents/pharaoh.status-lifecycle-check.agent.md b/.github/agents/pharaoh.status-lifecycle-check.agent.md index f04f403..c1a40dd 100644 --- a/.github/agents/pharaoh.status-lifecycle-check.agent.md +++ b/.github/agents/pharaoh.status-lifecycle-check.agent.md @@ -10,4 +10,115 @@ handoffs: Use when running a release-gate check over a full sphinx-needs corpus to confirm that zero needs remain in the initial `draft` status. Single mechanical binary gate — aggregates `status` across every need in `needs.json`, compares against the initial-state declaration in `workflows.yaml`, and returns pass/fail plus per-status counts. Advisory by default (pre-release development passes); release pipelines override `enforce=true` so any draft blocks the gate. -See [`skills/pharaoh-status-lifecycle-check/SKILL.md`](../../skills/pharaoh-status-lifecycle-check/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-status-lifecycle-check + +## When to use + +Invoke from a release pipeline, from `pharaoh-quality-gate` as the delegated check for the `status-lifecycle-healthy` invariant, or standalone when auditing whether a corpus is past the draft stage. Reads the project's `workflows.yaml` (for the initial-state name) plus `needs.json` (for each need's current `status`), buckets the needs, and emits a single findings JSON with `draft_count` and an `overall` verdict. + +Do NOT invoke to check whether a single transition `from → to` is legal — that is the per-need state-machine question answered by `pharaoh-lifecycle-check`. That skill checks one need, one proposed transition, consults the `transitions` list with `requires:` prerequisites, and answers "is this move allowed right now?". This skill answers a different question over a different input: given the whole corpus, how many needs are still in the initial `draft` bucket, and does the release policy tolerate any? No per-need transition walk, no prerequisite resolution — only the binary "current status bucket" aggregation. + +Do NOT invoke to score percentage thresholds like "≥50% past draft". The plan that commissioned this atom explicitly rejects fuzzy thresholds for release gates. Under `enforce=true` the gate is binary: zero drafts pass, one draft fails. Under `enforce=false` the output reports counts without failing, so callers still see the distribution. + +Do NOT invoke to transition needs. Read-only audit. + +## Atomicity + +- (a) Indivisible: one `workflows.yaml` + one `needs.json` in → one findings JSON out. No per-need transition checks, no set-level re-authoring, no dispatch of other skills. +- (b) Input: `{workflow_path: <str>, needs_json_path: <str>, enforce: bool (default false)}`. Output: findings JSON per the shape in `## Output` below. +- (c) Reward: fixtures under `skills/pharaoh-status-lifecycle-check/fixtures/` — one per outcome branch: + 1. `all-draft-enforcing/` — every need `status: draft`, `enforce: true`. Expected: `overall: "fail"`, `draft_count` equals total, `blockers` lists the draft need ids. + 2. `all-draft-advisory/` — every need `status: draft`, `enforce: false`. Expected: `overall: "pass"` with an advisory `blockers` entry describing the drafts without failing the gate. + 3. `mixed-enforcing/` — some drafts, some past draft, `enforce: true`. Expected: `overall: "fail"`, `blockers` lists only the draft need ids. + 4. `fully-reviewed-enforcing/` — every need past draft, `enforce: true`. Expected: `overall: "pass"`, empty `blockers`. + + Pass = each fixture's actual output matches `expected-output.json` modulo ordering of need ids inside the `blockers` list. +- (d) Reusable across projects — lifecycle state names come from `workflows.yaml` (the project declares them); only the bucket named by `initial_state` (or the literal `draft` fallback) triggers the gate. No project-specific vocabulary in the base. +- (e) Read-only. Does not modify `workflows.yaml`, `needs.json`, or any need status. + +## Input + +- `workflow_path`: absolute path to the project's `workflows.yaml` (typically `.pharaoh/project/workflows.yaml`). The skill reads two keys: + - `initial_state` (optional): the state name that signals "not yet reviewed". If absent, the skill falls back to the literal string `"draft"` and records a note. + - `lifecycle_states` (optional): map of declared state names. Used only to validate that every observed status is declared; unknown statuses surface in `notes` but do not change the verdict. +- `needs_json_path`: absolute path to the project's `needs.json` (typically `docs/_build/needs/needs.json`). The skill reads the flat ID map and inspects each entry's `status` field. +- `enforce`: boolean, default `false`. When `false`, the skill runs in advisory mode — counts and lists drafts but always emits `overall: "pass"`. When `true`, any draft flips `overall` to `"fail"`. + +Edge cases: +- `workflow_path` missing or unparseable → emit `overall: "fail"` with blocker `"workflows.yaml unresolved: <path>"` regardless of `enforce` (cannot decide without the initial-state name). +- `needs_json_path` missing or unparseable → emit `overall: "fail"` with blocker `"needs.json unresolved: <path>"` regardless of `enforce`. +- `needs.json` contains zero needs → emit `overall: "pass"`, `draft_count: 0`, `notes: ["needs.json empty — nothing to gate"]`. +- Need lacks a `status` field → bucket it under the literal key `"<missing>"`, count it as past-draft for gate purposes, and surface it in `notes`. This avoids crashing on malformed corpora while keeping the gate focused on `draft`. + +## Output + +```json +{ + "needs_by_status": {"draft": 40, "reviewed": 0, "approved": 0, "released": 0}, + "draft_count": 40, + "enforce": true, + "overall": "fail", + "blockers": [ + "40 needs still in draft status; release gate requires zero drafts", + "comp_req__example_a", + "comp_req__example_b", + "..." + ], + "notes": [] +} +``` + +Fields: +- `needs_by_status`: bucket counts keyed by every status value observed in `needs.json`. Entries with zero are included for states declared in `workflows.yaml.lifecycle_states` so downstream dashboards see a stable shape; observed-but-undeclared statuses are included with their actual count and added to `notes`. +- `draft_count`: count of needs whose status equals the `initial_state` from `workflows.yaml` (fallback literal `"draft"`). +- `enforce`: echo of the input flag so downstream callers can distinguish advisory from release runs without re-reading their own config. +- `overall`: `"pass"` when `enforce=false` OR `draft_count == 0`. `"fail"` otherwise, or when preconditions (workflow/needs files) failed to resolve. +- `blockers`: in `enforce=true` mode with `draft_count > 0`, one summary line plus one entry per draft need id (capped at the first 500 ids to bound output size; overflow surfaces as a single `"... and N more"` line). In advisory mode with `draft_count > 0`, a single informational entry like `"advisory: 40 needs in draft; release gate not enforced"` — `overall` stays `"pass"`. In pass cases, empty list. +- `notes`: informational observations — fallback `initial_state` used, undeclared statuses observed, missing `status` fields, empty corpus. + +## Detection rule + +One mechanical check. No LLM judgement. + +### 1. `draft_count_against_enforce` + +**Check:** Parse `workflows.yaml` for `initial_state` (fallback `"draft"`). Iterate `needs.json`, count needs whose `status` equals the initial-state name. If `enforce=true` and the count is non-zero, fail. Otherwise pass. + +**Detection:** +```bash +# Initial state name (fallback to literal "draft"): +initial=$(python -c "import yaml; d=yaml.safe_load(open('$workflow_path')); print(d.get('initial_state','draft'))") + +# Count drafts: +python -c " +import json, sys +needs = json.load(open('$needs_json_path')) +items = needs if isinstance(needs, list) else needs.get('needs', needs) +if isinstance(items, dict): + items = items.values() +count = sum(1 for n in items if n.get('status') == '$initial') +print(count) +" + +# Gate: +# enforce == true AND count > 0 → fail +# else → pass +``` + +The skill performs the same extraction in whatever runtime the caller invokes (direct tool use, subagent shell); the pseudocode above is the reference implementation. + +## Tailoring extension point + +The `initial_state` name is read from `workflows.yaml` — the project tailors what "draft" means by declaring the state there. No other knobs are exposed on this skill; a project that wants percentage-based reporting should wire a separate metric collector rather than weaken this binary gate. + +## Composition + +Role: `atom-check`. + +Called from `pharaoh-quality-gate` as the delegated check for the `status-lifecycle-healthy` invariant (pass requirement: `overall == "pass"` with `enforce=true` set by the release pipeline). Also callable standalone from any release workflow that wants the binary gate without the full quality-gate pipeline. Never dispatches other skills. Never modifies tailoring, needs.json, or need status. + +Distinct from `pharaoh-lifecycle-check`: that skill is per-need and consults the `transitions` list (`from → to` legality with `requires:` prerequisites); this skill is corpus-wide and consults only the initial-state bucket. diff --git a/.github/agents/pharaoh.tailor-bootstrap.agent.md b/.github/agents/pharaoh.tailor-bootstrap.agent.md index 2e37a48..fe446af 100644 --- a/.github/agents/pharaoh.tailor-bootstrap.agent.md +++ b/.github/agents/pharaoh.tailor-bootstrap.agent.md @@ -7,4 +7,183 @@ handoffs: [] Use when a sphinx-needs project has just been bootstrapped (post pharaoh-bootstrap, pre any needs authoring) and you need to generate minimal tailoring files from declared types — workflows.yaml, id-conventions.yaml, artefact-catalog.yaml, and per-type checklists — without requiring any needs to exist. Complements pharaoh-tailor-detect which requires ≥10 needs. -See [`skills/pharaoh-tailor-bootstrap/SKILL.md`](../../skills/pharaoh-tailor-bootstrap/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-tailor-bootstrap + +## When to use + +Invoke immediately after `pharaoh-bootstrap` + `pharaoh-setup` on a greenfield sphinx-needs project. The project has declared need types in `ubproject.toml` but has zero needs yet, so `pharaoh-tailor-detect` (which requires ≥10 needs to infer conventions) fails by design. This skill fills the "tailoring donut hole" — it emits minimal but valid tailoring derived from the bootstrap inputs, so: + +- Every need gets a defined lifecycle (`:status: draft` can transition to `reviewed` and `approved`). +- ID patterns are machine-checkable before the first need lands. +- `pharaoh-quality-gate` has a gate spec to evaluate against. +- Review checklists exist per type for `pharaoh-req-review` to consume. + +Do NOT use on a project that already has `.pharaoh/project/*.yaml` files — this skill never overwrites without explicit `overwrite=true`. Run `pharaoh-tailor-detect` / `pharaoh-tailor-fill` on matured projects instead. + +## Atomicity + +- (a) Indivisible — one `project_root` in → up to 5 files out in `.pharaoh/project/`. No needs authoring, no RST mutation, no setup-level edits. One tailoring phase × one project. +- (b) Input: `{project_root: str, overwrite?: bool}`. Output: JSON `{files_created: list[str], files_skipped: list[str], warnings: list[str]}`. Default `overwrite=false`. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-tailor-bootstrap/input_ubproject.toml` declares two types (`feat` FEAT_, `comp_req` CREQ_) and one extra_link (`satisfies`). Run skill against it. Scorer checks: + 1. Five files created under `.pharaoh/project/`: `workflows.yaml`, `id-conventions.yaml`, `artefact-catalog.yaml`, `checklists/feat.md`, `checklists/comp_req.md`. + 2. Each emitted file's content matches the corresponding `expected_output/` fixture byte-exact (YAML canonical-sorted for the YAML files; md exact for the checklists). + 3. Re-running the skill with `overwrite=false` on the now-populated `.pharaoh/project/` is idempotent: `files_created=[]`, `files_skipped` lists all five, `warnings=[]`. + 4. Running with `overwrite=true` on populated target regenerates all five (byte-exact equality with fixture preserved). + + Pass = all 4 checks. +- (d) Reusable on any first-time Pharaoh project. +- (e) Composable: callable by `pharaoh-setup` post-bootstrap. Never calls other skills. + +## Input + +- `project_root`: absolute path. Must contain `ubproject.toml` with at least one `[[needs.types]]` entry. +- `overwrite` (optional): if `true`, regenerate existing tailoring files. Default `false` (skip + warn on collision). + +## Output + +```json +{ + "files_created": [ + ".pharaoh/project/workflows.yaml", + ".pharaoh/project/id-conventions.yaml", + ".pharaoh/project/artefact-catalog.yaml", + ".pharaoh/project/checklists/feat.md", + ".pharaoh/project/checklists/comp_req.md", + ".pharaoh/project/checklists/requirement.md" + ], + "files_skipped": [], + "warnings": [] +} +``` + +Paths are relative to `project_root`. + +## Process + +### Step 1: Read ubproject.toml + +Read `<project_root>/ubproject.toml`. Extract every `[[needs.types]]` entry's `directive` and `prefix`. Extract every `[[needs.extra_links]]` entry's `option`, `incoming`, `outgoing`. + +If zero types declared, FAIL: `"no [[needs.types]] in ubproject.toml; run pharaoh-bootstrap first"`. + +### Step 2: Emit workflows.yaml + +Emit a single project-wide state machine matching `schemas/workflows.schema.json`: a flat `lifecycle_states` list and a `transitions` array where each entry has `from`, `to`, and a `requires` list (always a list, never a scalar `gate` field). + +Default content for a greenfield bootstrap: + +```yaml +# workflows.yaml — generated by pharaoh-tailor-bootstrap +# Validates against schemas/workflows.schema.json. +# Edit transitions[*].requires to declare per-transition prerequisites. + +lifecycle_states: + - draft + - reviewed + - approved + +transitions: + - from: draft + to: reviewed + requires: [] + - from: reviewed + to: approved + requires: [] + - from: reviewed + to: draft + requires: [] +``` + +The state machine is project-wide, not per-type — `artefact-catalog.yaml` declares which subset of `lifecycle_states` applies to each type via its own `lifecycle` field (Step 4). `requires` is always a list of gate-name strings (empty list permitted); it replaces any older single-string `gate:` field. + +See `expected_output/workflows.yaml` in the fixture for exact format. + +### Step 3: Emit id-conventions.yaml + +`prefixes`: mapping from each directive to its prefix. `id_regex`: OR-join of all prefixes as a regex anchored to start + snake_case tail. `separator`: `"_"`. + +See fixture for exact format. + +### Step 4: Emit artefact-catalog.yaml + +For each declared type, emit: +- `required_fields`: at minimum `id`, `status`. +- `optional_fields`: `reviewer`, `approved_by`, plus `source_doc` for types that typically carry provenance (heuristic: top-level types like `feat`, `story`, `use_case` — if unsure, include it). +- `lifecycle`: list of states from `workflows.yaml` that apply to this type (typically the full state list). +- `required_body_sections`: optional list of body-prose section headings (e.g. `Inputs`, `Steps`, `Expected` for `tc`); omit when no body sections are required. +- `required_links`, `optional_links`, `required_metadata_fields`, `required_roles`: release-gate fields read by `pharaoh-link-completeness-check`, `pharaoh-output-validate`, and `pharaoh-review-completeness` (and aggregated by `pharaoh-quality-gate`). Emit all four as arrays of strings — empty arrays are valid and declare an explicit "no requirement" choice; absent keys are flagged by `pharaoh-tailor-review` rule C6 as an unmade decision. + +Inference rules for the four release-gate fields: + +- `required_links`: include the `option` of every `[[needs.extra_links]]` entry that resolves to this type as the source side of the link. Concretely: for an extra_link with `outgoing: <this_type>` (the link is declared on directives of this type), include its `option`. If the project does not declare the direction explicitly, default to including link options whose semantic name implies a parent reference (`satisfies`, `verifies`, `refines`, `derives_from`) for downstream artefact types (anything with `_req`, `arch`, `tc`, `impl`, `spec` in its directive). Other types default to `[]`. +- `optional_links`: any other extra_link `option` that may legally appear on this type but is not in `required_links`. Default to `[]` when no other links are declared. +- `required_metadata_fields`: include `status` always (every governed type has a lifecycle). Add any field that the project's `[needs.fields.X]` table declares as required for this directive. If `ubproject.toml` carries no such hints, emit `[status]`. +- `required_roles`: emit `[]` by default — bootstrap runs on greenfield projects with no observed review process. Tailoring authors are expected to fill in `[reviewer]` (or `[reviewer, approved_by]`) once a review gate is established. `pharaoh-tailor-review` rule C6 surfaces a finding if the key is absent so the project explicitly declares the choice. + +Worked example for a `comp_req` directive with `[[needs.extra_links]] option = "satisfies", outgoing = "comp_req"`: + +```yaml +comp_req: + required_fields: [id, status, title, satisfies] + optional_fields: [reviewer, approved_by, source_doc] + lifecycle: [draft, reviewed, approved] + required_links: [satisfies] + optional_links: [] + required_metadata_fields: [status] + required_roles: [] +``` + +A type with no observed link declarations emits empty arrays: + +```yaml +feat: + required_fields: [id, status, title] + optional_fields: [reviewer, approved_by, source_doc] + lifecycle: [draft, reviewed, approved] + required_links: [] + optional_links: [] + required_metadata_fields: [status] + required_roles: [] +``` + +Empty arrays are deliberate: they say "the project considered this and chose no requirement". Omitting the key entirely is what `pharaoh-tailor-review` flags as an unmade decision (rule C6). + +The emitted `artefact-catalog.yaml` validates against +`schemas/artefact-catalog.schema.json` shipped at the Pharaoh package root +(see `schemas/README.md`). `pharaoh-tailor-review` enforces this on the +output of every bootstrap run. + +### Step 5: Emit per-type checklists + +For each declared type, write `checklists/<directive>.md` with frontmatter `applies_to: <directive>`, `required_before: [reviewed]`, and a short review checklist body. The content is type-generic for `comp_req` and `feat`; see fixtures for exact content. + +For types not covered by built-in templates (anything beyond `feat`, `comp_req`, `story`, `use_case`, `spec`, `impl`, `test`), emit a minimal checklist with `- [ ] Review this <type> for clarity, correctness, and traceability.` + +Additionally, emit `checklists/requirement.md` as a canonical alias for the primary requirement-type checklist (the `comp_req` checklist if declared, otherwise whichever declared type has prefix `REQ_` / role `requirement` per artefact-catalog.yaml). The alias is a one-line redirect `# See [<directive>.md](<directive>.md) — canonical requirement checklist` plus the frontmatter block. Downstream skills (`pharaoh-tailor-review`, `pharaoh-req-review`) reference `checklists/requirement.md` as the well-known filename, so the alias keeps the interop contract stable regardless of the project's directive naming. + +### Step 6: Check overwrite + write + +For each target file, check existence. If exists and `overwrite=false`, add to `files_skipped`, emit warning `"<path> already exists; skipping (use overwrite=true to regenerate)"`. Otherwise write the file and add to `files_created`. + +Create intermediate directories (`.pharaoh/project/`, `.pharaoh/project/checklists/`) as needed. + +### Step 7: Return + +Return the JSON report. + +## Failure modes + +- `ubproject.toml` missing → FAIL. +- Zero types declared → FAIL per Step 1. +- `.pharaoh/project/` unwritable → FAIL. + +## Non-goals + +- No tailoring inference from corpus statistics. For that, use `pharaoh-tailor-detect` + `pharaoh-tailor-fill` on matured projects. +- No pharaoh.toml generation. That is `pharaoh-setup`. +- No sphinx-needs config generation. That is `pharaoh-bootstrap`. +- No checklist customization per project — checklists are built-in templates. Caller edits after generation. diff --git a/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md b/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md index f2f5bb6..09fd22f 100644 --- a/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md +++ b/.github/agents/pharaoh.tailor-code-grounding-filters.agent.md @@ -7,4 +7,189 @@ handoffs: [] Use when authoring a project's `code-grounding-filters.yaml` from observed stack conventions. Detects language + CLI framework + config-object style in the project source tree and emits a tailoring YAML populated with the four parameterised filter strategies. Does not invoke `pharaoh-req-code-grounding-check`; purely produces tailoring. -See [`skills/pharaoh-tailor-code-grounding-filters/SKILL.md`](../../skills/pharaoh-tailor-code-grounding-filters/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-tailor-code-grounding-filters + +## When to use + +Invoke once per project, before running `pharaoh-req-code-grounding-check` at scale, to populate `.pharaoh/project/code-grounding-filters.yaml`. Inspects a project source tree, detects which CLI framework / import syntax / config-default idiom the code uses, and emits a filter YAML whose `filters:` entries wire up the four strategies in [`../shared/code-grounding-filters.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/code-grounding-filters.md) to the detected stack. + +Do NOT invoke to validate an existing filter YAML — that is part of `pharaoh-tailor-review`. Do NOT invoke to apply filters to a CREQ — that is `pharaoh-req-code-grounding-check`. This skill only reads the codebase and writes one YAML. + +## Atomicity + +- (a) Indivisible: one source-tree in → one filter-YAML + detection report out. No CREQ scoring, no plan dispatch, no coupling to other tailor skills beyond emitting YAML in the format the target skill reads. +- (b) Input: `{project_root: str, output_path?: str, on_existing?: "fail"|"overwrite"|"skip"}`. Output: JSON `{detected: {language, cli_framework, config_style, import_style}, emitted_filters: [...], yaml_path: str, warnings: [...]}` plus the YAML written to `output_path`. +- (c) Reward: fixtures under `skills/pharaoh-tailor-code-grounding-filters/fixtures/`: + 1. `python-typer/` — source tree with `import typer`, `def from_csv(...)`, `TypeAlias = Annotated[..., typer.Option(...)]` markers → emits `typer_kebab` (with `Opt` morphology), `python_import`, `python_dataclass_default`, and `env_var_glob` (if `os.environ` or `envvar=` usages detected). + 2. `python-click-click/` — source tree with `import click`, `@click.command()`, `click.option(...)` markers → emits `click_kebab` (no `Opt` morphology because Click projects do not use the `Opt` TypeAlias convention), `python_import`, `python_dataclass_default`. + 3. `rust-clap/` — source tree with `use clap::...`, `#[derive(Parser)]`, `#[arg(long)]` markers → emits `clap_kebab`, `rust_use_clause`, `rust_serde_default` (when `serde(default="...")` is detected). + + Pass = each fixture's emitted YAML matches `expected-filters.yaml` by substring inclusion of every declared filter's `strategy` + `name`, and the detection report matches `expected-report.json` on `detected.*` fields. +- (d) Reusable: any project regardless of language — the detection matrix is a bounded table (roughly: Python, Rust, TypeScript, Go), each detection slot independently optional. +- (e) Read-only wrt source code. Writes one YAML file at `output_path`. Running twice with `on_existing="skip"` is a no-op. + +## Input + +- `project_root`: absolute path to the project root. The skill walks up to 3 levels deep looking for source files; skips `node_modules`, `target`, `.venv`, `dist`, `build`, `__pycache__`. +- `output_path`: absolute path to write the YAML. Default: `<project_root>/.pharaoh/project/code-grounding-filters.yaml`. +- `on_existing`: `"fail"` (default — refuse to overwrite), `"overwrite"` (replace the file), `"skip"` (if file exists, return without writing; still returns the detection report for review). + +## Detection matrix + +The skill walks the source tree and greps for language + framework markers. Each detection is boolean (present / absent). Multiple languages can coexist; the emitted YAML gets filters for every detected stack. + +### Language detection + +| language | markers (any-of) | +|---|---| +| python | file extension `.py` AND (`^import\s` OR `^from\s.*\simport\s` OR `^def\s` OR `^class\s`) | +| rust | file extension `.rs` AND (`^use\s` OR `^fn\s` OR `^struct\s` OR `^enum\s`) | +| typescript | file extension `.ts` or `.tsx` AND (`^import\s` OR `^export\s`) | +| go | file extension `.go` AND (`^package\s` OR `^import\s`) | + +### CLI framework detection (Python) + +| framework | markers (any-of) | +|---|---| +| typer | `import typer` OR `typer.Typer()` OR `typer.Option` OR `@.*_app\.command` | +| click | `import click` OR `@click\.command` OR `click\.option` | +| argparse | `argparse\.ArgumentParser` | +| none | no matches | + +### CLI framework detection (Rust) + +| framework | markers (any-of) | +|---|---| +| clap | `use clap::` OR `#\[derive\(.*Parser.*\)\]` OR `#\[arg\(` OR `#\[command\(` | +| structopt | `use structopt::` OR `#\[derive\(.*StructOpt.*\)\]` | +| none | no matches | + +### CLI framework detection (Go) + +| framework | markers (any-of) | +|---|---| +| cobra | `github.com/spf13/cobra` OR `cobra.Command` | +| urfave-cli | `github.com/urfave/cli` | +| flag | `"flag"` import | +| none | no matches | + +### Config-default idiom detection + +| idiom | markers (any-of) | +|---|---| +| python_dataclass | `@dataclass` AND `field\(default=` | +| python_pydantic | `from pydantic` AND `Field\(default=` | +| python_attrs | `import attr` OR `@attr.s` | +| rust_serde | `#\[derive\(.*Deserialize.*\)\]` AND `#\[serde\(default` | +| go_struct_tag | `` `json:"` `` with `default:` stanza | +| none | no matches | + +### Env-var convention detection + +| style | markers | +|---|---| +| uppercase-prefix | ≥3 distinct `[A-Z][A-Z0-9_]+_\w+` identifiers declared as env var strings, sharing a common prefix of ≥3 chars | +| none | fewer than 3 | + +If uppercase-prefix style IS detected, also detect the **prefix** — the longest common uppercase run across observed env-var identifiers (e.g. `JAMA_` from `JAMA_URL_ENV`, `JAMA_USERNAME_ENV`, ...). The prefix is NOT encoded into the filter (prefix is per-call from the CREQ token), only the strategy itself is enabled. + +## Emission rules + +After detection, build the `filters:` list in this order (deterministic for fixture comparison): + +1. **Kebab filter** — one entry if any CLI framework was detected. Name the filter after the framework (`typer_kebab`, `click_kebab`, `clap_kebab`, `cobra_kebab`). Parameters: + - `token_regex`: `"^(--)?[a-z][a-z0-9]*(-[a-z0-9]+)+$"` + - `strip_leading`: `["--"]` + - `morphology_prefixes`: `["Opt"]` only for `typer` (that convention is Typer-specific); `[]` otherwise. + +2. **Env-var glob** — one entry if uppercase-prefix style was detected. Name `env_var_glob`. Parameters: + - `token_regex`: `"^[A-Z][A-Z0-9_]*_?\\*$"` + - `separator_character`: `"_"` + +3. **Dotted import resolution** — one entry per language with a known import syntax. Parameters differ by language: + + **Python:** + ```yaml + - name: python_import + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w.]*\\.[A-Z]\\w+$" + separator: "." + import_patterns: + - "from\\s+${mod}\\s+import\\s+${attr}" + - "import\\s+${mod}\\b" + - "${tok}" + ``` + + **Rust:** + ```yaml + - name: rust_use_clause + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w]*(::[\\w]+)+$" + separator: "::" + import_patterns: + - "use\\s+${tok}" + - "use\\s+${mod}::\\{[^}]*\\b${attr}\\b[^}]*\\}" + - "${tok}" + ``` + + **TypeScript:** + ```yaml + - name: ts_named_import + strategy: dotted_import_resolution + token_regex: "^@?[a-z][\\w/.-]*:[A-Z]\\w+$" + separator: ":" + import_patterns: + - "import\\s*\\{[^}]*\\b${attr}\\b[^}]*\\}\\s*from\\s*['\"]${mod}['\"]" + ``` + + **Go:** + ```yaml + - name: go_import + strategy: dotted_import_resolution + token_regex: "^[a-z][\\w/.-]*\\.[A-Z]\\w+$" + separator: "." + import_patterns: + - "\"${mod}\"\\s*$" + - "${tok}" + ``` + +4. **Literal-default** — one entry per detected config-default idiom: + + - `python_dataclass` → `python_dataclass_default` with `field\\(default=[\"']${tok}[\"']\\)` and `hint_dir_pattern: "config/"`. + - `python_pydantic` → `python_pydantic_default` with `Field\\(default=[\"']${tok}[\"']\\)` and `hint_dir_pattern: "config/|models/"`. + - `rust_serde` → `rust_serde_default` with `#\\[serde\\(default\\s*=\\s*\"${tok}\"\\)\\]` and `hint_dir_pattern: "config/|src/config/"`. + - Absent → omit this filter (the skill's other axes still catch the CREQ; the actionable evidence is just missing). + +## Output + +```json +{ + "detected": { + "languages": ["python"], + "cli_framework": "typer", + "config_default_idiom": "python_dataclass", + "env_var_style": "uppercase-prefix", + "detected_env_prefix": "JAMA_" + }, + "emitted_filters": [ + "typer_kebab", + "env_var_glob", + "python_import", + "python_dataclass_default" + ], + "yaml_path": "/abs/path/to/.pharaoh/project/code-grounding-filters.yaml", + "warnings": [] +} +``` + +`warnings` surfaces any ambiguity: two CLI frameworks co-detected (emits filters for both and warns), no language detected (emits empty `filters:` and warns), config idiom detected without matching dirs on disk (emits filter, warns that `hint_dir_pattern` may not match anything at review time). + +## Composition + +Role: `tailor-author`. + +Runs once per project during bootstrap, typically chained after `pharaoh-tailor-detect` and before `pharaoh-tailor-review`. Its output feeds `pharaoh-req-code-grounding-check` via the `tailoring_path` input. Never invokes emission or review skills; produces YAML that other skills consume. diff --git a/.github/agents/pharaoh.tailor-detect.agent.md b/.github/agents/pharaoh.tailor-detect.agent.md index 680f063..c1620a1 100644 --- a/.github/agents/pharaoh.tailor-detect.agent.md +++ b/.github/agents/pharaoh.tailor-detect.agent.md @@ -7,4 +7,327 @@ handoffs: [] Use when inspecting a sphinx-needs project to emit a structured report of detected conventions — prefixes, ID regex candidates, separator, lifecycle states, artefact types with observed required/optional fields. Does NOT author tailoring files (see pharaoh-tailor-fill). -See [`skills/pharaoh-tailor-detect/SKILL.md`](../../skills/pharaoh-tailor-detect/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-tailor-detect + +## When to use + +Invoke when you have a built sphinx-needs project and want to bootstrap tailoring configuration +automatically. This skill reads the `needs.json` index and derives what conventions the project +actually uses — it does not guess, it observes. + +Do NOT invoke this skill to author tailoring files — that is `pharaoh-tailor-fill`. Do NOT +invoke this to validate existing tailoring — that is `pharaoh-tailor-review`. This skill only +observes and reports. + +--- + +## Inputs + +- **project_dir** (from user): path to the sphinx-needs project root. Must contain a built + `needs.json` at a discoverable path (see Step 1 for search order). +- No tailoring files required — this skill bootstraps from raw data. + +--- + +## Outputs + +A single JSON document — no prose wrapper. Shape: + +```json +{ + "prefixes": { + "gd_req": { + "count": 185, + "example_ids": ["gd_req__brake_activation", "gd_req__safety_goal_1"] + }, + "std_req": { + "count": 580, + "example_ids": ["std_req__iso26262__4_3_1", "std_req__aspice__SWE2_BP1"] + } + }, + "separator": "__", + "id_regex_candidate": "^[a-z][a-z_]*__[a-z0-9_]+$", + "id_regex_exceptions": { + "std_req": "^std_req__<source>__<UPSTREAM-ID>$ (pattern inferred; verify manually)" + }, + "lifecycle_states_observed": ["draft", "valid", "inspected"], + "artefact_types": { + "gd_req": { + "observed_fields_required": [ + ["id", 185], ["status", 185], ["satisfies", 184] + ], + "observed_fields_optional": [ + ["complies", 173], ["rationale", 42], ["tags", 31], ["verification", 29] + ], + "required_threshold_note": "Field present in >= 95% of instances considered required" + }, + "std_req": { + "observed_fields_required": [ + ["id", 580], ["status", 580] + ], + "observed_fields_optional": [ + ["complies", 12], ["tags", 8] + ], + "required_threshold_note": "Field present in >= 95% of instances considered required" + } + }, + "warnings": [ + "std_req IDs do not match the common regex pattern — id_regex_exception recorded" + ] +} +``` + +--- + +## Process + +### Step 1: Locate needs.json + +Search in order: +1. `<project_dir>/build/needs/needs.json` +2. `<project_dir>/docs/_build/needs/needs.json` +3. `<project_dir>/_build/needs/needs.json` +4. `<project_dir>/bazel-bin/needs_json/_build/needs/needs.json` +5. Any `needs.json` under a `_build` or `build` directory (recursive, first match) + +If not found in any location, FAIL: + +``` +FAIL: needs.json not found under <project_dir>. +Build the project first: + sphinx-build docs/ docs/_build/ (Sphinx) + bazel build //...:needs_json (Bazel) +Then re-run pharaoh-tailor-detect. +``` + +--- + +### Step 2: Parse needs.json and extract all needs + +Load the JSON file. The top-level structure may be: + +```json +{"versions": {"<version>": {"needs": {"<id>": {...}}}}} +``` + +or a flat `{"needs": {"<id>": {...}}}`. Handle both. Extract the flat map +`id → {id, type, status, <all other fields>}` from the most recent version if versioned. + +If the total need count is < 10, FAIL: + +``` +FAIL: needs.json contains only <N> needs — corpus too small for confident detection. +Pharaoh-tailor-detect requires at least 10 needs to infer reliable conventions. +Populate the project with representative content and rebuild before detecting. +``` + +--- + +### Step 3: Detect separator and prefixes + +**3a. Determine separator** + +Check all IDs for common separator patterns. Test candidates `__`, `_`, `-` in order. The +separator is the string that divides the type prefix from the local-ID part in the majority +of IDs. + +Algorithm: +- Split each ID on the candidate separator (max 2 parts). +- The separator is the one for which ≥ 70% of IDs split cleanly into exactly 2 non-empty parts. +- If no candidate achieves 70%, record `"separator": null` and emit a warning. + +**3b. Group by prefix** + +Using the detected separator, split each ID into `(prefix, local_part)`. Group all IDs by +prefix. For each prefix record: +- `count`: number of needs with this prefix +- `example_ids`: up to 3 representative IDs (first alphabetically) + +--- + +### Step 4: Detect ID regex candidates + +**Common regex pattern:** `^[a-z][a-z_]*__[a-z0-9_]+$` (using `__` separator). + +For each prefix, test all its IDs against the common pattern. + +- If ≥ 95% of a prefix's IDs match the common pattern → record `id_regex_candidate` as the + common pattern. +- If a prefix has < 95% match rate, inspect the non-matching IDs and construct a + prefix-specific regex candidate. Record under `id_regex_exceptions`. +- Note the exception with a `(pattern inferred; verify manually)` annotation — do not claim + certainty about exceptions. + +--- + +### Step 5: Detect lifecycle states + +Collect all unique non-empty values of the `status` field across all needs. Record as +`lifecycle_states_observed` sorted alphabetically. + +If no need has a `status` field, record `"lifecycle_states_observed": []` and add a warning. + +--- + +### Step 6: Compute field frequencies per prefix + +For each prefix-grouped set of needs: + +1. Collect all field keys that appear on at least one need in the group (excluding `id` and + `type` — those are always present structurally). +2. For each field key, count the number of needs in the group that carry a non-empty value. +3. Express as `[field_name, count]` tuples. +4. Split into: + - `observed_fields_required`: field count / group_size ≥ 0.95 + - `observed_fields_optional`: field count / group_size ≥ 0.20 and < 0.95 +5. Include `id` and `status` in `observed_fields_required` always (structural). +6. Sort each list descending by count. + +Add `"required_threshold_note": "Field present in >= 95% of instances considered required"` to +each artefact type entry. + +--- + +### Step 7: Compile warnings + +Collect any anomalies encountered: +- Prefixes with non-matching ID regex (note which prefix and what fraction matched) +- `status` field absent from entire corpus +- Separator detection fell below 70% threshold +- Any prefix with count == 1 (singleton — may be an outlier, not a real type) + +Add all warnings to the top-level `"warnings": [...]` array. If no anomalies, emit `"warnings": []`. + +--- + +### Step 8: Emit JSON + +Emit the single JSON document per the Output shape. No prose before or after. + +--- + +## Guardrails + +**G1 — needs.json not found** + +FAIL with build hint (Step 1). Do not attempt inference without data. + +**G2 — Corpus too small** + +< 10 total needs → FAIL (Step 2). Field frequencies from tiny samples are not meaningful. + +**G3 — No separator detected** + +If no separator candidate achieves 70% coverage, record `separator: null` and +`id_regex_candidate: null`, add a warning, and continue — the output is still useful for +`lifecycle_states_observed` and raw prefix counts. + +**G4 — Malformed needs.json** + +If the JSON is syntactically invalid or the `needs` key is missing, FAIL: + +``` +FAIL: needs.json is malformed or missing the 'needs' key. +Rebuild the project and retry. +``` + +--- + +## Advisory chain + +After successfully emitting the report, always advise: + +``` +Pass this report to `pharaoh-tailor-fill` to author the .pharaoh/project/ tailoring files. +Review the id_regex_exceptions entries manually before proceeding. +``` + +--- + +## Worked example + +**User input:** `project_dir = /work/eclipse-score` + +**Step 1:** `needs.json` found at `/work/eclipse-score/docs/_build/needs/needs.json`. + +**Step 2:** 1281 needs loaded. Corpus size check passed. + +**Step 3:** Separator `__` achieves 98% clean split. Prefixes detected: `gd_req` (185), +`std_req` (580), `wp` (74), `wf` (79), `gd_chklst` (15), `tc` (348). + +**Step 4:** `gd_req`, `wp`, `wf`, `gd_chklst`, `tc` all pass ≥ 95% match on +`^[a-z][a-z_]*__[a-z0-9_]+$`. `std_req` fails — IDs contain uppercase and an extra segment +(e.g. `std_req__iso26262__4_3_1`). Exception recorded. + +**Step 5:** Unique status values: `draft`, `inspected`, `valid`. + +**Step 6:** For `gd_req` (185 needs): `id` 185/185, `status` 185/185, `satisfies` 184/185 +(≥ 95% → required). `complies` 173/185 (93.5% → optional by threshold), `rationale` 42/185, +`tags` 31/185, `verification` 29/185 (all < 95% → optional). + +**Step 7:** One warning: `std_req IDs do not match the common regex pattern — id_regex_exception recorded`. + +**Step 8 output:** + +```json +{ + "prefixes": { + "gd_req": {"count": 185, "example_ids": ["gd_req__abs_pump_activation", "gd_req__brake_system_safety"]}, + "gd_chklst": {"count": 15, "example_ids": ["gd_chklst__functional_safety_req"]}, + "std_req": {"count": 580, "example_ids": ["std_req__aspice__SWE2_BP1", "std_req__iso26262__4_3_1"]}, + "tc": {"count": 348, "example_ids": ["tc__abs_pump_001", "tc__brake_coast_002"]}, + "wf": {"count": 79, "example_ids": ["wf__brake_system_design"]}, + "wp": {"count": 74, "example_ids": ["wp__brake_release_1_0"]} + }, + "separator": "__", + "id_regex_candidate": "^[a-z][a-z_]*__[a-z0-9_]+$", + "id_regex_exceptions": { + "std_req": "^std_req__<source>__<UPSTREAM-ID>$ (pattern inferred; verify manually)" + }, + "lifecycle_states_observed": ["draft", "inspected", "valid"], + "artefact_types": { + "gd_req": { + "observed_fields_required": [["id", 185], ["status", 185], ["satisfies", 184]], + "observed_fields_optional": [["complies", 173], ["rationale", 42], ["tags", 31], ["verification", 29]], + "required_threshold_note": "Field present in >= 95% of instances considered required" + }, + "std_req": { + "observed_fields_required": [["id", 580], ["status", 580]], + "observed_fields_optional": [["complies", 12], ["tags", 8]], + "required_threshold_note": "Field present in >= 95% of instances considered required" + }, + "tc": { + "observed_fields_required": [["id", 348], ["status", 348], ["verifies", 345]], + "observed_fields_optional": [["tags", 87], ["level", 63], ["rationale", 41]], + "required_threshold_note": "Field present in >= 95% of instances considered required" + }, + "wf": { + "observed_fields_required": [["id", 79], ["status", 79], ["input", 79], ["output", 79], ["responsible", 77]], + "observed_fields_optional": [["approved_by", 31], ["supported_by", 28], ["has", 14]], + "required_threshold_note": "Field present in >= 95% of instances considered required" + }, + "wp": { + "observed_fields_required": [["id", 74], ["status", 74]], + "observed_fields_optional": [["tags", 18], ["complies", 11]], + "required_threshold_note": "Field present in >= 95% of instances considered required" + }, + "gd_chklst": { + "observed_fields_required": [["id", 15], ["status", 15]], + "observed_fields_optional": [["tags", 4]], + "required_threshold_note": "Field present in >= 95% of instances considered required" + } + }, + "warnings": [ + "std_req IDs do not match the common regex pattern — id_regex_exception recorded" + ] +} +``` + +``` +Pass this report to `pharaoh-tailor-fill` to author the .pharaoh/project/ tailoring files. +Review the id_regex_exceptions entries manually before proceeding. +``` diff --git a/.github/agents/pharaoh.tailor-fill.agent.md b/.github/agents/pharaoh.tailor-fill.agent.md index 11afcd1..3c79cb6 100644 --- a/.github/agents/pharaoh.tailor-fill.agent.md +++ b/.github/agents/pharaoh.tailor-fill.agent.md @@ -7,4 +7,429 @@ handoffs: [] Use when authoring the .pharaoh/project/ tailoring files (id-conventions.yaml, workflows.yaml, artefact-catalog.yaml, checklists/requirement.md) from detected-conventions JSON produced by pharaoh-tailor-detect. -See [`skills/pharaoh-tailor-fill/SKILL.md`](../../skills/pharaoh-tailor-fill/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-tailor-fill + +## When to use + +Invoke after `pharaoh-tailor-detect` has produced a detected-conventions JSON report. This +skill converts that report into the four files that Pharaoh reads at runtime from +`.pharaoh/project/`. + +Do NOT invoke without a detect report — guessing conventions without observation is +unreliable. Do NOT invoke if tailoring files already exist unless the user explicitly confirms +overwrite (see Guardrail G2). + +> **(a) Debt logged** — this skill fills 4 files in one invocation. If Phase-3 validation +> shows `pharaoh-tailor-review` systematically flags one file type more than others, consider +> splitting into `tailor-fill-id-conventions`, `tailor-fill-workflows`, `tailor-fill-catalog`, +> and `tailor-fill-checklist`. Until that evidence exists, the combined skill is correct +> scope: all four files form one logical tailoring unit and are authored together or not at all. + +--- + +## Inputs + +- **detected_conventions** (from `pharaoh-tailor-detect`): the JSON report emitted by that + skill +- **target_dir** (from user): path to `.pharaoh/project/` where the four files will be written +- **standard** (from user, optional): `iso26262-8.6` (default) or `aspice-4.0` or `iso21434` + — controls which checklist is generated for `checklists/requirement.md` +- **overwrite_ok** (from user, optional, default `false`): if `true`, existing tailoring files + are replaced; if `false`, fail if any target file already exists + +--- + +## Outputs + +Four files written under `<target_dir>`: + +1. `id-conventions.yaml` +2. `workflows.yaml` +3. `artefact-catalog.yaml` +4. `checklists/requirement.md` + +Emit a summary after writing: + +``` +Written: + <target_dir>/id-conventions.yaml (<N> prefix entries) + <target_dir>/workflows.yaml (<N> states, <N> transitions) + <target_dir>/artefact-catalog.yaml (<N> artefact types) + <target_dir>/checklists/requirement.md (<standard> axes) + +Run `pharaoh-tailor-review` to validate the generated files before using them. +``` + +--- + +## Process + +### Step 1: Validate detected_conventions + +Confirm the JSON has the required top-level keys: `prefixes`, `separator`, `artefact_types`, +`lifecycle_states_observed`. + +If any required key is missing, FAIL: + +``` +FAIL: detected_conventions JSON is malformed — missing key(s): <list>. +Re-run pharaoh-tailor-detect and pass its output directly to this skill. +Expected shape: {"prefixes": {...}, "separator": "...", "artefact_types": {...}, + "lifecycle_states_observed": [...]} +``` + +--- + +### Step 2: Check for existing files + +If `overwrite_ok` is `false` (default), check whether any of the four target files exist: +- `<target_dir>/id-conventions.yaml` +- `<target_dir>/workflows.yaml` +- `<target_dir>/artefact-catalog.yaml` +- `<target_dir>/checklists/requirement.md` + +If any exist, FAIL: + +``` +FAIL: Tailoring files already exist at <target_dir>: + <list of existing files> +To replace them, re-run with overwrite_ok: true. +``` + +Create `<target_dir>/checklists/` if it does not exist. + +--- + +### Step 3: Write id-conventions.yaml + +Use `prefixes`, `separator`, `id_regex_candidate`, and `id_regex_exceptions` from the detect +report. + +Format: + +```yaml +# id-conventions.yaml — generated by pharaoh-tailor-fill from pharaoh-tailor-detect output +# Review id_regex_exceptions manually before committing. + +separator: "__" + +id_regex: "^[a-z][a-z_]*__[a-z0-9_]+$" + +# Per-type overrides for IDs that do not match id_regex above. +# Empty if none detected. +id_regex_exceptions: {} + +prefixes: + gd_req: GD_REQ_ + wp: WP_ + wf: WF_ + tc: TC_ + # ... one entry per detected prefix +``` + +Rules: +- `separator`: value from detect report; if `null`, write `separator: null # WARNING: not detected` +- `id_regex`: value from `id_regex_candidate`; if `null`, write placeholder with warning comment +- `id_regex_exceptions`: map from detect report `id_regex_exceptions`; empty map if none +- `prefixes`: one key per detected prefix from `detected_conventions.prefixes`. The value is the + identifier-prefix string that gets prepended to the local-part to form an id (e.g. `REQ_`, + `FEAT_`, `BRAKE_CTRL_`, or just `tc` if the project uses the directive name itself as the + prefix). Take the value from the detect report's `prefixes[<key>]`; if the detect report + reports the prefix as the directive name itself (e.g. `tc__`), use the directive name as + the value. Do not author human descriptions in the value position — `prefixes` is consumed + by `pharaoh-id-allocate` and must be a literal prefix token, not prose. + +--- + +### Step 4: Write workflows.yaml + +Use `lifecycle_states_observed` from the detect report. + +**4a. States** + +List all observed states. Add a short description using common conventions: + +| State | Description | +|---|---| +| `draft` | Initial authoring state — not yet reviewed | +| `valid` | Review completed — approved for use | +| `inspected` | Independent inspection passed | +| `open` | Item identified — not yet analysed | +| `closed` | Resolution confirmed — no action outstanding | +| `active` | In progress | +| `deprecated` | Superseded — do not use for new work | + +For any state not in this table, write `"<state> — description not inferred; update manually"`. + +**4b. Default transitions** + +If the project does not encode transitions explicitly (none are present in needs.json), apply +these sensible defaults based on the observed state set: + +- `[draft, valid, inspected]` → standard V-model review chain: + - `draft → valid`: requires `["independent_review_complete"]` + - `valid → inspected`: requires `["inspection_record_present"]` + - `inspected → draft`: requires `[]` (regression path, no prerequisite) + +- `[open, closed]` → simple open/close: + - `open → closed`: requires `["resolution_confirmed"]` + - `closed → open`: requires `[]` (reopen path) + +- Any other combination → emit transitions for each adjacent pair with empty `requires: []` + and add a `# TODO: fill in transition prerequisites` comment. + +Format: + +```yaml +# workflows.yaml — generated by pharaoh-tailor-fill +# Validates against schemas/workflows.schema.json. +# Review default transitions before committing. + +lifecycle_states: + - draft + - valid + - inspected + +transitions: + - from: draft + to: valid + requires: [independent_review_complete] + - from: valid + to: inspected + requires: [inspection_record_present] + - from: inspected + to: draft + requires: [] +``` + +State descriptions live in code review notes or per-project docs, not in `workflows.yaml` — `lifecycle_states` is a flat list of state-name strings per the schema. The optional `note:` field is not part of the schema and must be dropped from emitted output. + +--- + +### Step 5: Write artefact-catalog.yaml + +Use `artefact_types` from the detect report. + +For each artefact type, emit a YAML block: + +```yaml +<prefix>: + required_fields: [<fields from observed_fields_required, excluding id — always implicit>] + optional_fields: [<fields from observed_fields_optional>] + lifecycle: [<lifecycle_states_observed>] # omit if prefix has no status field + required_links: [<link options observed on every need of this type>] + optional_links: [<link options observed on some but not all needs of this type>] + required_metadata_fields: [<option keys observed non-empty on every need of this type>] + required_roles: [<reviewer/approval options observed non-empty on every need of this type>] +``` + +Rules for the basic shape: +- Include `id` and `status` in `required_fields` always (structural). +- Copy remaining `observed_fields_required` field names (drop counts — counts are detect output, + not catalog schema). +- Copy `observed_fields_optional` field names similarly. +- Include `lifecycle` key only for types that carry a `status` field. +- Add a YAML comment noting frequency for any field that was close to the 95% threshold + (between 90–95%): `# <field>: present in <N>% of corpus — considered required by threshold`. + +Rules for the four release-gate fields. These are read by +`pharaoh-link-completeness-check`, `pharaoh-output-validate`, and +`pharaoh-review-completeness` and aggregated by `pharaoh-quality-gate`. Empty +arrays are valid and explicit; absent keys are flagged by +`pharaoh-tailor-review` rule C6: + +- `required_links`: every link option that appears non-empty on ≥95% of needs of this type + (same threshold as `required_fields`). +- `optional_links`: link options that appear on ≥20% but <95% of needs. +- `required_metadata_fields`: option keys (excluding link options and the structural `id`) + that appear non-empty on ≥95% of needs of this type. Always include `status` for any type + with a lifecycle. +- `required_roles`: include `reviewer` if observed non-empty on ≥80% of needs of this type; + include `approved_by` if observed non-empty on ≥80%. The fill path is more opinionated + than the bootstrap path: detected presence in the corpus is the signal that the project + treats the role as required, even if the threshold for `required_fields` (95%) was not met. + +Always emit all four keys, even as empty arrays. Empty array means "the project's data shows +no requirement for this type"; omitting the key means "we never decided" — and rule C6 flags +the latter at review time. + +Worked example for a `comp_req` type whose corpus shows `satisfies` on every need, +`verifies` on 60%, `priority` non-empty on every need, `reviewer` on 90%, and +`approved_by` on 30%: + +```yaml +comp_req: + required_fields: [id, status, title, satisfies] + optional_fields: [reviewer, approved_by, priority, verifies] + lifecycle: [draft, reviewed, approved] + required_links: [satisfies] + optional_links: [verifies] + required_metadata_fields: [status, priority] + required_roles: [reviewer] +``` + +Format header: + +```yaml +# artefact-catalog.yaml — generated by pharaoh-tailor-fill from pharaoh-tailor-detect output +# Threshold: required = present in >= 95% of instances; optional = >= 20%. +# Release-gate fields (required_links, optional_links, required_metadata_fields, +# required_roles) read by pharaoh-link-completeness-check, pharaoh-output-validate, +# pharaoh-review-completeness; aggregated by pharaoh-quality-gate. Empty array means +# "explicit no requirement"; absent key is flagged by pharaoh-tailor-review C6. +``` + +--- + +### Step 6: Write checklists/requirement.md + +Use the standard specified in the `standard` input (default `iso26262-8.6`). + +**ISO 26262-8 §6 axes (default):** + +```markdown +--- +title: "Requirement Quality Checklist — ISO 26262-8 §6" +standard: iso26262-8.6 +axes: + individual: + - atomicity + - internal_consistency + - verifiability + - unambiguity_prose + - comprehensibility + - feasibility + - schema + set_level: + - completeness + - external_consistency + - no_duplication + chain_level: + - maintainability +--- + +# Requirement Quality Checklist + +This checklist is used by `pharaoh-req-review` and `pharaoh-req-set-review`. + +## Individual-requirement axes + +| Axis | Type | Scale | Rule | +|---|---|---|---| +| atomicity | binary | 0/1 | Body contains exactly one `shall`; no coordinating conjunction joins modal verbs within the shall clause | +| internal_consistency | binary | 0/1 | Body contains no self-contradictory statement | +| verifiability | binary | 0/1 | `:verification:` present, non-empty, and resolves to a real need-id | +| schema | binary | 0/1 | All `required_fields` from artefact-catalog present and non-empty | +| unambiguity_prose | ordinal | 0–3 | 3 = unambiguous and precise; 0 = multiple conflicting interpretations | +| comprehensibility | ordinal | 0–3 | 3 = fully self-contained; 0 = incomprehensible without external context | +| feasibility | ordinal | 0–3 | 3 = clearly feasible and well-constrained; 0 = obviously infeasible | + +## Set-level axes (deferred to pharaoh-req-set-review) + +| Axis | Type | Notes | +|---|---|---| +| completeness | set | All behaviours of the parent element are captured | +| external_consistency | set | No conflict with sibling requirements or parent element | +| no_duplication | set | No requirement duplicates another in the set | + +## Chain-level axes + +| Axis | Type | Notes | +|---|---|---| +| maintainability | chain | Requirement set survives pharaoh-req-regenerate fixed-point within 2 iterations | +``` + +**ASPICE 4.0 (SWE.2):** + +Replace the axes table with ASPICE SWE.2 base practice indicators. Emit a note: +`# ASPICE 4.0 SWE.2 BP coverage — use pharaoh-standard-conformance for per-indicator scoring`. + +**ISO 21434 (security):** + +Replace with TARA/security requirement axes. Emit a note referencing +`pharaoh-standard-conformance` for per-indicator scoring. + +--- + +## Guardrails + +**G1 — Malformed detect report** + +`detected_conventions` missing required keys → FAIL (Step 1) with explicit key list. + +**G2 — Existing tailoring, no overwrite permission** + +Any of the four target files already exists and `overwrite_ok` is `false` → FAIL (Step 2) +listing the conflicting files. Never silently overwrite existing tailoring. + +**G3 — Null separator in detect report** + +If `separator: null`, write `workflows.yaml` and `artefact-catalog.yaml` as usual but +set `id_regex: null` in `id-conventions.yaml` with a warning comment. Do not block the other +three files. + +**G4 — Unknown standard** + +If `standard` is not one of `iso26262-8.6`, `aspice-4.0`, `iso21434`, FAIL: + +``` +FAIL: unknown standard "<standard>". +Supported values: iso26262-8.6 (default), aspice-4.0, iso21434. +``` + +--- + +## Advisory chain + +After writing all four files: + +``` +Run `pharaoh-tailor-review` to validate the generated files before using them. +``` + +--- + +## Worked example + +**User input:** +- `detected_conventions`: the report from the pharaoh-tailor-detect worked example (Score project) +- `target_dir`: `/work/eclipse-score/.pharaoh/project/` +- `standard`: `iso26262-8.6` (default) +- `overwrite_ok`: `false` + +**Step 1:** All required keys present. Validation passes. + +**Step 2:** No existing files at target path. Continue. + +**Step 3 — id-conventions.yaml written:** +6 prefix entries (`gd_req`, `gd_chklst`, `std_req`, `tc`, `wf`, `wp`); +`id_regex_exceptions` includes `std_req` exception. + +**Step 4 — workflows.yaml written:** +States `draft`, `inspected`, `valid` with standard V-model default transitions: +`draft → valid` (requires `independent_review_complete`), `valid → inspected` +(requires `inspection_record_present`), `inspected → draft` (no prerequisite). + +**Step 5 — artefact-catalog.yaml written:** +6 type entries. `gd_req`: required `[id, status, satisfies]`, optional +`[complies, rationale, tags, verification]`; `complies` gets a comment noting it was +93.5% (just below the 95% threshold — borderline, kept as optional). + +**Step 6 — checklists/requirement.md written:** +ISO 26262-8 §6 axes: 4 binary individual, 3 ordinal individual, 3 set-deferred, 1 chain. + +**Summary output:** + +``` +Written: + /work/eclipse-score/.pharaoh/project/id-conventions.yaml (6 prefix entries) + /work/eclipse-score/.pharaoh/project/workflows.yaml (3 states, 3 transitions) + /work/eclipse-score/.pharaoh/project/artefact-catalog.yaml (6 artefact types) + /work/eclipse-score/.pharaoh/project/checklists/requirement.md (iso26262-8.6 axes) + +Run `pharaoh-tailor-review` to validate the generated files before using them. +``` diff --git a/.github/agents/pharaoh.tailor-review.agent.md b/.github/agents/pharaoh.tailor-review.agent.md index 30a5d4e..80beb47 100644 --- a/.github/agents/pharaoh.tailor-review.agent.md +++ b/.github/agents/pharaoh.tailor-review.agent.md @@ -7,4 +7,410 @@ handoffs: [] Use when auditing .pharaoh/project/ tailoring files against JSON schemas (id-conventions, workflows, artefact-catalog, checklists frontmatter) plus cross-file consistency checks (every lifecycle state referenced in artefact-catalog exists in workflows.yaml, every prefix in artefact-catalog is declared in id-conventions, etc.). -See [`skills/pharaoh-tailor-review/SKILL.md`](../../skills/pharaoh-tailor-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-tailor-review + +## When to use + +Invoke after `pharaoh-tailor-fill` has written the four tailoring files, or whenever the files +are manually edited and need re-validation. + +This skill validates structure and cross-file consistency. It does NOT judge whether the +conventions are sensible — it checks whether they are internally coherent and well-formed. + +Do NOT invoke to author or repair tailoring files — use `pharaoh-tailor-fill` for that. + +--- + +## Inputs + +- **tailoring_dir** (from user): path to `.pharaoh/project/` containing the four tailoring + files +- **schemas_dir** (from user, optional): path to JSON schema files. Resolved in + this order: + 1. The explicit value if provided. + 2. `<tailoring_dir>/schemas/` if that directory exists (per-project overrides). + 3. The Pharaoh-shipped `schemas/` directory at the package root + (`<pharaoh_repo>/schemas/`). This is the default and ships with the four + canonical schemas (`artefact-catalog.schema.json`, `workflows.schema.json`, + `id-conventions.schema.json`, `checklists-frontmatter.schema.json`). + 4. If none of the above resolve, falls back to built-in structural rules and + emits a `degraded_mode: true` flag in the output. + +The four expected files: +- `<tailoring_dir>/id-conventions.yaml` +- `<tailoring_dir>/workflows.yaml` +- `<tailoring_dir>/artefact-catalog.yaml` +- `<tailoring_dir>/checklists/requirement.md` + +--- + +## Outputs + +A single JSON document — no prose wrapper. Shape: + +```json +{ + "tailoring_dir": "/path/to/.pharaoh/project", + "files_checked": 4, + "schema_violations": { + "id-conventions.yaml": [], + "workflows.yaml": [], + "artefact-catalog.yaml": [], + "checklists/requirement.md": [] + }, + "cross_file_violations": [], + "overall": "pass" +} +``` + +Each violation entry: + +```json +{ + "file": "artefact-catalog.yaml", + "rule": "prefix_declared_in_id_conventions", + "detail": "Prefix 'tc' appears in artefact-catalog.yaml but is not declared in id-conventions.yaml prefixes map", + "severity": "error" +} +``` + +`overall` values: +- `"pass"` — zero violations across all checks +- `"warnings_only"` — violations with `"severity": "warning"` only +- `"fail"` — at least one `"severity": "error"` violation + +--- + +## Process + +### Step 1: Load and parse all four files + +Read each file. If any required file is missing, record a `severity: error` violation and +continue checking the remaining files: + +```json +{ + "file": "<filename>", + "rule": "file_present", + "detail": "File not found at <path>", + "severity": "error" +} +``` + +Parse YAML files with a strict parser. If a file is syntactically invalid YAML, record: + +```json +{ + "file": "<filename>", + "rule": "yaml_syntax", + "detail": "YAML parse error: <error message>", + "severity": "error" +} +``` + +Do not attempt cross-file checks for a file that failed to parse. + +--- + +### Step 2: Schema validation per file + +Apply the following structural rules. These are built-in; external JSON schemas supplement +but do not replace them. + +**id-conventions.yaml required keys:** + +| Key | Type | Rule | +|---|---|---| +| `separator` | string | Must be present | +| `id_regex` | string | Must be present; non-empty | +| `prefixes` | map | Must be present; must contain at least one entry | +| `id_regex_exceptions` | map | Optional; if present must be a map of `<prefix>: <regex>` where `<prefix>` is declared in the `prefixes` map | + +For each entry in `prefixes`, the key must be a non-empty string and the value must be +a literal identifier-prefix token (matches schema pattern `^[A-Za-z][A-Za-z0-9_]*$`); +not a human description. See +`<pharaoh_repo>/schemas/id-conventions.schema.json` for the authoritative +JSON Schema. + +**workflows.yaml required keys:** + +| Key | Type | Rule | +|---|---|---| +| `lifecycle_states` | array of strings | Must be present; at least two unique, non-empty state names | +| `transitions` | list | Must be present; may be empty | + +For each transition in `transitions`: +- Must have `from`, `to`, `requires` keys. +- `from` and `to` must be non-empty strings. +- `requires` must be a list (may be empty). + +See `<pharaoh_repo>/schemas/workflows.schema.json` for the authoritative +JSON Schema. + +**artefact-catalog.yaml required structure:** + +Top level must be a map of artefact-type keys. For each artefact type: + +| Key | Type | Rule | +|---|---|---| +| `required_fields` | list | Must be present; must include at least `id` and `status`. Entries are sphinx-needs *option* keys (`:key: value`). | +| `optional_fields` | list | Optional; may be empty. Entries are sphinx-needs *option* keys. | +| `required_body_sections` | list | Optional; entries are top-level heading names that must appear inside the directive body prose (e.g. `Inputs`, `Steps`, `Expected` for `tc`). Validated as body prose, not as `:key:` options. | +| `lifecycle` | list | Optional; if present must be non-empty | +| `required_links` | list | Optional; entries are link-relation option names (e.g. `satisfies`). Empty list disables the release-gate check for this type; absent key is treated as empty by `pharaoh-link-completeness-check` but flagged by C6 below. | +| `optional_links` | list | Optional; entries are link-relation option names. Must not overlap with `required_links` (enforced by C6). | +| `required_metadata_fields` | list | Optional; entries are sphinx-needs option keys. Empty list disables the release-gate check; absent key is flagged by C6. | +| `required_roles` | list | Optional; entries are role-bearing option keys (e.g. `reviewer`, `approved_by`). Empty list explicitly declares no review/approval gate; absent key is flagged by C6. | + +See `<pharaoh_repo>/schemas/artefact-catalog.schema.json` for the +authoritative JSON Schema. + +**checklists/*.md frontmatter:** + +YAML frontmatter (delimited by `---`) at the top of a checklist file is **optional**. When +present, it is validated against +`<pharaoh_repo>/schemas/checklists-frontmatter.schema.json`: + +| Key | Rule | +|---|---| +| `name` | Optional; non-empty string if present | +| `applies_to` | Optional; artefact-type key from `artefact-catalog.yaml`, or `"*"` | +| `axes` | Optional; list of one or more non-empty axis labels | + +Additional fields are allowed (`additionalProperties: true`). Do NOT error on a missing +frontmatter block or on missing individual keys; only error on fields that are *present* but +violate the type rule above. + +--- + +### Step 3: Cross-file consistency checks + +Run these checks after all four files are parsed successfully. + +**C1 — Prefix declared in id-conventions for every type in artefact-catalog** + +For every key in `artefact-catalog.yaml`, that key must also appear in +`id-conventions.yaml.prefixes`. + +Violation (error) if not: +``` +Prefix '<key>' appears in artefact-catalog.yaml but is not declared in id-conventions.yaml prefixes map. +``` + +**C2 — Lifecycle states in artefact-catalog exist in workflows.yaml** + +For every artefact type in `artefact-catalog.yaml` that carries a `lifecycle` list, every +state value must appear as an entry in the `workflows.yaml.lifecycle_states` array. + +Violation (error) if not: +``` +Lifecycle state '<state>' referenced in artefact-catalog.yaml (<type>.lifecycle) is not declared in workflows.yaml lifecycle_states. +``` + +**C3 — Transition states exist in lifecycle_states** + +For every transition in `workflows.yaml.transitions`, both `from` and `to` must appear as +entries in the `workflows.yaml.lifecycle_states` array. + +Violation (error) if not: +``` +Transition from='<from>' to='<to>' references state '<state>' which is not declared in workflows.yaml lifecycle_states. +``` + +**C4 — id-conventions prefix superset of artefact-catalog** + +Every prefix declared in `id-conventions.yaml.prefixes` should appear in +`artefact-catalog.yaml`. A prefix with no catalog entry is not an error but should be flagged +as a warning — the tailoring is incomplete for that type. + +Violation (warning): +``` +Prefix '<key>' declared in id-conventions.yaml but has no entry in artefact-catalog.yaml. Add a catalog entry or remove the prefix declaration. +``` + +**C5 — required_fields does not overlap optional_fields** + +For each artefact type, no field name should appear in both `required_fields` and +`optional_fields`. + +Violation (error): +``` +Field '<field>' appears in both required_fields and optional_fields for artefact type '<type>' in artefact-catalog.yaml. +``` + +**C6 — Release-gate fields declared explicitly** + +The four release-gate fields on each per-type entry of `artefact-catalog.yaml` are +consumed by `pharaoh-link-completeness-check` (`required_links`, `optional_links`), +`pharaoh-output-validate` (`required_metadata_fields`), and `pharaoh-review-completeness` +(`required_roles`); all four are aggregated by `pharaoh-quality-gate`. Each consumer +treats an absent key as an empty list, so a project that omits all four fields ships a +release gate that silently does nothing. C6 makes the choice explicit: + +For every artefact type entry in `artefact-catalog.yaml`, three of the four fields must +be **present as keys** (the value may be an empty array). The three keys are +`required_links`, `required_metadata_fields`, `required_roles`. `optional_links` is +not part of C6 — it is purely informational and absent-equals-empty is fine. + +Violation (warning) for each missing key: +``` +Release-gate key '<field>' is absent from artefact-catalog.yaml for type '<type>'. Empty array declares an explicit "no requirement"; missing key is treated as empty by consumers but means the project never made the choice. Add the key with an empty list `[]` if no requirement applies. +``` + +Where `<field>` is one of `required_links`, `required_metadata_fields`, `required_roles`. + +Additionally — when both `required_links` and `optional_links` are present, no link +name may appear in both lists. + +Violation (error): +``` +Link name '<link>' appears in both required_links and optional_links for artefact type '<type>' in artefact-catalog.yaml. +``` + +The missing-key part of C6 is a `severity: warning` rather than `error` so that legacy +tailoring continues to validate while the project decides; the overlap-check part is +`severity: error` because consumers cannot reconcile a link declared as both required +and optional. + +--- + +### Step 4: Compute overall and emit + +Determine `overall`: +- `"fail"` if any violation has `severity: error` +- `"warnings_only"` if all violations have `severity: warning` +- `"pass"` if `schema_violations` and `cross_file_violations` are both empty + +Emit the JSON document. No prose before or after. + +--- + +## Schema validation + +Four JSON Schema (draft 2020-12) files ship with Pharaoh and make structural +validation deterministic: + +| Tailoring file | Schema | +|---|---| +| `id-conventions.yaml` | `<schemas_dir>/id-conventions.schema.json` | +| `workflows.yaml` | `<schemas_dir>/workflows.schema.json` | +| `artefact-catalog.yaml` | `<schemas_dir>/artefact-catalog.schema.json` | +| `checklists/*.md` (frontmatter) | `<schemas_dir>/checklists-frontmatter.schema.json` | + +`<schemas_dir>` is resolved per the order documented in Inputs above; the +default is the Pharaoh-shipped `schemas/` directory at the package root. See +`schemas/README.md` for the full resolution order and per-file responsibilities. + +Schema `$id` values are anchored under `https://pharaoh.useblocks.com/schemas/` and do not +need to resolve at runtime. + +These checks can be executed by any JSON Schema validator against the canonical +schemas in `schemas/`. Cross-file consistency rules C1–C6 are **not** expressible in +JSON Schema and remain implemented in Step 3 of this skill. + +--- + +## Guardrails + +**G1 — All four files missing** + +If none of the four files are found, the tailoring_dir may be wrong. FAIL before attempting +any checks: + +``` +FAIL: No tailoring files found at <tailoring_dir>. +Expected: id-conventions.yaml, workflows.yaml, artefact-catalog.yaml, checklists/requirement.md. +Verify the path or run pharaoh-tailor-fill first. +``` + +**G2 — Partial file set** + +If some but not all files exist, proceed with the available files and record file_present +errors for the missing ones. Do not FAIL outright. + +**G3 — Malformed JSON output** + +If the emitted JSON is malformed, self-correct once. If still malformed, emit raw findings as +free text with a `[DIAGNOSTIC]` prefix. + +--- + +## Advisory chain + +If `overall` is `"fail"` or `"warnings_only"`, append after the JSON: + +``` +Run `pharaoh-tailor-fill` with `overwrite_ok: true` to regenerate the affected files, +or edit them manually and re-run pharaoh-tailor-review. +``` + +--- + +## Worked example + +**User input:** `tailoring_dir = /work/eclipse-score/.pharaoh/project/` + +All four files present and well-formed. Cross-file check results: +- C1: all artefact-catalog types (`gd_req`, `gd_chklst`, `std_req`, `tc`, `wf`, `wp`) are + declared in id-conventions. Pass. +- C2: lifecycle `[draft, valid, inspected]` on `gd_req`, `tc`, `arch` — all three states are + keys in workflows.yaml. Pass. +- C3: all transitions reference only `draft`, `valid`, `inspected`. Pass. +- C4: all six prefixes in id-conventions also appear in artefact-catalog. No orphan prefixes. + Pass. +- C5: no field appears in both required and optional for any type. Pass. +- C6: every artefact type declares `required_links`, `required_metadata_fields`, and + `required_roles` keys (empty arrays permitted). Pass. + +**Output:** + +```json +{ + "tailoring_dir": "/work/eclipse-score/.pharaoh/project", + "files_checked": 4, + "schema_violations": { + "id-conventions.yaml": [], + "workflows.yaml": [], + "artefact-catalog.yaml": [], + "checklists/requirement.md": [] + }, + "cross_file_violations": [], + "overall": "pass" +} +``` + +**Variant — one cross-file error:** + +`artefact-catalog.yaml` contains an `arch` type entry that is not declared in +`id-conventions.yaml` prefixes (the fill step missed it). + +```json +{ + "tailoring_dir": "/work/eclipse-score/.pharaoh/project", + "files_checked": 4, + "schema_violations": { + "id-conventions.yaml": [], + "workflows.yaml": [], + "artefact-catalog.yaml": [], + "checklists/requirement.md": [] + }, + "cross_file_violations": [ + { + "file": "artefact-catalog.yaml", + "rule": "prefix_declared_in_id_conventions", + "detail": "Prefix 'arch' appears in artefact-catalog.yaml but is not declared in id-conventions.yaml prefixes map", + "severity": "error" + } + ], + "overall": "fail" +} +``` + +``` +Run `pharaoh-tailor-fill` with `overwrite_ok: true` to regenerate the affected files, +or edit them manually and re-run pharaoh-tailor-review. +``` diff --git a/.github/agents/pharaoh.toctree-emit.agent.md b/.github/agents/pharaoh.toctree-emit.agent.md index 1532bdb..603dd4a 100644 --- a/.github/agents/pharaoh.toctree-emit.agent.md +++ b/.github/agents/pharaoh.toctree-emit.agent.md @@ -7,4 +7,114 @@ handoffs: [] Use when a composition skill has just emitted a set of RST files into a directory and needs to add (or regenerate) an `index.rst` with a Sphinx toctree over them. Prevents orphan-file warnings under `sphinx-build -W`. Does NOT modify the emitted RST files. Does NOT wire the emitted directory into any parent toctree — that is a caller concern. -See [`skills/pharaoh-toctree-emit/SKILL.md`](../../skills/pharaoh-toctree-emit/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-toctree-emit + +## When to use + +Invoke at the end of a plan (emitted by `pharaoh-write-plan`, executed by `pharaoh-execute-plan`) that writes N RST files into a directory (e.g. `docs/source/features/`). Without an `index.rst` with a matching toctree, Sphinx treats every generated file as orphan and `sphinx-build -W` fails. This skill writes that index, and only that index. + +Do NOT use to wire the emitted directory into its parent (e.g. updating the project-root `index.rst` to reference `features/index`). That is a separate caller concern — changes to the parent toctree are outside this skill's scope. + +Do NOT use to patch an existing `index.rst` with different content — if an existing index disagrees with what this skill would write, the skill warns and does not overwrite. Caller decides whether to delete the existing file or merge manually. + +## Atomicity + +- (a) Indivisible — one directory + one glob + one caption in → one `index.rst` file written (or no-op if already matches). No modifications to other RST files. No parent-toctree edits. +- (b) Input: `{target_dir: str, file_glob: str, parent_caption: str, maxdepth?: int, exclude?: list[str]}`. Output: JSON `{toctree_path: str, files_included: list[str], files_modified: bool}`. +- (c) Reward: fixture `pharaoh-validation/fixtures/pharaoh-toctree-emit/input_dir/` containing three files (`csv_export.rst`, `jama_pull.rst`, `reqif_import.rst`). With `target_dir` = that path, `file_glob` = `"*.rst"`, `parent_caption` = `"Features"`, `maxdepth` = `1` → skill writes `<target_dir>/index.rst` whose content exactly matches the `expected_index.rst` fixture (same alphabetical ordering of stems, same caption, same maxdepth, same blank lines). +- (d) Reusable for any sphinx-needs project with dynamically emitted RST sets (features, modules, decisions, releases). +- (e) Composable: one directory per call. A plan emitted by `pharaoh-write-plan` may include N toctree-emit tasks (one per target_dir) dispatched by `pharaoh-execute-plan`, but this skill itself handles exactly one. + +## Input + +- `target_dir`: absolute path to the directory whose RST files should be indexed. +- `file_glob`: glob pattern applied within `target_dir` (e.g. `"*.rst"`, `"*.md"`). Non-recursive — toctrees are one level deep by convention. +- `parent_caption`: human-readable heading shown above the toctree (`Features`, `Modules`, `Decisions`, etc.). +- `maxdepth` (optional): `:maxdepth:` option for the toctree. Default `1`. +- `exclude` (optional): list of filename globs to exclude from the toctree. Default `["index.rst"]` (never self-reference). + +## Output + +```json +{ + "toctree_path": "/abs/path/to/target_dir/index.rst", + "files_included": ["csv_export", "jama_pull", "reqif_import"], + "files_modified": true +} +``` + +`files_included` contains stems (filename without `.rst` extension), in the order they appear in the emitted toctree (alphabetical). + +`files_modified`: +- `true` if the skill wrote a new `index.rst` or overwrote one with matching content (idempotent write). +- `false` if `index.rst` already existed with CONTENT MATCHING what this skill would have written — no-op. +- `false` PLUS a warning in the return if `index.rst` exists with different content — skill did not overwrite; caller must handle merge. + +## Process + +### Step 1: Enumerate files + +Glob `target_dir/<file_glob>` non-recursively. Subtract `exclude` matches. Sort alphabetically by filename. Strip file extensions to produce toctree entries (Sphinx toctree entries omit `.rst`). + +If zero files remain, FAIL: `"no files matched <file_glob> under <target_dir>; nothing to index"`. + +### Step 2: Build toctree content + +Emit content in this exact shape: + +```rst +<parent_caption> +<underline of = characters, exact length of parent_caption> + +.. toctree:: + :maxdepth: <maxdepth> + + <stem_1> + <stem_2> + <stem_3> +``` + +- Single blank line between the caption underline and `.. toctree::`. +- Single blank line between `:maxdepth:` line and the first stem. +- Stems indented with 3 spaces (Sphinx toctree convention). +- No trailing blank lines after the last stem. + +### Step 3: Check existing index.rst + +If `target_dir/index.rst` does not exist → write the new content, return `files_modified=true`. + +If it exists and its content (after normalizing line endings and trailing whitespace) exactly equals what would be written → no-op, return `files_modified=false`, `warnings=[]`. + +If it exists with different content → no-op, return `files_modified=false`, warnings include `"index.rst exists with different content — not overwriting; delete manually or merge to regenerate"`. + +### Step 4: Return + +Return the JSON shape per Output section. + +## Last step + +No dedicated `*-review` atom exists for toctree emission; the operation is structural (write one `index.rst` listing N files) and its correctness is checked mechanically rather than via a prose-judgement review atom. This skill therefore performs an inline self-verification in Step 4 before returning: + +1. Every entry in the emitted `toctree` block resolves to an existing file under `target_dir` (no dangling references). +2. The emitted `index.rst` contains exactly one `.. toctree::` directive (no accidental duplication). +3. No entry appears twice in the toctree body. + +If any check fails, do not write `index.rst`; return `status: "failed"` with evidence. + +Coverage is mechanically enforced at plan level by `pharaoh-quality-gate`'s orphan / link-completeness invariants plus `sphinx-build -W` itself (which fails on orphan RST files). See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale. + +## Failure modes + +- `target_dir` does not exist → FAIL. +- `target_dir` is not a directory → FAIL. +- Zero files matched glob (after exclude) → FAIL per Step 1. + +## Non-goals + +- No parent-toctree updates. The caller (human or a future composition skill) wires `<target_dir>/index.rst` into the project root's `index.rst` manually. +- No inter-file toctree. This skill assumes a flat glob over one directory — nested toctrees require separate invocations. +- No .md support beyond passing `*.md` as `file_glob`. Sphinx's myst_parser must already be configured in the project for Markdown files to resolve. diff --git a/.github/agents/pharaoh.trace.agent.md b/.github/agents/pharaoh.trace.agent.md index 095c023..4887a71 100644 --- a/.github/agents/pharaoh.trace.agent.md +++ b/.github/agents/pharaoh.trace.agent.md @@ -105,3 +105,416 @@ If multiple project roots exist, search all indexes when a link target is not fo 6. This agent is read-only. No session state changes, no file modifications. 7. No workflow gates. Runs freely in any mode. 8. For large projects (>500 needs), suggest `--depth` or `--type` filters. + +--- + +## Full atomic specification + +# pharaoh-trace + +Navigate traceability in any direction from any need in a sphinx-needs project. +Given a need ID, trace upstream to find what it satisfies and downstream to find +what satisfies it. Follow all link types: standard `links`, extra_links +(`implements`, `tests`, and any project-specific types), and sphinx-codelinks +(code-to-requirement references). Present results as a clear traceability tree +with link type labels, statuses, and gap highlights. + +--- + +## Process + +### Step 1: Get project data + +Follow the instructions in [`skills/shared/data-access.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/data-access.md) to: + +1. Detect the project structure (find `ubproject.toml`, `conf.py`, source directories). +2. Read the project configuration (need types, extra_links, ID settings). +3. Determine the data access tier (ubc CLI > ubCode MCP > raw file parsing). +4. Build the needs index with all needs and their attributes. +5. Build the link graph with all relationships in both directions. +6. Detect sphinx-codelinks configuration. +7. Read `pharaoh.toml` for traceability requirements (`required_links`). + +After completing data access, present the detection summary before proceeding: + +``` +Project: <name> (<config source>) +Types: <list of directive names> +Links: <list of link type names> +Data source: <tier used> +Needs found: <count> +Codelinks: <enabled|not configured> +``` + +If detection fails (no project found, no needs in source files), report the issue +and ask the user for guidance. Do not proceed with empty data. + +### Step 2: Identify the target need + +The user provides either a need ID or a description. + +**If the user provides a need ID** (e.g., `REQ_001`, `SPEC_002`): + +1. Look up the ID in the needs index. +2. If found, confirm the target silently and proceed. +3. If not found, report the error: + ``` + Need ID "REQ_999" not found in the project. + Available needs with prefix REQ_: REQ_001, REQ_002, REQ_003 + ``` + Ask the user to provide a valid ID. + +**If the user provides a description** (e.g., "brake response time"): + +1. Search the needs index by title and content for matching terms. +2. If exactly one match is found, confirm it with the user: + ``` + Found: REQ_001 (Requirement: Brake response time) [open] + Proceed with tracing this need? + ``` +3. If multiple matches are found, list them and ask the user to choose: + ``` + Multiple matches found: + 1. REQ_001 (Requirement: Brake response time) [open] + 2. SPEC_001 (Specification: Brake pedal sensor interface) [open] + Which need should I trace? Enter the number or ID. + ``` +4. If no matches are found, report it and suggest the user check the ID or description. + +**Parse optional flags from the user's input:** + +- `--upstream` or `--up`: Trace only upstream (parents, what this need satisfies). +- `--downstream` or `--down`: Trace only downstream (children, what satisfies this need). +- `--depth N`: Limit recursion to N levels from the target. Default: unlimited. +- `--type <type>`: Filter the tree to show only needs of a specific type (e.g., `--type req` or `--type test`). + +If no direction flag is given, trace both upstream and downstream (full trace). + +### Step 3: Trace upstream (what does this need satisfy?) + +Starting from the target need, follow all incoming links to find parent needs. +"Upstream" means moving toward higher-level, more abstract needs (e.g., from +a test case up to an implementation, specification, and requirement). + +**Algorithm:** + +``` +function trace_upstream(need_id, visited, depth, max_depth): + if need_id in visited: + return CYCLE_DETECTED + if max_depth is set and depth > max_depth: + return DEPTH_LIMIT + visited.add(need_id) + + parents = [] + # 1. Check standard links (incoming) + for other_need in needs_index: + if need_id in other_need.links: + parents.append({id: other_need.id, link_type: "links"}) + + # 2. Check extra_links (incoming direction) + for link_type in project_extra_links: + for other_need in needs_index: + if need_id in other_need[link_type]: + parents.append({id: other_need.id, link_type: link_type}) + + # 3. Recurse for each parent + for parent in parents: + parent.children = trace_upstream(parent.id, visited, depth + 1, max_depth) + + return parents +``` + +**Key rules:** + +- Follow ALL configured link types, not just the defaults. Read the project's + `extra_links` configuration to know every valid link type name. +- For extra_links, the "incoming" direction means: find needs that have + `:<link_type>: <target_id>` in their options. For example, if IMPL_001 has + `:implements: SPEC_001`, then tracing upstream from IMPL_001 finds SPEC_001 + via the `implements` link type. +- For standard `links`, the direction is symmetric. If A has `:links: B`, then + B also links to A. When tracing upstream, consider both sides. +- Track visited need IDs to detect and break cycles (see Step 3/4 cycle handling). +- Stop at the top level: needs with no incoming links (typically user stories + or top-level requirements). + +### Step 4: Trace downstream (what satisfies this need?) + +Starting from the target need, follow all outgoing links to find child needs. +"Downstream" means moving toward more concrete, detailed needs (e.g., from +a requirement down to specifications, implementations, and test cases). + +**Algorithm:** + +``` +function trace_downstream(need_id, visited, depth, max_depth): + if need_id in visited: + return CYCLE_DETECTED + if max_depth is set and depth > max_depth: + return DEPTH_LIMIT + visited.add(need_id) + + children = [] + need = needs_index[need_id] + + # 1. Check standard links (outgoing) + for target_id in need.links: + children.append({id: target_id, link_type: "links"}) + + # 2. Check extra_links (outgoing direction) + for link_type in project_extra_links: + for target_id in need[link_type]: + children.append({id: target_id, link_type: link_type}) + + # 3. Check codelinks (if enabled) + if codelinks_enabled: + code_refs = find_codelinks_for(need_id) + for ref in code_refs: + children.append({id: ref.file + ":" + ref.symbol, link_type: "codelink"}) + + # 4. Recurse for each child (except codelinks, which are leaf nodes) + for child in children: + if child.link_type != "codelink": + child.children = trace_downstream(child.id, visited, depth + 1, max_depth) + + return children +``` + +**Key rules:** + +- For extra_links, the "outgoing" direction means: the target need has the link + option set. For example, if IMPL_001 has `:implements: SPEC_001`, then tracing + downstream from SPEC_001 finds IMPL_001 via the `implements` link (because + IMPL_001 "implements" SPEC_001, so SPEC_001 is implemented by IMPL_001). +- Include sphinx-codelinks if enabled. Search code files for codelink annotations + referencing the current need ID (e.g., `# codelink: REQ_001` or patterns + defined in the codelinks configuration). Codelinks are always leaf nodes. +- Stop at leaf nodes: needs with no outgoing links (typically test cases) and + code artifacts. + +**Finding codelinks:** + +If sphinx-codelinks is configured, search for code references: + +1. Use Grep to search source code directories for the need ID in codelink patterns: + ``` + Grep pattern: codelink.*<need_id>|@codelink\(.*<need_id>.*\) + ``` +2. Also check the codelinks configuration for custom patterns. +3. Each match becomes a leaf node in the downstream tree labeled `[codelink]`. + +### Step 5: Handle cycles + +If the traversal revisits a need ID already in the `visited` set, a cycle exists. + +1. Do not recurse further into the cycle. +2. Mark the node in the output tree: + ``` + REQ_001 (Requirement: Brake response time) [open] + └── SPEC_001 (Specification: ...) [open] --implements--> + └── REQ_001 [CYCLE - already visited] + ``` +3. After presenting the tree, add a warning: + ``` + Warning: Circular link detected: SPEC_001 -> REQ_001 -> SPEC_001 + Circular links may indicate a modeling error. Review the link structure. + ``` + +### Step 6: Present the traceability tree + +Combine the upstream and downstream traces into a single visual tree, with the +target need as the root. + +**Full trace output format (both directions):** + +``` +=== Traceability: REQ_001 (Requirement: Brake response time) [open] === + +--- Upstream (satisfies) --- +(no upstream links - REQ_001 is a top-level requirement) + +--- Downstream (satisfied by) --- +REQ_001 (Requirement: Brake response time) [open] +├── SPEC_001 (Specification: Brake pedal sensor interface) [open] --links--> +│ ├── IMPL_001 (Implementation: Brake pedal driver) [open] --implements--> +│ │ ├── TEST_001 (Test Case: Brake response time test) [open] --tests--> +│ │ └── src/brake_controller.c:brake_check() [codelink] +│ └── [GAP] No test case directly linked to SPEC_001 (expected: spec -> impl -> test) +└── REQ_002 (Requirement: Brake force distribution) [open] --links--> + └── SPEC_002 (Specification: Force distribution algorithm) [open] --links--> + └── IMPL_002 (Implementation: EBD module) [open] --implements--> + └── TEST_002 (Test Case: Force distribution test) [open] --tests--> +``` + +**Upstream-only output format** (when `--upstream` is used): + +``` +=== Upstream trace from IMPL_001 (Implementation: Brake pedal driver) [open] === + +IMPL_001 (Implementation: Brake pedal driver) [open] +└── SPEC_001 (Specification: Brake pedal sensor interface) [open] --implements--> + └── REQ_001 (Requirement: Brake response time) [open] --links--> + (top-level requirement - no further upstream links) +``` + +**Downstream-only output format** (when `--downstream` is used): + +``` +=== Downstream trace from REQ_001 (Requirement: Brake response time) [open] === + +REQ_001 (Requirement: Brake response time) [open] +├── SPEC_001 (Specification: Brake pedal sensor interface) [open] --links--> +│ └── IMPL_001 (Implementation: Brake pedal driver) [open] --implements--> +│ └── TEST_001 (Test Case: Brake response time test) [open] --tests--> +└── REQ_002 (Requirement: Brake force distribution) [open] --links--> + └── ... +``` + +**Tree formatting rules:** + +- Each node shows: `ID (Type: Title) [status]` +- After each node, show the link type label that connects it to its parent: `--<link_type>-->` +- Use box-drawing characters for the tree structure: `├──`, `└──`, `│` +- Indent each level by 4 characters. +- For codelinks, show: `file:symbol [codelink]` +- For cycle nodes, show: `ID [CYCLE - already visited]` +- For depth-limited nodes, show: `... (depth limit reached)` + +### Step 7: Highlight gaps + +After building the tree, check for traceability gaps using the `required_links` +from `pharaoh.toml` (if configured). + +**Gap detection logic:** + +For each `required_links` entry (e.g., `"req -> spec"`): + +1. Find all needs of the source type (e.g., all `req` needs). +2. For each source need, check if it has at least one downstream link to a need + of the target type (e.g., at least one `spec`). +3. If no link exists, this is a gap. + +Only report gaps that are visible in the current trace (needs that appear in the +tree but are missing expected children). Do not report gaps for needs outside the +trace scope. + +**Gap display:** + +Insert gap markers directly into the tree where expected children are missing: + +``` +REQ_003 (Requirement: Emergency brake activation) [open] +└── SPEC_003 (Specification: Emergency detection logic) [open] --links--> + └── [GAP] No implementation linked (expected: spec -> impl) +``` + +After the tree, provide a gap summary: + +``` +Traceability gaps found: 2 +- SPEC_003: Missing implementation (required: spec -> impl) +- REQ_003: No test coverage through full chain (req -> spec -> impl -> test) + +Tip: Run the appropriate authoring skill (e.g. pharaoh:req-draft) to create the missing needs, or pharaoh:mece for a full gap analysis. +``` + +If no `required_links` are configured in `pharaoh.toml`, skip gap detection and +do not show gap markers. Mention this: + +``` +Note: No required_links configured in pharaoh.toml. Gap detection is disabled. +Configure [pharaoh.traceability] required_links to enable gap highlighting. +``` + +### Step 8: Filtered views + +Apply filters based on the user's flags before presenting the tree. + +**Type filter** (`--type <type>`): + +When `--type` is specified, prune the tree to show only the path from the target +need to needs of the specified type. Intermediate needs on the path are shown +but visually de-emphasized: + +``` +=== Trace REQ_001 -> test (filtered) === + +REQ_001 (Requirement: Brake response time) [open] + via SPEC_001 -> IMPL_001 + └── TEST_001 (Test Case: Brake response time test) [open] + via SPEC_002 -> IMPL_002 + └── TEST_002 (Test Case: Force distribution test) [open] +``` + +**Depth limit** (`--depth N`): + +When `--depth` is specified, stop recursion at N levels. Show a truncation +marker at the depth boundary: + +``` +REQ_001 (Requirement: Brake response time) [open] +├── SPEC_001 (Specification: Brake pedal sensor interface) [open] --links--> +│ └── ... (depth limit: 1) +└── REQ_002 (Requirement: Brake force distribution) [open] --links--> + └── ... (depth limit: 1) +``` + +### Step 9: Multi-project support + +In multi-project setups (multiple `ubproject.toml` files detected): + +1. Build a needs index for each project separately. +2. When tracing, if a link target ID is not found in the current project, search + other project indexes. +3. When a cross-project link is found, annotate it in the tree: + ``` + SPEC_EXT_001 (Specification: External interface) [open] [project: sensor-system] + ``` +4. If a linked ID is not found in any project, mark it: + ``` + SPEC_UNKNOWN (not found in any project) + ``` + +Also check for external needs imported via `needs_external_needs` or +`needs_from_toml` external references. These are needs defined outside the +current source tree but imported into the build. + +--- + +## Constraints + +1. **Handle circular links gracefully.** Always use a visited set during traversal. + Never enter infinite recursion. Report cycles clearly to the user. + +2. **Support all configured link types.** Read the project's `extra_links` + configuration and follow every defined link type. Do not hardcode link type + names like `implements` or `tests`. The project may define custom link types + such as `satisfies`, `derives_from`, `blocks`, `validates`, or any other name. + +3. **Always show link type labels.** Every edge in the tree must show how the + two needs are connected (e.g., `--implements-->`, `--tests-->`, `--links-->`). + This is critical for understanding the nature of each relationship. + +4. **Show status on every node.** The status value (e.g., `open`, `in_progress`, + `closed`, `approved`) helps the user understand the maturity of the trace chain. + +5. **Work with incomplete data.** If a linked ID does not exist in the needs index, + show it as a broken reference rather than silently dropping it: + ``` + IMPL_999 [BROKEN LINK - need not found] + ``` + +6. **No session state changes.** This skill is read-only. It does not modify + `.pharaoh/session.json` or any project files. It only reads and presents data. + +7. **No workflow gates.** As noted in [`skills/shared/strictness.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/strictness.md), `pharaoh:trace` + has no prerequisites and executes freely in both advisory and enforcing modes. + +8. **Performance on large projects.** If the needs index contains more than 500 + needs, warn the user that a full bidirectional trace may produce a large tree + and suggest using `--depth` or `--type` filters: + ``` + Note: Project contains 1,247 needs. Consider using --depth or --type to limit output. + Proceeding with full trace... + ``` diff --git a/.github/agents/pharaoh.use-case-diagram-draft.agent.md b/.github/agents/pharaoh.use-case-diagram-draft.agent.md index a32fd11..b206bd3 100644 --- a/.github/agents/pharaoh.use-case-diagram-draft.agent.md +++ b/.github/agents/pharaoh.use-case-diagram-draft.agent.md @@ -7,4 +7,120 @@ handoffs: [pharaoh.diagram-review] Use when drafting one use-case diagram for a single feat — actors (primary, secondary, external systems), use cases (one per user-facing capability), and system boundary. Renderer-aware (mermaid or plantuml per `.pharaoh/project/diagram-conventions.yaml`). First concrete `*-diagram-draft` skill — others follow the same shape. -See [`skills/pharaoh-use-case-diagram-draft/SKILL.md`](../../skills/pharaoh-use-case-diagram-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-use-case-diagram-draft + +## When to use + +Invoke when a feat (user-facing capability) needs a use-case view. Typical caller: a plan emitted by `pharaoh-write-plan` that selects this skill via the view_map (see `shared/diagram-view-selection.md`). + +Do NOT use for component decomposition — that's `pharaoh-component-diagram-draft`. Do NOT use for interaction flow — that's `pharaoh-sequence-diagram-draft`. One skill per diagram kind — see the atomic-skill criteria. + +## Atomicity + +- (a) One feat context + one actor list + one renderer in → one use-case diagram block out. +- (b) Input: `{feat_id: str, actors: list[{name, role, kind}], use_cases: list[str], external_systems: list[str], renderer: "mermaid" | "plantuml", tailoring_path: str}`. Output: JSON `{diagram_block: <rst_directive_str>, element_count: int}`. +- (c) Reward: fixtures in `pharaoh-validation/fixtures/pharaoh-use-case-diagram-draft/` — canonical inputs + expected outputs for both renderers. Parser gate: emitted block passes `mmdc` (mermaid) or `plantuml -checkonly` (plantuml). Required-elements gate: ≥1 actor, 1 system boundary, ≥1 use case. Element count ≤ `element_count_max` from tailoring. +- (d) Reusable for every feat that needs a use-case view. +- (e) Emits only the directive block. Does not touch conf.py, does not mutate tailoring, does not dispatch other skills (except the mandated review as last step). + +## Input + +- `feat_id`: the need_id of the parent feat. Used as the diagram's `:caption:` hook for `trace_to_parent`. +- `actors`: list of actor specs. Each: + - `name`: short label (e.g. "User", "CSV file", "Jama REST API"). + - `role`: "primary" or "secondary". + - `kind`: "human" or "system". +- `use_cases`: list of user-facing capability strings. Each becomes a use case node inside the system boundary. Derived from the feat's body (shall-clauses) or supplied by the caller. +- `external_systems`: list of labels for external participants shown outside the system boundary (databases, third-party APIs, file formats). +- `renderer`: "mermaid" or "plantuml". If unspecified, read from `tailoring_path/diagram-conventions.yaml > renderer`; if that is also unspecified, default to "mermaid". +- `tailoring_path`: absolute path to `.pharaoh/project/`. Reads `diagram-conventions.yaml` for renderer + `element_count_max` + `stereotype_aliases`. + +## Output + +JSON document, no prose wrapper: + +```json +{ + "diagram_block": ".. mermaid::\n :caption: FEAT_example — use case diagram\n\n flowchart TB\n ...", + "element_count": 5, + "renderer": "mermaid" +} +``` + +## How to emit — mermaid + +Mermaid does not have a first-class use-case diagram type. Use `flowchart TB` with stereotype-labelled nodes. Shape template: + +```mermaid +flowchart TB + %% Actors + actor_user(("User")) + actor_jama[("Jama REST API")] + + %% System boundary + subgraph SYS["<<system>> <project>"] + uc1["Fetch Jama items"] + uc2["Convert to Sphinx-Needs"] + uc3["Export needs.json"] + end + + actor_user --> uc1 + uc1 --> actor_jama + uc1 --> uc2 + uc2 --> uc3 +``` + +Conventions: +- `(( ))` shape = human actor. +- `[( )]` cylinder shape = external data/system actor. +- `subgraph` with a `<<system>>` prefix in its label = system boundary. +- Arrows connect actors to use cases (primary: actor → uc; secondary: uc → external). + +## How to emit — plantuml + +PlantUML has a first-class use-case syntax. Use it: + +```plantuml +@startuml +left to right direction +actor "User" as user +actor "Jama REST API" as jama <<external>> + +rectangle "<project>" { + usecase "Fetch Jama items" as uc1 + usecase "Convert to Sphinx-Needs" as uc2 + usecase "Export needs.json" as uc3 +} + +user --> uc1 +uc1 --> jama +uc1 --> uc2 +uc2 --> uc3 +@enduml +``` + +Conventions: +- `actor "..." as <alias>` for human actors. +- `actor "..." as <alias> <<external>>` for system actors. +- `rectangle "SystemName" { ... }` for the system boundary. +- `usecase "..." as <alias>` inside the rectangle. + +## Safe labels + +Every emitted label obeys [`shared/diagram-safe-labels.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/diagram-safe-labels.md) — no `;`, no `|`, no unescaped `"` in labels, etc. The draft output runs through `pharaoh-diagram-lint` before success. + +## Relationship semantics + +Use-case diagrams use association arrows (`-->`) from actor to use case, and include / extend between use cases if the scope requires. See [`shared/uml-relationship-semantics.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/uml-relationship-semantics.md) for the full decision matrix if include / extend are needed (rare at feat level). + +## Last step + +After emitting the diagram block, invoke `pharaoh-diagram-review` with `diagram_type: use_case` and `parent_need_id: <feat_id>`. Attach the returned review JSON to this skill's output under the key `review`. If review emits any critical finding, return non-success with the findings verbatim. See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md). + +## Composition + +Invoked as a task in plans emitted by `pharaoh-write-plan` when the plan includes a feat that selects use-case as its primary view (per `shared/diagram-view-selection.md`). diff --git a/.github/agents/pharaoh.verify.agent.md b/.github/agents/pharaoh.verify.agent.md index f02b890..5f1d12f 100644 --- a/.github/agents/pharaoh.verify.agent.md +++ b/.github/agents/pharaoh.verify.agent.md @@ -16,4 +16,330 @@ handoffs: Use when checking whether one sphinx-needs artefact actually addresses the substance of every parent it links to via :satisfies: or :verifies:. Cross-need content check — distinct from structural MECE, schema-level tailor-review, and per-axis req-review/arch-review. -See [`skills/pharaoh-verify/SKILL.md`](../../skills/pharaoh-verify/SKILL.md) for the full atomic specification — inputs, scoring scale, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-verify + +## When to use + +Invoke when the user has one need-id (drafted, modified, or already-published) and wants to +know whether its body actually addresses the substance of its parent — does the architecture +element discharge the requirement, does the test case exercise the requirement's claim, does +the decision close the question its `:decides:` target raised. + +This is a **cross-need content check**. It is the only Pharaoh skill that compares two needs' +bodies for substantive coverage. + +How it differs from the neighbouring review skills: + +| Skill | Scope | Question answered | +|---|---|---| +| `pharaoh-mece` | full corpus | Are required links present? Are there orphans / gaps? (structural) | +| `pharaoh-tailor-review` | tailoring files | Does `.pharaoh/project/` validate against the schemas? (schema-level) | +| `pharaoh-req-review` | one requirement | Does this requirement pass the 11 ISO 26262 §6 axes? (per-axis prose rubric) | +| `pharaoh-arch-review` | one architecture element | Same axes plus arch-specific axes (per-axis prose rubric) | +| `pharaoh-vplan-review` | one test case | Same axes plus vplan-specific axes (per-axis prose rubric) | +| **`pharaoh-verify`** | one child + its parents | Does the child's body actually address the parent's substance? (content-level satisfaction) | + +Do NOT use when: + +- The user wants per-axis prose grading — use `pharaoh-req-review` / `pharaoh-arch-review` / + `pharaoh-vplan-review`. +- The user wants gap or orphan analysis across the corpus — use `pharaoh-mece`. +- The user wants to know if the tailoring files are well-formed — use `pharaoh-tailor-review`. + +--- + +## Inputs + +- **need_id** (from user): the child need-id to verify. Must exist in `needs.json`. +- **transitive** (from user, optional, default `false`): if `true`, walk every `:satisfies:` + / `:verifies:` link transitively and verify each ancestor pair (child↔direct-parent, + direct-parent↔grandparent, …). If `false`, verify only the direct parents. +- **tailoring** (from `.pharaoh/project/`): + - `artefact-catalog.yaml` — read the link-relationship map: which link types carry the + "satisfies its parent" semantics for each artefact type. Defaults: `:satisfies:` for + requirements and architecture; `:verifies:` for test cases; `:decides:` for decisions. + - `id-conventions.yaml` — only used when reporting parent IDs that fail the regex +- **needs.json**: required for body lookup on both child and every parent + +--- + +## Outputs + +A single JSON document. Shape: + +```json +{ + "need_id": "<child-id>", + "child_type": "<type>", + "parents": [ + { + "parent_id": "<id>", + "parent_type": "<type>", + "link_field": "satisfies|verifies|decides", + "verdict": "addresses|partial|absent|unresolved", + "score": 3, + "reason": "one-sentence justification", + "missing_aspects": ["..."] + } + ], + "overall": "addresses|partial|absent", + "action_items": ["..."] +} +``` + +### Verdict scale + +Per parent, score the child body on a 0-3 ordinal: + +| Score | Verdict | Definition | +|---|---|---| +| 3 | `addresses` | Child body explicitly covers every claim, condition, or actor named in the parent. No substantive aspect of the parent is missing. | +| 2 | `addresses` | Child covers all major claims; minor wording or a marginal sub-claim is paraphrased rather than restated. | +| 1 | `partial` | Child covers some but not all of the parent's substantive claims. Concrete missing aspects are listed in `missing_aspects`. | +| 0 | `absent` | Child body is generic, off-topic, or names the parent ID without substantively addressing the parent's claim. | +| n/a | `unresolved` | Parent ID does not resolve in `needs.json` — record as `unresolved` and skip scoring this pair. | + +### Overall + +Computed across all parent pairs (excluding `unresolved`): + +- `addresses` — every pair scores ≥ 2 +- `partial` — at least one pair scores 1, no pair scores 0 +- `absent` — at least one pair scores 0 + +If every parent is `unresolved`, set `overall` to `"absent"` and add an action item naming the +unresolved parents. + +--- + +## Process + +### Step 1: Resolve child + +Find `needs.json` (check `docs/_build/needs/needs.json`, then `_build/needs/needs.json`, then +any `needs.json` under a `_build` directory). If not found, FAIL: + +``` +FAIL: needs.json not found. Build the Sphinx project first +(`sphinx-build docs/ docs/_build/`), then re-run this skill. +``` + +Look up `need_id`. If not present, FAIL: + +``` +FAIL: need_id "<id>" not found in needs.json. +Verify the ID or build the project. +``` + +Extract the child's type, body, and the values of every parent-link field that applies to its +type per `artefact-catalog.yaml` (default mapping: requirements → `:satisfies:`; architecture +→ `:satisfies:`; test cases → `:verifies:`; decisions → `:decides:`). + +If the child has no parent-link values at all, emit: + +```json +{ + "need_id": "<id>", + "child_type": "<type>", + "parents": [], + "overall": "absent", + "action_items": [ + "Child has no :satisfies:/:verifies:/:decides: link. Add a parent before verifying." + ] +} +``` + +and stop. + +--- + +### Step 2: Resolve parents + +For each parent ID listed in the child's link fields, look it up in `needs.json`: + +- If found, capture the parent's type and body. +- If not found, mark the pair `verdict: "unresolved"` and continue. + +If `transitive = true`, push each resolved parent onto a queue and repeat Step 2 for its own +parents. Maintain a visited set to avoid cycles. The output `parents` list is flattened — +each parent appears once with its direct child noted in `link_field`. + +--- + +### Step 3: Score each pair + +For each (child, parent) pair, read both bodies and answer the satisfaction question for the +relevant link semantics: + +| Link field | Question to answer | +|---|---| +| `satisfies` | Does the child's prose discharge the parent requirement's claim — i.e. would building only what the child describes satisfy what the parent demands? | +| `verifies` | Does the test case's inputs / steps / expected outcome exercise the specific behaviour the parent requires? Would a passing run of this test give evidence the parent is met? | +| `decides` | Does the decision body resolve the design question that the affected need raises — alternatives weighed, choice stated, rationale matches the parent's concern? | + +Score on the 0-3 scale above. Be concrete in `reason` — name the parent claim that is or is +not covered. When `score < 2`, populate `missing_aspects` with the specific claims, actors, +or conditions from the parent that the child fails to address. + +For test cases, also flag `partial` when the test only exercises a happy path while the +parent requires negative or boundary cases. + +--- + +### Step 4: Compute overall and action items + +Compute `overall` per the table above. For every pair with `score < 2` or +`verdict == "unresolved"`, add a concrete action item naming the parent and the missing +aspect: + +``` +"satisfies <parent_id>: child does not cover <missing aspect>; rewrite the body to address it" +``` + +If every pair scores ≥ 2, `action_items` is an empty array. + +--- + +### Step 5: Emit JSON + +Emit the JSON document only. No prose wrapper. + +If `overall != "addresses"`, append below the JSON a single advisory line: + +``` +Consider `pharaoh-author <need_id>` to revise the body, or `pharaoh-req-regenerate <need_id>` +for a per-axis driven rewrite. Re-run `pharaoh-verify <need_id>` after the edit. +``` + +This is the only prose permitted after the JSON. + +--- + +## Guardrails + +**G1 — Child not found** + +`need_id` absent from `needs.json` → FAIL (Step 1). + +**G2 — Child has no parent links** + +Emit a JSON document with empty `parents` and a clear action item, then stop (Step 1). Do not +guess a parent. + +**G3 — All parents unresolved** + +Set `overall = "absent"`; do not crash. Action items must list every unresolved parent ID so +the user can fix the link or build the project first. + +**G4 — Empty bodies** + +If either the child's body or a parent's body is empty (only the title is set), score that +pair `0` with `reason: "<role> body is empty — substantive verification not possible"` and +add an action item to populate the body before re-running. + +--- + +## Advisory chain + +This skill has `chains_to: [pharaoh-mece]` because content satisfaction is the natural sibling +of structural MECE. After verifying one need, the user often wants the corpus-wide structural +view next: + +``` +Consider running `pharaoh-mece` to confirm the surrounding link structure +(orphans, gaps, status inconsistencies) is also healthy. +``` + +Show this line only when `overall == "addresses"`. + +--- + +## Worked example + +**User input:** + +> need_id: `arch__abs_pump_driver` +> transitive: false + +**Step 1:** child resolves; type `arch`. Parent-link field for `arch` is `:satisfies:`. Value: +`gd_req__abs_pump_activation`. + +**Step 2:** parent resolves. Body: "The brake controller shall engage the ABS pump when +measured wheel slip exceeds the calibrated activation threshold." + +**Step 3:** child body: "The ABS pump driver component manages the pump drive circuit, +controlling output PWM duty cycle and providing over-current protection for the pump motor." + +The child names the pump drive circuit and the actuation mechanism (PWM, over-current) but +does not state how the wheel-slip-threshold trigger reaches the driver. Score 2 — addresses +the actuation claim but the trigger linkage is implicit. + +**Step 4:** `overall = "addresses"`; one minor `missing_aspect` recorded. + +**Step 5 output:** + +```json +{ + "need_id": "arch__abs_pump_driver", + "child_type": "arch", + "parents": [ + { + "parent_id": "gd_req__abs_pump_activation", + "parent_type": "gd_req", + "link_field": "satisfies", + "verdict": "addresses", + "score": 2, + "reason": "child covers pump actuation and protection; the wheel-slip trigger linkage is implicit, not restated", + "missing_aspects": ["wheel-slip-threshold trigger pathway from sensor to driver"] + } + ], + "overall": "addresses", + "action_items": [] +} +``` + +``` +Consider running `pharaoh-mece` to confirm the surrounding link structure is also healthy. +``` + +--- + +**Variant: child does not address the parent** + +Same parent, but the child body reads: "The ABS pump driver component logs telemetry every +100 ms to the CAN bus." Logging is unrelated to the parent's actuation claim. + +```json +{ + "need_id": "arch__abs_pump_driver", + "child_type": "arch", + "parents": [ + { + "parent_id": "gd_req__abs_pump_activation", + "parent_type": "gd_req", + "link_field": "satisfies", + "verdict": "absent", + "score": 0, + "reason": "child describes telemetry logging only; does not address pump engagement on slip threshold", + "missing_aspects": [ + "ABS pump engagement actuation", + "wheel-slip-threshold trigger pathway" + ] + } + ], + "overall": "absent", + "action_items": [ + "satisfies gd_req__abs_pump_activation: child does not cover pump engagement; rewrite the body to describe the actuation pathway and the slip-threshold trigger" + ] +} +``` + +``` +Consider `pharaoh-author arch__abs_pump_driver` to revise the body, or `pharaoh-req-regenerate +arch__abs_pump_driver` for a per-axis driven rewrite. Re-run `pharaoh-verify +arch__abs_pump_driver` after the edit. +``` diff --git a/.github/agents/pharaoh.vplan-draft.agent.md b/.github/agents/pharaoh.vplan-draft.agent.md index 24c2d1f..ec026b0 100644 --- a/.github/agents/pharaoh.vplan-draft.agent.md +++ b/.github/agents/pharaoh.vplan-draft.agent.md @@ -7,4 +7,415 @@ handoffs: [] Use when drafting a single sphinx-needs test-case (verification plan item) for one requirement. The artefact type is parameterised via `target_level` (any catalog-declared verification-plan / test-case type — e.g. `tc`, `test`, `vplan`). Emits an RST directive with inputs, steps, and expected outcome, linking to the parent req via `:verifies:`. -See [`skills/pharaoh-vplan-draft/SKILL.md`](../../skills/pharaoh-vplan-draft/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-vplan-draft + +## When to use + +Invoke when the user has a validated requirement (or architecture element) and wants to derive a +single test case that verifies it. Each invocation produces exactly one directive of the +catalog-declared verification-plan type. The directive name and ID prefix are resolved from +the project's tailoring; this skill is type-agnostic and supports any catalog-declared +verification-plan type (typical names: `tc`, `test`, `vplan`). + +Do NOT draft multiple test cases in one invocation — one test case per call. +Do NOT draft test cases for requirements that are not verifiable (see Guardrail G3). +Do NOT review — use `pharaoh-vplan-review` after drafting. + +--- + +## Inputs + +- **parent_id** (from user): need-id of the parent requirement or architecture element to verify + — must exist in needs.json +- **target_level** (from user, default `tc`): the artefact-catalog type name to emit. Any + verification-plan / test-case type declared in `.pharaoh/project/artefact-catalog.yaml` + is accepted. The emitted directive uses `target_level` verbatim as the directive name; the + ID prefix is resolved from `id-conventions.yaml`'s `prefixes` map. +- **verification_level** (from user, optional, default `system`): one of `unit`, `integration`, or `system`. When the dispatching caller (e.g. `pharaoh-author`) does not propagate this input, the default applies — `system` is the broadest scope and the safest default for a top-level test case. +- **tailoring** (from `.pharaoh/project/`): + - `artefact-catalog.yaml` — look up the entry for `target_level` to read `required_fields`, + `optional_fields`, `lifecycle`, and `required_body_sections` + - `id-conventions.yaml` — `prefixes` map (key = type name → value = identifier prefix + string), `separator`, `id_regex` +- **needs.json**: required for parent resolution and ID uniqueness + +> Note: A `shared/tailoring-access.md` helper module is planned. Until it exists, Steps 1-2 below +> inline the tailoring-access logic directly. When that file is created, this skill should be +> updated to delegate to it. + +--- + +## Outputs + +A single RST directive block of type `target_level` containing: + +- Unique ID using the prefix resolved for `target_level` from `id-conventions.yaml`'s + `prefixes` map +- `:status: draft` +- `:verifies:` pointing to parent_id (validated in needs.json) +- `:level:` set to the requested verification_level (`unit` / `integration` / `system`) +- Body with three labelled sections: **Inputs**, **Steps**, **Expected** + +The body must be self-contained — a test engineer should be able to execute this test case +without reading any other document beyond the referenced parent requirement. + +--- + +## Process + +### Step 1: Read tailoring + +**1a. `artefact-catalog.yaml`** + +Look up the entry whose key equals `target_level`. If found, read `required_fields`, +`optional_fields`, `lifecycle`, and `required_body_sections`. + +**`required_fields` / `optional_fields`** are directive option names (sphinx-needs `:key: value` +options). **`required_body_sections`** are top-level Markdown/RST section headings that must +appear in the directive body prose (e.g. `Inputs`, `Steps`, `Expected`). + +If the entry is absent, FAIL: + +``` +FAIL: target_level "<value>" is not declared in .pharaoh/project/artefact-catalog.yaml. +Add an entry for "<value>" (with required_fields, required_body_sections, lifecycle) before +drafting, or pass a target_level that is already declared. +``` + +**1b. `id-conventions.yaml`** + +Read the `prefixes:` map and look up the prefix for `target_level`. Also extract +`separator` and `id_regex`. + +If `prefixes` does not declare `target_level`, FAIL: + +``` +FAIL: id-conventions.yaml prefixes map has no entry for "<value>". +Declare a prefix for "<value>" (e.g. T_ for a type named "test", or TC_ for "tc") +before drafting. +``` + +The resolved prefix is the value of `prefixes[target_level]` — e.g. `tc__` for +`target_level: tc` on a project that uses the double-underscore convention, `T_` for +`target_level: test` on a project that uses underscore separators. + +**1c. Validate verification_level** + +Accepted values: `unit`, `integration`, `system`. If `verification_level` is absent, +default to `system` (the broadest scope; the override is documented in the Inputs +section above). If a different non-empty value is supplied, FAIL: + +``` +FAIL: verification_level "<value>" is not recognised. +Accepted values: unit, integration, system. +``` + +--- + +### Step 2: Locate and parse needs.json + +Find `needs.json` (check `docs/_build/needs/needs.json`, then `_build/needs/needs.json`, then any +`needs.json` under a `_build` directory). If not found, FAIL: + +``` +FAIL: needs.json not found. Build the Sphinx project first (`sphinx-build docs/ docs/_build/`), +then re-run this skill. +``` + +Extract the flat map of `id → {id, type, status, body}` and the set of all existing IDs. + +--- + +### Step 3: Validate parent_id + +1. Look up `parent_id` in needs.json. If not found, FAIL: + +``` +FAIL: parent_id "<id>" not found in needs.json. +Specify an existing requirement or architecture element ID. +``` + +2. Extract the parent body to understand what testable claim must be verified. + +3. Check whether the parent's type is testable: + - Requirement types (prefix ends in `req`) — always valid + - Architecture elements (any catalog-declared architecture type) — valid at + integration/system level only + - Workflow/work-product types (`wf`, `wp`) — warn but do not block: + ``` + [WARNING] parent_id "<id>" has type "<type>". Verification plans are usually written + against requirements or arch elements. Proceeding at user's discretion. + ``` + +--- + +### Step 4: Testability check + +Read the parent body. Confirm that the parent contains a testable claim: +- A measurable outcome or threshold (e.g. "within X ms", "exceeds Y threshold") +- A discrete pass/fail condition (e.g. "shall activate the valve", "shall reject the input") + +If the parent body is too vague to derive a verifiable procedure (e.g. body is a stub, no +condition or outcome stated), FAIL: + +``` +FAIL: parent "<parent_id>" does not contain a testable claim. +Improve the parent requirement first (e.g. run pharaoh-req-regenerate) before drafting a +test case. +``` + +--- + +### Step 5: Assign a unique ID + +**5a. Derive local-ID part** + +Format: `<prefix><tail>` where `<prefix>` is the value resolved in Step 1b. The tail is +derived from the parent_id local part plus a level suffix: + +- Strip the parent's prefix from `parent_id`: `gd_req__abs_pump_activation` → `abs_pump_activation` +- Append `_<verification_level>` → `abs_pump_activation_system` +- Compose the full ID by concatenating `<prefix>` + `<tail>`. If `id-conventions.yaml` + declares an explicit `separator` distinct from any trailing punctuation in the prefix, + insert it between prefix and tail. + +Examples: + +- `prefixes: {tc: tc__}` → `tc__abs_pump_activation_system` +- `prefixes: {test: T_}` → `T_abs_pump_activation_system` + +Check uniqueness. If taken, append `_2`, `_3`, etc. + +**5b. Validate against id_regex** + +Confirm the candidate matches the `id_regex` declared in `id-conventions.yaml`. If it does +not, FAIL: + +``` +FAIL: generated ID "<id>" does not match id_regex "<regex>". +``` + +This is the gate that catches a hardcoded prefix mismatch — e.g. emitting `tc__foo_unit` +on a project whose `test` type uses prefix `T_` and a regex `^T_[A-Za-z0-9_]+$`. Because +the prefix is read from `prefixes[target_level]`, this case is now caught at draft time. + +--- + +### Step 6: Draft the test case body + +Structure the body using three labelled sections. Use a Given/When/Then framing where natural, +or a step-by-step enumeration for procedural tests. + +**Inputs section** — list all preconditions and input stimuli: + +``` +Inputs: +- <precondition or stimulus 1> +- <precondition or stimulus 2> +``` + +**Steps section** — ordered procedure: + +``` +Steps: +1. <action> +2. <action> +3. Observe <observable outcome> +``` + +**Expected section** — concrete pass criterion: + +``` +Expected: +<Observable result that proves the parent claim is satisfied. Must be checkable without +ambiguity — state exact value, range, or behaviour.> +``` + +The expected outcome must directly trace to the testable claim extracted from the parent body +in Step 4. Do not invent pass criteria that are not implied by the parent. + +--- + +### Step 7: Self-check + +Before emitting: + +**Check A — required fields present** +Every field in `required_fields` from Step 1 must appear. + +**Check B — parent resolves** +`:verifies:` value is present in needs.json (confirmed in Step 3). + +**Check C — ID unique** +Chosen ID not in needs.json. + +**Check D — testable expected outcome** +Expected section must contain a concrete, unambiguous pass criterion. Vague criteria like +"the system works correctly" or "no errors occur" are not acceptable — rewrite with a specific +observable. + +If any check fails after one rewrite attempt, emit with `[DIAGNOSTIC]`. + +--- + +### Step 8: Emit the directive block + +```rst +.. <target_level>:: <test case title> + :id: <id> + :status: draft + :verifies: <parent_id> + :level: <verification_level> + + Inputs: + - <input 1> + - <input 2> + + Steps: + 1. <step> + 2. <step> + + Expected: + <pass criterion> +``` + +The directive name is exactly `target_level`; the `:id:` value uses the prefix resolved from +`id-conventions.yaml`'s `prefixes` map. Both come from tailoring — neither is hardcoded. + +--- + +## Guardrails + +**G1 — Parent not found** + +parent_id absent from needs.json → FAIL (Step 3). + +**G2 — verification_level not recognised** + +Unrecognised level value → FAIL (Step 1c). + +**G3 — Parent not testable** + +Parent body too vague to derive a verifiable procedure → FAIL (Step 4). Do not draft a +placeholder test case — improve the parent first. + +**G4 — needs.json unavailable** + +Cannot find needs.json → FAIL (Step 2). + +**G5 — target_level not declared** + +If `target_level` is not declared in `artefact-catalog.yaml` or has no entry in +`id-conventions.yaml`'s `prefixes` map, FAIL (Step 1 handles this). The catalog is the +contract — never silently default to a hardcoded prefix. + +--- + +## Advisory chain + +After successfully emitting the directive: + +``` +Consider running `pharaoh-vplan-review <new_id>` to audit against per-axis criteria. +``` + +Do not show this if the emit included a `[DIAGNOSTIC]`. + +--- + +## Worked example + +### Example A — default `target_level: tc` + +**User input:** +> Parent: `gd_req__abs_pump_activation`; level: `system`. (`target_level` defaults to `tc`.) + +**Parent body (from needs.json):** +> "The brake controller shall engage the ABS pump when measured wheel slip exceeds the calibrated +> activation threshold." + +**Step 1:** Catalog has a `tc` entry with `required_fields: [id, status, verifies]` and +`required_body_sections: [Inputs, Steps, Expected]`. `id-conventions.yaml` `prefixes` map +has `tc: tc__`. Level `system` is valid. + +**Step 2:** needs.json found; 185 IDs loaded. + +**Step 3:** `gd_req__abs_pump_activation` found; type `gd_req`. Valid. + +**Step 4:** Testable claim — "engage the ABS pump when wheel slip exceeds the calibrated +activation threshold" — discrete activation event, verifiable by observing pump output signal. + +**Step 5:** prefix = `tc__`; tail = `abs_pump_activation_system`; candidate = +`tc__abs_pump_activation_system`. Not in needs.json. Passes id_regex. + +**Step 6 body drafted** (see output below). + +**Step 7 self-checks:** required fields present; parent resolves; ID unique; expected outcome +concrete ("ABS pump output signal activates within 50 ms"). All pass. + +**Step 8 output:** + +```rst +.. tc:: ABS pump activation on wheel slip threshold — system test + :id: tc__abs_pump_activation_system + :status: draft + :verifies: gd_req__abs_pump_activation + :level: system + + Inputs: + - Vehicle moving at 30 km/h on low-friction surface (µ ≤ 0.3) + - Brake controller in normal operating mode (no active faults) + - Calibrated wheel slip activation threshold loaded (default factory value) + + Steps: + 1. Apply full brake pedal force to induce wheel lock-up condition. + 2. Monitor measured wheel slip signal via diagnostic interface. + 3. Confirm slip measurement exceeds the calibrated activation threshold. + 4. Observe ABS pump output signal state. + + Expected: + ABS pump output signal transitions from inactive to active within 50 ms of wheel slip + measurement exceeding the calibrated activation threshold. +``` + +``` +Consider running `pharaoh-vplan-review tc__abs_pump_activation_system` to audit against per-axis criteria. +``` + +### Example B — project that uses `test` with prefix `T_` + +**User input:** +> Parent: `req__login_lockout`; target_level: `test`; level: `unit`. + +**Step 1:** Catalog has a `test` entry with `required_fields: [id, status, verifies]` and +`required_body_sections: [Inputs, Steps, Expected]`. `id-conventions.yaml` `prefixes` map +has `test: T_`. Level `unit` is valid. + +**Step 5:** prefix = `T_`; tail = `login_lockout_unit`; candidate = `T_login_lockout_unit`. +Passes id_regex `^T_[A-Za-z0-9_]+$`. + +**Step 8 output (header only):** + +```rst +.. test:: Login lockout — unit test + :id: T_login_lockout_unit + :status: draft + :verifies: req__login_lockout + :level: unit + + ... +``` + +A skill that hardcoded `tc__` would have emitted `tc__login_lockout_unit`, which fails the +project's `T_…` `id_regex` at Step 5b. By deriving prefix and directive name from tailoring +the same skill serves both projects. + +## Last step + +After emitting the artefact, invoke `pharaoh-vplan-review` on it. Pass the emitted artefact (or its `need_id`) as `target`. Attach the returned review JSON to the skill's output under the key `review`. If the review emits any axis with `score: 0` or `severity: critical`, return a non-success status with the review findings verbatim and do NOT finalize the artefact — the caller must regenerate (via `pharaoh-vplan-regenerate` if available, or by re-invoking this skill with the findings as input). + +See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md) for the rationale and enforcement mechanism. Coverage is mechanically enforced by `pharaoh-self-review-coverage-check` in `pharaoh-quality-gate`. diff --git a/.github/agents/pharaoh.vplan-review.agent.md b/.github/agents/pharaoh.vplan-review.agent.md index e01ddbf..aed3ad7 100644 --- a/.github/agents/pharaoh.vplan-review.agent.md +++ b/.github/agents/pharaoh.vplan-review.agent.md @@ -7,4 +7,362 @@ handoffs: [] Use when auditing a single test case against ISO 26262-8 §6 axes plus vplan-specific axes (coverage of parent req, completeness of steps, clarity of expected outcome). Emits structured findings JSON. -See [`skills/pharaoh-vplan-review/SKILL.md`](../../skills/pharaoh-vplan-review/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-vplan-review + +## When to use + +Invoke when the user has a single test case (either just drafted by `pharaoh-vplan-draft` or +retrieved from needs.json by ID) and wants per-axis inspection. + +Do NOT review sets of test cases — each invocation audits exactly one `tc__` element. +Do NOT re-author or fix — address findings manually or via a planned `pharaoh-vplan-regenerate` +skill, then re-invoke this review. +Do NOT audit requirements or arch elements — use `pharaoh-req-review` or `pharaoh-arch-review`. + +--- + +## Inputs + +- **target**: either an RST directive block (from `pharaoh-vplan-draft`) OR a need-id present + in needs.json +- **tailoring** (from `.pharaoh/project/`): + - `artefact-catalog.yaml` — `tc` entry; required/optional fields + - `id-conventions.yaml` — id_regex +- **needs.json**: required for coverage axis (verifying `:verifies:` resolves to parent req) and + for reading the parent body to check coverage + +--- + +## Outputs + +A single JSON document with **no prose wrapper**. Shape mirrors `pharaoh-req-review`, with +`coverage` replacing `verifiability` as the primary traceability axis for test cases: + +```json +{ + "need_id": "tc__example", + "axes": { + "atomicity": {"score": 0, "reason": "..."}, + "internal_consistency": {"score": 0, "reason": "..."}, + "coverage": {"score": 0, "reason": "..."}, + "completeness": {"score": "deferred", "reason": "set-level axis — see note"}, + "external_consistency": {"score": "deferred", "reason": "set-level axis — see note"}, + "no_duplication": {"score": "deferred", "reason": "set-level axis — see note"}, + "schema": {"score": 0, "reason": "..."}, + "maintainability": {"score": null, "reason": "chain-level axis — see note"}, + "unambiguity_prose": {"score": 0, "reason": "..."}, + "comprehensibility": {"score": 0, "reason": "..."}, + "feasibility": {"score": 0, "reason": "..."} + }, + "action_items": ["..."], + "overall": "pass" +} +``` + +Note: `coverage` replaces `verifiability` from req-review. It checks both that the parent link +resolves AND that the test case steps address the testable claim in the parent body. + +### Score scales + +**Binary (0 or 1) — mechanized axes:** + +| Axis | Score 0 = FAIL | Score 1 = PASS | +|---|---|---| +| `atomicity` | test case covers more than one independent testable claim (steps verify two unrelated parent conditions) | test case addresses exactly one testable claim | +| `internal_consistency` | steps or expected outcome contradict each other (e.g. a step asserts X while expected denies X) | no internal contradiction | +| `coverage` | `:verifies:` absent or does not resolve in needs.json, OR the test steps do not address the parent's testable claim | `:verifies:` resolves and test steps address the parent's testable claim | +| `schema` | any field in `required_fields` for the `tc` artefact type is missing | all required fields present and non-empty | + +**Ordinal (0–3) — subjective LLM-judge axes:** + +| Axis | 0 | 1 | 2 | 3 | +|---|---|---|---|---| +| `unambiguity_prose` | expected outcome has multiple interpretations or is completely vague ("system works") | single interpretation but phrasing is imprecise | single interpretation; minor imprecision | unambiguous pass/fail criterion | +| `comprehensibility` | test engineer cannot execute this test without reading additional documents | requires significant extra context | mostly self-contained; minor references needed | fully self-contained; executable as-is | +| `feasibility` | test procedure is physically impossible or requires unavailable equipment | feasible but significant setup unknowns | feasible with normal lab equipment | clearly feasible; setup requirements fully specified | + +### Deferred set-level axes + +`completeness`, `external_consistency`, `no_duplication` require the full set of test cases. +Defer to a planned `pharaoh-vplan-set-review`. + +Record each as `{"score": "deferred", "reason": "set-level axis — assess with pharaoh-vplan-set-review"}`. + +### Chain-level axis + +`maintainability` requires observing regeneration convergence. +Record as `{"score": null, "reason": "chain-level axis — assess after vplan regeneration runs"}`. + +### `overall` field + +Computed from non-deferred, non-null axes (atomicity, internal_consistency, coverage, schema, +unambiguity_prose, comprehensibility, feasibility): + +- `"pass"` — all binary axes score 1, all subjective axes score ≥ 2 +- `"needs_work"` — no binary fails, ≥ 1 subjective < 2 +- `"fail"` — ≥ 1 binary scores 0 + +--- + +## Process + +### Step 1: Read tailoring + +Read `.pharaoh/project/artefact-catalog.yaml`. Extract the `tc` entry: +- `required_fields` — directive-option keys, expected: `[id, status, verifies]` +- `required_body_sections` — top-level body headings, expected: `[Inputs, Steps, Expected]` + +`required_fields` are sphinx-needs directive options (`:id:`, `:status:`, `:verifies:`). +`required_body_sections` are named sections *inside* the directive body prose (`Inputs`, +`Steps`, `Expected`). These are separate axes and validated separately by the `schema` check +below. + +If the `tc` entry is absent, apply defaults: +`required_fields = [id, status, verifies]` and +`required_body_sections = [Inputs, Steps, Expected]`. Note the fallback in output. + +Read `id-conventions.yaml` for `id_regex`. + +--- + +### Step 2: Resolve target + +**If target is a need-id:** + +1. Find needs.json (`docs/_build/needs/needs.json` first, then `_build/needs/needs.json`). +2. Look up the need-id. If not found, FAIL (see Guardrails G1). +3. Extract all fields and body (Inputs, Steps, Expected sections). + +**If target is an RST directive block:** + +1. Parse the block — extract `:id:`, `:status:`, `:verifies:`, `:level:`, and body sections. +2. Identify directive prefix (e.g. `.. tc::` → type `tc`). +3. If needs.json available, check whether the ID already exists. + +--- + +### Step 3: Evaluate binary axes + +**Atomicity:** + +Read the Steps and Expected sections. Check whether the test case addresses more than one +independent testable claim. Signs of bundled test cases: +- Steps cover two clearly distinct system behaviours (e.g. activation of pump AND logging, with + no causal relationship between them) +- Expected outcome has two independent pass criteria ("X shall occur AND Y shall occur") + +Score 1 if exactly one testable claim is addressed. Score 0 if two or more independent claims +are verified. + +**Internal consistency:** + +Check for contradictions within the test case: +- A step asserts that a condition is true while the Expected section requires the opposite +- Two steps mutually exclude each other's preconditions + +Score 1 if no contradiction; score 0 if one is identifiable. + +**Coverage:** + +Evaluate two sub-conditions, both must hold for score 1: + +*Sub-condition A — link resolution:* +`:verifies:` is present and the ID resolves in needs.json. If absent or unresolved, score 0. + +*Sub-condition B — content coverage:* +Read the parent body from needs.json (or from context if target is an RST block and needs.json +is unavailable). Extract the parent's testable claim (the observable outcome or threshold). +Compare against the test Steps and Expected: +- Steps must include an action that stimulates the parent's condition +- Expected must cite the parent's claimed observable + +If the test steps and expected do not address the parent's testable claim, score 0 and +describe the gap in `reason`. + +Score 1 only if both sub-conditions hold. + +**Schema:** + +Two independent sub-checks — both must pass to score 1: + +1. Every key in `required_fields` is present as a sphinx-needs directive option and + non-empty. For default tc: `:id:`, `:status:`, `:verifies:` must be set. +2. Every entry in `required_body_sections` is present as a top-level section heading in + the directive body prose and non-empty. For default tc: `Inputs`, `Steps`, `Expected` + sections must appear with content. + +Score 0 with a `reason` listing the missing fields and/or missing body sections (prefix +option failures with `:field:` and body-section failures with the section name). + +--- + +### Step 4: Evaluate subjective axes + +**Unambiguity (prose):** + +Read the Expected section. Assess whether a test engineer could judge pass/fail without +interpretation: +- "Pump activates within 50 ms" → score 3 +- "Pump activates promptly" → score 1 +- "System works correctly" → score 0 + +**Comprehensibility:** + +Assess whether a test engineer could execute this test case end-to-end without reading any +document other than the parent requirement: +- All preconditions in Inputs are explicit +- All steps are unambiguous actions +- No implicit knowledge assumed + +**Feasibility:** + +Assess whether the test procedure can be executed with normal automotive lab equipment and +environment. Consider: +- Required signals available via CAN/diagnostic interface? +- Fault conditions injectable without destructive testing? +- Timing requirements measurable with standard tools? + +--- + +### Step 5: Record deferred and null axes + +Set `completeness`, `external_consistency`, `no_duplication` to deferred. +Set `maintainability` to null. Per policy above. + +--- + +### Step 6: Compute overall and action items + +Compute `overall` from non-deferred, non-null axes per the rule above. + +For each binary axis scoring 0 and each subjective axis scoring 0 or 1, add a concrete action +item naming the axis and stating what must change. + +--- + +### Step 7: Emit JSON + +Emit only the JSON. No prose before or after (except the advisory chain). + +--- + +## Guardrails + +**G1 — Unresolved target** + +If target is a need-id not found in needs.json: + +``` +FAIL: need-id "<id>" not found in needs.json. +Verify the ID is correct or build the Sphinx project first. +``` + +**G2 — Malformed JSON output** + +Self-correct once. If still invalid after correction: + +```json +{ + "need_id": "<id>", + "diagnostic": "JSON self-correction failed. Raw findings follow.", + "raw": "<free-text findings>" +} +``` + +**G3 — Parent body unavailable for coverage check** + +If needs.json is unavailable and target is an RST block, the coverage sub-condition B cannot +be evaluated. Record: + +```json +"coverage": {"score": 0, "reason": "parent body unavailable — needs.json not found; link presence checked only"} +``` + +--- + +## Advisory chain + +If `overall` is `"needs_work"` or `"fail"`, append — after the JSON — a single line: + +``` +Consider addressing action items and re-running `pharaoh-vplan-review <need_id>`. +``` + +This is the only prose permitted after the JSON. + +--- + +## Worked example + +**Target (RST block from pharaoh-vplan-draft):** + +```rst +.. tc:: ABS pump activation on wheel slip threshold — system test + :id: tc__abs_pump_activation_system + :status: draft + :verifies: gd_req__abs_pump_activation + :level: system + + Inputs: + - Vehicle moving at 30 km/h on low-friction surface (µ ≤ 0.3) + - Brake controller in normal operating mode (no active faults) + - Calibrated wheel slip activation threshold loaded (default factory value) + + Steps: + 1. Apply full brake pedal force to induce wheel lock-up condition. + 2. Monitor measured wheel slip signal via diagnostic interface. + 3. Confirm slip measurement exceeds the calibrated activation threshold. + 4. Observe ABS pump output signal state. + + Expected: + ABS pump output signal transitions from inactive to active within 50 ms of wheel slip + measurement exceeding the calibrated activation threshold. +``` + +**Parent body (from needs.json for `gd_req__abs_pump_activation`):** +> "The brake controller shall engage the ABS pump when measured wheel slip exceeds the +> calibrated activation threshold." + +**Step 3 — binary axes:** +- atomicity: one testable claim (pump activation on slip threshold); no second independent claim → score 1 +- internal_consistency: steps flow logically; no step contradicts expected outcome → score 1 +- coverage: `:verifies: gd_req__abs_pump_activation` resolves; Step 3 stimulates slip threshold + condition; Expected cites pump activation within 50 ms → score 1 +- schema: `id`, `status`, `verifies` present; `inputs` section present; `steps` present; + `expected` present → score 1 + +**Step 4 — subjective axes:** +- unambiguity_prose: "transitions from inactive to active within 50 ms" — concrete measurable + criterion → score 3 +- comprehensibility: preconditions, stimuli, and expected outcome fully stated → score 3 +- feasibility: slip signal readable via CAN diagnostic interface; 50 ms timing measurable with + standard tools → score 3 + +**Step 6:** all 7 evaluated axes pass → `overall = "pass"`, `action_items = []`. + +**Step 7 output:** + +```json +{ + "need_id": "tc__abs_pump_activation_system", + "axes": { + "atomicity": {"score": 1, "reason": "single testable claim: pump activation on wheel slip threshold"}, + "internal_consistency": {"score": 1, "reason": "no contradiction between steps and expected outcome"}, + "coverage": {"score": 1, "reason": ":verifies: resolves; steps stimulate slip threshold; expected cites pump activation timing"}, + "schema": {"score": 1, "reason": "id, status, verifies, inputs, steps, expected all present"}, + "completeness": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-vplan-set-review"}, + "external_consistency": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-vplan-set-review"}, + "no_duplication": {"score": "deferred", "reason": "set-level axis — assess with pharaoh-vplan-set-review"}, + "maintainability": {"score": null, "reason": "chain-level axis — assess after vplan regeneration runs"}, + "unambiguity_prose": {"score": 3, "reason": "concrete measurable pass criterion: activation within 50 ms"}, + "comprehensibility": {"score": 3, "reason": "preconditions, stimuli, and expected outcome fully stated"}, + "feasibility": {"score": 3, "reason": "standard automotive lab setup; slip signal accessible via CAN interface"} + }, + "action_items": [], + "overall": "pass" +} +``` diff --git a/.github/agents/pharaoh.write-plan.agent.md b/.github/agents/pharaoh.write-plan.agent.md index a9584a4..3d4a544 100644 --- a/.github/agents/pharaoh.write-plan.agent.md +++ b/.github/agents/pharaoh.write-plan.agent.md @@ -7,4 +7,242 @@ handoffs: [] Use when you have an intent (e.g. "reverse-engineer features and reqs from this module") and need a concrete plan.yaml that pharaoh-execute-plan can run. Picks a plan template by intent, fills project-specific values, emits a plan that validates against schema.md. Does NOT execute anything. -See [`skills/pharaoh-write-plan/SKILL.md`](../../skills/pharaoh-write-plan/SKILL.md) for the full atomic specification — inputs, outputs, atomicity contract, and composition patterns. +--- + +## Full atomic specification + +# pharaoh-write-plan + +## When to use + +Invoke when you need a plan.yaml and do not already have one. Typical inputs: a short natural-language intent plus a project root and its tailoring files. Typical output: a plan ready to hand to `pharaoh-execute-plan`. + +Do NOT use to execute plans (that is `pharaoh-execute-plan`). Do NOT use to review emitted artefacts (that is `pharaoh-quality-gate` or `pharaoh-req-review`). Do NOT use to discover feats or files at scale — discovery is expressed as tasks in the plan itself. + +## Why this skill exists + +The deleted composition skills (`pharaoh-feats-from-project`, `pharaoh-reqs-from-module`) attempted to orchestrate 6-12 atomic skills via prose. In practice an LLM executing them flattened the process and dropped steps. This skill replaces that pattern by emitting the orchestration as data (plan.yaml), consumed by a generic executor. The domain heuristics those skills carried — split_strategy choice, preseed ordering, quality-gate terminal placement, id-allocate positioning — live here, but they decide plan content, not runtime behaviour. + +## Atomicity + +- (a) **Indivisible.** One intent → one plan.yaml. Does not execute tasks. Does not write artefacts. Does not mutate `.papyrus/` or `.pharaoh/`. Pure transformation: intent + project state → plan text. +- (b) **Typed I/O.** + - Input: `{intent: str, project_root: str, tailoring: {ubproject_toml_path?: str, pharaoh_toml_path?: str}, template_name?: str, vars?: dict[str, any]}`. + - Output: `{plan_yaml: str, template_used: str, warnings: list[str]}`. +- (c) **Execution-based reward.** Fixture in `pharaoh-validation/fixtures/write-plan-smoke/` contains a minimal project (docs/ with 1 RST file, src/ with 3 Python files, `ubproject.toml` declaring `feat` + `comp_req` types). Scorer runs write-plan with intent `"reverse-engineer features and reqs"`. Assertions: + 1. Output is valid YAML parseable by PyYAML. + 2. Output passes `pharaoh-execute-plan/schema.md` static validation (ref parsing, skill existence, cycle detection). + 3. `preseed_papyrus` task appears before any task invoking `pharaoh-req-from-code`. + 4. `pharaoh-quality-gate` is the last task (no downstream deps). + 5. `pharaoh-id-allocate` appears before any `pharaoh-req-from-code`. + 5a. Every plan that schedules `pharaoh-req-from-code` also schedules a `review_comp_reqs` task invoking `pharaoh-req-review` AND a `grounding_check_comp_reqs` task invoking `pharaoh-req-code-grounding-check`, both with `foreach: ${reqs_from_code.emitted_ids}` and `depends_on: [reqs_from_code]` (or equivalent per-file dependency set). Both tasks must appear in `quality_gate.depends_on`. These are **explicit** plan tasks — they are NOT replaced by the skill's in-body `## Last step` self-invocation, which the LLM-executor drops under foreach fan-out. + 6. Every `skill:` references a directory present under `<pharaoh>/skills/` or `<papyrus>/skills/`. + 7. **Dep-probe enrichment with prerequisite insertion.** Fixture's `conf.py` declares only `['sphinx_needs']` (no mermaid extension). Scorer checks: (a) the emitted plan still contains every diagram-emitting task (`pharaoh-feat-component-extract`, `pharaoh-feat-flow-extract`) — tasks are NOT stripped; (b) the plan contains exactly one new task with `skill: pharaoh-sphinx-extension-add` whose `extensions` input lists `sphinxcontrib.mermaid`; (c) every diagram-emitting task's `depends_on` list includes the new prerequisite task id; (d) `warnings` contains at least one human-readable entry naming the missing `sphinxcontrib.mermaid` module and pointing to the prerequisite task. +- (d) **Reusable.** Any intent matching an available template. Adding a new workflow = adding a new template, not modifying this skill. +- (e) **Composable.** Called by humans or by future wrapper skills (`pharaoh-reverse-engineer` could chain write-plan → execute-plan → quality-gate interpretation). + +## Input + +- `intent`: short phrase, normalised against the template index. Accepted phrasings map to templates: + - `"reverse-engineer project"`, `"reverse-engineer features and reqs"`, `"rev-eng full project"` → `templates/reverse-engineer-project.yaml.j2` + - `"reverse-engineer module"`, `"reqs from module"`, `"extract reqs from files"` → `templates/reverse-engineer-module.yaml.j2` + - When the intent does not match any template, emit a `warnings` entry and return `template_used: none`; do not fabricate a plan. +- `project_root`: absolute path to the target project. +- `tailoring.ubproject_toml_path`: path to `ubproject.toml` for type/prefix lookup. Optional if the project has no tailoring (use baseline defaults from `pharaoh-bootstrap`). +- `tailoring.pharaoh_toml_path`: path to `pharaoh.toml` for source-layout discovery. +- `template_name` (optional): overrides intent-based dispatch; names a template file under `templates/` without the `.yaml.j2` suffix. +- `vars` (optional): dict of additional template variables (e.g. `{"docs_root": "docs/source", "src_root": "src/<project>"}`). Caller-provided values win over skill-inferred ones. Notable optional vars consumed by the reverse-engineer-project template: + - `target_docs_path`: where emitted artefacts finally live (`toctree-emit` and `quality-gate` read from this path). Default `${workspace}/artefacts`. Set this when the caller wants reverse-engineered spec to land directly under the project's docs tree (e.g. `docs/source/spec/feature/`) without having to override `workspace_dir`. + +## Output + +```yaml +plan_yaml: | + name: ... + version: 1 + ... +template_used: reverse-engineer-project +warnings: + - "inferred docs_root as docs/source (no explicit value in pharaoh.toml)" +``` + +`plan_yaml` is the full text to hand to `pharaoh-execute-plan`. `template_used` records provenance for audit. `warnings` surfaces inference decisions (e.g., guessed paths, missing optional inputs). + +## Templates + +Templates live under `templates/` with filename `<name>.yaml.j2`. Each template: + +1. Begins with a YAML front-matter block (actual YAML, not Jinja) declaring `required_vars` and `optional_vars`. +2. Is a Jinja2-style text template with `{{ var }}` placeholders and `{% for %}` loops only. +3. Produces a plan.yaml body below the front-matter. + +Supported Jinja constructs: +- `{{ var }}` simple substitution. +- `{% for item in list %}` ... `{% endfor %}` iteration (rare — most iteration should be expressed as `foreach:` in the emitted plan, not unrolled at write time). +- `{% if cond %}` ... `{% endif %}` for optional tasks (e.g., include diagram tasks only when tailoring declares a diagram renderer). + +No arbitrary Python expressions, no filters beyond `| default(...)`. If a template needs richer logic, split it into two templates. + +## Process + +### Step 1: Resolve template + +1. If `template_name` is provided, use it directly. +2. Else normalise `intent` (lowercase, strip punctuation, collapse whitespace) and look up in the intent→template map above. +3. If no match, return `template_used: "none"`, `plan_yaml: ""`, and add a warning `"no template matched intent '<intent>'; valid intents: <list>"`. + +### Step 2: Gather variables + +Combine variables in this precedence (higher wins): + +1. Defaults baked into the template's front-matter. +2. Inferred from `tailoring.pharaoh_toml_path` (if present): + - `src_root` from `[pharaoh.codelinks].src_dir` or `[source_discover].src_dir`. + - `docs_root` from sphinx conf lookup (`docs/source/conf.py`, `docs/conf.py`, `conf.py`). +3. Inferred from `tailoring.ubproject_toml_path` (if present): + - `feat_directive`, `feat_prefix`, `comp_req_directive`, `comp_req_prefix` from the `[[needs.types]]` array. +4. Inferred from `docs_root` (after it resolves): + - `docs`: list of relative paths (relative to `project_root`) produced by globbing `<project_root>/<docs_root>/**/*.rst` and `**/*.md`, sorted alphabetically. Excludes `index.rst` / `index.md` (toctree parent is not a feature doc). Empty list if the directory is absent. This satisfies the `doc_files` shape that `pharaoh-feat-draft-from-docs` expects; templates iterate `docs` at write time rather than passing `docs_root` through to the skill. +5. Caller-supplied `vars`. + +Any required_var missing after this merge → add a warning, leave placeholder intact in the emitted plan (caller must fill before executing), do not fabricate a value. + +### Step 3: Render template + +Substitute `{{ var }}` tokens. Evaluate `{% if %}`/`{% for %}` blocks. Emit the rendered body (the part below the template's front-matter) as the plan. + +### Step 3.5: Probe required sphinx extensions and insert prerequisite tasks + +Before validating the rendered plan, probe `conf.py` to verify that the renderers required by diagram-emitting tasks are loaded. When a required extension is missing, this step enriches the plan with a `pharaoh-sphinx-extension-add` prerequisite task — it does NOT strip the diagram task. The plan body gains a task; no task is removed. This preserves the B1 invariant ("enrich, never strip") while also giving the executor an actionable step instead of a human handoff. + +**Probe procedure:** + +1. Resolve `conf.py` using the same lookup chain as Step 2 (`<docs_root>/conf.py` → `docs/source/conf.py` → `docs/conf.py` → `<project_root>/conf.py`). If absent, emit a warning and skip this step. +2. Parse `extensions = [...]` from the resolved `conf.py`. Flatten to a set of imported extension module paths. +3. Scan the rendered plan for diagram-emitting skills. Each has a fixed renderer surface: + + | Skill | Default renderer | Required extension module | pypi package | + | ------------------------------ | ---------------- | ------------------------- | ---------------------- | + | `pharaoh-feat-component-extract` | mermaid (or uml) | `sphinxcontrib.mermaid` | `sphinxcontrib-mermaid` | + | `pharaoh-feat-flow-extract` | mermaid (or uml) | `sphinxcontrib.mermaid` | `sphinxcontrib-mermaid` | + | `pharaoh-diagram-lint` | both | `sphinxcontrib.mermaid` AND/OR `sphinxcontrib.plantuml` | same, by renderer | + + If a task's inputs include `renderer_override: "plantuml"`, the required extension becomes `sphinxcontrib.plantuml` (pypi: `sphinxcontrib-plantuml`). If the template does not set `renderer_override`, fall back to `pharaoh.toml [pharaoh.diagrams].renderer` when readable, else `mermaid`. +4. Collect the set of missing extensions (present in the "required" column for some diagram task but absent from `conf.py` extensions list). If empty, skip to Step 4. +5. **Insert a prerequisite task into the plan body.** For all missing extensions (batched into one task invocation — `pharaoh-sphinx-extension-add` accepts a list), append a task with deterministic id `sphinx_extension_add`: + + ```yaml + - id: sphinx_extension_add + skill: pharaoh-sphinx-extension-add + inputs: + conf_py: <resolved conf_py path> + extensions: [<list of missing extension modules>] + install_if_missing: true + on_package_manager_missing: warn + reporter_id: "write-plan:sphinx-extension-add" + depends_on: [] + ``` + + Place this task before any diagram-emitting task in the plan's task list. + +6. **Rewrite `depends_on` of every diagram-emitting task** so it includes `sphinx_extension_add` as a dependency. This preserves the diagram task's existing dependencies and adds `sphinx_extension_add` to the list. +7. For each missing extension, ALSO append a warning entry (human-readable handoff in case someone inspects the plan without running it): + + ``` + diagram task '<task_id>' emits <renderer> blocks but conf.py does not load '<ext_module>'. Plan includes prerequisite task 'sphinx_extension_add' that will install '<pypi_pkg>' and update conf.py before diagram tasks run. Requires a resolvable package manager in the execution environment. + ``` + +**Design notes:** + +- No task is ever REMOVED. This step is plan-body enrichment: one new task + `depends_on` additions, no deletions. This preserves the B1 invariant. +- If `pharaoh-sphinx-extension-add` is itself missing from the skills tree (e.g. an old installation), log a warning and fall back to warn-only mode (do not insert the task). +- If probing fails (e.g. `conf.py` unparseable), emit one warning naming the parse failure and proceed without insertion. Do not abort. +- The prerequisite task is always batched (one task per plan, not one per missing extension), keeping the plan's task count bounded. + +### Step 4: Validate against schema + +Invoke the static-validation portion of `pharaoh-execute-plan/schema.md` mentally: + +1. Parse rendered text as YAML. +2. Confirm required top-level fields present. +3. Confirm every `skill:` references an existing skill directory under `<pharaoh>/skills/` or `<papyrus>/skills/`. +4. **Terminal quality-gate invariant (unconditional).** Compute the set of tasks with no downstream dependents (no other task lists them in `depends_on`). That set must be non-empty and every task in it must have `skill: pharaoh-quality-gate`. Rationale: when quality gate is absent or non-terminal, executors skip it under cost pressure and ID/body/satisfies checks go unenforced. Every reverse-engineering template ships with a terminal `quality_gate`; this check prevents custom templates from drifting away from that invariant. +5. Confirm the ordering invariants specific to reverse-engineering intents: + - preseed_papyrus before any req-emission task. + - id-allocate before any req-emission task referring to allocated IDs. +6. On any violation: abort, return empty `plan_yaml`, add the violation to `warnings`. + +### Step 5: Return + +Return `{plan_yaml, template_used, warnings}`. + +## Heuristics carried from deleted composition skills + +These are the domain bits that used to live in prose inside `pharaoh-feats-from-project` and `pharaoh-reqs-from-module`. They now live in template content and in the variable-inference logic above. Enumerated for auditability: + +1. **split_strategy per file.** Templates emit `split_strategy: ${heuristics.split_strategy(${item.file})}` on every `pharaoh-req-from-code` task, so the executor's helper evaluates it at dispatch time (no LOC counting at write time). The helper's rule: LOC ≤ 500 → single; 500 < LOC ≤ 2000 and section markers → sections; else → top_level_symbols. +2. **preseed Papyrus before reqs (unconditional).** Templates always emit a `preseed_papyrus` task (using `pharaoh-decision-record` or a dedicated future skill) with `depends_on: []` and a `depends_on: [preseed_papyrus]` on every req-emission task. Preseed registers canonical feat names in Papyrus, not file-level associations. Every feat-emission agent queries the same canonical-name namespace regardless of which files it touches, so preseed is always useful even when concurrent agents work on disjoint file sets. +3. **id-allocate before reqs.** Templates emit `pharaoh-id-allocate` after file discovery (feat-file-map or module-file-enum), producing `allowed_ids` that req-emission tasks consume via ref. +4. **Multi-parent reqs.** When a file maps to multiple feats (feat-file-map's `shared_with`), the template emits a single req-from-code task with `parent_feat_ids: ${item.parents}` (a list), not one task per parent. +5. **Quality-gate terminal.** Templates always include `pharaoh-quality-gate` as the last task, taking `artefacts_dir: ${workspace}/artefacts`. +6. **file-per-feature layout.** Templates produce one artefact per feat via the `pharaoh-toctree-emit` task's inputs, matching the layout committed in `pharaoh-feats-from-project` Task 6 of the prior plan. +7. **Diagram dep probe (enrich, never strip).** Write-plan reads `conf.py` extensions and, when a plan includes diagram-emitting tasks whose renderer extension is absent, inserts a `pharaoh-sphinx-extension-add` prerequisite task into the plan body (see Step 3.5). The diagram tasks remain in place and gain the new task as a dependency. A warning is still emitted alongside for human inspection. The contract is: write-plan informs AND enriches, never drops; the diagram deliverable is both visible and actually runnable end-to-end. + +## Review invariant (self-review) + +Every emission task in a plan is followed by its matching review task in the DAG. Mapping in [`shared/self-review-map.yaml`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-map.yaml). The template handles this automatically — user does not need to request review tasks. + +The terminal `pharaoh-quality-gate` task lists all review tasks in its `depends_on` and configures `gate_spec.invariants.self_review_coverage: true` so that missing reviews fail the gate. See [`shared/self-review-invariant.md`](https://github.com/useblocks/pharaoh/blob/v1.2.0/skills/shared/self-review-invariant.md). + +The two companion invariants are also unconditional: + +- `papyrus_non_empty` — enabled when the plan ran `preseed_papyrus`. Catches the LLM-executor-skipped-Papyrus-writes failure class. +- `dispatch_signal_matches_plan` — always enabled. Catches the LLM-executor-collapsed-parallel-to-inline failure class. + +Both delegate to atomic check skills; the gate itself stays a pure aggregator. + +## Failure modes + +| Condition | Response | +| ------------------------------------------- | -------------------------------------------------------- | +| Intent matches no template | Return empty plan; warning enumerating valid intents. | +| Required var missing after merge | Leave placeholder; warning; caller must fill. | +| Template references a non-existent skill | Abort with warning; return empty plan. | +| Rendered YAML fails schema validation | Abort with warning; return empty plan. | +| tailoring paths unreadable | Proceed with defaults; warning per missing file. | + +## Framework for remaining *-diagram-draft skills + +`pharaoh-use-case-diagram-draft` is the first concrete implementation of the `*-diagram-draft` family. It demonstrates the pattern every future draft skill must follow: + +1. Frontmatter `name`, `description` starting "Use when", `chains_from` / `chains_to: [pharaoh-diagram-review]`. +2. Atomicity section showing (a)-(e). +3. Input section naming `renderer` + `tailoring_path` + per-diagram-type scope inputs. +4. Output section with `{diagram_block, element_count, renderer}` shape. +5. Two "How to emit" subsections — one per renderer (mermaid, plantuml). +6. "Safe labels" subsection linking to `shared/diagram-safe-labels.md`. +7. "Relationship semantics" subsection linking to `shared/uml-relationship-semantics.md` if the diagram kind uses structural relationships. +8. "Last step" subsection invoking `pharaoh-diagram-review` per the self-review invariant. + +Agent cross-ref in `.github/agents/pharaoh.<name>.agent.md` required for CI. + +Diagram-draft skill catalogue (one per UML / SysML view the emitter supports): + +- `pharaoh-use-case-diagram-draft` — **shipped**, runnable end-to-end. +- `pharaoh-sequence-diagram-draft` — design-only scaffold. +- `pharaoh-component-diagram-draft` — design-only scaffold. +- `pharaoh-class-diagram-draft` — design-only scaffold. +- `pharaoh-state-diagram-draft` — design-only scaffold. +- `pharaoh-activity-diagram-draft` — design-only scaffold. +- `pharaoh-block-diagram-draft` — design-only scaffold. +- `pharaoh-deployment-diagram-draft` — design-only scaffold. +- `pharaoh-fault-tree-diagram-draft` — design-only scaffold. + +Shipped skills follow the canonical skeleton (frontmatter → input → output → process → review invocation) and have a matching agent under `.github/agents/`. Design-only scaffolds declare the same frontmatter + agent pair for plan-authoring and CI validation, but their process body is a `DESIGN ONLY` placeholder with a sentinel FAIL — implementation lands per-kind when a flow actually needs that view. Use-case extraction is handled end-to-end by `pharaoh-use-case-diagram-draft`; component and flow views are currently extracted directly from code by `pharaoh-feat-component-extract` and `pharaoh-feat-flow-extract` (not by the draft skills). + +## Non-goals + +- No skill discovery beyond checking directory existence. If a template references `pharaoh-foo` and that directory exists, write-plan trusts its contents are valid. +- No file enumeration. File lists come from tasks in the plan at execute time (e.g., `pharaoh-feat-file-map`). +- No artefact emission. This skill emits a plan, not artefacts. +- No branching at runtime. If an intent has two variants ("with diagrams" vs "without"), add two templates.