From cf70fca5748434eff149455c5984c151cea429cd Mon Sep 17 00:00:00 2001 From: Brett Date: Fri, 1 May 2026 04:55:19 -0500 Subject: [PATCH 01/15] docs(skill): refresh anc cross-references for v0.3.0 + RELEASES drift sweep + SYNCS map (#14) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary Three docs landings batched together since they all share the v0.3.0 release-prep arc: - **Skill bundle refresh** — closes the v0.3.0 gap that left the "fix → re-run → claim badge" loop incomplete in the agent-facing guide. - **RELEASES drift sweep** — backports the triple-diff verification block from `agentnative-cli` so the `dev → release/main` flow catches drift in both directions. - **SYNCS map** — new `docs/SYNCS.md` routing map for how spec content flows in and the bundle flows out. ## Changelog ### Added - `references/update-check.md` — pulled-out operational detail for the consumer-side update-check script (prompt copy, snooze ladder, state-dir layout). - New "The anc loop" section in `SKILL.md` documenting scorecard schema 0.5 fields (`coverage_summary.must.verified`, `badge.eligible`, `badge.score_pct`, `badge.embed_markdown`), the 80% badge eligibility floor, and the four `--audit-profile` categories (`human-tui`, `file-traversal`, `posix-utility`, `diagnostic-only`). - `anc skill install ` documented in `getting-started.md` § "Installing anc and this skill bundle" with `--dry-run`, `eval $(...)` capture, and `--output json` envelope. - `docs/SYNCS.md` — cross-repo sync map covering inbound (`agentnative` spec → this repo via `scripts/sync-spec.sh`) and outbound (this repo → consumer hosts; `agentnative-site` daily probe) edges, with manifest-vs-bundle ownership diagrams. ### Changed - Vendored-spec prose reference in `SKILL.md` bumped `v0.2.0 → v0.3.0` to match `spec/VERSION`. - `SKILL.md` description expanded with Rust/clap, scorecard, audit-profile, agent-native badge, and `anc skill install` keywords plus a SKIP clause that routes TUI builders to `--audit-profile human-tui` instead of this skill. - `SKILL.md` "Update check" block compressed from 35 lines (which buried the first-action intent) to a 6-line "First action: update check" stub; details moved to `references/update-check.md`. - `RELEASES.md` § "Releasing dev to main" step 4 — single guarded-paths grep replaced with a triple-diff verification block (A: main→release, B: release→dev, C: dev→main) plus a `git cherry HEAD origin/dev` patch-id check with squash-merge triage guidance. Mirrors the same step that landed on `agentnative-cli` during v0.3.0 prep. ## Type of Change - [x] `docs`: Documentation update ## Related Issues/Stories - Story: n/a - Issue: n/a - Architecture: n/a - Related PRs: agentnative-cli #45 (RELEASES triple-diff source), agentnative-cli #41 (SYNCS.md template), agentnative-cli #40 (badge schema 0.5 docs) ## Testing - [ ] Unit tests added/updated - [ ] Integration tests added/updated - [x] Manual testing completed - [x] All tests passing **Test Summary:** - markdownlint passes on all four touched/created files (auto-fix hook ran on each Write). - Verified `spec/VERSION` reads `0.3.0` so the prose bump aligns with the vendored snapshot. - Verified `--audit-profile` category list matches `cargo run -- check --help` output from `~/dev/agentnative-cli` HEAD. - Verified scorecard schema 0.5 keys (`coverage_summary`, `badge.*`, `audit_profile`, `tool/anc/run/target`) against a live `cargo run -- check --output json .` invocation. ## Files Modified **Modified:** - `SKILL.md` — compressed update-check block, added "The anc loop" section, expanded description with trigger keywords + SKIP clause, bumped spec ref to v0.3.0. - `getting-started.md` — added `anc skill install` examples, badge claim step in the existing-CLI loop, schema-0.5 stop conditions, `--audit-profile` category list. - `RELEASES.md` — step 4 triple-diff verification + `git cherry` patch-id check with squash-merge triage guidance. **Created:** - `references/update-check.md` — operational details pulled out of SKILL.md (prompt copy, snooze ladder, state-dir layout). - `docs/SYNCS.md` — cross-repo sync routing map. **Renamed:** - None. **Deleted:** - None. ## Breaking Changes - [x] No breaking changes - [ ] Breaking changes described below: ## Deployment Notes - [x] No special deployment steps required ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow Conventional Commits - [x] Self-review of code completed - [x] No new warnings or errors introduced - [x] Changes are backward compatible --- RELEASES.md | 56 ++++++++++++++++- SKILL.md | 100 +++++++++++++++++------------ docs/SYNCS.md | 125 +++++++++++++++++++++++++++++++++++++ getting-started.md | 24 +++++-- references/update-check.md | 33 ++++++++++ 5 files changed, 291 insertions(+), 47 deletions(-) create mode 100644 docs/SYNCS.md create mode 100644 references/update-check.md diff --git a/RELEASES.md b/RELEASES.md index e9f7616..aff8cf3 100644 --- a/RELEASES.md +++ b/RELEASES.md @@ -66,9 +66,59 @@ git log --oneline dev --not origin/main # docs/brainstorms/, or docs/reviews/) stay on dev. git cherry-pick ... -# 4. Verify no guarded paths leaked through: -git diff origin/main --name-only | grep -E '^docs/(plans|solutions|brainstorms|reviews)/' && \ - echo 'FAIL: guarded docs in release branch' || echo 'OK: no guarded docs' +# 4. Triple-diff verification — belt-and-suspenders sweep that catches both +# directions of drift before the release tag goes out: +# +# A. main → release (what users will see; the intended ship surface) +# B. release → dev (should be empty for non-doc paths until the +# bump/CHANGELOG commits land, and even then should +# only list those release-prep files — anything else +# is a missed cherry-pick) +# C. dev → main (sanity: phantom commits dev "appears ahead" on +# because cherry-pick rewrites SHAs post-squash) +git diff origin/main..HEAD --stat # A +git diff HEAD..origin/dev --name-only | grep -v '^docs/' || echo "(none)" # B +git diff origin/dev..origin/main --stat | tail -5 # C +# +# Re-confirm no guarded paths leaked (this caught the original miss class): +git diff origin/main..HEAD --name-only \ + | grep -E '^(docs/plans|docs/brainstorms|docs/ideation|docs/reviews|docs/solutions|\.context)' \ + && echo "LEAKED — reset and redo" || echo "(clean — no guarded paths)" +# +# Patch-id cherry check — catches commits on dev that have NO patch-id +# equivalent on release. The file-level diff in B misses this class when +# the same content happens to land via a different commit. +# +# IMPORTANT: in a squash-merge workflow this output is noisy. Every '+' +# line needs human triage — it does NOT auto-block the release. Expected +# sources of '+' lines that are NOT real misses: +# +# 1. Historical commits squash-merged in prior releases. The squash +# commit on main has a different patch-id than the dev commits it +# consolidates, so old commits show as '+' forever. Anything older +# than the previous release tag is almost always this. +# 2. Cherry-picks where conflict resolution stripped guarded paths +# (docs/plans, docs/brainstorms, etc.) or otherwise altered the +# tree. Same source-code intent, different patch-id. +# 3. Intentionally skipped commits — docs-only commits, release-prep +# backports, revert-and-redo prep steps. +# +# A real miss looks like: a recent feat/fix/chore commit on dev whose +# *file content* is not yet on main. To triage a '+' line: +# +# git show --stat # what did it touch? +# git diff origin/main..HEAD -- # already on release? +# +# If every touched file is guarded (docs/plans/, docs/brainstorms/, etc.) +# OR the content is already on main via a prior squash, it's a false +# positive — no action. Otherwise cherry-pick the commit and re-run the +# triple-diff. +git cherry HEAD origin/dev | grep '^+' || echo "(none — release is patch-equivalent through dev)" +# +# If B lists any non-docs path you didn't expect, fetch dev, identify the +# commit (`git log dev --not origin/main`), cherry-pick it, re-run the +# triple-diff. Missed cherry-picks have shipped to main on this and sibling +# repos before — this step is the cheap way to catch them. # 5. Bump VERSION on the release branch. echo '' > VERSION diff --git a/SKILL.md b/SKILL.md index cd5340a..66f26cc 100644 --- a/SKILL.md +++ b/SKILL.md @@ -4,10 +4,15 @@ description: >- Guide to designing, building, and auditing CLI tools for use by AI agents. Pairs with [`anc`](https://github.com/brettdavies/agentnative-cli) (the canonical compliance checker) and [`agentnative-spec`](https://github.com/brettdavies/agentnative) (the canonical principle text, vendored at - `spec/`). Provides starter templates, language-specific implementation idioms, and a short - getting-started guide that points agents at `anc check --output json`. Use when designing a new CLI tool, - reviewing one for agent-readiness, or remediating findings from `anc`. Triggers on agentic CLI, agent-native, - CLI design, CLI standard, agent-first, CLI for agents, agent-friendly CLI, CLI compliance, anc. + `spec/`). Provides starter templates, language-specific implementation idioms (Rust/clap, Python Click & argparse, + Go Cobra, JS Commander/yargs/oclif, Ruby Thor), and a getting-started guide that points agents at + `anc check --output json` and `anc skill install `. Use when designing a new CLI tool, building a Rust/clap + binary intended for agents, reviewing one for agent-readiness, claiming the agent-native badge, or remediating + findings from `anc`. Triggers on agentic CLI, agent-native, CLI design, CLI standard, agent-first, CLI for agents, + agent-friendly CLI, CLI compliance, agent-readiness, anc, anc check, anc skill install, agent-native badge, + scorecard, audit-profile, Rust CLI, clap derive. SKIP when the user is building a TUI app meant for humans (use + `--audit-profile human-tui` rather than this skill), writing a non-CLI library, or asking unrelated Rust questions + not specifically about agent-readiness of a CLI. --- # Agent-Native CLI @@ -23,51 +28,27 @@ The standard for CLI tools designed to be operated by AI agents. Three artifacts The skill does **not** implement principles checking. `anc` does. The skill teaches agents to use `anc` and supplies the surrounding context (spec, idioms, templates) that `anc`'s findings reference. -## Update check +## First action: update check -On first invocation per session, run `bin/check-update`. It compares this bundle's `VERSION` against `main` on GitHub -and prints one of: - -| Output | Meaning | -| ------------------------------------ | ------------------------------------------------------------------------------- | -| (empty) | Up to date, snoozed, disabled, or check skipped (broken install, no network). | -| `UPGRADE_AVAILABLE ` | A newer release is on `main`. Surface the upgrade flow below before continuing. | +Run once per session before doing real work: ```bash bash "$(dirname "$0")/bin/check-update" ``` -Exit code is always 0; failures degrade silently. - -### Inline upgrade flow - -When stdout is `UPGRADE_AVAILABLE `, ask the user via `AskUserQuestion`: - -> `agent-native-cli` **v{remote}** is available (you're on v{local}). Upgrade now? - -Three options: - -- **"Yes, upgrade now"** — run `git -C pull --ff-only`. Report the new HEAD and the upgrade outcome. - The bundle root is the parent of ``; `git -C ../.. pull --ff-only` from `bin/` works for the default install layout - (`~//skills/agent-native-cli/`). If `--ff-only` rejects (uncommitted edits or divergent history), surface git's - error verbatim and stop — do not auto-stash. -- **"Not now"** — write `$HOME/.cache/agent-native-cli/update-snoozed` in the format ` `, where - `` is `1` (24h reminder), `2` (48h), or `3` (7 days), escalating each time the user defers. Tell the user the - next reminder window. -- **"Never ask again"** — `touch $HOME/.cache/agent-native-cli/disabled` and tell the user how to re-enable (`rm - $HOME/.cache/agent-native-cli/disabled`). - -State directory: `$HOME/.cache/agent-native-cli/`. All three files (`last-update-check`, `update-snoozed`, `disabled`) -live there; the script auto-creates the directory on first slow-path fetch. +Empty output → continue. `UPGRADE_AVAILABLE ` → prompt the user via `AskUserQuestion` (Yes / Not now / +Never). Full prompt text, snooze ladder, and state-file layout are in +[`references/update-check.md`](./references/update-check.md). Exit code is always `0`; failures degrade silently. ## Start here → **[`getting-started.md`](./getting-started.md)** — the three working loops (existing CLI / new Rust CLI / other -language), the canonical `anc check` invocations, and a "where things live" map. +language), the canonical `anc check` invocations, the `anc skill install ` installer, and a "where things live" +map. ## The seven principles -Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative-spec` — currently `v0.2.0`; see +Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative-spec` — currently `v0.3.0`; see [`spec/README.md`](./spec/README.md) for resync instructions). One file per principle, each with machine-readable `requirements[]` frontmatter: @@ -83,6 +64,38 @@ Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative- Do not paraphrase the principles inside this skill — read the spec files directly. They are the source of truth. +## The anc loop: check → fix → re-check → claim badge + +Once `anc` is installed (one-line install in [`getting-started.md`](./getting-started.md)), the work is a four-step +loop: + +**1. Check.** `anc check --output json . > scorecard.json`. The JSON envelope is schema `0.5` and contains: + +- `summary` — `total / pass / warn / fail / skip / error` count. +- `coverage_summary` — `must / should / may`, each with `total` + `verified`. `must.verified == must.total` is the bar + for "no MUST violations". +- `badge.eligible` (bool), `badge.score_pct` (int), `badge.embed_markdown` (string or `null`), `badge.scorecard_url`, + `badge.badge_url`, `badge.convention_url`. **80%** is the eligibility floor; below it, `embed_markdown` is `null` and + the convention says do not advertise a badge. +- `results[]` — per-check entries citing `requirement_id`, `status`, and `evidence`. +- `audit_profile` — the exemption category in effect (or `null`). +- `tool / anc / run / target` metadata — identifies the scored tool, the `anc` build, the invocation, and the resolved + target. + +**2. Fix.** For each `fail`, look up the cited `requirement_id` (e.g. `p1-must-no-interactive`) in +`spec/principles/p-*.md`'s `requirements[]` frontmatter. Apply the fix using the implementation references below. +Re-run with `--principle ` to focus on one principle while iterating. + +**3. Re-check.** Re-run `anc check --output json .` until `summary.fail == 0` and `coverage_summary.must.verified == +coverage_summary.must.total`. Use `--audit-profile ` to suppress checks that don't apply to the tool class — +`human-tui` (TUIs that legitimately intercept the TTY), `file-traversal` (reserved), `posix-utility` (cat / sed / awk +style), `diagnostic-only` (read-only tools). Suppressed checks emit `Skip` with structured evidence so readers see what +was excluded. + +**4. Claim the badge.** Once `badge.eligible == true` (≥80%), copy `badge.embed_markdown` into the project's README. The +`text` output appends an embed hint after the summary line whenever the floor is cleared; below the floor, nothing +badge-related is printed (the convention's "do not nag" rule). + ## Implementation guidance (when fixing findings) Once `anc check` reports a failure, the agent has the cited `requirement_id` and the spec text. The next question is @@ -108,16 +121,23 @@ Drop-in starting points for greenfield Rust CLIs. Each encodes the relevant prin ## Compliance checking -Use `anc`. Install once: +Use `anc`. Install once via Homebrew, cargo, or self-install: ```bash brew install brettdavies/tap/agentnative # binary is `anc` cargo install agentnative ``` -Recommended invocations and the full agent loop are in [`getting-started.md`](./getting-started.md). Do not write shell -scripts to grep for principle violations — `anc` already implements (and supersedes) every check that approach could -produce. +To install **this skill bundle** into a host's canonical skills directory, use `anc`'s built-in installer: + +```bash +anc skill install claude_code # also: codex, cursor, factory, kiro, opencode +anc skill install --dry-run claude_code # print resolved git command without spawning +``` + +Recommended `anc check` invocations and the full agent loop are in [`getting-started.md`](./getting-started.md). Do not +write shell scripts to grep for principle violations — `anc` already implements (and supersedes) every check that +approach could produce. ## Sources diff --git a/docs/SYNCS.md b/docs/SYNCS.md new file mode 100644 index 0000000..efd90c5 --- /dev/null +++ b/docs/SYNCS.md @@ -0,0 +1,125 @@ +# Cross-repo sync map + +How spec content flows in and how the skill bundle flows out. Source of truth for the sync mechanisms that connect this +repo to its three sibling repos (`agentnative` spec, `agentnative-site`, `agentnative-cli`) and to consumer agent hosts. + +> This document is a routing map. Per-script details (env vars, fallback behavior, exit codes) live in the script +> headers. Re-vendor procedure during a release lives in [`RELEASES.md`](../RELEASES.md). Vendored-spec mechanics live +> in [`spec/README.md`](../spec/README.md). + +## Upstream — data flowing INTO this repo + +| Source | Mechanism | What's synced | Trigger / cadence | Drift check | +| -------------------------------------------------- | ------------------------------- | ----------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `brettdavies/agentnative` (spec) @ latest `v*` tag | `scripts/sync-spec.sh` (manual) | `principles/p*-*.md` + `VERSION` + `CHANGELOG.md` → `spec/` | Re-run on the `release/v` branch as part of every release (see `RELEASES.md` §"Spec re-vendoring"). | None automated — relies on the release-branch checklist. The script itself is idempotent; `git status` after a re-run surfaces orphan files from spec renames. | + +**Mechanism notes:** + +- Resolution is **remote-first**: queries `https://github.com/brettdavies/agentnative.git` for the latest `v*` tag, + shallow-clones it into a temp dir, extracts files via `git show` so neither working tree is perturbed. +- Falls back to a local checkout (`$HOME/dev/agentnative-spec`, override with `SPEC_ROOT`) only when the remote query + fails. +- Mirror of `~/dev/agentnative-cli/scripts/sync-spec.sh` — only `DEST_DIR` differs. Keep them aligned when changing + either. + +## Inbound + outbound data map + +```mermaid +flowchart LR + subgraph Inbound + SPEC[brettdavies/agentnative
spec repo] + end + + SKILL[brettdavies/agentnative-skill
this repo
bundle + git tags] + + subgraph Outbound + CC[Claude Code
git clone] + CUR[Cursor
git clone] + CDX[Codex
git clone] + end + + SITE[brettdavies/agentnative-site] + + SPEC -- "scripts/sync-spec.sh
(MANUAL — NOT on spec's
repository_dispatch list)" --> SKILL + SKILL -- "release tag v<X.Y.Z>
= bundle pin" --> CC + SKILL -- "release tag v<X.Y.Z>
= bundle pin" --> CUR + SKILL -- "release tag v<X.Y.Z>
= bundle pin" --> CDX + SITE -. "git ls-remote (daily 13:00 UTC)
skill-availability.yml
visibility probe" .-> SKILL +``` + +This repo is intentionally **off** the spec's `repository_dispatch` list — re-vendoring is a deliberate release-time +act, not a webhook reflex. The site's daily probe is the only automated edge that touches this repo from outside, and +it's read-only (`git ls-remote`) — it never mutates state here. + +## Downstream — data flowing OUT of this repo + +This repo is **not** the source of truth for `skill.json`. The `agentnative-site` repo holds the canonical manifest at +`src/data/skill.json`; this repo's role downstream is (a) to be the git target that `skill.json`'s `source.url` points +at, and (b) to be the bundle that `install.` commands clone. + +### Manifest source-of-truth vs bundle content + +```mermaid +flowchart LR + subgraph ManifestPath[Manifest text — owned by agentnative-site] + SJSON[src/data/skill.json
SoT for manifest fields
incl. source.commit] + EMITTER[src/build/skill.mjs
validator + emitter] + ENDPOINT[/skill.json endpoint/] + SJSON --> EMITTER --> ENDPOINT + end + + subgraph BundlePath[Bundle content — owned by this repo] + BUNDLE[agentnative-skill
SKILL.md + spec/ + scripts/
+ git tags v<X.Y.Z>] + end + + SJSON -. "source.commit pins
a SHA from this repo
(hand-edited at release)" .-> BUNDLE +``` + +The two paths are independent artifacts that meet only at the `source.commit` pin. The site decides what the manifest +*says*; this repo decides what the bundle *is*. Editing `skill.json` here would be a category error — the field +ownership lives across the boundary in `agentnative-site/src/data/skill.json`. + +| Consumer | Mechanism | What's distributed | Trigger / cadence | Drift check | +| ----------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Agent hosts (Claude Code, Codex, Cursor, Factory, Kiro, OpenCode) | Plain `git clone --depth 1 https://github.com/brettdavies/agentnative-skill.git /agent-native-cli`. Update path is `git -C pull --ff-only` triggered inline by `bin/check-update`'s `UPGRADE_AVAILABLE`. | The full repo as a flat tree. Host auto-discovers `SKILL.md` at the install root and ignores producer-side files. | New release lands on `main` + tag `v`. Consumers pick it up on next `bin/check-update` invocation (first invocation per session, with snooze/disable state in `~/.cache/agent-native-cli/`). | None on the producer side — no telemetry. `bin/check-update` is the consumer-side mechanism; it compares local `VERSION` to `main` and prompts the user. | +| `brettdavies/agentnative-site` `/skill` + `/skill.json` endpoints | The site holds the source-of-truth manifest at `src/data/skill.json`. `src/build/skill.mjs` validates and emits `dist/skill.json` byte-stably during the site build. The manifest's `source.commit` is hand-co-edited at release time to pin a specific `agentnative-skill` SHA. | The `/skill.json` machine surface, `/skill.html` human page, and `/skill.md` markdown twin — all derived from the site's own `src/data/skill.json`. | On every site deploy. The site's release procedure includes hand-bumping `source.commit` (and `version`) to match the `agentnative-skill` release being announced. | Synthetic probe — `agentnative-site/.github/workflows/skill-availability.yml` runs `git ls-remote` against this repo daily at 13:00 UTC and fails the run if the repo 404s, gets renamed, or flips private. | +| `brettdavies/agentnative-cli` `src/skill_install/skill.json` | `~/dev/agentnative-cli/scripts/sync-skill-fixture.sh` pulls `src/data/skill.json` from `agentnative-site` (default ref: `dev`). The fixture is also the build-time input for `build.rs`'s host-map codegen — `cargo build` regenerates `SkillHost`, `KNOWN_HOSTS`, and `resolve_host` from it. | Whatever ships in the site's `src/data/skill.json` — the `install.` map, in particular. | Manual re-run when the site updates `src/data/skill.json`. Pre-release checklist in agentnative-cli's `RELEASES.md` captures the cadence. | `agentnative-cli/.github/workflows/skill-fixture-drift.yml` runs `scripts/sync-skill-fixture.sh --check` on every PR; non-zero exit on drift. | + +**Distribution chain summary** — `agentnative-skill` (this repo, the bundle) → `agentnative-site` `src/data/skill.json` +(the manifest, hand-edits `source.commit` to pin a SHA from this repo) → `agentnative-cli` +`src/skill_install/skill.json` (the fixture, synced from the site, drives the Rust `SkillHost` codegen). + +## Release / dispatch chain + +There is **no automated `repository_dispatch` chain** between these three repos. The producer-side workflows are CI-only +(`ci.yml`: markdownlint + shellcheck; `guard-main-docs.yml`: blocks engineering docs from `main`). Cross-repo +propagation is **manual + checklist-driven**. + +A skill release flows like this: + +1. **`agentnative-spec`** ships a new `v*` tag (independent cadence). +2. **`agentnative-skill`** (this repo) opens a `release/v` branch from `main`, cherry-picks non-docs commits from + `dev`, runs `scripts/sync-spec.sh` to re-vendor `spec/`, bumps `VERSION`, generates CHANGELOG, merges to `main`, tags + `v`, creates a GitHub Release. Consumers see the new version on next `bin/check-update`. +3. **`agentnative-site`** updates `src/data/skill.json` (`version` and `source.commit`) to point at the new + `agentnative-skill` SHA, deploys. `/skill.json` now serves the new manifest. The daily `skill-availability` probe + continues to verify the producer repo is reachable. +4. **`agentnative-cli`** runs `scripts/sync-skill-fixture.sh` to pull the updated `src/data/skill.json` from the site, + `cargo build` regenerates the host map, ships in the next `anc` release. The `skill-fixture-drift` workflow ensures + no PR lands while the fixture is stale. + +Each step is gated by a human reading the previous repo's release notes and deciding to advance the chain. There is no +webhook, no scheduled sync, no `repository_dispatch` payload between any pair. + +## Reference + +- `scripts/sync-spec.sh` — header comment has full env-var matrix and fallback behavior. +- `spec/README.md` — vendored-spec layout, attribution, and resync invocation. +- `RELEASES.md` §"Spec re-vendoring" and §"Releasing dev to main" — release-time procedure. +- `~/dev/agentnative-cli/scripts/sync-spec.sh` — parallel implementation; keep aligned. +- `~/dev/agentnative-cli/scripts/sync-skill-fixture.sh` — downstream sync from site → cli; header documents `--check` + mode and CI integration. +- `~/dev/agentnative-site/src/build/skill.mjs` — the validator/emitter that treats `src/data/skill.json` as source of + truth for the `/skill.json` endpoint. +- `~/dev/agentnative-site/.github/workflows/skill-availability.yml` — daily synthetic probe of this repo. +- `~/dev/agentnative-cli/.github/workflows/skill-fixture-drift.yml` — per-PR drift gate on the cli fixture. diff --git a/getting-started.md b/getting-started.md index 5cab6d8..d876693 100644 --- a/getting-started.md +++ b/getting-started.md @@ -18,11 +18,16 @@ anc check --output json . > scorecard.json # references/rust-clap-patterns.md (Rust/clap) # references/framework-idioms.md (Rust idioms) # references/framework-idioms-other-languages.md (Click, argparse, Cobra, Commander, yargs, oclif, Thor) -# Re-run `anc check` until the scorecard is clean. +# Re-run `anc check` until `summary.fail == 0` and +# `coverage_summary.must.verified == coverage_summary.must.total`. + +# 4. Claim the badge once `badge.eligible == true` (≥80%). +# Copy `badge.embed_markdown` from the JSON scorecard into the project's README. ``` Useful flags: `--principle N` to focus on one principle, `--audit-profile ` to suppress checks that don't -apply (e.g. `human-tui` for tools that legitimately intercept the TTY), `--binary` / `--source` to scope. +apply (`human-tui` for TUIs that legitimately intercept the TTY, `posix-utility` for cat/sed/awk-style stdin-primary +tools, `diagnostic-only` for read-only tools, `file-traversal` reserved), `--binary` / `--source` to scope. ## You're building from scratch (Rust) @@ -47,14 +52,24 @@ anc check --output json # run continuously as you build compiled binary on `PATH`. Read `spec/principles/p1-*.md` through `p7-*.md` for the language-agnostic requirements, and `references/framework-idioms-other-languages.md` for per-framework idioms. -## Installing anc +## Installing anc and this skill bundle ```bash +# 1. Install the `anc` binary. brew install brettdavies/tap/agentnative # macOS / Linux cargo install agentnative + +# 2. Install this skill bundle into your host's canonical skills directory. +# Six hosts supported: claude_code, codex, cursor, factory, kiro, opencode. +anc skill install claude_code +anc skill install --dry-run claude_code # preview the git clone +eval $(anc skill install --dry-run claude_code) # captures cleanly for scripted installs +anc skill install --output json claude_code # uniform JSON envelope (success + error) ``` -Binary name: `anc`. Prebuilt releases at . +Binary name: `anc`. Prebuilt releases at . If your host is not +yet in the binary's map, `anc skill install --dry-run ` prints a manual `git clone` you can adapt: `git +clone --depth 1 https://github.com/brettdavies/agentnative-skill.git `. ## Where things live @@ -64,6 +79,7 @@ Binary name: `anc`. Prebuilt releases at ` in Rust/clap? | `references/rust-clap-patterns.md` | | How do I implement `` in Python/Go/JS? | `references/framework-idioms-other-languages.md` | +| How does the badge eligibility threshold work? | `SKILL.md` § "The anc loop" (80% floor) | | File a spec question or proposal | | | File an `anc` bug | | | File a skill-bundle issue | | diff --git a/references/update-check.md b/references/update-check.md new file mode 100644 index 0000000..d4d5e61 --- /dev/null +++ b/references/update-check.md @@ -0,0 +1,33 @@ +# Update check — operational details + +The `bin/check-update` script compares this bundle's `VERSION` against `main` on GitHub. Exit code is always `0`; +network failures, broken installs, and disabled checks all degrade silently. SKILL.md only documents the agent-visible +surface; this file documents the script's contract for anyone debugging it. + +## Output contract + +| Stdout | Meaning | +| ------------------------------------ | ----------------------------------------------------------------------------- | +| (empty) | Up to date, snoozed, disabled, or check skipped (broken install, no network). | +| `UPGRADE_AVAILABLE ` | A newer release is on `main`. Trigger the upgrade prompt below. | + +## Prompt the user via `AskUserQuestion` + +> `agent-native-cli` **v{remote}** is available (you're on v{local}). Upgrade now? + +Three options: + +- **"Yes, upgrade now"** — run `git -C pull --ff-only`. The bundle root is the parent of `bin/`; + `git -C ../.. pull --ff-only` from `bin/` works for the default install layout (`~//skills/agent-native-cli/`). + If `--ff-only` rejects (uncommitted edits or divergent history), surface git's error verbatim and stop — do not + auto-stash. +- **"Not now"** — write `$HOME/.cache/agent-native-cli/update-snoozed` in the format ` `, where + `` is `1` (24h reminder), `2` (48h), or `3` (7 days), escalating each time the user defers. Tell the user the + next reminder window. +- **"Never ask again"** — `touch $HOME/.cache/agent-native-cli/disabled` and tell the user how to re-enable (`rm + $HOME/.cache/agent-native-cli/disabled`). + +## State directory + +`$HOME/.cache/agent-native-cli/` holds three files: `last-update-check`, `update-snoozed`, `disabled`. The script +auto-creates the directory on first slow-path fetch. From d95bc6d6f98c261c5b3ebaed96260fed18f9cf87 Mon Sep 17 00:00:00 2001 From: Brett Date: Tue, 26 May 2026 23:27:38 -0500 Subject: [PATCH 02/15] feat(scripts): sync-spec.sh `--ref` flag + gh api transport (#15) ## Summary Adds a `--ref ` flag (and matching `SPEC_REF` env var) to `scripts/sync-spec.sh` so the skill can vendor `agentnative-spec` from an explicit branch, tag, or commit SHA rather than only the latest `v*` tag. Default behavior (no `--ref`) is unchanged: still resolves the latest `v*` tag. Refactored from `git clone` to `gh api` so all ref types (latest tag, branch, tag, SHA) flow through the same code path (`repos/{owner}/{repo}/contents/{path}?ref=` with `application/vnd.github.raw`). Mirrors the same refactor in `agentnative-cli` and `agentnative-site`. Only `DEST_DIR` differs (skill vendors into `spec/`, cli into `src/principles/spec/`). Motivation: cross-repo coordination of in-flight spec work that has landed on `dev` but is not yet tagged. The release-branch flow needs a way to pin spec content to a specific commit without waiting for spec to cut. ## Changelog ### Added - `--ref ` flag and matching `SPEC_REF` environment variable on `scripts/sync-spec.sh` for vendoring `agentnative-spec` from an explicit branch, tag, or commit SHA. Default behavior (no `--ref`) still resolves the latest `v*` tag. ### Changed - `scripts/sync-spec.sh` now uses `gh api` (raw content endpoint) instead of `git clone` for the primary fetch path. All ref types share one code path; the local-fallback path against `SPEC_ROOT` is preserved for offline runs. ### Documentation - `docs/SYNCS.md` spec-row mechanism column updated to describe `--ref` / `SPEC_REF`, the cross-repo coordination workflow, and the `gh api` resolution semantics. ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) - [ ] `fix`: Bug fix (non-breaking change which fixes an issue) - [ ] `refactor`: Code refactoring (no functional changes) - [ ] `perf`: Performance improvement - [ ] `docs`: Documentation update - [ ] `test`: Adding or updating tests - [ ] `chore`: Maintenance tasks (dependencies, config, etc.) - [ ] `ci`: CI/CD configuration changes - [ ] `style`: Code style/formatting changes - [ ] `build`: Build system changes - [ ] `BREAKING CHANGE`: Breaking API change (requires major version bump) ## Related Issues/Stories - Story: n/a - Issue: n/a - Architecture: n/a - Related PRs: mirrors the corresponding `feat/sync-spec-ref-flag` PRs in `brettdavies/agentnative-cli` and `brettdavies/agentnative-site` ## Testing - [ ] Unit tests added/updated - [ ] Integration tests added/updated - [x] Manual testing completed - [x] All tests passing **Test Summary:** - Validated against `--ref dev` (resolves to `b4f4d02`, picks up U1 conditional schema in p2 + p8). - Validated against `--ref v0.4.0` (resolves to `90dd48b`, pre-U1). - Default invocation (no `--ref`) still resolves the latest `v*` tag and produces the same vendored tree as before the refactor. ## Files Modified **Modified:** - `scripts/sync-spec.sh`: added `--ref` / `SPEC_REF`, switched primary fetch to `gh api`, kept local fallback, prints resolved short SHA every run. - `docs/SYNCS.md`: spec-row mechanism column + notes updated. **Created:** - None. **Renamed:** - None. **Deleted:** - None. ## Key Features - `--ref` flag and `SPEC_REF` env var (flag wins over env). - Single code path for all ref types via `gh api` raw content endpoint. - Resolved short SHA printed every run, so the release-branch checklist can record the exact pin. - Local-fallback path against `SPEC_ROOT` still works when `gh api` is unreachable. ## Benefits - Unblocks release-branch flows that need to consume spec content from `dev` (or a specific SHA) before the spec repo cuts a tag. - Removes the shallow-vs-full clone distinction; one transport for every ref type. - Aligned surface across `agentnative-cli`, `agentnative-site`, and `agentnative-skill` makes the cross-repo sync workflow legible. ## Breaking Changes - [x] No breaking changes - [ ] Breaking changes described below: Default behavior (no `--ref`) is unchanged. ## Deployment Notes - [x] No special deployment steps required - [ ] Deployment steps documented below: ## Screenshots/Recordings n/a. Script + docs change. ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) - [x] Self-review of code completed - [x] Tests added/updated and passing - [x] No new warnings or errors introduced - [x] Changes are backward compatible (or breaking changes documented) ## Additional Context This is the third of three coordinated PRs adding `--ref` support across the agent-native repos. The flag, env var, and behavior are identical across all three; only `DEST_DIR` differs per repo. --- docs/SYNCS.md | 17 +-- scripts/sync-spec.sh | 263 +++++++++++++++++++++++++++++++------------ 2 files changed, 201 insertions(+), 79 deletions(-) diff --git a/docs/SYNCS.md b/docs/SYNCS.md index efd90c5..f43c124 100644 --- a/docs/SYNCS.md +++ b/docs/SYNCS.md @@ -9,16 +9,19 @@ repo to its three sibling repos (`agentnative` spec, `agentnative-site`, `agentn ## Upstream — data flowing INTO this repo -| Source | Mechanism | What's synced | Trigger / cadence | Drift check | -| -------------------------------------------------- | ------------------------------- | ----------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `brettdavies/agentnative` (spec) @ latest `v*` tag | `scripts/sync-spec.sh` (manual) | `principles/p*-*.md` + `VERSION` + `CHANGELOG.md` → `spec/` | Re-run on the `release/v` branch as part of every release (see `RELEASES.md` §"Spec re-vendoring"). | None automated — relies on the release-branch checklist. The script itself is idempotent; `git status` after a re-run surfaces orphan files from spec renames. | +| Source | Mechanism | What's synced | Trigger / cadence | Drift check | +| -------------------------------------------------------------------------------------------- | ------------------------------- | ----------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `brettdavies/agentnative` (spec); default latest `v*` tag, or any branch/tag/SHA via `--ref` | `scripts/sync-spec.sh` (manual) | `principles/p*-*.md` + `VERSION` + `CHANGELOG.md` → `spec/` | Re-run on the `release/v` branch as part of every release (see `RELEASES.md` §"Spec re-vendoring"). Cross-repo coordination: rerun with `--ref dev` (or a SHA) to consume in-flight spec work before a release cuts. | None automated — relies on the release-branch checklist. The script itself is idempotent; `git status` after a re-run surfaces orphan files from spec renames. The script prints the resolved short SHA on every run regardless of ref type so the release branch can record the exact pin. | **Mechanism notes:** -- Resolution is **remote-first**: queries `https://github.com/brettdavies/agentnative.git` for the latest `v*` tag, - shallow-clones it into a temp dir, extracts files via `git show` so neither working tree is perturbed. -- Falls back to a local checkout (`$HOME/dev/agentnative-spec`, override with `SPEC_ROOT`) only when the remote query - fails. +- Resolution uses `gh api` against `https://github.com/brettdavies/agentnative.git` (override with `SPEC_REMOTE_URL`). + Pulls VERSION, CHANGELOG.md, and `principles/p*-*.md` individually via the contents endpoint at the resolved ref. No + clone, no tarball — branches, tags, and SHAs all take the same code path via `?ref=`. +- Default ref is the latest `v*` tag, resolved via the same API. `--ref ` (or `SPEC_REF=` env var) + vendors any explicit ref for cross-repo coordination of in-flight spec work that hasn't released yet. +- Falls back to a local checkout (`$HOME/dev/agentnative-spec`, override with `SPEC_ROOT`) only when `gh api` is + unreachable (offline / unauthenticated). The local-fallback path uses `git` against `SPEC_ROOT`. - Mirror of `~/dev/agentnative-cli/scripts/sync-spec.sh` — only `DEST_DIR` differs. Keep them aligned when changing either. diff --git a/scripts/sync-spec.sh b/scripts/sync-spec.sh index bd1a949..29fe72b 100755 --- a/scripts/sync-spec.sh +++ b/scripts/sync-spec.sh @@ -1,121 +1,240 @@ #!/usr/bin/env bash # Vendor agentnative-spec into spec/. # -# Resolves the latest v* tag of agentnative-spec, preferring the remote -# repository, and falls back to a local checkout if the remote is -# unreachable. Extracts files via `git show :` so neither -# checkout's working tree is perturbed. The vendored tree ships as part -# of the skill bundle so consumers carry the canonical principle text -# alongside the skill metadata. +# Default behavior: resolves the latest v* tag of agentnative-spec via the +# GitHub API and pulls VERSION, CHANGELOG.md, and principles/p*-*.md at +# that tag. The vendored tree ships as part of the skill bundle so +# consuming agents carry the canonical principle text alongside the skill +# metadata. +# +# Override behavior (--ref / SPEC_REF): vendors an explicit branch HEAD, +# tag, or commit SHA instead of the latest v* tag. Use for cross-repo +# coordination of in-flight spec work that hasn't released yet (e.g., a +# CLI/site change that depends on a spec PR landed on `dev` but not yet +# tagged). The resolved short SHA is always printed alongside the ref so +# the user knows exactly what landed; record that SHA in any consumer PR +# body so the vendoring is traceable post-merge. +# +# Transport: `gh api` against the GitHub REST contents endpoint. Pulls +# files individually (no clone, no tarball) so branches, tags, and SHAs +# take the same code path — `?ref=` accepts all three. Requires `gh` +# authenticated against github.com. When the API path fails (network +# down, gh unauthenticated, repo unreachable), the script falls back to +# a local checkout for offline development. # # Usage: -# scripts/sync-spec.sh +# scripts/sync-spec.sh # latest v* tag (default) +# scripts/sync-spec.sh --ref dev # HEAD of dev branch +# scripts/sync-spec.sh --ref v0.4.0 # explicit tag +# scripts/sync-spec.sh --ref b4f4d02 # specific commit SHA +# SPEC_REF=dev scripts/sync-spec.sh # env-var form of --ref # SPEC_ROOT=/path/to/agentnative-spec scripts/sync-spec.sh # SPEC_REMOTE_URL=git@github.com:brettdavies/agentnative.git scripts/sync-spec.sh # +# Flags: +# --ref Branch name, tag, or commit SHA to vendor. Wins over +# SPEC_REF env var. When unset, the script resolves the +# latest v* tag. +# # Env vars: -# SPEC_REMOTE_URL Remote URL to query first. +# SPEC_REF Same as --ref but via env. CLI flag wins on conflict. +# SPEC_REMOTE_URL Remote URL identifying the repo. The script parses +# `/` out of it for the `gh api` calls and +# out of it for the local-fallback's remote-name lookup. # Default: https://github.com/brettdavies/agentnative.git -# SPEC_ROOT Local checkout to fall back to when the remote is +# SPEC_ROOT Local checkout to fall back to when the API is # unreachable. Default: $HOME/dev/agentnative-spec # -# Resync cadence: rerun after every new agentnative-spec tag. The remote -# query picks up new tags automatically; a local fallback only sees what -# the local checkout already has fetched. -# -# Stale orphan files in spec/principles/ (e.g., from a spec rename) are -# accepted; `git status` surfaces them at commit time. +# Resync cadence: rerun after every new agentnative-spec tag. The default +# API query picks up new tags automatically. Spec's +# `repository_dispatch:spec-release` event already fires to this repo on +# tag publish — a consumer-side handler that auto-PRs the resync is +# tracked as follow-up work. # -# Mirror of agentnative-cli/scripts/sync-spec.sh; only DEST_DIR differs. +# Stale orphan files in spec/principles/ (e.g., from a spec +# rename) are accepted; `git status` surfaces them at commit time. set -euo pipefail SPEC_REMOTE_URL="${SPEC_REMOTE_URL:-https://github.com/brettdavies/agentnative.git}" SPEC_ROOT="${SPEC_ROOT:-$HOME/dev/agentnative-spec}" +SPEC_REF="${SPEC_REF:-}" + +# --- Argument parsing --------------------------------------------------- +# CLI --ref wins over SPEC_REF env. Other flags reserved for future use. +while [[ $# -gt 0 ]]; do + case "$1" in + --ref) + if [[ $# -lt 2 || -z "$2" ]]; then + echo "error: --ref requires a value (branch, tag, or SHA)" >&2 + exit 2 + fi + SPEC_REF="$2" + shift 2 + ;; + --ref=*) + SPEC_REF="${1#--ref=}" + if [[ -z "$SPEC_REF" ]]; then + echo "error: --ref= requires a value (branch, tag, or SHA)" >&2 + exit 2 + fi + shift + ;; + -h|--help) + sed -n '2,55p' "$0" | sed 's/^# \{0,1\}//' + exit 0 + ;; + *) + echo "error: unknown argument: $1" >&2 + echo " run \`$0 --help\` for usage" >&2 + exit 2 + ;; + esac +done REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)" DEST_DIR="$REPO_ROOT/spec" DEST_PRINCIPLES="$DEST_DIR/principles" -# Cleanup hook for the temp clone (set only after mktemp succeeds). -tmp_root="" -cleanup() { - if [[ -n "$tmp_root" && -d "$tmp_root" ]]; then - rm -rf "$tmp_root" +# Parse `/` out of SPEC_REMOTE_URL for `gh api` calls. +# Handles both URL shapes: +# https://github.com//.git +# git@github.com:/.git +spec_repo="${SPEC_REMOTE_URL%.git}" +spec_repo="${spec_repo#*github.com[/:]}" +spec_repo="${spec_repo%/}" +if [[ -z "$spec_repo" || "$spec_repo" == "$SPEC_REMOTE_URL" || "$spec_repo" != */* ]]; then + echo "error: could not parse owner/repo from SPEC_REMOTE_URL: $SPEC_REMOTE_URL" >&2 + exit 1 +fi + +# === Resolution ========================================================= +# resolved_ref: what we actually vendor (e.g., "v0.4.0" or a SHA) +# resolved_sha: 7-char SHA for display (always set after resolution) +# source_label: human-readable origin string for the "vendoring" line +resolved_ref="" +resolved_sha="" +source_label="" + +# Try the API path first. Captures both branches: explicit user ref OR +# auto-resolve latest v* tag. +api_ok=false + +if [[ -n "$SPEC_REF" ]]; then + # User-specified ref. gh api accepts branches, tags, and SHAs at the + # same endpoint via `?ref=`. + if full_sha="$(gh api "repos/$spec_repo/commits/$SPEC_REF" --jq '.sha' 2>/dev/null)"; then + resolved_ref="$SPEC_REF" + resolved_sha="${full_sha:0:7}" + source_label="github.com:$spec_repo via gh api" + api_ok=true fi -} -trap cleanup EXIT - -# === Remote-first resolution =========================================== -spec_source="" -spec_tag="" - -echo "querying $SPEC_REMOTE_URL for latest v* tag..." -remote_tag="$(git ls-remote --tags --sort='-version:refname' \ - "$SPEC_REMOTE_URL" 'refs/tags/v*' 2>/dev/null \ - | awk '{print $2}' \ - | sed 's|refs/tags/||' \ - | grep -v '\^{}$' \ - | head -n 1 || true)" - -if [[ -n "$remote_tag" ]]; then - tmp_root="$(mktemp -d -t agentnative-spec-XXXXXX)" - if git clone --depth 1 --branch "$remote_tag" --quiet \ - "$SPEC_REMOTE_URL" "$tmp_root" 2>/dev/null; then - spec_source="$tmp_root" - spec_tag="$remote_tag" - resolved_sha="$(git -C "$spec_source" rev-parse --short=7 "$spec_tag^{commit}")" - echo "vendoring $spec_tag ($resolved_sha) from remote $SPEC_REMOTE_URL" +else + # Default: latest v* tag. Query all tags, filter to semver-shape v*, + # sort -V (version sort, descending), take the first. + latest_tag="$(gh api "repos/$spec_repo/tags?per_page=100" --jq '.[].name' 2>/dev/null \ + | grep -E '^v[0-9]+\.[0-9]+\.[0-9]+' \ + | sort -V -r \ + | head -n 1 || true)" + if [[ -n "$latest_tag" ]]; then + if full_sha="$(gh api "repos/$spec_repo/commits/$latest_tag" --jq '.sha' 2>/dev/null)"; then + resolved_ref="$latest_tag" + resolved_sha="${full_sha:0:7}" + source_label="github.com:$spec_repo via gh api" + api_ok=true + fi fi fi -# === Local fallback ==================================================== -if [[ -z "$spec_source" ]]; then +# === Local fallback ===================================================== +# When the API path fails (offline, gh not authenticated, repo not +# reachable), use a local checkout instead. The fallback uses `git` +# against SPEC_ROOT — `gh` is not involved in this path so it works +# without network or auth. +if ! $api_ok; then if [[ ! -d "$SPEC_ROOT/.git" ]]; then - echo "error: remote unreachable and SPEC_ROOT is not a git repository: $SPEC_ROOT" >&2 + echo "error: API path failed and SPEC_ROOT is not a git repository: $SPEC_ROOT" >&2 echo " remote: $SPEC_REMOTE_URL" >&2 - echo " set SPEC_ROOT to your agentnative-spec checkout, or check network access." >&2 + if [[ -n "$SPEC_REF" ]]; then + echo " requested ref: $SPEC_REF" >&2 + fi + echo " check \`gh auth status\`, network access, or point SPEC_ROOT at a local checkout." >&2 exit 1 fi - echo "warning: remote query failed; falling back to local $SPEC_ROOT" >&2 + echo "warning: gh api unreachable; falling back to local $SPEC_ROOT" >&2 - spec_source="$SPEC_ROOT" - spec_tag="$(git -C "$spec_source" tag --list 'v*' --sort='-version:refname' | head -n 1)" - if [[ -z "$spec_tag" ]]; then - echo "error: no v* tags found in $SPEC_ROOT" >&2 - echo " try \`git -C $SPEC_ROOT fetch --tags\` to pick up upstream tags" >&2 - exit 1 + if [[ -n "$SPEC_REF" ]]; then + # User-provided ref must be resolvable in the local checkout. + if ! git -C "$SPEC_ROOT" rev-parse --verify --quiet "$SPEC_REF^{commit}" >/dev/null; then + echo "error: ref \`$SPEC_REF\` not found in $SPEC_ROOT" >&2 + echo " try \`git -C $SPEC_ROOT fetch --all --tags\` to pick up upstream refs," >&2 + echo " or pass a SHA the local checkout already contains." >&2 + exit 1 + fi + resolved_ref="$SPEC_REF" + else + # Latest v* tag in the local checkout. + resolved_ref="$(git -C "$SPEC_ROOT" tag --list 'v*' --sort='-version:refname' | head -n 1)" + if [[ -z "$resolved_ref" ]]; then + echo "error: no v* tags found in $SPEC_ROOT" >&2 + echo " try \`git -C $SPEC_ROOT fetch --tags\` to pick up upstream tags" >&2 + exit 1 + fi fi - resolved_sha="$(git -C "$spec_source" rev-parse --short=7 "$spec_tag^{commit}")" - echo "vendoring $spec_tag ($resolved_sha) from local $spec_source" + resolved_sha="$(git -C "$SPEC_ROOT" rev-parse --short=7 "$resolved_ref^{commit}")" + source_label="local $SPEC_ROOT" fi -# === Verify + extract (works identically for remote and local sources) = -if ! git -C "$spec_source" cat-file -e "$spec_tag:principles" 2>/dev/null; then - echo "error: $spec_tag has no principles/ directory in $spec_source" >&2 - exit 1 -fi +echo "vendoring $resolved_ref ($resolved_sha) from $source_label" + +# === Extract ============================================================= +# Fetcher: API path uses `gh api` with raw accept header; local path uses +# `git show`. Both write a single file to $2 from path $1. +fetch_file() { + local path="$1" + local dest="$2" + if $api_ok; then + gh api -H "Accept: application/vnd.github.raw" \ + "repos/$spec_repo/contents/$path?ref=$resolved_ref" >"$dest" + else + git -C "$SPEC_ROOT" show "$resolved_ref:$path" >"$dest" + fi +} + +# Lister: returns names (not paths) of files in a directory at the ref. +# API path uses the contents endpoint; local path uses ls-tree. +list_dir() { + local dir="$1" + if $api_ok; then + gh api "repos/$spec_repo/contents/$dir?ref=$resolved_ref" --jq '.[].name' + else + git -C "$SPEC_ROOT" ls-tree --name-only "$resolved_ref" "$dir/" \ + | sed "s|^$dir/||" + fi +} mkdir -p "$DEST_PRINCIPLES" # VERSION and CHANGELOG.md are top-level in the spec repo. -git -C "$spec_source" show "$spec_tag:VERSION" >"$DEST_DIR/VERSION" -git -C "$spec_source" show "$spec_tag:CHANGELOG.md" >"$DEST_DIR/CHANGELOG.md" +fetch_file "VERSION" "$DEST_DIR/VERSION" +fetch_file "CHANGELOG.md" "$DEST_DIR/CHANGELOG.md" -# Enumerate principle files at the tag and extract each one. +# Enumerate principle files at the ref and extract each one. Filter to +# p*-*.md so principles/AGENTS.md (spec-side design context, not consumed +# by the site) is skipped. copied=0 -while IFS= read -r path; do - case "$path" in - principles/p*-*.md) - dest_name="${path#principles/}" - git -C "$spec_source" show "$spec_tag:$path" >"$DEST_PRINCIPLES/$dest_name" +while IFS= read -r name; do + [[ -z "$name" ]] && continue + case "$name" in + p[0-9]-*.md|p[0-9][0-9]-*.md) + fetch_file "principles/$name" "$DEST_PRINCIPLES/$name" copied=$((copied + 1)) ;; esac -done < <(git -C "$spec_source" ls-tree --name-only "$spec_tag" principles/) +done < <(list_dir "principles") if [[ "$copied" -eq 0 ]]; then - echo "error: no principles/p*-*.md files found at $spec_tag" >&2 + echo "error: no principles/p*-*.md files found at ref \`$resolved_ref\`" >&2 exit 1 fi From 731918cc0d606982d464d68849a3e5aa398e4086 Mon Sep 17 00:00:00 2001 From: Brett Date: Wed, 27 May 2026 00:12:03 -0500 Subject: [PATCH 03/15] feat(scripts): add pre-push hook mirroring CI (#16) ## Summary Adds `scripts/hooks/pre-push`, a local CI mirror that runs the same two checks the GitHub Actions pipeline runs, so a maintainer running `git push` gets the same gate before the push hits GitHub. Modeled on the canonical `agentnative-cli/scripts/hooks/pre-push` pattern: numbered steps, `pass()` / `fail()` helpers, ANSI red/green output, `set -euo pipefail`, exit-code header. Tools that may not be on every dev machine are skipped with a one-line note rather than failing. ## Changelog ### Added - `scripts/hooks/pre-push`: local CI mirror that runs markdownlint-cli2 and shellcheck against the same surfaces CI checks, gating pushes before they reach GitHub. ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) ## Related Issues/Stories - Story: n/a - Issue: n/a - Architecture: n/a - Related PRs: brettdavies/agentnative-cli pre-push hook (the canonical pattern this mirrors) ## Testing - [x] Manual testing completed - [x] All tests passing **Test Summary:** Ran `bash scripts/hooks/pre-push` on the current tree end-to-end. Output: ```text Running local CI checks... markdownlint-cli2 v0.22.1 (markdownlint v0.40.0) Finding: **/*.md !node_modules !node_modules/** !**/node_modules/** !vendor/** !target/** !.git/** !*.min.md !spec/CHANGELOG.md !CHANGELOG.md Linting: 28 file(s) Summary: 0 error(s) markdownlint shellcheck All checks passed. ``` Also self-linted the hook: `shellcheck --severity=style scripts/hooks/pre-push` passes clean. ## Files Modified **Modified:** **Created:** - `scripts/hooks/pre-push` (executable) **Renamed:** **Deleted:** ## Key Features Inventory of CI checks ported into the hook, with step numbering matching the file: 1. **markdownlint** mirrors the `markdownlint` job in `ci.yml`. The CI workflow runs `DavidAnson/markdownlint-cli2-action` with `globs: **/*.md` and reads the committed `.markdownlint-cli2.yaml`. The local CLI honors the same config file, so passing the same glob reproduces CI behavior. Skipped with a one-line note when `markdownlint-cli2` is not installed. 2. **shellcheck** mirrors the `shellcheck` job in `ci.yml`, which runs `ludeeus/action-shellcheck` against `./scripts/` with `SHELLCHECK_OPTS=--severity=style`. Locally the hook walks `git ls-files 'scripts/*'`, filters to actual shell scripts (by `.sh` extension or `#!.*\b(bash|sh)\b` shebang) so `bin/` stays out, and includes `scripts/hooks/*` so the hook lints itself. Skipped with a one-line note when `shellcheck` is not installed. No other CI jobs exist to port. `.github/workflows/guard-main-docs.yml` is a `pull_request`-only guard that gates `dev->main` merges via an org-level reusable workflow and has no local-runnable equivalent. ## Benefits - Fail fast: catches markdownlint and shellcheck regressions before they hit GitHub Actions. - Same surface as CI: the hook reads the same `.markdownlint-cli2.yaml` and uses the same severity flag (`--severity=style`) the CI workflow sets, so a green hook is a green CI signal. - Optional, not load-bearing: every check skips silently when its tool is absent. CI is still the authoritative backstop. ## Breaking Changes - [x] No breaking changes ## Deployment Notes - [x] No special deployment steps required Activation is one-time per checkout, matching the `agentnative-cli` repo convention: ```bash git config core.hooksPath scripts/hooks ``` This is a local git-config flag, not a tracked file, so each maintainer opts in once after cloning. Mention this in `CONTRIBUTING.md` as a follow-up if desired (out of scope for this PR). ## Screenshots/Recordings n/a ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow Conventional Commits - [x] Self-review of code completed - [x] Tests added/updated and passing - [x] No new warnings or errors introduced - [x] Changes are backward compatible ## Additional Context Hook ran clean end-to-end on the current tree. No findings were surfaced and no in-tree files were modified beyond the hook itself, so no scope creep. --- scripts/hooks/pre-push | 76 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100755 scripts/hooks/pre-push diff --git a/scripts/hooks/pre-push b/scripts/hooks/pre-push new file mode 100755 index 0000000..95fd579 --- /dev/null +++ b/scripts/hooks/pre-push @@ -0,0 +1,76 @@ +#!/usr/bin/env bash +# Local CI mirror — runs the same checks as the GitHub Actions CI pipeline. +# Use as a pre-push hook or run manually before pushing. +# +# Activation: `git config core.hooksPath scripts/hooks` (run once after clone). +# +# Mirrors .github/workflows/ci.yml: +# - markdownlint: markdownlint-cli2 over **/*.md +# - shellcheck: shellcheck over ./scripts/ at --severity=style +# +# Tools install (one-time): +# brew install shellcheck markdownlint-cli2 +# # apt: shellcheck via `apt install shellcheck`; markdownlint-cli2 via `bun add -g markdownlint-cli2` +# +# Tools that aren't installed are skipped with a one-line note rather than +# failing — CI is the authoritative backstop. Install them locally to close +# the gap and catch issues before the push hits GitHub. +# +# Exit codes: 0 = all pass, 1 = failure +set -euo pipefail + +RED='\033[0;31m' +GREEN='\033[0;32m' +BOLD='\033[1m' +RESET='\033[0m' + +pass() { echo -e " ${GREEN}✓${RESET} $1"; } +fail() { echo -e " ${RED}✗${RESET} $1"; exit 1; } + +echo -e "${BOLD}Running local CI checks...${RESET}" + +# 1. markdownlint — mirrors the `markdownlint` job in ci.yml. The CI workflow +# uses DavidAnson/markdownlint-cli2-action with `globs: **/*.md` and the +# committed .markdownlint-cli2.yaml config. The local CLI honors the same +# config file, so passing the same glob reproduces CI behavior. Skip silently +# if markdownlint-cli2 isn't installed — install via Homebrew +# (`brew install markdownlint-cli2`) or Bun (`bun add -g markdownlint-cli2`). +if command -v markdownlint-cli2 &>/dev/null; then + markdownlint-cli2 '**/*.md' '#node_modules' 2>&1 || fail "markdownlint-cli2" + pass "markdownlint" +else + echo " - markdownlint (skipped: install via \`brew install markdownlint-cli2\` or \`bun add -g markdownlint-cli2\`)" +fi + +# 2. Shellcheck — mirrors the `shellcheck` job in ci.yml, which runs +# ludeeus/action-shellcheck against `./scripts/` with SHELLCHECK_OPTS set to +# `--severity=style`. CI uses style severity to surface all suggestions and +# still fails on error+warning by default. The bundle is markdown-only; +# only producer-ops scripts under ./scripts/ are subject to shellcheck. +# Include scripts/hooks/* so this hook lints itself. Skip silently when +# the linter isn't installed. +if command -v shellcheck &>/dev/null; then + mapfile -t shell_files < <({ + git ls-files 'scripts/*' 2>/dev/null + } | sort -u) + # Filter to actual shell scripts (by extension or shebang). action-shellcheck + # auto-detects via shebang; mirror that locally so bin/ stays out and + # non-shell files in scripts/ don't trip the linter. + shell_targets=() + for f in "${shell_files[@]}"; do + [ -f "$f" ] || continue + if [[ "$f" == *.sh ]]; then + shell_targets+=("$f") + elif head -c 80 "$f" 2>/dev/null | grep -qE '^#!.*\b(bash|sh)\b'; then + shell_targets+=("$f") + fi + done + if [ ${#shell_targets[@]} -gt 0 ]; then + shellcheck --severity=style "${shell_targets[@]}" || fail "shellcheck --severity=style" + fi + pass "shellcheck" +else + echo " - shellcheck (skipped: install via \`brew install shellcheck\` or \`apt install shellcheck\`)" +fi + +echo -e "${BOLD}${GREEN}All checks passed.${RESET}" From e0df14587d38e7f49c3ffdae57cc189a118f0c78 Mon Sep 17 00:00:00 2001 From: Brett Date: Wed, 27 May 2026 15:48:32 -0500 Subject: [PATCH 04/15] =?UTF-8?q?docs:=20impeccable=E2=86=92product=20chan?= =?UTF-8?q?nel=20migration=20+=20tighten=20PR-body=20rules=20(#17)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary **Channel migration: `.impeccable.md` → `PRODUCT.md`.** The skill bundle carries `PRODUCT.md` at the repo root as the channel-specific design-context file, inheriting from a vendored `BRAND.md` (universal voice and identity, source of truth in `agentnative-spec`). `scripts/sync-prose-tooling.sh` vendors `BRAND.md` from spec's `main` HEAD on a separate cadence from `scripts/sync-spec.sh`. `AGENTS.md` points authors at both files before touching skill-bundle prose. Aligns with `agentnative-spec`, `agentnative-site`, and `agentnative-cli`, which already migrated. **Tighter PR-body conventions.** The PR template's Summary placeholder reserves the section for the net diff (what merging produces vs. the base branch) and lists the verification artifacts to exclude (triple-diff stats, leak-check output, patch-id cherry-check counts, pre-push gate results, CI status, prose-scrub findings). `RELEASES.md` and `RELEASES-RATIONALE.md` codify the same rule in operational and rationale form. **Supporting docs:** - `.github/ISSUE_TEMPLATE/*.md` replaced by YAML forms (`bug-report.yml`, `bundle-proposal.yml`, `00-blank.yml`) plus `config.yml` routing visitors to the spec, cli, and site repos. - `CONTRIBUTING.md` rewritten: sibling-repo list widened to four (adds `agentnative-site`), Contribution Tiers table (Signal / Proposal / Code with intake + effort), AI-disclosure pointer at the spec's policy. - `README.md` repo-layout block adds `BRAND.md`, `PRODUCT.md`, `RELEASES-RATIONALE.md`, and `scripts/sync-prose-tooling.sh`; `anc.dev` principle-range link now covers `/p1` through `/p8` (spec added P8 on discoverability). - `RELEASES.md` "Apply" section for branch-protection rulesets is past-tensed (all three rulesets are installed; commands read as re-runnable for new repos or after a ruleset reset, not as gated on a future public-flip), with a `gh api repos/.../rulesets` verify recipe. - `RELEASES-RATIONALE.md` "Private-repo ruleset gap" section is rewritten as "Why the apply step is re-runnable" so the doc reads forward. ## Changelog ### Added - New skill-bundle channel-context layer: `PRODUCT.md` (channel design context), `BRAND.md` (universal voice, vendored from `agentnative-spec`), and `scripts/sync-prose-tooling.sh` (vendoring vehicle, decoupled from `scripts/sync-spec.sh`). - `RELEASES-RATIONALE.md` companion to `RELEASES.md` documents the rationale behind branching, PR conventions, CHANGELOG generation, spec-vendor pipeline, and branch protection. - GitHub issue forms: `bug-report.yml`, `bundle-proposal.yml`, `00-blank.yml`, and `config.yml`. ### Changed - PR template, `RELEASES.md`, and `RELEASES-RATIONALE.md` codify the net-diff PR-body rule: Summary describes the merged-state diff and excludes verification artifacts. - `RELEASES.md` "Apply" section for branch-protection rulesets past-tensed (all three rulesets installed; apply commands re-runnable). - `CONTRIBUTING.md` widens the sibling-repo list to four, adds a Contribution Tiers table (Signal / Proposal / Code), and points at the spec's AI-disclosure policy. - `README.md` repo-layout block lists `BRAND.md`, `PRODUCT.md`, `RELEASES-RATIONALE.md`, and `scripts/sync-prose-tooling.sh`; principle-range link covers `/p1` through `/p8`. - `AGENTS.md` adds a "Voice and prose rules" pointer to `PRODUCT.md` and `BRAND.md`. ### Removed - Legacy markdown issue templates (`bug_report.md`, `bundle_proposal.md`), replaced by YAML forms. ## Type of Change - [ ] `feat`: New feature (non-breaking change which adds functionality) - [ ] `fix`: Bug fix (non-breaking change which fixes an issue) - [ ] `refactor`: Code refactoring (no functional changes) - [ ] `perf`: Performance improvement - [x] `docs`: Documentation update - [ ] `test`: Adding or updating tests - [ ] `chore`: Maintenance tasks (dependencies, config, etc.) - [ ] `ci`: CI/CD configuration changes - [ ] `style`: Code style/formatting changes - [ ] `build`: Build system changes - [ ] `BREAKING CHANGE`: Breaking API change (requires major version bump) ## Related Issues/Stories - Story: n/a - Issue: n/a - Architecture: aligns with the `.impeccable.md` → `PRODUCT.md` channel migration already shipped in `agentnative-spec`, `agentnative-site`, and `agentnative-cli`. - Related PRs: n/a ## Testing - [ ] Unit tests added/updated - [ ] Integration tests added/updated - [x] Manual testing completed - [x] All tests passing ## Files Modified **Modified:** - `.github/pull_request_template.md`: Summary placeholder adds the SCOPE rule and the verification-artifact EXCLUDE list. - `AGENTS.md`: new "Voice and prose rules" section pointing at `PRODUCT.md` and `BRAND.md`. - `CONTRIBUTING.md`: sibling-repo list widened to include `agentnative-site`; new Contribution Tiers table; AI-disclosure pointer; prose touch-up. - `README.md`: repo-layout block adds the four files this branch creates; principle-range link corrected to `/p1` through `/p8`; prose touch-up. - `RELEASES.md`: PR-body section codifies the net-diff and zero-verification-artifacts rules; "Apply" section past-tensed with a verify-installed-rulesets recipe. **Created:** - `BRAND.md`: vendored from `agentnative-spec/BRAND.md`. Edits land upstream, not here. - `PRODUCT.md`: skill-bundle channel design context, inheriting from `BRAND.md`. - `RELEASES-RATIONALE.md`: rationale companion to `RELEASES.md` (branching, PR conventions, CHANGELOG generation, spec-vendor pipeline, branch protection). - `scripts/sync-prose-tooling.sh`: vendors `BRAND.md` from `agentnative-spec`'s `main` HEAD. - `.github/ISSUE_TEMPLATE/00-blank.yml`: structured blank-issue template. - `.github/ISSUE_TEMPLATE/bug-report.yml`: structured bundle-bug form. - `.github/ISSUE_TEMPLATE/bundle-proposal.yml`: structured bundle-proposal form. - `.github/ISSUE_TEMPLATE/config.yml`: cross-repo routing links. **Renamed:** - None. **Deleted:** - `.github/ISSUE_TEMPLATE/bug_report.md`: superseded by `bug-report.yml`. - `.github/ISSUE_TEMPLATE/bundle_proposal.md`: superseded by `bundle-proposal.yml`. ## Key Features - Three-tier prose inheritance for the skill bundle: universal (`BRAND.md`) → channel (`PRODUCT.md`) → bundle artifacts (`SKILL.md`, `getting-started.md`, `references/`, `templates/`). - PR-body rule that excludes verification artifacts so the body reads as what shipped, not how it was assembled. - Structured GitHub issue forms with required AI-disclosure fields and route-check banners pointing at the right sibling repos. ## Benefits - Cross-repo legibility: spec, site, cli, and skill now share the same `BRAND.md` + `PRODUCT.md` shape. - Cleaner PR history: reviewers see substance, not workflow narration. - Lower friction for contributors: structured forms guide bug reports and proposals; the Contribution Tiers table sets expectations. ## Breaking Changes - [x] No breaking changes - [ ] Breaking changes described below: The `.impeccable.md` → `PRODUCT.md` rename is producer-side. The `/impeccable` skill loader resolves the legacy filename via auto-migration. ## Deployment Notes - [x] No special deployment steps required - [ ] Deployment steps documented below: ## Screenshots/Recordings n/a. Docs-only change. ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) - [x] Self-review of code completed - [x] Tests added/updated and passing - [x] No new warnings or errors introduced - [x] Changes are backward compatible (or breaking changes documented) --- .github/ISSUE_TEMPLATE/00-blank.yml | 35 +++ .github/ISSUE_TEMPLATE/bug-report.yml | 95 ++++++++ .github/ISSUE_TEMPLATE/bug_report.md | 47 ---- .github/ISSUE_TEMPLATE/bundle-proposal.yml | 108 +++++++++ .github/ISSUE_TEMPLATE/bundle_proposal.md | 66 ----- .github/ISSUE_TEMPLATE/config.yml | 11 + .github/pull_request_template.md | 16 +- AGENTS.md | 6 + BRAND.md | 105 ++++++++ CONTRIBUTING.md | 66 +++-- PRODUCT.md | 82 +++++++ README.md | 39 +-- RELEASES-RATIONALE.md | 203 ++++++++++++++++ RELEASES.md | 267 ++++++++++----------- scripts/sync-prose-tooling.sh | 128 ++++++++++ 15 files changed, 988 insertions(+), 286 deletions(-) create mode 100644 .github/ISSUE_TEMPLATE/00-blank.yml create mode 100644 .github/ISSUE_TEMPLATE/bug-report.yml delete mode 100644 .github/ISSUE_TEMPLATE/bug_report.md create mode 100644 .github/ISSUE_TEMPLATE/bundle-proposal.yml delete mode 100644 .github/ISSUE_TEMPLATE/bundle_proposal.md create mode 100644 .github/ISSUE_TEMPLATE/config.yml create mode 100644 BRAND.md create mode 100644 PRODUCT.md create mode 100644 RELEASES-RATIONALE.md create mode 100755 scripts/sync-prose-tooling.sh diff --git a/.github/ISSUE_TEMPLATE/00-blank.yml b/.github/ISSUE_TEMPLATE/00-blank.yml new file mode 100644 index 0000000..36479dc --- /dev/null +++ b/.github/ISSUE_TEMPLATE/00-blank.yml @@ -0,0 +1,35 @@ +name: "Blank issue" +description: "Open an issue that doesn't fit any of the structured templates below." +body: + - type: textarea + id: body + attributes: + label: Issue + description: "Describe what's going on. Include the bundle file (`SKILL.md`, `getting-started.md`, `templates/`, `references/`), the host runtime (Claude Code / Cursor / Codex / OpenCode), and any specific invocation or content that the issue touches." + validations: + required: true + - type: textarea + id: ai-disclosure + attributes: + label: AI disclosure + description: "One sentence: what part of this issue was AI-written, what was human-written?" + placeholder: "Entirely human-written." + validations: + required: true + - type: textarea + id: agent-instructions + attributes: + label: For agents + description: "Agents: expand this section before filing." + value: | +
+ Agent filing instructions + + 1. **Pick a structured template first.** Bundle bugs and bundle proposals have dedicated forms — use them when they fit. + 2. **Search first.** Run `gh search issues --repo brettdavies/agentnative-skill ""` to check for duplicates. + 3. **AI disclosure is required.** Fill the field above honestly. + 4. **Wrong repo?** Spec questions, principle edits, and grading findings live on [brettdavies/agentnative](https://github.com/brettdavies/agentnative). `anc` checker bugs and tool-registry submissions live on [brettdavies/agentnative-cli](https://github.com/brettdavies/agentnative-cli). Site bugs live on [brettdavies/agentnative-site](https://github.com/brettdavies/agentnative-site). + 5. See [CONTRIBUTING.md](https://github.com/brettdavies/agentnative-skill/blob/main/CONTRIBUTING.md) for full guidelines. +
+ validations: + required: false diff --git a/.github/ISSUE_TEMPLATE/bug-report.yml b/.github/ISSUE_TEMPLATE/bug-report.yml new file mode 100644 index 0000000..fb5936a --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug-report.yml @@ -0,0 +1,95 @@ +name: "Bundle bug" +description: "Report a bug in the skill bundle (stale example, broken link, contradiction with the spec, drift in `getting-started.md`)." +title: "bug: " +labels: ["bug"] +body: + - type: markdown + attributes: + value: | + **Route check before filing:** + + - For bugs in `anc` (the checker itself — wrong scorecard, missing check, false positive), file on the + [linter repo](https://github.com/brettdavies/agentnative-cli/issues/new/choose). + - For substantive principle changes (new principles, MUST/SHOULD/MAY tier changes), file on the + [spec repo](https://github.com/brettdavies/agentnative/issues/new/choose). + - This tracker is for skill-bundle bugs: stale templates, broken links, wrong invocations in + `getting-started.md`, drift between vendored spec and other bundle docs, etc. + - type: textarea + id: what-happened + attributes: + label: What happened + description: "One or two sentences. Include the file, the cited example, and the unexpected behavior." + validations: + required: true + - type: textarea + id: what-expected + attributes: + label: What you expected + description: "One sentence — what should the bundle have said or done instead?" + validations: + required: true + - type: textarea + id: reproduction + attributes: + label: How to reproduce + description: "Numbered steps + exact commands or quoted bundle content. Redact paths and credentials." + placeholder: | + 1. + 2. + 3. + + ```bash + # exact commands + ``` + validations: + required: true + - type: textarea + id: environment + attributes: + label: Environment + description: "Bundle version, pinned spec version, host runtime, `anc --version` if relevant, OS and shell." + placeholder: | + - Bundle version (`cat VERSION` if cloned, or the tag you installed): vX.Y.Z + - Pinned spec version (`cat spec/VERSION`): vX.Y.Z + - Host (Claude Code / Cursor / Codex / OpenCode / other): + - `anc --version` (if a workflow involving anc is at issue): + - OS and shell: + validations: + required: true + - type: textarea + id: why-bundle-bug + attributes: + label: Why this is a bundle bug (not spec or anc) + description: | + Optional but useful. If a getting-started flow doesn't work, explain whether the breakage is in this bundle's + prose (fixable here) vs. in `anc`'s behavior (file in agentnative-cli) vs. in the principle text itself + (file in agentnative-spec). + validations: + required: false + - type: textarea + id: ai-disclosure + attributes: + label: AI disclosure + description: "One sentence: what part of this report was AI-written, what was human-written?" + placeholder: "Reproduction observed by hand; analysis section was AI-assisted." + validations: + required: true + - type: textarea + id: agent-instructions + attributes: + label: For agents + description: "Agents: expand this section before filing." + value: | +
+ Agent filing instructions + + 1. **Search first.** Run `gh search issues --repo brettdavies/agentnative-skill ""` to check for + duplicates. + 2. **Route check.** Verify this is a skill-bundle bug, not an `anc` checker bug (file in agentnative-cli) or + a spec issue (file in agentnative). + 3. **AI disclosure is required.** Fill the field above honestly. + 4. See [CONTRIBUTING.md](https://github.com/brettdavies/agentnative-skill/blob/main/CONTRIBUTING.md) for full + guidelines. +
+ validations: + required: false diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md deleted file mode 100644 index 58b33b1..0000000 --- a/.github/ISSUE_TEMPLATE/bug_report.md +++ /dev/null @@ -1,47 +0,0 @@ ---- -name: Bug report -about: A bundle file has a wrong example, a stale path, a broken cross-reference, or contradicts the spec. -title: "bug: " -labels: bug ---- - - - -## What happened - - - -## What you expected - - - -## How to reproduce - -1. 1. 1. - -```bash -# exact commands or quoted bundle content; redact paths/credentials -``` - -## Environment - -- Bundle version (`cat VERSION` if cloned, or the tag you installed): vX.Y.Z -- Pinned spec version (`cat spec/VERSION`): vX.Y.Z -- Host (Claude Code / Cursor / Codex / other): -- `anc --version` (if a workflow involving anc is at issue): -- OS and shell: - -## Why this is a bundle bug, not a spec or anc bug - - diff --git a/.github/ISSUE_TEMPLATE/bundle-proposal.yml b/.github/ISSUE_TEMPLATE/bundle-proposal.yml new file mode 100644 index 0000000..4d7b391 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bundle-proposal.yml @@ -0,0 +1,108 @@ +name: "Bundle proposal" +description: "Propose a new template, reference doc, getting-started flow, or other change to the skill bundle." +title: "proposal: " +labels: ["proposal"] +body: + - type: markdown + attributes: + value: | + **Route check before filing:** + + - New principle, MUST/SHOULD/MAY tier change, or any substantive change to the standard itself → file on + the [spec repo](https://github.com/brettdavies/agentnative/issues/new/choose). This skill bundle vendors + the spec; principle changes happen there first, then arrive here via `scripts/sync-spec.sh`. + - New compliance check, change to scorecard semantics, or anything `anc check` does → file on the + [linter repo](https://github.com/brettdavies/agentnative-cli/issues/new/choose). + - New starter template, reference doc, getting-started flow, idiom for a new language/framework, or any + change to how this bundle teaches the existing principles → keep filing here. + - type: textarea + id: problem + attributes: + label: Problem statement + description: | + What is the problem this proposal addresses? Be specific about which agent's workflow it would improve. + Cite real tools, real PRs, or real `anc` findings where relevant. + validations: + required: true + - type: textarea + id: proposal + attributes: + label: Proposal + description: | + One paragraph: what should the bundle do that it does not do today? Frame as a concrete change to one or + more of: `SKILL.md`, `getting-started.md`, `references/`, `templates/`. + validations: + required: true + - type: checkboxes + id: change-type + attributes: + label: Type of change + options: + - label: "New starter template under `templates/`" + - label: "New reference doc under `references/`" + - label: "Update to `SKILL.md` (entry-point structure or routing)" + - label: "Update to `getting-started.md` (new flow, new invocation)" + - label: "Idioms for a new language/framework in `references/framework-idioms-other-languages.md`" + - label: "Other (describe in the Proposal section)" + - type: textarea + id: prior-art + attributes: + label: Prior art + description: "Existing tools, docs, or articles that demonstrate the problem or the proposed solution. Two or three is fine." + validations: + required: false + - type: textarea + id: draft + attributes: + label: Draft of the change + description: | + Sketch what the relevant bundle file(s) would look like after the change. A diff against the current state + is ideal; an outline is acceptable for early-stage proposals. + placeholder: | + ```diff + + ``` + validations: + required: false + - type: checkboxes + id: compatibility + attributes: + label: Compatibility + options: + - label: "Additive — no existing bundle content needs to change" + - label: "Replaces existing content (list what gets removed/superseded in the Open questions section)" + - label: "Coordinated with a spec or anc change (link the upstream issue/PR in the Open questions section)" + - type: textarea + id: open-questions + attributes: + label: Open questions + description: "Anything you're unsure about. Decisions to make in the issue thread before any PR." + validations: + required: false + - type: textarea + id: ai-disclosure + attributes: + label: AI disclosure + description: "One sentence: what part of this proposal was AI-written, what was human-written?" + placeholder: "Problem statement drafted by hand; example diff sketch was AI-assisted." + validations: + required: true + - type: textarea + id: agent-instructions + attributes: + label: For agents + description: "Agents: expand this section before filing." + value: | +
+ Agent filing instructions + + 1. **Search first.** Run `gh search issues --repo brettdavies/agentnative-skill ""` to check for + duplicates. + 2. **Route check.** Verify this is a bundle proposal, not a spec change (file in agentnative) or an + `anc` feature request (file in agentnative-cli). + 3. **AI disclosure is required.** Fill the field above honestly. + 4. See [CONTRIBUTING.md](https://github.com/brettdavies/agentnative-skill/blob/main/CONTRIBUTING.md) for full + guidelines. +
+ validations: + required: false diff --git a/.github/ISSUE_TEMPLATE/bundle_proposal.md b/.github/ISSUE_TEMPLATE/bundle_proposal.md deleted file mode 100644 index 2fa2eaa..0000000 --- a/.github/ISSUE_TEMPLATE/bundle_proposal.md +++ /dev/null @@ -1,66 +0,0 @@ ---- -name: Bundle proposal -about: Propose a new template, reference doc, getting-started flow, or other change to the skill bundle. -title: "proposal: " -labels: proposal ---- - - - -## Problem statement - - - -## Proposal - - - -## Type of change - -- [ ] New starter template under `templates/` -- [ ] New reference doc under `references/` -- [ ] Update to `SKILL.md` (entry-point structure or routing) -- [ ] Update to `getting-started.md` (new flow, new invocation) -- [ ] Idioms for a new language/framework in `references/framework-idioms-other-languages.md` -- [ ] Other (describe) - -## Prior art - - - -- - - -## Draft of the change - - - -```diff - -``` - -## Compatibility - -- [ ] Additive — no existing bundle content needs to change -- [ ] Replaces existing content — list what gets removed/superseded -- [ ] Coordinated with a spec or anc change — link the upstream issue/PR - -## Open questions - - - -- diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 0000000..8b7c3b2 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,11 @@ +blank_issues_enabled: false +contact_links: + - name: "Spec questions, principle edits, or grading findings" + url: "https://github.com/brettdavies/agentnative/issues/new/choose" + about: "For anything about the standard itself — propose changes, submit a grading finding, ask questions — file on the spec repo." + - name: "Checker bugs, features, or tool-registry submissions" + url: "https://github.com/brettdavies/agentnative-cli/issues/new/choose" + about: "For bugs in the `anc` checker, feature requests, or proposing a tool for the leaderboard, file on the linter repo." + - name: "Site bugs (rendering, performance, deployment)" + url: "https://github.com/brettdavies/agentnative-site/issues/new/choose" + about: "For bugs on anc.dev, file on the site repo." diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md index fdead33..4f42d08 100644 --- a/.github/pull_request_template.md +++ b/.github/pull_request_template.md @@ -1,6 +1,20 @@ ## Summary - + ## Changelog diff --git a/AGENTS.md b/AGENTS.md index 3361655..d7b9ce4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -43,6 +43,12 @@ actionlint .github/workflows/*.yml The repo ships a local `.markdownlint-cli2.yaml` (canonical 120-char line length) and `.shellcheckrc` so CI and local tooling agree. +## Voice and prose rules + +Channel-specific design context lives in [`PRODUCT.md`](PRODUCT.md). It inherits from [`BRAND.md`](BRAND.md) (vendored +from `agentnative-spec` via [`scripts/sync-prose-tooling.sh`](scripts/sync-prose-tooling.sh)). Read both before +authoring skill-bundle prose (`SKILL.md`, `getting-started.md`, `references/`, `templates/`). + ## Spec sync The canonical principle text lives in [`brettdavies/agentnative`](https://github.com/brettdavies/agentnative). This repo diff --git a/BRAND.md b/BRAND.md new file mode 100644 index 0000000..5844d8f --- /dev/null +++ b/BRAND.md @@ -0,0 +1,105 @@ +# BRAND.md: agentnative voice and identity + +Source of truth for the voice and identity of the agentnative standard. Shared across the spec, the website, the linter, +the skill bundle, and any future channel. Each channel inherits from this document and adds channel-specific register +and artifacts in its own `PRODUCT.md`. + +> **Source of truth: `agentnative-spec/BRAND.md`.** This file is vendored into each channel repo (`agentnative-site`, +> `agentnative-cli`, `agentnative-skill`) via `scripts/sync-prose-tooling.sh`. Edits in +> a consumer repo will be overwritten on the next sync. File issues and PRs against this repo. + +## Brand identity + +**Three words: opinionated, precise, inviting.** + +- **Opinionated.** The standard has a point of view. It does not enumerate tradeoffs and shrug; it states "MUST do X, + here is the failure mode if you don't, here is the canonical fix." The point of view is what makes the standard worth + citing. +- **Precise.** RFC 2119 language. Anchors stable and citable. Numbers measured, not asserted. Where a contract has a + canonical realization (a flag spelling, an exit code, a path), it is named explicitly. +- **Inviting.** The reader (or agent handler) keeps reading by design. That comes from details: typography that rewards + a slow read, prose that rewards a fast scan, code blocks that read like reference material a reader can trust. + Inviting is not "friendly" and it is not "marketing." It rewards engagement. + +## Voice anchor + +Concrete before abstract. Show then tell. No filler adjectives. The standard speaks as a standard, not a person ( +first-person singular is out), but every channel inherits the same sequence: state the contract, show the failure mode, +name the canonical fix. + +## Audiences + +Two first-class consumers across all channels: + +- **Humans** evaluating, adopting, implementing, or extending the standard. Spec-channel readers are technically deep + and arrive with skepticism; site-channel readers are time-pressured and decide in 60 seconds whether to take the + standard seriously; linter users invoke at the terminal. Each channel narrows further in its own `PRODUCT.md`. +- **AI agents** consuming the standard programmatically: markdown via `Accept: text/markdown`, requirement IDs via + frontmatter parsing, skill bundles via `SKILL.md`/`AGENTS.md` discovery, linter findings via JSON. Their UX is "do + anchors stay stable, do IDs survive reorganizations, does the channel render cleanly across versions." This is not a + nice-to-have. The agent audience is first-class. Decisions that improve a channel for humans at the cost of agent + legibility are regressions. + +## Universal anti-patterns + +These bans apply across every channel. The narrative below explains *why* each category is banned; the executable +contract for *what* is banned lives in [`styles/brand/README.md`](styles/brand/README.md), generated from the Vale rule +pack at `styles/brand/*.yml`. + +- **No marketing register.** First-person belief and recommendation framings are out. The standard speaks in the third + person about contracts, not in the first person about beliefs. +- **No hedge words.** Probabilistic softeners undercut MUST and SHOULD. The contract is the contract. +- **No filler adjectives.** Marketing modifiers do no work. Concrete before abstract; the noun carries the meaning. +- **No verbatim quotation from any single source.** Where multiple sources converge on a claim, the standard's wording + sounds like triangulation, not citation. Lineage belongs in the README's `Acknowledgements` section, not in the + contract. + +## Voice anchors: concrete examples + +The ✓ column shows the contract voice. The ✗ column names the category of failure rather than reproducing literal banned +phrases. Those live in [`styles/brand/README.md`](styles/brand/README.md). The category labels describe the shape of the +failure each ✓ phrasing replaces. + +| ✓ | ✗ | +| ------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | +| "CLI tools MUST run without human input." | First-person belief framing with lowercase RFC keyword and audience speculation. | +| "Authentication failed: token expired (`expires_at: 2026-03-25T00:00:00Z`). Run `tool auth refresh` or set `TOOL_TOKEN`." | Apologetic register with vague remediation and no actionable diagnostic. | +| "Numeric output is locale-independent: `.` decimal, no thousands grouping, regardless of `LC_NUMERIC`." | First-person recommendation with hedge word and unspecific "potential issues" framing. | + +## Channels + +The shared identity above applies to every channel. Each channel adds register and artifacts in its own `PRODUCT.md`: + +- **Spec** (`agentnative-spec/PRODUCT.md`): RFC 2119 register, third-person standards voice, present tense, no + first-person plural, no implementation leakage in MUSTs. +- **Site** (`agentnative-site/PRODUCT.md`): visual system (palette, typography, code-block treatment, OG image), + tech-stack decisions (SSG, Worker, content negotiation), JS budget, dark-mode design. +- **Skill bundle**: instructional voice, second-person imperative is allowed, agent-loadable. +- **Linter (`anc`)**: terse error messages, ≤80-column help text, four-part error rubric (offending value, constraint, + valid example, remediation). + +## Channel artifacts + +Each channel's repo carries its own narrow stack on top of this universal `BRAND.md`. The canonical layout: + +| Channel | `PRODUCT.md` location | Deep tier-3 | Vale rule pack | How `BRAND.md` arrives | +| ------------ | ----------------------------------------------- | ------------------------------------------------------ | -------------- | ------------------------------- | +| Spec | `agentnative-spec/PRODUCT.md` | `principles/`, `docs/architecture/`, `docs/decisions/` | `styles/spec/` | (origin — this repo) | +| Site | `agentnative-site/PRODUCT.md` | `DESIGN.md` (root) | (none yet) | `scripts/sync-prose-tooling.sh` | +| CLI (`anc`) | `agentnative-cli/PRODUCT.md` (when warranted) | `src/` (Rust source IS the artifact) | (planned) | `scripts/sync-prose-tooling.sh` | +| Skill bundle | `agentnative-skill/PRODUCT.md` (when warranted) | `bundle/` | (planned) | `scripts/sync-prose-tooling.sh` | + +A channel earns its `PRODUCT.md` when channel-specific decisions (visual system, error rubric, instructional voice, +etc.) accumulate enough that the universal `BRAND.md` cannot carry them. The spec and site channels have crossed that +threshold today. + +**Convention: deep tier-3 artifacts live at the repo root, not in `docs/`.** The site channel's `DESIGN.md` sits at +`agentnative-site/DESIGN.md` (not `docs/DESIGN.md`) so the `/impeccable` skill loader and human readers find it without +traversal. Future deep companions (e.g., a hypothetical `GOVERNANCE.md`) follow the same pattern. Only research +artifacts and historical plans live under `docs/`. + +## Sync + +This document is the source of truth. The site syncs it via `scripts/sync-spec.sh` alongside `principles/*.md`, +`VERSION`, and `CHANGELOG.md`. The skill bundle and linter sync similarly when they grow brand-aware artifacts. A PR +that changes `BRAND.md` flags whether channel sync is needed; channel repos pick up the change in a follow-on PR. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4bb642e..e491424 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,14 +1,37 @@ # Contributing to `agentnative-skill` -Thanks for your interest. This repo is the agent-facing skill that pairs with two siblings: +Thanks for your interest. This repo is the agent-facing skill that pairs with three siblings: -- [`agentnative`](https://github.com/brettdavies/agentnative) (the spec) — canonical principle text. Vendored here at +- [`agentnative`](https://github.com/brettdavies/agentnative) (the spec): canonical principle text. Vendored here at `spec/`. -- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) (`anc`) — the canonical compliance checker. +- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) (`anc`): the canonical compliance checker. +- [`agentnative-site`](https://github.com/brettdavies/agentnative-site) (`anc.dev`): the public site, leaderboard + renderer, live-scoring loop, and skill-distribution endpoint. This skill does **not** define principles (the spec does) and does **not** check compliance (`anc` does). It teaches agents how to use them and supplies surrounding context (idioms, templates, getting-started). Route contributions -accordingly. +accordingly. For cross-repo visitor-facing navigation, see [`anc.dev/contribute`](https://anc.dev/contribute). + +## Contribution tiers + +The skill bundle accepts three shapes of contribution. All three are welcome; none is required. Skill work skews toward +Tier 3 because most improvements are concrete bundle changes; Tier 2 proposals matter when the change spans bundle +structure or host-runtime support. + +| Tier | Shape | Intake | Effort | +| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | +| **1. Signal** | Bundle content issue, install path bug, host-runtime detection failure, instructional content critique | [`bug-report`](https://github.com/brettdavies/agentnative-skill/issues/new?template=bug-report.yml) / [`bundle-proposal`](https://github.com/brettdavies/agentnative-skill/issues/new?template=bundle-proposal.yml) | ~5 min | +| **2. Proposal** | A new host runtime to support in `anc skill install`, bundle reorganization, instructional content rework | Issue with the design before opening a PR | ~1-2 hrs | +| **3. Code** | Bundle content improvements, host-runtime additions, `bin/check-update` work, `SKILL.md` / `getting-started.md` prose, template additions | PR against `dev` (branch model below) | Variable | + +For principle-level discussion (the spec's MUST/SHOULD/MAY tiers, including P8 on discoverability), file a +`pressure-test` issue in the +[spec repo](https://github.com/brettdavies/agentnative/issues/new?template=pressure-test.yml). For scoring engine or +`anc check` work, file in the [CLI repo](https://github.com/brettdavies/agentnative-cli). Those discussions don't belong +here. + +**Response expectations:** Tier 1 and Tier 2 are welcome and get a substantive reply when time allows. Tier 3 PRs are +reviewed when scope and time permit. Real PRs land; no merge-window promise. ## Where to file what @@ -16,8 +39,9 @@ accordingly. | --------------------------------------------------------------- | ----------------------------------------------------------------------------------- | | Propose a new principle, change MUST/SHOULD/MAY tiers, etc. | [`brettdavies/agentnative`](https://github.com/brettdavies/agentnative) (the spec) | | Report an `anc check` bug, or propose a new checker feature | [`brettdavies/agentnative-cli`](https://github.com/brettdavies/agentnative-cli) | -| Improve a starter template, add a language idiom, fix the guide | This repo — issue + PR. Templates: **Bug report** or **Bundle proposal**. | -| Bump the vendored spec to a newer tag | This repo — run `scripts/sync-spec.sh` and PR the diff. | +| Report a site bug (rendering, deployment, anc.dev surface) | [`brettdavies/agentnative-site`](https://github.com/brettdavies/agentnative-site) | +| Improve a starter template, add a language idiom, fix the guide | This repo. Issue + PR. Templates: **Bug report** or **Bundle proposal**. | +| Bump the vendored spec to a newer tag | This repo. Run `scripts/sync-spec.sh` and PR the diff. | | Operate as an agent in this repo | Read [`AGENTS.md`](./AGENTS.md) first (lint commands, hard rules, common pitfalls). | ## Branch model (TL;DR) @@ -35,9 +59,9 @@ procedure in [`RELEASES.md`](./RELEASES.md). ## Pull requests -- **Title format**: [Conventional Commits](https://www.conventionalcommits.org/) — `type(scope): description`. +- **Title format**: [Conventional Commits](https://www.conventionalcommits.org/) (`type(scope): description`). - **Body**: follow [`.github/pull_request_template.md`](.github/pull_request_template.md). The `## Changelog` section is - the source of truth for `CHANGELOG.md` entries — write for users, not implementers. Never hand-edit `CHANGELOG.md`; + the source of truth for `CHANGELOG.md` entries. Write for users, not implementers. Never hand-edit `CHANGELOG.md`; it's generated by `scripts/generate-changelog.sh` from PR bodies at release time. - **Scope**: keep PRs small and single-purpose where possible. - **Tests**: there is no test runner in this repo, but every PR must pass `markdownlint`, `shellcheck`, and (when @@ -65,33 +89,47 @@ auto-discovers `SKILL.md` at the install root and ignores everything else. Produ - **`spec/`** is vendored. Do not edit by hand. Substantive principle changes happen in `brettdavies/agentnative`; bring them here by re-running `scripts/sync-spec.sh` after a new upstream tag lands. - **`SKILL.md`** is the host-discovered entry point. Changes to its `name` or `description` frontmatter affect skill - discovery on every host — coordinate before changing. + discovery on every host. Coordinate before changing. - **`getting-started.md`** is the agent's first read after `SKILL.md`. Keep it short and concrete; cite spec paths and `anc` invocations rather than restating the principles. - **`bin/check-update`** is the consumer-side update-check script. It compares local `VERSION` to GitHub `main` and - emits `UPGRADE_AVAILABLE` for the SKILL.md preamble. Treat as load-bearing — agents rely on it to detect staleness. + emits `UPGRADE_AVAILABLE` for the SKILL.md preamble. Treat as load-bearing. Agents rely on it to detect staleness. - **`references/`** holds implementation guidance (Rust/clap patterns, framework idioms, project structure). When `anc - --fix` lands upstream, these may shrink — they exist today because the agent has to apply remediations by hand. + --fix` lands upstream, these may shrink. They exist today because the agent has to apply remediations by hand. - **`templates/`** are starter files. They encode principles by construction. Changes here should be informed by `agentnative-cli`'s reference patterns to avoid drift; the cross-repo alignment story is documented in the spec repo's `AGENTS.md`. +## AI disclosure + +Inherits from the spec's AI disclosure policy. See +[agentnative/CONTRIBUTING.md § AI disclosure policy](https://github.com/brettdavies/agentnative/blob/main/CONTRIBUTING.md#ai-disclosure-policy). + ## Security -See [`SECURITY.md`](./SECURITY.md). Do not file security issues in the public tracker — use the GitHub private security +See [`SECURITY.md`](./SECURITY.md). Do not file security issues in the public tracker. Use the GitHub private security advisories channel. ## License By contributing, you agree your contributions are dual-licensed under the same terms as the rest of this repository: -- MIT — see [`LICENSE-MIT`](./LICENSE-MIT) -- Apache License, Version 2.0 — see [`LICENSE-APACHE`](./LICENSE-APACHE) +- MIT: see [`LICENSE-MIT`](./LICENSE-MIT) +- Apache License, Version 2.0: see [`LICENSE-APACHE`](./LICENSE-APACHE) at the consumer's option. No CLA. The Apache-2.0 side carries the standard contributor patent grant under §3 of the license. Vendored content under `spec/` is CC BY 4.0 (upstream); contributions to that directory should happen upstream in [`agentnative-spec`](https://github.com/brettdavies/agentnative). + +## Cross-repo navigation + +The full visitor-facing menu lives at [`anc.dev/contribute`](https://anc.dev/contribute). Per-repo intakes: + +- [Spec](https://github.com/brettdavies/agentnative): principle text, pressure-tests, versioning policy +- [Linter](https://github.com/brettdavies/agentnative-cli): `anc`, the scoring engine, the registry +- [Site](https://github.com/brettdavies/agentnative-site): anc.dev source, leaderboard renderer, live-scoring +- This repo: the agent-facing skill bundle, install paths, host-runtime detection diff --git a/PRODUCT.md b/PRODUCT.md new file mode 100644 index 0000000..f45dbe2 --- /dev/null +++ b/PRODUCT.md @@ -0,0 +1,82 @@ +# PRODUCT.md: skill bundle channel design context + +Channel-specific design context for the **skill bundle channel** of agentnative. Inherits the shared identity, voice +anchor, audiences, and universal anti-patterns from [`BRAND.md`](BRAND.md), vendored from +[`agentnative-spec`](https://github.com/brettdavies/agentnative) via +[`scripts/sync-prose-tooling.sh`](scripts/sync-prose-tooling.sh). Read that first. + +## Inheritance + +The skill bundle channel sits in a three-tier waterfall. Each tier owns a different concern; nothing duplicates. + +1. **Universal — [`BRAND.md`](BRAND.md).** Shared identity, voice anchor, audiences, universal anti-patterns. Vendored + from `agentnative-spec` via [`scripts/sync-prose-tooling.sh`](scripts/sync-prose-tooling.sh) (parallel to + [`scripts/sync-spec.sh`](scripts/sync-spec.sh), which vendors `spec/principles/`). Read that first. +2. **Channel delta — this file (`PRODUCT.md`).** Instructional voice, second-person imperative allowed (the bundle + teaches agents how to invoke `anc`), terse and agent-loadable. The narrative companion to the shipped bundle content + at the repo root. +3. **Bundle — repo root.** The shipped skill: [`SKILL.md`](SKILL.md), [`getting-started.md`](getting-started.md), + [`references/`](references/), [`templates/`](templates/), [`bin/`](bin/), and the vendored [`spec/`](spec/). What an + agent loads into runtime via `~/.claude/skills/agent-native-cli/`, `~/.cursor/skills/agent-native-cli/`, and + equivalent paths under Codex / OpenCode / Factory / Kiro. Built from the same commit as `PRODUCT.md`. + +## Channel — skill bundle + +The skill bundle teaches agents how to use `anc` and the spec. The spec describes the contract; the linter verifies it; +the skill bundle teaches the workflow that connects the two. + +## Audience (narrowed) + +- **AI agents** loading the skill via Claude Code's skill discovery (frontmatter triggers), Cursor's skill registry, + Codex's `~/.codex/skills/`, OpenCode's `.opencode/skills/`, and equivalent runtimes. The agent reads `SKILL.md` first, + drills into [`getting-started.md`](./getting-started.md) for one of three loops (existing CLI / new Rust CLI / other + language), and consults `references/` only when remediating a specific `anc` finding. +- **Humans** running `anc skill install ` for the first time, or maintaining the bundle. They expect commands to + be runnable as written, paths to resolve, and the bundle to track what `anc` produces. + +## Register + +- **Instructional voice.** Second-person imperative is the default. "Run `anc check . --output json` to score the repo." + Not "the skill recommends running `anc check`." +- **Every action is a runnable command.** Code blocks contain commands the reader can paste. Pseudo-commands and + prose-only "how to think about it" are out. +- **Reference-shaped tables.** Where the artifact is a list of options, files, or trade-offs, render it as a table with + stable column headers. The three-artifact table at the top of `SKILL.md` and the useful-flags paragraph in + `getting-started.md` are the patterns. +- **Triggers and keywords are first-class.** The skill's `description` frontmatter determines which agents discover the + skill. Keep it dense in the canonical vocabulary (`agentic CLI`, `agent-native`, `anc check`, `audit-profile`, …). The + opposite of marketing prose; closer to a search-index line. +- **Cross-link to the spec, the linter, and the templates.** A skill that paraphrases what the spec says drifts. Link + instead, and quote the requirement ID, not the prose. + +## Skill-specific anti-patterns + +These extend the universal bans in `BRAND.md`: + +- **No "in this guide we will…" preamble.** The agent skips it. Open with the action. +- **No narrative scaffolding.** "First, we'll cover X. Then, Y. Finally, Z." Out. The TOC and section headings carry the + structure; prose that recapitulates them is noise. +- **No "consider doing X" hedges.** State the action. "Run `anc check`." Not "you might want to consider running `anc + check`." +- **No paraphrased spec content.** When the contract matters, link to `spec/principles/p-*.md` and quote the + requirement ID (`p1-must-no-interactive`), not the prose. Spec drift is silent and expensive when paraphrased. +- **No screenshots, no images.** The skill is consumed by agents; images are tokens-for-no-signal. +- **No version numbers in prose.** "Currently `v0.3.0`" goes stale. Reference [`VERSION`](./VERSION) and + [`spec/VERSION`](./spec/VERSION) directly, or link to the canonical source. Frontmatter aside. + +## Voice anchor application + +The pattern in `BRAND.md`, specialized for the skill channel: + +- **`SKILL.md`** opens with the three-artifact table (spec / linter / skill), the first action (update check), then the + launch link to `getting-started.md`. No preamble. +- **`getting-started.md`** opens with the three loops, each as a runnable command sequence with inline annotation. +- **`references/*.md`** are exhaustive references for one technical area each (Rust/clap patterns, framework idioms, + project structure, update check). Each section stands alone; readers arrive via deep links from `anc` findings. +- **`templates/*`** are the smallest functional starting points, not "examples." A copied template runs. + +## Status + +This file is the skill-channel `PRODUCT.md`. The spec channel's equivalent lives at +[`agentnative-spec/PRODUCT.md`](https://github.com/brettdavies/agentnative/blob/main/PRODUCT.md); the site channel's at +`agentnative-site/PRODUCT.md`. Cross-channel content is in `BRAND.md`, synced from the spec repo. diff --git a/README.md b/README.md index fa72072..1255e63 100644 --- a/README.md +++ b/README.md @@ -1,14 +1,15 @@ # agentnative-skill -The producer repo for the [`agent-native-cli`](./SKILL.md) skill — an agent-facing guide to designing, building, and +The producer repo for the [`agent-native-cli`](./SKILL.md) skill: an agent-facing guide to designing, building, and auditing CLI tools for use by AI agents. -This skill is the third artifact in a three-repo ecosystem: +This skill is the fourth artifact in a four-repo ecosystem: | Repo | Role | | --------------------------------------------------------------------------- | -------------------------------------------------------------- | -| [`agentnative`](https://github.com/brettdavies/agentnative) (the spec) | Canonical text of the seven principles. CC BY 4.0. | +| [`agentnative`](https://github.com/brettdavies/agentnative) (the spec) | Canonical text of the eight principles. CC BY 4.0. | | [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) (`anc`) | The compliance checker. MIT / Apache-2.0. | +| [`agentnative-site`](https://github.com/brettdavies/agentnative-site) | `anc.dev`: leaderboard, scorecards, live scoring. | | **This repo** (`agentnative-skill`) | The agent-facing guide. Vendors the spec; teaches `anc` usage. | ## Repository layout @@ -24,13 +25,17 @@ agentnative-skill/ ├── templates/ drop-in starter files (clap-main, error-types, output-format, agents-md-template) ├── VERSION single-line current version (read by bin/check-update) ├── scripts/ -│ ├── sync-spec.sh vendor the latest agentnative-spec v* tag into spec/ -│ └── generate-changelog.sh release-time CHANGELOG generator (git-cliff + PR-body extraction) +│ ├── sync-spec.sh vendor the latest agentnative-spec v* tag into spec/ +│ ├── sync-prose-tooling.sh vendor BRAND.md from agentnative-spec main HEAD +│ └── generate-changelog.sh release-time CHANGELOG generator (git-cliff + PR-body extraction) ├── docs/plans/ engineering plans (dev-only — guarded out of main) ├── .github/ workflows, rulesets, issue templates, PR template ├── AGENTS.md project-level agent instructions FOR THIS REPO (producer-side) +├── BRAND.md universal voice and identity (vendored from agentnative-spec) +├── PRODUCT.md skill-bundle channel design context (inherits from BRAND.md) ├── CONTRIBUTING.md how to propose changes ├── RELEASES.md release procedure (cherry-pick from dev → release/* → main) +├── RELEASES-RATIONALE.md rationale companion to RELEASES.md (the WHY) ├── SECURITY.md vulnerability disclosure ├── CHANGELOG.md released versions (generated, never hand-edited) ├── cliff.toml git-cliff configuration @@ -41,33 +46,33 @@ agentnative-skill/ Consumer-facing files (`SKILL.md`, `getting-started.md`, `bin/`, `spec/`, `references/`, `templates/`, `VERSION`, `LICENSE-*`) are read by the agent at runtime. Producer-side files (`scripts/`, `docs/`, `.github/`, `AGENTS.md`, -`CONTRIBUTING.md`, `RELEASES.md`, `cliff.toml`) ship to consumers via `git clone` but are inert at runtime — the host +`CONTRIBUTING.md`, `RELEASES.md`, `cliff.toml`) ship to consumers via `git clone` but are inert at runtime. The host discovers `SKILL.md` and ignores everything else. ## Install See [anc.dev/skill](https://anc.dev/skill) for the supported hosts (Claude Code, Cursor, Codex, etc.) and the exact -install commands. The install model is plain `git clone --depth 1` into the host's skills directory — for example +install commands. The install model is plain `git clone --depth 1` into the host's skills directory: for example `~/.claude/skills/agent-native-cli/`. The host auto-discovers `SKILL.md` at the install root; `SKILL.md` then points the agent at `getting-started.md` for progressive disclosure. Updates are `git pull --ff-only` from inside the install dir, prompted by `bin/check-update`. ## Skill contents -- [`SKILL.md`](./SKILL.md) — skill metadata + entry-point pointer. -- [`getting-started.md`](./getting-started.md) — three working loops (existing CLI / new Rust / other language); +- [`SKILL.md`](./SKILL.md): skill metadata + entry-point pointer. +- [`getting-started.md`](./getting-started.md): three working loops (existing CLI / new Rust / other language); canonical `anc check` invocations; "where things live" map. -- [`bin/check-update`](./bin/check-update) — periodic version check. Compares local `VERSION` to GitHub `main`, emits +- [`bin/check-update`](./bin/check-update): periodic version check. Compares local `VERSION` to GitHub `main`, emits `UPGRADE_AVAILABLE` so the agent can offer to `git pull`. -- [`spec/`](./spec/) — vendored canonical principle text from +- [`spec/`](./spec/): vendored canonical principle text from [`agentnative-spec`](https://github.com/brettdavies/agentnative). See [`spec/README.md`](./spec/README.md) for the resync procedure. **Do not edit by hand.** -- [`references/`](./references/) — implementation guidance: framework idioms (Rust + others), project structure, +- [`references/`](./references/): implementation guidance: framework idioms (Rust + others), project structure, Rust/clap patterns. Used when remediating `anc` findings. -- [`templates/`](./templates/) — drop-in starting points for greenfield Rust CLIs (`clap-main.rs`, `error-types.rs`, +- [`templates/`](./templates/): drop-in starting points for greenfield Rust CLIs (`clap-main.rs`, `error-types.rs`, `output-format.rs`, `agents-md-template.md`). -The principles are also published as a stable web reference at [anc.dev/p1](https://anc.dev/p1) through `/p7`. +The principles are also published as a stable web reference at [anc.dev/p1](https://anc.dev/p1) through `/p8`. ## Versioning @@ -79,7 +84,7 @@ The skill's own version is independent of the spec it vendors. The currently-ven ## Contributing -Issues and PRs welcome — see [`CONTRIBUTING.md`](./CONTRIBUTING.md). Routing: +Issues and PRs welcome. See [`CONTRIBUTING.md`](./CONTRIBUTING.md). Routing: - **Spec questions or principle proposals** → file in [`brettdavies/agentnative`](https://github.com/brettdavies/agentnative) (the spec repo). This skill vendors the spec; @@ -99,8 +104,8 @@ See [`SECURITY.md`](./SECURITY.md) for vulnerability disclosure. Dual-licensed under either of: -- MIT — see [`LICENSE-MIT`](./LICENSE-MIT) -- Apache License, Version 2.0 — see [`LICENSE-APACHE`](./LICENSE-APACHE) +- MIT: see [`LICENSE-MIT`](./LICENSE-MIT) +- Apache License, Version 2.0: see [`LICENSE-APACHE`](./LICENSE-APACHE) at your option. Matches the licensing on `agentnative-cli` so producers can adapt the skill's content into their own tooling without re-licensing friction. diff --git a/RELEASES-RATIONALE.md b/RELEASES-RATIONALE.md new file mode 100644 index 0000000..c9c873f --- /dev/null +++ b/RELEASES-RATIONALE.md @@ -0,0 +1,203 @@ +# Releases rationale + +Companion to [`RELEASES.md`](./RELEASES.md). RELEASES.md is the runbook (commands, paths, decision tables). This file +holds the WHY behind those rules: branching model, PR conventions, CHANGELOG generation, spec-vendor pipeline, +flat-bundle convention, `bin/check-update` semantics, cross-host install considerations, branch-protection pitfalls. + +Read this when: + +- A rule in RELEASES.md doesn't make sense and you're tempted to change it. +- A new contributor asks "why do we do X this way". +- You're adding a new release-flow rule and need to know where it fits the existing model. + +## Branching model + +### Forever `dev`, ephemeral release branches + +`dev` is never deleted, even after a release. The next release cycle reuses the same `dev`. The repo's +`delete_branch_on_merge: true` setting doesn't touch `dev` because `dev` is never the head of a PR. Using a short-lived +`release/*` head is what keeps the setting compatible with a forever integration branch. + +Engineering docs (`docs/plans/`, `docs/solutions/`, `docs/brainstorms/`, `docs/reviews/`) live on `dev` only. They never +reach `main`. `guard-main-docs.yml` blocks any `added` or `modified` engineering-doc files in PRs targeting `main`. The +release-branch cherry-pick handles this cleanly: docs commits stay on `dev`, only feature/fix/chore commits go onto +`release/*`. + +### Why cherry-pick from `main`, not branch from `dev` + +Branching from `dev` and then `gio trash`-ing the guarded paths seems simpler but produces `add/add` merge conflicts +whenever `dev` and `main` have diverged. The file appears as "added" on both sides with different content. Always branch +from `origin/main` and cherry-pick onto it. + +### Why `main` is the default branch + release pointer + +Consumers install the bundle via `git clone --depth 1` (see `SKILL.md` install commands), which lands on the default +branch. `main` is therefore the published-release pointer; `dev` is the integration branch. Each release requires the +skill maintainer to fast-forward `main` to the new tag (the `release/* → main` PR squash-merge does this). + +## PR body conventions + +### No explainer prose in the body + +Every section of a PR body is user-facing substance only: what is changing for the consumer that was not already there — +the **net diff**, not the commit history or intermediate state that produced it. Workflow mechanics (cherry-pick, +regenerate, pre-push gate, CI behavior) is documented in RELEASES.md and `.github/`, NOT in the PR body. Triple-diff +output ("A: 12 files, B: none, C: clean"), leak-check narration ("`guard-main-docs` runs clean", "no guarded paths +leaked"), patch-id cherry-check counts, pre-push gate results, CI check status, exclusion rationale, and other +verification artifacts stay local; anomalies get fixed before push, not audit-trailed in the body. + +The PR body is read by humans reviewing what shipped. Workflow mechanics and tool-fix provenance are noise from that +perspective; they belong in this file, the script outputs, and the commit history respectively. + +## Triple-diff verification + +The release-PR procedure runs three diffs (A: main→release, B: release→dev for non-doc paths, C: dev→main) plus a +patch-id cherry check. This is belt-and-suspenders because missed cherry-picks have shipped to `main` on this and +sibling repos before, and the file-level diff in B alone doesn't catch the patch-id false-negative class. + +### Why patch-id cherry-check output is noisy + +In a squash-merge workflow, `git cherry HEAD origin/dev` produces many `+` lines that need human triage. They do NOT +auto-block the release. Expected sources of false positives: + +1. **Historical commits squash-merged in prior releases.** The squash commit on main has a different patch-id than the + dev commits it consolidates, so old commits show as `+` forever. Anything older than the previous release tag is + almost always this. +2. **Cherry-picks where conflict resolution stripped guarded paths** (`docs/plans/`, `docs/brainstorms/`, etc.) or + otherwise altered the tree. Same source-code intent, different patch-id. +3. **Intentionally skipped commits** (docs-only commits, release-prep backports, revert-and-redo prep steps). + +A real miss looks like: a recent feat/fix/chore commit on dev whose *file content* is not yet on main. To triage a `+` +line: + +```bash +git show --stat # what did it touch? +git diff origin/main..HEAD -- # already on release? +``` + +If every touched file is guarded (`docs/plans/`, `docs/brainstorms/`, etc.) OR the content is already on main via a +prior squash, it's a false positive (no action). Otherwise cherry-pick the commit and re-run the triple-diff. + +## CHANGELOG generation + +### Generated, never hand-written + +`scripts/generate-changelog.sh` (with `cliff.toml`) is the only sanctioned way to update `CHANGELOG.md`. The script: + +- Runs `git-cliff` to prepend a versioned entry for commits since the last tag. +- Walks each squash-merged PR's body, extracts the `## Changelog → ### Added / Changed / Fixed / Documentation` + subsections, and replaces the auto-generated bullets with the curated PR-body content (with author and PR-link + attribution). + +If a PR's `## Changelog` section is empty, that PR's entry is omitted from the changelog (empty section = no user-facing +change). To fix a wrong CHANGELOG entry, fix the input: edit the squash-merged PR body, then re-run the script. Do +**not** edit `CHANGELOG.md` directly. + +`scripts/generate-changelog.sh --check` verifies that `CHANGELOG.md` has a versioned section (not just `[Unreleased]`): +wire this into the release-branch CI if/when one is added. + +### Why `cliff.toml` skips chore/style/test/ci/build + +These commit types do not produce user-facing content. If a cherry-picked PR has user-facing `## Changelog` content but +its commit subject starts with one of those types, its bullets get silently dropped. After running the script, +cross-check the generated section against `gh pr view --json body` for each cherry-picked PR; correct mistyped PR +titles (e.g. `chore` → `feat`) and re-amend the cherry-pick subject before re-running. See "Prefer `feat`/`fix` over +`chore`" in global CLAUDE.md for prevention. + +## Spec-vendor pipeline + +The bundle vendors a snapshot of [`agentnative-spec`](https://github.com/brettdavies/agentnative) under `spec/`. When +the spec ships a new tag (e.g., `v0.3.0`), this skill re-vendors via `scripts/sync-spec.sh` on the `release/v` +branch (same commit as the version bump, message `chore(spec): re-vendor spec to `). The script auto-resolves +the latest upstream tag from the remote, so no manual version selection is needed. + +Without re-vendoring, the bundle ships stale spec content while consumers see the new version on `anc.dev`. Re-vendoring +on the release branch keeps the on-disk snapshot in lockstep with the published version that consumers will detect via +`bin/check-update`. + +### Skill version is independent of spec version + +The skill's version is independent of the spec it vendors. A spec bump that doesn't affect the skill's surface (e.g., +prose-only edits) can ship as a patch even when the spec went minor. SemVer guidance applies to the *skill's* observable +behaviour, not the spec's: + +- **Patch** (doc updates, internal cleanups, non-substantive template edits, vendoring a patch-level spec bump). +- **Minor** (new templates, new reference docs, new bundle files (backward-compatible additions), vendoring a + minor-level spec bump that adds requirements without tightening existing tiers). +- **Major** (breaking changes to the bundle's contract: renaming `SKILL.md` frontmatter fields, restructuring directory + layout in ways that break existing skill installations, moving content between subdirectories and the producer-ops + root, or vendoring a major-level spec bump (renamed/removed principles or tightened MUSTs that would regress existing + consumers)). + +## `bin/check-update` semantics + +Update detection at install sites is delegated to the bundle's `bin/check-update`, which compares the local bundle's +`VERSION` against `main` on GitHub. This is a pull-side mechanism: there is no push or notification. Consumers detect +the new release on their next `bin/check-update` run. + +This is why `main` (not `dev`) must be the published-release pointer: a `git clone --depth 1` lands on `main`, and +`bin/check-update` compares against `main`. Cutting a release without fast-forwarding `main` would mean consumers never +see the new VERSION. + +## Prose scrubbing scope + +Three release-flow artifacts live outside any automated prose check and need a manual scrub before they ship: + +- **PR bodies.** `gh pr create` and `gh pr edit` send body text directly to GitHub; no automated prose check has reach + there. +- **`CHANGELOG.md`.** A generated artifact built from upstream PR bodies. Findings inherit whatever prose those PR + bodies carry. +- **Release-PR bodies.** The `release/* → main` PR carries contributor-authored wrap-up text composed after + `CHANGELOG.md` has been generated, and the same out-of-repo gap applies. + +The canonical Vale + LanguageTool rule packs and orchestrator behaviour live in the spec repo at +[`~/dev/agentnative-spec/docs/architecture/voice-enforcement.md`](https://github.com/brettdavies/agentnative/blob/dev/docs/architecture/voice-enforcement.md). +Until those packs are vendored into this repo via a `scripts/sync-spec.sh` extension (a deferred follow-up tracked in +the spec plan), the scrub commands point at the spec checkout directly. + +Scrub-before-submit (author in `/tmp/`, scrub there, submit via `--body-file`) avoids the round-trip of "submit, scrub, +edit, scrub again". Every fix lands locally and the public PR sees only clean text. The auto-format hook skips `/tmp/` +paths so the body keeps its authored shape and no soft-wrapping is injected. + +For a `CHANGELOG.md` finding, fix the upstream PR body (which `generate-changelog.sh` re-fetches every run) and +regenerate. Hand-editing `CHANGELOG.md` directly produces drift the next regeneration overwrites. + +## Branch protection + +### Why three rulesets + +This repo ships three rulesets (`protect-main.json`, `protect-dev.json`, `protect-tags.json`) where peer repos ship two. +The third (`protect-tags.json`) treats `v*` tags as immutable historical anchors for released versions: deletion, +force-push (re-tag), and updates are all blocked. The bundle's `bin/check-update` and the `git clone --depth 1` install +path both rely on `main` and on tag identity; a re-tagged release would lie to consumers about what they're installing. + +### Why the apply step is re-runnable + +The three rulesets ship in `.github/rulesets/` and are applied via the GitHub API. The apply commands in RELEASES.md are +deliberately idempotent so they survive: (a) the original public-flip from the bootstrap window when the repo was +private (GitHub's free tier does not allow rulesets on private repos, so rulesets could not be applied until visibility +flipped); (b) any future ruleset reset; (c) the same procedure being copied into a new repo's bootstrap. + +### Status-check context strings + +The `required_status_checks[].context` strings in `protect-main.json` MUST match exactly what GitHub publishes for each +check: + +- **Inline job** (with `name:` field): published as just `` (no workflow-name prefix). +- **Reusable-workflow caller** (`uses: .../foo.yml@ref`): published as ` / `. + +Mixing these produces a stuck-but-green PR: all actual checks report green, but the ruleset waits forever on a context +that will never appear. Confirm the real contexts after a first CI run with: + +```bash +gh api repos/brettdavies/agentnative-skill/commits//check-runs --jq '.check_runs[].name' +``` + +## Related docs + +- [`RELEASES.md`](./RELEASES.md) (operational runbook: commands, paths, decision tables) +- [`AGENTS.md`](./AGENTS.md) (repo layout, lint commands, what agents must not do) +- [`CONTRIBUTING.md`](./CONTRIBUTING.md) (how to propose changes) +- [`.github/pull_request_template.md`](.github/pull_request_template.md) (PR body structure with changelog sections) +- [`.github/rulesets/README.md`](.github/rulesets/README.md) (ruleset apply + verify procedure) +- [`CHANGELOG.md`](./CHANGELOG.md) (released versions and their notes) diff --git a/RELEASES.md b/RELEASES.md index aff8cf3..1470c3c 100644 --- a/RELEASES.md +++ b/RELEASES.md @@ -1,7 +1,6 @@ # Releasing `agentnative-skill` -Every change reaches `main` via this pipeline. Direct commits to `dev` or `main` are not permitted — every change has a -PR number in its squash commit message, which keeps the history scannable, attributable, and changelog-ready. +Operational runbook. Rationale lives in [`RELEASES-RATIONALE.md`](./RELEASES-RATIONALE.md). ```text feature branch (feat/*, fix/*, chore/*, docs/*) → PR to dev (squash merge) @@ -10,9 +9,7 @@ feature branch (feat/*, fix/*, chore/*, docs/*) → PR to dev (squash merge) → tag v* on main → GitHub Release ``` -This is the canonical brettdavies release pattern with `release/*` cherry-pick branches. Plans live on `dev` forever and -`guard-main-docs.yml` blocks any `added` or `modified` engineering-doc files in PRs targeting `main`. The release-branch -cherry-pick handles this cleanly: docs commits stay on `dev`, only feature/fix/chore commits go onto `release/*`. +Direct commits to `dev` or `main` are not permitted: every change has a PR number in its squash commit message. ## Branches @@ -20,13 +17,10 @@ cherry-pick handles this cleanly: docs commits stay on `dev`, only feature/fix/c | -------------------------------------- | ------------------------------------------------------- | ------------------------------------------- | ------------------------------------ | | `main` | Released bundle. Only release-merged commits. | Forever. | `.github/rulesets/protect-main.json` | | `dev` | Integration. All feature PRs land here. Default branch. | Forever. Never delete. | `.github/rulesets/protect-dev.json` | -| `feat/*`, `fix/*`, `chore/*`, `docs/*` | Feature work. | One PR's worth. Auto-deleted on merge. | None — squash into `dev` freely. | +| `feat/*`, `fix/*`, `chore/*`, `docs/*` | Feature work. | One PR's worth. Auto-deleted on merge. | None. Squash into `dev` freely. | | `release/*` | Head of a `release/* → main` PR. | One release's worth. Auto-deleted on merge. | None. | -`dev` is a **forever branch**. Never delete it locally or remotely, even after a `release/* → main` merge. The next -release cycle reuses the same `dev`. The repo's `delete_branch_on_merge: true` setting doesn't touch `dev` because `dev` -is never the head of a PR — using a short-lived `release/*` head is what keeps the setting compatible with a forever -integration branch. +→ Rationale: [`RELEASES-RATIONALE.md` § Branching model](./RELEASES-RATIONALE.md#branching-model). ## Daily development (feature → dev) @@ -41,109 +35,92 @@ gh pr create --base dev --title "feat(scope): what changed" - **Commit style**: [Conventional Commits](https://www.conventionalcommits.org/). - **PR body**: follow `.github/pull_request_template.md`. The `## Changelog` section is the source of truth for - user-facing release notes — `CHANGELOG.md` entries derive from it directly. + user-facing release notes; `CHANGELOG.md` entries derive from it directly. See [§ PR body](#pr-body). +- **PR body prose scrub**: see [§ Prose scrubbing](#prose-scrubbing). + +## PR body + +Every PR (feature, fix, docs, release) uses `.github/pull_request_template.md` verbatim. + +- **No explainer prose anywhere in the body.** User-facing substance only. +- **Summary describes the net diff only** — what merged `main` looks like vs the base branch. Not commit history, + intermediate state, or cherry-pick mechanics. +- **Zero verification artifacts in the body.** No triple-diff stats, leak-check output ("`guard-main-docs` runs clean"), + patch-id cherry-check counts, pre-push gate results, CI status, or prose-scrub findings. Anomalies get fixed before + push, not audit-trailed. +- **Changelog** subsections (`### Added` / `### Changed` / `### Fixed` / `### Removed` / `### Security`): 1-5 bullets + each, delete empty subsections, each bullet starts with a verb. +- A PR with no user-facing impact (pure refactor, test-only, CI-only) leaves `## Changelog` empty or omits it. + +→ Rationale: [`RELEASES-RATIONALE.md` § PR body conventions](./RELEASES-RATIONALE.md#pr-body-conventions). ## Releasing dev to main Engineering docs (`docs/plans/`, `docs/solutions/`, `docs/brainstorms/`, `docs/reviews/`) live on `dev` only. -`guard-main-docs.yml` blocks any `added` or `modified` files under those paths from reaching `main`. Branching from -`dev` and deleting docs on the way produces `add/add` merge conflicts whenever `dev` and `main` have diverged (the norm -after the first squash merge). The cherry-pick pattern avoids this. +`guard-main-docs.yml` blocks any `added` or `modified` files under those paths from reaching `main`. **Branch naming**: `release/v` (preferred) or `release/-`. Keep the slug short and descriptive. ```bash -# 1. Cut release/* from main, NOT dev. Branching from dev causes add/add -# conflicts when dev and main have divergent histories. +# 1. Cut release/* from main, NOT dev. git fetch origin git checkout -b release/v origin/main -# 2. List the dev commits not yet on main: +# 2. List the dev commits not yet on main. git log --oneline dev --not origin/main -# 3. Cherry-pick non-docs commits onto release/v. Docs commits -# (anything that touched only docs/plans/, docs/solutions/, -# docs/brainstorms/, or docs/reviews/) stay on dev. +# 3. Cherry-pick non-docs commits onto release/v. Docs commits stay on dev. git cherry-pick ... -# 4. Triple-diff verification — belt-and-suspenders sweep that catches both -# directions of drift before the release tag goes out: -# -# A. main → release (what users will see; the intended ship surface) -# B. release → dev (should be empty for non-doc paths until the -# bump/CHANGELOG commits land, and even then should -# only list those release-prep files — anything else -# is a missed cherry-pick) -# C. dev → main (sanity: phantom commits dev "appears ahead" on -# because cherry-pick rewrites SHAs post-squash) -git diff origin/main..HEAD --stat # A -git diff HEAD..origin/dev --name-only | grep -v '^docs/' || echo "(none)" # B -git diff origin/dev..origin/main --stat | tail -5 # C -# -# Re-confirm no guarded paths leaked (this caught the original miss class): +# 4. Triple-diff verification. +git diff origin/main..HEAD --stat # A: ship surface +git diff HEAD..origin/dev --name-only | grep -v '^docs/' || echo "(none)" # B: no missed picks +git diff origin/dev..origin/main --stat | tail -5 # C: phantom-commits sanity + +# Re-confirm no guarded paths leaked. git diff origin/main..HEAD --name-only \ | grep -E '^(docs/plans|docs/brainstorms|docs/ideation|docs/reviews|docs/solutions|\.context)' \ - && echo "LEAKED — reset and redo" || echo "(clean — no guarded paths)" -# -# Patch-id cherry check — catches commits on dev that have NO patch-id -# equivalent on release. The file-level diff in B misses this class when -# the same content happens to land via a different commit. -# -# IMPORTANT: in a squash-merge workflow this output is noisy. Every '+' -# line needs human triage — it does NOT auto-block the release. Expected -# sources of '+' lines that are NOT real misses: -# -# 1. Historical commits squash-merged in prior releases. The squash -# commit on main has a different patch-id than the dev commits it -# consolidates, so old commits show as '+' forever. Anything older -# than the previous release tag is almost always this. -# 2. Cherry-picks where conflict resolution stripped guarded paths -# (docs/plans, docs/brainstorms, etc.) or otherwise altered the -# tree. Same source-code intent, different patch-id. -# 3. Intentionally skipped commits — docs-only commits, release-prep -# backports, revert-and-redo prep steps. -# -# A real miss looks like: a recent feat/fix/chore commit on dev whose -# *file content* is not yet on main. To triage a '+' line: -# -# git show --stat # what did it touch? -# git diff origin/main..HEAD -- # already on release? -# -# If every touched file is guarded (docs/plans/, docs/brainstorms/, etc.) -# OR the content is already on main via a prior squash, it's a false -# positive — no action. Otherwise cherry-pick the commit and re-run the -# triple-diff. -git cherry HEAD origin/dev | grep '^+' || echo "(none — release is patch-equivalent through dev)" -# -# If B lists any non-docs path you didn't expect, fetch dev, identify the -# commit (`git log dev --not origin/main`), cherry-pick it, re-run the -# triple-diff. Missed cherry-picks have shipped to main on this and sibling -# repos before — this step is the cheap way to catch them. + && echo "LEAKED — reset and redo" || echo "(clean)" + +# Patch-id cherry check (noisy in squash-merge workflow; triage per-line). +git cherry HEAD origin/dev | grep '^+' || echo "(none)" # 5. Bump VERSION on the release branch. echo '' > VERSION -# 6. Generate CHANGELOG entries from PR bodies. NEVER hand-edit CHANGELOG.md — -# the script is authoritative. It reads cliff.toml + each cherry-picked PR's -# ## Changelog section and prepends a versioned [] entry. +# 6. Re-vendor the spec if a new tag has shipped upstream. +scripts/sync-spec.sh +git add spec/ && git commit -m "chore(spec): re-vendor spec to " || true + +# 7. Generate CHANGELOG entries from PR bodies. scripts/generate-changelog.sh # (the script extracts from the branch name release/v) -# 7. Commit the version bump and generated changelog. +# 8. Scrub CHANGELOG.md via Vale + LanguageTool + unslop. See § Prose scrubbing. +# Fix findings on upstream PR bodies, never by hand-editing CHANGELOG.md. + +# 9. Commit the version bump and generated changelog. git add VERSION CHANGELOG.md git commit -m "chore(release): v" -# 8. Push and open the PR: +# 10. Push and open the PR. Scrub body in /tmp/ first. git push -u origin release/v -gh pr create --base main --head release/v --title "release: v" +gh pr create --base main --head release/v \ + --title "release: v" --body-file /tmp/body.md ``` When the PR merges: 1. The squash commit lands on `main` with the PR body as its message. 2. `release/v` is auto-deleted. -3. Tag the new `main` HEAD: `git checkout main && git pull && git tag -a v -m "v" && git push origin - v`. +3. Tag the new `main` HEAD: + + ```bash + git checkout main && git pull + git tag -a v -m "v" + git push origin v + ``` + 4. Create the GitHub Release using the generated CHANGELOG section: ```bash @@ -153,87 +130,86 @@ When the PR merges: Consumers detect the new release on their next `bin/check-update` run; nothing else to do here. -`dev` keeps moving forward. Never reset or rebase `dev` after a release — it is forever. +`dev` keeps moving forward. Never reset or rebase `dev` after a release: it is forever. -### CHANGELOG is generated, never hand-written +→ Rationale + triple-diff false-positive triage: +[`RELEASES-RATIONALE.md` § Triple-diff verification](./RELEASES-RATIONALE.md#triple-diff-verification). CHANGELOG +mechanics: [`RELEASES-RATIONALE.md` § CHANGELOG generation](./RELEASES-RATIONALE.md#changelog-generation). Spec +re-vendoring: [`RELEASES-RATIONALE.md` § Spec-vendor pipeline](./RELEASES-RATIONALE.md#spec-vendor-pipeline). -`scripts/generate-changelog.sh` (with `cliff.toml`) is the only sanctioned way to update `CHANGELOG.md`. The script: +## Version bump procedure -- Runs `git-cliff` to prepend a versioned entry for commits since the last tag. -- Walks each squash-merged PR's body, extracts the `## Changelog` section's `### Added` / `### Changed` / `### Fixed` / - `### Documentation` subsections, and replaces the auto-generated bullets with the curated PR-body content (with author - and PR-link attribution). +The version bump and CHANGELOG generation both happen on the `release/v` branch (steps 5-7 of the cherry-pick +flow above). There is no separate version-bump PR to `dev`. Picking the version is the only manual decision: -If a PR's `## Changelog` section is empty, that PR's entry is omitted from the changelog (the convention in -[`.github/pull_request_template.md`](.github/pull_request_template.md): empty section = no user-facing change). To fix a -wrong CHANGELOG entry, fix the input — edit the squash-merged PR body, then re-run the script. Do **not** edit -`CHANGELOG.md` directly. +- **Patch**: doc updates, internal cleanups, non-substantive template edits, vendoring a patch-level spec bump. +- **Minor**: new templates, new reference docs, new bundle files (backward-compatible additions), vendoring a + minor-level spec bump that adds requirements without tightening existing tiers. +- **Major**: breaking changes to the bundle's contract (renaming `SKILL.md` frontmatter fields, restructuring directory + layout in ways that break existing skill installations, moving content between subdirectories and the producer-ops + root, or vendoring a major-level spec bump). -`scripts/generate-changelog.sh --check` verifies that `CHANGELOG.md` has a versioned section (not just `[Unreleased]`) — -wire this into the release-branch CI if/when one is added. +→ Rationale: [`RELEASES-RATIONALE.md` § Spec-vendor pipeline](./RELEASES-RATIONALE.md#spec-vendor-pipeline) (skill +version is independent of spec version). -### Why branch from main, not dev +## Prose scrubbing -Branching from `dev` and then `gio trash`-ing the guarded paths seems simpler but produces `add/add` merge conflicts -whenever `dev` and `main` have diverged. The file appears as "added" on both sides with different content. Always branch -from `origin/main` and cherry-pick onto it. +Three release-flow artifacts live outside any automated prose check and need a manual scrub before they ship: -## Spec re-vendoring +- PR bodies (`gh pr create` / `gh pr edit` send body text directly to GitHub). +- `CHANGELOG.md` (a generated artifact built from upstream PR bodies). +- Release-PR bodies (composed after `CHANGELOG.md` has been generated). -The bundle vendors a snapshot of [`agentnative-spec`](https://github.com/brettdavies/agentnative) under `spec/`. When -the spec ships a new tag (e.g., `v0.3.0`), this skill re-vendors via `scripts/sync-spec.sh` on the `release/v` -branch — same commit as the version bump, message `chore(spec): re-vendor spec to `. The script auto-resolves -the latest upstream tag from the remote, so no manual version selection is needed. Without re-vendoring, the bundle -ships stale spec content while consumers see the new version on `anc.dev`. +The canonical Vale + LanguageTool rule packs and orchestrator behaviour live in the spec repo at +[`~/dev/agentnative-spec/docs/architecture/voice-enforcement.md`](https://github.com/brettdavies/agentnative/blob/dev/docs/architecture/voice-enforcement.md). +Until those packs are vendored into this repo via a `scripts/sync-spec.sh` extension (a deferred follow-up), the scrub +commands point at the spec checkout directly. -## Version bump procedure +```bash +# 1. Save the artifact to /tmp/. +gh pr view --json body --jq .body > /tmp/body.md # for PR body edits +# cp CHANGELOG.md /tmp/body.md # for changelog scrub -The version bump and CHANGELOG generation both happen on the `release/v` branch (steps 5–6 of the cherry-pick -flow above). There is no separate version-bump PR to `dev`. Picking the version is the only manual decision: +# 2. Vale (against the spec's rule packs). +vale --no-global --config ~/dev/agentnative-spec/.vale.ini --output=line --minAlertLevel=error /tmp/body.md -- **Patch** — doc updates, internal cleanups, non-substantive template edits, vendoring a patch-level spec bump. -- **Minor** — new templates, new reference docs, new bundle files (backward-compatible additions), vendoring a - minor-level spec bump that adds requirements without tightening existing tiers. -- **Major** — breaking changes to the bundle's contract: renaming `SKILL.md` frontmatter fields, restructuring directory - layout in ways that break existing skill installations, moving content between `` and the producer-ops root, or - vendoring a major-level spec bump (renamed/removed principles or tightened MUSTs that would regress existing - consumers). +# 3. LanguageTool grammar check via lt_check (~/dotfiles/config/shell/languagetool.sh). +# Skips cleanly if LT is unreachable. Inspect: `lt_rules`, `lt_info`. See +# ~/dev/agentnative-spec/CONTRIBUTING.md § Voice enforcement for the +# install-vs-required nuance. +lt_check /tmp/body.md -The skill's version is independent of the spec it vendors. A spec bump that doesn't affect the skill's surface (e.g., -prose-only edits) can ship as a patch even when the spec went minor. Use the SemVer guidance above against the *skill's* -observable behaviour, not the spec's. +# 4. unslop (em-dash density and AI-unique structural patterns). +~/.claude/skills/unslop/scripts/score.py /tmp/body.md -## PRs and changelog generation +# 5. Apply fixes per finding. Re-run until 0 blocking and unslop score is 0. -Every PR **must** follow `.github/pull_request_template.md`. The template's `## Changelog` section has these -subsections: +# 6. Apply the cleaned version. +gh pr edit --body-file /tmp/body.md # for PR body edits +# scripts/generate-changelog.sh # for CHANGELOG.md (re-runs the PR-body fetch from GitHub) +``` -- `### Added` — new user-visible features or capabilities (new principles, new checks, new templates). -- `### Changed` — changes to existing behavior (e.g., a check's pass/fail criteria tightens). -- `### Fixed` — bug fixes (e.g., a check produces false positives). -- `### Removed` — removed features or APIs. -- `### Security` — security-relevant changes (e.g., a script that ran on user machines now refuses an unsafe path). +For a `CHANGELOG.md` finding, fix the upstream PR body and regenerate. Hand-editing `CHANGELOG.md` directly produces +drift the next regeneration overwrites. -A PR that lands with an empty or missing `## Changelog` section silently drops its user-facing notes from the next -release. If a PR truly has no user-facing impact (pure refactor, test-only, CI-only), leave the section empty — the PR -still appears in git history. +→ Rationale + which artifacts need this: +[`RELEASES-RATIONALE.md` § Prose scrubbing scope](./RELEASES-RATIONALE.md#prose-scrubbing-scope). ## Branch protection Three rulesets are committed under `.github/rulesets/` and applied to the repo via the GitHub API: -- **`protect-main.json`** — required signatures, linear history, squash-only merges via PR with CODEOWNERS review, +- **`protect-main.json`**: required signatures, linear history, squash-only merges via PR with CODEOWNERS review, required status checks (`markdownlint`, `shellcheck`, `guard-docs / check-forbidden-docs`), creation/deletion blocked, non-fast-forward blocked. -- **`protect-dev.json`** — required signatures, deletion blocked, non-fast-forward blocked. No PR-requirement at the - ruleset level; the PR-only norm is enforced by convention. -- **`protect-tags.json`** — `v*` tags: deletion, force-push (re-tag), and updates all blocked. Tags are immutable +- **`protect-dev.json`**: required signatures, deletion blocked, non-fast-forward blocked. PR-only norm is enforced by + convention. +- **`protect-tags.json`**: `v*` tags. Deletion, force-push (re-tag), and updates all blocked. Tags are immutable historical anchors for released versions. -### Apply (post-public-flip) +### Apply -The repo ships **PRIVATE** through the bootstrap window. GitHub's free tier does not allow rulesets on private repos. -After visibility flips to public from the agentnative-site session: +All three rulesets are already installed on this repo. Re-runnable for new repos or after a ruleset reset: ```bash gh api repos/brettdavies/agentnative-skill/rulesets -X POST --input .github/rulesets/protect-main.json @@ -241,6 +217,12 @@ gh api repos/brettdavies/agentnative-skill/rulesets -X POST --input .github/rule gh api repos/brettdavies/agentnative-skill/rulesets -X POST --input .github/rulesets/protect-tags.json ``` +Verify installed rulesets: + +```bash +gh api repos/brettdavies/agentnative-skill/rulesets --jq '.[] | "\(.id)\t\(.name)\t\(.target)"' +``` + See [`.github/rulesets/README.md`](.github/rulesets/README.md) for verification + negative tests. ### Updating a ruleset @@ -255,9 +237,7 @@ gh api repos/brettdavies/agentnative-skill/rulesets --jq '.[] | "\(.id)\t\(.name gh api -X PUT repos/brettdavies/agentnative-skill/rulesets/ --input .github/rulesets/protect-main.json ``` -### Status-check context pitfall - -`required_status_checks[].context` strings must match exactly what GitHub publishes for each check. For this repo: +### Status-check contexts (verified) | Check | Source | Context (verified) | | ------------------ | ---------------------------------------------- | ----------------------------------- | @@ -271,10 +251,15 @@ Confirm post-CI with: gh api repos/brettdavies/agentnative-skill/commits//check-runs --jq '.check_runs[].name' ``` +→ Rationale (inline vs reusable, three-ruleset shape): +[`RELEASES-RATIONALE.md` § Branch protection](./RELEASES-RATIONALE.md#branch-protection). + ## Related docs -- [`AGENTS.md`](./AGENTS.md) — repo layout, lint commands, what agents must not do. -- [`CONTRIBUTING.md`](./CONTRIBUTING.md) — how to propose changes. -- [`.github/pull_request_template.md`](.github/pull_request_template.md) — PR body structure with changelog sections. -- [`.github/rulesets/README.md`](.github/rulesets/README.md) — ruleset apply + verify procedure. -- [`CHANGELOG.md`](./CHANGELOG.md) — released versions and their notes. +- [`RELEASES-RATIONALE.md`](./RELEASES-RATIONALE.md) (release flow rationale, CHANGELOG pipeline, branch-protection + pitfalls) +- [`AGENTS.md`](./AGENTS.md) (repo layout, lint commands, what agents must not do) +- [`CONTRIBUTING.md`](./CONTRIBUTING.md) (how to propose changes) +- [`.github/pull_request_template.md`](.github/pull_request_template.md) (PR body structure with changelog sections) +- [`.github/rulesets/README.md`](.github/rulesets/README.md) (ruleset apply + verify procedure) +- [`CHANGELOG.md`](./CHANGELOG.md) (released versions and their notes) diff --git a/scripts/sync-prose-tooling.sh b/scripts/sync-prose-tooling.sh new file mode 100755 index 0000000..905a476 --- /dev/null +++ b/scripts/sync-prose-tooling.sh @@ -0,0 +1,128 @@ +#!/usr/bin/env bash +# Vendor the cross-channel prose source-of-truth shipped on agentnative-spec. +# +# Parallel sync vehicle to scripts/sync-spec.sh, decoupled because the brand +# narrative (BRAND.md) and the contract artifacts (principles/, VERSION, +# CHANGELOG.md) release on different cadences. sync-spec.sh covers the +# contract anc lints against; this script covers the universal voice anchor +# that PRODUCT.md inherits from. +# +# Vendored manifest (paths at spec main HEAD, mirrored verbatim into this +# repo at the same paths): +# +# BRAND.md (universal voice SoT) +# +# BRAND.md only, by design. The skill repo does not run prose-check today +# (no .vale.ini, no styles/ tree), so the broader prose-tooling stack the +# spec ships (Vale rule packs, vocabulary, prose-check.sh orchestrator, +# generate-pack-readme.mjs) is omitted here. When the skill activates Vale, +# extend the vendored manifest below to mirror the site's +# scripts/sync-prose-tooling.sh: add styles/brand/*.yml + README.md, +# styles/config/vocabularies/brand/{accept,reject}.txt, scripts/prose-check.sh, +# and scripts/generate-pack-readme.mjs to required_paths and the extract block. +# +# Tracks `main` HEAD by design: the prose-tooling stack (BRAND.md and, +# once expanded, Vale packs / vocabularies / prose-check.sh) iterates +# faster than the principle contract and does not need release-tag +# ceremony. Tag-pinning is reserved for the principle contract via +# `sync-spec.sh`. Resolves the current commit on `main`, preferring the +# remote repository, and falls back to a local checkout if the remote is +# unreachable. Extracts files via `git show :` so neither +# checkout's working tree is perturbed. +# +# Usage: +# scripts/sync-prose-tooling.sh +# SPEC_ROOT=/path/to/agentnative-spec scripts/sync-prose-tooling.sh +# SPEC_REMOTE_URL=git@github.com:brettdavies/agentnative.git scripts/sync-prose-tooling.sh +# +# Env vars (shared with sync-spec.sh): +# SPEC_REMOTE_URL Remote URL to query first. +# Default: https://github.com/brettdavies/agentnative.git +# SPEC_ROOT Local checkout to fall back to when the remote is +# unreachable. Default: $HOME/dev/agentnative-spec +# +# Resync cadence: rerun whenever the spec's `main` advances with changes +# under the vendored manifest (today: BRAND.md). Tracks `main` HEAD by +# design; tag-pinning is for the principle contract via `sync-spec.sh`. +# Spec's `repository_dispatch:spec-release` event fires on tag publish; a +# consumer-side handler that auto-PRs the resync is tracked as deferred +# follow-up alongside the same handler for sync-spec.sh. +# +# Idempotent at a fixed upstream sha: re-running with no upstream change +# produces no `git diff`. + +set -euo pipefail + +SPEC_REMOTE_URL="${SPEC_REMOTE_URL:-https://github.com/brettdavies/agentnative.git}" +SPEC_ROOT="${SPEC_ROOT:-$HOME/dev/agentnative-spec}" + +REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)" + +# Cleanup hook for the temp clone (set only after mktemp succeeds). +tmp_root="" +cleanup() { + if [[ -n "$tmp_root" && -d "$tmp_root" ]]; then + rm -rf "$tmp_root" + fi +} +trap cleanup EXIT + +# === Remote-first resolution =========================================== +spec_source="" +spec_ref="" +resolved_sha="" + +echo "querying $SPEC_REMOTE_URL for main HEAD..." +remote_sha="$(git ls-remote "$SPEC_REMOTE_URL" 'refs/heads/main' 2>/dev/null | awk '{print $1}')" + +if [[ -n "$remote_sha" ]]; then + tmp_root="$(mktemp -d -t agentnative-prose-XXXXXX)" + if git clone --depth 1 --branch main --quiet \ + "$SPEC_REMOTE_URL" "$tmp_root" 2>/dev/null; then + spec_source="$tmp_root" + spec_ref="main" + resolved_sha="$(git -C "$spec_source" rev-parse --short=7 main)" + echo "pulling from main @ $resolved_sha (remote $SPEC_REMOTE_URL)" + fi +fi + +# === Local fallback ==================================================== +if [[ -z "$spec_source" ]]; then + if [[ ! -d "$SPEC_ROOT/.git" ]]; then + echo "error: remote unreachable and SPEC_ROOT is not a git repository: $SPEC_ROOT" >&2 + echo " remote: $SPEC_REMOTE_URL" >&2 + echo " set SPEC_ROOT to your agentnative-spec checkout, or check network access." >&2 + exit 1 + fi + echo "warning: remote query failed; falling back to local $SPEC_ROOT" >&2 + + spec_source="$SPEC_ROOT" + if ! git -C "$spec_source" rev-parse --verify --quiet main >/dev/null; then + echo "error: no local main branch found in $SPEC_ROOT" >&2 + echo " try \`git -C $SPEC_ROOT fetch origin main:main\` to track upstream" >&2 + exit 1 + fi + spec_ref="main" + resolved_sha="$(git -C "$spec_source" rev-parse --short=7 main)" + echo "pulling from main @ $resolved_sha (local $spec_source)" +fi + +# === Verify expected paths exist at main =============================== +required_paths=( + "BRAND.md" +) +for path in "${required_paths[@]}"; do + if ! git -C "$spec_source" cat-file -e "$spec_ref:$path" 2>/dev/null; then + echo "error: main @ $resolved_sha is missing required path: $path" >&2 + echo " (BRAND.md may not have landed on main yet)" >&2 + exit 1 + fi +done + +# === Extract: top-level singletons ===================================== +git -C "$spec_source" show "$spec_ref:BRAND.md" >"$REPO_ROOT/BRAND.md" + +# === Report ============================================================ +echo "wrote BRAND.md to repo root (pulled from main @ $resolved_sha)" +echo +echo "next: review \`git diff\` for unexpected changes, then commit." From 3d61954ed7677b9cd5288ac1c03d6bb73faa750d Mon Sep 17 00:00:00 2001 From: Brett Date: Wed, 27 May 2026 17:21:52 -0500 Subject: [PATCH 05/15] feat(scripts): add sync-dev-after-release.sh release-backport tool (#18) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary Adds `scripts/sync-dev-after-release.sh` to backport release-bookkeeping (`VERSION` + `CHANGELOG.md`) from `main` to `dev` after a release tag publishes. Mirrors `~/dev/agentnative-cli/scripts/sync-dev-after-release.sh`; this variant drops the Cargo.toml/Cargo.lock steps since the skill bundle is markdown-only. `RELEASES.md` gains an "After publish: sync `dev` with the release" subsection documenting the invocation; `RELEASES-RATIONALE.md` gains a matching "Why backport `main` → `dev` after publish" section explaining the direct-to-dev exception (one signed commit, no PR) and why `dev` needs the bookkeeping current. ## Changelog ### Added - `scripts/sync-dev-after-release.sh`: release-backport tool that overwrites `VERSION` with the released number and copies `CHANGELOG.md` verbatim from `origin/main` as one signed commit on `dev`. Idempotent on re-run. ### Changed - `RELEASES.md` documents the post-publish backport step under "Releasing dev to main." - `RELEASES-RATIONALE.md` documents the rationale for landing the backport as a direct-to-dev commit (rather than through a PR) and the load-bearing consequences of skipping it. ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) - [ ] `fix`: Bug fix (non-breaking change which fixes an issue) - [ ] `refactor`: Code refactoring (no functional changes) - [ ] `perf`: Performance improvement - [ ] `docs`: Documentation update - [ ] `test`: Adding or updating tests - [ ] `chore`: Maintenance tasks (dependencies, config, etc.) - [ ] `ci`: CI/CD configuration changes - [ ] `style`: Code style/formatting changes - [ ] `build`: Build system changes - [ ] `BREAKING CHANGE`: Breaking API change (requires major version bump) ## Related Issues/Stories - Story: n/a - Issue: n/a - Architecture: mirrors the release-backport pattern already shipped in `agentnative-cli` (`scripts/sync-dev-after-release.sh` + corresponding RELEASES.md / RELEASES-RATIONALE.md sections). - Related PRs: n/a ## Testing - [ ] Unit tests added/updated - [ ] Integration tests added/updated - [x] Manual testing completed - [x] All tests passing ## Files Modified **Modified:** - `RELEASES.md`: new "After publish: sync `dev` with the release" subsection at the end of "Releasing dev to main." - `RELEASES-RATIONALE.md`: new "Why backport `main` → `dev` after publish" section between `bin/check-update` semantics and Prose scrubbing scope. **Created:** - `scripts/sync-dev-after-release.sh`: vX.Y.Z arg, verifies tag is on `origin/main`, fast-forward-pulls `dev`, overwrites `VERSION`, copies `CHANGELOG.md` from `origin/main`, commits with a `chore(release): backport vX.Y.Z artifacts to dev` message. Idempotent. **Renamed:** - None. **Deleted:** - None. ## Key Features - Direct-to-dev backport commit (no PR) makes the procedure fast and matches the cli's established convention. - Idempotent re-runs: safe to invoke from automation that doesn't track prior backport state. - Argument validation refuses anything not matching `vMAJOR.MINOR.PATCH` and refuses to run on a dirty working tree or a tag not reachable from `origin/main`. ## Benefits - `dev`'s `VERSION` stops drifting from the released number across release cycles. Feature branches cut from `dev` inherit the right baseline. - `bin/check-update` no longer reports false `UPGRADE_AVAILABLE` on consumer clones whose local `VERSION` came from a `dev` checkout. ## Breaking Changes - [x] No breaking changes - [ ] Breaking changes described below: ## Deployment Notes - [x] No special deployment steps required - [ ] Deployment steps documented below: After this PR merges, run `./scripts/sync-dev-after-release.sh v0.2.0` to backport the v0.2.0 release-bookkeeping that has been outstanding since 2026-04-29. ## Screenshots/Recordings n/a. Script + docs change. ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) - [x] Self-review of code completed - [x] Tests added/updated and passing - [x] No new warnings or errors introduced - [x] Changes are backward compatible (or breaking changes documented) --- RELEASES-RATIONALE.md | 25 ++++++++++ RELEASES.md | 18 +++++++ scripts/sync-dev-after-release.sh | 83 +++++++++++++++++++++++++++++++ 3 files changed, 126 insertions(+) create mode 100755 scripts/sync-dev-after-release.sh diff --git a/RELEASES-RATIONALE.md b/RELEASES-RATIONALE.md index c9c873f..6bee33c 100644 --- a/RELEASES-RATIONALE.md +++ b/RELEASES-RATIONALE.md @@ -139,6 +139,31 @@ This is why `main` (not `dev`) must be the published-release pointer: a `git clo `bin/check-update` compares against `main`. Cutting a release without fast-forwarding `main` would mean consumers never see the new VERSION. +## Why backport `main` → `dev` after publish + +Once a release tag publishes, `scripts/sync-dev-after-release.sh` backports the release-bookkeeping files from `main` to +`dev` so future feature branches inherit the correct baseline. Two files move: `VERSION` (overwritten with the released +number) and `CHANGELOG.md` (copied verbatim from `origin/main`, which is authoritative for the changelog). + +Without the backport, `dev` keeps the pre-release `VERSION` indefinitely (the release bump lives only on the `release/*` +branch that was squash-merged to `main` and never touched `dev`). Feature branches cut from `dev` then carry a stale +baseline — confusing during review, and load-bearing in two places: (a) `bin/check-update` compares the caller's local +`VERSION` against the producer repo's `main`, so a stale local `VERSION` from a `dev` clone would falsely report +`UPGRADE_AVAILABLE` on a current main; (b) the `chore(spec)` re-vendor commit message references the current bundle +version. + +The backport lands directly on `dev` (one signed commit, no PR). This is a deliberate exception to the PR-only norm on +`dev` documented in [`RELEASES.md`](./RELEASES.md) — the change is mechanical (no design content) and the script's +idempotency makes it safe to re-run. The exception holds only for this script's output; everything else still lands on +`dev` via PR. + +The script is idempotent: it exits 0 with no commit when `VERSION` and `CHANGELOG.md` already match `main`. Safe to +re-run, safe to invoke from automation that doesn't track whether the last release was already backported. + +Mirror of `~/dev/agentnative-cli/scripts/sync-dev-after-release.sh`. The cli variant additionally regenerates +`Cargo.lock` via `cargo build --release` after surgically updating `Cargo.toml`'s `[package].version`; the skill bundle +is markdown-only and ships no lock file, so those steps drop. + ## Prose scrubbing scope Three release-flow artifacts live outside any automated prose check and need a manual scrub before they ship: diff --git a/RELEASES.md b/RELEASES.md index 1470c3c..a043f33 100644 --- a/RELEASES.md +++ b/RELEASES.md @@ -137,6 +137,24 @@ Consumers detect the new release on their next `bin/check-update` run; nothing e mechanics: [`RELEASES-RATIONALE.md` § CHANGELOG generation](./RELEASES-RATIONALE.md#changelog-generation). Spec re-vendoring: [`RELEASES-RATIONALE.md` § Spec-vendor pipeline](./RELEASES-RATIONALE.md#spec-vendor-pipeline). +### After publish — sync `dev` with the release + +Once the release tag is published, backport the release-bookkeeping files from `main` to `dev`: + +```bash +./scripts/sync-dev-after-release.sh v +git push origin dev +``` + +The script overwrites `VERSION` with the released number and copies `CHANGELOG.md` verbatim from `origin/main`, then +commits the result directly to `dev` as one signed commit (no PR). Without this step `dev`'s `VERSION` and +`CHANGELOG.md` stay frozen at the pre-release state, and future feature branches inherit the wrong baseline. + +The backport is idempotent: re-running on a `dev` already in sync exits 0 with no commit. + +→ Rationale: +[`RELEASES-RATIONALE.md` § Why backport `main` → `dev` after publish](./RELEASES-RATIONALE.md#why-backport-main--dev-after-publish). + ## Version bump procedure The version bump and CHANGELOG generation both happen on the `release/v` branch (steps 5-7 of the cherry-pick diff --git a/scripts/sync-dev-after-release.sh b/scripts/sync-dev-after-release.sh new file mode 100755 index 0000000..ac1adce --- /dev/null +++ b/scripts/sync-dev-after-release.sh @@ -0,0 +1,83 @@ +#!/usr/bin/env bash +# Backport release artifacts from main to dev after a release tag publishes. +# +# Pulls two files from main and lands them as a single signed commit on dev: +# - VERSION — overwritten with the released version (plain text, no leading "v"). +# - CHANGELOG.md — copied verbatim from origin/main. Main is fully +# authoritative for CHANGELOG; dev never edits it directly. +# +# Run AFTER: +# 1. The release/v* -> main PR has merged. +# 2. `git tag -a vX.Y.Z` has been pushed to origin. +# 3. The GitHub Release has been created. +# +# Usage: +# ./scripts/sync-dev-after-release.sh v0.2.0 +# +# Idempotent: safe to re-run. If dev already matches main on these two +# files, the script exits 0 with no commit. +# +# Mirror of ~/dev/agentnative-cli/scripts/sync-dev-after-release.sh; this +# variant drops the Cargo.toml/Cargo.lock steps because the skill bundle +# is markdown-only (single plain-text VERSION file). + +set -euo pipefail + +if [[ $# -ne 1 ]]; then + echo "usage: $0 vX.Y.Z" >&2 + exit 64 +fi + +VERSION="$1" +if [[ ! "$VERSION" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then + echo "error: version must match vMAJOR.MINOR.PATCH (got: $VERSION)" >&2 + exit 64 +fi +VERSION_NO_V="${VERSION#v}" + +REPO_ROOT="$(git rev-parse --show-toplevel)" +cd "$REPO_ROOT" + +if [[ -n "$(git status --porcelain)" ]]; then + echo "error: working tree not clean -- commit or stash first" >&2 + git status --short >&2 + exit 65 +fi + +git fetch origin --tags --quiet + +# Verify the release tag exists locally. +if ! git rev-parse --verify --quiet "refs/tags/$VERSION" >/dev/null; then + echo "error: tag $VERSION not found locally -- run 'git fetch origin --tags' or verify the release published" >&2 + exit 66 +fi + +# Verify main is at or past the tag (i.e. release/* actually merged). +TAG_SHA="$(git rev-parse "$VERSION")" +if ! git merge-base --is-ancestor "$TAG_SHA" origin/main; then + echo "error: tag $VERSION is not reachable from origin/main -- wait for release/v* to merge" >&2 + exit 66 +fi + +git switch dev +git pull --ff-only origin dev + +# VERSION is plain text; overwrite with the released version (no leading "v"). +printf '%s\n' "$VERSION_NO_V" > VERSION + +# CHANGELOG.md from main (authoritative). +git checkout origin/main -- CHANGELOG.md + +if git diff --quiet VERSION CHANGELOG.md; then + echo "no changes -- dev already in sync with $VERSION" + exit 0 +fi + +git add VERSION CHANGELOG.md +git commit -m "chore(release): backport $VERSION artifacts to dev + +Brings dev's release-bookkeeping current with the $VERSION release on +main: VERSION bumped to ${VERSION_NO_V} and CHANGELOG.md copied verbatim +from origin/main." + +echo "committed; push with: git push origin dev" From d6944c256dbb5348cd4618af6200dde7835c35e5 Mon Sep 17 00:00:00 2001 From: Brett Date: Fri, 29 May 2026 17:14:44 -0500 Subject: [PATCH 06/15] refactor: rename anc check to anc audit and re-vendor spec v0.4.0 (#19) Migrates the skill's compliance-audit terminology from "check" to "audit" so it matches the renamed `anc` subcommand. `anc check` becomes `anc audit`, the "checker" noun becomes "auditor", and the compliance-sense prose (the four-step loop, "compliance auditing", "audit IDs") follows. The vendored `spec/` is re-synced from agentnative-spec's `refactor/check-to-audit` branch, which applies the same rename upstream and adds the new P8 principle, bumping the bundled spec from 0.3.0 to 0.4.0. The same re-sync also starts vendoring `principles/scoring.md` (leaderboard formula and badge eligibility) and teaches `sync-spec.sh` to pull it. The rename is coupled to the unreleased `anc` 0.4.0, which removes the `check` subcommand with no back-compat alias. The installed `anc` 0.3.1 still requires `anc check`, so this branch must not merge until `anc` 0.4.0 ships. The unrelated update-check feature (`bin/check-update`, `references/update-check.md`) and incidental tokens (`git checkout`, `shellcheck`, checklists, CI status checks) keep their spelling: they are a different sense of "check". - P8 (Discoverable Through Agent Skill Bundles) principle, vendored from agentnative-spec v0.4.0. - `principles/scoring.md` (leaderboard formula, badge eligibility floor, color bands) is now vendored into `spec/`; `scripts/sync-spec.sh` fetches it alongside the principle files. - The canonical audit command is now `anc audit` (was `anc check`), matching the renamed `anc` subcommand. Skill docs, the four-step loop, and all `anc`-compliance prose now read "audit" and "auditor". - Bundled spec bumped 0.3.0 to 0.4.0; the skill now teaches eight principles. - `spec/README.md` now links to the upstream spec landing page (leaderboard, badge convention, acknowledgements) and documents `scoring.md` in the layout table. - [x] `feat`: New feature (non-breaking change which adds functionality) - [ ] `fix`: Bug fix (non-breaking change which fixes an issue) - [x] `refactor`: Code refactoring (no functional changes) - [ ] `perf`: Performance improvement - [ ] `docs`: Documentation update - [ ] `test`: Adding or updating tests - [ ] `chore`: Maintenance tasks (dependencies, config, etc.) - [ ] `ci`: CI/CD configuration changes - [ ] `style`: Code style/formatting changes - [ ] `build`: Build system changes - [ ] `BREAKING CHANGE`: Breaking API change (requires major version bump) - Story: n/a - Issue: n/a - Architecture: Coupled-release with agentnative-spec `refactor/check-to-audit` (commit a0771a7) and the unreleased `anc` 0.4.0 subcommand rename. - Related PRs: n/a - [ ] Unit tests added/updated - [ ] Integration tests added/updated - [x] Manual testing completed - [x] All tests passing **Test Summary:** - `git grep 'anc check'` returns zero matches; `git grep 'checker'` returns only the byte-faithful historical entry in `spec/CHANGELOG.md` (vendored, fixed upstream rather than here). - Corruption scan (`auditout`, `shellaudit`, `is a AUDIT`) returns zero matches; the update-check feature and incidental tokens are untouched. - `spec/` re-vendored via `scripts/sync-spec.sh --ref refactor/check-to-audit`; `spec/VERSION` is 0.4.0, P8 is present, and `scoring.md` is vendored. - `shellcheck` clean on `scripts/sync-spec.sh`; markdownlint clean on edited markdown. **Modified:** - Skill content: `SKILL.md`, `getting-started.md`, `README.md`, `AGENTS.md`, `CONTRIBUTING.md`, `PRODUCT.md`, `CHANGELOG.md`, `references/project-structure.md`. - Issue templates: `.github/ISSUE_TEMPLATE/00-blank.yml`, `bug-report.yml`, `bundle-proposal.yml`, `config.yml`. - Planning docs: `docs/brainstorms/2026-05-01-001`, `docs/brainstorms/2026-05-01-002`, `docs/plans/2026-04-27-001`, `docs/plans/2026-05-01-001`. - Sync tooling: `scripts/sync-spec.sh` (now vendors `scoring.md`). - Vendored spec (re-sync): `spec/VERSION`, `spec/CHANGELOG.md`, `spec/README.md`, `spec/principles/p1` through `p7`. **Created:** - `spec/principles/p8-discoverable-skill-bundle.md` (vendored). - `spec/principles/scoring.md` (vendored). **Renamed:** - None. **Deleted:** - None. - [ ] No breaking changes - [x] Breaking changes described below: The canonical audit command changes from `anc check` to `anc audit` with no transitional alias. Agents running `anc` 0.3.1 will break against the new docs; do not merge until `anc` 0.4.0 (with the `audit` subcommand) is released. - [x] No special deployment steps required - [ ] Deployment steps documented below: - [x] Code follows project conventions and style guidelines - [x] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) - [x] Self-review of code completed - [x] Tests added/updated and passing - [x] No new warnings or errors introduced - [ ] Changes are backward compatible (or breaking changes documented) --- .github/ISSUE_TEMPLATE/00-blank.yml | 2 +- .github/ISSUE_TEMPLATE/bug-report.yml | 4 +- .github/ISSUE_TEMPLATE/bundle-proposal.yml | 2 +- .github/ISSUE_TEMPLATE/config.yml | 4 +- AGENTS.md | 8 +- CHANGELOG.md | 6 +- CONTRIBUTING.md | 6 +- PRODUCT.md | 8 +- README.md | 6 +- SKILL.md | 41 +- ...5-01-001-skill-determinism-requirements.md | 447 +++++ ...c-determinism-feature-asks-requirements.md | 199 ++ ...27-001-bootstrap-agentnative-skill-plan.md | 329 +++ ...1-feat-skill-determinism-hardening-plan.md | 1786 +++++++++++++++++ getting-started.md | 14 +- references/project-structure.md | 2 +- scripts/sync-spec.sh | 14 +- spec/CHANGELOG.md | 63 +- spec/README.md | 16 +- spec/VERSION | 2 +- .../p1-non-interactive-by-default.md | 52 +- .../p2-structured-parseable-output.md | 75 +- .../p3-progressive-help-discovery.md | 38 +- .../p4-fail-fast-actionable-errors.md | 54 +- .../p5-safe-retries-mutation-boundaries.md | 52 +- ...omposable-predictable-command-structure.md | 61 +- .../p7-bounded-high-signal-responses.md | 44 +- .../p8-discoverable-skill-bundle.md | 94 + spec/principles/scoring.md | 129 ++ 29 files changed, 3352 insertions(+), 206 deletions(-) create mode 100644 docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md create mode 100644 docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md create mode 100644 docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md create mode 100644 docs/plans/2026-05-01-001-feat-skill-determinism-hardening-plan.md create mode 100644 spec/principles/p8-discoverable-skill-bundle.md create mode 100644 spec/principles/scoring.md diff --git a/.github/ISSUE_TEMPLATE/00-blank.yml b/.github/ISSUE_TEMPLATE/00-blank.yml index 36479dc..a2b939c 100644 --- a/.github/ISSUE_TEMPLATE/00-blank.yml +++ b/.github/ISSUE_TEMPLATE/00-blank.yml @@ -28,7 +28,7 @@ body: 1. **Pick a structured template first.** Bundle bugs and bundle proposals have dedicated forms — use them when they fit. 2. **Search first.** Run `gh search issues --repo brettdavies/agentnative-skill ""` to check for duplicates. 3. **AI disclosure is required.** Fill the field above honestly. - 4. **Wrong repo?** Spec questions, principle edits, and grading findings live on [brettdavies/agentnative](https://github.com/brettdavies/agentnative). `anc` checker bugs and tool-registry submissions live on [brettdavies/agentnative-cli](https://github.com/brettdavies/agentnative-cli). Site bugs live on [brettdavies/agentnative-site](https://github.com/brettdavies/agentnative-site). + 4. **Wrong repo?** Spec questions, principle edits, and grading findings live on [brettdavies/agentnative](https://github.com/brettdavies/agentnative). `anc` auditor bugs and tool-registry submissions live on [brettdavies/agentnative-cli](https://github.com/brettdavies/agentnative-cli). Site bugs live on [brettdavies/agentnative-site](https://github.com/brettdavies/agentnative-site). 5. See [CONTRIBUTING.md](https://github.com/brettdavies/agentnative-skill/blob/main/CONTRIBUTING.md) for full guidelines. validations: diff --git a/.github/ISSUE_TEMPLATE/bug-report.yml b/.github/ISSUE_TEMPLATE/bug-report.yml index fb5936a..f517865 100644 --- a/.github/ISSUE_TEMPLATE/bug-report.yml +++ b/.github/ISSUE_TEMPLATE/bug-report.yml @@ -8,7 +8,7 @@ body: value: | **Route check before filing:** - - For bugs in `anc` (the checker itself — wrong scorecard, missing check, false positive), file on the + - For bugs in `anc` (the auditor itself — wrong scorecard, missing check, false positive), file on the [linter repo](https://github.com/brettdavies/agentnative-cli/issues/new/choose). - For substantive principle changes (new principles, MUST/SHOULD/MAY tier changes), file on the [spec repo](https://github.com/brettdavies/agentnative/issues/new/choose). @@ -85,7 +85,7 @@ body: 1. **Search first.** Run `gh search issues --repo brettdavies/agentnative-skill ""` to check for duplicates. - 2. **Route check.** Verify this is a skill-bundle bug, not an `anc` checker bug (file in agentnative-cli) or + 2. **Route check.** Verify this is a skill-bundle bug, not an `anc` auditor bug (file in agentnative-cli) or a spec issue (file in agentnative). 3. **AI disclosure is required.** Fill the field above honestly. 4. See [CONTRIBUTING.md](https://github.com/brettdavies/agentnative-skill/blob/main/CONTRIBUTING.md) for full diff --git a/.github/ISSUE_TEMPLATE/bundle-proposal.yml b/.github/ISSUE_TEMPLATE/bundle-proposal.yml index 4d7b391..90fd630 100644 --- a/.github/ISSUE_TEMPLATE/bundle-proposal.yml +++ b/.github/ISSUE_TEMPLATE/bundle-proposal.yml @@ -11,7 +11,7 @@ body: - New principle, MUST/SHOULD/MAY tier change, or any substantive change to the standard itself → file on the [spec repo](https://github.com/brettdavies/agentnative/issues/new/choose). This skill bundle vendors the spec; principle changes happen there first, then arrive here via `scripts/sync-spec.sh`. - - New compliance check, change to scorecard semantics, or anything `anc check` does → file on the + - New compliance audit, change to scorecard semantics, or anything `anc audit` does → file on the [linter repo](https://github.com/brettdavies/agentnative-cli/issues/new/choose). - New starter template, reference doc, getting-started flow, idiom for a new language/framework, or any change to how this bundle teaches the existing principles → keep filing here. diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml index 8b7c3b2..6db7b9f 100644 --- a/.github/ISSUE_TEMPLATE/config.yml +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -3,9 +3,9 @@ contact_links: - name: "Spec questions, principle edits, or grading findings" url: "https://github.com/brettdavies/agentnative/issues/new/choose" about: "For anything about the standard itself — propose changes, submit a grading finding, ask questions — file on the spec repo." - - name: "Checker bugs, features, or tool-registry submissions" + - name: "Auditor bugs, features, or tool-registry submissions" url: "https://github.com/brettdavies/agentnative-cli/issues/new/choose" - about: "For bugs in the `anc` checker, feature requests, or proposing a tool for the leaderboard, file on the linter repo." + about: "For bugs in the `anc` auditor, feature requests, or proposing a tool for the leaderboard, file on the linter repo." - name: "Site bugs (rendering, performance, deployment)" url: "https://github.com/brettdavies/agentnative-site/issues/new/choose" about: "For bugs on anc.dev, file on the site repo." diff --git a/AGENTS.md b/AGENTS.md index d7b9ce4..2a8adde 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -2,8 +2,8 @@ Project-level agent instructions for `agentnative-skill` — the producer repo for the `agent-native-cli` skill. -This repo is **not** a Rust CLI tool and **not** a compliance checker. It is the agent-facing guide that pairs with -[`anc`](https://github.com/brettdavies/agentnative-cli) (the canonical checker) and +This repo is **not** a Rust CLI tool and **not** a compliance auditor. It is the agent-facing guide that pairs with +[`anc`](https://github.com/brettdavies/agentnative-cli) (the canonical auditor) and [`agentnative-spec`](https://github.com/brettdavies/agentnative) (the canonical principle text, vendored at `spec/`). The skill teaches agents how to use `anc` and supplies the surrounding context — spec, idioms, templates — that `anc` findings reference. @@ -17,7 +17,7 @@ auto-discovers `SKILL.md` at the install root and ignores everything else. Produ | Path | Read at runtime by host? | Purpose | | ---------------------------------------------------------------------------------------- | ------------------------ | ---------------------------------------------------------------------------------------------------------------------------------- | | `SKILL.md` | ✓ | Skill metadata + entry-point pointer to `getting-started.md`. The host's first read. | -| `getting-started.md` | ✓ | Three working loops (existing CLI / new Rust / other language); canonical `anc check` invocations. | +| `getting-started.md` | ✓ | Three working loops (existing CLI / new Rust / other language); canonical `anc audit` invocations. | | `bin/check-update` | ✓ | Consumer-side update-check script. Compares local `VERSION` to GitHub `main`; emits `UPGRADE_AVAILABLE` for the SKILL.md preamble. | | `spec/` | ✓ | Vendored copy of `agentnative-spec`. Canonical principle text + machine-readable `requirements[]`. | | `references/` | ✓ | Implementation guidance: framework idioms (Rust + others), project structure, Rust/clap patterns. | @@ -81,7 +81,7 @@ table. - Edit anything under `spec/` by hand. It is vendored from `agentnative-spec`. Any required change is a PR against the spec repo, then a `scripts/sync-spec.sh` bump here. - Reimplement `anc`. The skill does not contain shell-script duplicates of `anc`'s checks. If you find yourself writing - `rg`-based grep checks, you're rebuilding what `anc` already does — use `anc check --output json` instead. + `rg`-based grep checks, you're rebuilding what `anc` already does — use `anc audit --output json` instead. - Commit anything under `docs/plans/`, `docs/solutions/`, `docs/brainstorms/`, or `docs/reviews/` directly to a `release/*` branch — those paths are filtered by the cherry-pick pattern. Add to `dev` instead. - Modify `SKILL.md`'s `name` or `description` frontmatter without coordinating with consumers — those fields drive skill diff --git a/CHANGELOG.md b/CHANGELOG.md index fe87cfe..a87ec0a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,7 +18,7 @@ All notable changes to this project will be documented in this file. - `.github/ISSUE_TEMPLATE/principle_proposal.md` — substantive standards-change template. - `**Renamed:**` subsection in `.github/pull_request_template.md` (sync of the canonical update at `~/dotfiles/stow/github/dot-config/github/pull_request_template.md`). Sister sync PRs landing in agentnative-cli (#30 there) and agentnative-site (already on dev as commit 4437435). - Add vendored `bundle/spec/` tree (agentnative-spec @ v0.2.0): `VERSION`, `CHANGELOG.md`, `README.md`, and seven `principles/p*.md` files with machine-readable `requirements[]` frontmatter — canonical principle text the skill now points at instead of paraphrasing. by @brettdavies in [#4](https://github.com/brettdavies/agentnative-skill/pull/4) -- Add `bundle/getting-started.md` covering three working agent loops (existing CLI / new Rust / other language), canonical `anc check --output json` invocations, and a "where things live" map. +- Add `bundle/getting-started.md` covering three working agent loops (existing CLI / new Rust / other language), canonical `anc audit --output json` invocations, and a "where things live" map. - Add `scripts/sync-spec.sh` so the bundle can re-vendor agentnative-spec on demand. - `LICENSE-APACHE` — Apache 2.0 boilerplate, identical to the file in `agentnative-cli`. by @brettdavies in [#6](https://github.com/brettdavies/agentnative-skill/pull/6) - `bundle/bin/check-update` — script that compares the consumer's local `VERSION` against the producer repo's `main` and emits `UPGRADE_AVAILABLE ` (or empty when up-to-date / snoozed / disabled). Adapts the gstack pattern with cache TTL (60min UP_TO_DATE / 720min UPGRADE_AVAILABLE) and a 3-level snooze (24h / 48h / 7d). State directory: `$HOME/.cache/agent-native-cli/`. by @brettdavies in [#8](https://github.com/brettdavies/agentnative-skill/pull/8) @@ -61,7 +61,7 @@ All notable changes to this project will be documented in this file. ### Removed -- Remove `bundle/scripts/check-compliance.sh` and 24 `bundle/scripts/checks/check-*.sh` files (plus `_helpers.sh`). `anc check --output json` is the canonical replacement. by @brettdavies in [#4](https://github.com/brettdavies/agentnative-skill/pull/4) +- Remove `bundle/scripts/check-compliance.sh` and 24 `bundle/scripts/checks/check-*.sh` files (plus `_helpers.sh`). `anc audit --output json` is the canonical replacement. by @brettdavies in [#4](https://github.com/brettdavies/agentnative-skill/pull/4) - Remove `bundle/references/principles-deep-dive.md` (419-line hand-typed paraphrase of the spec; canonical text now lives at `bundle/spec/principles/`). - Remove `bundle/checklists/new-tool.md` (pre-anc manual checklist; replaced by `bundle/getting-started.md`). - All SHA-pin claims from public-facing markdown (`RELEASES.md`, `AGENTS.md`, `README.md`, `spec/README.md`, `CONTRIBUTING.md`): pipeline diagram's "site re-pins to commit SHA" step, the post-merge "site re-pins via its own PR" step, the `protect-tags.json` / `install endpoints` claims that tags are pinned to install endpoints, and the spec-vendor "pinned ref" / "pinned `SPEC_REF`" / "current pin is recorded" vocabulary across all docs. by @brettdavies in [#11](https://github.com/brettdavies/agentnative-skill/pull/11) @@ -78,7 +78,7 @@ All notable changes to this project will be documented in this file. - `checklists/new-tool.md` — task checklist for starting a new agent-native CLI. - `references/` — five deep-dive references: principle specifications, framework idioms (Rust/clap and other languages), project structure, Rust/clap patterns. -- `scripts/check-compliance.sh` — automated compliance checker that produces deterministic pass/warn/fail scorecards +- `scripts/check-compliance.sh` — automated compliance auditor that produces deterministic pass/warn/fail scorecards across 24 checks in 9 groups. - `scripts/checks/` — individual check scripts plus shared `_helpers.sh`. - `templates/` — starter files: `AGENTS.md`, `clap-main.rs`, `error-types.rs`, `output-format.rs`. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index e491424..46b286b 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -4,7 +4,7 @@ Thanks for your interest. This repo is the agent-facing skill that pairs with th - [`agentnative`](https://github.com/brettdavies/agentnative) (the spec): canonical principle text. Vendored here at `spec/`. -- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) (`anc`): the canonical compliance checker. +- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) (`anc`): the canonical compliance auditor. - [`agentnative-site`](https://github.com/brettdavies/agentnative-site) (`anc.dev`): the public site, leaderboard renderer, live-scoring loop, and skill-distribution endpoint. @@ -27,7 +27,7 @@ structure or host-runtime support. For principle-level discussion (the spec's MUST/SHOULD/MAY tiers, including P8 on discoverability), file a `pressure-test` issue in the [spec repo](https://github.com/brettdavies/agentnative/issues/new?template=pressure-test.yml). For scoring engine or -`anc check` work, file in the [CLI repo](https://github.com/brettdavies/agentnative-cli). Those discussions don't belong +`anc audit` work, file in the [CLI repo](https://github.com/brettdavies/agentnative-cli). Those discussions don't belong here. **Response expectations:** Tier 1 and Tier 2 are welcome and get a substantive reply when time allows. Tier 3 PRs are @@ -38,7 +38,7 @@ reviewed when scope and time permit. Real PRs land; no merge-window promise. | You want to… | Where | | --------------------------------------------------------------- | ----------------------------------------------------------------------------------- | | Propose a new principle, change MUST/SHOULD/MAY tiers, etc. | [`brettdavies/agentnative`](https://github.com/brettdavies/agentnative) (the spec) | -| Report an `anc check` bug, or propose a new checker feature | [`brettdavies/agentnative-cli`](https://github.com/brettdavies/agentnative-cli) | +| Report an `anc audit` bug, or propose a new auditor feature | [`brettdavies/agentnative-cli`](https://github.com/brettdavies/agentnative-cli) | | Report a site bug (rendering, deployment, anc.dev surface) | [`brettdavies/agentnative-site`](https://github.com/brettdavies/agentnative-site) | | Improve a starter template, add a language idiom, fix the guide | This repo. Issue + PR. Templates: **Bug report** or **Bundle proposal**. | | Bump the vendored spec to a newer tag | This repo. Run `scripts/sync-spec.sh` and PR the diff. | diff --git a/PRODUCT.md b/PRODUCT.md index f45dbe2..ae6c94f 100644 --- a/PRODUCT.md +++ b/PRODUCT.md @@ -36,15 +36,15 @@ the skill bundle teaches the workflow that connects the two. ## Register -- **Instructional voice.** Second-person imperative is the default. "Run `anc check . --output json` to score the repo." - Not "the skill recommends running `anc check`." +- **Instructional voice.** Second-person imperative is the default. "Run `anc audit . --output json` to score the repo." + Not "the skill recommends running `anc audit`." - **Every action is a runnable command.** Code blocks contain commands the reader can paste. Pseudo-commands and prose-only "how to think about it" are out. - **Reference-shaped tables.** Where the artifact is a list of options, files, or trade-offs, render it as a table with stable column headers. The three-artifact table at the top of `SKILL.md` and the useful-flags paragraph in `getting-started.md` are the patterns. - **Triggers and keywords are first-class.** The skill's `description` frontmatter determines which agents discover the - skill. Keep it dense in the canonical vocabulary (`agentic CLI`, `agent-native`, `anc check`, `audit-profile`, …). The + skill. Keep it dense in the canonical vocabulary (`agentic CLI`, `agent-native`, `anc audit`, `audit-profile`, …). The opposite of marketing prose; closer to a search-index line. - **Cross-link to the spec, the linter, and the templates.** A skill that paraphrases what the spec says drifts. Link instead, and quote the requirement ID, not the prose. @@ -56,7 +56,7 @@ These extend the universal bans in `BRAND.md`: - **No "in this guide we will…" preamble.** The agent skips it. Open with the action. - **No narrative scaffolding.** "First, we'll cover X. Then, Y. Finally, Z." Out. The TOC and section headings carry the structure; prose that recapitulates them is noise. -- **No "consider doing X" hedges.** State the action. "Run `anc check`." Not "you might want to consider running `anc +- **No "consider doing X" hedges.** State the action. "Run `anc audit`." Not "you might want to consider running `anc check`." - **No paraphrased spec content.** When the contract matters, link to `spec/principles/p-*.md` and quote the requirement ID (`p1-must-no-interactive`), not the prose. Spec drift is silent and expensive when paraphrased. diff --git a/README.md b/README.md index 1255e63..619c9f2 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ This skill is the fourth artifact in a four-repo ecosystem: | Repo | Role | | --------------------------------------------------------------------------- | -------------------------------------------------------------- | | [`agentnative`](https://github.com/brettdavies/agentnative) (the spec) | Canonical text of the eight principles. CC BY 4.0. | -| [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) (`anc`) | The compliance checker. MIT / Apache-2.0. | +| [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) (`anc`) | The compliance auditor. MIT / Apache-2.0. | | [`agentnative-site`](https://github.com/brettdavies/agentnative-site) | `anc.dev`: leaderboard, scorecards, live scoring. | | **This repo** (`agentnative-skill`) | The agent-facing guide. Vendors the spec; teaches `anc` usage. | @@ -61,7 +61,7 @@ prompted by `bin/check-update`. - [`SKILL.md`](./SKILL.md): skill metadata + entry-point pointer. - [`getting-started.md`](./getting-started.md): three working loops (existing CLI / new Rust / other language); - canonical `anc check` invocations; "where things live" map. + canonical `anc audit` invocations; "where things live" map. - [`bin/check-update`](./bin/check-update): periodic version check. Compares local `VERSION` to GitHub `main`, emits `UPGRADE_AVAILABLE` so the agent can offer to `git pull`. - [`spec/`](./spec/): vendored canonical principle text from @@ -91,7 +91,7 @@ Issues and PRs welcome. See [`CONTRIBUTING.md`](./CONTRIBUTING.md). Routing: substantive principle changes happen there first. - **`anc` bugs or feature requests** → file in [`brettdavies/agentnative-cli`](https://github.com/brettdavies/agentnative-cli). The skill teaches `anc` usage but - doesn't implement the checker. + doesn't implement the auditor. - **Skill issues** (templates, references, getting-started, layout) → file here. Branch + release model documented in [`RELEASES.md`](./RELEASES.md). diff --git a/SKILL.md b/SKILL.md index 66f26cc..9c7b0dc 100644 --- a/SKILL.md +++ b/SKILL.md @@ -2,14 +2,14 @@ name: agent-native-cli description: >- Guide to designing, building, and auditing CLI tools for use by AI agents. Pairs with - [`anc`](https://github.com/brettdavies/agentnative-cli) (the canonical compliance checker) and + [`anc`](https://github.com/brettdavies/agentnative-cli) (the canonical compliance auditor) and [`agentnative-spec`](https://github.com/brettdavies/agentnative) (the canonical principle text, vendored at `spec/`). Provides starter templates, language-specific implementation idioms (Rust/clap, Python Click & argparse, Go Cobra, JS Commander/yargs/oclif, Ruby Thor), and a getting-started guide that points agents at - `anc check --output json` and `anc skill install `. Use when designing a new CLI tool, building a Rust/clap + `anc audit --output json` and `anc skill install `. Use when designing a new CLI tool, building a Rust/clap binary intended for agents, reviewing one for agent-readiness, claiming the agent-native badge, or remediating findings from `anc`. Triggers on agentic CLI, agent-native, CLI design, CLI standard, agent-first, CLI for agents, - agent-friendly CLI, CLI compliance, agent-readiness, anc, anc check, anc skill install, agent-native badge, + agent-friendly CLI, CLI compliance, agent-readiness, anc, anc audit, anc skill install, agent-native badge, scorecard, audit-profile, Rust CLI, clap derive. SKIP when the user is building a TUI app meant for humans (use `--audit-profile human-tui` rather than this skill), writing a non-CLI library, or asking unrelated Rust questions not specifically about agent-readiness of a CLI. @@ -21,11 +21,11 @@ The standard for CLI tools designed to be operated by AI agents. Three artifacts | Artifact | Role | | ---------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| [`agentnative-spec`](https://github.com/brettdavies/agentnative) | Canonical text of the seven principles. Frontmatter `requirements[]` is the machine-readable contract. Vendored into [`spec/`](./spec/) — snapshot refreshed each release. | -| [`anc`](https://github.com/brettdavies/agentnative-cli) | The compliance checker. Reads target source/binary, emits a JSON scorecard whose entries cite spec `requirement_id`s. The runtime authority. | +| [`agentnative-spec`](https://github.com/brettdavies/agentnative) | Canonical text of the eight principles. Frontmatter `requirements[]` is the machine-readable contract. Vendored into [`spec/`](./spec/) — snapshot refreshed each release. | +| [`anc`](https://github.com/brettdavies/agentnative-cli) | The compliance auditor. Reads target source/binary, emits a JSON scorecard whose entries cite spec `requirement_id`s. The runtime authority. | | **This skill** (`agent-native-cli`) | The agent-facing guide. Tells the agent how to invoke `anc`, how to navigate the spec when remediating findings, and where the implementation patterns and starter templates live. | -The skill does **not** implement principles checking. `anc` does. The skill teaches agents to use `anc` and supplies the +The skill does **not** implement principles auditing. `anc` does. The skill teaches agents to use `anc` and supplies the surrounding context (spec, idioms, templates) that `anc`'s findings reference. ## First action: update check @@ -43,12 +43,12 @@ Never). Full prompt text, snooze ladder, and state-file layout are in ## Start here → **[`getting-started.md`](./getting-started.md)** — the three working loops (existing CLI / new Rust CLI / other -language), the canonical `anc check` invocations, the `anc skill install ` installer, and a "where things live" +language), the canonical `anc audit` invocations, the `anc skill install ` installer, and a "where things live" map. -## The seven principles +## The eight principles -Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative-spec` — currently `v0.3.0`; see +Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative-spec` — currently `v0.4.0`; see [`spec/README.md`](./spec/README.md) for resync instructions). One file per principle, each with machine-readable `requirements[]` frontmatter: @@ -61,15 +61,16 @@ Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative- | P5 | [`p5-safe-retries-mutation-boundaries.md`](./spec/principles/p5-safe-retries-mutation-boundaries.md) | Safe Retries, Mutation Boundaries | | P6 | [`p6-composable-predictable-command-structure.md`](./spec/principles/p6-composable-predictable-command-structure.md) | Composable, Predictable Structure | | P7 | [`p7-bounded-high-signal-responses.md`](./spec/principles/p7-bounded-high-signal-responses.md) | Bounded, High-Signal Responses | +| P8 | [`p8-discoverable-skill-bundle.md`](./spec/principles/p8-discoverable-skill-bundle.md) | Discoverable Skill Bundles | Do not paraphrase the principles inside this skill — read the spec files directly. They are the source of truth. -## The anc loop: check → fix → re-check → claim badge +## The anc loop: audit → fix → re-audit → claim badge Once `anc` is installed (one-line install in [`getting-started.md`](./getting-started.md)), the work is a four-step loop: -**1. Check.** `anc check --output json . > scorecard.json`. The JSON envelope is schema `0.5` and contains: +**1. Audit.** `anc audit --output json . > scorecard.json`. The JSON envelope is schema `0.5` and contains: - `summary` — `total / pass / warn / fail / skip / error` count. - `coverage_summary` — `must / should / may`, each with `total` + `verified`. `must.verified == must.total` is the bar @@ -77,7 +78,7 @@ loop: - `badge.eligible` (bool), `badge.score_pct` (int), `badge.embed_markdown` (string or `null`), `badge.scorecard_url`, `badge.badge_url`, `badge.convention_url`. **80%** is the eligibility floor; below it, `embed_markdown` is `null` and the convention says do not advertise a badge. -- `results[]` — per-check entries citing `requirement_id`, `status`, and `evidence`. +- `results[]` — per-audit entries citing `requirement_id`, `status`, and `evidence`. - `audit_profile` — the exemption category in effect (or `null`). - `tool / anc / run / target` metadata — identifies the scored tool, the `anc` build, the invocation, and the resolved target. @@ -86,10 +87,10 @@ loop: `spec/principles/p-*.md`'s `requirements[]` frontmatter. Apply the fix using the implementation references below. Re-run with `--principle ` to focus on one principle while iterating. -**3. Re-check.** Re-run `anc check --output json .` until `summary.fail == 0` and `coverage_summary.must.verified == -coverage_summary.must.total`. Use `--audit-profile ` to suppress checks that don't apply to the tool class — +**3. Re-audit.** Re-run `anc audit --output json .` until `summary.fail == 0` and `coverage_summary.must.verified == +coverage_summary.must.total`. Use `--audit-profile ` to suppress audits that don't apply to the tool class — `human-tui` (TUIs that legitimately intercept the TTY), `file-traversal` (reserved), `posix-utility` (cat / sed / awk -style), `diagnostic-only` (read-only tools). Suppressed checks emit `Skip` with structured evidence so readers see what +style), `diagnostic-only` (read-only tools). Suppressed audits emit `Skip` with structured evidence so readers see what was excluded. **4. Claim the badge.** Once `badge.eligible == true` (≥80%), copy `badge.embed_markdown` into the project's README. The @@ -98,7 +99,7 @@ badge-related is printed (the convention's "do not nag" rule). ## Implementation guidance (when fixing findings) -Once `anc check` reports a failure, the agent has the cited `requirement_id` and the spec text. The next question is +Once `anc audit` reports a failure, the agent has the cited `requirement_id` and the spec text. The next question is "how do I write code that satisfies this requirement?" — answered by: | Need | File | @@ -119,7 +120,7 @@ Drop-in starting points for greenfield Rust CLIs. Each encodes the relevant prin | [`templates/output-format.rs`](./templates/output-format.rs) | `OutputConfig`, `OutputFormat`, `diag!`, NO_COLOR / IsTerminal | | [`templates/agents-md-template.md`](./templates/agents-md-template.md) | Project-level AGENTS.md scaffold | -## Compliance checking +## Compliance auditing Use `anc`. Install once via Homebrew, cargo, or self-install: @@ -135,12 +136,12 @@ anc skill install claude_code # also: codex, cursor, factory, kiro, anc skill install --dry-run claude_code # print resolved git command without spawning ``` -Recommended `anc check` invocations and the full agent loop are in [`getting-started.md`](./getting-started.md). Do not -write shell scripts to grep for principle violations — `anc` already implements (and supersedes) every check that +Recommended `anc audit` invocations and the full agent loop are in [`getting-started.md`](./getting-started.md). Do not +write shell scripts to grep for principle violations — `anc` already implements (and supersedes) every audit that approach could produce. ## Sources - [`agentnative-spec`](https://github.com/brettdavies/agentnative) — canonical principle text (CC BY 4.0) -- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) — `anc`, the canonical checker (MIT / Apache-2.0) +- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) — `anc`, the canonical auditor (MIT / Apache-2.0) - [`agentnative-skill`](https://github.com/brettdavies/agentnative-skill) — this repo (MIT) diff --git a/docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md b/docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md new file mode 100644 index 0000000..76c290f --- /dev/null +++ b/docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md @@ -0,0 +1,447 @@ +--- +status: draft +created: 2026-05-01 +owner: brett +target_repo: agentnative-skill +related: docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md +--- + +# `agent-native-cli` skill — determinism & runbook hardening + +Closes the gaps surfaced by the 2026-05-01 SKILL.md self-review against `create-agent-skills`. Two-agent variance is the +headline problem: two competent agents reading SKILL.md today can produce two different correct-looking-but-wrong +handlers for the same `anc audit` output. This document scopes the skill-side fixes that close that variance. Cross-repo +`anc` feature requests are tracked separately in +[`2026-05-01-002-anc-determinism-feature-asks-requirements.md`](./2026-05-01-002-anc-determinism-feature-asks-requirements.md). + +## Problem + +Today's `SKILL.md`: + +- Describes the `anc audit` JSON envelope in prose only (no sample, no shape). +- Names the `results[].status` enum once in passing without enumerating values. +- Does not document `anc audit`'s exit-code contract. +- Mentions "schema 0.5" once with no pinning guidance and no behaviour-on-bump. +- Names four `--audit-profile` categories without a selection rule. +- Stops the iteration loop on `summary.fail == 0` + `must.verified == must.total`, with no termination rule for `warn`s + once the badge floor is cleared. +- Gives no fallback for spec-version skew between bundle (`spec/VERSION` = 0.3.0) and `anc` releases. +- Provides no first-action precondition for "is `anc` even installed?". +- Leaves `badge.embed_markdown` placement, scorecard diffability, `--binary`/`--source` scoping, hybrid-language + audit-profile selection, `json` vs `text` choice, and scorecard commit policy unresolved. + +The mechanical rubric items pass (frontmatter, line count, headings, references one level deep). The gaps are content, +not structure. + +## Goals + +1. Make `anc`'s observable contract explicit in the skill so agents stop inferring it from prose. +2. Eliminate the highest-frequency situational dead-ends that don't have answers in `SKILL.md` or `getting-started.md`. +3. Stay under the 500-line SKILL.md ceiling. New material that doesn't pull weight in the entry-point file lives in + `references/`. +4. Tighten the SKILL.md / `getting-started.md` / `references/` boundary so each file owns one job and there is no + cross-file content duplication. +5. Broaden starter coverage so non-Rust paths and JSON-envelope authoring have first-class scaffolding. + +## Non-goals + +- Rewriting prose for Haiku-tier readability (separate brainstorm if pursued). +- Spec-text changes in the `agentnative` repo. +- Anything that requires `anc` to ship a new flag or output field — those are tracked in the sibling brief. +- Renaming the skill itself (`agent-native-cli`) — discoverability cost outweighs gerund-form rubric preference; see R10 + for which frontmatter changes ARE in scope. + +## Requirements + +### R1. Worked scorecard samples — three placements, three roles + +**What.** Three coordinated samples, each sized for its location: + +1. **`SKILL.md` inline** — minimal snippet (~12–15 lines) right after the "1. Check." paragraph. Shows the envelope's + top-level shape (`summary`, `coverage_summary`, `badge`, `results[]`). Always-loaded; pays for ~150 tokens. +2. **`getting-started.md`** — even shorter snippet (3–5 lines) right after the `anc audit --output json . > + scorecard.json` recipe in the "existing CLI" loop. Shows just `coverage_summary` + `badge` so an agent reading the + recipe sees invocation→output side by side without scrolling. Pattern-match level only; not a parser spec. +3. **`references/scorecard-shape.md`** — exhaustive sample (~50–60 lines) covering every top-level field, one + `results[]` entry per status value, every `audit_profile` category, full `tool` / `anc` / `run` / `target` metadata. + Loaded only when an agent writes a parser. + +**Must include (exhaustive sample):** one entry per top-level field — `summary`, `coverage_summary`, `badge`, one +`results[]` entry per status value, `audit_profile`, `tool`, `anc`, `run`, `target`. Real field names taken from a live +`anc audit --output json` run. + +**Must show (exhaustive sample):** `summary.fail` (int), `coverage_summary.must.verified` (int), `badge.eligible` +(bool), `badge.embed_markdown` (string|null), `results[].requirement_id` (kebab-case string), `results[].status` (one of +pass / warn / fail / skip / error), `results[].evidence` (object). + +**Acceptance.** + +- An agent that reads only the exhaustive sample can write a parser handling all five status values and all four + audit-profile categories without reading `anc` source. +- An agent skimming `getting-started.md` sees an output snippet without leaving the file. +- An agent skimming `SKILL.md` sees the envelope shape without leaving the file. + +### R2. `## Anc contract` section in SKILL.md + +**What.** A new short section between "First action: update check" and "Start here", or as a sub-section of "The anc +loop". Documents the observable contract. + +**Must include:** + +- **Output flag policy.** Agents always pass `--output json`; `text` is for humans. +- **Exit codes.** Table — what `0`, non-zero values, and "couldn't run" return. Pulled from `anc`'s actual behaviour; if + `anc` doesn't have a stable exit-code policy, this R is blocked behind an anc-side change in the sibling brief. +- **`results[].status` enum.** Explicit list: `pass`, `warn`, `fail`, `skip`, `error`. +- **Schema-version pin.** Tell agents to assert `.` matches a pinned value before parsing; + spec out which path that is once anc surfaces it. If the envelope doesn't currently carry a version, the assertion + guidance falls back to "pin `anc` binary version" and the durable fix moves to the sibling brief. +- **Stable vs noise.** Tell agents which subtree to diff for CI gating (`summary.*`, `coverage_summary.*`, + `results[*].{requirement_id, status}`) and which is timestamp-noise (`run.*`). + +**Acceptance.** An agent writing a CI gate that fails on regression can do so without guessing. + +### R3. `--audit-profile` decision table + +**What.** A 4-row table in SKILL.md (or `references/audit-profile-selection.md` if it needs more than 4 rows). One row +per profile — `human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal` (reserved). Columns: when-to-pick +(one-line rule), example tool, what gets suppressed. + +**Must include:** a "hybrid project" rule — what to do when a binary mixes a TUI-rendering mode with a stdin-piped batch +mode, or a mostly-Rust binary with shell-helper subcommands. Either profiles compose, or the rule is "scope to the +primary entry-point and leave secondary surfaces uncovered." + +**Acceptance.** Three different agents reading the table for the same tool pick the same profile. + +### R4. Loop termination rule for warn / should + +**What.** One paragraph in "The anc loop". States explicitly: + +- `must` violations gate the loop. Continue until `must.verified == must.total`. +- `warn`s are advisory. Once `badge.eligible == true` (≥80%), stop iterating. Do not spend agent time pushing `warn`s to + `pass` unless the user explicitly asks. +- `error` (the run-failure status, distinct from `fail`) means re-run — see troubleshooting. + +**Acceptance.** An agent in "fix mode" stops when the badge clears, not when every `warn` clears. + +### R5. `anc` install precondition + +**What.** Replace or augment the current "First action: update check" so the very first action a session takes is +verifying `anc` is on `PATH`. Pseudocode: + +```text +if ! command -v anc; then + prompt user with install options (brew / cargo) and exit +fi +run bin/check-update for the skill bundle +``` + +The current update-check is for the skill bundle, not `anc`; the skill should not silently assume `anc` exists. + +**Acceptance.** Cold-start sessions never reach `anc audit` without first verifying or installing the binary. + +### R6. Spec-skew fallback + +**What.** Add a single paragraph or table row to "The anc loop" step 2 ("Fix.") covering: what to do when a finding +cites a `requirement_id` that does NOT exist in the bundle's vendored `spec/principles/`. + +**Must include:** + +- The likely cause (anc shipped against newer spec than this bundle vendors). +- The durable fix (`anc audit --explain `, **once shipped** — see sibling brief). +- The interim fix (resync via `scripts/sync-spec.sh`, or fetch `spec/principles/p-*.md` from `agentnative` `main`). +- The non-fix (do not hallucinate a definition). + +**Acceptance.** An agent hitting the skew case has a deterministic next step that isn't "read source". + +### R7. Situational runbook (the dead-ends in C) + +**What.** A new `## Common situations` section near the bottom of SKILL.md or a new `references/runbook.md` linked from +there. One paragraph per situation. Cover: + +- `badge.embed_markdown` placement — top of README, replacing existing badges, ordering relative to CI badges. Adopt + whatever convention `anc.dev/badge` recommends; if it doesn't recommend one, propose one here and push it upstream. +- Should I commit `scorecard.json`? — default no (artifact, regenerable). Override only for CI gating snapshots. +- `anc audit .` vs `--binary` vs `--source` — when each applies. +- `anc` panics or returns malformed JSON — pointer to issue tracker, do not retry blind. +- `anc skill install` host not in registry — fallback to manual `git clone` (already in `getting-started.md`; cross-link + from here). + +Each entry: ≤4 lines. The goal is a fast index, not deep prose. + +**Acceptance.** Agents stop generating issues that ask FAQs already covered. + +### R8. Frontmatter — `allowed-tools` + +**What.** Add `allowed-tools: Bash(anc *), Read` to SKILL.md frontmatter so the skill doesn't trigger permission prompts +for the canonical commands. + +**Acceptance.** Mechanical — no user prompt on `anc audit` invocations once the skill loads. + +### R9. SKILL.md / getting-started.md / references/ boundary cleanup + +**What.** Eliminate cross-file duplication, give each file one job, and reorder SKILL.md so the loop the skill teaches +is the first thing an agent reads — not a session-once meta-update side task. + +**File ownership:** + +- **SKILL.md owns:** discovery (frontmatter), preamble (what the skill is + where the three artifacts live), Quick Start + (one runnable `anc audit` command), the `anc` contract (R2), the four-step loop overview, the principles index, + pointer to `getting-started.md` for procedural detail, pointer to runbook (R7), sources. +- **`getting-started.md` owns:** the three working loops (existing CLI / new Rust / other language), full install steps + for `anc` and the skill bundle, the "where things live" pointer table. +- **`references/` owns:** depth — patterns, idioms, project structure, scorecard shape, audit-profile selection, + runbook, update-check operational details. One topic per file, no cross-file overlap. + +**Reorder SKILL.md.** Today the order is: preamble → `## First action: update check` → `## Start here` → principles → +anc loop → implementation guidance → starter code → compliance auditing → sources. The update-check section is a +session-once side task; making it section #2 trains agents to read meta-update flow before they read the work flow. + +New order: + +1. Preamble (unchanged). +2. **Quick Start** — one runnable `anc audit --output json . > scorecard.json` line + the 12–15-line inline scorecard + sample from R1. +3. **The `anc` contract** (R2) — exit codes, status enum, schema-version pin, stable-vs-noise. +4. **The anc loop** — four steps, conceptual (R4 termination rule folded in). +5. **The seven principles** index (unchanged). +6. **Pointer block** — links to `getting-started.md` (install, three loops), `references/` (depth), `templates/` + (starters), `references/runbook.md` (R7). +7. **Update-check** — demoted to a one-paragraph footnote near the bottom referencing `references/update-check.md`. The + session-once `bash bin/check-update` invocation moves to that reference file (or to `bin/README.md` if that fits the + repo convention better). +8. Sources. + +This reorder also resolves the "first runnable thing in SKILL.md is `bash bin/check-update`" issue surfaced in external +review: after the reorder, the first runnable thing is `anc audit`. + +**Must remove:** + +- The duplicated install block in `SKILL.md` § "Compliance auditing" (lives in `getting-started.md` already). +- The duplicated four-step loop where `SKILL.md` § "The anc loop" and `getting-started.md` "existing CLI" section both + walk it. Canonical homes: SKILL.md for the conceptual loop; `getting-started.md` for the runnable bash. +- The two parallel index tables — SKILL.md "Implementation guidance" pointer table and `getting-started.md` "Where + things live". Merge into one, owned by `getting-started.md`. SKILL.md retains a brief pointer paragraph. + +**Must add:** + +- SKILL.md `## Quick Start` section at position #2 above (after the preamble, before everything else). +- Inline scorecard sample from R1 lands inside Quick Start. + +**Acceptance.** + +- A grep for any of `brew install brettdavies/tap/agentnative`, `cargo install agentnative`, or `anc skill install + claude_code` returns hits in exactly one file. +- The four-step loop appears once in SKILL.md (conceptual) and the bash recipe form appears once in + `getting-started.md`. +- A first-time agent reading SKILL.md sees `anc audit` as the first runnable command, not `bash bin/check-update`. +- The update-check section is no longer the second top-level heading. +- SKILL.md stays under 200 lines after the cleanup. + +### R10. Frontmatter audit beyond `allowed-tools` + +**What.** A small structured pass over SKILL.md frontmatter, separate from R8. + +**In-scope changes:** + +- Confirm `description` lead clause answers "what is this" before "what does it pair with". Today: `Guide to designing, + building, and auditing CLI tools for use by AI agents.` — passes the test, leave alone. +- Audit the `Triggers on` keyword list in the description against the rubric's "include trigger keywords" rule. Drop any + keyword that hasn't fired discovery in practice; add ones that have come up in skill-side questions but aren't there. + Stay under 1024 chars. +- Audit the `SKIP when` clause for collisions with adjacent skills (e.g. anything that overlaps `compound-engineering` + or `create-agent-skills` should be sharper). +- **No** `argument-hint` — this is a background-knowledge skill, not a slash command. +- **No** `model:` pin — let the host decide. +- **No** `disable-model-invocation` — auto-load is correct here. +- **No** rename of `name`. The discoverability cost (existing references in `getting-started.md`, `AGENTS.md`, + templates, and external badges) outweighs the rubric's gerund-form preference. Document this decision in the doc + itself. + +**Acceptance.** Frontmatter passes a fresh `create-agent-skills` audit at the "well-tuned" tier, not just "valid", and +the kept-as-is decisions have inline rationale (in this doc, not in the SKILL.md frontmatter). + +### R11. New starter templates and framework idioms + +**What.** Broaden `templates/` and `references/framework-idioms-other-languages.md` so non-Rust paths have first-class +scaffolding instead of "read the principles and figure it out." + +**Templates to add (priority order):** + +1. **`templates/cargo-toml.md`** — drop-in `[dependencies]` block for the Rust starter. Today an agent copying + `clap-main.rs` has to know to add `clap` (with `derive` + `env`), `serde`, `serde_json`, `thiserror`, `libc` (for + SIGPIPE), `clap_complete`. AGENTS.md template documents this in prose; a copy-paste TOML block is faster. +2. **`templates/cli-tests.rs`** — `assert_cmd` patterns covering P5 mutation boundaries (idempotency, dry-run, + `--yes`/`--force` distinction). Encodes P5 by construction. +3. **`templates/scorecard-envelope.schema.json`** — JSON Schema for the canonical structured-output envelope authors + should emit when implementing P2. Compounds with R1: agents write code that conforms, and `anc` (or any other linter) + can validate against the schema directly. +4. **`templates/python-click/`** — minimal Python/Click starter (one main file + `pyproject.toml` snippet) encoding P1 + SIGPIPE handling, P2 stdout/stderr separation, P4 exit codes. Mirror of `clap-main.rs` for the Click world. +5. **`templates/go-cobra/`** — same idea, Go/Cobra. Mirrors `clap-main.rs`. +6. **`templates/agents-md-template.python.md`** and **`.go.md`** — language-flavoured AGENTS.md scaffolds, since the + current AGENTS.md template embeds Rust-specific conventions. + +**Framework idioms to add:** + +- A "JSON envelope" sub-section in `references/framework-idioms-other-languages.md` showing the canonical envelope shape + in Python/Go/JS/Ruby. Pairs with `templates/scorecard-envelope.schema.json`. +- A "testing P5 mutation boundaries" section, parallel to the Rust `assert_cmd` template, with idiomatic patterns in + `pytest`, Go `testing`, `vitest`, and `rspec`. + +**Out of scope for R11 (file kept, deferred to future work):** + +- Adding new principles or new audit profiles (those originate in `agentnative-spec`, not here). +- Adding starters for languages not already covered in `framework-idioms-other-languages.md` (Rust, Python, Go, JS, + Ruby). Adding e.g. a Swift starter is its own brainstorm. + +**Acceptance.** + +- A new Rust CLI can be bootstrapped with three `cp` commands and a `cargo add` that's just copy-paste from the new + Cargo TOML template. +- A new Python CLI has a starter that encodes P1/P2/P4 by construction. +- An author implementing P2 can validate their JSON output against the schema template without writing one from scratch. + +### R12. Consolidate the two Rust-idiom reference files + +**What.** Merge `references/framework-idioms.md` (137 lines, "Free from clap / Must implement / Anti-patterns" framing, +Rust-only) and `references/rust-clap-patterns.md` (209 lines, denser per-principle prose) into one canonical Rust +reference — proposed name `references/rust-clap.md`. Cross-checked against both files: P1, P3, P5, P7 are heavily +overlapping; P2, P4, P6 each have one file-unique nugget worth preserving. + +**File-unique content to preserve (must not be lost in the merge):** + +- From `framework-idioms.md`: the **Free / Must / Anti-patterns** three-bucket organizational framing — keep this as the + canonical structure. Also the closing pointer table to `framework-idioms-other-languages.md`. +- From `rust-clap-patterns.md`: the **subcommand-vs-flag taxonomy** in P6 (subcommands for operations, nested + subcommands for namespaces, global flags for cross-cutting modifiers, local flags for command-specific modifiers, both + flag and subcommand for meta-commands like `--help` / `--version`). This is real content, not duplication. Also: the + **`kind()` method + main-only `process::exit()`** rule in P4. Also: the **Jsonl variant** of the output format enum + (Text/Json/Jsonl) in P2. Also: the explicit references to xurl-rs / bird as exemplar codebases — keep these as + concrete-example anchors. + +**Merge strategy:** + +1. Adopt `framework-idioms.md`'s Free / Must / Anti-patterns scaffold per principle. +2. For each principle's Must-implement bucket, fold in the deeper detail from `rust-clap-patterns.md` — patterns, crate + names (FalseyValueParser, clap_complete), exit codes (sysexits 77/78), exemplar refs. +3. P2 Must-implement: explicitly enumerate Text/Json/Jsonl as the three OutputFormat variants. +4. P4 Must-implement: include both `exit_code()` and `kind()` methods, plus the "restrict `process::exit()` to `main()`" + rule. +5. P6 gets a new sub-section before the Free/Must/Anti-patterns triplet: **Flags vs subcommands taxonomy** — five + bullets covering the subcommand-vs-flag rules from `rust-clap-patterns.md`. +6. Delete `references/framework-idioms.md` and `references/rust-clap-patterns.md` once `rust-clap.md` is reviewed. +7. Update SKILL.md's "Implementation guidance" pointer table (or its consolidated R9 equivalent in `getting-started.md`) + to reference the single new file. + +**Estimated size:** ~250 lines for the merged file, replacing 346 lines of overlap. Net repo savings ~100 lines, plus +the cognitive savings of "one file per language family, organized identically." + +**Out of scope:** + +- Re-organizing `framework-idioms-other-languages.md`. That file is multi-framework by design and has a different axis + (one section per framework, not one file per language). +- Splitting the consolidated file by clap version. If clap 5 ships with breaking changes, that's a future concern. + +**Acceptance.** + +- One canonical Rust idioms reference at `references/rust-clap.md`. +- Both `framework-idioms.md` and `rust-clap-patterns.md` are deleted from the repo. +- All file-unique content listed above appears in the merged file. +- SKILL.md and `getting-started.md` reference the new file (no broken links). +- A diff between "what's in the merged file" and "union of the two source files" shows no information loss other than + prose duplication. + +## Implementation surface + +| Requirement | Lands in | Estimated size | +| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------- | +| R1 | SKILL.md inline (minimal) + `references/scorecard-shape.md` (exhaustive) | 15 + 60 lines | +| R2 | SKILL.md, new section | 25–40 lines | +| R3 | SKILL.md table + `references/audit-profile-selection.md` (extended) | 10 + 40 lines | +| R4 | SKILL.md, edit existing "anc loop" | 5–10 lines | +| R5 | SKILL.md, edit "First action" | 5–10 lines | +| R6 | SKILL.md, edit "Fix." step | 5–10 lines | +| R7 | `references/runbook.md` (linked from SKILL.md) | 50–80 lines | +| R8 | SKILL.md frontmatter | 1 line | +| R9 | SKILL.md restructure + `getting-started.md` cleanup; merge index tables | net ≈0 lines | +| R10 | SKILL.md frontmatter (description + skip clause audit) | ≤5 lines net | +| R11 | `templates/cargo-toml.md`, `templates/cli-tests.rs`, `templates/scorecard-envelope.schema.json`, `templates/python-click/`, `templates/go-cobra/`, two AGENTS.md variants, sub-sections in `references/framework-idioms-other-languages.md` | ~400 lines new files | +| R12 | New `references/rust-clap.md`; delete `references/framework-idioms.md` + `references/rust-clap-patterns.md` | ~250 new, 346 deleted (net −100) | + +**SKILL.md ceiling check.** R9 deletes duplicated install + loop content (~25 lines saved), R1 + R2 + R3 + R4 + R5 + R6 +add ~70 lines, R7 + R11 + R12 land outside SKILL.md, R8 + R10 are frontmatter. Net SKILL.md delta is roughly +45 lines, +taking it from 146 to ~190 — well under the 200-line target in R9 and the 500-line rubric ceiling. + +**Repo-wide delta.** ~400 lines in new template files, ~150 lines in new reference files (R7 runbook + R3 +audit-profile-selection + R1 scorecard-shape), ~250 lines in the merged R12 file, 346 lines deleted in the R12 merge, +~45 lines net into SKILL.md, ~20 lines net into `getting-started.md` after de-duplication. Net repo: ~520 lines added, +~25 lines deleted in SKILL.md/getting-started.md de-dup, ~346 lines deleted in R12 merge. + +## Dependencies / assumptions + +- **Soft dependency on sibling brief.** R2's schema-pin guidance and R6's `--explain` reference both work better once + `anc` ships matching features. They're not blocked — interim guidance is documentable today, but the docs become + stable when anc lands the features. See sibling brief for which features. +- **`scripts/sync-spec.sh` works.** R6's interim fix relies on the existing sync script. If it's broken or stale, fix + that as a precondition. +- **`anc.dev/badge` convention.** R7's badge-placement guidance defers to the live convention. If the convention page + doesn't yet specify placement, this requirement triggers a content addition there too. + +## Success criteria + +- Two independently-prompted agents asked "parse this scorecard and report failures" produce structurally identical + handlers. +- An agent given a tool to audit and asked to pick `--audit-profile` arrives at the same answer as the human reviewer. +- Cold-start sessions never call `anc audit` against a missing binary. +- An agent finishing a remediation loop stops when `badge.eligible == true`, not when every warn clears. +- A finding citing a `requirement_id` not in the local vendored spec resolves deterministically. +- A first-time agent reading SKILL.md sees a runnable `anc audit` command before scrolling, and can find install + instructions, the three working loops, and reference depth without re-reading SKILL.md (R9). +- A new Python or Go CLI author has a starter that encodes P1/P2/P4 by construction (R11). +- A grep across the repo for any single canonical command (e.g. `anc skill install claude_code`) returns hits in exactly + one canonical location (R9). +- One canonical Rust idioms reference exists (R12). `framework-idioms.md` and `rust-clap-patterns.md` are gone; no links + in SKILL.md or `getting-started.md` are broken. +- The first runnable command an agent encounters in SKILL.md is `anc audit`, not `bash bin/check-update` (R9 reorder). + +## Open questions + +1. **R1** — inline in SKILL.md or in `references/scorecard-shape.md`? Resolved: inline a minimal sample (~15 lines) AND + link `references/scorecard-shape.md` for the exhaustive one. Recorded in the implementation table. +2. **R2** — does `anc` currently have a stable exit-code policy worth documenting, or does this R block on the sibling + brief? Verify against `agentnative-cli` source before drafting R2 prose. +3. **R7** — `## Common situations` in SKILL.md vs `references/runbook.md`? Resolved: `references/runbook.md`. SKILL.md + gets a one-paragraph pointer. +4. **R9 sequencing.** Run R9 first (de-duplicate, restructure) so R1–R7 land in their final homes the first time and + don't have to be moved during R9. Alternative: run R9 last as a cleanup pass once R1–R7 reveal where content actually + wants to live. Recommendation: R9 first; the boundary is clear enough today and re-homing later is more work. +5. **R10** — does the description's `Triggers on` keyword list match the queries skills actually fire on in real + traffic? Need a short audit (sample of recent skill-discovery hits or qmd queries) before pruning/adding. +6. **R11 scope** — ship all six template additions in one PR or split per language? Recommendation: split. Rust + additions (cargo-toml, cli-tests, scorecard-envelope.schema.json) ship together; Python and Go each ship as their own + PR so each can be reviewed by someone fluent in that ecosystem. +7. **R12 sequencing relative to R11.** The R11 Rust starter additions (cargo-toml, cli-tests, scorecard-envelope) and + R12 (consolidate Rust idioms) both touch the Rust documentation surface. Ship R12 first so the merged + `references/rust-clap.md` is the link target for the new starter docs. Recommended. + +## Handoff + +Next step: `/ce-plan` against this document. Implementation breakdown: + +- **PR 1 (skill-side determinism).** R1–R8 plus R9 (do R9 first to avoid re-homing). One commit per requirement for + review legibility. Targets `dev`. Estimated total ~250 lines added/changed across SKILL.md, getting-started.md, and + three new reference files. +- **PR 2 (frontmatter polish).** R10 only. Tiny PR; can land in PR 1 if it stays under ~10 lines. Split if R10's + description audit takes more than that. +- **PR 3 (Rust idioms consolidation).** R12 only. Merge `framework-idioms.md` + `rust-clap-patterns.md` into + `references/rust-clap.md`; delete the originals; update SKILL.md / `getting-started.md` link targets. Land before PR 4 + so the new starter docs reference the merged file. +- **PR 4 (Rust starter completion).** R11 items 1–3 — `cargo-toml.md`, `cli-tests.rs`, `scorecard-envelope.schema.json`. + Self-contained, no skill prose changes. +- **PR 5 (Python starter).** R11 items 4 + 6a — `python-click/` template, `agents-md-template.python.md`, plus the + Python sections in framework-idioms. +- **PR 6 (Go starter).** R11 items 5 + 6b — `go-cobra/` template, `agents-md-template.go.md`, plus the Go sections in + framework-idioms. + +Sibling brief (`2026-05-01-002-anc-determinism-feature-asks-requirements.md`) is filed separately against +`agentnative-cli`. R2 and R6 reference items there as soft dependencies; interim guidance ships in PR 1 regardless. diff --git a/docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md b/docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md new file mode 100644 index 0000000..02f28db --- /dev/null +++ b/docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md @@ -0,0 +1,199 @@ +--- +status: draft +created: 2026-05-01 +owner: brett +target_repo: agentnative-cli +related: docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md +--- + +# `anc` — feature asks for deterministic agent consumption + +Cross-repo brief. Lives in `agentnative-skill` next to its sibling because the asks were surfaced by the skill-side +review on 2026-05-01, but the durable fixes ship in `anc` itself. This document is the source for an issue (or set of +issues) to be filed against [`brettdavies/agentnative-cli`](https://github.com/brettdavies/agentnative-cli). The +skill-side requirements document +([`2026-05-01-001-skill-determinism-requirements.md`](./2026-05-01-001-skill-determinism-requirements.md)) references +several items here as soft dependencies — interim skill-side guidance is shippable today, but stabilises once these +land. + +## Problem + +Agents consuming `anc audit` output today must infer details that should be self-describing in the binary's contract: + +- The JSON envelope's schema version is internal documentation, not a field agents can assert against. A bump from 0.5 + to 0.6 with renamed fields breaks consumers silently. +- A finding cites a `requirement_id` (e.g. `p1-must-no-interactive`) but resolving its text requires the consumer to + read a vendored copy of the spec, which can drift from the version `anc` was built against. +- The `text` mode appends a badge embed hint, but the embed string itself doesn't declare *where* in a README it should + land — every consumer reinvents that. +- `audit_profile` exemptions are documented at four named categories, but nothing encodes how to pick one for a hybrid + tool (TUI rendering + stdin batch mode + a shell wrapper). +- Scorecard envelopes contain timestamp / run-id metadata mixed in with the gating payload, so CI gates that diff the + whole envelope see false churn. +- `anc audit`'s exit-code semantics aren't part of the documented contract, so agents can't `&&`-chain it reliably. + +Each of these is fixable in the skill via prose, but the durable fix is for `anc` to make the contract self-describing. + +## Goals + +1. Make `anc`'s observable contract robust against version drift. +2. Eliminate the runbook items where the skill is currently papering over a missing `anc` feature. +3. Keep changes additive — no breaking changes to schema 0.5 consumers. + +## Non-goals + +- Restructuring `anc`'s internals or check pipeline. +- Spec-text changes (those go to `agentnative`). +- Anything that would prevent JSON envelopes from being consumed by an existing schema-0.5 parser (i.e. additions are + tolerated, renames are not — gate any rename behind a schema bump). + +## Asks + +Each ask is sized small enough to ship independently. Priorities reflect the skill-side benefit; `anc` may have other +priorities. + +### A1. Self-describing schema version in the envelope (highest priority) + +**What.** Surface the schema version on every `anc audit --output json` envelope at a stable, top-level path — proposed +`anc.schema_version` (sits alongside `anc.version`) or `run.schema_version`. String, semver-shaped (`"0.5"`, `"0.6.0"`). + +**Why.** Lets every agent assert `envelope.anc.schema_version == "0.5"` (or whatever they pinned against) before +parsing. Today the only fingerprint is the binary version. Schema and binary versions diverge; one is the contract, the +other is the implementation. + +**Acceptance.** + +- Field present in every JSON envelope from `anc audit`. +- Documented in the README and the `anc` man page (or equivalent). +- Bumping the schema bumps this field. Adding fields without rename does not. + +### A2. `anc explain ` (closes spec-skew gap) + +**What.** A new subcommand or flag that prints the spec text + machine-readable metadata for one `requirement_id` *from +the version of the spec `anc` was built against*. Output: `--output text` (default, human-readable) and `--output json` +(envelope with the full requirement record from spec frontmatter). + +**Why.** Today, an agent sees a finding citing `p1-must-no-interactive` and goes to `spec/principles/p1-*.md` — but the +bundle vendors a snapshot, and that snapshot can lag the `anc` build. `anc explain` is the only authoritative source: +it's whatever this specific binary was compiled against. Eliminates the spec-skew dead-end (R6 in the skill-side doc) at +its root. + +**Acceptance.** + +- `anc explain p1-must-no-interactive` returns spec text and structured metadata for a valid id. +- `anc explain bogus-id` returns a non-zero exit and a structured error envelope. +- `anc explain --list` (or `anc list-requirements`) enumerates every id the binary knows about. + +### A3. Stable exit-code policy + +**What.** Document and stabilise the exit-code contract for `anc audit` and other subcommands. Proposed: + +| Exit | Meaning | +| ---- | ------------------------------------------------------------------------- | +| 0 | Ran successfully; `summary.fail == 0` and the run is gating-clean. | +| 1 | Ran successfully; one or more `fail`s present (gating violation). | +| 2 | Invocation error — bad flags, target not found, profile name unknown. | +| 3 | Internal error — panic, malformed input, schema version mismatch on read. | + +**Why.** Lets agents and CI scripts use `anc audit && next-step` reliably, and lets agents distinguish "tool said no" +from "tool couldn't run." Closes R2's exit-code row in the skill-side doc. + +**Acceptance.** + +- Behaviour documented in README and CLI `--help`. +- Tested in CI for each exit class. +- The skill-side doc R2 references this table directly. + +### A4. Embed-position metadata on the badge block + +**What.** Add an optional field — proposed `badge.embed_position` — that names the recommended placement convention. +String enum, e.g. `"top-of-readme"`, `"under-title"`, `"in-badges-row"`, `null` if no convention applies. Defer to +whatever [`anc.dev/badge`](https://anc.dev/badge) publishes; anc just surfaces it. + +**Why.** Today four agents pasting `badge.embed_markdown` into the same README put it in four places. Convention belongs +at the source, not in every consumer's head. + +**Acceptance.** + +- Field present whenever `badge.eligible == true`. +- Skill-side R7 (badge placement runbook entry) defers to this field instead of duplicating the convention. +- Convention page at `anc.dev/badge` enumerates the values. + +### A5. Stable-output mode for CI diffability + +**What.** Either: + +- **(a)** Split the envelope into a stable subtree (gating-relevant: `summary`, `coverage_summary`, `results`, + `audit_profile`, `badge`) and a metadata subtree (`tool`, `anc`, `run`, `target`) that consumers can ignore for diff + purposes. This may already be the layout — confirm and document it. +- **(b)** Add `--stable` (or `--no-timestamps`) that elides volatile fields entirely, producing a byte-stable output for + identical input. + +**Why.** CI gates that diff scorecards across runs see false churn from `run.*` timestamps. Diffing only `summary` is +the workaround; native support eliminates it. + +**Acceptance.** + +- Document the stable subtree in the README. OR +- `anc audit --stable .` produces byte-identical output for byte-identical input. + +### A6. Profile composition for hybrid tools + +**What.** Either allow `--audit-profile` to take a comma-separated list, OR add a `--audit-profile-for-subcommand +=` repeatable flag, so a tool with a TUI mode + a batch mode can scope profiles per surface. + +**Why.** Today the four categories (`human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal` reserved) don't +compose. Real tools mix shapes — e.g. an agent-native CLI with one rendering subcommand. Forcing a single profile either +suppresses real findings (if the user picks `human-tui`) or surfaces irrelevant ones (if they don't). + +**Acceptance.** + +- Hybrid tool case has a deterministic incantation. +- Skill-side R3 ("hybrid project rule") points to this rather than inventing a workaround. + +### A7. `anc` self-audit / `anc doctor` + +**What.** A subcommand that prints `anc.version`, `anc.schema_version`, the spec version it was built against, the +registered audit-profile names, and the registered hosts for `anc skill install`. JSON output supported. + +**Why.** Replaces a half-dozen "what does my install actually know about" questions that today require reading source. +Also supports the skill-side R5 install precondition — the agent can run `anc doctor --output json` to confirm health, +not just `command -v anc`. + +**Acceptance.** + +- One subcommand returns everything needed to validate an install. +- Output stable enough for an agent to assert against. + +## Priority and sequencing + +Ranked by skill-side benefit: + +1. **A1** (schema version) — unblocks future schema bumps without breaking consumers. Smallest change, largest leverage. +2. **A3** (exit codes) — unblocks CI gating and `&&` chaining. +3. **A2** (`anc explain`) — closes the spec-skew dead-end at its root. +4. **A5** (stable output) — closes CI diff churn. +5. **A6** (profile composition) — closes the hybrid-tool dead-end. +6. **A4** (embed position) — closes the badge placement question. Lowest urgency because the skill can document a + convention as a stop-gap. +7. **A7** (`anc doctor`) — convenience; not blocking anything. + +A1 and A3 are the two that the skill-side document treats as soft dependencies. The rest are independent improvements. + +## Filing plan + +Each ask becomes its own GitHub issue against `agentnative-cli`, labelled `agent-determinism` and cross-referencing this +brief and the skill-side sibling. The issue body is the corresponding `### Aₙ` block from this document (roughly +verbatim) plus a "Surfaced by" link to the skill-side review thread. + +A1 and A3 should land in the same `anc` release if possible — they pair naturally and together unblock the +highest-impact skill-side requirements. + +## Open questions + +1. A1 — does `tool.schema_version` already exist somewhere in the envelope? If yes, this ask collapses to "document and + stabilise"; if no, it's a small additive change. +2. A5 — confirm whether the current envelope already segregates stable from volatile fields cleanly. If so, the ask is + purely documentation. +3. A2 — is the spec already embedded in the binary at compile time, or is it loaded from a vendored snapshot at runtime? + The implementation strategy depends on the answer. diff --git a/docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md b/docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md new file mode 100644 index 0000000..5338b45 --- /dev/null +++ b/docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md @@ -0,0 +1,329 @@ +--- +title: "feat: Bootstrap agentnative-skill producer repo (Unit 1 of skill-distribution master plan)" +type: feat +status: complete +date: 2026-04-27 +completed_date: 2026-04-27 +public_flip_completed_date: 2026-04-28 +v0_2_0_completed_date: 2026-04-29 +v0.1.0_commit: 47a76cceb8b7b1bc013c19ee18a5e38179b1dd0e +v0.1.0_tag_object: 5e2b676369a952c80da4bf133ce5ac03d34406a5 +v0.2.0_commit: 2b10c845760becf3de8d66aafbb7b57820385d45 +v0.2.0_tag_object: 054c249c36e92d5fc08603f701c999ac9ad187b6 +master_plan: ../../../agentnative-site/docs/plans/2026-04-24-001-feat-skill-distribution-endpoint-plan.md +related_session: ../../../agentnative-site/docs/plans/2026-04-27-001-feat-skill-distribution-site-plan.md +origin: master plan Unit 1, executed in dedicated session +--- + +# feat: Bootstrap agentnative-skill producer repo + +## Session Context + +This plan is executed by a **dedicated session whose cwd is `~/dev/agentnative-skill/`**. A parallel session in +`~/dev/agentnative-site/` runs the matching site-side plan (Units 2–5). The two sessions exchange exactly **one +artifact**: the v0.1.0 commit SHA produced here, consumed there as `source.commit` in `src/data/install.json`. + +**Master plan (canonical reference):** +`~/dev/agentnative-site/docs/plans/2026-04-24-001-feat-skill-distribution-endpoint-plan.md` — Unit 1. **Sibling site +plan:** `~/dev/agentnative-site/docs/plans/2026-04-27-001-feat-skill-distribution-site-plan.md` — Units 2–5 plus the +public-flip cutover. + +This document defines execution shape; the master plan defines architectural rationale. Read the master plan's Unit 1 +section first if any decision below feels under-justified. + +## State at Session Start + +- **Repo created (this current session):** `github.com/brettdavies/agentnative-skill` (private, empty, topics applied). +- **Local clone present:** `~/dev/agentnative-skill/` containing only `.git/`. Origin = + `git@github.com:brettdavies/agentnative-skill.git`. +- **Bundle source (private, author's machine):** `~/dev/agent-skills/agent-native-cli/` — pre-audited; no private path + references found via `grep -rln -E '(agent-skills|~/dev|/home/brett)'`. +- **Bundle inventory (verified at planning time):** +- `SKILL.md` (root) +- `checklists/new-tool.md` +- + +`references/{framework-idioms,framework-idioms-other-languages,principles-deep-dive,project-structure,rust-clap-patterns}.md` + +- `scripts/check-compliance.sh` +- `scripts/checks/_helpers.sh` + 21 `check-*.sh` scripts (broader than the master plan's illustrative list — copy the + actual filesystem, do not enforce the master plan's incomplete enumeration) +- `templates/{agents-md-template.md,clap-main.rs,error-types.rs,output-format.rs}` + +## Goals + +1. Stand up the public producer repo with the bundle at the root, governance files, CI gates, and v0.1.0 tag. +2. Migrate via clean-room re-commit (no history import from the private source). +3. Apply `github-repo-setup` skill defaults (branch protection, tag protection, CODEOWNERS). +4. **Stop before public flip.** The visibility cutover is coordinated with the site PR in the sibling session. +5. Hand off the v0.1.0 commit SHA to the site session via a one-line note. + +## Non-Goals + +- **No `gh repo edit --visibility public`.** That is the final cutover step in the site plan, not here. +- **No site repo edits.** Cross-repo coordination is via SHA handoff only. +- **No content rewriting of SKILL.md or scripts.** Migration is byte-faithful copy. Content evolution is post-bootstrap. +- **No git history import** from the private source. The private repo's branches, authorship, and paths must not leak. +- **No signed tags in v1.** Documented in the master plan as deferred to v2. + +## Branch Discipline (initial commit special case) + +The master plan and global CLAUDE.md mandate `dev` off `main`, features off `dev`, PR'd back. An empty repo has no +`main` to PR against — the first commit is a bootstrap exception: + +1. Bootstrap commit lands on `main` directly (no PR — there is nothing to review against). +2. Immediately after, create `dev` tracking `main`. Push it. **Leave `main` as the repo default branch** — this matches + the rest of the agentnative ecosystem (`agentnative`, `agentnative-cli`, `agentnative-site` all default to `main`). A + naive `git clone` should land visitors on the released bundle, not on `dev`'s WIP. +3. Tag `v0.1.0` on the bootstrap commit (pre-`dev` creation is fine; tag refers to commit SHA, not branch). +4. Any subsequent work in this repo (post-v0.1.0) follows full discipline: feat/*off `dev`, squash-PR to `dev`, + release/* cherry-picked from `main` and PR'd to `main`. + +The bootstrap exception is one commit only. Repeat-edits during this session must NOT pile onto `main`; if you discover +a fix mid-session, force yourself onto a `feat/*` branch. + +## Implementation Steps + +### Step 1: Verify session preconditions + +```bash +pwd # → /home/brett/dev/agentnative-skill +git remote -v # → origin git@github.com:brettdavies/agentnative-skill.git (fetch+push) +git status # → clean, on no branch yet (empty repo) +ls ~/dev/agent-skills/agent-native-cli/ # → bundle source visible +gh auth status # → authenticated as brettdavies +``` + +If any check fails, stop and ask the orchestrator. The repo creation in the parent session is the only external +prerequisite; everything else from here is local work. + +### Step 2: Author governance files + +Create at repo root (do **not** copy from master plan verbatim — write fresh content suited to a single-skill repo): + +- **`README.md`** — short. What this repo is (the `agent-native-cli` skill bundle, cloned-in-place install model), + pointer to `https://anc.dev/install` for install instructions, pointer to `https://anc.dev/p1` for principles, + license, contribution model. ~50 lines max. +- **`LICENSE`** — MIT, year 2026, copyright Brett Davies. Match the form used in `~/dev/agentnative-site/LICENSE` if + present (read it for exact wording); otherwise standard SPDX MIT text. +- **`CHANGELOG.md`** — keepachangelog.com format. Initial section: `## [0.1.0] - 2026-04-27 — Initial release`. Bullets + describe the bundle (7 principles, 24 compliance audits, templates). +- **`VERSION`** — single line `0.1.0\n`. No frontmatter, no metadata. This is the source of truth for the version string + the site reads at release time. +- **`SECURITY.md`** — vulnerability disclosure policy. Channel: GitHub private security advisories + (`https://github.com/brettdavies/agentnative-skill/security/advisories/new`). 90-day disclosure window. Reference + `SECURITY.md` patterns from `~/dev/agentnative-cli/SECURITY.md` if it exists. +- **`.gitignore`** — minimal: editor scratch files (`.DS_Store`, `*.swp`, `.idea/`, `.vscode/`), local todos + (`TODO*.md`, `.context/`), build artifacts if any (`target/`, `node_modules/`). +- **`.gitattributes`** — `* text=auto eol=lf` and `*.sh text eol=lf` to enforce line endings on shell scripts (Windows + hosts will checkout LF). This is in the master plan's risk table; the file IS the mitigation. + +### Step 3: Migrate the bundle (clean-room copy) + +```bash +# Copy each top-level item from the source. Use cp -a to preserve mode bits (scripts must remain +x). +cp -a ~/dev/agent-skills/agent-native-cli/SKILL.md ./ +cp -a ~/dev/agent-skills/agent-native-cli/checklists ./ +cp -a ~/dev/agent-skills/agent-native-cli/references ./ +cp -a ~/dev/agent-skills/agent-native-cli/scripts ./ +cp -a ~/dev/agent-skills/agent-native-cli/templates ./ + +# Verify scripts kept executable bits. +ls -l scripts/*.sh scripts/checks/*.sh | awk '{print $1, $NF}' | grep -v 'rwx' && \ + echo 'FAIL: non-executable scripts found' || echo 'OK: scripts executable' + +# Final frontmatter audit — should output nothing (matched at planning time, re-verify after copy). +rg -n --no-heading '(agent-skills|~/dev|/home/brett)' . || echo 'OK: no private paths' +``` + +**Frontmatter audit beyond paths:** `SKILL.md`'s top-level `description` field and `name` should match the deployed +skill. The frontmatter at planning time was clean — re-grep after the copy and pause if anything new surfaces. + +### Step 4: CI workflow + +Create `.github/workflows/ci.yml`. Jobs: + +1. `markdownlint` — runs `markdownlint-cli2` against `**/*.md`. Use the action pinned to a commit SHA per global + CLAUDE.md SHA-pinning rule (resolve via `gh api repos///commits/ --jq .sha`). +2. `shellcheck` — runs against `scripts/*.sh` and `scripts/checks/*.sh` with severity `style`. Use a pinned-by-SHA + action. + +Triggers: `push` on `main`/`dev`/`feat/**`, `pull_request` against `main`/`dev`. Required for branch protection in Step +7. + +The site session has a `~/dev/agentnative-site/.github/workflows/ci.yml` reference; you may consult it for action SHA +patterns but do NOT introduce any other site-specific CI logic — this repo's CI is intentionally minimal. + +### Step 5: CODEOWNERS + +Create `.github/CODEOWNERS`: + +```text +# Mandatory review on anything that runs on user machines at install time +scripts/** @brettdavies +.github/workflows/ @brettdavies +``` + +Per master plan risk row "Shell scripts in producer repo execute on user machines" — this is the review gate. + +### Step 6: Initial commit + push + +```bash +git checkout -b main # create main locally +git add -A +git status --short # sanity scan; no surprises +git commit -m "feat: initial bundle for agent-native-cli v0.1.0 + +Migrate the agent-native-cli skill bundle into a public producer repo. +Bundle ships at the repo root so 'git clone' directly into a host's +skills directory IS install. + +Includes: +- SKILL.md (north-star standard, 7 principles) +- checklists/, references/, scripts/, templates/ +- 24 compliance audits across 9 groups (scripts/checks/*) +- governance: LICENSE (MIT), CHANGELOG, VERSION, SECURITY.md, CODEOWNERS +- CI: markdownlint + shellcheck + +Source: clean-room re-commit from a private bundle that lived on the +author's machine since March 2026. No history imported. + +Companion endpoints (anc.dev/install, anc.dev/install.json) ship via +the agentnative-site repo, pinning to this commit SHA." +git push -u origin main + +# Now create dev tracking main and push it. +# Leave `main` as the repo default branch (matches the rest of the +# ecosystem: agentnative, agentnative-cli, agentnative-site all +# default to main). A naive `git clone` should land on the released +# bundle, not on dev's WIP. +git checkout -b dev +git push -u origin dev +``` + +### Step 7: Tag v0.1.0 + +Tag on `main` (the bootstrap commit), then push: + +```bash +git checkout main +git tag -a v0.1.0 -m "v0.1.0 — initial release" +git push origin v0.1.0 + +# Capture SHA for the site session handoff. +git rev-parse v0.1.0 # → 40-char SHA — copy this verbatim +``` + +**Hand off the SHA to the orchestrator.** The site session needs this exact value as `source.commit` in +`src/data/install.json`. + +### Step 8: Apply repo settings (branch + tag protection) + +Use `gh api` against the rulesets endpoint (modern replacement for branch-protection). The `github-repo-setup` skill is +the authoritative source for the rules — invoke it via `Skill(skill="github-repo-setup", args="audit + apply")` if +present; otherwise apply the following manually. + +**Branch ruleset (covers `main` AND `dev`):** + +- Force-push: blocked. +- Branch deletion: blocked. +- Required status checks: `markdownlint`, `shellcheck` (the two CI jobs from Step 4). +- Require linear history: optional (nice-to-have for a small repo, not load-bearing). +- Bypass: repo admins permitted (so `gh repo edit --visibility public` and emergency hotfixes still work). + +**Tag ruleset (covers `v*`):** + +- Force-push (re-tag): blocked. +- Tag deletion: blocked. +- Bypass: repo admins permitted. + +Verify both rulesets actually take effect: + +```bash +gh api repos/brettdavies/agentnative-skill/rulesets --jq '.[].name' # should list both +git checkout main +git commit --allow-empty -m "test: should be rejected" && git push origin main +# expected: refused by ruleset; if accepted, the protection isn't applied +``` + +(Reset / discard the test commit after verifying refusal.) + +### Step 9: Verification gates + +All Step 9 gates pass: + +- [x] `gh repo view brettdavies/agentnative-skill --json visibility -q .visibility` → `PUBLIC` (flipped 2026-04-28). +- [x] `gh repo view brettdavies/agentnative-skill --json defaultBranchRef -q .defaultBranchRef.name` → `main`. +- [x] `gh run list --branch main --limit 1` → CI job for the initial commit completed `success`. +- [x] `gh release list` → both `v0.1.0` and `v0.2.0` published. +- [x] `gh api repos/brettdavies/agentnative-skill/git/ref/tags/v0.1.0 --jq '.object.sha'` → + `5e2b676369a952c80da4bf133ce5ac03d34406a5` (tag object SHA, recorded in frontmatter alongside the commit SHA + `47a76cceb8b7b1bc013c19ee18a5e38179b1dd0e`). +- [x] Force-push to `main` fails for non-admin: `protect-main` rules array blocks force-push; the `actor_type: + RepositoryRole, actor_id: 5, bypass_mode: always` bypass allows admin overrides. +- [x] `markdownlint-cli2 '**/*.md'` exits zero locally. +- [x] `shellcheck scripts/*.sh scripts/checks/*.sh` exits zero with three narrow `.shellcheckrc` disables (`SC1091`, + `SC2034`, `SC2125`) tolerating style-only findings in the migrated bundle scripts. +- [x] **Local install smoke test (Claude Code):** + ```bash + mkdir -p /tmp/anc-test-home/.claude/skills + HOME=/tmp/anc-test-home git clone --depth 1 git@github.com:brettdavies/agentnative-skill.git \ + /tmp/anc-test-home/.claude/skills/agent-native-cli + ls /tmp/anc-test-home/.claude/skills/agent-native-cli/SKILL.md # exists + rm -rf /tmp/anc-test-home + ``` + (SSH clone was verified during the bootstrap session against the then-private repo. After the 2026-04-28 public + flip, both `git clone https://github.com/brettdavies/agentnative-skill.git` and the SSH form should work; the + site session's e2e relies on the HTTPS path.) + +### Step 10: Hand-off note + +Handoff was delivered to the orchestrator chat with the bootstrap SHAs (commit +`47a76cceb8b7b1bc013c19ee18a5e38179b1dd0e`, tag object `5e2b676369a952c80da4bf133ce5ac03d34406a5`) and repo visibility +(`PRIVATE` during bootstrap, flipped `PUBLIC` 2026-04-28). No separate handoff doc was committed. + +## Risks & Mitigations (this session only) + +| Risk | Mitigation | +| --------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Accidental include of bundle source's parent dir or hidden files | `git status --short` between copy and commit; explicit per-item `cp -a` (Step 3) rather than wildcard from source root | +| Scripts lose +x bits on copy | `cp -a` preserves modes; verification in Step 3 fails loud if any script is non-executable | +| `markdownlint` flags existing skill content not previously CI-checked | Resolve in-session: either fix the markdown to pass, or add a narrowly-scoped `.markdownlint.jsonc` exception with a comment explaining why. Do NOT lower the bar globally | +| Branch protection blocks the very `git push` that creates `dev` | Push `main` BEFORE creating the ruleset (Step 6 precedes Step 8); rulesets apply on next push | +| Frontmatter audit false-negative — private path slips through | Re-run audit in Step 3 AFTER copy, not just at planning time; the planning-time grep is supporting evidence, not a substitute | +| GitHub Release object expected but not created | Master plan does not require a Release object for v1 — tag suffices. Don't create one speculatively | +| Tag created on `dev` instead of `main` | Step 7 explicitly checks out `main` before tagging. The bootstrap commit is on `main`; `dev` was created from it so they point at the same SHA at this moment, but the canonical tag-bearing branch is `main` | +| Site session starts before this is done | The user is orchestrating both sessions and will not start the site session before this session reports the SHA | + +## Files Touched (in this repo only) + +```text +.gitattributes +.gitignore +.github/CODEOWNERS +.github/workflows/ci.yml +CHANGELOG.md +LICENSE +README.md +SECURITY.md +SKILL.md # migrated from ~/dev/agent-skills/agent-native-cli/ +VERSION +checklists/new-tool.md # migrated +docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md # this plan, committed alongside bootstrap +references/ # migrated (5 files) +scripts/ # migrated (1 top-level + 1 helper + 21 check scripts) +templates/ # migrated (4 files) +``` + +## Sources & References + +- **Master plan:** `~/dev/agentnative-site/docs/plans/2026-04-24-001-feat-skill-distribution-endpoint-plan.md` (Unit 1) +- **Sibling site plan (parallel session):** + `~/dev/agentnative-site/docs/plans/2026-04-27-001-feat-skill-distribution-site-plan.md` +- **Bundle source (private, author's machine):** `~/dev/agent-skills/agent-native-cli/` +- **Skill on Claude Code (live):** `~/.claude/skills/agent-native-cli/` — same content as the bundle source; do NOT use + this path as the migration source (it may diverge from the master copy in `~/dev/agent-skills/`) +- **External:** [agentskills.io specification](https://agentskills.io/specification), + [garrytan/gstack](https://github.com/garrytan/gstack) +- **Global rules consulted:** `~/.claude/CLAUDE.md` (SHA-pinning, branch discipline, no AI attribution, dev-flow + squash-merge) diff --git a/docs/plans/2026-05-01-001-feat-skill-determinism-hardening-plan.md b/docs/plans/2026-05-01-001-feat-skill-determinism-hardening-plan.md new file mode 100644 index 0000000..b9dc0cc --- /dev/null +++ b/docs/plans/2026-05-01-001-feat-skill-determinism-hardening-plan.md @@ -0,0 +1,1786 @@ +--- +title: "feat: SKILL.md determinism + runbook hardening (R1–R12)" +type: feat +status: active +date: 2026-05-01 +origin: docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md +sibling_brief: docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md +--- + +# feat: SKILL.md determinism + runbook hardening (R1–R12) + +## Summary + +Rewrite `SKILL.md`, `getting-started.md`, and four reference files against `anc 0.3.0`'s live JSON envelope +(`schema_version: "0.5"`, top-level `tool/anc/run/target/badge/audience/audit_profile`, `results[].id`); add three new +reference files (`runbook.md`, `scorecard-shape.md`, `audit-profile-selection.md`) and a consolidated `rust-clap.md`; +ship six new templates (Cargo TOML, CLI tests, scorecard JSON Schema, Python/Click + Go/Cobra starters with paired +AGENTS.md scaffolds). Sequenced as seven landable PRs: R9 boundary cleanup first, then R1–R8 determinism additions, then +R10 frontmatter, then R12 Rust idioms consolidation, then three language-starter PRs, then R14 deterministic-script +hardening (the dogfood gate that makes `bin/check-update` and `scripts/sync-spec.sh` pass the audit the skill teaches). +All seven PRs ship in one consolidated `v0.4.0` release. + +--- + +## Problem Frame + +`SKILL.md` today describes the `anc` envelope in prose without samples, names `results[].status` once without +enumerating values, has no exit-code contract, no schema-version pin, no `--audit-profile` selection rule, no +termination rule for `warn`s, no fallback for spec-skew, no install precondition, no runbook, and no allowed-tools +frontmatter. Two competent agents reading it can produce two different correct-looking-but-wrong scorecard handlers — +that variance is the problem this plan closes. Full pain narrative + all twelve requirements live in +`docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md`. A sibling cross-repo brief +(`docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md`) tracks the `anc`-side feature requests +this plan soft-depends on; interim guidance ships here regardless. + +--- + +## Requirements + +- R1. Worked scorecard samples in three placements (SKILL.md inline ~15 lines, getting-started.md inline ~5 lines, + `references/scorecard-shape.md` exhaustive). +- R2. `## Anc contract` section in SKILL.md — output flag policy, exit codes, `results[].status` enum, schema-version + pin path, stable-vs-noise subtree. +- R3. `--audit-profile` decision table in SKILL.md + extended `references/audit-profile-selection.md`. +- R4. Loop termination rule for `warn` / `should` (must gates the loop; warn is advisory; stop on `badge.eligible`). +- R5. `anc` install precondition replaces "First action: update check" as the very first action. +- R6. Spec-skew fallback paragraph in "The anc loop" step 2 (Fix.). +- R7. `references/runbook.md` covering badge placement, scorecard commit policy, `--binary` / `--source` scoping, anc + panics, skill-host fallback. +- R8. `allowed-tools: Bash(anc *), Read` added to SKILL.md frontmatter. +- R9. SKILL.md / `getting-started.md` / `references/` boundary cleanup + reorder so the first runnable command in + SKILL.md is `anc audit`, not `bash bin/check-update`. SKILL.md kept under 200 lines. +- R10. SKILL.md frontmatter audit beyond `allowed-tools` — description triggers, skip clause, kept-as-is decisions + documented. +- R11. New starter templates: `cargo-toml.md`, `cli-tests.rs`, `python-click/`, `go-cobra/`, + `agents-md-template.{python,go}.md`; new sub-sections in `references/framework-idioms-other-languages.md` for JSON + envelope authoring and P5 mutation-boundary testing. **The `templates/scorecard-envelope.schema.json` artifact origin + R11 enumerated is dropped:** the canonical schema is shipped by upstream `agentnative-cli` (plan + `agentnative-cli/docs/plans/2026-04-30-002-feat-scorecard-json-schema-plan.md`) embedded in the binary and exposed via + `anc generate scorecard-schema`; vendoring a copy here would duplicate and drift. U14 is repurposed to document the + extraction path. The "P2 author-output template" intent of origin R11.3 (a generic envelope template tool authors emit + *their tool's* output against, distinct from anc's audit envelope) is deferred to follow-up work. +- R12. Consolidate `references/framework-idioms.md` + `references/rust-clap-patterns.md` into a single + `references/rust-clap.md`; delete the originals; preserve every file-unique nugget the origin enumerates. +- R13. Subcommand index + target-resolution runbook entry (**plan-introduced, not origin-traced**). A short SKILL.md + table covering the full `anc` subcommand surface (`check`, `generate`, `skill install`, `completions`, `help`, plus + the bare-`anc`-defaults-to-`check` shortcut), so agents stop punting to `anc --help` for basic discovery. Plus an + expanded runbook entry covering all four target modes — `anc audit .` (project mode), `--binary`, `--source`, + `--command ` — with resolution rules for each. Closes the two gaps a 2026-05-01 peer review of the current + SKILL.md surfaced ("the full subcommand surface" and "target resolution rules for `--binary` vs `--source` vs + `--command` vs project mode"). +- R14. Deterministic-script hardening (dogfood gate, **plan-introduced 2026-05-04**). `bin/check-update` and + `scripts/sync-spec.sh` must pass `anc audit --binary --audit-profile posix-utility` with `badge.eligible == true` — + the skill audits CLIs against the agent-native spec, so the skill's own runtime scripts must pass that audit. Add + `--output text|json`, `--quiet`, `--no-interactive`, `--timeout ` global flags (P1, P2, P7) to both. Add + `--dry-run` (P5) and `--force` (P5) to `scripts/sync-spec.sh`. Move success-path status echoes from stdout to stderr + in `sync-spec.sh` so `--output json` produces a clean envelope. Distinguish network-fail from up-to-date in + `bin/check-update`'s output (today both branches exit 0 silently). Document the text-mode token grammar, the JSON + envelope per status, and the env-var contract (`AGENTNATIVE_SKILL_DIR`, `AGENTNATIVE_SKILL_REMOTE_URL`, + `AGENTNATIVE_SKILL_STATE_DIR`, `SPEC_REMOTE_URL`, `SPEC_ROOT`) in `references/update-check.md` and + `references/runbook.md`. + +(R1–R12 cited verbatim from origin §Requirements; R13 added by this plan in response to a peer review of the live +SKILL.md that surfaced two recurrent gaps the origin brainstorm did not anticipate; R14 added 2026-05-04 to make the +skill's own scripts pass the audit the skill teaches — the dogfood gate. Success criteria from origin §Success Criteria; +R13's success criterion is "an agent driving `anc` cold can enumerate the subcommand surface and pick the right target +mode without reading `anc --help` source"; R14's is "`anc audit --binary --audit-profile posix-utility` against +`bin/check-update` and `scripts/sync-spec.sh` produces `badge.eligible == true` for both.") + +--- + +## Scope Boundaries + +- No `anc` binary changes (sibling brief `2026-05-01-002` covers those). +- No spec-text edits — `spec/` is vendored from `agentnative` via `scripts/sync-spec.sh`. +- No skill rename — `name: agent-native-cli` stays per origin R10's documented "no" decision. +- No Haiku-tier prose rewrite (separate brainstorm if pursued). +- No new principles, new audit profiles, or new starter languages beyond Python/Go (Rust + JS + Ruby idioms exist; + Swift, Kotlin, etc. are their own brainstorm). + +### Deferred to Follow-Up Work + +- JS and Ruby rows in `references/framework-idioms-other-languages.md`'s new "JSON envelope authoring" and "Testing P5 + mutation boundaries" sub-sections — section scaffolds land in PR 5; JS / Ruby rows defer to a follow-up PR matched to + a JS/Ruby starter ecosystem expert. +- Updating origin brainstorm's R1 "Must show" field-list to use `id` instead of `requirement_id` — origin doc is closed; + correction lives only in this plan and in the shipped docs. +- **P2 author-output envelope template** (the half of origin R11.3 that's distinct from anc's audit envelope). A generic + envelope authors emit *their tool's* `--output json` against — not the anc audit envelope U14 now points at. Deferred + to a follow-up plan because (a) origin R11.3 conflated the two artifacts, and (b) "what envelope shape should a P2 + tool emit?" is its own open question that compounds with the existing `references/framework-idioms-other-languages.md` + JSON-envelope sub-section but isn't a one-file deliverable. + +--- + +## Context & Research + +### Relevant Code and Patterns + +- `SKILL.md` (146 lines today) — entry point; frontmatter + preamble + four loop steps + principles index + + implementation guidance + starter code + compliance auditing + sources. R9 reorders this. +- `getting-started.md` (85 lines) — three working loops + install + "Where things live". R9 dedupes against SKILL.md. +- `references/framework-idioms.md` (137 lines, Free / Must / Anti-patterns scaffold, Rust-only) and + `references/rust-clap-patterns.md` (209 lines, denser per-principle prose). R12 merges. +- `references/framework-idioms-other-languages.md` (338 lines; Click, argparse, Cobra, Commander, yargs, oclif, Thor) — + R11 adds two new cross-language sub-sections. +- `references/project-structure.md` (165 lines) — Rust-prescriptive; not touched by this plan but linked from new + `rust-clap.md`. +- `references/update-check.md` (33 lines) — already separated; R5 + R9 demote SKILL.md's update-check section here. +- `templates/clap-main.rs`, `templates/error-types.rs`, `templates/output-format.rs`, `templates/agents-md-template.md` + — canonical Rust patterns; R11 adds Cargo TOML + tests + JSON Schema + Python/Go siblings. +- `bin/check-update` — bash, exits 0 always, emits `UPGRADE_AVAILABLE` on divergence. Used by R5's revised first action + (after `anc` is verified on PATH). +- `spec/principles/p1-*.md` … `p7-*.md` — vendored at `spec/VERSION = 0.3.0`. Cited by R6 spec-skew fallback. + +### Live `anc 0.3.0` envelope (verified 2026-05-01 against `anc audit --output json --command rg`) + +Top-level keys: `schema_version` (`"0.5"`), `spec_version` (`"0.3.0"`), `summary`, `coverage_summary`, `results`, +`audit_profile`, `audience` (`"agent-optimized"` for rg), `badge`, `tool`, `anc`, `run`, `target`. Result entries carry +`id` (kebab-case, e.g. `p3-help`), `label`, `group` (`P1`–`P7`), `layer` (`behavioral` / `project` / `source`), `status` +(one of `pass` / `warn` / `fail` / `skip` / `error`), `evidence`, `confidence`. Badge block populated when `eligible: +true`: `score_pct`, `embed_markdown`, `scorecard_url`, `badge_url`, `convention_url`. Run block carries `invocation`, +`started_at` (ISO-8601), `duration_ms`, `platform.{os,arch}`. Target block: `kind` (`command` / `path`), `path`, +`command`. Tool block: `name`, `binary`, `version`. Anc block: `version`. + +Observed exit codes (verified 2026-05-01 against `anc 0.3.0` for echo / true / false / cat / ls / grep / rg): `0` when +summary is fully clean (no warn / fail / skip); `1` when warn or skip are present and `summary.fail == 0` (cat, ls, +grep, rg); `2` when `summary.fail > 0` (echo, true, false — each had `fail >= 1`) **OR** for invocation errors (missing +path, bad flag, unknown command). The `2` slot is overloaded today: it means "tool gated correctly with at least one +fail" *and* "tool failed to run." Agents writing CI gates **must** switch on `summary.fail`, not on `$?`, since `$? == +2` cannot distinguish a real fail from a missing-path. Sibling brief A3 will split these (`1` for `fail > 0`, `2` +reserved for invocation errors only); until then, R2 documents this observed-but-overloaded behavior. + +### Institutional Learnings + +- `~/dev/solutions-docs/best-practices/skills-2-0-structure-progressive-disclosure-20260402.md` — SKILL.md is a metadata +- branch-router; depth lives in `references/`; templates must be self-contained text, not shell-callable. Reinforces R9 + boundary cleanup. +- `~/dev/solutions-docs/architecture-patterns/anc-cli-output-envelope-pattern-2026-04-29.md` — document the envelope + contract once in `references/` and link from SKILL.md; do not duplicate field names inline. Drives the three-placement + strategy in R1. +- `~/dev/solutions-docs/best-practices/cli-structure-for-machines-typed-json-fields-over-display-strings-2026-04-20.md` + — agents must switch on typed fields, never grep human-mode display strings. Drives R2's stable-vs-noise guidance. +- `~/dev/solutions-docs/best-practices/consistent-json-schema-across-success-and-error-paths-2026-04-20.md` — error + envelopes must carry the same context fields as success envelopes. Relevant to R11.3 schema template. +- `~/dev/solutions-docs/best-practices/agentnative-version-model-2026-05-01.md` — six version concepts across four + repos; every version mention in SKILL.md must name *which* version (skill bundle / spec / anc). Drives R2 schema-pin + wording. + +### External References + +- `anc.dev/badge` — convention page referenced by R7 badge-placement entry. R7 defers placement to whatever the + convention publishes; if the page doesn't enumerate placement values, R7 proposes one and the sibling brief A4 + surfaces it as `badge.embed_position` metadata in a future anc release. + +--- + +## Key Technical Decisions + +- **Envelope reality alignment.** Every envelope-shape claim in `SKILL.md` and `getting-started.md` is rewritten against + `anc 0.3.0` live output. The brainstorm's R1 "Must show" field list is treated as advisory: where it says + `requirement_id`, the docs ship `id` (the actual envelope field). **The two are different namespaces, not synonyms.** + Envelope `results[].id` is an AUDIT id (`p3-help`, `p1-flag-existence`, `p6-sigpipe`); spec frontmatter + `requirements[].id` is a REQUIREMENT id (`p1-must-no-interactive`, `p2-must-output-flag`). One check verifies one or + more requirements; the mapping lives in each principle file's body prose (e.g. p6's "Measured by audit IDs + `p6-sigpipe`, `p6-no-color`, `p6-completions`, `p6-timeout`, `p6-agents-md`"). The previously-claimed "1:1 mapping" + was wrong and is retracted across U4 / U5 / U8. The brainstorm's `tool/anc/run/target/badge.*` claims now match + reality so they ship verbatim. New top-level fields the brainstorm did not enumerate (`audience`, `audience_reason`) + are documented in `references/scorecard-shape.md` with an explicit value enumeration captured at U5 implementation + time. +- **Schema-pin path is concrete.** R2 pins top-level `schema_version` (which exists today as `"0.5"`). No anc-side + change required for the pin to ship usable. Sibling brief A1 proposed `anc.schema_version` for additivity; current + top-level placement is documented as the canonical path. +- **Exit-code documentation is interim AND overloaded.** R2's exit-code table documents observed `anc 0.3.0` behavior + empirically: `0` = fully clean, `1` = warn/skip-only with `summary.fail == 0`, `2` = `summary.fail > 0` **or** + invocation error (the slot is overloaded today). Verified across 7 commands. The skill **must** instruct agents to + switch on `summary.fail`, never `$?`, since `$? == 2` cannot distinguish a real fail from a missing-path. Sibling + brief A3 will split the slots; until then, the interim guidance is `gate on summary.fail, treat $? as advisory`. The + earlier draft of this decision (`1` = any non-pass present, `2` = invocation error only) was empirically wrong and is + retracted. +- **Badge runbook works against the JSON envelope.** R7's badge-placement entry references `badge.embed_markdown` + directly (it exists in JSON now); placement convention defers to `anc.dev/badge`; soft-deps on sibling brief A4 only + for machine-readable position metadata. +- **R9 first, R12 before R11 Rust starters.** Origin Open Question #4 chose R9 first to avoid re-homing R1–R7 content. + Origin Open Question #7 chose R12 before the Rust starter additions (PR 4) so the merged `references/rust-clap.md` is + the link target the new starter docs reference. This plan adopts both. +- **R12 merge scaffold.** Adopt `framework-idioms.md`'s Free / Must / Anti-patterns three-bucket structure as canonical + (per origin's "merge strategy"); fold every file-unique nugget enumerated in origin (subcommand-vs-flag taxonomy in + P6, `kind()` method + main-only `process::exit()` in P4, Jsonl variant in P2, exemplar references to xurl-rs / bird). +- **R10 trigger-keyword audit defers into U10.** Origin Open Question #5 asked for a recent-traffic audit before + pruning/adding triggers. The audit happens *inside* U10 (qmd query against the skills collection + manual review of + the existing trigger list) rather than as a planning prerequisite. Keeps the plan unblocked. +- **PR-shaped phasing.** The plan ships as six landable PRs matching origin's handoff section. Each phase = one PR. This + means the plan doubles as a multi-PR coordination doc; reviewers can land PR 1 without waiting on the rest. +- **Cross-language idiom sub-sections split across PR 5 + PR 6.** R11's "JSON envelope authoring" and "Testing P5 + mutation boundaries" sub-sections in `references/framework-idioms-other-languages.md` cover four languages (Python, + Go, JS, Ruby). Section scaffold + Python rows land in PR 5; Go rows append in PR 6; JS + Ruby rows defer to follow-up + work. +- **Schema-as-pointer, not schema-as-artifact (U14 pivot).** Origin R11.3 enumerated a hand-written + `templates/scorecard-envelope.schema.json` in this skill bundle. Upstream `agentnative-cli` plan + `2026-04-30-002-feat-scorecard-json-schema-plan.md` ships the canonical scorecard schema embedded in the `anc` binary + (derived from Rust types via `schemars`, exposed via `anc generate scorecard-schema`, archived at + `https://anc.dev/scorecard-v0.5.schema.json`). Vendoring a copy in this bundle would duplicate and drift the moment + upstream bumps. U14 is repurposed: the skill points at the upstream verb (`anc generate scorecard-schema --output -`) + rather than shipping its own copy. The "P2 author-output template" half of origin R11.3 (a generic envelope template + tool authors emit *their tool's* output against, not anc's audit envelope) is deferred to follow-up work — it's a + different artifact and origin conflated the two. +- **Verification posture.** PRs 1–6 are docs + templates. Most units verify via grep checks, line-count assertions, + cross-file link integrity, and byte-faithful comparison against `anc audit --output json` output. Three units have + executable verification: U13 (`cli-tests.rs` runs under `cargo test`), U15 + U18 (Python / Go starters compile or + import cleanly). U14 is now docs-only (it documents the upstream `anc generate scorecard-schema` extraction path), so + it carries no executable verification beyond grep checks and link integrity. **PR 7 is the executable-verification + exception:** U22 / U23 modify runtime bash scripts and U25 is the empirical dogfood gate (`anc audit --binary` against + both scripts). +- **R14 dogfood gate (PR 7).** Approach is additive: new flag surfaces (`--output text|json`, `--quiet`, + `--no-interactive`, `--timeout`, `--dry-run`, `--force`) extend the existing CLI; default behavior is preserved + byte-for-byte when no new flags are passed (back-compat for existing consumers and the cache file at + `~/.cache/agent-native-cli/last-update-check`). Cache file format stays in legacy `UP_TO_DATE ` / + `UPGRADE_AVAILABLE ` shape; JSON appears at output time only. One minor breaking change in `sync-spec.sh`: + success-path status echoes ("querying…", "vendoring…", "wrote N principle files") move from stdout to stderr so + `--output json` produces a clean stdout envelope. PR 7's changelog calls the breaking change out explicitly. The + dogfood claim — the skill audits CLIs against the agent-native spec and the skill's own runtime scripts pass that same + audit at the badge-eligible bar — is non-trivial credibility for the skill itself. + +--- + +## Open Questions + +### Resolved During Planning + +- **OQ-origin-#1** (R1 inline vs reference): inline minimal (~15 lines) **and** reference exhaustive (~60 lines). Both + ship in U5. +- **OQ-origin-#3** (R7 location): `references/runbook.md`, with a one-paragraph pointer in SKILL.md. U9. +- **OQ-origin-#4** (R9 sequencing): R9 first. PR 1 leads with U1. +- **OQ-origin-#6** (R11 PR split): split per language ecosystem. PR 4 Rust, PR 5 Python, PR 6 Go. +- **OQ-origin-#7** (R12 vs R11 Rust): R12 first. PR 3 lands before PR 4. +- **Field-name drift** (`results[].id` vs origin's `requirement_id`): use `id` (actual). The earlier "maps 1:1" framing + was wrong — envelope `id` and spec frontmatter `requirements[].id` are different namespaces (audit id vs requirement + id). U4 / U5 / U8 carry the corrected framing and the principle-prose mapping path. + +### Deferred to Implementation + +- **OQ-origin-#2** (R2 anc exit-code policy): empirically resolved by 2026-05-01 probing across 7 commands. Observed: + `0` clean, `1` warn/skip with `fail==0`, `2` overloaded (`fail>0` OR invocation error). The skill ships interim + guidance "**gate on `summary.fail`, not `$?`**" with cross-link to sibling brief A3 (which will split the overloaded + `2` slot). The earlier draft contract (`1` for any non-pass, `2` for invocation error only) is empirically wrong and + retracted. +- **OQ-origin-#5** (R10 trigger-keyword traffic audit): U10 runs the audit at implementation time using a qmd query + against the skills collection and manual review of `Triggers on:` entries. +- **`anc.dev/badge` placement convention** (R7 dependency): if the page does not enumerate placement values when U9 + lands, U9 proposes one (`top-of-readme` after the H1) and the proposal is mirrored to sibling brief A4. +- **R14 CI integration**: PR 7 ships the manual dogfood gate (U25) plus a `CONTRIBUTING.md` note pinning re-runs to + edits of either script. Wiring the audit into CI as a regression gate (`gh workflow` entry running `anc audit + --binary` on every PR touching `bin/` or `scripts/`) is **explicitly out of scope** for this plan and belongs to a + follow-up CI-hardening PR. The manual gate is sufficient for the v0.4.0 release; CI integration is the next + compounding step. +- **R14 audit-profile choice** (`posix-utility` vs alternatives): U25 audits both scripts under `posix-utility`. If + either script's primary surface turns out to be more idiomatic under a different profile (unlikely — both are + stdin-quiet, stdout-emitting, no TUI), the audit-profile selection note in `references/audit-profile-selection.md` + (U6) is the authoritative source. Pick is empirically resolved at U25 implementation time; if a different profile is + needed, the U25 verification section captures the rationale. + +--- + +## Implementation Units + +Units are grouped by phase (= PR). Each unit is a landable atomic commit within its PR. U-IDs are stable; reordering or +splitting preserves them. + +### Phase 1 — PR 1: Skill-side determinism (R1–R9) + +- U1. **R9: SKILL.md restructure + getting-started.md dedup** + +**Goal:** Reorder SKILL.md so the first runnable command is `anc audit` (not `bash bin/check-update`). Eliminate +cross-file duplication (install block, four-step loop, parallel index tables). Land the new section skeleton so U2–U9 +fill in their own slots without re-homing. + +**Requirements:** R9. + +**Dependencies:** None. (Runs first per OQ-origin-#4.) + +**Files:** + +- Modify: `SKILL.md` +- Modify: `getting-started.md` + +**Approach:** + +- New SKILL.md section order: Preamble → `## Quick Start` (placeholder for U5's inline scorecard sample + the canonical + `anc audit --output json . > scorecard.json` line) → `## Subcommand index` (placeholder for U21) → `## Anc contract` + (placeholder for U4) → `## The anc loop` (conceptual four steps; U7 folds in termination rule, U8 folds in spec-skew) + → `## The seven principles` index → `## Audit profile selection` (placeholder for U6) → `## Common situations` + (one-paragraph pointer to `references/runbook.md`, U9 + U21 fill) → Pointer block → `## Update-check` (demoted to + one-paragraph footnote pointing at `references/update-check.md`) → Sources. +- Delete the duplicated install block in SKILL.md `## Compliance auditing` (lives in `getting-started.md` already). +- Delete the duplicated four-step loop where SKILL.md and `getting-started.md` both walk it. Canonical homes: SKILL.md + for the conceptual loop; `getting-started.md` for the runnable bash recipe. +- Merge SKILL.md's `## Implementation guidance` pointer table and `getting-started.md`'s `## Where things live` table + into one canonical table in `getting-started.md`. SKILL.md keeps a short pointer paragraph. +- Keep SKILL.md under 200 lines after the cleanup (origin R9 acceptance). + +**Patterns to follow:** Existing SKILL.md tone and table formatting (kept consistent with `agentnative-spec` skill). +`getting-started.md`'s "Where things live" table format. + +**Test scenarios:** + +- Verification: `grep -c 'brew install brettdavies/tap/agentnative' SKILL.md getting-started.md` returns 0 in + `SKILL.md`, ≥1 in `getting-started.md` (origin R9 acceptance). +- Verification: `grep -c 'anc skill install claude_code' SKILL.md getting-started.md` returns hits in exactly one file + (`getting-started.md`). +- Verification: First runnable command in SKILL.md (first fenced block after the preamble) is `anc audit`, not `bash + bin/check-update`. +- Verification: `wc -l SKILL.md` ≤ 200 after the full PR 1 lands (re-checked at end of PR 1, not at U1 alone since U2–U9 + add ~70 lines). +- Verification: `## Update-check` is no longer the second top-level heading; appears near the bottom. +- Verification: `markdownlint-cli2 SKILL.md getting-started.md` passes. + +**Verification:** Reordered SKILL.md presents Quick Start as section #2; install + four-step loop appear in exactly one +canonical location each; markdownlint clean. + +--- + +- U2. **R8: SKILL.md `allowed-tools` frontmatter** + +**Goal:** Add `allowed-tools: Bash(anc *), Read` to SKILL.md frontmatter so canonical `anc audit` invocations don't +trigger permission prompts on hosts that honor the field. + +**Requirements:** R8. + +**Dependencies:** U1 (so U2's frontmatter edit doesn't conflict with U1's body restructure). + +**Files:** + +- Modify: `SKILL.md` (frontmatter only) + +**Approach:** Single-line addition to YAML frontmatter. Place after `description:` and before the closing `---`. Use the +exact string `Bash(anc *), Read` so Claude Code's matcher accepts both bare `anc` and any `anc ` invocation. + +**Patterns to follow:** Skills 2.0 frontmatter convention surfaced in +`~/dev/solutions-docs/best-practices/skills-2-0-structure-progressive-disclosure-20260402.md` — `allowed-tools` is +advisory in interactive mode, enforced under headless `claude -p`. + +**Test scenarios:** + +- Verification: `grep -A1 '^name: agent-native-cli$' SKILL.md` shows the description; `grep '^allowed-tools:' SKILL.md` + returns one match. +- Verification: YAML frontmatter parses (`python3 -c "import yaml,sys; + yaml.safe_load(open('SKILL.md').read().split('---')[1])"`). + +**Verification:** Frontmatter is well-formed; `allowed-tools` line present and exactly matches the canonical string. + +--- + +- U3. **R5: anc install precondition replaces "First action: update check"** + +**Goal:** First action in any session is verifying `anc` is on `PATH`. Update-check (which is for the skill bundle, not +`anc`) demotes to a referenced sub-flow. + +**Requirements:** R5. + +**Dependencies:** U1 (uses U1's restructured section layout — `## Update-check` as bottom footnote). + +**Files:** + +- Modify: `SKILL.md` (replace current `## First action: update check` content; add precondition pseudocode) +- Modify: `references/update-check.md` (only if U1 didn't already absorb the demoted bash; otherwise unchanged) + +**Approach:** + +- Replace SKILL.md's current `## First action: update check` body with: a one-paragraph "anc precondition" + the + pseudocode block from origin R5 (`if ! command -v anc; then prompt user; exit; fi`) + a one-line pointer to the + demoted update-check footnote. +- Pseudocode is markdown-fenced shell-pseudo (not literal bash), so agents read it as logic, not as a command to execute + verbatim. Origin's exact pseudocode shape preserved. +- The demoted update-check section runs the existing `bash bin/check-update` after the precondition gate clears. + +**Patterns to follow:** Existing `references/update-check.md` voice + structure for the demoted footnote. + +**Test scenarios:** + +- Verification: `grep -B2 -A8 '^## ' SKILL.md | head` shows the install precondition section appears before any other + procedural section. +- Verification: SKILL.md contains the `command -v anc` pseudocode and the brew + cargo install fallback options. +- Verification: Update-check guidance in SKILL.md is a single paragraph referencing `references/update-check.md` (no + full procedural body). + +**Verification:** Cold-start agent reading SKILL.md top-to-bottom encounters the anc precondition before any `anc audit` +invocation. + +--- + +- U4. **R2: `## Anc contract` section in SKILL.md** + +**Goal:** Document `anc`'s observable contract once, so agents stop inferring it from prose. Output flag policy, exit +codes (interim), `results[].status` enum, schema-version pin, stable-vs-noise subtree. + +**Requirements:** R2. + +**Dependencies:** U1 (placeholder section exists). Soft-dep on U5 (the inline scorecard sample lands in adjacent Quick +Start; cross-link). + +**Files:** + +- Modify: `SKILL.md` (add `## Anc contract` body, ~25–40 lines per origin estimate) + +**Approach:** + +- **Output flag policy.** One paragraph: agents always pass `--output json`; `text` is for humans (and appends a badge + embed hint after the summary line when `badge.eligible == true`). +- **Exit codes.** Three-row table covering observed `anc 0.3.0` behavior: `0` = fully clean (no warn/fail/skip), `1` = + warn/skip present with `summary.fail == 0`, `2` = `summary.fail > 0` **OR** invocation error (missing path, bad flag, + unknown command). The `2` slot is overloaded — same exit code for "tool gated correctly with at least one fail" and + "tool failed to run." Imperative gating rule: **agents writing CI gates must switch on `summary.fail`, not on `$?`**, + since `$? == 2` cannot distinguish a real fail from an invocation error today. Footnote: "Interim contract — sibling + brief A3 (`docs/brainstorms/2026-05-01-002`) will split the slots so `1` means `fail > 0` and `2` is reserved for + invocation errors. Until then, treat `$?` as advisory; the gating signal is `summary.fail`." +- **`results[].status` enum.** Bulleted list naming all five values: `pass`, `warn`, `fail`, `skip`, `error`. One-line + gloss per value. +- **Schema-version pin.** Imperative paragraph: assert `envelope.schema_version == "0.5"` before parsing. If the + assertion fails, do not fall back to silent parse — fail explicit and prompt the user to upgrade `anc` or this skill + bundle. Footnote naming sibling brief A1 as the durable additive replacement (`anc.schema_version` proposed). +- **Stable vs noise.** Two-list paragraph. Stable-for-CI-diffing: `summary.*`, `coverage_summary.*`, `results[*].{id, + status, evidence}`, `audience`, `audit_profile`, `badge.eligible`, `badge.score_pct`. Timestamp/run-noise (don't + diff): `run.started_at`, `run.duration_ms`, `run.invocation`, `tool.version`, `anc.version`. Sibling brief A5 will + eventually offer `--stable` flag. + +**Patterns to follow:** Stable-vs-noise framing from +`~/dev/solutions-docs/best-practices/cli-structure-for-machines-typed-json-fields-over-display-strings-2026-04-20.md`. + +**Test scenarios:** + +- Verification: `grep -c '^## Anc contract' SKILL.md` returns 1. +- Verification: Section enumerates all five status values (`pass`, `warn`, `fail`, `skip`, `error`). +- Verification: Section names `schema_version` as the pin path (not `tool.schema_version` or `anc.schema_version`). +- Verification: Exit-code table has three rows, footnoted as interim with cross-link to sibling brief. +- Verification: Exit-code table is empirically validated at unit time by running `anc audit --output json --command + ` against at least three commands covering `summary.fail == 0` and `summary.fail > 0` cases (e.g. `rg`, `cat`, + `echo`); the documented `$?` value matches observed behavior, the imperative `gate on summary.fail, not $?` line is + present. +- Verification: The schema-pin example (`envelope.schema_version == "0.5"`) matches the live `anc 0.3.0` value. + +**Verification:** An agent writing a CI gate that fails on regression can do so without guessing — exit codes +documented, status enum complete, schema-pin path concrete, stable subtree enumerated. + +--- + +- U5. **R1: Three-placement worked scorecard samples** + +**Goal:** Three coordinated samples sized to their location: SKILL.md inline (~12–15 lines top-level shape), +`getting-started.md` inline (~3–5 lines invocation→output), `references/scorecard-shape.md` (~50–60 lines exhaustive, +every top-level field, one `results[]` entry per status value, every audit-profile category, full metadata). All taken +from a live `anc 0.3.0 check --output json` run, byte-faithful where possible. + +**Requirements:** R1. + +**Dependencies:** U1 (Quick Start placeholder), U4 (links to anc contract section). Soft-dep on U6 (audit-profile +decision table is adjacent). + +**Files:** + +- Create: `references/scorecard-shape.md` +- Modify: `SKILL.md` (Quick Start fenced block) +- Modify: `getting-started.md` ("existing CLI" loop section, after the canonical `anc audit --output json . > + scorecard.json` line) + +**Approach:** + +- **`references/scorecard-shape.md` (exhaustive).** One H1 + short preamble + a single fenced JSON block of the full + envelope with every top-level field populated. Use a synthesized but realistic envelope (start from `anc audit + --output json --command rg`, then construct extra `results[]` entries to cover all five status values, all four + audit-profile categories, and an `audience: agent-optimized` example). Below the JSON block, a per-field gloss table: + `field path → type → semantics → stable for CI?`. Note that `audience` and `audience_reason` are top-level (not inside + `audit_profile`). +- **SKILL.md inline (~15 lines).** Truncated envelope showing only `summary`, `coverage_summary`, `badge`, and one + `results[]` entry. Placed immediately after the Quick Start `anc audit ...` line. Comment ellipses (`/* ... */`) where + fields are elided. +- **`getting-started.md` inline (~5 lines).** Even shorter — just `coverage_summary` + `badge.eligible` + + `badge.embed_markdown`. Placed right after the `anc audit --output json . > scorecard.json` recipe. +- **Field-name discipline.** All three samples use `results[].id` (the actual envelope field). The per-field gloss table + in `references/scorecard-shape.md` carries an explicit "**audit id vs requirement id**" note: envelope `results[].id` + is an AUDIT identifier (`p3-help`, `p1-flag-existence`, `p6-sigpipe`); spec frontmatter `requirements[].id` is a + REQUIREMENT identifier (`p1-must-no-interactive`, `p2-must-output-flag`). The two are different namespaces. Each + principle file's body prose maps audit ids to the requirements they verify (e.g. p6's "Measured by audit IDs + `p6-sigpipe`, …"). To resolve a finding's spec text, read principle prose, not frontmatter alone. Sibling brief A2 + (`anc explain `) is the durable resolver once it ships. +- **Live envelope capture at unit time.** Implementer runs `anc audit --output json --command rg` at U5 implementation + time and uses that output as the byte-faithful base for the exhaustive sample. Do not rely on a snapshot from this + plan — `anc` may have moved between plan-write and unit-implementation; live capture is the freshness guarantee. +- **`audience` semantic capture.** Before the per-field gloss table goes final, run `anc audit --output json` against at + least four targets covering the observed `audience` values (sampled today: `agent-optimized`; null cases also + observed). Enumerate the value set in the gloss row and document when `audience_reason` is present (today: when + `audience == null`). If the value space is open-ended or not stable enough to enumerate, the gloss row says so + explicitly rather than implying enumeration completeness. + +**Patterns to follow:** `references/update-check.md`'s preamble + fenced-block voice. Origin R1 sample-sizing rules. + +**Test scenarios:** + +- Verification: `references/scorecard-shape.md` JSON block parses (`python3 -c "import json; + json.load(open('references/scorecard-shape.md'))"` after extracting the block, or `python3 + scripts/extract-fenced-json.py references/scorecard-shape.md`). +- Verification: Exhaustive sample contains at least one `results[]` entry per `status` value (`pass`, `warn`, `fail`, + `skip`, `error`). +- Verification: `audit_profile` is a scalar field — only one value per envelope. The exhaustive sample shows ONE + `audit_profile` value plus a prose annotation (under the per-field gloss row) enumerating all four possible values + (`human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal`) and noting the field is set per-invocation by + `--audit-profile `. +- Verification: Per-field gloss row for `audience` enumerates the observed value set (sampled `agent-optimized`; null + cases) and notes when `audience_reason` is populated. If the value space is intentionally open, the row says so. +- Verification: All three samples use `id` (not `requirement_id`) for result entries. +- Verification: Field path table in `references/scorecard-shape.md` enumerates `summary, coverage_summary, badge, + results, audit_profile, audience, audience_reason, tool, anc, run, target, schema_version, spec_version` (13 top-level + fields). +- Verification: SKILL.md inline sample fits in ~15 lines. +- Verification: `getting-started.md` inline sample fits in ~5 lines. +- Covers AE: An agent reading only `references/scorecard-shape.md` can write a parser handling all five status values + and all four audit-profile categories without reading `anc` source (origin R1 acceptance). + +**Verification:** Three samples present at three locations; each sized for its job; all field names match `anc 0.3.0` +reality; JSON parses; exhaustive sample is the canonical one for parser authors. + +--- + +- U6. **R3: `--audit-profile` decision table + extended reference** + +**Goal:** A 4-row decision table in SKILL.md (one row per profile) + an extended `references/audit-profile-selection.md` +covering the hybrid-tool rule. + +**Requirements:** R3. + +**Dependencies:** U1 (placeholder section). + +**Files:** + +- Modify: `SKILL.md` (`## Audit profile selection` section, ~10 lines) +- Create: `references/audit-profile-selection.md` (~40 lines) + +**Approach:** + +- **SKILL.md table.** Four rows: `human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal` (reserved). Columns: + when to pick (one-line rule), example tool, what gets suppressed. Glosses paraphrased from `anc audit --help`'s value + list — but agents are pointed at `--help` for the authoritative description. +- **SKILL.md `### Worked examples` sub-section** (added immediately under the 4-row table, ~10 lines). Three concrete + hybrid-tool worked examples with explicit profile picks, addressing the determinism-acceptance concern that one-line + rules in 4 rows won't converge agents on hybrid cases: +- *Example 1 — `lazygit` (pure TUI):* pick `human-tui`. Reason: the binary's primary entry-point is the interactive TTY + interface; no batch / stdin-piped mode exists. +- *Example 2 — `cat` with optional TUI mode:* pick `posix-utility`. Reason: stdin-primary is the documented main use; + any TUI rendering is a secondary surface, scope-out per the primary-entry-point rule. +- *Example 3 — A diagnostic CLI with one TUI rendering subcommand (e.g. `mytool dashboard`) and otherwise read-only + introspection:* pick `diagnostic-only`. Reason: the tool's documented main use is read-only diagnosis; the dashboard + subcommand is a secondary surface, scope-out and document the suppression in the README's Limitations section. +- **`references/audit-profile-selection.md` (extended).** Restate the four profiles with deeper one-paragraph + descriptions, then a "**Hybrid project rule**" sub-section: when a tool mixes a TUI-rendering subcommand with a + stdin-piped batch mode (or a Rust binary with shell-helper subcommands), the rule is: scope the audit profile to the + primary entry-point and leave secondary surfaces uncovered. Cross-link to sibling brief A6 (proposed `--audit-profile + =` repeatable flag) as the durable composition path. +- "Three different agents reading the table for the same tool pick the same profile" (origin R3 acceptance) — frame the + rule deterministically: "primary entry-point" defined as the subcommand documented as the tool's main use in its + README, or the bare invocation behavior if no README clarifies. The worked-examples sub-section is the empirical + determinism test — three different agents reading the table + the three worked examples for a fourth hybrid tool + should converge on the same pick. + +**Patterns to follow:** Existing `references/framework-idioms-other-languages.md` table-with-paragraph-context style. + +**Test scenarios:** + +- Verification: SKILL.md `## Audit profile selection` section contains exactly four rows (one per profile) plus a `### + Worked examples` sub-section with three hybrid-tool examples and explicit profile picks. +- Verification: `references/audit-profile-selection.md` contains a `### Hybrid project rule` (or equivalent) + sub-section. +- Verification: Cross-link to sibling brief A6 present in the extended file. +- Verification: The four profile names in the table match `anc audit --help`'s `--audit-profile` value list + (`human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal`). +- Verification (empirical determinism): give the SKILL.md `## Audit profile selection` section (table + worked examples) + to 3 agents on a fourth hybrid tool not in the worked examples (e.g. `nvtop` — TUI by default but supports stdin-piped + data); record their picks. If any disagreement, revise the worked examples or the rule before merge. + +**Verification:** Decision table renders cleanly; hybrid-tool rule is explicit and deterministic; cross-links hold. + +--- + +- U7. **R4: Loop termination rule for warn / should** + +**Goal:** Folded into `## The anc loop` (placed by U1): one paragraph explicitly stating that `must` violations gate the +loop, `warn`s are advisory, stop iterating when `badge.eligible == true`. + +**Requirements:** R4. + +**Dependencies:** U1. + +**Files:** + +- Modify: `SKILL.md` (`## The anc loop` step 3 / 4 boundary) + +**Approach:** + +- One paragraph (~5–10 lines) inserted at the end of step 3 (Re-audit) or as a short coda before step 4 (Claim the + badge). Three bullets: (1) `must` gates the loop — continue until `must.verified == must.total`. (2) `warn`s are + advisory — once `badge.eligible == true` (≥80%), stop iterating; do not push warns to pass unless the user explicitly + asks. (3) `error` (the run-failure status, distinct from `fail`) means re-run — see runbook (U9). + +**Patterns to follow:** Existing imperative voice in `## The anc loop`. + +**Test scenarios:** + +- Verification: `## The anc loop` section contains the words `advisory` and `badge.eligible` near step 3 / 4. +- Verification: The termination rule explicitly distinguishes `error` (run-failure) from `fail` (gating-violation). + +**Verification:** An agent in "fix mode" reading SKILL.md stops when the badge clears, not when every warn clears. + +--- + +- U8. **R6: Spec-skew fallback paragraph** + +**Goal:** Step 2 of `## The anc loop` (Fix.) carries a short paragraph or table row explaining the **two** failure modes +agents hit when looking up a finding's `id`: (a) the namespace mismatch (envelope `id` is an audit id, not a requirement +id), and (b) actual spec drift when the bundle's vendored `spec/principles/` is older than `anc`. + +**Requirements:** R6. + +**Dependencies:** U1, U4 (the check-id-vs-requirement-id namespace note from U4 is referenced). + +**Files:** + +- Modify: `SKILL.md` (`## The anc loop` step 2 — Fix.) + +**Approach:** + +- ~8–12 line paragraph inserted at the end of step 2. Two-part structure: +- **Part A — namespace, not skew (always check this first).** The envelope's `results[].id` is an AUDIT id (e.g. + `p3-help`, `p1-flag-existence`, `p6-sigpipe`). The spec's `requirements[].id` is a REQUIREMENT id (e.g. + `p1-must-no-interactive`, `p2-must-output-flag`). They are different namespaces. To resolve the spec text the check + references, read the matching principle file's body prose (e.g. `spec/principles/p6-*.md`'s "Measured by audit IDs + `p6-sigpipe`, …" mapping line) — NOT the principle's frontmatter `requirements[]` block alone. If `anc explain ` + is available (sibling brief A2), prefer it as the authoritative resolver. +- **Part B — actual spec skew (only if Part A's principle file doesn't reference the audit id).** (1) Likely cause — + `anc` shipped against a newer spec than this bundle vendors. (2) Interim fix — re-run `scripts/sync-spec.sh` to + refresh the vendored spec, or fetch the missing `spec/principles/p-*.md` from `agentnative` `main`. (3) Non-fix — + do not hallucinate a spec definition; better to surface the gap to the user. + +**Patterns to follow:** Existing `## The anc loop` step prose voice; `references/update-check.md` voice for the +cross-link footnote. + +**Test scenarios:** + +- Verification: SKILL.md step 2 contains both `scripts/sync-spec.sh` (interim fix) and the cross-link to sibling brief + A2 (`anc explain`). +- Verification: The paragraph explicitly says "do not hallucinate" or equivalent (origin R6 non-fix bullet). +- Verification: Part A (namespace) appears before Part B (skew) and uses concrete examples — `p3-help` as an audit id, + `p1-must-no-interactive` as a requirement id — to make the namespace distinction unambiguous. +- Verification: The paragraph names "principle file body prose" (NOT "frontmatter alone") as the resolution path. + +**Verification:** An agent hitting the spec-skew case has a deterministic next step that isn't "read source". + +--- + +- U9. **R7: `references/runbook.md` (common situations)** + +**Goal:** A new reference file with one-paragraph entries (≤4 lines each) per common situational dead-end. Linked from +SKILL.md `## Common situations` (the placeholder U1 created). + +**Requirements:** R7. + +**Dependencies:** U1 (SKILL.md pointer in place), U4 (anc contract referenced from runbook entries). + +**Files:** + +- Create: `references/runbook.md` +- Modify: `SKILL.md` (`## Common situations` paragraph, links to runbook) + +**Approach:** + +- Five entries from origin R7: + +1. **`badge.embed_markdown` placement.** Default convention: top of README, after the H1 title, alongside CI badges. + Override only if `anc.dev/badge` publishes a different convention (check the page; if it doesn't enumerate placement, + this entry's default applies and the proposal mirrors to sibling brief A4). +2. **Should I commit `scorecard.json`?** Default no (artifact, regenerable from `anc audit`). Override only for CI + gating snapshots. If you commit, gitignore `run.*` fields by post-processing or use the eventual `anc audit --stable` + (sibling brief A5). +3. **`anc audit .` vs `--binary` vs `--source`.** `anc audit .` runs both source and behavioral analysis; `--binary` + skips source (use when scoring a pre-built binary that isn't in this repo); `--source` skips behavioral (use when + source-only feedback is wanted, e.g. PR review on a code-only diff). +4. **`anc` panics or returns malformed JSON.** Pointer to `` with + the `anc --version` + invocation. Do not retry blind; retrying a panicking `anc` will produce the same panic with + timestamp churn and waste agent time. +5. **`anc skill install ` host not in registry.** Cross-link to `getting-started.md`'s manual `git clone --depth + 1` fallback. + +- Each entry: ≤4 lines body + a one-line "see also" pointer to the relevant origin requirement, sibling brief item, or + other reference file. Goal is fast index, not deep prose. + +**Patterns to follow:** `references/update-check.md` short-section voice; `getting-started.md` Q&A table for +cross-links. + +**Test scenarios:** + +- Verification: `references/runbook.md` contains all five entries; each entry body is ≤4 lines (excluding heading and + "see also" pointer). +- Verification: SKILL.md `## Common situations` section is one paragraph containing the link to `references/runbook.md`. +- Verification: Markdownlint clean; cross-links resolve (no broken `[text](path)` pointers). +- Verification: Entry 1's badge-placement default cross-references `anc.dev/badge` and sibling brief A4. + +**Verification:** Agents stop generating issues that ask FAQs already covered (origin R7 acceptance). + +--- + +- U21. **R13: Subcommand index + target-resolution runbook entry** + +**Goal:** Document the full `anc` subcommand surface in SKILL.md so agents stop punting to `anc --help` for basic +discovery; expand `references/runbook.md`'s target-resolution entry to cover all four target modes (`.`, `--binary`, +`--source`, `--command `) with resolution rules. + +**Requirements:** R13. + +**Dependencies:** U1 (placeholder `## Subcommand index` section in place), U9 (`references/runbook.md` exists; U21 +expands its target-resolution entry). + +**Files:** + +- Modify: `SKILL.md` (`## Subcommand index` section, ~10 lines) +- Modify: `references/runbook.md` (replace U9's entry 3 with an expanded version, ~10–15 lines) + +**Approach:** + +- **SKILL.md `## Subcommand index`.** A 5-row table placed between `## Quick Start` and `## Anc contract` (U1's section + order). Columns: subcommand, one-line description, when to use. Rows: + +1. `check` — audit a CLI for agent-readiness (the canonical workflow; default when bare `anc ` is invoked). +2. `generate` — produce build artifacts (e.g., coverage matrix); not part of the agent loop. +3. `skill install ` — install this skill bundle into a host's canonical skills directory (six hosts supported; see + `getting-started.md`). +4. `completions` — emit shell completions for bash / zsh / fish / PowerShell. +5. `help [subcommand]` — print help; equivalent to `--help`. + +- Footnote under the table: `anc ` (no subcommand) is shorthand for `anc audit ` — see `anc --help` for the + exact aliasing rules. Bare `anc` (no arguments) prints help and exits 2. +- Subcommand list **must** be verified at unit time by running `anc --help` and reconciling the table against the live + `Commands:` block. If `anc` ships a new subcommand between plan-write and U21 implementation, add it (or trim) so the + table stays current. +- **`references/runbook.md` target-resolution entry (replaces U9's entry 3).** Four-mode coverage: +- **Project mode** (`anc audit .`) — runs both source analysis (Rust-only today) and behavioral audits. Default when a + path argument is given. +- **`--binary`** — runs only behavioral audits against a pre-built binary at the given path. Use when scoring a binary + that isn't built from source you control, or when source analysis would surface noise irrelevant to runtime behavior. +- **`--source`** — runs only source analysis (Rust source-tree scanning). Use when source-only feedback is wanted (e.g., + PR review on a code-only diff before a binary is rebuilt). +- **`--command `** — resolves a binary from `PATH` and runs behavioral audits against it. The cross-repo audit + mode the skill's actual audience uses (auditing someone else's tool). Behavioral checks only — `anc` doesn't analyze + source it didn't find via path resolution. +- Cross-link the entry to `anc audit --help` for the authoritative flag list. + +**Patterns to follow:** Existing `getting-started.md` "Where things live" table format for the SKILL.md subcommand +index; existing `references/update-check.md` short-section voice for the runbook entry. + +**Test scenarios:** + +- Verification: SKILL.md `## Subcommand index` section contains a 5-row table covering at minimum `check`, `generate`, + `skill install`, `completions`, `help`. +- Verification: Table content matches `anc --help` `Commands:` block at unit time (`diff` between table subcommand names + and live `anc --help` output shows zero unexpected omissions). +- Verification: `references/runbook.md` target-resolution entry names all four modes (`.`, `--binary`, `--source`, + `--command `) and gives at least one concrete use case per mode. +- Verification: `--command ` is named explicitly as the cross-repo audit mode (the skill's primary audience per + origin §Problem framing). +- Verification: SKILL.md stays under 200 lines after U21 lands (R9 ceiling preserved). +- Verification: Markdownlint clean; cross-links resolve. + +**Verification:** A first-time agent reading SKILL.md cold can enumerate the subcommand surface without running `anc +--help` and pick the right `--binary` / `--source` / `--command` mode for the audit task at hand. + +--- + +### Phase 2 — PR 2: Frontmatter polish (R10) + +- U10. **R10: SKILL.md frontmatter audit** + +**Goal:** Audit `description`, `Triggers on` keyword list, and `SKIP when` clause; document kept-as-is decisions; ensure +no `argument-hint` / `model:` / `disable-model-invocation` fields are added. + +**Requirements:** R10. + +**Dependencies:** U2 (frontmatter already touched for `allowed-tools`; U10 edits adjacent fields without conflict). + +**Files:** + +- Modify: `SKILL.md` (frontmatter `description` only) + +**Approach:** + +- **Triggers audit (resolves OQ-origin-#5).** Run `qmd query "agent-native" --collection skills` and `qmd query "anc + CLI" --collection skills` to surface what queries the skill actually fires on in real traffic. Manual review of + existing `Triggers on:` list; drop any keyword that hasn't fired discovery in practice; add ones that have come up in + skill-side questions but aren't there. Stay under 1024 chars total. +- **SKIP clause sharpening.** Audit for collisions with `compound-engineering` and `create-agent-skills`. Today's clause + excludes TUI-app authoring and non-CLI library work; sharpen to also explicitly exclude general "skill authoring" + (route to `create-agent-skills`) and general "compound engineering workflows" (route to `compound-engineering`). +- **Kept-as-is decisions** (per origin R10 in-scope notes): document in this plan (not in SKILL.md frontmatter) that + `name` stays, `argument-hint` is not added (background-knowledge skill), `model:` is not pinned (let host decide), + `disable-model-invocation` is not added (auto-load is correct). + +**Patterns to follow:** `~/dev/solutions-docs/best-practices/skills-2-0-structure-progressive-disclosure-20260402.md` — +frontmatter discoverability rules. + +**Test scenarios:** + +- Verification: `description` field is ≤1024 chars (`python3 -c "import yaml; + d=yaml.safe_load(open('SKILL.md').read().split('---')[1]); assert len(d['description']) <= 1024, + len(d['description'])"`). +- Verification: `Triggers on:` keyword list in description does not contain `Slack`, `email`, or other unrelated noise + (sample sanity check). +- Verification: `SKIP when` clause names `compound-engineering` and `create-agent-skills` as explicit redirects. +- Verification: `name`, `description`, `allowed-tools` are the only frontmatter keys; no `argument-hint`, `model`, or + `disable-model-invocation`. + +**Verification:** Frontmatter passes a fresh `create-agent-skills` audit at the "well-tuned" tier; kept-as-is decisions +documented in this plan's Key Technical Decisions or U10 commit message. + +--- + +### Phase 3 — PR 3: Rust idioms consolidation (R12) + +- U11. **R12: Merge `framework-idioms.md` + `rust-clap-patterns.md` → `references/rust-clap.md`** + +**Goal:** One canonical Rust idioms reference. Adopt `framework-idioms.md`'s Free / Must / Anti-patterns scaffold; fold +every file-unique nugget from `rust-clap-patterns.md` into the appropriate principle's bucket. + +**Requirements:** R12. + +**Dependencies:** U1 (SKILL.md's pointer table merged into `getting-started.md`'s Where things live; U11 updates that +single table). Independent of all PR 1 + PR 2 unit content otherwise. + +**Files:** + +- Create: `references/rust-clap.md` +- Delete: `references/framework-idioms.md` +- Delete: `references/rust-clap-patterns.md` +- Modify: `getting-started.md` (Where things live table — point at the new file) +- Modify: `SKILL.md` (any pointer paragraph referencing the old files — update or delete) + +**Approach:** + +- One H2 per principle (P1–P7). Within each, three H3 buckets in this order: **Free from clap**, **Must implement**, + **Anti-patterns**. Per-bucket prose folds in: +- **P1 Must-implement:** the FalseyValueParser detail, four-flag `--output / --quiet / --no-interactive / --timeout` + global pattern (xurl-rs / bird precedent), `cli.no_interactive || !std::io::stdin().is_terminal()` gate. +- **P2 Must-implement:** explicitly enumerate Text / Json / **Jsonl** as the three OutputFormat variants (preserves the + file-unique Jsonl nugget). OutputConfig threading; format-aware error printing. +- **P3 Must-implement:** `after_help` (not `about` or `long_about`); env vars surface automatically via `env` attribute; + per-subcommand `after_help`. +- **P4 Must-implement:** `try_parse()` not `parse()`; `thiserror` enum; `exit_code()` method **and** `kind()` method + (preserves file-unique nugget); main-only `process::exit()` (preserves file-unique rule); sysexits 77 / 78 / 74 + mappings. +- **P5 Must-implement:** `--dry-run` on every write subcommand; `--force` / `--yes` on destructive ops; idempotent + design; read/write categorization rule. +- **P6 Must-implement:** SIGPIPE fix; `IsTerminal`; `NO_COLOR` + `TERM=dumb`; clap_complete; three-tier dependency + gating. Plus a new H3 sub-section before the Free/Must/Anti-patterns triplet: **Flags vs subcommands taxonomy** — five + bullets: subcommands for operations, nested subcommands for namespaced operations, global flags for cross-cutting + modifiers, local flags for command-specific modifiers, both flag and subcommand for universal meta-commands like + `--help` / `--version`. (Preserves the largest file-unique nugget.) +- **P7 Must-implement:** `diag!` macro; `--quiet`; `--limit` / `--max-results` with `clamp()`; `--timeout`; output + clamping with truncation diagnostic. +- Closing pointer table: link to `references/framework-idioms-other-languages.md` (preserved from `framework-idioms.md`) +- new pointer to `templates/cargo-toml.md` (which lands in PR 4). +- Concrete-example anchors: explicit references to xurl-rs and bird as exemplar codebases (preserved from + `rust-clap-patterns.md`). +- Estimated final size ~250 lines, replacing 346 lines (origin R12 estimate). + +**Patterns to follow:** `framework-idioms.md` three-bucket scaffold; `rust-clap-patterns.md` per-principle paragraph +density. + +**Test scenarios:** + +- Verification: `git status` after the unit shows `references/framework-idioms.md` and + `references/rust-clap-patterns.md` as deleted, `references/rust-clap.md` as new. +- Verification: `wc -l references/rust-clap.md` is between 200 and 300 (origin estimate ~250). +- Verification: `grep -c '^## P[1-7]' references/rust-clap.md` returns 7 (one per principle). +- Verification: P6 section contains both the **Flags vs subcommands taxonomy** sub-section and the SIGPIPE / IsTerminal + / clap_complete coverage. +- Verification: P4 section contains both `exit_code()` and `kind()` method names. +- Verification: P2 section enumerates `Text`, `Json`, `Jsonl` (all three OutputFormat variants). +- Verification: `grep -rn 'framework-idioms\.md\|rust-clap-patterns\.md' SKILL.md getting-started.md references/ + templates/` returns 0 hits (no broken links). +- Verification: `grep -n 'rust-clap\.md' SKILL.md getting-started.md` returns ≥1 hit each (new pointers in place). +- Verification: `markdownlint-cli2 references/rust-clap.md` passes. +- **Verification (token-set diff — pre-deletion gate).** Before deleting the source files, compute the symmetric + difference between merged-file content tokens and source-file content tokens, and review every loss explicitly: + +```sh +# Extract H3 + H4 headings, named identifiers (function/method/crate names), +# and code-fenced spans from both source files; compare against merged file. +extract_tokens() { + rg -oE '^####? .+|`[A-Za-z_][A-Za-z0-9_:!]+`|`#\[[^]]+\]`' "$1" \ + | sort -u +} +extract_tokens references/framework-idioms.md > /tmp/src-a.tokens +extract_tokens references/rust-clap-patterns.md > /tmp/src-b.tokens +extract_tokens references/rust-clap.md > /tmp/merged.tokens +sort -u /tmp/src-a.tokens /tmp/src-b.tokens > /tmp/source-union.tokens +# Tokens present in source union but absent in merged (with paraphrase tolerance): +comm -23 /tmp/source-union.tokens /tmp/merged.tokens > /tmp/lost.tokens +wc -l /tmp/lost.tokens +``` + + Review every line in `/tmp/lost.tokens`. For each lost token: either confirm it was an intentional drop (prose + duplication, redundant phrasing) AND record the rationale in the PR description, OR add the missing content to the + merged file before merge. Token-set difference of zero is not the goal (paraphrase tolerance is real); reviewed-loss + is the goal — every loss is a deliberate choice with a recorded reason. + +**Verification:** A diff between "what's in the merged file" and "union of the two source files" shows no information +loss other than prose duplication; SKILL.md and `getting-started.md` link to the merged file; both originals are +deleted. + +--- + +### Phase 4 — PR 4: Rust starter completion (R11.1, R11.2, R11.3) + +- U12. **R11.1: `templates/cargo-toml.md`** + +**Goal:** Drop-in `[dependencies]` block for the Rust starter. Today an agent copying `clap-main.rs` has to know to add +clap (with `derive` + `env`), serde, serde_json, thiserror, libc, clap_complete. A copy-paste TOML is faster. + +**Requirements:** R11 (item 1). + +**Dependencies:** U11 (`rust-clap.md` cross-link from this template). + +**Files:** + +- Create: `templates/cargo-toml.md` + +**Approach:** + +- File starts with a one-paragraph preamble: "Drop-in `[dependencies]` and `[features]` for a greenfield Rust CLI built + from `templates/clap-main.rs`. Append to your `Cargo.toml` after `cargo init`." +- A single fenced TOML block with: clap (`features = ["derive", "env"]`), serde (`features = ["derive"]`), serde_json, + thiserror, libc (`#[cfg(unix)]` only — annotated), clap_complete. Pin version constraints loosely (`"4"` for clap, + `"1"` for serde) per Cargo convention. +- Followed by a "Why each crate" bulleted list — one line per crate naming the principle it serves (clap → P3, serde + + serde_json → P2, thiserror → P4, libc → P6 SIGPIPE, clap_complete → P6 completions). +- Closes with a one-line pointer to `references/rust-clap.md` and `references/project-structure.md`. + +**Patterns to follow:** Existing `templates/agents-md-template.md` voice (preamble + fenced block + "why" prose). + +**Test scenarios:** + +- Verification: TOML block parses (`python3 -c "import tomllib; tomllib.loads(open('templates/cargo-toml.md').read())"` + after extracting the fenced block, or just `cargo init /tmp/test-toml && (cd /tmp/test-toml && python3 ... extract + block ... append to Cargo.toml && cargo metadata --format-version 1)`). +- Verification: Pasting the block into a fresh `cargo init` repo and running `cargo metadata` resolves all dependencies + without errors. +- Verification: The "Why each crate" list names every principle (P1–P7) at least once across crates. + +**Verification:** A new Rust CLI can be bootstrapped with three `cp` commands and a copy-paste from this file (origin +R11 acceptance). + +--- + +- U13. **R11.2: `templates/cli-tests.rs`** + +**Goal:** `assert_cmd` patterns covering P5 mutation boundaries (idempotency, dry-run, `--yes` / `--force` distinction). +Encodes P5 by construction. + +**Requirements:** R11 (item 2). + +**Dependencies:** U11. + +**Files:** + +- Create: `templates/cli-tests.rs` + +**Approach:** + +- File header comment block (matching `templates/clap-main.rs` voice): what this template demonstrates, principle + mapping, where to drop it (`tests/cli.rs` in the consumer repo). +- Test functions covering at minimum: + +1. **`--dry-run` does not mutate.** Run a write subcommand twice with `--dry-run`; assert side-effect (file existence, + exit, JSON-reported state) is unchanged. +2. **`--dry-run` reports what it would do.** Same write subcommand with `--dry-run --output json`; assert the JSON + envelope contains a `would_*` field or equivalent indicating the mutation is described. +3. **Idempotent create.** Run a create subcommand twice without `--force`; second call succeeds without error and + reports "already exists" rather than failing. +4. **`--force` overrides confirmation gate.** Run a destructive subcommand with `--no-interactive` (no `--force`); + assert exit code != 0 and a clear error in JSON. Re-run with `--force --no-interactive`; assert exit 0. +5. **`--yes` accepts implicit confirmation.** Run a destructive subcommand with `--yes`; assert exit 0 and the + destructive op completed. + +- Use `assert_cmd::Command` with `.assert().success()` / `.failure()`. Use `tempfile::tempdir` for filesystem isolation. +- `// TODO: replace 'mytool' with your binary name` markers throughout. + +**Patterns to follow:** xurl-rs and bird `tests/` conventions (cited in `references/rust-clap.md`); `assert_cmd` +documentation idioms. + +**Test scenarios:** + +- Verification: `cargo init --name dummy /tmp/dummy-cli && cp templates/cli-tests.rs /tmp/dummy-cli/tests/cli.rs && (cd + /tmp/dummy-cli && cargo check --tests)` succeeds (compile-only smoke; tests will fail without a binary, but the file + should compile). +- Verification: File contains all five test functions named distinctly. +- Verification: File header comment names P5 explicitly. + +**Verification:** Compile-only smoke passes; an author can copy this file, replace `mytool` with their binary, and have +a working P5-covering test suite. + +--- + +- U14. **R11.3 (repurposed): Document the embedded-schema extraction path** + +**Goal:** Tell agents how to obtain the canonical scorecard JSON Schema from the `anc` binary itself (via `anc generate +scorecard-schema`), rather than vendoring a hand-written copy that would drift the moment upstream bumps. Cross-link to +the upstream plan so readers see where the canonical schema lives. + +**Requirements:** R11 item 3, repurposed away from "ship a vendored schema" to "document the canonical extraction path" +after upstream `agentnative-cli` plan +[`2026-04-30-002-feat-scorecard-json-schema-plan.md`](../../agentnative-cli/docs/plans/2026-04-30-002-feat-scorecard-json-schema-plan.md) +established that the schema is shipped embedded in the binary, derived from Rust types via `schemars`, exposed via `anc +generate scorecard-schema`. + +**Dependencies:** U5 (`references/scorecard-shape.md` exists), U9 (`references/runbook.md` exists), U21 (subcommand +index calls out the `generate` family of verbs). + +**Files:** + +- Modify: `references/scorecard-shape.md` (new sub-section: "Validating against the canonical schema") +- Modify: `references/runbook.md` (new entry 6: "How do I validate a scorecard against the schema?") + +**Approach:** + +- **`references/scorecard-shape.md` sub-section** (~10 lines). Three paragraphs: +- (1) The canonical, authoritative JSON Schema for the scorecard envelope is **embedded in the `anc` binary** — derived + from Rust struct definitions via `schemars`, regenerated at compile time, exposed via `anc generate scorecard-schema`. + The skill bundle does **not** vendor a copy — vendoring would duplicate and drift. +- (2) Usage: `anc generate scorecard-schema --output -` writes the schema to stdout; `anc generate scorecard-schema + --output schema.json` writes to a file; `anc generate scorecard-schema --check --output schema.json` exits non-zero if + the file disagrees with the embedded copy (CI-friendly drift gate). +- (3) Public archival URL: `https://anc.dev/scorecard-v0.5.schema.json` (published by the `agentnative-site` archive, + cross-repo plumbing per upstream plan). Use the verb when validating against the binary actually installed; use the + archive URL when pinning to a specific schema version across consumers. +- **`references/runbook.md` entry 6.** ≤4-line entry: "How do I validate a scorecard against the schema? — run `anc + generate scorecard-schema --output -` to get the schema embedded in your installed `anc`. For a published archive, + fetch `https://anc.dev/scorecard-v{X.Y}.schema.json`. Do not vendor a copy in your repo — it will drift." +- **Soft-dep on upstream verb shipping.** When U14 lands, the `anc generate scorecard-schema` verb may not yet exist in + the `anc` release the consumer has installed (upstream plan is `status: active` as of 2026-05-01). U14's prose handles + this with a one-line caveat: "If your `anc` version doesn't yet ship `generate scorecard-schema`, upgrade via `brew + upgrade brettdavies/tap/agentnative` or pin to the published archive URL." + +**Patterns to follow:** `references/update-check.md`'s short-section voice; `getting-started.md` Q&A table for +cross-links. + +**Test scenarios:** + +- Verification: `references/scorecard-shape.md` contains a sub-section naming `anc generate scorecard-schema`. +- Verification: `references/runbook.md` contains a 6th entry on schema validation. +- Verification: Cross-link to upstream plan + `agentnative-cli/docs/plans/2026-04-30-002-feat-scorecard-json-schema-plan.md` is present. +- Verification: Cross-link to `https://anc.dev/scorecard-v0.5.schema.json` is present. +- Verification: No file is created at `templates/scorecard-envelope.schema.json` (the vendored artifact origin R11.3 + enumerated is explicitly NOT shipped). +- Verification: Markdownlint clean. + +**Verification:** An agent needing to validate a scorecard against the canonical schema runs the verb (no skill bundle +files updated when upstream bumps the schema); the skill teaches the path, the binary owns the artifact. + +--- + +### Phase 5 — PR 5: Python starter (R11.4, R11.6a, cross-language idiom scaffolds) + +- U15. **R11.4: `templates/python-click/`** + +**Goal:** Minimal Python/Click starter (one main file + `pyproject.toml` snippet) encoding P1 SIGPIPE handling, P2 +stdout/stderr separation, P4 exit codes. Mirror of `clap-main.rs` for the Click world. + +**Requirements:** R11 (item 4). + +**Dependencies:** None within PR 5. + +**Files:** + +- Create: `templates/python-click/main.py` +- Create: `templates/python-click/pyproject.toml.snippet` + +**Approach:** + +- **`main.py`** structure: +- Header comment block matching `templates/clap-main.rs` voice — what the template demonstrates, principle mapping, + copy-paste instructions. +- SIGPIPE fix: `import signal; signal.signal(signal.SIGPIPE, signal.SIG_DFL)` in `if __name__ == '__main__':` guard, + wrapped `try / except AttributeError` for Windows compatibility. (P6.) +- Click CLI scaffold with one read subcommand (`status`) and one write subcommand (`apply`) demonstrating `--dry-run` + (`@click.option('--dry-run', is_flag=True, ...)`). (P5.) +- `--output text|json|jsonl` global option using `click.Choice`. (P2.) +- `--quiet`, `--no-interactive`, `--timeout` global options. (P1, P7.) +- Custom exit codes: `EX_USAGE = 2`, `EX_AUTH = 77`, `EX_CONFIG = 78`, `EX_IOERR = 74` (sysexits). (P4.) +- JSON-aware error printing: when `--output json` is set and an error occurs, emit `{"error": true, "kind": ..., + "message": ..., "code": ...}` to stderr; exit with the right code. +- Diagnostic output gated behind `--quiet` and `--output != text`: `def diag(msg, ctx): if not ctx.quiet and ctx.format + == 'text': click.echo(msg, err=True)`. (P7.) +- **`pyproject.toml.snippet`** is a partial pyproject.toml fragment authors append: `[project] name = "mytool"`, + `[project.scripts] mytool = "mytool.main:cli"`, `[project.dependencies] click >= 8.1, < 9`. Comment header notes the + snippet structure (drop-in for `[project]` section). + +**Patterns to follow:** `templates/clap-main.rs` structural shape (header comment + tiered sections + TODO markers); +xurl-rs and bird global-flag conventions translated to Click idioms. + +**Test scenarios:** + +- Verification: `python3 -c "import ast; ast.parse(open('templates/python-click/main.py').read())"` parses cleanly + (syntax check; doesn't execute). +- Verification: With `click` installed in a temp venv, `python3 templates/python-click/main.py --help` prints help text + and exits 0. +- Verification: `python3 templates/python-click/main.py status --output json` emits valid JSON to stdout (parses with + `json.loads`). +- Verification: `python3 templates/python-click/main.py apply --dry-run --output json` reports the would-be action and + does not mutate. +- Verification: `pyproject.toml.snippet` is valid TOML when wrapped in a minimal pyproject (`tomllib.loads(...)` + parses). + +**Verification:** A new Python CLI has a starter that encodes P1/P2/P4 by construction (origin R11 acceptance). + +--- + +- U16. **R11.6a: `templates/agents-md-template.python.md`** + +**Goal:** Python-flavoured AGENTS.md scaffold paired with U15. Mirrors the existing `templates/agents-md-template.md` +(Rust-prescriptive) for the Python world. + +**Requirements:** R11 (item 6a). + +**Dependencies:** U15. + +**Files:** + +- Create: `templates/agents-md-template.python.md` + +**Approach:** + +- Direct adaptation of `templates/agents-md-template.md` (the Rust version): +- **Build & Run** uses `pip install -e .` / `python -m mytool` instead of `cargo`. +- **Test** uses `pytest` instead of `cargo test`; covers single-test (`pytest -k name`), output (`pytest -s`). +- **Lint & Format** uses `ruff` (`ruff check . && ruff format .`). +- **Architecture** module overview names `mytool/main.py`, `mytool/cli/` (Click commands), `mytool/errors.py` (exception + classes with exit-code mapping), `mytool/output.py` (output mode + diag logic). +- **Exit Codes** table identical to Rust template (sysexits 0/1/2/77/78). +- **Conventions** translated to Click idioms: "Output goes through `OutputContext` — never naked `print()` or + `click.echo()` outside the helper"; "Errors raise typed exceptions — never `sys.exit()` except in `main`"; "`--output + text|json|jsonl`, `--quiet`, `--no-interactive`, `--timeout` are global flags via Click context". +- **Common pitfalls** translated: forgetting `signal.SIGPIPE = SIG_DFL` causes broken-pipe noise; using `print()` + directly breaks `--quiet` and `--output json`; `sys.exit()` outside main skips Click's cleanup. + +**Patterns to follow:** `templates/agents-md-template.md` (Rust) structure exactly; placeholder format identical +(`[Binary name]`, `[add modules]`, etc.). + +**Test scenarios:** + +- Verification: File contains the same eight H2 sections as the Rust template (Build & Run, Test, Lint & Format, + Architecture, Quality Bar, Conventions, Common Pitfalls, Known Debt, References). +- Verification: All commands referenced are Python-ecosystem commands, not Rust (`grep -i 'cargo\|rustc' + templates/agents-md-template.python.md` returns 0 hits). +- Verification: Markdownlint clean. + +**Verification:** A Python CLI author copies this template and gets an AGENTS.md that mirrors the Rust version's +structure with idiomatic Python tooling. + +--- + +- U17. **Cross-language idiom sub-sections in `references/framework-idioms-other-languages.md` (scaffold + Python + rows)** + +**Goal:** Add two new sub-sections to the existing four-language idioms file: "JSON envelope authoring (P2)" and +"Testing P5 mutation boundaries". Section scaffolds + Python rows land here. Go rows append in U20. JS + Ruby rows +defer. + +**Requirements:** R11 ("framework idioms to add" — JSON envelope sub-section + P5 testing section). + +**Dependencies:** U14 (the JSON envelope schema is the canonical reference for shape; the Python row points at it). + +**Files:** + +- Modify: `references/framework-idioms-other-languages.md` + +**Approach:** + +- **`## JSON envelope authoring (P2)`** — new section near the bottom of the file, before any closing pointer table. + Intro paragraph: when implementing P2, your `--output json` mode should emit envelopes that follow the canonical shape + documented in `references/scorecard-shape.md` (and validated by `anc`'s embedded schema, extractable via `anc generate + scorecard-schema --output -`). Per-language rows show idiomatic envelope construction: +- **Python (Click + stdlib `json`):** `json.dumps({"data": result, "meta": {"version": __version__}})` → `click.echo` to + stdout. +- **Go (Cobra + `encoding/json`):** appended in PR 6 (U20). Until then, this row is **absent from the shipped doc** — + not a "deferred" stub. +- **JS / Ruby:** **not present in this section.** JS / Ruby authors continue to use the existing per-framework sections + (Click, Commander, yargs, oclif, Thor) already in this file. Cross-language coverage for JS / Ruby is tracked in this + plan's `### Deferred to Follow-Up Work` and lands in a follow-up PR matched to a JS / Ruby ecosystem expert. Absent + rows ship cleaner than visible "deferred" markers — readers interpret an absent row as "the existing per-framework + section covers this" rather than "I'm missing instructions." +- **`## Testing P5 mutation boundaries`** — new section parallel to `references/rust-clap.md`'s implicit P5 testing + coverage. Same row policy: +- **Python (pytest + `subprocess`/`click.testing.CliRunner`):** show a CliRunner-based dry-run-doesn't-mutate test, + idempotent-create test, force-flag-overrides-gate test. Reference `templates/cli-tests.rs` as the Rust analog. +- **Go (testing + `os/exec`):** appended in PR 6 (U20). Absent until then. +- **JS / Ruby:** not present (deferred per the rationale above; same row policy). + +**Patterns to follow:** Existing `references/framework-idioms-other-languages.md` per-language section structure (Click +section, argparse section, Cobra section…). + +**Test scenarios:** + +- Verification: File contains both new H2 sections (`## JSON envelope authoring (P2)`, `## Testing P5 mutation + boundaries`). +- Verification: Python rows present in both sections; Go rows absent (added in PR 6 / U20); JS + Ruby rows absent (no + visible "deferred" marker in the shipped doc). +- Verification: Envelope section closes with a pointer to `references/scorecard-shape.md` (canonical envelope shape) + + `anc generate scorecard-schema` (machine-readable schema extraction). Testing section closes with a pointer to + `templates/cli-tests.rs` (Rust analog). +- Verification: Markdownlint clean. + +**Verification:** Section scaffolds in place; Python rows complete and idiomatic; explicit deferral signals where Go / +JS / Ruby content lives or will live. + +--- + +### Phase 6 — PR 6: Go starter (R11.5, R11.6b, Go cross-language rows) + +- U18. **R11.5: `templates/go-cobra/`** + +**Goal:** Minimal Go/Cobra starter mirroring `clap-main.rs`. + +**Requirements:** R11 (item 5). + +**Dependencies:** None within PR 6. + +**Files:** + +- Create: `templates/go-cobra/main.go` +- Create: `templates/go-cobra/go.mod.snippet` + +**Approach:** + +- **`main.go`** structure (Cobra-idiomatic): +- Header comment block matching `templates/clap-main.rs` voice. +- `cobra.Command` root with one read sub (`status`) and one write sub (`apply`) demonstrating `--dry-run`. (P5.) +- Persistent flags (Cobra's "global" equivalent): `--output text|json|jsonl`, `--quiet`, `--no-interactive`, + `--timeout`. (P1, P2, P7.) +- Custom exit codes via `os.Exit` only in main; subcommand handlers return errors that `main` maps to exit codes. + Mapping: 0 success, 1 command error, 2 usage error, 77 auth, 78 config, 74 IO. (P4.) +- JSON-aware error printing: when `--output json` is set, marshal errors as `{"error": true, "kind": ..., "message": + ..., "code": ...}` to stderr. +- Diagnostic output via a `diag(ctx context.Context, format string, args ...any)` helper that gates on `quiet || output + != "text"`. (P7.) +- Note: Go does NOT install a custom SIGPIPE handler (the runtime's default is acceptable), so no SIGPIPE fix needed — + header comment notes this difference vs Rust/Python. +- **`go.mod.snippet`** is a partial go.mod fragment: `module github.com/example/mytool`, `go 1.22`, `require ( + github.com/spf13/cobra v1.8.0 )`. Header comment explains the snippet is appended to `go mod init` output. + +**Patterns to follow:** `templates/clap-main.rs` structural shape; Cobra's official examples for persistent flags + +exit-code patterns; `~/dev/solutions-docs/architecture-patterns/anc-cli-output-envelope-pattern-2026-04-29.md` for +envelope shape. + +**Test scenarios:** + +- Verification: `go vet templates/go-cobra/main.go` passes (syntax + basic semantics). +- Verification: With Cobra in a temp module, `go run templates/go-cobra/main.go --help` exits 0 and prints help text. +- Verification: `go run templates/go-cobra/main.go status --output json` emits valid JSON. +- Verification: `go run templates/go-cobra/main.go apply --dry-run --output json` reports the would-be action and does + not mutate. +- Verification: `go.mod.snippet` is valid go.mod syntax (parses with `go mod download` after stitching to a minimal + module). + +**Verification:** A new Go CLI has a starter that encodes P1/P2/P4 by construction (origin R11 acceptance, mirrored from +Python). + +--- + +- U19. **R11.6b: `templates/agents-md-template.go.md`** + +**Goal:** Go-flavoured AGENTS.md scaffold paired with U18. + +**Requirements:** R11 (item 6b). + +**Dependencies:** U18. + +**Files:** + +- Create: `templates/agents-md-template.go.md` + +**Approach:** + +- Direct adaptation of `templates/agents-md-template.md` (Rust): +- **Build & Run** uses `go build ./...`, `go run ./cmd/mytool`, `go install ./cmd/mytool`. +- **Test** uses `go test ./...`; single-test `go test -run TestName ./...`; output `go test -v ./...`. +- **Lint & Format** uses `gofmt -w .` and `go vet ./...` (and optionally `golangci-lint run`). +- **Architecture** module overview names `cmd/mytool/main.go`, `internal/cli/` (Cobra commands), `internal/errors/` + (typed errors with exit-code mapping), `internal/output/` (output mode + diag logic). +- **Exit Codes** table identical to Python/Rust templates. +- **Conventions** translated to Go idioms: "Output goes through `OutputContext` — never naked `fmt.Println`"; "Errors + are typed and bubble up — never `os.Exit` except in `main`"; "`--output`, `--quiet`, `--no-interactive`, `--timeout` + are persistent flags". +- **Common pitfalls**: missing `--no-interactive` gate before stdin read; using `fmt.Println` directly breaks `--quiet`; + `os.Exit` outside main skips deferred cleanup; forgetting `--output json` errors must be JSON. + +**Patterns to follow:** `templates/agents-md-template.md` (Rust) structure exactly. + +**Test scenarios:** + +- Verification: Same eight H2 sections as the Rust template. +- Verification: All commands are Go-ecosystem (`grep -i 'cargo\|pip\|rustc' templates/agents-md-template.go.md` returns + 0 hits). +- Verification: Markdownlint clean. + +**Verification:** A Go CLI author copies this template and gets an AGENTS.md that mirrors the Rust version's structure +with idiomatic Go tooling. + +--- + +- U20. **Go rows in `references/framework-idioms-other-languages.md`** + +**Goal:** Append Go content to the two cross-language sub-sections U17 scaffolded. + +**Requirements:** R11 (Go portion of "framework idioms to add"). + +**Dependencies:** U17 (sections + scaffolds exist), U18 (Go starter is the canonical link target). + +**Files:** + +- Modify: `references/framework-idioms-other-languages.md` + +**Approach:** + +- **JSON envelope authoring (P2) → Go row.** Replace U17's stub paragraph with a real Go example: `encoding/json` + marshalling, write to `os.Stdout`, error envelope variant on `os.Stderr`, link to `references/scorecard-shape.md` for + the canonical envelope shape and `anc generate scorecard-schema` for machine-readable schema extraction. +- **Testing P5 mutation boundaries → Go row.** Replace U17's stub paragraph with a `go test` + `os/exec` pattern: + dry-run-doesn't-mutate test, idempotent-create test, force-flag-overrides-gate test. Reference + `templates/cli-tests.rs` as the Rust analog and `templates/go-cobra/main.go` as the implementation. + +**Patterns to follow:** U17's Python row structure (parallel composition). + +**Test scenarios:** + +- Verification: Both Go rows are now substantive paragraphs (not absent, not stub-marked). +- Verification: Both rows link to `templates/go-cobra/` (implementation) or `references/scorecard-shape.md` + `anc + generate scorecard-schema` (envelope shape). +- Verification: JS + Ruby rows remain absent from the new sub-sections (per U17's row policy — no visible "deferred" + markers in the shipped doc; cross-language JS / Ruby coverage continues to be tracked in this plan's `### Deferred to + Follow-Up Work`). +- Verification: Markdownlint clean. + +**Verification:** Cross-language sub-sections cover Python + Go fully; JS + Ruby coverage remains in their existing +per-framework sections (Click / Commander / yargs / oclif / Thor) elsewhere in the file; cross-language deferral tracked +in this plan only, not in the shipped doc. + +--- + +### Phase 7 — PR 7: Deterministic-script hardening (R14) + +- U22. **R14: `bin/check-update` agent-native upgrade** + +**Goal:** Make `bin/check-update` pass `anc audit --binary --audit-profile posix-utility` with `badge.eligible == true`. +Add agent-native flag surface (`--output text|json`, `--quiet`, `--no-interactive`, `--timeout `); distinguish +network-fail from up-to-date in JSON mode; preserve text-mode output and cache-file format byte-for-byte. + +**Requirements:** R14. + +**Dependencies:** None within PR 7. + +**Files:** + +- Modify: `bin/check-update` + +**Approach:** + +- **Flag surface (additive).** `--output text|json` (default `text` — current single-line token grammar preserved); + `--quiet` (suppress all output; cache-file write side-effect still runs); `--no-interactive` (no-op — script is + already non-interactive; flag added for P1 conformance and explicit semantics); `--timeout ` (overrides the + hard-coded `--max-time 5` curl timeout; min 1, max 60). +- **JSON envelope shape.** Single-line JSON to stdout, one of: `{"status": "up_to_date", "version": ""}`, + `{"status": "upgrade_available", "local": "", "remote": ""}`, `{"status": "snoozed", "remote": "", + "expires_at": }`, `{"status": "disabled"}`, `{"status": "network_unavailable", "reason": ""}`, + `{"status": "cache_only", "cached": ""}`. Stderr stays empty in success paths; non-zero curl rc no longer + collapses to silent-exit-0 in JSON mode — agents get a typed signal. +- **Default behavior unchanged.** Without flags, output is byte-faithful to today: nothing on up-to-date, nothing on + snooze/disabled/network-fail, single line `UPGRADE_AVAILABLE ` on stale. Cache file format unchanged. +- **Exit code policy.** Always 0 (degrades silently per the existing contract — periodic update check must not break + user shells if curl fails). Failure semantics surface only in JSON mode via the `status` field. Documented in the + script header as "exit code is intentionally non-meaningful for this script — gate on `status` in JSON mode" and + cross-linked to `## Anc contract`'s "switch on typed fields, never `$?`" guidance from U4. + +**Patterns to follow:** Existing `bin/check-update` voice; `templates/output-format.rs` `OutputFormat` enum shape for +the JSON envelope (Text / Json / Jsonl with format-aware printing); +`~/dev/solutions-docs/best-practices/cli-structure-for-machines-typed-json-fields-over-display-strings-2026-04-20.md`. + +**Test scenarios:** + +- Verification: `bin/check-update --help` exits 0, prints flag list including `--output`, `--quiet`, `--no-interactive`, + `--timeout`. +- Verification: `bin/check-update` (no flags) byte-matches today's output across the six branches (up-to-date, stale, + snoozed, disabled, network-fail, cache-hit) — capture today's output to fixtures pre-edit, diff against post-edit + output. +- Verification: `bin/check-update --output json` emits parseable JSON for each of the six status cases (`status` field + enumerates `up_to_date | upgrade_available | snoozed | disabled | network_unavailable | cache_only`). +- Verification: `bin/check-update --quiet` produces zero output regardless of branch; cache file is still written. +- Verification: `bin/check-update --timeout 0` rejected with non-zero exit + clear error (text mode) or `{"status": + "error", "kind": "invalid_timeout", ...}` (JSON mode); `--timeout 60` accepted. +- Verification: Cache file at `~/.cache/agent-native-cli/last-update-check` retains legacy format (`UP_TO_DATE ` / + `UPGRADE_AVAILABLE `) — verified by parsing the file after a JSON-mode invocation. +- Verification: `shellcheck bin/check-update` clean. + +**Verification:** Output contract is documented (U24 covers docs); JSON mode produces typed envelopes; default text mode +is byte-faithful to today; cache file is back-compat. + +--- + +- U23. **R14: `scripts/sync-spec.sh` agent-native upgrade** + +**Goal:** Make `scripts/sync-spec.sh` pass `anc audit --binary --audit-profile posix-utility`. Add full P1 / P2 / P5 / +P7 flag surface; move success-path status prose from stdout to stderr; emit JSON envelope for the success path; report +idempotency when no work is needed; gate destructive overwrites of dirty `spec/`. + +**Requirements:** R14. + +**Dependencies:** None within PR 7. + +**Files:** + +- Modify: `scripts/sync-spec.sh` + +**Approach:** + +- **Flag surface.** `--output text|json` (default `text`); `--quiet` (suppress stderr status; final result still emitted + in selected output mode); `--no-interactive` (script is already non-interactive; flag added for conformance); + `--timeout ` (wraps `git ls-remote` and `git clone` invocations via `timeout ...` or `git -c + http.lowSpeedTime=...`); `--dry-run` (resolve the upstream tag, report what would be vendored, do not touch `spec/`); + `--force` (override the new dirty-spec gate). +- **Stderr discipline.** Move success-path echoes ("querying $SPEC_REMOTE_URL…", "vendoring $spec_tag…", "wrote $copied + principle files") from stdout to stderr. Errors already go to stderr; this just unifies the success path. Stdout in + text mode now contains only the final result line (e.g., `synced v0.3.0 abc1234 7-files` or `already_in_sync v0.3.0`). +- **JSON envelope.** Success: `{"status": "synced", "tag": "", "sha": "", "source": "remote|local", + "files_written": , "files": ["VERSION", "CHANGELOG.md", "principles/p1-…", …]}`. No-op: `{"status": + "already_in_sync", "tag": "", "sha": ""}`. Dry-run: `{"status": "would_sync", "tag": "", "sha": + "", "files": [...]}`. Error envelopes: `{"status": "error", "kind": "remote_unreachable | no_tags | dirty_spec + | invalid_timeout", "message": "", ...}`. +- **Idempotency report.** Before extracting, compare resolved `$spec_tag` to current `$DEST_DIR/VERSION`. If equal AND + `git -C "$REPO_ROOT" diff --quiet spec/` (no local edits), emit the `already_in_sync` envelope and exit 0 without + writing. +- **Dirty-spec gate.** If `git -C "$REPO_ROOT" status --porcelain spec/` reports non-empty, refuse to overwrite without + `--force`. Error envelope: `{"status": "error", "kind": "dirty_spec", "files": [...]}`. P5 (every write op gates + destructive overwrites) verbatim. +- **`--dry-run`.** Resolve tag + enumerate files via `git ls-tree --name-only $spec_tag principles/`; emit `would_sync` + envelope; do not call `git show`, do not write to `$DEST_DIR`. +- **Exit code policy.** 0 on success / no-op / dry-run; 2 on dirty-spec without `--force` (usage error); 74 on remote + unreachable + no local fallback (sysexits IOERR); 78 on missing local tags (sysexits CONFIG); 1 on any other failure. + Documented in the script header. + +**Patterns to follow:** `templates/error-types.rs` for the `kind` enum naming; existing `sync-spec.sh` cleanup-trap + +remote-first-then-local resolution pattern; +`~/dev/solutions-docs/best-practices/consistent-json-schema-across-success-and-error-paths-2026-04-20.md` for envelope +shape parity across success / error paths. + +**Test scenarios:** + +- Verification: `scripts/sync-spec.sh --help` exits 0, prints all six new flags. +- Verification: `scripts/sync-spec.sh` (no flags) with clean `spec/` and remote reachable produces text-mode output + matching the new contract (single human-readable result line on stdout; status prose on stderr). +- Verification: `scripts/sync-spec.sh --output json` produces parseable JSON; `status` field is one of `synced | + already_in_sync | would_sync | error`. +- Verification: `scripts/sync-spec.sh --dry-run --output json` resolves the tag without writing; `git status spec/` + unchanged after invocation. +- Verification: With dirty `spec/` (touch a principle file), `scripts/sync-spec.sh` exits 2 and emits `{"status": + "error", "kind": "dirty_spec", …}`; `--force` overrides and proceeds. +- Verification: With `SPEC_REMOTE_URL=https://invalid.example.com` and no `SPEC_ROOT`, exits 74 with `{"status": + "error", "kind": "remote_unreachable", …}`. +- Verification: Re-running against an already-vendored tag emits `already_in_sync` and writes no files (`stat -c %Y` on + `spec/VERSION` unchanged across two consecutive runs). +- Verification: `shellcheck scripts/sync-spec.sh` clean. + +**Verification:** Script passes the dogfood audit (U25); idempotency reports correctly; dirty-spec is gated; JSON +envelope shape matches the documented contract. + +--- + +- U24. **R14: Output + env-var contract documentation** + +**Goal:** Document `bin/check-update`'s and `scripts/sync-spec.sh`'s output contract (text grammar + JSON envelope per +status) and env-var overrides in the references; cross-link from SKILL.md's update-check footnote and U8's spec-skew +fallback paragraph. + +**Requirements:** R14. + +**Dependencies:** U22, U23 (the contract is what U22 / U23 actually ship; docs follow the implementation). + +**Files:** + +- Modify: `references/update-check.md` (expand to cover text grammar + JSON envelope + env vars for `bin/check-update`) +- Modify: `references/runbook.md` (new entry 7: "When and how to run `scripts/sync-spec.sh`") + +**Approach:** + +- **`references/update-check.md` expansion.** Add three sub-sections under existing content: + +1. **Output contract — text mode.** The single-line token grammar (`UPGRADE_AVAILABLE ` on stale; nothing + otherwise). Documented as a stable interface; consumers may parse via `awk '{print $1, $2, $3}'`. +2. **Output contract — JSON mode.** Full envelope shape with one example per `status` value (six total). +3. **Env-var overrides.** Three-row table: `AGENTNATIVE_SKILL_DIR`, `AGENTNATIVE_SKILL_REMOTE_URL`, + `AGENTNATIVE_SKILL_STATE_DIR` — purpose + default + when to override (testing, mirrored install). + +- **`references/runbook.md` entry 7.** ~10-line entry: when to run `sync-spec.sh` (after every `agentnative-spec` v* tag + bump; SKILL.md U8 spec-skew fallback is the in-loop trigger); env-var overrides (`SPEC_REMOTE_URL`, `SPEC_ROOT`); + `--dry-run` for previewing; `--force` for dirty-spec override; the idempotency-report behavior; cross-link to + `bin/check-update` (sibling) and `references/update-check.md`. +- **SKILL.md cross-links.** Update the `## Update-check` footnote (created in U1, refined in U3) and U8's spec-skew + fallback paragraph to mention "`--output json` is available — see `references/update-check.md`" and "see + `references/runbook.md` entry 7" respectively. + +**Patterns to follow:** `references/update-check.md` existing voice; `references/runbook.md` short-section voice from +U9. + +**Test scenarios:** + +- Verification: `references/update-check.md` enumerates all six JSON `status` values matching U22's implementation. +- Verification: `references/update-check.md` env-var table covers all three `AGENTNATIVE_SKILL_*` vars. +- Verification: `references/runbook.md` entry 7 names `--dry-run`, `--force`, `SPEC_REMOTE_URL`, `SPEC_ROOT`. +- Verification: SKILL.md update-check footnote references `references/update-check.md`'s JSON envelope; U8 spec-skew + paragraph references `references/runbook.md` entry 7. +- Verification: Markdownlint clean; cross-links resolve. + +**Verification:** An agent reading `references/update-check.md` and `references/runbook.md` cold can parse both scripts' +output and pick the right invocation flags without reading the script source. + +--- + +- U25. **R14: Dogfood gate via `anc audit --binary`** + +**Goal:** Empirically verify both scripts pass `anc audit --binary --audit-profile posix-utility` with `badge.eligible +== true`. Document the audit invocation so contributors can re-run the gate after future edits. + +**Requirements:** R14 (acceptance gate). + +**Dependencies:** U22, U23, U24 (everything else in PR 7). + +**Files:** + +- Modify: `references/runbook.md` (new entry 8: "How do I re-run the dogfood audit?") +- Modify: `CONTRIBUTING.md` (one-paragraph note pointing at the runbook entry) + +**Approach:** + +- **Empirical gate at unit time.** Run `anc audit --binary bin/check-update --audit-profile posix-utility --output json` + and `anc audit --binary scripts/sync-spec.sh --audit-profile posix-utility --output json`. Capture both envelopes. + Assert `badge.eligible == true` for both. If either fails, the failure modes from `results[]` are the punch list — fix + in U22 / U23 and re-run before PR 7 merges. +- **`references/runbook.md` entry 8.** Document the canonical audit invocation (the exact two commands above), the + expected `badge.eligible == true` outcome, and the punch-list workflow when one fails. Note that CI integration is + deferred to a follow-up PR. +- **`CONTRIBUTING.md` update.** Add one paragraph in the existing "Before merging" or equivalent section pointing at the + runbook entry, so a contributor editing either script can't merge without re-running the gate. +- **Out of scope for this unit (deferred to follow-up CI-hardening PR):** wiring the dogfood audit into CI as a + regression gate. PR 7 ships the manual gate + documentation; CI integration is the next compounding step. + +**Patterns to follow:** `references/runbook.md` short-section voice; `CONTRIBUTING.md` existing tone. + +**Test scenarios:** + +- Verification: `anc audit --binary bin/check-update --audit-profile posix-utility --output json | jq .badge.eligible` + returns `true`. +- Verification: `anc audit --binary scripts/sync-spec.sh --audit-profile posix-utility --output json | jq + .badge.eligible` returns `true`. +- Verification: `references/runbook.md` entry 8 contains both audit commands verbatim. +- Verification: `CONTRIBUTING.md` references the runbook entry. + +**Verification:** The dogfood claim is empirically true: the skill audits CLIs against the agent-native spec; under PR 7 +the skill's own runtime scripts pass that same audit at the badge-eligible bar. + +--- + +## System-Wide Impact + +- **Interaction graph.** `SKILL.md` is the entry-point host hosts read; `getting-started.md` is its first link; + `references/*.md` are loaded on triggered read. Reordering in U1 must keep this load order intact — agents reading + SKILL.md top-to-bottom must reach Quick Start before any deeper file is referenced. +- **Error propagation.** PRs 1–6 are docs + templates; the only "errors" are markdownlint failures, broken cross-links, + and JSON Schema parse failures; each unit's verification catches them locally. **PR 7 (R14) is the exception:** U22 / + U23 modify two runtime-callable bash scripts; new failure modes are surfaced as typed JSON statuses in the new + `--output json` mode (`network_unavailable`, `dirty_spec`, `remote_unreachable`, `invalid_timeout`) and gated by + `--dry-run` / `--force` for the destructive `sync-spec.sh` overwrite path. Default text-mode behavior is byte-faithful + to today; back-compat for existing consumers and the cache file at `~/.cache/agent-native-cli/last-update-check`. +- **State lifecycle risks.** PRs 1–6 only touch git history of docs / templates. PR 7 adds two new pieces of agent- + observable state: (1) the cache file format at `~/.cache/agent-native-cli/last-update-check` is preserved as legacy + (no migration needed); (2) `spec/` overwrite is now gated by a dirty-check (`--force` required to override). Plans + live on `dev`; SKILL.md / getting-started.md / references / templates / scripts land on `dev` via PR and propagate to + `main` via `release/*` cherry-pick (per `RELEASES.md`). +- **API surface parity.** The skill's "API" is its frontmatter (`name`, `description`, `allowed-tools`) read by hosts + during discovery. U2 + U10 are the surface-touching units; U2 adds `allowed-tools`, U10 audits `description`. Both are + additive — no rename, no removal of existing fields. +- **Integration coverage.** Live `anc 0.3.0` integration is exercised by U5 (samples), U14 (schema validates real + envelope), U6 (audit-profile values match `anc audit --help`). If `anc` ships a schema bump (sibling brief A1) before + this plan lands, U4's pin path may need to update from top-level `schema_version` to `anc.schema_version` — plan + revision required at that point. +- **Unchanged invariants.** `name: agent-native-cli` (R10's documented "no" — discoverability). The seven principle + files in `spec/principles/` (vendored, not edited here). The four-step `## The anc loop` shape (U1 reorders SKILL.md + but the four steps stay; U7 + U8 fold rules into existing steps). `bin/check-update`'s default text-mode output and + cache-file format (R5 demotes its SKILL.md position; R14 adds new flags but preserves default behavior byte-for-byte). +- **Cross-repo.** Sibling brief `2026-05-01-002` carries A1 / A2 / A3 / A4 / A5 / A6 / A7 against `agentnative-cli`. + This plan's R2 / R6 / R7 reference those items as soft-dependencies but ship interim guidance independently. PR 1's + commit message naming `[A1, A2, A3, A4]` cross-references is recommended for traceability. + +--- + +## Risks & Dependencies + +| Risk | Likelihood | Impact | Mitigation | +| ----------------------------------------------------------------------------------------------------------------- | ---------- | ------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `anc` ships schema 0.6 (or renames `id` to `requirement_id`) mid-plan, invalidating U4 / U5 pinned values. | Low | Med | U4 footnote names the interim contract explicitly; if a bump lands, U4 + U5 need a follow-up PR. Sibling brief A1 is additive, not breaking, so the most likely shape is a new `anc.schema_version` field alongside the top-level one — both pin paths stay valid. | +| `anc.dev/badge` convention page does not exist or doesn't enumerate placement, leaving U9 entry 1 underspecified. | Med | Low | U9 entry 1's open-question note: if convention page lacks placement, U9 proposes a default (`top-of-readme` after H1) and the proposal mirrors to sibling brief A4 as a feature ask. | +| R12 merge loses a file-unique nugget that nobody notices until post-merge. | Low | Med | U11 verification specifically checks for the four enumerated nuggets (Flags-vs-subcommands taxonomy, `kind()` method, main-only `process::exit()`, Jsonl variant) **plus a token-set diff pre-deletion gate** that surfaces every lost token for explicit reviewed-loss disposition. | +| Upstream `anc generate scorecard-schema` verb (the path U14 points at) hasn't shipped when PR 4 lands. | Med | Low | U14 prose carries a one-line caveat naming the upstream plan and the upgrade path (`brew upgrade brettdavies/tap/agentnative` or pin to `https://anc.dev/scorecard-v0.5.schema.json` archive). Until the verb ships, parser authors use U5's prose form. No skill-side rework needed when the verb lands — the path is verb-name-stable. | +| SKILL.md exceeds 200 lines after PR 1 lands (R9 ceiling violated). | Low | Low | U1 verification checks line count at end of PR 1. Origin estimate is +45 lines net; budget is +54 lines, so margin exists. If exceeded, deferred guidance moves to `references/runbook.md` (U9) or extends `references/scorecard-shape.md` (U5). | +| Trigger-keyword audit (U10 / OQ-origin-#5) finds the existing keyword list is fine and U10 has no work. | Low | Low | OK — U10 commits only the kept-as-is rationale documentation in that case. Frontmatter remains unchanged. The unit still ships (one-paragraph commit). | +| PR 5 / PR 6 reviewers (Python / Go specialists) push back on starter idioms. | Med | Low | Each PR is intentionally scoped to one ecosystem so review is fast and targeted. Iterate within PR; do not block the rest of the plan. | +| Live `anc audit` envelope used by U5 examples drifts before PR 1 lands. | Low | Low | U5 verification re-runs `anc audit --output json` at unit time and checks samples against current output. If drift detected, regenerate the samples — they're cheap. | +| `bin/check-update` cache file format compat breaks under PR 7 (cache consumed across an upgrade boundary). | Low | Low | U22 keeps the cache file format unchanged (legacy `UP_TO_DATE ` / `UPGRADE_AVAILABLE `); JSON appears at output time only. Verified by post-edit cache-file inspection in U22's test scenarios. | +| `scripts/sync-spec.sh` stderr-discipline change breaks an unknown consumer that captured the success-path stdout. | Low | Low | Searched for callers; only invocation today is direct human run + this plan's U8 spec-skew fallback (which doesn't capture stdout). PR 7 changelog calls out the breaking change explicitly so any external consumer sees the migration note. | +| `anc audit --binary` against bash scripts surfaces unexpected gaps (e.g., missing P3 `after_help`, P6 SIGPIPE). | Med | Low | U25 is the empirical gate; if either script fails, U22 / U23 absorb the punch list before merge. Acceptable to iterate within PR 7. If a finding requires an `anc`-side feature (e.g., bash-script-aware audit profile), defer with a sibling-brief item and document the suppression in U25's verification note. | + +--- + +## Phased Delivery + +Seven PRs, sequenced. Each phase = one PR. Origin's handoff section is the canonical sequencing source for PRs 1–6; PR 7 +is plan-introduced (R14, the dogfood gate). All seven PRs ship in one consolidated `v0.4.0` release. + +### Phase 1 — PR 1: Skill-side determinism (R1–R9, R13) + +Lands U1–U9 + U21 (R9 first; U21 last so it can absorb U1's section skeleton and expand U9's runbook entry). One commit +per unit for review legibility. Targets `dev`. Estimated total ~280 lines added/changed across `SKILL.md`, +`getting-started.md`, and three new reference files (`runbook.md`, `scorecard-shape.md`, `audit-profile-selection.md`). +Largest PR; contains the load-bearing reorder. + +Acceptance gate: SKILL.md ≤ 200 lines, first runnable command is `anc audit`, all five `results[].status` values +enumerated in `## Anc contract`, exhaustive scorecard sample parses as JSON, exit-code table empirically matches live +`anc 0.3.0` behavior across at least three commands (one with `summary.fail == 0`, one with `summary.fail > 0`, one +invocation error), R6 spec-skew paragraph distinguishes namespace-mismatch from actual-spec-drift, **subcommand index +matches live `anc --help` `Commands:` block** (U21), **target-resolution runbook entry names all four modes (`.`, +`--binary`, `--source`, `--command `)** (U21). U14 (formerly the hand-written JSON Schema; now docs-only — pivots +to documenting the upstream `anc generate scorecard-schema` extraction path) lands in PR 4; PR 1's gate has no +schema-related criterion. + +### Phase 2 — PR 2: Frontmatter polish (R10) + +Lands U10. Tiny PR; can be folded into PR 1 if R10 stays under ~10 lines after the trigger-keyword audit. + +Acceptance gate: `description` ≤ 1024 chars; SKIP clause names `compound-engineering` and `create-agent-skills` +redirects. + +### Phase 3 — PR 3: Rust idioms consolidation (R12) + +Lands U11. Self-contained; deletes two files, creates one, updates SKILL.md and `getting-started.md` link targets. Lands +before PR 4 so the new Rust starter docs reference `references/rust-clap.md`. + +Acceptance gate: `framework-idioms.md` and `rust-clap-patterns.md` deleted; `rust-clap.md` exists, ~250 lines, all four +file-unique nuggets present; no broken links anywhere in the repo. + +### Phase 4 — PR 4: Rust starter completion + scorecard-schema extraction docs (R11.1, R11.2, R11.3-repurposed) + +Lands U12 + U13 + U14. U12 + U13 are Rust-starter additions (cargo-toml.md, cli-tests.rs); U14 is docs-only (modifies +`references/scorecard-shape.md` + `references/runbook.md` to document the upstream `anc generate scorecard-schema` +extraction path). Cross-references `references/rust-clap.md` (PR 3). + +Acceptance gate: Cargo TOML resolves dependencies; `cli-tests.rs` compiles cleanly when copied into a fresh `cargo +init`; `references/scorecard-shape.md` contains the `anc generate scorecard-schema` sub-section with cross-link to +upstream plan; `references/runbook.md` contains the schema-validation entry; **no file is created at +`templates/scorecard-envelope.schema.json`** (the vendored artifact is explicitly NOT shipped — upstream binary is +canonical). + +### Phase 5 — PR 5: Python starter (R11.4, R11.6a, cross-language scaffolds) + +Lands U15 + U16 + U17. Cross-language idiom sub-section scaffolds + Python rows ride here. + +Acceptance gate: Python starter parses + runs `--help`; pyproject snippet is valid TOML; +framework-idioms-other-languages section scaffolds present; Python rows substantive; Go / JS / Ruby rows are explicitly +deferred markers. + +### Phase 6 — PR 6: Go starter (R11.5, R11.6b, Go cross-language rows) + +Lands U18 + U19 + U20. + +Acceptance gate: Go starter `go vet` clean, runs `--help`; AGENTS.md mirrors Rust template structure; Go rows in +cross-language sub-sections substantive (no longer stubs). + +### Phase 7 — PR 7: Deterministic-script hardening (R14) + +Lands U22 + U23 + U24 + U25. The dogfood gate: makes `bin/check-update` and `scripts/sync-spec.sh` pass `anc audit +--binary --audit-profile posix-utility` with `badge.eligible == true`. Additive flag surfaces (no removals); cache-file +format unchanged; one minor breaking change in `sync-spec.sh` (success-path status prose moves from stdout to stderr, +called out in PR 7's changelog). + +Acceptance gate: `bin/check-update --help` and `scripts/sync-spec.sh --help` print the new flag surfaces; both scripts +pass `anc audit --binary --audit-profile posix-utility` with `badge.eligible == true` (the empirical gate, U25); +`references/update-check.md` documents both modes' output contract and env-var overrides; `references/runbook.md` +carries entries 7 (sync-spec runbook) and 8 (re-run-the-dogfood-audit); `CONTRIBUTING.md` references the audit gate; +`shellcheck` clean on both scripts. + +--- + +## Documentation / Operational Notes + +- **CHANGELOG.** Each PR's `## Changelog` section in the body is the source of truth (per repo convention; + `scripts/generate-changelog.sh` extracts it). PR 1's changelog is user-facing-large: "SKILL.md restructure with anc + contract, scorecard samples, audit-profile selection, loop termination rule, install precondition, spec-skew fallback, + and runbook." PR 2–6 changelogs scoped per-PR. PR 7's changelog calls out two user-facing additions and one minor + breaking change: (1) `bin/check-update` and `scripts/sync-spec.sh` gain `--output json|text`, `--quiet`, + `--no-interactive`, `--timeout` flags; `sync-spec.sh` additionally gains `--dry-run` and `--force`; (2) both scripts + pass the dogfood audit (`anc audit --binary --audit-profile posix-utility`) at the badge-eligible bar; (3) **breaking + change:** `sync-spec.sh` success-path status prose ("querying…", "vendoring…", "wrote N files") moves from stdout to + stderr — consumers capturing the success-path stdout will see only the new single-line result token instead. +- **VERSION bump.** All seven PRs ship in one consolidated release: `v0.4.0` from `v0.3.0`. The release covers the R9 + reorder + R2 anc contract + R10 frontmatter + R12 Rust idioms + R11 starter templates + R13 subcommand index + R14 + deterministic-script hardening. Minor bump (additive surface, no removals beyond R12's two deleted reference files + which had explicit redirects, and PR 7's stdout→stderr re-routing in `sync-spec.sh`'s success path which is called out + as breaking in the changelog). Cut the tag after PR 7 lands and the dogfood gate (U25) is green. Per-PR commit + messages remain as the audit trail; the consolidated `RELEASES.md` entry summarizes against the seven PRs. +- **`anc skill install` consumers.** When PR 1 ships, hosts that have already installed the skill via `anc skill + install` will see the restructured SKILL.md after re-running install or `git pull`. R5's install-precondition is the + new first action — agents that cached the old "First action: update check" wording will encounter the precondition on + next session. This is intentional and correct (the precondition is more important than the bundle update-check). +- **Cross-repo.** Sibling brief PRs land independently in `agentnative-cli`. This plan does NOT block on any of them. +- **`spec/` resync.** Not required for this plan — `spec/VERSION = 0.3.0` matches `anc 0.3.0`'s `spec_version: "0.3.0"`. + If the spec moves, run `scripts/sync-spec.sh` as a separate commit before PR 1 lands. +- **Markdownlint configuration.** `.markdownlint-cli2.yaml` enforces 120-char line length. Per global instructions, do + not manually wrap markdown lines — the auto-format hook handles it. + +--- + +## Sources & References + +- **Origin document:** + [`docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md`](../brainstorms/2026-05-01-001-skill-determinism-requirements.md) +- **Sibling brief (cross-repo, anc-side):** + [`docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md`](../brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md) +- **Upstream plan (anc-side schema, the canonical artifact U14 points at):** + `agentnative-cli/docs/plans/2026-04-30-002-feat-scorecard-json-schema-plan.md` — derives the scorecard JSON Schema + from Rust types via `schemars`, embeds it in the binary, exposes it via `anc generate scorecard-schema`, archives + versioned URLs under `https://anc.dev/scorecard-v{X.Y}.schema.json` via the `agentnative-site` repo. This skill's U14 + pivot from "ship a vendored schema" to "document the extraction path" is grounded in this plan being the + single-source-of-truth. +- **Repo files referenced:** `SKILL.md`, `getting-started.md`, `AGENTS.md`, `references/framework-idioms.md`, + `references/framework-idioms-other-languages.md`, `references/project-structure.md`, + `references/rust-clap-patterns.md`, `references/update-check.md`, `templates/clap-main.rs`, + `templates/error-types.rs`, `templates/output-format.rs`, `templates/agents-md-template.md`, `bin/check-update`, + `scripts/sync-spec.sh`, `spec/principles/p[1-7]-*.md`, `spec/VERSION`. +- **Live `anc` integration:** `anc 0.3.0` (verified 2026-05-01); envelope at top-level keys `anc, audience, + audit_profile, badge, coverage_summary, results, run, schema_version, spec_version, summary, target, tool`; result + keys `confidence, evidence, group, id, label, layer, status`; `schema_version: "0.5"`, `spec_version: "0.3.0"`. +- **Solutions:** +- `~/dev/solutions-docs/best-practices/skills-2-0-structure-progressive-disclosure-20260402.md` +- `~/dev/solutions-docs/architecture-patterns/anc-cli-output-envelope-pattern-2026-04-29.md` +- `~/dev/solutions-docs/best-practices/cli-structure-for-machines-typed-json-fields-over-display-strings-2026-04-20.md` +- `~/dev/solutions-docs/best-practices/consistent-json-schema-across-success-and-error-paths-2026-04-20.md` +- `~/dev/solutions-docs/best-practices/agentnative-version-model-2026-05-01.md` +- **External:** +- [`anc.dev/badge`](https://anc.dev/badge) — badge convention reference for U9 entry 1 +- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) — sibling repo for cross-repo brief +- [`agentnative`](https://github.com/brettdavies/agentnative) — spec repo (not edited here) +- **Prior plans (this repo, dev):** +- `docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md` (bootstrap) +- `docs/plans/2026-04-28-001-feat-update-check-mechanism-plan.md` (update-check, U3 demotes its SKILL.md placement) diff --git a/getting-started.md b/getting-started.md index d876693..daab522 100644 --- a/getting-started.md +++ b/getting-started.md @@ -1,15 +1,15 @@ # Getting started This skill teaches agents how to design, build, or audit a CLI for use by other agents. The work is one of three loops; -pick the one that matches your starting point. The canonical checker is +pick the one that matches your starting point. The canonical auditor is [`anc`](https://github.com/brettdavies/agentnative-cli); the canonical principle text is in [`spec/principles/`](./spec/principles/). ## You have an existing CLI ```bash -# 1. Run the checker. -anc check --output json . > scorecard.json +# 1. Run the auditor. +anc audit --output json . > scorecard.json # 2. For each FAIL, look up the cited requirement_id (e.g. `p1-must-no-interactive`) # in spec/principles/p-*.md — frontmatter `requirements[]`. @@ -18,14 +18,14 @@ anc check --output json . > scorecard.json # references/rust-clap-patterns.md (Rust/clap) # references/framework-idioms.md (Rust idioms) # references/framework-idioms-other-languages.md (Click, argparse, Cobra, Commander, yargs, oclif, Thor) -# Re-run `anc check` until `summary.fail == 0` and +# Re-run `anc audit` until `summary.fail == 0` and # `coverage_summary.must.verified == coverage_summary.must.total`. # 4. Claim the badge once `badge.eligible == true` (≥80%). # Copy `badge.embed_markdown` from the JSON scorecard into the project's README. ``` -Useful flags: `--principle N` to focus on one principle, `--audit-profile ` to suppress checks that don't +Useful flags: `--principle N` to focus on one principle, `--audit-profile ` to suppress audits that don't apply (`human-tui` for TUIs that legitimately intercept the TTY, `posix-utility` for cat/sed/awk-style stdin-primary tools, `diagnostic-only` for read-only tools, `file-traversal` reserved), `--binary` / `--source` to scope. @@ -43,12 +43,12 @@ cp /templates/agents-md-template.md AGENTS.md # fill placeholders # Add to Cargo.toml: clap (derive, env), serde, serde_json, thiserror, # libc (SIGPIPE), clap_complete. See references/project-structure.md. -anc check --output json # run continuously as you build +anc audit --output json # run continuously as you build ``` ## You're building in another language -`anc`'s source-analysis layer is Rust-only; its behavioral layer (`anc check --command `) runs against any +`anc`'s source-analysis layer is Rust-only; its behavioral layer (`anc audit --command `) runs against any compiled binary on `PATH`. Read `spec/principles/p1-*.md` through `p7-*.md` for the language-agnostic requirements, and `references/framework-idioms-other-languages.md` for per-framework idioms. diff --git a/references/project-structure.md b/references/project-structure.md index d958e97..6c62af9 100644 --- a/references/project-structure.md +++ b/references/project-structure.md @@ -54,7 +54,7 @@ variants as needed (e.g., RateLimit, NotFound, Timeout) with appropriate exit co Use `thiserror` for the error enum derive. The `#[error("...")]` attribute defines Display, and `#[source]` chains cause errors. For projects that prefer minimal dependencies, implement `std::fmt::Display` and `std::error::Error` manually -instead. The compliance checker accepts either approach. +instead. The compliance auditor accepts either approach. Implement a format-aware `print()` method on the error enum that accepts `&OutputConfig` and writes to stderr. When the output format is Json or Jsonl, serialize the error as a JSON object with error, kind, message, and exit_code fields. diff --git a/scripts/sync-spec.sh b/scripts/sync-spec.sh index 29fe72b..d129575 100755 --- a/scripts/sync-spec.sh +++ b/scripts/sync-spec.sh @@ -2,10 +2,10 @@ # Vendor agentnative-spec into spec/. # # Default behavior: resolves the latest v* tag of agentnative-spec via the -# GitHub API and pulls VERSION, CHANGELOG.md, and principles/p*-*.md at -# that tag. The vendored tree ships as part of the skill bundle so -# consuming agents carry the canonical principle text alongside the skill -# metadata. +# GitHub API and pulls VERSION, CHANGELOG.md, principles/p*-*.md, and +# principles/scoring.md at that tag. The vendored tree ships as part of +# the skill bundle so consuming agents carry the canonical principle text +# alongside the skill metadata. # # Override behavior (--ref / SPEC_REF): vendors an explicit branch HEAD, # tag, or commit SHA instead of the latest v* tag. Use for cross-repo @@ -238,7 +238,11 @@ if [[ "$copied" -eq 0 ]]; then exit 1 fi -echo "wrote $copied principle file(s) to $DEST_PRINCIPLES" +# scoring.md is a named spec doc (leaderboard ranking methodology), not a +# p*-*.md principle, so it's fetched explicitly rather than via the glob. +fetch_file "principles/scoring.md" "$DEST_PRINCIPLES/scoring.md" + +echo "wrote $copied principle file(s) + scoring.md to $DEST_PRINCIPLES" echo "wrote VERSION + CHANGELOG.md to $DEST_DIR" echo echo "next: review \`git diff\` for unexpected changes, then commit." diff --git a/spec/CHANGELOG.md b/spec/CHANGELOG.md index d091c4b..eb78d41 100644 --- a/spec/CHANGELOG.md +++ b/spec/CHANGELOG.md @@ -1,8 +1,67 @@ # Changelog -All notable changes to this repository are documented here — governance, validator, release infrastructure, README, decision records. +All notable changes to this repository are documented here: governance, validator, release infrastructure, README, decision records. -Changes to the standard itself — principle MUST/SHOULD/MAY tier moves, requirement IDs added/removed/renamed, applicability shifts — are tracked per-principle in `principles/p*-*.md` via the `last-revised:` calver frontmatter field and the `## Pressure test notes` section appended to each file. +Changes to the standard itself (principle MUST/SHOULD/MAY tier moves, requirement IDs added/removed/renamed, applicability shifts) are tracked per-principle in `principles/p*-*.md` via the `last-revised:` calver frontmatter field and the `## Pressure test notes` section appended to each file. + +## [0.4.0] - 2026-05-07 + +### Added + +- P1 MUST `p1-must-secret-non-leaky-path` (conditional on CLI accepting secret material): sensitive inputs are readable via stdin or a `--*-file` flag; flag-value and env-var inputs MAY exist for convenience but MUST NOT be the only path. by @brettdavies in [#25](https://github.com/brettdavies/agentnative/pull/25) +- P2 MUST `p2-must-schema-print` (conditional on structured output): expose the output schema via a `schema` subcommand or `--schema` flag, runtime-discoverable, with a documented format identifier (canonical recommendation: JSON Schema 2020-12). +- P2 SHOULD `p2-should-schema-file` (conditional on structured output): also export the schema to a stable file path so CI and static-analysis consumers can pin without invoking the tool. +- P2 SHOULD `p2-should-json-aliases`: accept `--json` and `--jsonl` as aliases for `--output json` and `--output jsonl`. +- P4 SHOULD `p4-should-enumerate-valid-set` (conditional on closed-set rejection): when rejecting input against an enum or fixed-allowed-values set, the error message includes the valid set. +- P6 MUST `p6-must-sigterm` (conditional on long-running operations): flush or roll back partial writes, release locks, exit non-zero within a bounded shutdown window. Next invocation succeeds without manual cleanup. +- P6 MAY `p6-may-standard-names` (conditional on subcommands): follow community-standard verbs (`get` / `list` / `create` / `update` / `delete`) and flag spellings (`--force`, `--yes`, `--limit`, `--quiet`, `--verbose`). +- New principle **P8 Discoverable Through Agent Skill Bundles** (four requirements: `p8-must-bundle-install`, `p8-should-bundle-exists`, `p8-may-install-all`, `p8-may-bundle-update`). CLIs ship a top-level skill bundle (`AGENTS.md`, `SKILL.md`, or equivalent) and provide an install path that registers the bundle with installed agent runtimes (canonical form: `tool skill install []`). + +### Changed + +- `VERSION`: 0.3.1 → 0.4.0 (MINOR per `principles/AGENTS.md`'s versioning rules; new MUSTs added). by @brettdavies in [#25](https://github.com/brettdavies/agentnative/pull/25) +- `.impeccable.md`: new spec-channel anti-pattern "No false canonicalization". When a bullet names an outcome the implementer can satisfy any way, prose uses indefinite articles and avoids language that canonicalizes one shape; when a bullet names a citable single-shape pattern, prose uses definite articles and cites the source. + +**Full Changelog**: [v0.4.0...v0.4.0](https://github.com/brettdavies/agentnative/compare/v0.4.0...v0.4.0) + +## [0.4.0] - 2026-05-07 + +### Added + +- P1 MUST `p1-must-secret-non-leaky-path` (conditional on CLI accepting secret material): sensitive inputs are readable via stdin or a `--*-file` flag; flag-value and env-var inputs MAY exist for convenience but MUST NOT be the only path. by @brettdavies in [#25](https://github.com/brettdavies/agentnative/pull/25) +- P2 MUST `p2-must-schema-print` (conditional on structured output): expose the output schema via a `schema` subcommand or `--schema` flag, runtime-discoverable, with a documented format identifier (canonical recommendation: JSON Schema 2020-12). +- P2 SHOULD `p2-should-schema-file` (conditional on structured output): also export the schema to a stable file path so CI and static-analysis consumers can pin without invoking the tool. +- P2 SHOULD `p2-should-json-aliases`: accept `--json` and `--jsonl` as aliases for `--output json` and `--output jsonl`. +- P4 SHOULD `p4-should-enumerate-valid-set` (conditional on closed-set rejection): when rejecting input against an enum or fixed-allowed-values set, the error message includes the valid set. +- P6 MUST `p6-must-sigterm` (conditional on long-running operations): flush or roll back partial writes, release locks, exit non-zero within a bounded shutdown window. Next invocation succeeds without manual cleanup. +- P6 MAY `p6-may-standard-names` (conditional on subcommands): follow community-standard verbs (`get` / `list` / `create` / `update` / `delete`) and flag spellings (`--force`, `--yes`, `--limit`, `--quiet`, `--verbose`). +- New principle **P8 Discoverable Through Agent Skill Bundles** (four requirements: `p8-must-bundle-install`, `p8-should-bundle-exists`, `p8-may-install-all`, `p8-may-bundle-update`). CLIs ship a top-level skill bundle (`AGENTS.md`, `SKILL.md`, or equivalent) and provide an install path that registers the bundle with installed agent runtimes (canonical form: `tool skill install []`). + +### Changed + +- `VERSION`: 0.3.1 → 0.4.0 (MINOR per `principles/AGENTS.md`'s versioning rules; new MUSTs added). by @brettdavies in [#25](https://github.com/brettdavies/agentnative/pull/25) +- `.impeccable.md`: new spec-channel anti-pattern "No false canonicalization". When a bullet names an outcome the implementer can satisfy any way, prose uses indefinite articles and avoids language that canonicalizes one shape; when a bullet names a citable single-shape pattern, prose uses definite articles and cites the source. + +**Full Changelog**: [v0.3.1...v0.4.0](https://github.com/brettdavies/agentnative/compare/v0.3.1...v0.4.0) + +## [0.3.1] - 2026-05-07 + +### Added + +- Badge claim convention (`docs/badge.md`) defines eligibility floor (≥80% pass-rate), embed shape, score-text format, color thresholds, and version-pinning posture for tool authors who self-host the agent-native badge linked to a live scorecard. by @brettdavies in [#20](https://github.com/brettdavies/agentnative/pull/20) +- README and CONTRIBUTING pointers to the badge convention so HN visitors and tool authors land on the convention from the two top-level entry points. +- Add `BRAND.md` at the repo root. Universal voice and identity SoT shared across the spec, site, linter, and skill bundle channels. Each channel inherits the shared identity and adds register and artifacts in its own `.impeccable.md`. by @brettdavies in [#22](https://github.com/brettdavies/agentnative/pull/22) +- Add spec-channel `.impeccable.md`: RFC 2119 register rules, third-person standards voice, no-implementation-leakage anti-patterns. Narrative identity layer; literal phrase enforcement lives in the `spec` Vale rule pack. +- Add `## Acknowledgements` to README. Names foundational CLI doctrine (12-factor, POSIX, clig.dev, NO_COLOR, XDG), parallel agent-CLI synthesis sources, the spec's proximate ancestors, and the anc.dev ecosystem's mechanism contribution. +- Add deterministic pre-push voice enforcement: Vale rule packs (`styles/brand/`, `styles/spec/`), LanguageTool grammar checks over the Tailnet (graceful skip when unreachable), and pack-README drift detection. One-time setup per contributor: `brew install vale jaq bun && vale sync` after activating `core.hooksPath scripts/hooks`. The layered SoT, orchestrator behavior, contributor flow, and deferred follow-ups live in the `dev`-only architecture docs. + +### Changed + +- Rename README "trifecta" to "four artifacts"; add `agentnative-skill` as a first-class artifact alongside the spec, the linter, and the leaderboard. by @brettdavies in [#22](https://github.com/brettdavies/agentnative/pull/22) +- Drop `docs/architecture/voice-enforcement.md` references from main-shipped files (`AGENTS.md`, `CONTRIBUTING.md`, `principles/AGENTS.md`, `.gitignore` comment). Replace the pointers with inline narrative that names the rule packs and the LT graceful-skip behavior. The architecture docs stay on `dev` as contributor-side reference and are not shipped to `main`. by @brettdavies in [#24](https://github.com/brettdavies/agentnative/pull/24) +- Update the `RELEASES.md` Prose scrubbing procedure to scrub-before-submit. Step 1 covers three entry points (scratch authoring for `gh pr create`, fetch-then-clean for `gh pr edit`, `cp CHANGELOG.md` for changelog scrub); step 6 submits the cleaned version once via `--body-file`. + +**Full Changelog**: [v0.3.0...v0.3.1](https://github.com/brettdavies/agentnative/compare/v0.3.0...v0.3.1) ## [0.3.0] - 2026-04-28 diff --git a/spec/README.md b/spec/README.md index 5a18373..1740361 100644 --- a/spec/README.md +++ b/spec/README.md @@ -6,6 +6,9 @@ latest upstream `v*` tag and ship inside the skill bundle so consumers carry the skill metadata. Each release of this bundle re-vendors against the latest spec tag. The currently vendored version is recorded in [`VERSION`](./VERSION). +For the full spec landing page — leaderboard, badge convention, and acknowledgements — see [anc.dev](https://anc.dev) or +the upstream [`README`](https://github.com/brettdavies/agentnative#readme). + ## Resync Run from the repo root: @@ -21,15 +24,16 @@ remote. The script extracts files via `git show`, so neither source's working tr ## Layout -| Path | Source in `agentnative-spec` | Purpose | -| ------------------ | ---------------------------- | ------------------------------------------------------------- | -| `VERSION` | `VERSION` | Spec version string; the skill's vendored `SPEC_VERSION` | -| `CHANGELOG.md` | `CHANGELOG.md` | Spec change history; informational | -| `principles/p*.md` | `principles/p*.md` | Frontmatter `requirements[]` is the machine-readable contract | +| Path | Source in `agentnative-spec` | Purpose | +| ----------------------- | ---------------------------- | ------------------------------------------------------------- | +| `VERSION` | `VERSION` | Spec version string; the skill's vendored `SPEC_VERSION` | +| `CHANGELOG.md` | `CHANGELOG.md` | Spec change history; informational | +| `principles/p*.md` | `principles/p*.md` | Frontmatter `requirements[]` is the machine-readable contract | +| `principles/scoring.md` | `principles/scoring.md` | Leaderboard formula, badge eligibility floor, and color bands | Each principle file has a YAML frontmatter block with `id`, `title`, `last-revised`, `status`, and `requirements[]`. Each `requirements[]` entry carries a stable `id` (e.g. `p1-must-no-interactive`), a `level` (`must`/`should`/`may`), an -`applicability` (`universal` or `{if: }`), and a one-sentence `summary`. The `anc` checker +`applicability` (`universal` or `{if: }`), and a one-sentence `summary`. The `anc` auditor ([brettdavies/agentnative-cli](https://github.com/brettdavies/agentnative-cli)) emits these IDs in its scorecard so agents can navigate from a finding directly to the requirement. diff --git a/spec/VERSION b/spec/VERSION index 0d91a54..1d0ba9e 100644 --- a/spec/VERSION +++ b/spec/VERSION @@ -1 +1 @@ -0.3.0 +0.4.0 diff --git a/spec/principles/p1-non-interactive-by-default.md b/spec/principles/p1-non-interactive-by-default.md index f93ad5a..7761a76 100644 --- a/spec/principles/p1-non-interactive-by-default.md +++ b/spec/principles/p1-non-interactive-by-default.md @@ -1,7 +1,7 @@ --- id: p1 title: Non-Interactive by Default -last-revised: 2026-04-22 +last-revised: 2026-05-07 status: active requirements: - id: p1-must-env-var @@ -11,12 +11,17 @@ requirements: - id: p1-must-no-interactive level: must applicability: universal - summary: "`--no-interactive` flag gates every prompt library call; when set or stdin is not a TTY, use defaults/stdin or exit with an actionable error." + summary: "When stdin is not a TTY or `--no-interactive` is set, every blocking-input surface (prompt libraries, read-line, TUI init) resolves from defaults/stdin or exits with an actionable error." - id: p1-must-no-browser level: must applicability: if: CLI authenticates against a remote service summary: Headless authentication path (`--no-browser` / OAuth Device Authorization Grant). + - id: p1-must-secret-non-leaky-path + level: must + applicability: + if: CLI accepts secret material (tokens, passwords, keys) as input + summary: "Sensitive inputs are readable via stdin or a `--*-file` flag; flag-value and env-var inputs MAY exist for convenience but MUST NOT be the only path." - id: p1-should-tty-detection level: should applicability: universal @@ -36,11 +41,11 @@ requirements: ## Definition Every automation path MUST run without human input. A CLI tool that blocks on an interactive prompt is invisible to an -agent — the agent hangs, the user sees nothing, and the operation times out silently. +agent: the agent hangs, the user sees nothing, and the operation times out silently. **Decision record:** this principle's MUST is worded in terms of observable behavior rather than enumerated APIs. [`docs/decisions/p1-behavioral-must.md`](../docs/decisions/p1-behavioral-must.md) records the reasoning and names the -verification boundary: automated checks verify behavior under non-TTY stdin; TTY-driving-agent scenarios are covered by +verification boundary: automated audits verify behavior under non-TTY stdin; TTY-driving-agent scenarios are covered by the MUST but are not PTY-probed at the current scale. ## Why Agents Need It @@ -63,26 +68,32 @@ agent-tool deadlock. quiet: bool, ``` -- A `--no-interactive` flag gating every prompt library call (`dialoguer`, `inquire`, `read_line`, `TTY::Prompt`, - `inquirer`, equivalents in other frameworks, or any TUI event loop that takes over the terminal). When the flag is - set, or when stdin is not a TTY, the tool uses defaults, reads from stdin, or exits with an actionable error. It never - blocks. +- When stdin is not a terminal, or when `--no-interactive` is set, every blocking-input surface (prompt libraries, + read-line calls, TUI session initialization) MUST resolve from defaults, read from stdin, or exit non-zero with an + actionable error. The CLI MUST NOT block waiting for input that cannot arrive. - A headless authentication path if the CLI authenticates. The canonical flag is `--no-browser`, which triggers the OAuth 2.0 Device Authorization Grant ([RFC 8628](https://www.rfc-editor.org/rfc/rfc8628)): the CLI prints a URL and a code; the user authorizes on another device. Agents cannot open browsers. Non-canonical alternatives (`--device-code`, - `--remote`, `--headless`) are acceptable but should migrate toward `--no-browser`. + `--remote`, `--headless`) are acceptable but SHOULD migrate toward `--no-browser`. +- CLIs that accept secret material (tokens, passwords, private keys) MUST provide at least one input path that does not + leak the value into process listings (`ps`), shell history, or the parent environment. The two leak-resistant paths + are stdin and a `--*-file` flag pointing to a credential file. Flag-value (`--token `) and environment-variable + (`TOOL_TOKEN`) paths MAY exist as convenience surfaces but MUST NOT be the only programmatic path. Cloud-CLI env-var + conventions (`AWS_ACCESS_KEY_ID`, `GH_TOKEN`) are accepted as convenience paths under this rule, not as substitutes + for it. **SHOULD:** -- Auto-detect non-interactive context via TTY detection (`std::io::IsTerminal` in Rust 1.70+, `process.stdin.isTTY` in - Node, `sys.stdout.isatty()` in Python) and suppress prompts when stderr is not a terminal, even without an explicit - `--no-interactive` flag. +- Auto-detect non-interactive context via TTY detection on stdin (and stderr, where prompts target it) and suppress + prompts when no terminal is present, even without an explicit `--no-interactive` flag. Language-specific entry points + (`std::io::IsTerminal` in Rust 1.70+, `process.stdin.isTTY` in Node, `sys.stdout.isatty()` in Python) appear in the + Evidence section. - Document default values for prompted inputs in `--help` output so agents can pass them explicitly instead of accepting whatever default ships. **MAY:** -- Offer rich interactive experiences — spinners, progress bars, multi-select menus — when a TTY is detected and +- Rich interactive experiences (spinners, progress bars, multi-select menus) MAY render when a TTY is detected and `--no-interactive` is not set, provided the non-interactive path remains fully functional. ## Evidence @@ -95,20 +106,21 @@ agent-tool deadlock. ## Anti-Patterns -- Bare `dialoguer::Confirm::new().interact()` with no TTY check and no `--no-interactive` override — agents hang +- Bare `dialoguer::Confirm::new().interact()` with no TTY check and no `--no-interactive` override: agents hang indefinitely. - Boolean environment variables parsed as plain strings, so `TOOL_QUIET=false` is truthy because the string is non-empty. - `stdin().read_line()` in a code path reached during normal operation without a TTY check first. - Hard-coded credentials prompts with no env-var or config-file alternative. - OAuth flow that unconditionally opens a browser with no headless escape hatch. +- A `--password ` flag with no stdin or file alternative: every invocation leaks the secret into `ps` output. -Measured by check IDs `p1-non-interactive` (behavioral) and `p1-non-interactive-source` (source). Run `agentnative check ---principle 1 .` against your CLI to see both. +Measured by audit IDs `p1-non-interactive` (behavioral) and `p1-non-interactive-source` (source). Run `anc audit +--principle 1 .` against the CLI under test to see both. ## Pressure test notes -### 2026-04-27 — Show HN launch red-team pass +### 2026-04-27: Red-team pass Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". @@ -116,13 +128,13 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". - **[edit]** *Internal inconsistency.* "The `--no-interactive` MUST bullet says 'uses defaults, reads from stdin, or exits with an actionable error' but the principle's behavioral framing (per decision record) covers TUI session init too. The prose bullet only enumerates prompt libraries (`dialoguer`, `inquire`, `read_line`, `TTY::Prompt`, - `inquirer`), not TUI frameworks (`ratatui`, `bubbletea`) — readers will infer the MUST excludes TUIs, contradicting - the decision record's explicit 'blocking-interactive surface includes... TUI session initialization.'" Resolved: prose + `inquirer`), not TUI frameworks (`ratatui`, `bubbletea`). Readers will infer the MUST excludes TUIs, contradicting the + decision record's explicit 'blocking-interactive surface includes... TUI session initialization.'" Resolved: prose bullet's parenthetical now includes "or any TUI event loop that takes over the terminal." Mirrors the behavioral framing in [`docs/decisions/p1-behavioral-must.md`](../docs/decisions/p1-behavioral-must.md). No frontmatter change; the summary already says "gates every prompt library call" and stays. - **[wontfix]** *Prior art.* "RFC 8628 citation is correct in name but incomplete in framing. The prose says Device - Authorization Grant means 'the CLI prints a URL and a code; the user authorizes on another device' — this still + Authorization Grant means 'the CLI prints a URL and a code; the user authorizes on another device'. This still requires a human on another device, which an unattended agent does not have. An HN commenter will note this is a *human-assisted* headless path, not an agent-headless path; true unattended agents need service-account / API-token auth (which the principle doesn't mention)." Rationale: P1 scopes "headless" as "no local browser required," not "no diff --git a/spec/principles/p2-structured-parseable-output.md b/spec/principles/p2-structured-parseable-output.md index 4f1ef7b..db854c0 100644 --- a/spec/principles/p2-structured-parseable-output.md +++ b/spec/principles/p2-structured-parseable-output.md @@ -1,17 +1,17 @@ --- id: p2 title: Structured, Parseable Output -last-revised: 2026-04-22 +last-revised: 2026-05-29 status: active requirements: - id: p2-must-output-flag level: must applicability: universal - summary: "`--output text|json|jsonl` flag selects output format; `OutputFormat` enum threaded through output paths." + summary: "`--output` flag selects format with `json` and `jsonl` as canonical machine-readable values; `text` is the default human-facing form." - id: p2-must-stdout-stderr-split level: must applicability: universal - summary: Data goes to stdout; diagnostics/progress/warnings go to stderr — never interleaved. + summary: Data goes to stdout; diagnostics/progress/warnings go to stderr, never interleaved. - id: p2-must-exit-codes level: must applicability: universal @@ -20,10 +20,28 @@ requirements: level: must applicability: universal summary: When `--output json` is active, errors are emitted as JSON (to stderr) with at least `error`, `kind`, and `message` fields. + - id: p2-must-schema-print + level: must + applicability: + kind: conditional + antecedent: + audit_id: p2-json-output + summary: "CLIs that emit structured output expose the output schema via a `schema` subcommand or `--schema` flag: runtime-discoverable, with a documented format identifier." - id: p2-should-consistent-envelope level: should applicability: universal - summary: JSON output uses a consistent envelope — a top-level object with predictable keys — across every command. + summary: JSON output uses a consistent envelope (a top-level object with predictable keys) across every command. + - id: p2-should-schema-file + level: should + applicability: + kind: conditional + antecedent: + audit_id: p2-json-output + summary: "Output schemas are also exported to a stable file path (e.g., `schema/.json`) so CI/static-analysis consumers pin without invoking the tool." + - id: p2-should-json-aliases + level: should + applicability: universal + summary: "`--json` and `--jsonl` are accepted as aliases for `--output json` and `--output jsonl`; the short forms work alongside the canonical enum." - id: p2-may-more-formats level: may applicability: universal @@ -46,18 +64,19 @@ data forces agents into fragile regex extraction that breaks on any format chang An agent calling a CLI needs three things from each invocation: the data, the error (if any), and the exit code. When data goes to stdout, diagnostics go to stderr, and errors carry machine-readable fields, the agent parses the result reliably without heuristics. Mix these channels or ship human-formatted output only, and the agent falls back to -best-effort text parsing that fails unpredictably across versions, locales, and edge cases — silently at first, +best-effort text parsing that fails unpredictably across versions, locales, and edge cases: silently at first, catastrophically later. ## Requirements **MUST:** -- A `--output text|json|jsonl` flag selects the output format. Text is the default for humans; JSON and JSONL are the - agent-facing formats. Implementation surfaces an `OutputFormat` enum and an `OutputConfig` struct threaded through - every function that produces output. -- Data goes to stdout. Diagnostics, progress indicators, and warnings go to stderr. An agent consuming JSON from stdout - must never encounter an interleaved progress message. +- Structured-output CLIs MUST offer at least one machine-readable format selectable via `--output`, with `json` and + `jsonl` as canonical values; `text` is the default human-facing form. The format selection threads through every + output path, so a single invocation never mixes formats. +- Data goes to stdout. Diagnostics, progress indicators, and warnings go to stderr. The split is decades-old Unix + practice (POSIX, ESR's Rule of Repair, clig.dev's "Output" rules); for an agent it is load-bearing: a JSON consumer + reading stdout MUST NOT encounter an interleaved progress line. - Exit codes are structured and documented: | Code | Meaning | @@ -71,17 +90,29 @@ catastrophically later. These codes blend the bash 0/1/2 convention with BSD `sysexits.h` 77/78 (`EX_NOPERM`, `EX_CONFIG`); the result is the de-facto agent-facing dialect, not strict `sysexits.h` compliance. -- When `--output json` is active, errors are emitted as JSON (to stderr) with at least `error`, `kind`, and `message` - fields. Plain-text errors in a JSON run break the agent's parser on the only output it was told to expect. +- When `--output json` is active, errors MUST be emitted as JSON to stderr with at least `error`, `kind`, and `message` + fields. A plain-text error inside a JSON run breaks the consumer's parser on the only shape it was told to expect. +- CLIs that emit structured output (`--output json|jsonl`) MUST expose the output schema at runtime via a `schema` + subcommand (or a `--schema` flag on each data-emitting subcommand). The schema MUST identify its format (canonical + recommendation is JSON Schema 2020-12, the same dialect OpenAPI 3.1 uses), so an agent reading the schema loads the + right validator without parsing prose. A consumer asking "what shape am I about to receive?" gets a machine-readable + answer in one call. **SHOULD:** -- JSON output uses a consistent envelope — a top-level object with predictable keys — across every command so agents can +- JSON output uses a consistent envelope (a top-level object with predictable keys) across every command so agents can rely on the same shape. +- The schema SHOULD also be exported to a stable file path in the source repo (e.g., `schema/.json`) so + consumers can pin against it at install or CI time without invoking the tool. The print form is the runtime contract; + the file form is the build-time contract. +- CLIs SHOULD accept `--json` as an alias for `--output json` and `--jsonl` as an alias for `--output jsonl`. The + `--output` enum remains the canonical surface for the format MUST (`p2-must-output-flag`); a Cloudflare-style CLI + shipping only the short forms still satisfies the canonical MUST through the alias path. **MAY:** -- Additional output formats (CSV, TSV, YAML) beyond the core three. The core three remain mandatory. +- Additional `--output` values (CSV, TSV, YAML) MAY be offered beyond the canonical text/json/jsonl. The canonical three + remain mandatory. - A `--raw` flag for unformatted output suitable for piping to other tools. ## Evidence @@ -89,36 +120,36 @@ catastrophically later. - `OutputFormat` enum with `Text`, `Json`, `Jsonl` variants deriving `ValueEnum`. - `OutputConfig` struct with `format`, `use_color`, and `quiet` fields. - `serde_json` in `Cargo.toml`. -- No `println!` in `src/` outside the output module — every print goes through `OutputConfig`. +- No `println!` in `src/` outside the output module: every print goes through `OutputConfig`. - Exit-code constants or match arms mapping error variants to distinct numeric codes. - `eprintln!` (or an equivalent diagnostic macro) for every diagnostic line. ## Anti-Patterns - `println!` scattered across handlers instead of routing through the output config. -- A single exit code (1) for everything — agents cannot distinguish auth failures from config errors. +- A single exit code (1) for everything: agents cannot distinguish auth failures from config errors. - Status lines ("Fetching data…") printed to stdout where they contaminate JSON output. - `process::exit()` in library code, bypassing structured error propagation. - Human-formatted tables as the only output mode with no JSON alternative. -Measured by check IDs `p2-output-json`, `p2-output-format`, `p2-stderr-diagnostics`. Run `agentnative check --principle -2 .` against your CLI to see each. +Measured by audit IDs `p2-output-json`, `p2-output-format`, `p2-stderr-diagnostics`. Run `anc audit --principle 2 .` +against the CLI under test to see each. ## Pressure test notes -### 2026-04-27 — Show HN launch red-team pass +### 2026-04-27: Red-team pass Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". - **[edit]** *Prior art.* "The exit-code table conflicts with `sysexits.h`. `EX_NOPERM=77` is 'permission denied' (close), but `EX_CONFIG=78` is correct. However, `sysexits.h` reserves `EX_USAGE=64`, `EX_DATAERR=65`, - `EX_NOINPUT=66`, `EX_UNAVAILABLE=69`, `EX_SOFTWARE=70` — P2 puts 'usage error' at 2 (bash convention), not 64. HN will + `EX_NOINPUT=66`, `EX_UNAVAILABLE=69`, `EX_SOFTWARE=70`. P2 puts 'usage error' at 2 (bash convention), not 64. HN will note the principle straddles two conventions (bash 0/1/2 + sysexits 77/78) without naming the hybrid." Resolved: added one sentence under the exit-code table acknowledging the bash + `sysexits.h` blend. The same citation now appears in P4's exit-code table (per Row #13 of the same review pass) so both files agree. -- **[later]** *Must-vs-should.* "A single-number-emitting CLI (e.g., `epoch`, `uuidgen`) plausibly violates the +- **[later]** *MUST-vs-SHOULD.* "A single-number-emitting CLI (e.g., `epoch`, `uuidgen`) plausibly violates the `--output text|json|jsonl` MUST for a defensible reason. Universal applicability is a strong claim." Deferred: revisit - whether `applicability` should soften when the launch landscape clarifies actual single-number agent-facing CLIs. The + whether `applicability` SHOULD soften when the launch landscape clarifies actual single-number agent-facing CLIs. The applicability change would fire coupled-release (CLI registry impact), so it is held for a v0.4.0 cleanup PR rather than churned during launch week. diff --git a/spec/principles/p3-progressive-help-discovery.md b/spec/principles/p3-progressive-help-discovery.md index 55bf47b..078b0e3 100644 --- a/spec/principles/p3-progressive-help-discovery.md +++ b/spec/principles/p3-progressive-help-discovery.md @@ -1,7 +1,7 @@ --- id: p3 title: Progressive Help Discovery -last-revised: 2026-04-22 +last-revised: 2026-05-21 status: active requirements: - id: p3-must-subcommand-examples @@ -13,6 +13,14 @@ requirements: level: must applicability: universal summary: The top-level command ships 2–3 examples covering the primary use cases. + - id: p3-must-version + level: must + applicability: universal + summary: Top-level `--version` prints a non-empty version line and exits 0. + - id: p3-should-version-short + level: should + applicability: universal + summary: A short version alias (`-V`, `-v`, or `-version`) accompanies `--version` for fast version probes. - id: p3-should-paired-examples level: should applicability: universal @@ -46,16 +54,24 @@ trial-and-errors its way into a working call, burning tokens and sometimes landi **MUST:** -- Every subcommand ships at least one concrete invocation example showing the command with realistic arguments, rendered - in the section that appears after the flags list. In clap this is the `after_help` attribute. -- The top-level command ships 2–3 examples covering the primary use cases. +- Every subcommand MUST render at least one concrete invocation example with realistic arguments, in the section that + appears after the flags list. Clap's `after_help` attribute is the Rust realization; other frameworks have equivalents + (see Evidence section below). +- The top-level command MUST render 2–3 examples covering the primary use cases. +- The top-level command MUST respond to `--version` with a non-empty version line on stdout and exit 0. Agents pin + against tool versions to detect breaking changes; a `--version` that errors, exits non-zero, or prints nothing forces + every consumer to scrape the binary path or manifest for a clue. **SHOULD:** -- Examples show human and agent invocations side by side — a text-output example followed by its `--output json` +- Examples show human and agent invocations side by side: a text-output example followed by its `--output json` equivalent. Readers see the pair; agents see the JSON form. - Short `about` for command-list summaries; `long_about` reserved for detailed descriptions visible with `--help` but not `-h`. +- A short alias for `--version` SHOULD work: `-V` (clap default, `curl`, `wget`, `gzip`), `-v` (`npm`, `node`, `bun`, + `yarn`, `make`), or `-version` (Go's `flag` package). Any of the three forms is sufficient. Agents probing tool + versions across many CLIs save token cost when they can pin against a one- or two-character flag; the long-only path + forces an extra parse step. **MAY:** @@ -71,18 +87,18 @@ trial-and-errors its way into a working call, burning tokens and sometimes landi ## Anti-Patterns -- Relying solely on `///` doc comments — those populate `about` / `long_about`, not `after_help`, so no examples render +- Relying solely on `///` doc comments: those populate `about` / `long_about`, not `after_help`, so no examples render after the flags list. - A single `about` string serving as both summary and usage documentation. - Examples buried in a README or man page but absent from `--help` output. - `after_help` text that describes the flags in prose instead of demonstrating them in code. -Measured by check IDs `p3-help`, `p3-after-help`, `p3-version`. Run `agentnative check --principle 3 .` against your CLI -to see each. +Measured by audit IDs `p3-help`, `p3-after-help`, `p3-version`. Run `anc audit --principle 3 .` against the CLI under +test to see each. ## Pressure test notes -### 2026-04-27 — Show HN launch red-team pass +### 2026-04-27: Red-team pass Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". @@ -93,7 +109,7 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". `universal` to conditional (`if: CLI exposes a structured-output mode`) fires the coupled-release norm (CLI registry parses `applicability`). Bundled with other applicability cleanups for a v0.4.0 PR with explicit registry coordination. -- **[later]** *Must-vs-should.* "'Top-level command ships 2–3 examples' as a universal MUST is too strong for genuinely +- **[later]** *MUST-vs-SHOULD.* "'Top-level command ships 2–3 examples' as a universal MUST is too strong for genuinely single-purpose CLIs (e.g., `cat`, `true`, a one-shot wrapper) where one canonical invocation is the entire surface. The '2–3' count baked into a MUST will draw HN fire as cargo-culted." Deferred: softening to "at least one example, and 2–3 when the tool has multiple primary use cases" is a MUST-content change that drifts the frontmatter summary. @@ -103,4 +119,4 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". epilog, `cobra` Example field, `gh`/`kubectl` Examples convention). HN will call this 'a clap style guide, not a CLI standard.'" Deferred: a cross-framework analog appendix is a meaningful addition. The Definition / Why-Agents-Need-It sections are framework-agnostic; the Evidence section is intentionally clap-keyed. Worth revisiting in v0.4.0 once the - standard's multi-language reach is clearer; site copy may also be a better home than the principle file itself. + standard's multi-language reach is clearer; site copy could also be a better home than the principle file itself. diff --git a/spec/principles/p4-fail-fast-actionable-errors.md b/spec/principles/p4-fail-fast-actionable-errors.md index 8e91deb..5acc7d6 100644 --- a/spec/principles/p4-fail-fast-actionable-errors.md +++ b/spec/principles/p4-fail-fast-actionable-errors.md @@ -1,7 +1,7 @@ --- id: p4 title: Fail Fast with Actionable Errors -last-revised: 2026-04-22 +last-revised: 2026-05-07 status: active requirements: - id: p4-must-try-parse @@ -15,7 +15,7 @@ requirements: - id: p4-must-actionable-errors level: must applicability: universal - summary: Every error message contains what failed, why, and what to do next. + summary: Every error message names the failure, the cause, and a concrete remediation (a command or a value, not a hint to consult docs). - id: p4-should-structured-enum level: should applicability: universal @@ -29,6 +29,11 @@ requirements: level: should applicability: universal summary: "Error output respects `--output json`: JSON-formatted errors go to stderr when JSON output is selected." + - id: p4-should-enumerate-valid-set + level: should + applicability: + if: CLI rejects input against a closed set + summary: "When rejecting input against an enum or fixed-allowed-values set, the error message includes the valid set." --- # P4: Fail Fast with Actionable Errors @@ -40,8 +45,8 @@ why, and what to do next. An error that says "operation failed" gives an agent n ## Why Agents Need It -Agents operate in a retry loop: attempt, observe, decide. When an error is vague or unstructured — a bare stack trace, a -one-word failure, a mixed-channel splurge — the agent cannot tell whether to retry, re-authenticate, fix configuration, +Agents operate in a retry loop: attempt, observe, decide. When an error is vague or unstructured (a bare stack trace, a +one-word failure, a mixed-channel splurge), the agent cannot tell whether to retry, re-authenticate, fix configuration, or escalate to the user. Distinct exit codes with actionable messages let the agent act correctly on the first read. The difference between exit code 77 (re-authenticate) and exit code 78 (fix config) determines whether the agent retries OAuth or asks the user to check their config file. Getting that wrong wastes entire conversation turns. @@ -51,8 +56,8 @@ OAuth or asks the user to check their config file. Getting that wrong wastes ent **MUST:** - Parse arguments with `try_parse()` instead of `parse()`. Clap's `parse()` calls `process::exit()` directly, bypassing - custom error handlers — which means `--output json` cannot emit JSON parse errors. `try_parse()` returns a `Result` - the tool can format: + custom error handlers, which means `--output json` cannot emit JSON parse errors. `try_parse()` returns a `Result` the + tool can format: ```rust let cli = Cli::try_parse()?; @@ -72,7 +77,8 @@ OAuth or asks the user to check their config file. Getting that wrong wastes ent These codes blend the bash 0/1/2 convention with BSD `sysexits.h` 77/78 (`EX_NOPERM`, `EX_CONFIG`); the result is the de-facto agent-facing dialect, not strict `sysexits.h` compliance. -- Every error message contains **what failed**, **why**, and **what to do next**. Example: +- Every error message MUST name the failure, the cause, and the remediation. The remediation is concrete: a command to + run or a value to set, not a hint to consult documentation. ```text Authentication failed: token expired (expires_at: 2026-03-25T00:00:00Z). @@ -87,6 +93,14 @@ OAuth or asks the user to check their config file. Getting that wrong wastes ent three-tier definition (meta-commands, local-only commands, network commands) lives in P6 (`p6-should-tier-gating`); this requirement specifies the network-call ordering consequence. - Error output respects `--output json`: JSON-formatted errors go to stderr when JSON output is selected. +- When the failure is "invalid value for X" against a known closed set (an enum field, a documented allow-list, a typed + parameter), the error SHOULD include the valid set. An agent reading `error: invalid visibility` guesses and retries; + an agent reading `error: --visibility must be one of: public, private, unlisted (got "secret")` self-corrects in one + round-trip. + + ```text + error: --visibility must be one of: public, private, unlisted (got "secret") + ``` ## Evidence @@ -99,43 +113,43 @@ OAuth or asks the user to check their config file. Getting that wrong wastes ent ## Anti-Patterns -- `Cli::parse()` anywhere in the codebase — it silently prevents JSON error output. -- `process::exit()` in library code or command handlers. Only `main()` may call it, after all error handling. +- `Cli::parse()` anywhere in the codebase, because it silently prevents JSON error output. +- `process::exit()` in library code or command handlers. Only `main()` MAY call it, after all error handling. - A single catch-all error variant that maps everything to exit code 1. - Error messages that state the symptom without the cause or fix ("Error: request failed"). - Panics (`unwrap()`, `expect()`) on recoverable errors in production code paths. -Measured by check IDs `p4-bad-args`, `p4-process-exit`, `p4-unwrap`, `p4-exit-codes`. Run `agentnative check --principle -4 .` against your CLI to see each. +Measured by audit IDs `p4-bad-args`, `p4-process-exit`, `p4-unwrap`, `p4-exit-codes`. Run `anc audit --principle 4 .` +against the CLI under test to see each. ## Pressure test notes -### 2026-04-27 — Show HN launch red-team pass +### 2026-04-27: Red-team pass Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". - **[edit]** *Internal inconsistency.* "Three-tier gating is labeled identically as a SHOULD in both P4 - (`p4-should-gating-before-network`) and P6 (`p6-should-tier-gating`) — same pattern, two homes, no cross-reference. + (`p4-should-gating-before-network`) and P6 (`p6-should-tier-gating`). Same pattern, two homes, no cross-reference. Readers can't tell which is canonical, and a CLI that satisfies one auto-satisfies the other." Resolved: P4's bullet now focuses on the network-call ordering consequence and points to P6 as the canonical home of the structural three-tier definition. Frontmatter summary tightened to match. Requirement ID is unchanged so CLI registry pinning is unaffected. -- **[edit]** *Must-vs-should.* "`p4-must-exit-code-mapping` is `applicability: universal` and the prose says 'At - minimum' 0/1/2/77/78 — but a CLI with no auth surface and no config file legitimately has nothing to assign to either +- **[edit]** *MUST-vs-SHOULD.* "`p4-must-exit-code-mapping` is `applicability: universal` and the prose says 'At + minimum' 0/1/2/77/78. But a CLI with no auth surface and no config file legitimately has nothing to assign to either 77 or 78, and the MUST forces empty-by-construction error variants. Same shape as P6, which correctly gates `p6-must-timeout-network` behind `if: CLI makes network calls`." Resolved: prose now reads "Use 77 when the CLI has an auth surface and 78 when it has a config surface; 0/1/2 are universal." Frontmatter summary stays universal because the *mapping discipline* is universal even if the specific 77/78 codes are conditional. The summary-prose drift is a known launch-week tradeoff; full alignment of the summary text is on the v0.4.0 punch list. -- **[edit]** *Prior art.* "77/78 align with BSD `sysexits.h` (`EX_NOPERM`, `EX_CONFIG`) — the alignment is a strength - but neither P2 nor P4 cites BSD sysexits, leaving an HN commenter to 'discover' it as a gotcha." Resolved: added a +- **[edit]** *Prior art.* "77/78 align with BSD `sysexits.h` (`EX_NOPERM`, `EX_CONFIG`). The alignment is a strength but + neither P2 nor P4 cites BSD sysexits, leaving an HN commenter to 'discover' it as a gotcha." Resolved: added a one-liner under the P4 exit-code table acknowledging the `sysexits.h` alignment. Same sentence added to P2's exit-code table for consistency. -- **[later]** *Must-vs-should.* "`p4-must-try-parse` names a clap-specific Rust API in a `applicability: universal` - MUST. A Go/Python/Node CLI has no `try_parse()`. The underlying requirement — 'argument-parse failures route through - the same error/output formatter as runtime errors, not a library-internal `process::exit()`' — is universal; the API +- **[later]** *MUST-vs-SHOULD.* "`p4-must-try-parse` names a clap-specific Rust API in a `applicability: universal` + MUST. A Go/Python/Node CLI has no `try_parse()`. The underlying requirement: 'argument-parse failures route through + the same error/output formatter as runtime errors, not a library-internal `process::exit()`' is universal; the API name is not." Deferred: language-neutralizing the bullet ("Argument parsing returns a structured error rather than calling `process::exit()` internally; in Rust+clap, this means `try_parse()` not `parse()`") drifts the frontmatter summary. Bundled with P6's SIGPIPE and `global = true` rewrites for a coordinated v0.4.0 language-neutralization PR. diff --git a/spec/principles/p5-safe-retries-mutation-boundaries.md b/spec/principles/p5-safe-retries-mutation-boundaries.md index 882769b..ad9bc7e 100644 --- a/spec/principles/p5-safe-retries-mutation-boundaries.md +++ b/spec/principles/p5-safe-retries-mutation-boundaries.md @@ -1,7 +1,7 @@ --- id: p5 title: Safe Retries and Explicit Mutation Boundaries -last-revised: 2026-04-22 +last-revised: 2026-05-07 status: active requirements: - id: p5-must-force-yes @@ -23,7 +23,7 @@ requirements: level: should applicability: if: CLI has write operations - summary: Write operations are idempotent where the domain allows it — running the same command twice produces the same result. + summary: "Write operations are idempotent where the domain allows it: running the same command twice produces the same result." --- # P5: Safe Retries and Explicit Mutation Boundaries @@ -33,31 +33,31 @@ requirements: Every CLI with write operations MUST support `--dry-run` so agents can preview a mutation before committing it. Commands MUST make the read-vs-write distinction visible from name and `--help` alone, and destructive writes MUST require explicit confirmation. An agent that cannot distinguish a safe read from a dangerous write will either avoid the tool or -execute mutations blindly — both are failure modes. +execute mutations blindly: both are failure modes. ## Why Agents Need It Agent harnesses commonly retry failed operations. If a write operation is not idempotent, a retry creates duplicates, corrupts data, or trips rate limits. When destructive operations require explicit confirmation (`--force`, `--yes`) and support preview (`--dry-run`), an agent can safely explore what a command would do before committing to it. Read-only -tools are inherently safe for retries, but they still benefit from help text that names the mutation contract — "this +tools are inherently safe for retries, but they still benefit from help text that names the mutation contract: "this does not modify state" is a better sentence to put in `--help` than to assume. ## Requirements **MUST:** -- Destructive operations (delete, overwrite, bulk modify) require an explicit `--force` or `--yes` flag. Without it, the - tool refuses the operation or enters dry-run mode — never mutates silently. -- The distinction between read and write commands is clear from the command name and help text alone. An agent reading - `--help` immediately knows whether a command mutates state. -- A `--dry-run` flag is present on every write command. When set, the command validates inputs and reports what it would - do without executing. Dry-run output respects `--output json` so agents can parse the preview programmatically. +- Destructive operations (delete, overwrite, bulk modify) MUST require an explicit `--force` or `--yes` flag. Without + it, the command refuses the operation or enters dry-run mode; it MUST NOT mutate silently. +- The read-vs-write distinction MUST be visible from the command name and `--help` text alone. A reader scanning the + help output immediately knows whether a command mutates state. +- Every write command MUST support `--dry-run`: validate inputs and report the intended effect without executing it. + Dry-run output respects `--output json`. **SHOULD:** -- Write operations are idempotent where the domain allows it — running the same command twice produces the same result - rather than doubling the effect. +- Write operations SHOULD be idempotent where the domain allows it. Running the same command twice produces the same end + state, not a doubled effect. ## Evidence @@ -72,20 +72,20 @@ does not modify state" is a better sentence to put in `--help` than to assume. - A `delete` command that executes immediately without `--force` or confirmation. - Write commands sharing a name pattern with read commands (e.g., a `sync` that silently overwrites local state). - No `--dry-run` option on bulk operations, where a preview prevents costly mistakes. -- Operations that fail on retry because the first attempt partially succeeded — non-idempotent writes without rollback. +- Operations that fail on retry because the first attempt partially succeeded: non-idempotent writes without rollback. -Measured by check IDs `p5-dry-run`, `p5-destructive-guard`. Run `agentnative check --principle 5 .` against your CLI to -see each. +Measured by audit IDs `p5-dry-run`, `p5-destructive-guard`. Run `anc audit --principle 5 .` against the CLI under test +to see each. ## Pressure test notes -### 2026-04-27 — Show HN launch red-team pass +### 2026-04-27: Red-team pass Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". - **[edit]** *Internal inconsistency.* "Definition opens 'Every CLI MUST support `--dry-run`' as universal, but - `p5-must-dry-run` is gated on 'CLI has write operations' — read-only CLIs would falsely fail this prose claim." + `p5-must-dry-run` is gated on 'CLI has write operations'. Read-only CLIs would falsely fail this prose claim." Resolved: Definition sentence 1 narrowed to "Every CLI with write operations MUST support `--dry-run`..." Read-only CLIs are no longer falsely accused by the prose. - **[edit]** *Internal inconsistency.* "Definition's 'Write operations MUST clearly separate destructive actions from @@ -93,13 +93,13 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". read-only' is a different axis (writes can be non-destructive, e.g., `create`)." Resolved: Definition sentence 2 rewritten to "Commands MUST make the read-vs-write distinction visible from name and `--help` alone, and destructive writes MUST require explicit confirmation." The two axes are now stated separately. -- **[later]** *Internal inconsistency.* "`--force`/`--yes` MUST + P1 `--no-interactive` MUST should compose (agent path - is `--force --no-interactive`); composition isn't called out, leaving the 'without it, the tool refuses or enters - dry-run' clause ambiguous when stdin is non-TTY." Deferred: tightening the MUST to specify error-vs-dry-run behavior - under `--no-interactive` modifies the bullet's contract semantics. Bundled with other MUST-content cleanups for a - v0.4.0 PR. -- **[later]** *Must-vs-should.* "`read-write-distinction` MUST hinges on 'clear from command name and help text alone' — - subjective and unverifiable by `anc`. The `sync` anti-pattern proves the bar is taste, not a checkable property." +- **[later]** *Internal inconsistency.* "`--force`/`--yes` MUST and P1 `--no-interactive` MUST need to compose + explicitly (agent path is `--force --no-interactive`); composition isn't called out, leaving the 'without it, the tool + refuses or enters dry-run' clause ambiguous when stdin is non-TTY." Deferred: tightening the MUST to specify + error-vs-dry-run behavior under `--no-interactive` modifies the bullet's contract semantics. Bundled with other + MUST-content cleanups for a v0.4.0 PR. +- **[later]** *MUST-vs-SHOULD.* "`read-write-distinction` MUST hinges on 'clear from command name and help text alone', + subjective and unverifiable by `anc`. The `sync` anti-pattern proves the bar is taste, not an auditable property." Deferred: rewriting to a verifiable form ("Help text for every write command MUST contain an explicit mutation statement; command names SHOULD signal intent") creates a new SHOULD-shape claim, which is a `requirements[]` change. Coupled-release fires firmly. Defer to v0.4.0 with explicit registry-coordination plan. @@ -108,11 +108,11 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". satisfy the contract under different surfaces." Deferred: worth revisiting whether to add a 'name-or-contract-equivalent' clause that names the contract first and treats canonical flag spelling as one realization. Hold for v0.4.0 alongside the verifiability rewrite above. -- **[wontfix]** *Must-vs-should.* "'Why Agents Need It' leans on retry-safety, then idempotency lands as SHOULD. If +- **[wontfix]** *MUST-vs-SHOULD.* "'Why Agents Need It' leans on retry-safety, then idempotency lands as SHOULD. If retries are the framing, idempotency-where-domain-allows is the load-bearing property; `--dry-run` is mitigation, not cure." Rationale: domain-gated idempotency genuinely cannot be a universal MUST (some domains forbid it: append-only logs, payment capture). The current SHOULD is correct; the prose framing in "Why Agents Need It" is fine because it explains *why* idempotency matters when it is available, not that it is universally required. -- **[edit]** *Vague agent-native.* "'Agents retry failed operations by default' — true for Claude Code/Cursor/Aider tool +- **[edit]** *Vague agent-native.* "'Agents retry failed operations by default'. True for Claude Code/Cursor/Aider tool loops; not universally true for one-shot harnesses or human-in-the-loop agents." Resolved: "Why Agents Need It" hedged to "Agent harnesses commonly retry failed operations." Same operational point; more accurate across harness shapes. diff --git a/spec/principles/p6-composable-predictable-command-structure.md b/spec/principles/p6-composable-predictable-command-structure.md index 0448d8e..e8aecd1 100644 --- a/spec/principles/p6-composable-predictable-command-structure.md +++ b/spec/principles/p6-composable-predictable-command-structure.md @@ -1,21 +1,26 @@ --- id: p6 title: Composable and Predictable Command Structure -last-revised: 2026-04-22 +last-revised: 2026-05-07 status: active requirements: - id: p6-must-sigpipe level: must applicability: universal summary: SIGPIPE is handled so piping to `head`/`tail` does not crash the process (Rust example below; Python/Go/Node have language-specific equivalents). + - id: p6-must-sigterm + level: must + applicability: + if: CLI has long-running operations + summary: "Long-running operations handle SIGTERM gracefully: flush or roll back partial writes, release locks, exit non-zero within a bounded window. Next invocation succeeds without manual cleanup." - id: p6-must-no-color level: must applicability: universal - summary: TTY detection plus support for `NO_COLOR` and `TERM=dumb` — color codes suppressed when stdout/stderr is not a terminal. + summary: "TTY detection plus support for `NO_COLOR` and `TERM=dumb`: color codes suppressed when stdout/stderr is not a terminal." - id: p6-must-completions level: must applicability: universal - summary: Shell completions available via a `completions` subcommand (Tier 1 meta-command — needs no config/auth/network). + summary: Shell completions available via a `completions` subcommand (Tier 1 meta-command, needs no config/auth/network). - id: p6-must-timeout-network level: must applicability: @@ -54,6 +59,11 @@ requirements: level: may applicability: universal summary: "`--color auto|always|never` flag for explicit color control beyond TTY auto-detection." + - id: p6-may-standard-names + level: may + applicability: + if: CLI uses subcommands + summary: "Subcommand verbs MAY follow community-standard names (`get`/`list`/`create`/`update`/`delete`); flag spellings MAY follow widely-used canonical forms (`--force`, `--yes`, `--limit`, `--quiet`, `--verbose`)." --- # P6: Composable and Predictable Command Structure @@ -88,14 +98,20 @@ tool a building block rather than a dead end. unsafe { libc::signal(libc::SIGPIPE, libc::SIG_DFL); } ``` - Equivalents in other languages: Python — restore the default `SIGPIPE` handler at startup - (`signal.signal(signal.SIGPIPE, signal.SIG_DFL)`); Go — the runtime's default handling already exits cleanly on - EPIPE writes; Node.js — handle `EPIPE` on `process.stdout`. + Equivalents in other languages: in Python, restore the default `SIGPIPE` handler at startup + (`signal.signal(signal.SIGPIPE, signal.SIG_DFL)`); in Go, the runtime's default handling already exits cleanly on + EPIPE writes; in Node.js, handle `EPIPE` on `process.stdout`. + +- Agent harnesses send SIGTERM when their own timeout fires. A CLI that exits abruptly leaving a half-written file, a + stale `*.tmp` artifact, or a held flock makes the next invocation fail with a confusing error the agent cannot + diagnose. Long-running operations MUST handle SIGTERM by flushing or rolling back partial writes, releasing acquired + locks, and exiting non-zero within a bounded shutdown window. The next invocation MUST succeed without manual cleanup + of the previous run's state. This complements the existing SIGPIPE MUST (`p6-must-sigpipe`). - TTY detection, plus support for `NO_COLOR` and `TERM=dumb`. When stdout or stderr is not a terminal, color codes are suppressed automatically. - Shell completions available via a `completions` subcommand (clap_complete in Rust; equivalents elsewhere). This is a - Tier 1 meta-command — it works without config, auth, or network. + Tier 1 meta-command: it works without config, auth, or network. - Network CLIs ship a `--timeout` flag with a sensible default (30 seconds). Agents operating under their own time budgets need to fail fast rather than block on a slow upstream. - If the CLI uses a pager (`less`, `more`, `$PAGER`), it supports `--no-pager` or respects `PAGER=""`. Pagers block @@ -105,18 +121,23 @@ tool a building block rather than a dead end. **SHOULD:** -- Commands that accept input read from stdin when no file argument is provided. Pipeline composition depends on it. -- Subcommand naming follows a consistent `noun verb` or `verb noun` convention throughout the tool. Mixing patterns - (e.g., `list-users` alongside `user show`) forces agents to learn exceptions. +- Commands that accept input data SHOULD read from stdin when no file argument is provided. Pipeline composition depends + on it. +- Subcommand naming SHOULD follow one consistent grammar (`noun verb` or `verb noun`) throughout the tool. Mixed + patterns (e.g., `list-users` alongside `user show`) force consumers to memorize exceptions instead of applying a rule. - A three-tier dependency gating pattern: Tier 1 (meta-commands like `completions`, `version`) needs nothing; Tier 2 (local commands) needs config; Tier 3 (network commands) needs config + auth. `completions` and `version` always work, even in broken environments. -- Operations are modeled as subcommands, not flags. `tool search "query"` is correct; `tool --search "query"` is wrong. - Flags modify behavior (`--quiet`, `--output json`); subcommands select operations. +- Operations SHOULD be modeled as subcommands, not flags. `tool search "query"` is correct; `tool --search "query"` + conflates two roles. Flags modify behavior (`--quiet`, `--output json`); subcommands select operations. **MAY:** - A `--color auto|always|never` flag for explicit color control beyond TTY auto-detection. +- Subcommand verbs MAY follow community-standard names (`get` / `list` / `create` / `update` / `delete`); flag spellings + MAY follow widely-used canonical forms (`--force` for confirmation bypass, `--yes` for prompt bypass, `--limit` for + pagination, `--quiet`/`--verbose` for volume control). Convergence reduces an agent's per-tool relearning cost: an + agent that has seen `kubectl get` and `gh repo list` recognizes `tool list` immediately, without re-reading `--help`. ## Evidence @@ -130,31 +151,31 @@ tool a building block rather than a dead end. ## Anti-Patterns -- Missing SIGPIPE handler — `cargo run -- list | head` panics with "broken pipe". +- Missing SIGPIPE handler: `cargo run -- list | head` panics with "broken pipe". - Hard-coded ANSI escape codes without TTY detection. -- Color output in JSON mode — ANSI codes inside JSON string values break downstream parsing. +- Color output in JSON mode: ANSI codes inside JSON string values break downstream parsing. - A `completions` command that requires auth or config to run. - No stdin support on commands where piped input is a natural use case. -Measured by check IDs `p6-sigpipe`, `p6-no-color`, `p6-completions`, `p6-timeout`, `p6-agents-md`. Run `agentnative -check --principle 6 .` against your CLI to see each. +Measured by audit IDs `p6-sigpipe`, `p6-no-color`, `p6-completions`, `p6-timeout`, `p6-agents-md`. Run `anc audit +--principle 6 .` against the CLI under test to see each. ## Pressure test notes -### 2026-04-27 — Show HN launch red-team pass +### 2026-04-27: Red-team pass Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". - **[edit]** *Prior art / vague agent-native.* "The SIGPIPE MUST prescribes `unsafe { libc::signal(libc::SIGPIPE, - libc::SIG_DFL); }` as the first `main()` statement — that is a Rust-specific remedy. Python raises `BrokenPipeError` - by default (different fix), Go's runtime already exits cleanly on EPIPE writes (no fix needed), Node.js needs + libc::SIG_DFL); }` as the first `main()` statement. That is a Rust-specific remedy. Python raises `BrokenPipeError` by + default (different fix), Go's runtime already exits cleanly on EPIPE writes (no fix needed), Node.js needs `process.stdout.on('error')`. The MUST as written is correct in spirit but the prescription leaks Rust into a universal-applicability rule." Resolved: prose bullet now leads with the language-neutral MUST ("SIGPIPE is handled so that piping to `head`, `tail`, or any tool that closes the pipe early does not crash the process"); the Rust snippet stays as the canonical example; per-language one-liners cover Python, Go, and Node. Frontmatter summary updated to match. -- **[edit]** *Must-vs-should.* "The `global = true` MUST is a clap-API artifact — the behavioral requirement is 'agentic +- **[edit]** *MUST-vs-SHOULD.* "The `global = true` MUST is a clap-API artifact. The behavioral requirement is 'agentic flags propagate to every subcommand,' which is what the prose actually says. The frontmatter summary baking `global = true` into a universal contract overfits to one library." Resolved: frontmatter summary and prose bullet now lead with the behavioral requirement ("propagate to every subcommand"), with `global = true` cited as the clap-specific example. diff --git a/spec/principles/p7-bounded-high-signal-responses.md b/spec/principles/p7-bounded-high-signal-responses.md index a77da84..ec0fba5 100644 --- a/spec/principles/p7-bounded-high-signal-responses.md +++ b/spec/principles/p7-bounded-high-signal-responses.md @@ -1,7 +1,7 @@ --- id: p7 title: Bounded, High-Signal Responses -last-revised: 2026-04-22 +last-revised: 2026-05-07 status: active requirements: - id: p7-must-quiet @@ -12,7 +12,7 @@ requirements: level: must applicability: if: CLI has list-style commands - summary: "List operations clamp to a sensible default maximum; when truncated, indicate it (`\"truncated\": true` in JSON, stderr note in text)." + summary: "List operations clamp to a documented default maximum; when truncated, indicate it (`\"truncated\": true` in JSON, stderr note in text)." - id: p7-should-verbose level: should applicability: universal @@ -41,13 +41,13 @@ requirements: ## Definition -CLI tools MUST provide mechanisms to control output volume. Agent context windows are finite and expensive — a tool that -dumps 10,000 lines of unfiltered output wastes tokens and may exceed the context limit entirely, breaking the +CLI tools MUST provide mechanisms to control output volume. Agent context windows are finite and expensive: a tool that +dumps 10,000 lines of unfiltered output wastes tokens and can exceed the context limit entirely, breaking the conversation that invoked it. ## Why Agents Need It -Unbounded CLI output is expensive for any agent — token cost and context-window capacity for LLM agents, parse cost and +Unbounded CLI output is expensive for any agent: token cost and context-window capacity for LLM agents, parse cost and memory pressure for scripts, schedulers, and other automation. Either way, the agent ends up truncating (losing potentially important data) or consuming the full response (wasting cycles on noise). Bounded output with `--quiet`, `--verbose`, and `--limit` flags gives the agent precise control over how much data arrives, keeping responses @@ -57,9 +57,9 @@ high-signal and inside budget. **MUST:** -- A `--quiet` flag suppresses non-essential output: progress indicators, informational messages, decorative formatting. - When `--quiet` is set, only requested data and errors appear. Implementations typically route diagnostics through a - macro that short-circuits when quiet is on: +- A `--quiet` flag MUST suppress non-essential output (progress indicators, informational messages, decorative + formatting). Under `--quiet`, only requested data and errors appear. The Rust realization gates diagnostics through a + macro: ```rust macro_rules! diag { @@ -69,19 +69,19 @@ high-signal and inside budget. } ``` -- List operations clamp to a sensible default maximum. A `list` without `--limit` does not return more than a - configurable ceiling (e.g., 100 items). If more items exist, the output indicates truncation — `"truncated": true` in - JSON, a stderr note in text mode. +- List operations MUST clamp to a documented default maximum. A `list` invoked without `--limit` returns no more than a + configurable ceiling (e.g., 100 items). When the underlying result set exceeds the ceiling, the output signals + truncation: `"truncated": true` in JSON, a stderr note in text mode. **SHOULD:** - A `--verbose` flag (or `-v` / `-vv`) escalates diagnostic detail when agents need to debug failures. -- A `--limit` or `--max-results` flag lets callers request exactly the number of items they want. +- A `--limit` (or `--max-results`) flag SHOULD let callers request exactly the number of items they want. - A `--timeout` flag bounds execution time. An agent waiting indefinitely on a hung network call cannot proceed. **MAY:** -- Cursor-based pagination flags (`--after`, `--before`) for efficient traversal of large result sets. +- Cursor-based pagination flags (`--after`, `--before`) MAY be offered for efficient traversal of large result sets. - Automatic verbosity reduction in non-TTY contexts (the same behavior `--quiet` explicitly requests). ## Evidence @@ -96,32 +96,32 @@ high-signal and inside budget. ## Anti-Patterns -- List commands that return all results with no default limit — an agent listing 50,000 items floods its context window. -- No `--quiet` flag — agents consuming JSON output still receive interleaved diagnostic text on stderr. +- List commands that return all results with no default limit. An agent listing 50,000 items floods its context window. +- No `--quiet` flag. Agents consuming JSON output still receive interleaved diagnostic text on stderr. - `--verbose` as the only output control. If there is no way to reduce output, bounded responses do not exist. - Progress bars or spinners that write to stderr in non-TTY contexts, adding noise to agent logs. - No `--timeout` on network operations. A stalled request blocks the agent indefinitely. -Measured by check IDs `p7-quiet`, `p7-limit`, `p7-timeout`. Run `agentnative check --principle 7 .` against your CLI to -see each. +Measured by audit IDs `p7-quiet`, `p7-limit`, `p7-timeout`. Run `anc audit --principle 7 .` against the CLI under test +to see each. ## Pressure test notes -### 2026-04-27 — Show HN launch red-team pass +### 2026-04-27: Red-team pass Adversarial review via `compound-engineering:ce-adversarial-document-reviewer` ahead of the v0.3.0 launch. Findings recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". - **[later]** *Internal inconsistency.* "`--timeout` is universal SHOULD in P7 but conditional MUST in P6 (`p6-must-timeout-network`). For network CLIs the two compose (MUST wins), but P7's prose ('An agent waiting - indefinitely on a hung network call cannot proceed') only motivates the network case — the universal scope is + indefinitely on a hung network call cannot proceed') only motivates the network case. The universal scope is unjustified by its own rationale." Deferred: narrowing P7's `applicability` from `universal` to non-network - long-running operations only — or to `if: CLI has long-running operations` — fires the coupled-release norm (CLI + long-running operations only (or to `if: CLI has long-running operations`) fires the coupled-release norm (CLI registry parses `applicability`). Bundled with other applicability cleanups for a v0.4.0 PR with explicit registry coordination. -- **[later]** *Must-vs-should.* "The list-clamping MUST fires on every CLI with 'list-style commands' regardless of +- **[later]** *MUST-vs-SHOULD.* "The list-clamping MUST fires on every CLI with 'list-style commands' regardless of natural cardinality. A tool whose list operation returns a bounded small set by construction (e.g., `anc principles - list` → exactly 7) gains nothing from a clamp + `\"truncated\": true` contract — the clamp is unreachable and the + list` → exactly 7) gains nothing from a clamp + `\"truncated\": true` contract: the clamp is unreachable and the truncation flag is dead schema." Deferred: narrowing the `if:` clause from "CLI has list-style commands" to "CLI has list-style commands whose result set is unbounded or user-data-driven" changes the registry-parsed applicability value. Bundled with the P3 / P7 applicability cleanups for v0.4.0. diff --git a/spec/principles/p8-discoverable-skill-bundle.md b/spec/principles/p8-discoverable-skill-bundle.md new file mode 100644 index 0000000..53206f9 --- /dev/null +++ b/spec/principles/p8-discoverable-skill-bundle.md @@ -0,0 +1,94 @@ +--- +id: p8 +title: Discoverable Through Agent Skill Bundles +last-revised: 2026-05-29 +status: active +requirements: + - id: p8-must-bundle-install + level: must + applicability: + kind: conditional + antecedent: + audit_id: p8-bundle-exists + summary: "When a skill bundle exists, the CLI provides an install path (`tool skill install []`) that registers the bundle with installed agent runtimes." + - id: p8-should-bundle-exists + level: should + applicability: universal + summary: "CLIs ship a top-level agent-discoverable markdown bundle (`AGENTS.md`, `SKILL.md`, or equivalent) with YAML frontmatter naming the tool and capability summary." + - id: p8-may-install-all + level: may + applicability: + kind: conditional + antecedent: + audit_id: p8-bundle-exists + summary: "An `--all` mode auto-detects installed runtimes (Claude Code, Cursor, Codex, OpenCode, etc.) and installs across all." + - id: p8-may-bundle-update + level: may + applicability: + kind: conditional + antecedent: + audit_id: p8-bundle-exists + summary: "An update/upgrade subcommand (`tool skill update`) pulls the latest bundle version." +--- + +# P8: Discoverable Through Agent Skill Bundles + +## Definition + +A skill bundle is a structured markdown file (canonical names: `AGENTS.md` or `SKILL.md`) with YAML frontmatter that +names the tool, describes its capabilities, and provides workflow guidance an agent can load into its runtime. The +bundle lives outside the CLI's flag space: agents discover it via filesystem convention, not via `--help`. + +## Why Agents Need It + +`--help` describes what is *possible* (the flag and subcommand surface); a skill bundle describes what to *do* (workflow +knowledge, common compositions, recovery patterns). Workflow knowledge does not fit in `after_help` examples. Without a +bundle, every invocation begins with a `--help` round-trip plus inference; with one, the agent loads `SKILL.md` once and +recognizes the tool's idioms across every subsequent invocation. + +## Requirements + +**MUST:** + +- If a CLI ships a skill bundle, then it MUST provide an install path that registers the bundle with installed agent + runtimes. The canonical form is a `tool skill install []` subcommand that writes into the runtime's filesystem + cascade (e.g., `~/.claude/skills/`, `~/.cursor/skills/`). Non-canonical alternatives (`tool init --skill`, `tool + skills add`, `tool agents add`) are acceptable but SHOULD migrate toward `tool skill install`. A bundle without an + install path sits unread until a human manually copies it; the install path is what turns the bundle from + documentation into discoverable runtime knowledge. + +**SHOULD:** + +- CLIs SHOULD ship a top-level agent-discoverable markdown bundle (canonical names are `AGENTS.md` or `SKILL.md`, both + recognized by major agent runtimes) with YAML frontmatter naming the tool and summarizing its capabilities. The + bundle's first job is to be findable by filesystem convention; its second is to teach the agent how to invoke the tool + well. + +**MAY:** + +- If a CLI ships a skill bundle, then an `--all` mode MAY auto-detect installed agent runtimes (Claude Code, Cursor, + Codex, OpenCode, and others as the ecosystem evolves) and install the bundle across each. A user setting up a new + machine with multiple coding agents installs once and gets coverage across every runtime. +- If a CLI ships a skill bundle, then an `update` (or `upgrade`) subcommand under `tool skill` MAY pull the latest + bundle version, so agents stay current with the CLI's evolving surface without a full reinstall. + +## Evidence + +- A top-level `AGENTS.md` or `SKILL.md` in the CLI's source tree (and shipped in the release artifact) with YAML + frontmatter declaring at least the tool name and a one-line capability summary. +- A `skill` subcommand group in the CLI enum (e.g., `tool skill install`, `tool skill update`, `tool skill list`). +- An installer that targets the runtime cascade directly (file writes to `~/.claude/skills//`, etc.) rather than + requiring the runtime to be running. +- Bundle content versioned alongside the CLI's release: the bundle ships from the same commit as the binary, not from a + separate doc tree that drifts. + +## Anti-Patterns + +- A CLI shipping a skill bundle with no install path: the bundle sits unread until a human manually copies it. +- An install path that requires the agent runtime to be running: `tool skill install` writes to the runtime's filesystem + cascade (e.g., `~/.claude/skills/`) rather than requiring an active session. +- A bundle whose contents drift from the CLI's actual surface: the bundle is part of the CLI's release artifact, not a + separate doc tree. + +The vendor census in the v0.4.0 source-mining sprint documents the shipped patterns across Firecrawl, CLI-Anything, gws, +Crush, and larksuite; the `agentnative-skill` repo's `bin/check-update` is a reference for an update-check pattern. diff --git a/spec/principles/scoring.md b/spec/principles/scoring.md new file mode 100644 index 0000000..0ef9929 --- /dev/null +++ b/spec/principles/scoring.md @@ -0,0 +1,129 @@ +--- +title: "Scoring — leaderboard formula and badge eligibility" +last-revised: 2026-05-28 +--- + +# Scoring — leaderboard formula and badge eligibility + +This document defines how a tool's per-requirement scorecard collapses into the single `score_pct` that drives the +`anc.dev` leaderboard, the badge color, and badge eligibility. The taxonomy of per-requirement statuses and how they are +produced is defined in [`AGENTS.md`](AGENTS.md); this document is the layer above it: how those rows aggregate into one +number. + +The formula is the spec-side contract. The `anc` CLI computes `badge.score_pct` to match it, and the +[`agentnative-site`](https://github.com/brettdavies/agentnative-site) renderer reads the eligibility floor and color +bands defined here. + +## Scope: shipped-binary behavior only + +A public score reflects how the **shipped binary behaves**, observed by running it (`anc audit --command `). +Source-code and repository audits (static analysis of a project's source tree, manifest inspection, bundle presence) do +**not** contribute to the public score. What a tool's source code looks like does not change how an agent experiences +the installed binary. The binary is what agents run. + +Concretely, only behavioral-layer requirement rows enter the formula. Source-layer and project-layer audits are out of +scope for the leaderboard; they belong to a future advisory mode (`anc` run against a source tree to help authors +improve before release), not to the published score. This holds uniformly: every tool on the leaderboard, including +`anc` itself, is scored from binary behavior alone, so the comparison is like-for-like. + +## Inputs: the seven statuses + +Each behavioral requirement row carries a `status` and a `tier` (`must`, `should`, or `may`). The formula treats the +seven statuses in three groups: + +| Status | In denominator? | Execution credit | Meaning | +| --------- | --------------- | ---------------- | ------------------------------------------------ | +| `pass` | yes | 1.0 | Behavior present and correct. | +| `warn` | yes | 0.5 | Behavior present, partially correct. | +| `fail` | yes | 0.0 | Behavior expected, absent or broken. | +| `opt_out` | yes | 0.0 | Behavior deliberately declined (counts against). | +| `n_a` | no | — | Inapplicable: a conditional antecedent is unmet. | +| `skip` | no | — | Unmeasurable: the probe could not determine. | +| `error` | no | — | The probe raised an exception. | + +`opt_out` is in the denominator on purpose: a deliberate decision not to adopt a behavior is a real signal worth +reflecting, distinct from a behavior that simply does not apply (`n_a`) or could not be measured (`skip`). The +distinction is the whole point of the seven-status taxonomy: it lets the score exclude genuinely inapplicable audits +without letting deliberate non-adoption hide. + +## Formula + +For a tool's set of behavioral rows, let `D` be the rows whose status is in `{pass, warn, fail, opt_out}` (the +denominator set). With per-tier weights `w(must)`, `w(should)`, `w(may)`: + +```text +score_pct = round( 100 × Σ_{i∈D} w(tier_i) · credit(status_i) + ─────────────────────────────────────── ) + Σ_{i∈D} w(tier_i) +``` + +where `credit(pass) = 1.0`, `credit(warn) = 0.5`, `credit(fail) = credit(opt_out) = 0.0`. A tool with an empty +denominator set scores 0. + +### Tier weights (tunable) + +The tier weights are a parameter, not a constant baked into the definition, so the standard can re-tune the balance +between MUST/SHOULD/MAY without redefining the formula. The current published weights are **flat**: + +```text +w(must) = w(should) = w(may) = 1 +``` + +Flat weights keep the score legible (every behavioral audit counts the same) and avoid over- or under-rewarding +optional-tier adoption relative to the mandatory baseline. Under flat weights the formula reduces to a simple +credit-weighted ratio: + +```text +score_pct = round( 100 × (n_pass + 0.5 · n_warn) / (n_pass + n_warn + n_fail + n_opt_out) ) +``` + +A future revision can move to non-flat weights (for example, weighting failures at the MUST tier more heavily). Such a +change is a re-tuning of a published parameter and ships through the normal versioned-release path, subject to the +stability commitment below. + +### Worked example + +A narrow filter that ships no structured output: 20 `pass`, 7 `warn`, 0 `fail`, 1 `opt_out`, 1 `n_a`, 14 `skip`. + +- Denominator set `D` = the 20 + 7 + 0 + 1 = 28 rows that are pass/warn/fail/opt_out. The `n_a` and `skip` rows are + excluded. +- Numerator (flat weights) = 20 × 1.0 + 7 × 0.5 + 0 + 1 × 0.0 = 23.5. +- `score_pct = round(100 × 23.5 / 28) = 84` → **Strong** band. + +## Eligibility floor + +**A tool is badge-eligible at `score_pct ≥ 70`.** + +The floor is deliberately low. The badge's job is to spread the standard: a tool that clears a reasonable bar can +display it and point readers at its scorecard. Exclusivity is carried by the cohort bands below and by the score shown +on the badge itself, not by a high gate. A tool below the floor still gets a rendered badge and scorecard. The color +reflects the lower score, and the scorecard page shifts to improvement hints rather than the embed snippet (see +[`docs/badge.md`](../docs/badge.md#regression-behavior)). + +## Cohort bands + +The score maps to one of four cohort bands above the floor, plus a below-floor state. The bands are the exclusivity +signal. The top band is intentionally rare. + +| Band | Score | Meaning | +| ------------- | ------- | ----------------------------------------------------- | +| **Exemplary** | `≥ 85` | Near-complete binary-behavior conformance. Rare. | +| **Strong** | `80–84` | Broad conformance with minor gaps. | +| **Solid** | `75–79` | Solidly above the floor. | +| **Qualified** | `70–74` | Meets the eligibility floor. | +| _below floor_ | `< 70` | Not yet eligible; badge renders muted improvement UX. | + +The band thresholds are the spec-side contract; [`docs/badge.md`](../docs/badge.md) maps each band to a rendered color. +Exact colors are a rendering detail owned by the site. + +## Stability commitment + +The formula, tier weights, eligibility floor, and band thresholds are held stable for at least six months from +publication so that authors who embed the badge can rely on it. Any change ships through a versioned spec release with +the rationale recorded, never as a silent re-tuning. + +## Cross-references + +- [`AGENTS.md`](AGENTS.md): the seven-status taxonomy, per-row scorecard model, and antecedent propagation. +- [`docs/badge.md`](../docs/badge.md): badge claim, embed shapes, color rendering per band, regression behavior. +- RFC 2119 / RFC 8174: MUST / SHOULD / MAY tier semantics. From 48ee07b318af7366ed0bd0a4b15b3e584b35d7ce Mon Sep 17 00:00:00 2001 From: Brett Date: Mon, 1 Jun 2026 11:48:42 -0500 Subject: [PATCH 07/15] fix(docs): strip leaked tool-output trailers from top-level markdown (#20) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary `README.md`, `AGENTS.md`, and `CONTRIBUTING.md` each ended with literal `` and `` XML tags — leaked tool-output cruft from an earlier AI-assisted edit. These render as raw text in any markdown viewer (GitHub, Obsidian, mdformat) and degrade the first impression of the bundle. Strip them; the rest of each file is unchanged. `grep -rn '\|' --include='*.md' .` returns zero matches after the strip. ## Changelog ### Fixed - Strip leaked `` / `` XML trailers from `README.md`, `AGENTS.md`, and `CONTRIBUTING.md`. ## Type of Change - [x] `fix`: Bug fix (non-breaking change which fixes an issue) ## Files Modified **Modified:** `README.md`, `AGENTS.md`, `CONTRIBUTING.md` ## Breaking Changes - [x] No breaking changes ## Deployment Notes - [x] No special deployment steps required ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) - [x] Self-review of code completed - [x] No new warnings or errors introduced - [x] Changes are backward compatible --- AGENTS.md | 3 --- CONTRIBUTING.md | 2 -- README.md | 2 -- 3 files changed, 7 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 2a8adde..426f108 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -110,6 +110,3 @@ table. - [`SECURITY.md`](./SECURITY.md) — vulnerability disclosure - [`RELEASES.md`](./RELEASES.md) — release procedure - [`CONTRIBUTING.md`](./CONTRIBUTING.md) — how to propose changes - - - diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 46b286b..f0e0a9f 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -131,5 +131,3 @@ The full visitor-facing menu lives at [`anc.dev/contribute`](https://anc.dev/con - [Linter](https://github.com/brettdavies/agentnative-cli): `anc`, the scoring engine, the registry - [Site](https://github.com/brettdavies/agentnative-site): anc.dev source, leaderboard renderer, live-scoring - This repo: the agent-facing skill bundle, install paths, host-runtime detection - - diff --git a/README.md b/README.md index 619c9f2..00cf083 100644 --- a/README.md +++ b/README.md @@ -113,5 +113,3 @@ tooling without re-licensing friction. Vendored spec content under `spec/` is CC BY 4.0 (upstream from [`brettdavies/agentnative`](https://github.com/brettdavies/agentnative)); attribution is in [`spec/README.md`](./spec/README.md). - - From 2a925d2652a9481234ec067bdb38c90db6634cb0 Mon Sep 17 00:00:00 2001 From: Brett Date: Mon, 1 Jun 2026 12:07:29 -0500 Subject: [PATCH 08/15] feat(skill): align with anc v0.5.0 and add eval suite (#21) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary Update the skill bundle to match `anc` v0.5.0's surface, re-vendor `spec/` to `agentnative-spec` v0.5.0, and add an `evals/` directory with three dispatchable prompts. The skill content was tracking `anc` v0.3.1. v0.5.0 shipped breaking renames (`check → audit`, `generate → emit`, `schema → emit schema`), a new scorecard JSON shape (schema 0.5 → 0.7, per-row `id` / `audit_id` / `tier` fields, two new statuses `opt_out` and `n_a`), a lowered badge floor (80% → 70%), and several new top-level flags (`--examples`, `--json`, `--raw`, `--color`, `--verbose`). `SKILL.md` and `getting-started.md` document that surface; the vendored spec at `spec/VERSION` is now 0.5.0. `evals/` ships three self-contained prompts that test the bundle via fresh-agent dispatch. Running them during this PR surfaced two doc gaps: the scoring formula's behavioral-layer-only scope, and the semantics of `coverage_summary.must.verified`. Both are fixed here. Each eval's "Anti-patterns to detect" section names the stale strings (schema 0.5, 80% floor, `requirement_id` field name) so re-runs surface drift. ## Changelog ### Added - Add `evals/` with three self-contained prompts covering greenfield Rust, remediate-existing-Rust, and multi-language Python (Click) workflows. - Document `anc skill install --all` and `anc skill update [host|--all]` in the install section. - Document `anc emit schema` for extracting the scorecard JSON Schema embedded in the binary. ### Changed - Re-vendor `spec/` to `agentnative-spec` v0.5.0. - Track `anc` v0.5.0 scorecard surface: schema 0.7, per-row `id` / `audit_id` / `tier` fields, `opt_out` and `n_a` statuses, 70% badge floor. - Surface new top-level flags: `--examples`, `--json`, `--raw`, `--color`, `--verbose`. ### Fixed - Correct the "no MUST violations" check: `coverage_summary.must.verified` counts any verdict (including `fail`), so the right bar is no `results[]` row where `tier == "must" && status == "fail"`. - Clarify that `badge.score_pct` is computed from behavioral-layer rows only. Source- and project-layer audits do not affect the score. ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) - [x] `docs`: Documentation update ## Files Modified **Modified:** `SKILL.md`, `getting-started.md`, `spec/VERSION`, `spec/CHANGELOG.md`, four `spec/principles/p*.md` files **Created:** `evals/README.md`, `evals/01-greenfield-rust-cli.md`, `evals/02-remediate-existing-rust-cli.md`, `evals/03-multilang-python-cli.md` ## Breaking Changes - [x] No breaking changes ## Deployment Notes - [x] No special deployment steps required ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow Conventional Commits - [x] Self-review of code completed - [x] No new warnings or errors introduced --- README.md | 11 +- SKILL.md | 110 ++++++++++---- evals/01-greenfield-rust-cli.md | 115 +++++++++++++++ evals/02-remediate-existing-rust-cli.md | 133 +++++++++++++++++ evals/03-multilang-python-cli.md | 136 ++++++++++++++++++ evals/README.md | 38 +++++ getting-started.md | 35 +++-- spec/CHANGELOG.md | 41 ++++-- spec/VERSION | 2 +- .../p3-progressive-help-discovery.md | 6 +- .../p4-fail-fast-actionable-errors.md | 4 +- .../p5-safe-retries-mutation-boundaries.md | 6 +- .../p7-bounded-high-signal-responses.md | 4 +- 13 files changed, 570 insertions(+), 71 deletions(-) create mode 100644 evals/01-greenfield-rust-cli.md create mode 100644 evals/02-remediate-existing-rust-cli.md create mode 100644 evals/03-multilang-python-cli.md create mode 100644 evals/README.md diff --git a/README.md b/README.md index 00cf083..1ee5de0 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,10 @@ agentnative-skill/ ├── scripts/ │ ├── sync-spec.sh vendor the latest agentnative-spec v* tag into spec/ │ ├── sync-prose-tooling.sh vendor BRAND.md from agentnative-spec main HEAD -│ └── generate-changelog.sh release-time CHANGELOG generator (git-cliff + PR-body extraction) +│ ├── sync-dev-after-release.sh post-release backport: replay release/* artifacts onto dev +│ ├── generate-changelog.sh release-time CHANGELOG generator (git-cliff + PR-body extraction) +│ └── hooks/pre-push local CI mirror (markdownlint + shellcheck), installed via core.hooksPath +├── evals/ self-contained eval prompts for fresh-agent dispatch (producer-side; not loaded by hosts) ├── docs/plans/ engineering plans (dev-only — guarded out of main) ├── .github/ workflows, rulesets, issue templates, PR template ├── AGENTS.md project-level agent instructions FOR THIS REPO (producer-side) @@ -45,9 +48,9 @@ agentnative-skill/ ``` Consumer-facing files (`SKILL.md`, `getting-started.md`, `bin/`, `spec/`, `references/`, `templates/`, `VERSION`, -`LICENSE-*`) are read by the agent at runtime. Producer-side files (`scripts/`, `docs/`, `.github/`, `AGENTS.md`, -`CONTRIBUTING.md`, `RELEASES.md`, `cliff.toml`) ship to consumers via `git clone` but are inert at runtime. The host -discovers `SKILL.md` and ignores everything else. +`LICENSE-*`) are read by the agent at runtime. Producer-side files (`scripts/`, `evals/`, `docs/`, `.github/`, +`AGENTS.md`, `BRAND.md`, `PRODUCT.md`, `CONTRIBUTING.md`, `RELEASES.md`, `RELEASES-RATIONALE.md`, `cliff.toml`) ship to +consumers via `git clone` but are inert at runtime. The host discovers `SKILL.md` and ignores everything else. ## Install diff --git a/SKILL.md b/SKILL.md index 9c7b0dc..a4e9a00 100644 --- a/SKILL.md +++ b/SKILL.md @@ -6,24 +6,25 @@ description: >- [`agentnative-spec`](https://github.com/brettdavies/agentnative) (the canonical principle text, vendored at `spec/`). Provides starter templates, language-specific implementation idioms (Rust/clap, Python Click & argparse, Go Cobra, JS Commander/yargs/oclif, Ruby Thor), and a getting-started guide that points agents at - `anc audit --output json` and `anc skill install `. Use when designing a new CLI tool, building a Rust/clap - binary intended for agents, reviewing one for agent-readiness, claiming the agent-native badge, or remediating - findings from `anc`. Triggers on agentic CLI, agent-native, CLI design, CLI standard, agent-first, CLI for agents, - agent-friendly CLI, CLI compliance, agent-readiness, anc, anc audit, anc skill install, agent-native badge, - scorecard, audit-profile, Rust CLI, clap derive. SKIP when the user is building a TUI app meant for humans (use - `--audit-profile human-tui` rather than this skill), writing a non-CLI library, or asking unrelated Rust questions - not specifically about agent-readiness of a CLI. + `anc audit --output json` (scorecard schema 0.7) and `anc skill install ` / `anc skill update --all`. Use + when designing a new CLI tool, building a Rust/clap binary intended for agents, reviewing one for agent-readiness, + claiming the agent-native badge (≥70% credit-weighted score), or remediating findings from `anc`. Triggers on + agentic CLI, agent-native, CLI design, CLI standard, agent-first, CLI for agents, agent-friendly CLI, CLI + compliance, agent-readiness, anc, anc audit, anc emit schema, anc skill install, anc skill update, agent-native + badge, scorecard, audit_id, audit-profile, opt_out, n_a, Rust CLI, clap derive. SKIP when the user is building a + TUI app meant for humans (use `--audit-profile human-tui` rather than this skill), writing a non-CLI library, or + asking unrelated Rust questions not specifically about agent-readiness of a CLI. --- # Agent-Native CLI The standard for CLI tools designed to be operated by AI agents. Three artifacts work together: -| Artifact | Role | -| ---------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| [`agentnative-spec`](https://github.com/brettdavies/agentnative) | Canonical text of the eight principles. Frontmatter `requirements[]` is the machine-readable contract. Vendored into [`spec/`](./spec/) — snapshot refreshed each release. | -| [`anc`](https://github.com/brettdavies/agentnative-cli) | The compliance auditor. Reads target source/binary, emits a JSON scorecard whose entries cite spec `requirement_id`s. The runtime authority. | -| **This skill** (`agent-native-cli`) | The agent-facing guide. Tells the agent how to invoke `anc`, how to navigate the spec when remediating findings, and where the implementation patterns and starter templates live. | +| Artifact | Role | +| ---------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| [`agentnative-spec`](https://github.com/brettdavies/agentnative) | Canonical text of the eight principles. Frontmatter `requirements[]` is the machine-readable contract. Vendored into [`spec/`](./spec/) — snapshot refreshed each release. | +| [`anc`](https://github.com/brettdavies/agentnative-cli) | The compliance auditor. Reads target source/binary, emits a JSON scorecard whose entries cite the spec requirement `id` (e.g. `p1-must-no-interactive`) and the probe `audit_id`. The runtime authority. | +| **This skill** (`agent-native-cli`) | The agent-facing guide. Tells the agent how to invoke `anc`, how to navigate the spec when remediating findings, and where the implementation patterns and starter templates live. | The skill does **not** implement principles auditing. `anc` does. The skill teaches agents to use `anc` and supplies the surrounding context (spec, idioms, templates) that `anc`'s findings reference. @@ -48,7 +49,7 @@ map. ## The eight principles -Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative-spec` — currently `v0.4.0`; see +Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative-spec` — currently `v0.5.0`; see [`spec/README.md`](./spec/README.md) for resync instructions). One file per principle, each with machine-readable `requirements[]` frontmatter: @@ -70,37 +71,81 @@ Do not paraphrase the principles inside this skill — read the spec files direc Once `anc` is installed (one-line install in [`getting-started.md`](./getting-started.md)), the work is a four-step loop: -**1. Audit.** `anc audit --output json . > scorecard.json`. The JSON envelope is schema `0.5` and contains: +**1. Audit.** `anc audit --output json . > scorecard.json`. The JSON envelope is schema `0.7` and contains: -- `summary` — `total / pass / warn / fail / skip / error` count. -- `coverage_summary` — `must / should / may`, each with `total` + `verified`. `must.verified == must.total` is the bar - for "no MUST violations". +- `summary` — counters across the full status set: `total / pass / warn / fail / skip / error / opt_out / n_a`. Spans + all three audit layers (behavioral, source, project). +- `coverage_summary` — `must / should / may`, each with `total` + `verified`. **`verified` counts any verdict — `pass`, + `warn`, `fail`, `skip` all increment it.** "Was this MUST audited at all?" not "was it satisfied." The actual bar for + "no MUST violations" is no `results[]` row where `tier == "must" && status == "fail"` (equivalently, every MUST row is + `pass` / `warn` / `opt_out` / `skip` / `n_a`). - `badge.eligible` (bool), `badge.score_pct` (int), `badge.embed_markdown` (string or `null`), `badge.scorecard_url`, - `badge.badge_url`, `badge.convention_url`. **80%** is the eligibility floor; below it, `embed_markdown` is `null` and - the convention says do not advertise a badge. -- `results[]` — per-audit entries citing `requirement_id`, `status`, and `evidence`. + `badge.badge_url`, `badge.convention_url`. **70%** is the eligibility floor; below it, `embed_markdown` is `null` and + the convention says do not advertise a badge. **The score is computed over behavioral-layer rows only** — source- and + project-layer audits do not affect `score_pct` or badge eligibility (`spec/principles/scoring.md` § "Scope: + shipped-binary behavior only"). Under the current flat tier weights the formula reduces to a credit-weighted ratio: + `pass = 1.0`, `warn = 0.5`, `fail` and `opt_out` = `0.0` (all in the denominator); `n_a`, `skip`, and `error` are + excluded. The general form is tier-weighted (`w(must) · w(should) · w(may)`) but currently flat — see + `spec/principles/scoring.md` for the formula and the cohort bands above the floor. +- `results[]` — one entry per requirement-row. Each carries `id` (the spec requirement id, e.g. `p1-must-no-interactive` + — match this against `spec/principles/p-*.md` frontmatter), `audit_id` (the probe that produced the row, e.g. + `p1-non-interactive`), `tier` (`must` / `should` / `may`), `status`, `evidence`, `group`, `layer`, `confidence`, and + `label`. A single probe like `p3-version` emits two rows — one tier-stamped `must`, one `should` — so you can + attribute the verdict to a specific RFC 2119 level without joining the coverage matrix. - `audit_profile` — the exemption category in effect (or `null`). - `tool / anc / run / target` metadata — identifies the scored tool, the `anc` build, the invocation, and the resolved target. -**2. Fix.** For each `fail`, look up the cited `requirement_id` (e.g. `p1-must-no-interactive`) in -`spec/principles/p-*.md`'s `requirements[]` frontmatter. Apply the fix using the implementation references below. -Re-run with `--principle ` to focus on one principle while iterating. +**Status set.** `pass` / `warn` / `fail` / `skip` / `error` are the live verdicts. `opt_out` marks a deliberate +non-adoption (e.g. no `--output` flag → P2 schema-discovery rows collapse). `n_a` propagates from a conditional whose +antecedent is `opt_out` or `n_a` — the `evidence` field names the antecedent so the chain is legible from JSON alone. +The process exit code reflects live verdicts only: `n_a` from an `opt_out` antecedent does not force a non-zero exit. -**3. Re-audit.** Re-run `anc audit --output json .` until `summary.fail == 0` and `coverage_summary.must.verified == -coverage_summary.must.total`. Use `--audit-profile ` to suppress audits that don't apply to the tool class — -`human-tui` (TUIs that legitimately intercept the TTY), `file-traversal` (reserved), `posix-utility` (cat / sed / awk -style), `diagnostic-only` (read-only tools). Suppressed audits emit `Skip` with structured evidence so readers see what -was excluded. +**2. Fix.** For each `fail`, look up the cited `id` (e.g. `p1-must-no-interactive`) in `spec/principles/p-*.md`'s +`requirements[]` frontmatter. Apply the fix using the implementation references below. Re-run with `--principle ` to +focus on one principle while iterating. -**4. Claim the badge.** Once `badge.eligible == true` (≥80%), copy `badge.embed_markdown` into the project's README. The +**3. Re-audit.** Re-run `anc audit --output json .` until no MUST row is in `fail` — query the JSON with `jq +'[.results[] | select(.tier == "must" and .status == "fail")] | length'` and confirm it's `0`. Do not rely on +`coverage_summary.must.verified == coverage_summary.must.total` as the satisfaction bar — `verified` increments on any +verdict, so it can equal `total` while MUST fails still exist. Use `--audit-profile ` to suppress audits that +don't apply to the tool class — `human-tui` (TUIs that legitimately intercept the TTY), `file-traversal` (reserved), +`posix-utility` (cat / sed / awk style), `diagnostic-only` (read-only tools). Suppressed audits emit `Skip` with +structured evidence so readers see what was excluded. + +**4. Claim the badge.** Once `badge.eligible == true` (≥70%), copy `badge.embed_markdown` into the project's README. The `text` output appends an embed hint after the summary line whenever the floor is cleared; below the floor, nothing badge-related is printed (the convention's "do not nag" rule). +### Useful flags + +- `--principle ` — filter the audit to one principle while iterating. +- `--binary` / `--source` — scope to one layer (skip the other). +- `--audit-profile ` — suppress audits that don't apply (`human-tui`, `posix-utility`, `diagnostic-only`, + `file-traversal`). +- `--examples` — print a curated invocation block and exit (pair with `--output json` or `--json` for structured). +- `--json` — short alias for `--output json` (the `p2-should-json-aliases` convention). +- `--raw` — strip headers and summary, emit one `idstatus` line per audit. Pipe-friendly for `grep` / `awk`. +- `--color ` (env `AGENTNATIVE_COLOR`) — control ANSI styling; honors `NO_COLOR` in `auto` mode. +- `-v` / `--verbose` (env `AGENTNATIVE_VERBOSE`) — escalate diagnostic detail when debugging unexpected results. + +### Get the scorecard JSON Schema + +The canonical JSON Schema for the scorecard envelope ships embedded in the binary. Extract it without a network round +trip: + +```bash +anc emit schema # print to stdout +anc emit schema | jq '.title' # inspect a field +``` + +The schema `$id` is `https://anc.dev/scorecard-v0.7.schema.json`. Pre-0.6 consumers treated `opt_out` / `n_a` as unknown +— feature-detect the status enum rather than pinning to an exact list. + ## Implementation guidance (when fixing findings) -Once `anc audit` reports a failure, the agent has the cited `requirement_id` and the spec text. The next question is -"how do I write code that satisfies this requirement?" — answered by: +Once `anc audit` reports a failure, the agent has the cited `id` (requirement-row id) and the spec text. The next +question is "how do I write code that satisfies this requirement?" — answered by: | Need | File | | ------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | @@ -133,7 +178,10 @@ To install **this skill bundle** into a host's canonical skills directory, use ` ```bash anc skill install claude_code # also: codex, cursor, factory, kiro, opencode +anc skill install --all # install into every known host in one invocation anc skill install --dry-run claude_code # print resolved git command without spawning +anc skill update claude_code # refresh an existing install (guards on SKILL.md marker) +anc skill update --all # refresh every known host ``` Recommended `anc audit` invocations and the full agent loop are in [`getting-started.md`](./getting-started.md). Do not diff --git a/evals/01-greenfield-rust-cli.md b/evals/01-greenfield-rust-cli.md new file mode 100644 index 0000000..ce9e250 --- /dev/null +++ b/evals/01-greenfield-rust-cli.md @@ -0,0 +1,115 @@ +# Eval 01 — Greenfield Rust CLI for an AI agent + +## Task + +You need to design and build a brand-new CLI tool intended to be operated **primarily by AI agents** (not humans running +it interactively). The CLI should be a small but realistic shape: a `widgetctl` binary that has: + +- `widgetctl list` — list widgets (read-only, no network). +- `widgetctl get ` — get one widget (read-only, no network). +- `widgetctl create --name ` — create a widget (mutates state; this is the only mutating verb). + +You are free to mock the actual widget store with an in-memory map or a JSON file — the storage layer is not the point. +What matters is that the CLI is **agent-ready**: structured output, predictable errors, no hidden interactivity, no +surprise side effects, no color/pager fights when piped, etc. + +You should research what "agent-ready" means before designing the CLI rather than guessing. + +## Workdir + +`/tmp/eval-greenfield-rust-cli-$(date +%s)/` + +Treat this as a fresh sandbox. Do not edit any other directories. Do not assume any tooling beyond what is on `$PATH` +(check first with `which`); install via `brew` / `cargo` if the right tool is missing. + +## Required artifacts + +By the end of the eval, the workdir must contain: + +1. `widgetctl/` — a Rust Cargo crate that compiles cleanly (`cargo build` succeeds) and whose binary runs. +2. `widgetctl/AGENTS.md` — a per-project AGENTS.md aimed at agent operators. +3. `scorecard.json` — the JSON output of whatever auditor you found, run against the built binary. +4. `BADGE.md` — if the scorecard reports the project as badge-eligible, the badge embed markdown copied out of the + scorecard. If the project is **not** eligible, this file should explain in 2–3 sentences why (which scoring rule + blocks it) and what the next remediation would be. +5. `NOTES.md` — your investigation log: what you researched, dead-ends you abandoned (and why), and which decisions were + judgment calls rather than rules from the docs. + +Do not commit any of these — the workdir is throwaway. + +## Success criteria (score 0–10 each) + +1. **Discovery via description, not name.** You discovered the relevant standard / spec / auditor by following a + description from a skill listing or by web search — not by being told its name. If you guessed the name from training + data and went straight to its repo, that is a partial fail; record it under "Dead ends". +2. **Auditor invoked correctly.** You ran the canonical auditor against your built binary and captured its JSON output. + The JSON envelope matches the current published schema version; you did not pin against a stale schema. +3. **Spec lookup is real.** For at least one finding, you traced the cited requirement id (e.g. something resembling + `pN-must-…`) back to the canonical spec text and quoted the constraint in `NOTES.md`. No paraphrasing the rule from + training data without verifying. +4. **Templates used where they exist.** If the standard ships starter templates for the language you chose, you used + them rather than hand-rolling boilerplate. You should not have invented your own `--quiet` / `--output` flag + conventions when the templates already encode them. +5. **`opt_out` and `n_a` handled correctly.** When the auditor reports `opt_out` or `n_a` rows, you correctly classified + them as deliberate non-adoptions / propagated conditionals — not as bugs to chase. `NOTES.md` lists at least one such + row and explains why you did not "fix" it. +6. **Badge claim discipline.** If the scorecard is below the eligibility floor, you did NOT advertise a badge. + `BADGE.md` reflects the actual scorecard state. +7. **Pipe-safe output.** Running `widgetctl list | cat` produces no ANSI escapes; running with `--output json | jq .` + succeeds; `widgetctl create … | head` does not error out from SIGPIPE. +8. **No interactive prompts.** Running any verb in a non-TTY (e.g., `widgetctl create --name foo < /dev/null`) neither + hangs nor reads from stdin unexpectedly. + +## Anti-patterns to detect + +These are regression markers from prior eval runs. If your transcript shows any of these, flag them explicitly in +`NOTES.md`: + +- **Schema pinning to 0.5.** The current scorecard schema is **not** 0.5. Pinning your parser to that string will break + against current output. Use the schema the binary emits today; document the version in `NOTES.md`. +- **80% threshold mentioned.** The badge eligibility floor is not 80%. Anywhere you write down a percentage, source it + from the badge section of the actual scorecard, not from cached prose. +- **Chasing `n_a` rows.** A row whose status is `n_a` because its antecedent collapsed is a propagated non-finding, not + a failure. If your "fixes" include touching code to flip an `n_a` row, that is wrong work. +- **Hand-rolled `--output json` flag.** If you wrote the JSON serialization layer by hand instead of using the starter + template that already encodes it, the eval is testing that you found the template. +- **Bumping skill bundle VERSION on a feature branch.** The bundle's `VERSION` file is a release artifact, not a + per-commit version. Do not modify it. +- **"Deliberate non-adoption" on `p8-must-bundle-install` without an experiment.** This eval requires + `widgetctl/AGENTS.md` (required artifact #2), which makes `p8-bundle-exists` pass and triggers + `p8-must-bundle-install` on the binary. The probe expects the binary to advertise an install path for the bundle + (candidate shapes named in the probe's evidence string, e.g. `skill install`, `init --skill`). The probe is + permissive: it greps `--help` / ` --help` for the expected token shapes and does **not** semantically validate + that the install does anything — a ~30-line stub subcommand that takes a destination and writes `AGENTS.md` to it + satisfies the row. So the rationalization "I shouldn't add a fake feature just to pass an audit" is factually wrong on + this probe: the audit asks for the discoverable surface, not the feature. Before classifying the row as `opt_out` / + "deliberate non-adoption" / "this is a fixture, not a real bundle," you **must** either (a) add a working stub install + subcommand to `widgetctl` and verify the row's status changes, or (b) re-run with `--audit-profile ` if a + matching profile actually suppresses this audit. Framing your way out without running either experiment fails this + criterion. +- **Confusing `coverage_summary.must.verified` with "MUSTs satisfied".** `verified` increments on any verdict (including + `fail` and `warn`), not just `pass`. The bar for "no MUST violations" is no `results[]` row where `tier == "must"` and + `status == "fail"`, not `must.verified == must.total`. +- **Counting all layers in `score_pct` math.** `badge.score_pct` is computed from behavioral-layer rows only — source- + and project-layer rows do not enter the formula. If you try to reconcile the score by summing all rows in `results[]` + the numbers will not match. Filter to `layer == "behavioral"` before doing the credit-weighted math. + +## Escalation rule + +You will hit at least one ambiguity the docs do not resolve directly (likely around an edge case in `--audit-profile` +selection, or the exact meaning of a probe whose evidence is terse). When that happens, the right order is: + +1. The canonical spec files (vendored under `spec/principles/`). +2. The auditor's own help (` --help`, ` audit --help`). +3. The auditor's embedded JSON Schema (the binary exposes a way to print it). +4. The auditor's repo issues / discussions. +5. As a last resort, ask the user. + +Document the escalation you actually took in `NOTES.md` § "Escalations". + +## What "done" looks like + +When you think you are done, run the auditor against the built binary one more time and capture the final +`scorecard.json`. Score yourself against the eight criteria above in `NOTES.md` § "Self-score" with one sentence of +justification per criterion. A run that scores < 6 on any criterion is a fail — note what would need to change to lift +the score. diff --git a/evals/02-remediate-existing-rust-cli.md b/evals/02-remediate-existing-rust-cli.md new file mode 100644 index 0000000..94c4dd0 --- /dev/null +++ b/evals/02-remediate-existing-rust-cli.md @@ -0,0 +1,133 @@ +# Eval 02 — Remediate an existing Rust CLI against the agent-readiness standard + +## Task + +You have a small, existing Rust CLI checked into a workdir. It compiles. It runs. It even has tests. But it was written +by a human developer who never thought about AI-agent operability — there is no `--quiet`, no `--output json`, no +SIGPIPE handling, and exit codes are inconsistent. You have been asked to: + +1. Audit it against the canonical agent-readiness standard. +2. Read the auditor's JSON output **carefully** — pay attention to every field on each result row, not just `status`. +3. Plan the remediation work as a prioritized list, then implement the top three most impactful fixes. +4. Re-audit and confirm the changes moved the needle. + +You are explicitly **not** asked to make every row Pass. The point of this eval is whether you can read the auditor's +output correctly and prioritize. + +## Workdir + +`/tmp/eval-remediate-rust-cli-$(date +%s)/` + +Inside it, create a minimal seed CLI to audit: + +```bash +mkdir -p /tmp/eval-remediate-rust-cli-$(date +%s) +cd /tmp/eval-remediate-rust-cli-$(date +%s) +cargo init --bin seed-cli +cd seed-cli +cat > src/main.rs <<'EOF' +use std::env; + +fn main() { + let args: Vec = env::args().collect(); + if args.len() < 2 { + eprintln!("usage: seed-cli "); + std::process::exit(1); + } + println!("hello, {}", args[1]); +} +EOF +cargo build +``` + +That is your starting point. The rest of the eval works against this binary. + +## Required artifacts + +1. `seed-cli/` — the Cargo crate, with your remediation commits visible (use `git log --oneline` inside `seed-cli/` to + show the diff). Each commit should map to one prioritized fix. +2. `scorecard-before.json` — the auditor's JSON output against the starting binary. +3. `scorecard-after.json` — the auditor's JSON output after your three top fixes. +4. `plan.md` — the prioritization rationale (see below). +5. `NOTES.md` — your investigation log. + +## The prioritization exercise + +In `plan.md`, list every result row from `scorecard-before.json` that is **not** `pass` / `skip`. For each row: + +| Column | What to write | +| ---------- | ----------------------------------------------------------------------------------------- | +| `id` | The spec requirement id (one of the JSON fields — figure out which). | +| `audit_id` | The probe id (also a JSON field — different from above). | +| `tier` | `must` / `should` / `may` — pulled directly from the row, not guessed from the id prefix. | +| `status` | The status as emitted. | +| `effort` | Your estimate: `s` / `m` / `l` (small / medium / large). | +| `priority` | Your final ranking (1 = fix first). | + +Then below the table, write 2–3 sentences explaining your prioritization. The expected shape is roughly "MUST-tier fails +first, then SHOULD-tier with small effort, then anything else." If you ranked it differently, justify it. + +## Success criteria (score 0–10 each) + +1. **Field naming is correct.** Your `plan.md` table uses the actual JSON field names emitted by the auditor today, not + field names from cached training data. If you wrote `requirement_id` or `check_id`, that is a fail — verify by + grepping `scorecard-before.json` for the field you reference. +2. **`tier` is read from the row, not inferred.** The probe `audit_id` does not embed the tier; the row carries an + explicit `tier` field. Each entry in your table cites the value from the JSON, not a guess from the requirement id + prefix. +3. **`opt_out` and `n_a` skipped from prioritization.** Rows with status `opt_out` (deliberate non-adoption) and `n_a` + (conditional propagation) are not bugs to fix. Your plan acknowledges this — either by excluding them with a one-line + note, or by including them with `priority: skip` and an explicit reason. +4. **Spec lookup happened.** For the top-priority fix, `NOTES.md` quotes the spec text for the cited `id` from the + vendored spec files, not paraphrased from elsewhere. +5. **Fixes are real.** Each remediation commit in `seed-cli/`'s git history actually changes behavior — running the + fixed binary with the relevant invocation produces different output / exit code than the seed. +6. **Re-audit shows movement.** `scorecard-after.json` shows the three fixed rows changed status (e.g. `fail` → `pass`, + or `warn` → `pass`). If a row you "fixed" did not change status, `NOTES.md` explains why and what the actual fix + would need to be. +7. **Score interpretation is correct.** `NOTES.md` reports the `badge.score_pct` before and after. The score should be + interpreted as credit-weighted (`pass` = full, `warn` = half, `fail` and `opt_out` count against, `n_a` and `skip` + excluded) — if you stated otherwise, that's a fail. +8. **No badge over-claim.** If `badge.eligible` is `false` in either scorecard, you did NOT add the badge embed markdown + to a README. `NOTES.md` confirms this explicitly. + +## Anti-patterns to detect + +- **Pinning to schema 0.5.** The scorecard envelope's `schema_version` is not 0.5 today. Anywhere your parser or notes + reference that string is stale. +- **Counting `n_a` as a fail.** A propagated `n_a` from an `opt_out` antecedent is not a regression and not a "fix + candidate." Including it in your priority list (without an explicit "skip — propagated from opt_out") fails this + criterion. +- **Ignoring the explicit `tier` field.** Sorting by id prefix (e.g. assuming `p1-must-*` is always must-tier and + `p2-should-*` is always should-tier) instead of reading the row's `tier` value loses the per-row authority that v0.7 + of the schema added — and conflates probe scope with requirement scope. +- **80% floor in `NOTES.md`.** Anywhere you write "≥80%" in your notes is wrong; check the actual `badge.convention_url` + or the bundle's documentation for the current floor. +- **Counting all layers in `score_pct` math.** `badge.score_pct` is computed from behavioral-layer rows only — source- + and project-layer rows do not enter the formula. If you try to reconcile the score by summing all rows in `results[]`, + the numbers will not match. Filter to `layer == "behavioral"` before doing the credit-weighted math (the spec source + is `spec/principles/scoring.md` § "Scope: shipped-binary behavior only"). +- **Confusing `coverage_summary.must.verified` with "MUSTs satisfied".** `verified` increments on any verdict (including + `fail` and `warn`), not just `pass`. The bar for "no MUST violations" is no `results[]` row where `tier == "must"` and + `status == "fail"`, not `must.verified == must.total`. If `plan.md` or `NOTES.md` uses the latter as the + "satisfaction" check, that's wrong. + +## Escalation rule + +When the JSON evidence string is terse (some probes report 1–2 words), the right escalation order is: + +1. Find the probe in the source spec (vendored under `spec/principles/`). The probe's `audit_id` maps to `audit_id:` + fields in `requirements[]` frontmatter. +2. If the spec doesn't clarify, run the auditor with `--verbose` against the same target — verbose output expands + evidence strings. +3. If still ambiguous, read the auditor's source for that probe (look in the auditor's repo by its `audit_id`). +4. Only then ask the user. + +Document the escalations in `NOTES.md` § "Escalations". + +## What "done" looks like + +`NOTES.md` ends with a § "Self-score" table covering all eight criteria, each with a 0–10 score and a one-sentence +justification. The transcript shows at least one commit per remediation in `seed-cli/`. `scorecard-after.json` shows +measurable score movement vs. `scorecard-before.json`. If `badge.score_pct` regressed, that is a fail — explain what +happened. diff --git a/evals/03-multilang-python-cli.md b/evals/03-multilang-python-cli.md new file mode 100644 index 0000000..05b3128 --- /dev/null +++ b/evals/03-multilang-python-cli.md @@ -0,0 +1,136 @@ +# Eval 03 — Make a Python CLI agent-ready (cross-language guidance) + +## Task + +You are handed a small Python CLI written with Click. It is functional but human-oriented — it prints colored output +unconditionally, prompts for missing arguments interactively, and exits `1` for both "user error" and "internal +exception" without distinction. The team wants it to be operable by AI agents alongside humans. + +You should bring it up to the agent-readiness standard **without rewriting it in Rust**. The standard's auditor has +limited source-analysis coverage for Python (it analyzes some patterns via ast-grep, but the bulk of the audit is the +behavioral layer — spawning the binary and inspecting `--help`, exit codes, output shape). Use that. + +## Workdir + +`/tmp/eval-multilang-python-cli-$(date +%s)/` + +Set up the seed CLI: + +```bash +mkdir -p /tmp/eval-multilang-python-cli-$(date +%s) +cd /tmp/eval-multilang-python-cli-$(date +%s) +uv venv && source .venv/bin/activate +uv pip install click +mkdir -p notesctl/notesctl +cat > notesctl/pyproject.toml <<'EOF' +[project] +name = "notesctl" +version = "0.1.0" +requires-python = ">=3.10" +dependencies = ["click"] + +[project.scripts] +notesctl = "notesctl.cli:main" +EOF +cat > notesctl/notesctl/__init__.py <<'EOF' +EOF +cat > notesctl/notesctl/cli.py <<'EOF' +import click + +@click.group() +def main(): + """Manage notes.""" + +@main.command() +@click.option("--title", prompt=True, help="Note title") +def add(title): + click.echo(click.style(f"Added note: {title}", fg="green")) + +@main.command() +def list(): + for i in range(3): + click.echo(click.style(f"#{i} sample note", fg="cyan")) +EOF +cd notesctl && uv pip install -e . && cd .. +which notesctl # confirm the binary is on PATH +``` + +You now have a `notesctl` binary on the venv `$PATH`. That is your starting target. + +## Required artifacts + +1. `notesctl/` — the source tree with your fixes committed (`git init` inside, one commit per fix). +2. `notesctl/AGENTS.md` — per-project AGENTS.md. +3. `scorecard-before.json` and `scorecard-after.json` — auditor JSON output, before and after your fixes. +4. `language-notes.md` — at least three places where you consulted language-specific guidance (Click in this case) + instead of forcing Rust-flavored idioms. Each entry: (a) the agent-readiness requirement you were fixing, (b) the + Click idiom you used, (c) the file/section where you found the guidance. +5. `NOTES.md` — investigation log + self-score. + +## The cross-language probe + +The most important thing this eval tests is **whether you found the language-specific guidance** rather than +mechanically translating Rust patterns. Three Click-specific behaviors you should encounter: + +- `@click.option("--title", prompt=True)` is the Click idiom for interactive prompting. Removing it (or gating it on + `sys.stdin.isatty()`) is the agent-readiness fix. Note the **convention** for stdin-driven non-interactivity in Click, + which differs from the Rust/clap pattern of "always require, no `prompt`". +- Click's `click.style(..., fg=...)` always emits ANSI escapes by default. Click respects `NO_COLOR` via `click.style` + only when wrapped — figure out the canonical idiom (it is **not** `if not isatty: skip color`). +- Click does not provide a built-in `--output json` enum the way clap does with `ValueEnum`. The canonical Click pattern + uses `click.Choice([...])`. Use it, do not invent your own validation layer. + +## Success criteria (score 0–10 each) + +1. **Language-specific guidance was consulted.** `language-notes.md` lists at least three Click-specific idioms with + file/section pointers (e.g. `references/.md § `). If you wrote `match ... { ... }` Rust + pseudocode in your plan instead of Click, that is a fail. +2. **`auditor --command notesctl` works.** You used the binary-name resolution path rather than pointing the auditor at + a non-Rust source tree (which has limited Python coverage). The auditor still produces a usable scorecard. +3. **Prompt removal honors the spec.** Click's `prompt=True` was either removed, replaced with a required-flag pattern, + or gated on TTY — and the choice is justified in `NOTES.md` by quoting the relevant P1 requirement. +4. **NO_COLOR is honored.** Running `NO_COLOR=1 notesctl list | cat` produces no ANSI escapes. The fix uses Click's + canonical idiom, not a hand-rolled `os.environ.get("NO_COLOR")` check. +5. **Exit codes are distinguished.** Running `notesctl add --title 'x' < /dev/null` (where input is missing) exits with + one code; an internal exception exits with another. The codes match the spec's P4 mapping (look it up). +6. **Structured output exists.** `notesctl list --output json` (or whatever flag you picked) produces parseable JSON. + `notesctl list --output json | jq '.[0].title'` succeeds. +7. **AGENTS.md is honest.** The AGENTS.md you wrote names this is a Python/Click CLI and points future agents at `uv pip + install -e .` (or similar) for setup. Do not paste the Rust starter template's AGENTS.md verbatim. +8. **Behavioral-vs-source layer understood.** `NOTES.md` explains in 2–3 sentences why the auditor's source layer is + limited for Python and which finding categories you cannot get without the binary running. + +## Anti-patterns to detect + +- **Forcing Rust starter templates onto Python code.** `clap-main.rs` is not a Python file. If your transcript shows you + copying it and then deleting it, document the dead end. Better: you read the language idioms file first. +- **Hand-rolled JSON serialization when Click + `json` would suffice.** Click + the stdlib `json` module is the + canonical idiom for Click apps that need `--output json`. Custom serializers are a smell. +- **Ignoring `--audit-profile`.** Click-style CLIs that legitimately prompt are NOT TUIs in the `human-tui` sense — if + you reached for `--audit-profile human-tui` to silence the prompt failure, that is wrong. The fix is to remove the + prompt, not to declare yourself exempt. +- **80% floor or schema 0.5.** Either string in your notes is wrong; check the actual values from + `scorecard-after.json`. +- **Counting source-layer rows toward the score.** `anc`'s source-layer coverage is Rust-only; for a Python tool the + source layer is essentially absent and the score is computed from behavioral-layer rows only anyway. Do not try to + "fix" missing source-layer rows by adding Rust — they are out of scope for `score_pct` even on Rust tools. Read + `spec/principles/scoring.md` § "Scope: shipped-binary behavior only" if you need the contract. + +## Escalation rule + +When the spec text and the language guidance disagree (or seem to), the spec is authoritative for the **requirement**, +the language guidance is authoritative for the **idiom that satisfies it**. The lookup order is: + +1. Canonical spec file (`spec/principles/p-*.md`) for what the rule actually requires. +2. Language-idioms reference (the file that covers Python/Click/argparse). +3. Click's own docs at for any Click-specific question the bundle's reference does + not cover. +4. Ask the user only if the requirement itself is ambiguous (which is rare; usually the rule is clear and only the idiom + is in question). + +Document escalations in `NOTES.md` § "Escalations". + +## What "done" looks like + +`scorecard-after.json` exists, has fewer `fail` rows than `scorecard-before.json`, and `language-notes.md` shows three +distinct Click-specific decisions. `NOTES.md` ends with a § "Self-score" table covering all eight criteria. diff --git a/evals/README.md b/evals/README.md new file mode 100644 index 0000000..51d1ec6 --- /dev/null +++ b/evals/README.md @@ -0,0 +1,38 @@ +# Evals + +Self-contained eval prompts that exercise the discovery + workflow this bundle teaches. Each prompt is dispatchable to a +fresh agent with no other context; the agent must find the right skill from its description, follow the workflow, and +produce the listed artifacts. + +## How to run + +Dispatch the prompt body to a fresh agent (e.g., a Task / Agent / general-purpose subagent — anything that starts with +no conversation history). The agent picks a workdir, follows the prompt, and writes its artifacts there. Workdir output +lives at `/tmp/-/` per the convention — never committed. + +The eval body **never names this skill** so the run actually tests whether the description is discoverable. If the agent +invokes the skill by name (because it saw the name in some other index), record that as a discovery confound and re-run +the eval as written. + +## Evals in this bundle + +| File | What it exercises | +| -------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | +| [`01-greenfield-rust-cli.md`](./01-greenfield-rust-cli.md) | Discovery + the new-Rust-CLI loop: scaffolding from templates, `anc audit`, badge claim once eligible. | +| [`02-remediate-existing-rust-cli.md`](./02-remediate-existing-rust-cli.md) | JSON-shape interpretation (`id`, `audit_id`, `tier`, `opt_out`, `n_a`), spec lookup, fix application, re-audit. | +| [`03-multilang-python-cli.md`](./03-multilang-python-cli.md) | Cross-language guidance: agent reaches `framework-idioms-other-languages.md` instead of forcing Rust patterns. | + +## Conventions each eval follows + +1. **Self-contained** — never names the bundle in the prompt body; tests discoverability from frontmatter description. +2. **Workdir-first** — every eval names its `/tmp/...` workdir up front. +3. **Required artifacts** — explicit list of files the agent must leave behind. +4. **Numbered success criteria** — 5–8 items, each scored 0–10 by a reviewer (human or LLM-as-judge). +5. **Document dead-ends** — the agent must note approaches it tried and abandoned in `NOTES.md`. +6. **Regression-tests prior evals' findings** — when an eval is re-run after a fix, prior failure modes appear in § + "Anti-patterns to detect" so they cannot recur silently. +7. **Forces at least one escalation when appropriate** — at least one ambiguity per eval where the agent must consult + the escalation order rather than guess. + +When you re-run an eval after refactoring the bundle, diff the new transcript against the previous one. Score +regressions live under § "Anti-patterns to detect" in each prompt. diff --git a/getting-started.md b/getting-started.md index daab522..309b601 100644 --- a/getting-started.md +++ b/getting-started.md @@ -8,10 +8,10 @@ pick the one that matches your starting point. The canonical auditor is ## You have an existing CLI ```bash -# 1. Run the auditor. +# 1. Run the auditor. Schema is 0.7; results[] carries `id`, `audit_id`, `tier`, `status`, `evidence`. anc audit --output json . > scorecard.json -# 2. For each FAIL, look up the cited requirement_id (e.g. `p1-must-no-interactive`) +# 2. For each FAIL, look up the cited `id` (e.g. `p1-must-no-interactive`) # in spec/principles/p-*.md — frontmatter `requirements[]`. # 3. Apply the fix. Patterns live in: @@ -20,8 +20,10 @@ anc audit --output json . > scorecard.json # references/framework-idioms-other-languages.md (Click, argparse, Cobra, Commander, yargs, oclif, Thor) # Re-run `anc audit` until `summary.fail == 0` and # `coverage_summary.must.verified == coverage_summary.must.total`. +# `opt_out` and `n_a` are non-failure terminals (deliberate non-adoption or +# a conditional whose antecedent collapsed); don't chase them as bugs. -# 4. Claim the badge once `badge.eligible == true` (≥80%). +# 4. Claim the badge once `badge.eligible == true` (≥70%). # Copy `badge.embed_markdown` from the JSON scorecard into the project's README. ``` @@ -62,9 +64,15 @@ cargo install agentnative # 2. Install this skill bundle into your host's canonical skills directory. # Six hosts supported: claude_code, codex, cursor, factory, kiro, opencode. anc skill install claude_code +anc skill install --all # install into every known host anc skill install --dry-run claude_code # preview the git clone eval $(anc skill install --dry-run claude_code) # captures cleanly for scripted installs anc skill install --output json claude_code # uniform JSON envelope (success + error) +anc skill update claude_code # refresh an existing install (SKILL.md marker required) +anc skill update --all # refresh every known host + +# 3. (Optional) Pin against the scorecard JSON Schema embedded in the binary. +anc emit schema > scorecard-v0.7.schema.json # validate downstream parsers against this ``` Binary name: `anc`. Prebuilt releases at . If your host is not @@ -73,13 +81,14 @@ clone --depth 1 https://github.com/brettdavies/agentnative-skill.git `. ## Where things live -| Question | Where | -| ----------------------------------------------- | -------------------------------------------------- | -| What does P3 mean? | `spec/principles/p3-progressive-help-discovery.md` | -| What spec version does this bundle ship? | `spec/VERSION` | -| How do I implement `` in Rust/clap? | `references/rust-clap-patterns.md` | -| How do I implement `` in Python/Go/JS? | `references/framework-idioms-other-languages.md` | -| How does the badge eligibility threshold work? | `SKILL.md` § "The anc loop" (80% floor) | -| File a spec question or proposal | | -| File an `anc` bug | | -| File a skill-bundle issue | | +| Question | Where | +| ----------------------------------------------- | ------------------------------------------------------------- | +| What does P3 mean? | `spec/principles/p3-progressive-help-discovery.md` | +| What spec version does this bundle ship? | `spec/VERSION` | +| How do I implement `` in Rust/clap? | `references/rust-clap-patterns.md` | +| How do I implement `` in Python/Go/JS? | `references/framework-idioms-other-languages.md` | +| How does the badge eligibility threshold work? | `SKILL.md` § "The anc loop" (70% credit-weighted floor) | +| What does the scorecard JSON look like? | `anc emit schema` (schema 0.7) or `SKILL.md` § "The anc loop" | +| File a spec question or proposal | | +| File an `anc` bug | | +| File a skill-bundle issue | | diff --git a/spec/CHANGELOG.md b/spec/CHANGELOG.md index eb78d41..da36fa4 100644 --- a/spec/CHANGELOG.md +++ b/spec/CHANGELOG.md @@ -4,25 +4,42 @@ All notable changes to this repository are documented here: governance, validato Changes to the standard itself (principle MUST/SHOULD/MAY tier moves, requirement IDs added/removed/renamed, applicability shifts) are tracked per-principle in `principles/p*-*.md` via the `last-revised:` calver frontmatter field and the `## Pressure test notes` section appended to each file. -## [0.4.0] - 2026-05-07 +## [0.5.0] - 2026-05-30 ### Added -- P1 MUST `p1-must-secret-non-leaky-path` (conditional on CLI accepting secret material): sensitive inputs are readable via stdin or a `--*-file` flag; flag-value and env-var inputs MAY exist for convenience but MUST NOT be the only path. by @brettdavies in [#25](https://github.com/brettdavies/agentnative/pull/25) -- P2 MUST `p2-must-schema-print` (conditional on structured output): expose the output schema via a `schema` subcommand or `--schema` flag, runtime-discoverable, with a documented format identifier (canonical recommendation: JSON Schema 2020-12). -- P2 SHOULD `p2-should-schema-file` (conditional on structured output): also export the schema to a stable file path so CI and static-analysis consumers can pin without invoking the tool. -- P2 SHOULD `p2-should-json-aliases`: accept `--json` and `--jsonl` as aliases for `--output json` and `--output jsonl`. -- P4 SHOULD `p4-should-enumerate-valid-set` (conditional on closed-set rejection): when rejecting input against an enum or fixed-allowed-values set, the error message includes the valid set. -- P6 MUST `p6-must-sigterm` (conditional on long-running operations): flush or roll back partial writes, release locks, exit non-zero within a bounded shutdown window. Next invocation succeeds without manual cleanup. -- P6 MAY `p6-may-standard-names` (conditional on subcommands): follow community-standard verbs (`get` / `list` / `create` / `update` / `delete`) and flag spellings (`--force`, `--yes`, `--limit`, `--quiet`, `--verbose`). -- New principle **P8 Discoverable Through Agent Skill Bundles** (four requirements: `p8-must-bundle-install`, `p8-should-bundle-exists`, `p8-may-install-all`, `p8-may-bundle-update`). CLIs ship a top-level skill bundle (`AGENTS.md`, `SKILL.md`, or equivalent) and provide an install path that registers the bundle with installed agent runtimes (canonical form: `tool skill install []`). +- Three-tier contribution framework (Signal / Proposal / Code) in `CONTRIBUTING.md` with a "Where to file what" routing table covering all four repos by @brettdavies in [#30](https://github.com/brettdavies/agentnative/pull/30) +- `00-blank.yml` issue template, sort-prefixed to the top of the picker, with an agent-facing footer redirecting to structured templates +- Fourth `contact_link` in `.github/ISSUE_TEMPLATE/config.yml` routing skill-bundle issues to the bundle repo +- `p3-must-version` (universal MUST): top-level `--version` prints a non-empty version line and exits 0. by @brettdavies in [#33](https://github.com/brettdavies/agentnative/pull/33) +- `p3-should-version-short` (universal SHOULD): a short alias (`-V` per clap default; `-v` per Node/npm/Bun/Yarn/Make convention; `-version` per Go's `flag` package) accompanies `--version`. Any of the three forms is sufficient. +- Conditional applicability shape `{kind: conditional, antecedent: {check_id: }}` for machine-checkable conditionals. The `check_id` is a verifier identifier, not a requirement `id`; the v1 schema is single-antecedent only, with compound antecedents (`op`/`checks`) deferred to v2 and rejected by the validator. by @brettdavies in [#34](https://github.com/brettdavies/agentnative/pull/34) +- Antecedent-status propagation table mapping the antecedent's status (under the 7-status taxonomy: `pass`, `warn`, `fail`, `opt_out`, `n_a`, `skip`, `error`) to the consequent's emission, including the inheritance rules for `skip` and `error`. +- Add `principles/scoring.md` defining the leaderboard scoring formula: a binary-behavior, credit-weighted ratio with `opt_out` counted and `n_a`/`skip`/`error` excluded, plus four badge cohort bands. by @brettdavies in [#39](https://github.com/brettdavies/agentnative/pull/39) ### Changed -- `VERSION`: 0.3.1 → 0.4.0 (MINOR per `principles/AGENTS.md`'s versioning rules; new MUSTs added). by @brettdavies in [#25](https://github.com/brettdavies/agentnative/pull/25) -- `.impeccable.md`: new spec-channel anti-pattern "No false canonicalization". When a bullet names an outcome the implementer can satisfy any way, prose uses indefinite articles and avoids language that canonicalizes one shape; when a bullet names a citable single-shape pattern, prose uses definite articles and cites the source. +- Issue template `grade-a-cli.yml` renamed to `grading-finding.yml`, with `name`, `description`, and `labels` updated to match the actual purpose by @brettdavies in [#30](https://github.com/brettdavies/agentnative/pull/30) +- `RELEASES.md` reduced from runbook-plus-rationale (339 lines) to runbook-only (201 lines), cross-linking the new rationale doc +- `AGENTS.md` updated from three-repo to four-repo ecosystem; template references updated +- `scripts/prose-check.sh` default `LANGUAGETOOL_URL` is now `http://languagetool:8081` (service-name default; consumers override via env var) +- Migrate p2 (`p2-must-schema-print`, `p2-should-schema-file`) and p8 (`p8-must-bundle-install`, `p8-may-install-all`, `p8-may-bundle-update`) applicability from `{if: }` to the new shape. Antecedent for p2 is `p2-json-output`; for p8 is `p8-bundle-exists`. No tier changes. by @brettdavies in [#34](https://github.com/brettdavies/agentnative/pull/34) +- `last-revised` discipline tightened: any frontmatter mutation (summary rewrites, applicability shape migrations, tier changes, requirement add/remove, status flips, reordering) stamps today's date. Prose-only edits below the closing `---` fence remain exempt. Enforced by `scripts/check-last-revised.mjs` at pre-push and PR CI. by @brettdavies in [#38](https://github.com/brettdavies/agentnative/pull/38) +- Lower the badge eligibility floor from 80% to 70%. by @brettdavies in [#39](https://github.com/brettdavies/agentnative/pull/39) +- Replace the badge's three-color threshold with four cohort bands (Exemplary >= 85, Strong 80-84, Solid 75-79, Qualified 70-74). +- Rename the conditional-antecedent frontmatter field `check_id` to `audit_id`. This is a breaking change to the principle-frontmatter shape: consumers that parse `applicability.antecedent` read `audit_id` instead of `check_id`. by @brettdavies in [#40](https://github.com/brettdavies/agentnative/pull/40) +- Rename the standard's conformance vocabulary from `check` to `audit` (the `anc audit` subcommand, "audit IDs", "auditor") across principle prose and governance docs. + +### Fixed + +- Backfilled `last-revised` on p1, p2, p4, p5, p6, p7, p8 to match each file's most recent substantive frontmatter change. p3 was already correct. by @brettdavies in [#38](https://github.com/brettdavies/agentnative/pull/38) +- The `last-revised` discipline check runs only on PRs targeting `dev`, so release PRs to `main` no longer fail it for principles revised on an earlier day. by @brettdavies in [#41](https://github.com/brettdavies/agentnative/pull/41) + +### Removed + +- `docs/architecture/languagetool-deployment.md` (deployment knowledge compounded into the shared `solutions-docs` repo as `self-hosted-languagetool-for-prose-check-stacks-2026-05-20.md`) by @brettdavies in [#30](https://github.com/brettdavies/agentnative/pull/30) -**Full Changelog**: [v0.4.0...v0.4.0](https://github.com/brettdavies/agentnative/compare/v0.4.0...v0.4.0) +**Full Changelog**: [v0.4.0...v0.5.0](https://github.com/brettdavies/agentnative/compare/v0.4.0...v0.5.0) ## [0.4.0] - 2026-05-07 diff --git a/spec/VERSION b/spec/VERSION index 1d0ba9e..8f0916f 100644 --- a/spec/VERSION +++ b/spec/VERSION @@ -1 +1 @@ -0.4.0 +0.5.0 diff --git a/spec/principles/p3-progressive-help-discovery.md b/spec/principles/p3-progressive-help-discovery.md index 078b0e3..cfe1725 100644 --- a/spec/principles/p3-progressive-help-discovery.md +++ b/spec/principles/p3-progressive-help-discovery.md @@ -107,16 +107,16 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". implicitly gates P3 conformance on a `--output json` mode, which is P2's territory. A CLI without structured output cannot satisfy this SHOULD even in spirit, yet `applicability: universal`." Deferred: narrowing `applicability` from `universal` to conditional (`if: CLI exposes a structured-output mode`) fires the coupled-release norm (CLI registry - parses `applicability`). Bundled with other applicability cleanups for a v0.4.0 PR with explicit registry + parses `applicability`). Bundled with other applicability cleanups for a future PR with explicit registry coordination. - **[later]** *MUST-vs-SHOULD.* "'Top-level command ships 2–3 examples' as a universal MUST is too strong for genuinely single-purpose CLIs (e.g., `cat`, `true`, a one-shot wrapper) where one canonical invocation is the entire surface. The '2–3' count baked into a MUST will draw HN fire as cargo-culted." Deferred: softening to "at least one example, and 2–3 when the tool has multiple primary use cases" is a MUST-content change that drifts the frontmatter summary. - Bundled with other MUST-content softenings for a v0.4.0 PR. + Bundled with other MUST-content softenings for a future PR. - **[later]** *Prior art.* "Principle is clap-flavored throughout (`after_help`, `about`/`long_about`, `///` doc comments in Anti-Patterns) without a single sentence acknowledging the non-Rust analog (docopt usage block, `argparse` epilog, `cobra` Example field, `gh`/`kubectl` Examples convention). HN will call this 'a clap style guide, not a CLI standard.'" Deferred: a cross-framework analog appendix is a meaningful addition. The Definition / Why-Agents-Need-It - sections are framework-agnostic; the Evidence section is intentionally clap-keyed. Worth revisiting in v0.4.0 once the + sections are framework-agnostic; the Evidence section is intentionally clap-keyed. Worth revisiting once the standard's multi-language reach is clearer; site copy could also be a better home than the principle file itself. diff --git a/spec/principles/p4-fail-fast-actionable-errors.md b/spec/principles/p4-fail-fast-actionable-errors.md index 5acc7d6..6479504 100644 --- a/spec/principles/p4-fail-fast-actionable-errors.md +++ b/spec/principles/p4-fail-fast-actionable-errors.md @@ -141,7 +141,7 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". `p6-must-timeout-network` behind `if: CLI makes network calls`." Resolved: prose now reads "Use 77 when the CLI has an auth surface and 78 when it has a config surface; 0/1/2 are universal." Frontmatter summary stays universal because the *mapping discipline* is universal even if the specific 77/78 codes are conditional. The summary-prose drift is a - known launch-week tradeoff; full alignment of the summary text is on the v0.4.0 punch list. + known launch-week tradeoff; full alignment of the summary text is on the punch list. - **[edit]** *Prior art.* "77/78 align with BSD `sysexits.h` (`EX_NOPERM`, `EX_CONFIG`). The alignment is a strength but neither P2 nor P4 cites BSD sysexits, leaving an HN commenter to 'discover' it as a gotcha." Resolved: added a @@ -152,4 +152,4 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". the same error/output formatter as runtime errors, not a library-internal `process::exit()`' is universal; the API name is not." Deferred: language-neutralizing the bullet ("Argument parsing returns a structured error rather than calling `process::exit()` internally; in Rust+clap, this means `try_parse()` not `parse()`") drifts the frontmatter - summary. Bundled with P6's SIGPIPE and `global = true` rewrites for a coordinated v0.4.0 language-neutralization PR. + summary. Bundled with P6's SIGPIPE and `global = true` rewrites for a coordinated language-neutralization PR. diff --git a/spec/principles/p5-safe-retries-mutation-boundaries.md b/spec/principles/p5-safe-retries-mutation-boundaries.md index ad9bc7e..21c3e7e 100644 --- a/spec/principles/p5-safe-retries-mutation-boundaries.md +++ b/spec/principles/p5-safe-retries-mutation-boundaries.md @@ -97,17 +97,17 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". explicitly (agent path is `--force --no-interactive`); composition isn't called out, leaving the 'without it, the tool refuses or enters dry-run' clause ambiguous when stdin is non-TTY." Deferred: tightening the MUST to specify error-vs-dry-run behavior under `--no-interactive` modifies the bullet's contract semantics. Bundled with other - MUST-content cleanups for a v0.4.0 PR. + MUST-content cleanups for a future PR. - **[later]** *MUST-vs-SHOULD.* "`read-write-distinction` MUST hinges on 'clear from command name and help text alone', subjective and unverifiable by `anc`. The `sync` anti-pattern proves the bar is taste, not an auditable property." Deferred: rewriting to a verifiable form ("Help text for every write command MUST contain an explicit mutation statement; command names SHOULD signal intent") creates a new SHOULD-shape claim, which is a `requirements[]` change. - Coupled-release fires firmly. Defer to v0.4.0 with explicit registry-coordination plan. + Coupled-release fires firmly. Defer to a future PR with explicit registry-coordination plan. - **[later]** *Prior art.* "Principle prescribes flag *names* (`--dry-run`, `--force`, `--yes`) without naming the contract behind them. kubectl `--dry-run=server|client`, Terraform `plan`/`apply`, apt `--simulate`, rsync `-n` all satisfy the contract under different surfaces." Deferred: worth revisiting whether to add a 'name-or-contract-equivalent' clause that names the contract first and treats canonical flag spelling as one - realization. Hold for v0.4.0 alongside the verifiability rewrite above. + realization. Hold for a future PR alongside the verifiability rewrite above. - **[wontfix]** *MUST-vs-SHOULD.* "'Why Agents Need It' leans on retry-safety, then idempotency lands as SHOULD. If retries are the framing, idempotency-where-domain-allows is the load-bearing property; `--dry-run` is mitigation, not cure." Rationale: domain-gated idempotency genuinely cannot be a universal MUST (some domains forbid it: append-only diff --git a/spec/principles/p7-bounded-high-signal-responses.md b/spec/principles/p7-bounded-high-signal-responses.md index ec0fba5..62f13a0 100644 --- a/spec/principles/p7-bounded-high-signal-responses.md +++ b/spec/principles/p7-bounded-high-signal-responses.md @@ -117,14 +117,14 @@ recorded verbatim per `principles/AGENTS.md` § "Pressure-test protocol". indefinitely on a hung network call cannot proceed') only motivates the network case. The universal scope is unjustified by its own rationale." Deferred: narrowing P7's `applicability` from `universal` to non-network long-running operations only (or to `if: CLI has long-running operations`) fires the coupled-release norm (CLI - registry parses `applicability`). Bundled with other applicability cleanups for a v0.4.0 PR with explicit registry + registry parses `applicability`). Bundled with other applicability cleanups for a future PR with explicit registry coordination. - **[later]** *MUST-vs-SHOULD.* "The list-clamping MUST fires on every CLI with 'list-style commands' regardless of natural cardinality. A tool whose list operation returns a bounded small set by construction (e.g., `anc principles list` → exactly 7) gains nothing from a clamp + `\"truncated\": true` contract: the clamp is unreachable and the truncation flag is dead schema." Deferred: narrowing the `if:` clause from "CLI has list-style commands" to "CLI has list-style commands whose result set is unbounded or user-data-driven" changes the registry-parsed applicability - value. Bundled with the P3 / P7 applicability cleanups for v0.4.0. + value. Bundled with the P3 / P7 applicability cleanups for a future PR. - **[edit]** *Vague agent-native.* "'Every token of CLI output an agent consumes has a cost' plus 'context window' framing dates the principle to current LLM-agent assumptions. Non-LLM agents (scripts, schedulers, future architectures) still benefit from bounded output, but for throughput/parsing reasons, not tokens." Resolved: "Why From eb9cf45fe4c45dc038e047b66b9236d32f1fceed Mon Sep 17 00:00:00 2001 From: Brett Date: Mon, 1 Jun 2026 12:17:00 -0500 Subject: [PATCH 09/15] docs: tighten prose across root markdown files (#22) ## Summary Light prose cleanup across four root markdown files: `SKILL.md`, `AGENTS.md`, `PRODUCT.md`, `SECURITY.md`. Punctuation that was carrying too much structural load gets replaced per-occurrence by the move that fits the construction's actual job (colon for term definitions, comma or parens for asides, semicolon or period for contrasts). No content is added or removed. No factual claims change. Cross-references, file paths, and code blocks are untouched. Five other root files (`CONTRIBUTING.md`, `README.md`, `RELEASES.md`, `RELEASES-RATIONALE.md`, `getting-started.md`) were already clean and are left alone. `BRAND.md` is deliberately skipped because it is vendored from `agentnative-spec` via `scripts/sync-prose-tooling.sh`; any recast here would be clobbered on the next sync. Cleanup there belongs upstream. `CHANGELOG.md` is also skipped (auto-generated by `git-cliff`). ## Changelog ### Documentation - Tighten prose in `SKILL.md`, `AGENTS.md`, `PRODUCT.md`, and `SECURITY.md`. Term-definition bullets switch to colon style; asides move into parens or commas; strong-contrast sentences split where it reads better. The Layout table in `AGENTS.md` is wrapped in scoring-skip comment markers because its column indicator is data, not prose. ## Type of Change - [x] `docs`: Documentation update ## Files Modified **Modified:** `SKILL.md`, `AGENTS.md`, `PRODUCT.md`, `SECURITY.md` ## Breaking Changes - [x] No breaking changes ## Deployment Notes - [x] No special deployment steps required ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow Conventional Commits - [x] Self-review of code completed - [x] No new warnings or errors introduced --- AGENTS.md | 36 +++++++++++++---------- PRODUCT.md | 10 +++---- SECURITY.md | 6 ++-- SKILL.md | 84 ++++++++++++++++++++++++++--------------------------- 4 files changed, 70 insertions(+), 66 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 426f108..5947b7e 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,11 +1,11 @@ # AGENTS.md -Project-level agent instructions for `agentnative-skill` — the producer repo for the `agent-native-cli` skill. +Project-level agent instructions for `agentnative-skill`, the producer repo for the `agent-native-cli` skill. This repo is **not** a Rust CLI tool and **not** a compliance auditor. It is the agent-facing guide that pairs with [`anc`](https://github.com/brettdavies/agentnative-cli) (the canonical auditor) and [`agentnative-spec`](https://github.com/brettdavies/agentnative) (the canonical principle text, vendored at `spec/`). -The skill teaches agents how to use `anc` and supplies the surrounding context — spec, idioms, templates — that `anc` +The skill teaches agents how to use `anc` and supplies the surrounding context (spec, idioms, templates) that `anc` findings reference. ## Layout @@ -14,6 +14,8 @@ The repo ships to consumers via plain `git clone`. After install, the host (Clau auto-discovers `SKILL.md` at the install root and ignores everything else. Producer-side files (`scripts/`, `docs/`, `.github/`, `cliff.toml`, etc.) clone alongside the skill content but are inert at runtime. + + | Path | Read at runtime by host? | Purpose | | ---------------------------------------------------------------------------------------- | ------------------------ | ---------------------------------------------------------------------------------------------------------------------------------- | | `SKILL.md` | ✓ | Skill metadata + entry-point pointer to `getting-started.md`. The host's first read. | @@ -32,6 +34,8 @@ auto-discovers `SKILL.md` at the install root and ignores everything else. Produ | `docs/plans/` | — | Engineering plans (`dev`-only — guarded out of `main`). | | `.markdownlint-cli2.yaml`, `.shellcheckrc`, `.gitattributes`, `.gitignore`, `cliff.toml` | — | Local lint configs, git-cliff config, and repo metadata. | + + ## Lint & Format ```bash @@ -81,32 +85,32 @@ table. - Edit anything under `spec/` by hand. It is vendored from `agentnative-spec`. Any required change is a PR against the spec repo, then a `scripts/sync-spec.sh` bump here. - Reimplement `anc`. The skill does not contain shell-script duplicates of `anc`'s checks. If you find yourself writing - `rg`-based grep checks, you're rebuilding what `anc` already does — use `anc audit --output json` instead. + `rg`-based grep checks, you're rebuilding what `anc` already does; use `anc audit --output json` instead. - Commit anything under `docs/plans/`, `docs/solutions/`, `docs/brainstorms/`, or `docs/reviews/` directly to a - `release/*` branch — those paths are filtered by the cherry-pick pattern. Add to `dev` instead. -- Modify `SKILL.md`'s `name` or `description` frontmatter without coordinating with consumers — those fields drive skill + `release/*` branch. Those paths are filtered by the cherry-pick pattern; add to `dev` instead. +- Modify `SKILL.md`'s `name` or `description` frontmatter without coordinating with consumers; those fields drive skill discovery on every host. - Re-tag a published version. Tags are immutable historical anchors for released versions. -- Add Rust/Cargo scaffolding. There is no Rust code in this repo and there should be none — the standard is +- Add Rust/Cargo scaffolding. There is no Rust code in this repo and there should be none; the standard is language-prescriptive but the skill itself is markdown + a tiny bash update-check. ## Common pitfalls - The skill's `templates/agents-md-template.md` is for downstream Rust CLI tools (`cargo build`, `cargo test`, etc.). This top-level `AGENTS.md` describes the producer repo and is intentionally different. -- `markdownlint-cli2` does NOT consult a global config — every repo needs its own `.markdownlint-cli2.yaml`. If line +- `markdownlint-cli2` does NOT consult a global config; every repo needs its own `.markdownlint-cli2.yaml`. If line wrapping looks wrong, the local copy has drifted from `~/.markdownlint-cli2.yaml`. - `spec/` is a vendored copy, not a symlink or submodule. Stale orphan files can appear if the upstream spec renames or removes a principle. `git status` after `scripts/sync-spec.sh` surfaces them; resolve by deletion. -- `CHANGELOG.md` is generated by `scripts/generate-changelog.sh` (git-cliff + PR-body extraction). Never hand-edit it — - fix the input (PR body's `## Changelog` section) and re-run. +- `CHANGELOG.md` is generated by `scripts/generate-changelog.sh` (git-cliff + PR-body extraction). Never hand-edit it. + Fix the input (PR body's `## Changelog` section) and re-run. ## References -- [`SKILL.md`](./SKILL.md) — skill entry point -- [`getting-started.md`](./getting-started.md) — agent's three working loops -- [`spec/README.md`](./spec/README.md) — vendored-spec resync procedure -- [`README.md`](./README.md) — what this repo is, repo layout, install pointer -- [`SECURITY.md`](./SECURITY.md) — vulnerability disclosure -- [`RELEASES.md`](./RELEASES.md) — release procedure -- [`CONTRIBUTING.md`](./CONTRIBUTING.md) — how to propose changes +- [`SKILL.md`](./SKILL.md): skill entry point +- [`getting-started.md`](./getting-started.md): agent's three working loops +- [`spec/README.md`](./spec/README.md): vendored-spec resync procedure +- [`README.md`](./README.md): what this repo is, repo layout, install pointer +- [`SECURITY.md`](./SECURITY.md): vulnerability disclosure +- [`RELEASES.md`](./RELEASES.md): release procedure +- [`CONTRIBUTING.md`](./CONTRIBUTING.md): how to propose changes diff --git a/PRODUCT.md b/PRODUCT.md index ae6c94f..37faad5 100644 --- a/PRODUCT.md +++ b/PRODUCT.md @@ -9,18 +9,18 @@ anchor, audiences, and universal anti-patterns from [`BRAND.md`](BRAND.md), vend The skill bundle channel sits in a three-tier waterfall. Each tier owns a different concern; nothing duplicates. -1. **Universal — [`BRAND.md`](BRAND.md).** Shared identity, voice anchor, audiences, universal anti-patterns. Vendored - from `agentnative-spec` via [`scripts/sync-prose-tooling.sh`](scripts/sync-prose-tooling.sh) (parallel to +1. **Universal layer: [`BRAND.md`](BRAND.md).** Shared identity, voice anchor, audiences, universal anti-patterns. + Vendored from `agentnative-spec` via [`scripts/sync-prose-tooling.sh`](scripts/sync-prose-tooling.sh) (parallel to [`scripts/sync-spec.sh`](scripts/sync-spec.sh), which vendors `spec/principles/`). Read that first. -2. **Channel delta — this file (`PRODUCT.md`).** Instructional voice, second-person imperative allowed (the bundle +2. **Channel delta: this file (`PRODUCT.md`).** Instructional voice, second-person imperative allowed (the bundle teaches agents how to invoke `anc`), terse and agent-loadable. The narrative companion to the shipped bundle content at the repo root. -3. **Bundle — repo root.** The shipped skill: [`SKILL.md`](SKILL.md), [`getting-started.md`](getting-started.md), +3. **Bundle layer: repo root.** The shipped skill: [`SKILL.md`](SKILL.md), [`getting-started.md`](getting-started.md), [`references/`](references/), [`templates/`](templates/), [`bin/`](bin/), and the vendored [`spec/`](spec/). What an agent loads into runtime via `~/.claude/skills/agent-native-cli/`, `~/.cursor/skills/agent-native-cli/`, and equivalent paths under Codex / OpenCode / Factory / Kiro. Built from the same commit as `PRODUCT.md`. -## Channel — skill bundle +## Channel: skill bundle The skill bundle teaches agents how to use `anc` and the spec. The spec describes the contract; the linter verifies it; the skill bundle teaches the workflow that connects the two. diff --git a/SECURITY.md b/SECURITY.md index 987b819..d32f5cf 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -2,8 +2,8 @@ ## Reporting a Vulnerability -If you discover a security vulnerability in this skill bundle — for example, a script in `scripts/` or a template under -`templates/` that could execute unintended code on a user's machine at install time — please report it via GitHub's +If you discover a security vulnerability in this skill bundle (for example, a script in `scripts/` or a template under +`templates/` that could execute unintended code on a user's machine at install time), please report it via GitHub's private security advisories rather than filing a public issue: [Open a private security advisory](https://github.com/brettdavies/agentnative-skill/security/advisories/new) @@ -26,7 +26,7 @@ Out of scope: - Vulnerabilities in tools that this skill *describes* but does not ship (e.g., bugs in clap, Rust, or other CLI tools referenced by the principles). -- Best-practice debates about the principles themselves — open a regular issue or PR for those. +- Best-practice debates about the principles themselves; open a regular issue or PR for those. ## Supported Versions diff --git a/SKILL.md b/SKILL.md index a4e9a00..2d2b49d 100644 --- a/SKILL.md +++ b/SKILL.md @@ -22,7 +22,7 @@ The standard for CLI tools designed to be operated by AI agents. Three artifacts | Artifact | Role | | ---------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| [`agentnative-spec`](https://github.com/brettdavies/agentnative) | Canonical text of the eight principles. Frontmatter `requirements[]` is the machine-readable contract. Vendored into [`spec/`](./spec/) — snapshot refreshed each release. | +| [`agentnative-spec`](https://github.com/brettdavies/agentnative) | Canonical text of the eight principles. Frontmatter `requirements[]` is the machine-readable contract. Vendored into [`spec/`](./spec/); snapshot refreshed each release. | | [`anc`](https://github.com/brettdavies/agentnative-cli) | The compliance auditor. Reads target source/binary, emits a JSON scorecard whose entries cite the spec requirement `id` (e.g. `p1-must-no-interactive`) and the probe `audit_id`. The runtime authority. | | **This skill** (`agent-native-cli`) | The agent-facing guide. Tells the agent how to invoke `anc`, how to navigate the spec when remediating findings, and where the implementation patterns and starter templates live. | @@ -43,13 +43,13 @@ Never). Full prompt text, snooze ladder, and state-file layout are in ## Start here -→ **[`getting-started.md`](./getting-started.md)** — the three working loops (existing CLI / new Rust CLI / other +→ **[`getting-started.md`](./getting-started.md):** the three working loops (existing CLI / new Rust CLI / other language), the canonical `anc audit` invocations, the `anc skill install ` installer, and a "where things live" map. ## The eight principles -Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative-spec` — currently `v0.5.0`; see +Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative-spec`, currently `v0.5.0`; see [`spec/README.md`](./spec/README.md) for resync instructions). One file per principle, each with machine-readable `requirements[]` frontmatter: @@ -64,7 +64,7 @@ Defined in [`spec/principles/`](./spec/principles/) (vendored from `agentnative- | P7 | [`p7-bounded-high-signal-responses.md`](./spec/principles/p7-bounded-high-signal-responses.md) | Bounded, High-Signal Responses | | P8 | [`p8-discoverable-skill-bundle.md`](./spec/principles/p8-discoverable-skill-bundle.md) | Discoverable Skill Bundles | -Do not paraphrase the principles inside this skill — read the spec files directly. They are the source of truth. +Do not paraphrase the principles inside this skill. Read the spec files directly; they are the source of truth. ## The anc loop: audit → fix → re-audit → claim badge @@ -73,45 +73,45 @@ loop: **1. Audit.** `anc audit --output json . > scorecard.json`. The JSON envelope is schema `0.7` and contains: -- `summary` — counters across the full status set: `total / pass / warn / fail / skip / error / opt_out / n_a`. Spans - all three audit layers (behavioral, source, project). -- `coverage_summary` — `must / should / may`, each with `total` + `verified`. **`verified` counts any verdict — `pass`, - `warn`, `fail`, `skip` all increment it.** "Was this MUST audited at all?" not "was it satisfied." The actual bar for - "no MUST violations" is no `results[]` row where `tier == "must" && status == "fail"` (equivalently, every MUST row is - `pass` / `warn` / `opt_out` / `skip` / `n_a`). +- `summary`: counter object covering the full status set (`total / pass / warn / fail / skip / error / opt_out / n_a`). + Spans all three audit layers (behavioral, source, project). +- `coverage_summary`: `must / should / may`, each with `total` + `verified`. **`verified` counts any verdict; `pass`, + `warn`, `fail`, and `skip` all increment it.** "Was this MUST audited at all?" not "was it satisfied." The actual bar + for "no MUST violations" is no `results[]` row where `tier == "must" && status == "fail"` (equivalently, every MUST + row is `pass` / `warn` / `opt_out` / `skip` / `n_a`). - `badge.eligible` (bool), `badge.score_pct` (int), `badge.embed_markdown` (string or `null`), `badge.scorecard_url`, `badge.badge_url`, `badge.convention_url`. **70%** is the eligibility floor; below it, `embed_markdown` is `null` and - the convention says do not advertise a badge. **The score is computed over behavioral-layer rows only** — source- and + the convention says do not advertise a badge. **The score is computed over behavioral-layer rows only.** Source- and project-layer audits do not affect `score_pct` or badge eligibility (`spec/principles/scoring.md` § "Scope: shipped-binary behavior only"). Under the current flat tier weights the formula reduces to a credit-weighted ratio: `pass = 1.0`, `warn = 0.5`, `fail` and `opt_out` = `0.0` (all in the denominator); `n_a`, `skip`, and `error` are - excluded. The general form is tier-weighted (`w(must) · w(should) · w(may)`) but currently flat — see + excluded. The general form is tier-weighted (`w(must) · w(should) · w(may)`) but currently flat; see `spec/principles/scoring.md` for the formula and the cohort bands above the floor. -- `results[]` — one entry per requirement-row. Each carries `id` (the spec requirement id, e.g. `p1-must-no-interactive` - — match this against `spec/principles/p-*.md` frontmatter), `audit_id` (the probe that produced the row, e.g. - `p1-non-interactive`), `tier` (`must` / `should` / `may`), `status`, `evidence`, `group`, `layer`, `confidence`, and - `label`. A single probe like `p3-version` emits two rows — one tier-stamped `must`, one `should` — so you can +- `results[]`: one entry per requirement-row. Each carries `id` (the spec requirement id matching + `spec/principles/p-*.md` frontmatter, e.g. `p1-must-no-interactive`), `audit_id` (the probe that produced the row, + e.g. `p1-non-interactive`), `tier` (`must` / `should` / `may`), `status`, `evidence`, `group`, `layer`, `confidence`, + and `label`. A single probe like `p3-version` emits two rows (one tier-stamped `must`, one `should`), so you can attribute the verdict to a specific RFC 2119 level without joining the coverage matrix. -- `audit_profile` — the exemption category in effect (or `null`). -- `tool / anc / run / target` metadata — identifies the scored tool, the `anc` build, the invocation, and the resolved +- `audit_profile`: the exemption category in effect (or `null`). +- `tool / anc / run / target` metadata: identifies the scored tool, the `anc` build, the invocation, and the resolved target. **Status set.** `pass` / `warn` / `fail` / `skip` / `error` are the live verdicts. `opt_out` marks a deliberate non-adoption (e.g. no `--output` flag → P2 schema-discovery rows collapse). `n_a` propagates from a conditional whose -antecedent is `opt_out` or `n_a` — the `evidence` field names the antecedent so the chain is legible from JSON alone. -The process exit code reflects live verdicts only: `n_a` from an `opt_out` antecedent does not force a non-zero exit. +antecedent is `opt_out` or `n_a`; the `evidence` field names the antecedent so the chain is legible from JSON alone. The +process exit code reflects live verdicts only: `n_a` from an `opt_out` antecedent does not force a non-zero exit. **2. Fix.** For each `fail`, look up the cited `id` (e.g. `p1-must-no-interactive`) in `spec/principles/p-*.md`'s `requirements[]` frontmatter. Apply the fix using the implementation references below. Re-run with `--principle ` to focus on one principle while iterating. -**3. Re-audit.** Re-run `anc audit --output json .` until no MUST row is in `fail` — query the JSON with `jq -'[.results[] | select(.tier == "must" and .status == "fail")] | length'` and confirm it's `0`. Do not rely on -`coverage_summary.must.verified == coverage_summary.must.total` as the satisfaction bar — `verified` increments on any -verdict, so it can equal `total` while MUST fails still exist. Use `--audit-profile ` to suppress audits that -don't apply to the tool class — `human-tui` (TUIs that legitimately intercept the TTY), `file-traversal` (reserved), -`posix-utility` (cat / sed / awk style), `diagnostic-only` (read-only tools). Suppressed audits emit `Skip` with -structured evidence so readers see what was excluded. +**3. Re-audit.** Re-run `anc audit --output json .` until no MUST row is in `fail`. Query the JSON with `jq '[.results[] +| select(.tier == "must" and .status == "fail")] | length'` and confirm it's `0`. Do not rely on +`coverage_summary.must.verified == coverage_summary.must.total` as the satisfaction bar; the `verified` counter +increments on any verdict, so it can equal `total` while MUST fails still exist. Use `--audit-profile ` to +suppress audits that don't apply to the tool class: `human-tui` (TUIs that legitimately intercept the TTY), +`file-traversal` (reserved), `posix-utility` (cat / sed / awk style), `diagnostic-only` (read-only tools). Suppressed +audits emit `Skip` with structured evidence so readers see what was excluded. **4. Claim the badge.** Once `badge.eligible == true` (≥70%), copy `badge.embed_markdown` into the project's README. The `text` output appends an embed hint after the summary line whenever the floor is cleared; below the floor, nothing @@ -119,15 +119,15 @@ badge-related is printed (the convention's "do not nag" rule). ### Useful flags -- `--principle ` — filter the audit to one principle while iterating. -- `--binary` / `--source` — scope to one layer (skip the other). -- `--audit-profile ` — suppress audits that don't apply (`human-tui`, `posix-utility`, `diagnostic-only`, +- `--principle `: filter the audit to one principle while iterating. +- `--binary` / `--source`: scope to one layer (skip the other). +- `--audit-profile `: suppress audits that don't apply (`human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal`). -- `--examples` — print a curated invocation block and exit (pair with `--output json` or `--json` for structured). -- `--json` — short alias for `--output json` (the `p2-should-json-aliases` convention). -- `--raw` — strip headers and summary, emit one `idstatus` line per audit. Pipe-friendly for `grep` / `awk`. -- `--color ` (env `AGENTNATIVE_COLOR`) — control ANSI styling; honors `NO_COLOR` in `auto` mode. -- `-v` / `--verbose` (env `AGENTNATIVE_VERBOSE`) — escalate diagnostic detail when debugging unexpected results. +- `--examples`: print a curated invocation block and exit (pair with `--output json` or `--json` for structured). +- `--json`: short alias for `--output json` (the `p2-should-json-aliases` convention). +- `--raw`: strip headers and summary, emit one `idstatus` line per audit. Pipe-friendly for `grep` / `awk`. +- `--color ` (env `AGENTNATIVE_COLOR`): control ANSI styling; honors `NO_COLOR` in `auto` mode. +- `-v` / `--verbose` (env `AGENTNATIVE_VERBOSE`): escalate diagnostic detail when debugging unexpected results. ### Get the scorecard JSON Schema @@ -139,13 +139,13 @@ anc emit schema # print to stdout anc emit schema | jq '.title' # inspect a field ``` -The schema `$id` is `https://anc.dev/scorecard-v0.7.schema.json`. Pre-0.6 consumers treated `opt_out` / `n_a` as unknown -— feature-detect the status enum rather than pinning to an exact list. +The schema `$id` is `https://anc.dev/scorecard-v0.7.schema.json`. Pre-0.6 consumers treated `opt_out` / `n_a` as +unknown, so feature-detect the status enum rather than pinning to an exact list. ## Implementation guidance (when fixing findings) Once `anc audit` reports a failure, the agent has the cited `id` (requirement-row id) and the spec text. The next -question is "how do I write code that satisfies this requirement?" — answered by: +question is "how do I write code that satisfies this requirement?" The references below answer that: | Need | File | | ------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | @@ -185,11 +185,11 @@ anc skill update --all # refresh every known host ``` Recommended `anc audit` invocations and the full agent loop are in [`getting-started.md`](./getting-started.md). Do not -write shell scripts to grep for principle violations — `anc` already implements (and supersedes) every audit that +write shell scripts to grep for principle violations; `anc` already implements (and supersedes) every audit that approach could produce. ## Sources -- [`agentnative-spec`](https://github.com/brettdavies/agentnative) — canonical principle text (CC BY 4.0) -- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) — `anc`, the canonical auditor (MIT / Apache-2.0) -- [`agentnative-skill`](https://github.com/brettdavies/agentnative-skill) — this repo (MIT) +- [`agentnative-spec`](https://github.com/brettdavies/agentnative): canonical principle text (CC BY 4.0) +- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli): `anc`, the canonical auditor (MIT / Apache-2.0) +- [`agentnative-skill`](https://github.com/brettdavies/agentnative-skill): this repo (MIT) From bda0672f450e80261c1d7ebc316202d4f7d92eb8 Mon Sep 17 00:00:00 2001 From: Brett Date: Mon, 1 Jun 2026 12:26:06 -0500 Subject: [PATCH 10/15] chore(release): drop guarded paths leaked via PR #19 cherry-pick PR #19's squash commit (refactor: rename anc check to anc audit) modified files under `docs/brainstorms/` and `docs/plans/`, which exist on `dev` but must never reach `main` per `guard-main-docs.yml`. The initial `git update-index --remove` during conflict resolution was a no-op because the files were still in the working tree at that moment, so the cherry-pick committed them anyway. Net effect on the PR's diff against main: zero. The guard workflow uses `pulls.listFiles` (net diff), so the cleanup removes the leaked paths from the file list entirely and the guard passes. --- ...5-01-001-skill-determinism-requirements.md | 447 ----- ...c-determinism-feature-asks-requirements.md | 199 -- ...27-001-bootstrap-agentnative-skill-plan.md | 329 --- ...1-feat-skill-determinism-hardening-plan.md | 1786 ----------------- 4 files changed, 2761 deletions(-) delete mode 100644 docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md delete mode 100644 docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md delete mode 100644 docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md delete mode 100644 docs/plans/2026-05-01-001-feat-skill-determinism-hardening-plan.md diff --git a/docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md b/docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md deleted file mode 100644 index 76c290f..0000000 --- a/docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md +++ /dev/null @@ -1,447 +0,0 @@ ---- -status: draft -created: 2026-05-01 -owner: brett -target_repo: agentnative-skill -related: docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md ---- - -# `agent-native-cli` skill — determinism & runbook hardening - -Closes the gaps surfaced by the 2026-05-01 SKILL.md self-review against `create-agent-skills`. Two-agent variance is the -headline problem: two competent agents reading SKILL.md today can produce two different correct-looking-but-wrong -handlers for the same `anc audit` output. This document scopes the skill-side fixes that close that variance. Cross-repo -`anc` feature requests are tracked separately in -[`2026-05-01-002-anc-determinism-feature-asks-requirements.md`](./2026-05-01-002-anc-determinism-feature-asks-requirements.md). - -## Problem - -Today's `SKILL.md`: - -- Describes the `anc audit` JSON envelope in prose only (no sample, no shape). -- Names the `results[].status` enum once in passing without enumerating values. -- Does not document `anc audit`'s exit-code contract. -- Mentions "schema 0.5" once with no pinning guidance and no behaviour-on-bump. -- Names four `--audit-profile` categories without a selection rule. -- Stops the iteration loop on `summary.fail == 0` + `must.verified == must.total`, with no termination rule for `warn`s - once the badge floor is cleared. -- Gives no fallback for spec-version skew between bundle (`spec/VERSION` = 0.3.0) and `anc` releases. -- Provides no first-action precondition for "is `anc` even installed?". -- Leaves `badge.embed_markdown` placement, scorecard diffability, `--binary`/`--source` scoping, hybrid-language - audit-profile selection, `json` vs `text` choice, and scorecard commit policy unresolved. - -The mechanical rubric items pass (frontmatter, line count, headings, references one level deep). The gaps are content, -not structure. - -## Goals - -1. Make `anc`'s observable contract explicit in the skill so agents stop inferring it from prose. -2. Eliminate the highest-frequency situational dead-ends that don't have answers in `SKILL.md` or `getting-started.md`. -3. Stay under the 500-line SKILL.md ceiling. New material that doesn't pull weight in the entry-point file lives in - `references/`. -4. Tighten the SKILL.md / `getting-started.md` / `references/` boundary so each file owns one job and there is no - cross-file content duplication. -5. Broaden starter coverage so non-Rust paths and JSON-envelope authoring have first-class scaffolding. - -## Non-goals - -- Rewriting prose for Haiku-tier readability (separate brainstorm if pursued). -- Spec-text changes in the `agentnative` repo. -- Anything that requires `anc` to ship a new flag or output field — those are tracked in the sibling brief. -- Renaming the skill itself (`agent-native-cli`) — discoverability cost outweighs gerund-form rubric preference; see R10 - for which frontmatter changes ARE in scope. - -## Requirements - -### R1. Worked scorecard samples — three placements, three roles - -**What.** Three coordinated samples, each sized for its location: - -1. **`SKILL.md` inline** — minimal snippet (~12–15 lines) right after the "1. Check." paragraph. Shows the envelope's - top-level shape (`summary`, `coverage_summary`, `badge`, `results[]`). Always-loaded; pays for ~150 tokens. -2. **`getting-started.md`** — even shorter snippet (3–5 lines) right after the `anc audit --output json . > - scorecard.json` recipe in the "existing CLI" loop. Shows just `coverage_summary` + `badge` so an agent reading the - recipe sees invocation→output side by side without scrolling. Pattern-match level only; not a parser spec. -3. **`references/scorecard-shape.md`** — exhaustive sample (~50–60 lines) covering every top-level field, one - `results[]` entry per status value, every `audit_profile` category, full `tool` / `anc` / `run` / `target` metadata. - Loaded only when an agent writes a parser. - -**Must include (exhaustive sample):** one entry per top-level field — `summary`, `coverage_summary`, `badge`, one -`results[]` entry per status value, `audit_profile`, `tool`, `anc`, `run`, `target`. Real field names taken from a live -`anc audit --output json` run. - -**Must show (exhaustive sample):** `summary.fail` (int), `coverage_summary.must.verified` (int), `badge.eligible` -(bool), `badge.embed_markdown` (string|null), `results[].requirement_id` (kebab-case string), `results[].status` (one of -pass / warn / fail / skip / error), `results[].evidence` (object). - -**Acceptance.** - -- An agent that reads only the exhaustive sample can write a parser handling all five status values and all four - audit-profile categories without reading `anc` source. -- An agent skimming `getting-started.md` sees an output snippet without leaving the file. -- An agent skimming `SKILL.md` sees the envelope shape without leaving the file. - -### R2. `## Anc contract` section in SKILL.md - -**What.** A new short section between "First action: update check" and "Start here", or as a sub-section of "The anc -loop". Documents the observable contract. - -**Must include:** - -- **Output flag policy.** Agents always pass `--output json`; `text` is for humans. -- **Exit codes.** Table — what `0`, non-zero values, and "couldn't run" return. Pulled from `anc`'s actual behaviour; if - `anc` doesn't have a stable exit-code policy, this R is blocked behind an anc-side change in the sibling brief. -- **`results[].status` enum.** Explicit list: `pass`, `warn`, `fail`, `skip`, `error`. -- **Schema-version pin.** Tell agents to assert `.` matches a pinned value before parsing; - spec out which path that is once anc surfaces it. If the envelope doesn't currently carry a version, the assertion - guidance falls back to "pin `anc` binary version" and the durable fix moves to the sibling brief. -- **Stable vs noise.** Tell agents which subtree to diff for CI gating (`summary.*`, `coverage_summary.*`, - `results[*].{requirement_id, status}`) and which is timestamp-noise (`run.*`). - -**Acceptance.** An agent writing a CI gate that fails on regression can do so without guessing. - -### R3. `--audit-profile` decision table - -**What.** A 4-row table in SKILL.md (or `references/audit-profile-selection.md` if it needs more than 4 rows). One row -per profile — `human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal` (reserved). Columns: when-to-pick -(one-line rule), example tool, what gets suppressed. - -**Must include:** a "hybrid project" rule — what to do when a binary mixes a TUI-rendering mode with a stdin-piped batch -mode, or a mostly-Rust binary with shell-helper subcommands. Either profiles compose, or the rule is "scope to the -primary entry-point and leave secondary surfaces uncovered." - -**Acceptance.** Three different agents reading the table for the same tool pick the same profile. - -### R4. Loop termination rule for warn / should - -**What.** One paragraph in "The anc loop". States explicitly: - -- `must` violations gate the loop. Continue until `must.verified == must.total`. -- `warn`s are advisory. Once `badge.eligible == true` (≥80%), stop iterating. Do not spend agent time pushing `warn`s to - `pass` unless the user explicitly asks. -- `error` (the run-failure status, distinct from `fail`) means re-run — see troubleshooting. - -**Acceptance.** An agent in "fix mode" stops when the badge clears, not when every `warn` clears. - -### R5. `anc` install precondition - -**What.** Replace or augment the current "First action: update check" so the very first action a session takes is -verifying `anc` is on `PATH`. Pseudocode: - -```text -if ! command -v anc; then - prompt user with install options (brew / cargo) and exit -fi -run bin/check-update for the skill bundle -``` - -The current update-check is for the skill bundle, not `anc`; the skill should not silently assume `anc` exists. - -**Acceptance.** Cold-start sessions never reach `anc audit` without first verifying or installing the binary. - -### R6. Spec-skew fallback - -**What.** Add a single paragraph or table row to "The anc loop" step 2 ("Fix.") covering: what to do when a finding -cites a `requirement_id` that does NOT exist in the bundle's vendored `spec/principles/`. - -**Must include:** - -- The likely cause (anc shipped against newer spec than this bundle vendors). -- The durable fix (`anc audit --explain `, **once shipped** — see sibling brief). -- The interim fix (resync via `scripts/sync-spec.sh`, or fetch `spec/principles/p-*.md` from `agentnative` `main`). -- The non-fix (do not hallucinate a definition). - -**Acceptance.** An agent hitting the skew case has a deterministic next step that isn't "read source". - -### R7. Situational runbook (the dead-ends in C) - -**What.** A new `## Common situations` section near the bottom of SKILL.md or a new `references/runbook.md` linked from -there. One paragraph per situation. Cover: - -- `badge.embed_markdown` placement — top of README, replacing existing badges, ordering relative to CI badges. Adopt - whatever convention `anc.dev/badge` recommends; if it doesn't recommend one, propose one here and push it upstream. -- Should I commit `scorecard.json`? — default no (artifact, regenerable). Override only for CI gating snapshots. -- `anc audit .` vs `--binary` vs `--source` — when each applies. -- `anc` panics or returns malformed JSON — pointer to issue tracker, do not retry blind. -- `anc skill install` host not in registry — fallback to manual `git clone` (already in `getting-started.md`; cross-link - from here). - -Each entry: ≤4 lines. The goal is a fast index, not deep prose. - -**Acceptance.** Agents stop generating issues that ask FAQs already covered. - -### R8. Frontmatter — `allowed-tools` - -**What.** Add `allowed-tools: Bash(anc *), Read` to SKILL.md frontmatter so the skill doesn't trigger permission prompts -for the canonical commands. - -**Acceptance.** Mechanical — no user prompt on `anc audit` invocations once the skill loads. - -### R9. SKILL.md / getting-started.md / references/ boundary cleanup - -**What.** Eliminate cross-file duplication, give each file one job, and reorder SKILL.md so the loop the skill teaches -is the first thing an agent reads — not a session-once meta-update side task. - -**File ownership:** - -- **SKILL.md owns:** discovery (frontmatter), preamble (what the skill is + where the three artifacts live), Quick Start - (one runnable `anc audit` command), the `anc` contract (R2), the four-step loop overview, the principles index, - pointer to `getting-started.md` for procedural detail, pointer to runbook (R7), sources. -- **`getting-started.md` owns:** the three working loops (existing CLI / new Rust / other language), full install steps - for `anc` and the skill bundle, the "where things live" pointer table. -- **`references/` owns:** depth — patterns, idioms, project structure, scorecard shape, audit-profile selection, - runbook, update-check operational details. One topic per file, no cross-file overlap. - -**Reorder SKILL.md.** Today the order is: preamble → `## First action: update check` → `## Start here` → principles → -anc loop → implementation guidance → starter code → compliance auditing → sources. The update-check section is a -session-once side task; making it section #2 trains agents to read meta-update flow before they read the work flow. - -New order: - -1. Preamble (unchanged). -2. **Quick Start** — one runnable `anc audit --output json . > scorecard.json` line + the 12–15-line inline scorecard - sample from R1. -3. **The `anc` contract** (R2) — exit codes, status enum, schema-version pin, stable-vs-noise. -4. **The anc loop** — four steps, conceptual (R4 termination rule folded in). -5. **The seven principles** index (unchanged). -6. **Pointer block** — links to `getting-started.md` (install, three loops), `references/` (depth), `templates/` - (starters), `references/runbook.md` (R7). -7. **Update-check** — demoted to a one-paragraph footnote near the bottom referencing `references/update-check.md`. The - session-once `bash bin/check-update` invocation moves to that reference file (or to `bin/README.md` if that fits the - repo convention better). -8. Sources. - -This reorder also resolves the "first runnable thing in SKILL.md is `bash bin/check-update`" issue surfaced in external -review: after the reorder, the first runnable thing is `anc audit`. - -**Must remove:** - -- The duplicated install block in `SKILL.md` § "Compliance auditing" (lives in `getting-started.md` already). -- The duplicated four-step loop where `SKILL.md` § "The anc loop" and `getting-started.md` "existing CLI" section both - walk it. Canonical homes: SKILL.md for the conceptual loop; `getting-started.md` for the runnable bash. -- The two parallel index tables — SKILL.md "Implementation guidance" pointer table and `getting-started.md` "Where - things live". Merge into one, owned by `getting-started.md`. SKILL.md retains a brief pointer paragraph. - -**Must add:** - -- SKILL.md `## Quick Start` section at position #2 above (after the preamble, before everything else). -- Inline scorecard sample from R1 lands inside Quick Start. - -**Acceptance.** - -- A grep for any of `brew install brettdavies/tap/agentnative`, `cargo install agentnative`, or `anc skill install - claude_code` returns hits in exactly one file. -- The four-step loop appears once in SKILL.md (conceptual) and the bash recipe form appears once in - `getting-started.md`. -- A first-time agent reading SKILL.md sees `anc audit` as the first runnable command, not `bash bin/check-update`. -- The update-check section is no longer the second top-level heading. -- SKILL.md stays under 200 lines after the cleanup. - -### R10. Frontmatter audit beyond `allowed-tools` - -**What.** A small structured pass over SKILL.md frontmatter, separate from R8. - -**In-scope changes:** - -- Confirm `description` lead clause answers "what is this" before "what does it pair with". Today: `Guide to designing, - building, and auditing CLI tools for use by AI agents.` — passes the test, leave alone. -- Audit the `Triggers on` keyword list in the description against the rubric's "include trigger keywords" rule. Drop any - keyword that hasn't fired discovery in practice; add ones that have come up in skill-side questions but aren't there. - Stay under 1024 chars. -- Audit the `SKIP when` clause for collisions with adjacent skills (e.g. anything that overlaps `compound-engineering` - or `create-agent-skills` should be sharper). -- **No** `argument-hint` — this is a background-knowledge skill, not a slash command. -- **No** `model:` pin — let the host decide. -- **No** `disable-model-invocation` — auto-load is correct here. -- **No** rename of `name`. The discoverability cost (existing references in `getting-started.md`, `AGENTS.md`, - templates, and external badges) outweighs the rubric's gerund-form preference. Document this decision in the doc - itself. - -**Acceptance.** Frontmatter passes a fresh `create-agent-skills` audit at the "well-tuned" tier, not just "valid", and -the kept-as-is decisions have inline rationale (in this doc, not in the SKILL.md frontmatter). - -### R11. New starter templates and framework idioms - -**What.** Broaden `templates/` and `references/framework-idioms-other-languages.md` so non-Rust paths have first-class -scaffolding instead of "read the principles and figure it out." - -**Templates to add (priority order):** - -1. **`templates/cargo-toml.md`** — drop-in `[dependencies]` block for the Rust starter. Today an agent copying - `clap-main.rs` has to know to add `clap` (with `derive` + `env`), `serde`, `serde_json`, `thiserror`, `libc` (for - SIGPIPE), `clap_complete`. AGENTS.md template documents this in prose; a copy-paste TOML block is faster. -2. **`templates/cli-tests.rs`** — `assert_cmd` patterns covering P5 mutation boundaries (idempotency, dry-run, - `--yes`/`--force` distinction). Encodes P5 by construction. -3. **`templates/scorecard-envelope.schema.json`** — JSON Schema for the canonical structured-output envelope authors - should emit when implementing P2. Compounds with R1: agents write code that conforms, and `anc` (or any other linter) - can validate against the schema directly. -4. **`templates/python-click/`** — minimal Python/Click starter (one main file + `pyproject.toml` snippet) encoding P1 - SIGPIPE handling, P2 stdout/stderr separation, P4 exit codes. Mirror of `clap-main.rs` for the Click world. -5. **`templates/go-cobra/`** — same idea, Go/Cobra. Mirrors `clap-main.rs`. -6. **`templates/agents-md-template.python.md`** and **`.go.md`** — language-flavoured AGENTS.md scaffolds, since the - current AGENTS.md template embeds Rust-specific conventions. - -**Framework idioms to add:** - -- A "JSON envelope" sub-section in `references/framework-idioms-other-languages.md` showing the canonical envelope shape - in Python/Go/JS/Ruby. Pairs with `templates/scorecard-envelope.schema.json`. -- A "testing P5 mutation boundaries" section, parallel to the Rust `assert_cmd` template, with idiomatic patterns in - `pytest`, Go `testing`, `vitest`, and `rspec`. - -**Out of scope for R11 (file kept, deferred to future work):** - -- Adding new principles or new audit profiles (those originate in `agentnative-spec`, not here). -- Adding starters for languages not already covered in `framework-idioms-other-languages.md` (Rust, Python, Go, JS, - Ruby). Adding e.g. a Swift starter is its own brainstorm. - -**Acceptance.** - -- A new Rust CLI can be bootstrapped with three `cp` commands and a `cargo add` that's just copy-paste from the new - Cargo TOML template. -- A new Python CLI has a starter that encodes P1/P2/P4 by construction. -- An author implementing P2 can validate their JSON output against the schema template without writing one from scratch. - -### R12. Consolidate the two Rust-idiom reference files - -**What.** Merge `references/framework-idioms.md` (137 lines, "Free from clap / Must implement / Anti-patterns" framing, -Rust-only) and `references/rust-clap-patterns.md` (209 lines, denser per-principle prose) into one canonical Rust -reference — proposed name `references/rust-clap.md`. Cross-checked against both files: P1, P3, P5, P7 are heavily -overlapping; P2, P4, P6 each have one file-unique nugget worth preserving. - -**File-unique content to preserve (must not be lost in the merge):** - -- From `framework-idioms.md`: the **Free / Must / Anti-patterns** three-bucket organizational framing — keep this as the - canonical structure. Also the closing pointer table to `framework-idioms-other-languages.md`. -- From `rust-clap-patterns.md`: the **subcommand-vs-flag taxonomy** in P6 (subcommands for operations, nested - subcommands for namespaces, global flags for cross-cutting modifiers, local flags for command-specific modifiers, both - flag and subcommand for meta-commands like `--help` / `--version`). This is real content, not duplication. Also: the - **`kind()` method + main-only `process::exit()`** rule in P4. Also: the **Jsonl variant** of the output format enum - (Text/Json/Jsonl) in P2. Also: the explicit references to xurl-rs / bird as exemplar codebases — keep these as - concrete-example anchors. - -**Merge strategy:** - -1. Adopt `framework-idioms.md`'s Free / Must / Anti-patterns scaffold per principle. -2. For each principle's Must-implement bucket, fold in the deeper detail from `rust-clap-patterns.md` — patterns, crate - names (FalseyValueParser, clap_complete), exit codes (sysexits 77/78), exemplar refs. -3. P2 Must-implement: explicitly enumerate Text/Json/Jsonl as the three OutputFormat variants. -4. P4 Must-implement: include both `exit_code()` and `kind()` methods, plus the "restrict `process::exit()` to `main()`" - rule. -5. P6 gets a new sub-section before the Free/Must/Anti-patterns triplet: **Flags vs subcommands taxonomy** — five - bullets covering the subcommand-vs-flag rules from `rust-clap-patterns.md`. -6. Delete `references/framework-idioms.md` and `references/rust-clap-patterns.md` once `rust-clap.md` is reviewed. -7. Update SKILL.md's "Implementation guidance" pointer table (or its consolidated R9 equivalent in `getting-started.md`) - to reference the single new file. - -**Estimated size:** ~250 lines for the merged file, replacing 346 lines of overlap. Net repo savings ~100 lines, plus -the cognitive savings of "one file per language family, organized identically." - -**Out of scope:** - -- Re-organizing `framework-idioms-other-languages.md`. That file is multi-framework by design and has a different axis - (one section per framework, not one file per language). -- Splitting the consolidated file by clap version. If clap 5 ships with breaking changes, that's a future concern. - -**Acceptance.** - -- One canonical Rust idioms reference at `references/rust-clap.md`. -- Both `framework-idioms.md` and `rust-clap-patterns.md` are deleted from the repo. -- All file-unique content listed above appears in the merged file. -- SKILL.md and `getting-started.md` reference the new file (no broken links). -- A diff between "what's in the merged file" and "union of the two source files" shows no information loss other than - prose duplication. - -## Implementation surface - -| Requirement | Lands in | Estimated size | -| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------- | -| R1 | SKILL.md inline (minimal) + `references/scorecard-shape.md` (exhaustive) | 15 + 60 lines | -| R2 | SKILL.md, new section | 25–40 lines | -| R3 | SKILL.md table + `references/audit-profile-selection.md` (extended) | 10 + 40 lines | -| R4 | SKILL.md, edit existing "anc loop" | 5–10 lines | -| R5 | SKILL.md, edit "First action" | 5–10 lines | -| R6 | SKILL.md, edit "Fix." step | 5–10 lines | -| R7 | `references/runbook.md` (linked from SKILL.md) | 50–80 lines | -| R8 | SKILL.md frontmatter | 1 line | -| R9 | SKILL.md restructure + `getting-started.md` cleanup; merge index tables | net ≈0 lines | -| R10 | SKILL.md frontmatter (description + skip clause audit) | ≤5 lines net | -| R11 | `templates/cargo-toml.md`, `templates/cli-tests.rs`, `templates/scorecard-envelope.schema.json`, `templates/python-click/`, `templates/go-cobra/`, two AGENTS.md variants, sub-sections in `references/framework-idioms-other-languages.md` | ~400 lines new files | -| R12 | New `references/rust-clap.md`; delete `references/framework-idioms.md` + `references/rust-clap-patterns.md` | ~250 new, 346 deleted (net −100) | - -**SKILL.md ceiling check.** R9 deletes duplicated install + loop content (~25 lines saved), R1 + R2 + R3 + R4 + R5 + R6 -add ~70 lines, R7 + R11 + R12 land outside SKILL.md, R8 + R10 are frontmatter. Net SKILL.md delta is roughly +45 lines, -taking it from 146 to ~190 — well under the 200-line target in R9 and the 500-line rubric ceiling. - -**Repo-wide delta.** ~400 lines in new template files, ~150 lines in new reference files (R7 runbook + R3 -audit-profile-selection + R1 scorecard-shape), ~250 lines in the merged R12 file, 346 lines deleted in the R12 merge, -~45 lines net into SKILL.md, ~20 lines net into `getting-started.md` after de-duplication. Net repo: ~520 lines added, -~25 lines deleted in SKILL.md/getting-started.md de-dup, ~346 lines deleted in R12 merge. - -## Dependencies / assumptions - -- **Soft dependency on sibling brief.** R2's schema-pin guidance and R6's `--explain` reference both work better once - `anc` ships matching features. They're not blocked — interim guidance is documentable today, but the docs become - stable when anc lands the features. See sibling brief for which features. -- **`scripts/sync-spec.sh` works.** R6's interim fix relies on the existing sync script. If it's broken or stale, fix - that as a precondition. -- **`anc.dev/badge` convention.** R7's badge-placement guidance defers to the live convention. If the convention page - doesn't yet specify placement, this requirement triggers a content addition there too. - -## Success criteria - -- Two independently-prompted agents asked "parse this scorecard and report failures" produce structurally identical - handlers. -- An agent given a tool to audit and asked to pick `--audit-profile` arrives at the same answer as the human reviewer. -- Cold-start sessions never call `anc audit` against a missing binary. -- An agent finishing a remediation loop stops when `badge.eligible == true`, not when every warn clears. -- A finding citing a `requirement_id` not in the local vendored spec resolves deterministically. -- A first-time agent reading SKILL.md sees a runnable `anc audit` command before scrolling, and can find install - instructions, the three working loops, and reference depth without re-reading SKILL.md (R9). -- A new Python or Go CLI author has a starter that encodes P1/P2/P4 by construction (R11). -- A grep across the repo for any single canonical command (e.g. `anc skill install claude_code`) returns hits in exactly - one canonical location (R9). -- One canonical Rust idioms reference exists (R12). `framework-idioms.md` and `rust-clap-patterns.md` are gone; no links - in SKILL.md or `getting-started.md` are broken. -- The first runnable command an agent encounters in SKILL.md is `anc audit`, not `bash bin/check-update` (R9 reorder). - -## Open questions - -1. **R1** — inline in SKILL.md or in `references/scorecard-shape.md`? Resolved: inline a minimal sample (~15 lines) AND - link `references/scorecard-shape.md` for the exhaustive one. Recorded in the implementation table. -2. **R2** — does `anc` currently have a stable exit-code policy worth documenting, or does this R block on the sibling - brief? Verify against `agentnative-cli` source before drafting R2 prose. -3. **R7** — `## Common situations` in SKILL.md vs `references/runbook.md`? Resolved: `references/runbook.md`. SKILL.md - gets a one-paragraph pointer. -4. **R9 sequencing.** Run R9 first (de-duplicate, restructure) so R1–R7 land in their final homes the first time and - don't have to be moved during R9. Alternative: run R9 last as a cleanup pass once R1–R7 reveal where content actually - wants to live. Recommendation: R9 first; the boundary is clear enough today and re-homing later is more work. -5. **R10** — does the description's `Triggers on` keyword list match the queries skills actually fire on in real - traffic? Need a short audit (sample of recent skill-discovery hits or qmd queries) before pruning/adding. -6. **R11 scope** — ship all six template additions in one PR or split per language? Recommendation: split. Rust - additions (cargo-toml, cli-tests, scorecard-envelope.schema.json) ship together; Python and Go each ship as their own - PR so each can be reviewed by someone fluent in that ecosystem. -7. **R12 sequencing relative to R11.** The R11 Rust starter additions (cargo-toml, cli-tests, scorecard-envelope) and - R12 (consolidate Rust idioms) both touch the Rust documentation surface. Ship R12 first so the merged - `references/rust-clap.md` is the link target for the new starter docs. Recommended. - -## Handoff - -Next step: `/ce-plan` against this document. Implementation breakdown: - -- **PR 1 (skill-side determinism).** R1–R8 plus R9 (do R9 first to avoid re-homing). One commit per requirement for - review legibility. Targets `dev`. Estimated total ~250 lines added/changed across SKILL.md, getting-started.md, and - three new reference files. -- **PR 2 (frontmatter polish).** R10 only. Tiny PR; can land in PR 1 if it stays under ~10 lines. Split if R10's - description audit takes more than that. -- **PR 3 (Rust idioms consolidation).** R12 only. Merge `framework-idioms.md` + `rust-clap-patterns.md` into - `references/rust-clap.md`; delete the originals; update SKILL.md / `getting-started.md` link targets. Land before PR 4 - so the new starter docs reference the merged file. -- **PR 4 (Rust starter completion).** R11 items 1–3 — `cargo-toml.md`, `cli-tests.rs`, `scorecard-envelope.schema.json`. - Self-contained, no skill prose changes. -- **PR 5 (Python starter).** R11 items 4 + 6a — `python-click/` template, `agents-md-template.python.md`, plus the - Python sections in framework-idioms. -- **PR 6 (Go starter).** R11 items 5 + 6b — `go-cobra/` template, `agents-md-template.go.md`, plus the Go sections in - framework-idioms. - -Sibling brief (`2026-05-01-002-anc-determinism-feature-asks-requirements.md`) is filed separately against -`agentnative-cli`. R2 and R6 reference items there as soft dependencies; interim guidance ships in PR 1 regardless. diff --git a/docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md b/docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md deleted file mode 100644 index 02f28db..0000000 --- a/docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md +++ /dev/null @@ -1,199 +0,0 @@ ---- -status: draft -created: 2026-05-01 -owner: brett -target_repo: agentnative-cli -related: docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md ---- - -# `anc` — feature asks for deterministic agent consumption - -Cross-repo brief. Lives in `agentnative-skill` next to its sibling because the asks were surfaced by the skill-side -review on 2026-05-01, but the durable fixes ship in `anc` itself. This document is the source for an issue (or set of -issues) to be filed against [`brettdavies/agentnative-cli`](https://github.com/brettdavies/agentnative-cli). The -skill-side requirements document -([`2026-05-01-001-skill-determinism-requirements.md`](./2026-05-01-001-skill-determinism-requirements.md)) references -several items here as soft dependencies — interim skill-side guidance is shippable today, but stabilises once these -land. - -## Problem - -Agents consuming `anc audit` output today must infer details that should be self-describing in the binary's contract: - -- The JSON envelope's schema version is internal documentation, not a field agents can assert against. A bump from 0.5 - to 0.6 with renamed fields breaks consumers silently. -- A finding cites a `requirement_id` (e.g. `p1-must-no-interactive`) but resolving its text requires the consumer to - read a vendored copy of the spec, which can drift from the version `anc` was built against. -- The `text` mode appends a badge embed hint, but the embed string itself doesn't declare *where* in a README it should - land — every consumer reinvents that. -- `audit_profile` exemptions are documented at four named categories, but nothing encodes how to pick one for a hybrid - tool (TUI rendering + stdin batch mode + a shell wrapper). -- Scorecard envelopes contain timestamp / run-id metadata mixed in with the gating payload, so CI gates that diff the - whole envelope see false churn. -- `anc audit`'s exit-code semantics aren't part of the documented contract, so agents can't `&&`-chain it reliably. - -Each of these is fixable in the skill via prose, but the durable fix is for `anc` to make the contract self-describing. - -## Goals - -1. Make `anc`'s observable contract robust against version drift. -2. Eliminate the runbook items where the skill is currently papering over a missing `anc` feature. -3. Keep changes additive — no breaking changes to schema 0.5 consumers. - -## Non-goals - -- Restructuring `anc`'s internals or check pipeline. -- Spec-text changes (those go to `agentnative`). -- Anything that would prevent JSON envelopes from being consumed by an existing schema-0.5 parser (i.e. additions are - tolerated, renames are not — gate any rename behind a schema bump). - -## Asks - -Each ask is sized small enough to ship independently. Priorities reflect the skill-side benefit; `anc` may have other -priorities. - -### A1. Self-describing schema version in the envelope (highest priority) - -**What.** Surface the schema version on every `anc audit --output json` envelope at a stable, top-level path — proposed -`anc.schema_version` (sits alongside `anc.version`) or `run.schema_version`. String, semver-shaped (`"0.5"`, `"0.6.0"`). - -**Why.** Lets every agent assert `envelope.anc.schema_version == "0.5"` (or whatever they pinned against) before -parsing. Today the only fingerprint is the binary version. Schema and binary versions diverge; one is the contract, the -other is the implementation. - -**Acceptance.** - -- Field present in every JSON envelope from `anc audit`. -- Documented in the README and the `anc` man page (or equivalent). -- Bumping the schema bumps this field. Adding fields without rename does not. - -### A2. `anc explain ` (closes spec-skew gap) - -**What.** A new subcommand or flag that prints the spec text + machine-readable metadata for one `requirement_id` *from -the version of the spec `anc` was built against*. Output: `--output text` (default, human-readable) and `--output json` -(envelope with the full requirement record from spec frontmatter). - -**Why.** Today, an agent sees a finding citing `p1-must-no-interactive` and goes to `spec/principles/p1-*.md` — but the -bundle vendors a snapshot, and that snapshot can lag the `anc` build. `anc explain` is the only authoritative source: -it's whatever this specific binary was compiled against. Eliminates the spec-skew dead-end (R6 in the skill-side doc) at -its root. - -**Acceptance.** - -- `anc explain p1-must-no-interactive` returns spec text and structured metadata for a valid id. -- `anc explain bogus-id` returns a non-zero exit and a structured error envelope. -- `anc explain --list` (or `anc list-requirements`) enumerates every id the binary knows about. - -### A3. Stable exit-code policy - -**What.** Document and stabilise the exit-code contract for `anc audit` and other subcommands. Proposed: - -| Exit | Meaning | -| ---- | ------------------------------------------------------------------------- | -| 0 | Ran successfully; `summary.fail == 0` and the run is gating-clean. | -| 1 | Ran successfully; one or more `fail`s present (gating violation). | -| 2 | Invocation error — bad flags, target not found, profile name unknown. | -| 3 | Internal error — panic, malformed input, schema version mismatch on read. | - -**Why.** Lets agents and CI scripts use `anc audit && next-step` reliably, and lets agents distinguish "tool said no" -from "tool couldn't run." Closes R2's exit-code row in the skill-side doc. - -**Acceptance.** - -- Behaviour documented in README and CLI `--help`. -- Tested in CI for each exit class. -- The skill-side doc R2 references this table directly. - -### A4. Embed-position metadata on the badge block - -**What.** Add an optional field — proposed `badge.embed_position` — that names the recommended placement convention. -String enum, e.g. `"top-of-readme"`, `"under-title"`, `"in-badges-row"`, `null` if no convention applies. Defer to -whatever [`anc.dev/badge`](https://anc.dev/badge) publishes; anc just surfaces it. - -**Why.** Today four agents pasting `badge.embed_markdown` into the same README put it in four places. Convention belongs -at the source, not in every consumer's head. - -**Acceptance.** - -- Field present whenever `badge.eligible == true`. -- Skill-side R7 (badge placement runbook entry) defers to this field instead of duplicating the convention. -- Convention page at `anc.dev/badge` enumerates the values. - -### A5. Stable-output mode for CI diffability - -**What.** Either: - -- **(a)** Split the envelope into a stable subtree (gating-relevant: `summary`, `coverage_summary`, `results`, - `audit_profile`, `badge`) and a metadata subtree (`tool`, `anc`, `run`, `target`) that consumers can ignore for diff - purposes. This may already be the layout — confirm and document it. -- **(b)** Add `--stable` (or `--no-timestamps`) that elides volatile fields entirely, producing a byte-stable output for - identical input. - -**Why.** CI gates that diff scorecards across runs see false churn from `run.*` timestamps. Diffing only `summary` is -the workaround; native support eliminates it. - -**Acceptance.** - -- Document the stable subtree in the README. OR -- `anc audit --stable .` produces byte-identical output for byte-identical input. - -### A6. Profile composition for hybrid tools - -**What.** Either allow `--audit-profile` to take a comma-separated list, OR add a `--audit-profile-for-subcommand -=` repeatable flag, so a tool with a TUI mode + a batch mode can scope profiles per surface. - -**Why.** Today the four categories (`human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal` reserved) don't -compose. Real tools mix shapes — e.g. an agent-native CLI with one rendering subcommand. Forcing a single profile either -suppresses real findings (if the user picks `human-tui`) or surfaces irrelevant ones (if they don't). - -**Acceptance.** - -- Hybrid tool case has a deterministic incantation. -- Skill-side R3 ("hybrid project rule") points to this rather than inventing a workaround. - -### A7. `anc` self-audit / `anc doctor` - -**What.** A subcommand that prints `anc.version`, `anc.schema_version`, the spec version it was built against, the -registered audit-profile names, and the registered hosts for `anc skill install`. JSON output supported. - -**Why.** Replaces a half-dozen "what does my install actually know about" questions that today require reading source. -Also supports the skill-side R5 install precondition — the agent can run `anc doctor --output json` to confirm health, -not just `command -v anc`. - -**Acceptance.** - -- One subcommand returns everything needed to validate an install. -- Output stable enough for an agent to assert against. - -## Priority and sequencing - -Ranked by skill-side benefit: - -1. **A1** (schema version) — unblocks future schema bumps without breaking consumers. Smallest change, largest leverage. -2. **A3** (exit codes) — unblocks CI gating and `&&` chaining. -3. **A2** (`anc explain`) — closes the spec-skew dead-end at its root. -4. **A5** (stable output) — closes CI diff churn. -5. **A6** (profile composition) — closes the hybrid-tool dead-end. -6. **A4** (embed position) — closes the badge placement question. Lowest urgency because the skill can document a - convention as a stop-gap. -7. **A7** (`anc doctor`) — convenience; not blocking anything. - -A1 and A3 are the two that the skill-side document treats as soft dependencies. The rest are independent improvements. - -## Filing plan - -Each ask becomes its own GitHub issue against `agentnative-cli`, labelled `agent-determinism` and cross-referencing this -brief and the skill-side sibling. The issue body is the corresponding `### Aₙ` block from this document (roughly -verbatim) plus a "Surfaced by" link to the skill-side review thread. - -A1 and A3 should land in the same `anc` release if possible — they pair naturally and together unblock the -highest-impact skill-side requirements. - -## Open questions - -1. A1 — does `tool.schema_version` already exist somewhere in the envelope? If yes, this ask collapses to "document and - stabilise"; if no, it's a small additive change. -2. A5 — confirm whether the current envelope already segregates stable from volatile fields cleanly. If so, the ask is - purely documentation. -3. A2 — is the spec already embedded in the binary at compile time, or is it loaded from a vendored snapshot at runtime? - The implementation strategy depends on the answer. diff --git a/docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md b/docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md deleted file mode 100644 index 5338b45..0000000 --- a/docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md +++ /dev/null @@ -1,329 +0,0 @@ ---- -title: "feat: Bootstrap agentnative-skill producer repo (Unit 1 of skill-distribution master plan)" -type: feat -status: complete -date: 2026-04-27 -completed_date: 2026-04-27 -public_flip_completed_date: 2026-04-28 -v0_2_0_completed_date: 2026-04-29 -v0.1.0_commit: 47a76cceb8b7b1bc013c19ee18a5e38179b1dd0e -v0.1.0_tag_object: 5e2b676369a952c80da4bf133ce5ac03d34406a5 -v0.2.0_commit: 2b10c845760becf3de8d66aafbb7b57820385d45 -v0.2.0_tag_object: 054c249c36e92d5fc08603f701c999ac9ad187b6 -master_plan: ../../../agentnative-site/docs/plans/2026-04-24-001-feat-skill-distribution-endpoint-plan.md -related_session: ../../../agentnative-site/docs/plans/2026-04-27-001-feat-skill-distribution-site-plan.md -origin: master plan Unit 1, executed in dedicated session ---- - -# feat: Bootstrap agentnative-skill producer repo - -## Session Context - -This plan is executed by a **dedicated session whose cwd is `~/dev/agentnative-skill/`**. A parallel session in -`~/dev/agentnative-site/` runs the matching site-side plan (Units 2–5). The two sessions exchange exactly **one -artifact**: the v0.1.0 commit SHA produced here, consumed there as `source.commit` in `src/data/install.json`. - -**Master plan (canonical reference):** -`~/dev/agentnative-site/docs/plans/2026-04-24-001-feat-skill-distribution-endpoint-plan.md` — Unit 1. **Sibling site -plan:** `~/dev/agentnative-site/docs/plans/2026-04-27-001-feat-skill-distribution-site-plan.md` — Units 2–5 plus the -public-flip cutover. - -This document defines execution shape; the master plan defines architectural rationale. Read the master plan's Unit 1 -section first if any decision below feels under-justified. - -## State at Session Start - -- **Repo created (this current session):** `github.com/brettdavies/agentnative-skill` (private, empty, topics applied). -- **Local clone present:** `~/dev/agentnative-skill/` containing only `.git/`. Origin = - `git@github.com:brettdavies/agentnative-skill.git`. -- **Bundle source (private, author's machine):** `~/dev/agent-skills/agent-native-cli/` — pre-audited; no private path - references found via `grep -rln -E '(agent-skills|~/dev|/home/brett)'`. -- **Bundle inventory (verified at planning time):** -- `SKILL.md` (root) -- `checklists/new-tool.md` -- - -`references/{framework-idioms,framework-idioms-other-languages,principles-deep-dive,project-structure,rust-clap-patterns}.md` - -- `scripts/check-compliance.sh` -- `scripts/checks/_helpers.sh` + 21 `check-*.sh` scripts (broader than the master plan's illustrative list — copy the - actual filesystem, do not enforce the master plan's incomplete enumeration) -- `templates/{agents-md-template.md,clap-main.rs,error-types.rs,output-format.rs}` - -## Goals - -1. Stand up the public producer repo with the bundle at the root, governance files, CI gates, and v0.1.0 tag. -2. Migrate via clean-room re-commit (no history import from the private source). -3. Apply `github-repo-setup` skill defaults (branch protection, tag protection, CODEOWNERS). -4. **Stop before public flip.** The visibility cutover is coordinated with the site PR in the sibling session. -5. Hand off the v0.1.0 commit SHA to the site session via a one-line note. - -## Non-Goals - -- **No `gh repo edit --visibility public`.** That is the final cutover step in the site plan, not here. -- **No site repo edits.** Cross-repo coordination is via SHA handoff only. -- **No content rewriting of SKILL.md or scripts.** Migration is byte-faithful copy. Content evolution is post-bootstrap. -- **No git history import** from the private source. The private repo's branches, authorship, and paths must not leak. -- **No signed tags in v1.** Documented in the master plan as deferred to v2. - -## Branch Discipline (initial commit special case) - -The master plan and global CLAUDE.md mandate `dev` off `main`, features off `dev`, PR'd back. An empty repo has no -`main` to PR against — the first commit is a bootstrap exception: - -1. Bootstrap commit lands on `main` directly (no PR — there is nothing to review against). -2. Immediately after, create `dev` tracking `main`. Push it. **Leave `main` as the repo default branch** — this matches - the rest of the agentnative ecosystem (`agentnative`, `agentnative-cli`, `agentnative-site` all default to `main`). A - naive `git clone` should land visitors on the released bundle, not on `dev`'s WIP. -3. Tag `v0.1.0` on the bootstrap commit (pre-`dev` creation is fine; tag refers to commit SHA, not branch). -4. Any subsequent work in this repo (post-v0.1.0) follows full discipline: feat/*off `dev`, squash-PR to `dev`, - release/* cherry-picked from `main` and PR'd to `main`. - -The bootstrap exception is one commit only. Repeat-edits during this session must NOT pile onto `main`; if you discover -a fix mid-session, force yourself onto a `feat/*` branch. - -## Implementation Steps - -### Step 1: Verify session preconditions - -```bash -pwd # → /home/brett/dev/agentnative-skill -git remote -v # → origin git@github.com:brettdavies/agentnative-skill.git (fetch+push) -git status # → clean, on no branch yet (empty repo) -ls ~/dev/agent-skills/agent-native-cli/ # → bundle source visible -gh auth status # → authenticated as brettdavies -``` - -If any check fails, stop and ask the orchestrator. The repo creation in the parent session is the only external -prerequisite; everything else from here is local work. - -### Step 2: Author governance files - -Create at repo root (do **not** copy from master plan verbatim — write fresh content suited to a single-skill repo): - -- **`README.md`** — short. What this repo is (the `agent-native-cli` skill bundle, cloned-in-place install model), - pointer to `https://anc.dev/install` for install instructions, pointer to `https://anc.dev/p1` for principles, - license, contribution model. ~50 lines max. -- **`LICENSE`** — MIT, year 2026, copyright Brett Davies. Match the form used in `~/dev/agentnative-site/LICENSE` if - present (read it for exact wording); otherwise standard SPDX MIT text. -- **`CHANGELOG.md`** — keepachangelog.com format. Initial section: `## [0.1.0] - 2026-04-27 — Initial release`. Bullets - describe the bundle (7 principles, 24 compliance audits, templates). -- **`VERSION`** — single line `0.1.0\n`. No frontmatter, no metadata. This is the source of truth for the version string - the site reads at release time. -- **`SECURITY.md`** — vulnerability disclosure policy. Channel: GitHub private security advisories - (`https://github.com/brettdavies/agentnative-skill/security/advisories/new`). 90-day disclosure window. Reference - `SECURITY.md` patterns from `~/dev/agentnative-cli/SECURITY.md` if it exists. -- **`.gitignore`** — minimal: editor scratch files (`.DS_Store`, `*.swp`, `.idea/`, `.vscode/`), local todos - (`TODO*.md`, `.context/`), build artifacts if any (`target/`, `node_modules/`). -- **`.gitattributes`** — `* text=auto eol=lf` and `*.sh text eol=lf` to enforce line endings on shell scripts (Windows - hosts will checkout LF). This is in the master plan's risk table; the file IS the mitigation. - -### Step 3: Migrate the bundle (clean-room copy) - -```bash -# Copy each top-level item from the source. Use cp -a to preserve mode bits (scripts must remain +x). -cp -a ~/dev/agent-skills/agent-native-cli/SKILL.md ./ -cp -a ~/dev/agent-skills/agent-native-cli/checklists ./ -cp -a ~/dev/agent-skills/agent-native-cli/references ./ -cp -a ~/dev/agent-skills/agent-native-cli/scripts ./ -cp -a ~/dev/agent-skills/agent-native-cli/templates ./ - -# Verify scripts kept executable bits. -ls -l scripts/*.sh scripts/checks/*.sh | awk '{print $1, $NF}' | grep -v 'rwx' && \ - echo 'FAIL: non-executable scripts found' || echo 'OK: scripts executable' - -# Final frontmatter audit — should output nothing (matched at planning time, re-verify after copy). -rg -n --no-heading '(agent-skills|~/dev|/home/brett)' . || echo 'OK: no private paths' -``` - -**Frontmatter audit beyond paths:** `SKILL.md`'s top-level `description` field and `name` should match the deployed -skill. The frontmatter at planning time was clean — re-grep after the copy and pause if anything new surfaces. - -### Step 4: CI workflow - -Create `.github/workflows/ci.yml`. Jobs: - -1. `markdownlint` — runs `markdownlint-cli2` against `**/*.md`. Use the action pinned to a commit SHA per global - CLAUDE.md SHA-pinning rule (resolve via `gh api repos///commits/ --jq .sha`). -2. `shellcheck` — runs against `scripts/*.sh` and `scripts/checks/*.sh` with severity `style`. Use a pinned-by-SHA - action. - -Triggers: `push` on `main`/`dev`/`feat/**`, `pull_request` against `main`/`dev`. Required for branch protection in Step -7. - -The site session has a `~/dev/agentnative-site/.github/workflows/ci.yml` reference; you may consult it for action SHA -patterns but do NOT introduce any other site-specific CI logic — this repo's CI is intentionally minimal. - -### Step 5: CODEOWNERS - -Create `.github/CODEOWNERS`: - -```text -# Mandatory review on anything that runs on user machines at install time -scripts/** @brettdavies -.github/workflows/ @brettdavies -``` - -Per master plan risk row "Shell scripts in producer repo execute on user machines" — this is the review gate. - -### Step 6: Initial commit + push - -```bash -git checkout -b main # create main locally -git add -A -git status --short # sanity scan; no surprises -git commit -m "feat: initial bundle for agent-native-cli v0.1.0 - -Migrate the agent-native-cli skill bundle into a public producer repo. -Bundle ships at the repo root so 'git clone' directly into a host's -skills directory IS install. - -Includes: -- SKILL.md (north-star standard, 7 principles) -- checklists/, references/, scripts/, templates/ -- 24 compliance audits across 9 groups (scripts/checks/*) -- governance: LICENSE (MIT), CHANGELOG, VERSION, SECURITY.md, CODEOWNERS -- CI: markdownlint + shellcheck - -Source: clean-room re-commit from a private bundle that lived on the -author's machine since March 2026. No history imported. - -Companion endpoints (anc.dev/install, anc.dev/install.json) ship via -the agentnative-site repo, pinning to this commit SHA." -git push -u origin main - -# Now create dev tracking main and push it. -# Leave `main` as the repo default branch (matches the rest of the -# ecosystem: agentnative, agentnative-cli, agentnative-site all -# default to main). A naive `git clone` should land on the released -# bundle, not on dev's WIP. -git checkout -b dev -git push -u origin dev -``` - -### Step 7: Tag v0.1.0 - -Tag on `main` (the bootstrap commit), then push: - -```bash -git checkout main -git tag -a v0.1.0 -m "v0.1.0 — initial release" -git push origin v0.1.0 - -# Capture SHA for the site session handoff. -git rev-parse v0.1.0 # → 40-char SHA — copy this verbatim -``` - -**Hand off the SHA to the orchestrator.** The site session needs this exact value as `source.commit` in -`src/data/install.json`. - -### Step 8: Apply repo settings (branch + tag protection) - -Use `gh api` against the rulesets endpoint (modern replacement for branch-protection). The `github-repo-setup` skill is -the authoritative source for the rules — invoke it via `Skill(skill="github-repo-setup", args="audit + apply")` if -present; otherwise apply the following manually. - -**Branch ruleset (covers `main` AND `dev`):** - -- Force-push: blocked. -- Branch deletion: blocked. -- Required status checks: `markdownlint`, `shellcheck` (the two CI jobs from Step 4). -- Require linear history: optional (nice-to-have for a small repo, not load-bearing). -- Bypass: repo admins permitted (so `gh repo edit --visibility public` and emergency hotfixes still work). - -**Tag ruleset (covers `v*`):** - -- Force-push (re-tag): blocked. -- Tag deletion: blocked. -- Bypass: repo admins permitted. - -Verify both rulesets actually take effect: - -```bash -gh api repos/brettdavies/agentnative-skill/rulesets --jq '.[].name' # should list both -git checkout main -git commit --allow-empty -m "test: should be rejected" && git push origin main -# expected: refused by ruleset; if accepted, the protection isn't applied -``` - -(Reset / discard the test commit after verifying refusal.) - -### Step 9: Verification gates - -All Step 9 gates pass: - -- [x] `gh repo view brettdavies/agentnative-skill --json visibility -q .visibility` → `PUBLIC` (flipped 2026-04-28). -- [x] `gh repo view brettdavies/agentnative-skill --json defaultBranchRef -q .defaultBranchRef.name` → `main`. -- [x] `gh run list --branch main --limit 1` → CI job for the initial commit completed `success`. -- [x] `gh release list` → both `v0.1.0` and `v0.2.0` published. -- [x] `gh api repos/brettdavies/agentnative-skill/git/ref/tags/v0.1.0 --jq '.object.sha'` → - `5e2b676369a952c80da4bf133ce5ac03d34406a5` (tag object SHA, recorded in frontmatter alongside the commit SHA - `47a76cceb8b7b1bc013c19ee18a5e38179b1dd0e`). -- [x] Force-push to `main` fails for non-admin: `protect-main` rules array blocks force-push; the `actor_type: - RepositoryRole, actor_id: 5, bypass_mode: always` bypass allows admin overrides. -- [x] `markdownlint-cli2 '**/*.md'` exits zero locally. -- [x] `shellcheck scripts/*.sh scripts/checks/*.sh` exits zero with three narrow `.shellcheckrc` disables (`SC1091`, - `SC2034`, `SC2125`) tolerating style-only findings in the migrated bundle scripts. -- [x] **Local install smoke test (Claude Code):** - ```bash - mkdir -p /tmp/anc-test-home/.claude/skills - HOME=/tmp/anc-test-home git clone --depth 1 git@github.com:brettdavies/agentnative-skill.git \ - /tmp/anc-test-home/.claude/skills/agent-native-cli - ls /tmp/anc-test-home/.claude/skills/agent-native-cli/SKILL.md # exists - rm -rf /tmp/anc-test-home - ``` - (SSH clone was verified during the bootstrap session against the then-private repo. After the 2026-04-28 public - flip, both `git clone https://github.com/brettdavies/agentnative-skill.git` and the SSH form should work; the - site session's e2e relies on the HTTPS path.) - -### Step 10: Hand-off note - -Handoff was delivered to the orchestrator chat with the bootstrap SHAs (commit -`47a76cceb8b7b1bc013c19ee18a5e38179b1dd0e`, tag object `5e2b676369a952c80da4bf133ce5ac03d34406a5`) and repo visibility -(`PRIVATE` during bootstrap, flipped `PUBLIC` 2026-04-28). No separate handoff doc was committed. - -## Risks & Mitigations (this session only) - -| Risk | Mitigation | -| --------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Accidental include of bundle source's parent dir or hidden files | `git status --short` between copy and commit; explicit per-item `cp -a` (Step 3) rather than wildcard from source root | -| Scripts lose +x bits on copy | `cp -a` preserves modes; verification in Step 3 fails loud if any script is non-executable | -| `markdownlint` flags existing skill content not previously CI-checked | Resolve in-session: either fix the markdown to pass, or add a narrowly-scoped `.markdownlint.jsonc` exception with a comment explaining why. Do NOT lower the bar globally | -| Branch protection blocks the very `git push` that creates `dev` | Push `main` BEFORE creating the ruleset (Step 6 precedes Step 8); rulesets apply on next push | -| Frontmatter audit false-negative — private path slips through | Re-run audit in Step 3 AFTER copy, not just at planning time; the planning-time grep is supporting evidence, not a substitute | -| GitHub Release object expected but not created | Master plan does not require a Release object for v1 — tag suffices. Don't create one speculatively | -| Tag created on `dev` instead of `main` | Step 7 explicitly checks out `main` before tagging. The bootstrap commit is on `main`; `dev` was created from it so they point at the same SHA at this moment, but the canonical tag-bearing branch is `main` | -| Site session starts before this is done | The user is orchestrating both sessions and will not start the site session before this session reports the SHA | - -## Files Touched (in this repo only) - -```text -.gitattributes -.gitignore -.github/CODEOWNERS -.github/workflows/ci.yml -CHANGELOG.md -LICENSE -README.md -SECURITY.md -SKILL.md # migrated from ~/dev/agent-skills/agent-native-cli/ -VERSION -checklists/new-tool.md # migrated -docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md # this plan, committed alongside bootstrap -references/ # migrated (5 files) -scripts/ # migrated (1 top-level + 1 helper + 21 check scripts) -templates/ # migrated (4 files) -``` - -## Sources & References - -- **Master plan:** `~/dev/agentnative-site/docs/plans/2026-04-24-001-feat-skill-distribution-endpoint-plan.md` (Unit 1) -- **Sibling site plan (parallel session):** - `~/dev/agentnative-site/docs/plans/2026-04-27-001-feat-skill-distribution-site-plan.md` -- **Bundle source (private, author's machine):** `~/dev/agent-skills/agent-native-cli/` -- **Skill on Claude Code (live):** `~/.claude/skills/agent-native-cli/` — same content as the bundle source; do NOT use - this path as the migration source (it may diverge from the master copy in `~/dev/agent-skills/`) -- **External:** [agentskills.io specification](https://agentskills.io/specification), - [garrytan/gstack](https://github.com/garrytan/gstack) -- **Global rules consulted:** `~/.claude/CLAUDE.md` (SHA-pinning, branch discipline, no AI attribution, dev-flow - squash-merge) diff --git a/docs/plans/2026-05-01-001-feat-skill-determinism-hardening-plan.md b/docs/plans/2026-05-01-001-feat-skill-determinism-hardening-plan.md deleted file mode 100644 index b9dc0cc..0000000 --- a/docs/plans/2026-05-01-001-feat-skill-determinism-hardening-plan.md +++ /dev/null @@ -1,1786 +0,0 @@ ---- -title: "feat: SKILL.md determinism + runbook hardening (R1–R12)" -type: feat -status: active -date: 2026-05-01 -origin: docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md -sibling_brief: docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md ---- - -# feat: SKILL.md determinism + runbook hardening (R1–R12) - -## Summary - -Rewrite `SKILL.md`, `getting-started.md`, and four reference files against `anc 0.3.0`'s live JSON envelope -(`schema_version: "0.5"`, top-level `tool/anc/run/target/badge/audience/audit_profile`, `results[].id`); add three new -reference files (`runbook.md`, `scorecard-shape.md`, `audit-profile-selection.md`) and a consolidated `rust-clap.md`; -ship six new templates (Cargo TOML, CLI tests, scorecard JSON Schema, Python/Click + Go/Cobra starters with paired -AGENTS.md scaffolds). Sequenced as seven landable PRs: R9 boundary cleanup first, then R1–R8 determinism additions, then -R10 frontmatter, then R12 Rust idioms consolidation, then three language-starter PRs, then R14 deterministic-script -hardening (the dogfood gate that makes `bin/check-update` and `scripts/sync-spec.sh` pass the audit the skill teaches). -All seven PRs ship in one consolidated `v0.4.0` release. - ---- - -## Problem Frame - -`SKILL.md` today describes the `anc` envelope in prose without samples, names `results[].status` once without -enumerating values, has no exit-code contract, no schema-version pin, no `--audit-profile` selection rule, no -termination rule for `warn`s, no fallback for spec-skew, no install precondition, no runbook, and no allowed-tools -frontmatter. Two competent agents reading it can produce two different correct-looking-but-wrong scorecard handlers — -that variance is the problem this plan closes. Full pain narrative + all twelve requirements live in -`docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md`. A sibling cross-repo brief -(`docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md`) tracks the `anc`-side feature requests -this plan soft-depends on; interim guidance ships here regardless. - ---- - -## Requirements - -- R1. Worked scorecard samples in three placements (SKILL.md inline ~15 lines, getting-started.md inline ~5 lines, - `references/scorecard-shape.md` exhaustive). -- R2. `## Anc contract` section in SKILL.md — output flag policy, exit codes, `results[].status` enum, schema-version - pin path, stable-vs-noise subtree. -- R3. `--audit-profile` decision table in SKILL.md + extended `references/audit-profile-selection.md`. -- R4. Loop termination rule for `warn` / `should` (must gates the loop; warn is advisory; stop on `badge.eligible`). -- R5. `anc` install precondition replaces "First action: update check" as the very first action. -- R6. Spec-skew fallback paragraph in "The anc loop" step 2 (Fix.). -- R7. `references/runbook.md` covering badge placement, scorecard commit policy, `--binary` / `--source` scoping, anc - panics, skill-host fallback. -- R8. `allowed-tools: Bash(anc *), Read` added to SKILL.md frontmatter. -- R9. SKILL.md / `getting-started.md` / `references/` boundary cleanup + reorder so the first runnable command in - SKILL.md is `anc audit`, not `bash bin/check-update`. SKILL.md kept under 200 lines. -- R10. SKILL.md frontmatter audit beyond `allowed-tools` — description triggers, skip clause, kept-as-is decisions - documented. -- R11. New starter templates: `cargo-toml.md`, `cli-tests.rs`, `python-click/`, `go-cobra/`, - `agents-md-template.{python,go}.md`; new sub-sections in `references/framework-idioms-other-languages.md` for JSON - envelope authoring and P5 mutation-boundary testing. **The `templates/scorecard-envelope.schema.json` artifact origin - R11 enumerated is dropped:** the canonical schema is shipped by upstream `agentnative-cli` (plan - `agentnative-cli/docs/plans/2026-04-30-002-feat-scorecard-json-schema-plan.md`) embedded in the binary and exposed via - `anc generate scorecard-schema`; vendoring a copy here would duplicate and drift. U14 is repurposed to document the - extraction path. The "P2 author-output template" intent of origin R11.3 (a generic envelope template tool authors emit - *their tool's* output against, distinct from anc's audit envelope) is deferred to follow-up work. -- R12. Consolidate `references/framework-idioms.md` + `references/rust-clap-patterns.md` into a single - `references/rust-clap.md`; delete the originals; preserve every file-unique nugget the origin enumerates. -- R13. Subcommand index + target-resolution runbook entry (**plan-introduced, not origin-traced**). A short SKILL.md - table covering the full `anc` subcommand surface (`check`, `generate`, `skill install`, `completions`, `help`, plus - the bare-`anc`-defaults-to-`check` shortcut), so agents stop punting to `anc --help` for basic discovery. Plus an - expanded runbook entry covering all four target modes — `anc audit .` (project mode), `--binary`, `--source`, - `--command ` — with resolution rules for each. Closes the two gaps a 2026-05-01 peer review of the current - SKILL.md surfaced ("the full subcommand surface" and "target resolution rules for `--binary` vs `--source` vs - `--command` vs project mode"). -- R14. Deterministic-script hardening (dogfood gate, **plan-introduced 2026-05-04**). `bin/check-update` and - `scripts/sync-spec.sh` must pass `anc audit --binary --audit-profile posix-utility` with `badge.eligible == true` — - the skill audits CLIs against the agent-native spec, so the skill's own runtime scripts must pass that audit. Add - `--output text|json`, `--quiet`, `--no-interactive`, `--timeout ` global flags (P1, P2, P7) to both. Add - `--dry-run` (P5) and `--force` (P5) to `scripts/sync-spec.sh`. Move success-path status echoes from stdout to stderr - in `sync-spec.sh` so `--output json` produces a clean envelope. Distinguish network-fail from up-to-date in - `bin/check-update`'s output (today both branches exit 0 silently). Document the text-mode token grammar, the JSON - envelope per status, and the env-var contract (`AGENTNATIVE_SKILL_DIR`, `AGENTNATIVE_SKILL_REMOTE_URL`, - `AGENTNATIVE_SKILL_STATE_DIR`, `SPEC_REMOTE_URL`, `SPEC_ROOT`) in `references/update-check.md` and - `references/runbook.md`. - -(R1–R12 cited verbatim from origin §Requirements; R13 added by this plan in response to a peer review of the live -SKILL.md that surfaced two recurrent gaps the origin brainstorm did not anticipate; R14 added 2026-05-04 to make the -skill's own scripts pass the audit the skill teaches — the dogfood gate. Success criteria from origin §Success Criteria; -R13's success criterion is "an agent driving `anc` cold can enumerate the subcommand surface and pick the right target -mode without reading `anc --help` source"; R14's is "`anc audit --binary --audit-profile posix-utility` against -`bin/check-update` and `scripts/sync-spec.sh` produces `badge.eligible == true` for both.") - ---- - -## Scope Boundaries - -- No `anc` binary changes (sibling brief `2026-05-01-002` covers those). -- No spec-text edits — `spec/` is vendored from `agentnative` via `scripts/sync-spec.sh`. -- No skill rename — `name: agent-native-cli` stays per origin R10's documented "no" decision. -- No Haiku-tier prose rewrite (separate brainstorm if pursued). -- No new principles, new audit profiles, or new starter languages beyond Python/Go (Rust + JS + Ruby idioms exist; - Swift, Kotlin, etc. are their own brainstorm). - -### Deferred to Follow-Up Work - -- JS and Ruby rows in `references/framework-idioms-other-languages.md`'s new "JSON envelope authoring" and "Testing P5 - mutation boundaries" sub-sections — section scaffolds land in PR 5; JS / Ruby rows defer to a follow-up PR matched to - a JS/Ruby starter ecosystem expert. -- Updating origin brainstorm's R1 "Must show" field-list to use `id` instead of `requirement_id` — origin doc is closed; - correction lives only in this plan and in the shipped docs. -- **P2 author-output envelope template** (the half of origin R11.3 that's distinct from anc's audit envelope). A generic - envelope authors emit *their tool's* `--output json` against — not the anc audit envelope U14 now points at. Deferred - to a follow-up plan because (a) origin R11.3 conflated the two artifacts, and (b) "what envelope shape should a P2 - tool emit?" is its own open question that compounds with the existing `references/framework-idioms-other-languages.md` - JSON-envelope sub-section but isn't a one-file deliverable. - ---- - -## Context & Research - -### Relevant Code and Patterns - -- `SKILL.md` (146 lines today) — entry point; frontmatter + preamble + four loop steps + principles index + - implementation guidance + starter code + compliance auditing + sources. R9 reorders this. -- `getting-started.md` (85 lines) — three working loops + install + "Where things live". R9 dedupes against SKILL.md. -- `references/framework-idioms.md` (137 lines, Free / Must / Anti-patterns scaffold, Rust-only) and - `references/rust-clap-patterns.md` (209 lines, denser per-principle prose). R12 merges. -- `references/framework-idioms-other-languages.md` (338 lines; Click, argparse, Cobra, Commander, yargs, oclif, Thor) — - R11 adds two new cross-language sub-sections. -- `references/project-structure.md` (165 lines) — Rust-prescriptive; not touched by this plan but linked from new - `rust-clap.md`. -- `references/update-check.md` (33 lines) — already separated; R5 + R9 demote SKILL.md's update-check section here. -- `templates/clap-main.rs`, `templates/error-types.rs`, `templates/output-format.rs`, `templates/agents-md-template.md` - — canonical Rust patterns; R11 adds Cargo TOML + tests + JSON Schema + Python/Go siblings. -- `bin/check-update` — bash, exits 0 always, emits `UPGRADE_AVAILABLE` on divergence. Used by R5's revised first action - (after `anc` is verified on PATH). -- `spec/principles/p1-*.md` … `p7-*.md` — vendored at `spec/VERSION = 0.3.0`. Cited by R6 spec-skew fallback. - -### Live `anc 0.3.0` envelope (verified 2026-05-01 against `anc audit --output json --command rg`) - -Top-level keys: `schema_version` (`"0.5"`), `spec_version` (`"0.3.0"`), `summary`, `coverage_summary`, `results`, -`audit_profile`, `audience` (`"agent-optimized"` for rg), `badge`, `tool`, `anc`, `run`, `target`. Result entries carry -`id` (kebab-case, e.g. `p3-help`), `label`, `group` (`P1`–`P7`), `layer` (`behavioral` / `project` / `source`), `status` -(one of `pass` / `warn` / `fail` / `skip` / `error`), `evidence`, `confidence`. Badge block populated when `eligible: -true`: `score_pct`, `embed_markdown`, `scorecard_url`, `badge_url`, `convention_url`. Run block carries `invocation`, -`started_at` (ISO-8601), `duration_ms`, `platform.{os,arch}`. Target block: `kind` (`command` / `path`), `path`, -`command`. Tool block: `name`, `binary`, `version`. Anc block: `version`. - -Observed exit codes (verified 2026-05-01 against `anc 0.3.0` for echo / true / false / cat / ls / grep / rg): `0` when -summary is fully clean (no warn / fail / skip); `1` when warn or skip are present and `summary.fail == 0` (cat, ls, -grep, rg); `2` when `summary.fail > 0` (echo, true, false — each had `fail >= 1`) **OR** for invocation errors (missing -path, bad flag, unknown command). The `2` slot is overloaded today: it means "tool gated correctly with at least one -fail" *and* "tool failed to run." Agents writing CI gates **must** switch on `summary.fail`, not on `$?`, since `$? == -2` cannot distinguish a real fail from a missing-path. Sibling brief A3 will split these (`1` for `fail > 0`, `2` -reserved for invocation errors only); until then, R2 documents this observed-but-overloaded behavior. - -### Institutional Learnings - -- `~/dev/solutions-docs/best-practices/skills-2-0-structure-progressive-disclosure-20260402.md` — SKILL.md is a metadata -- branch-router; depth lives in `references/`; templates must be self-contained text, not shell-callable. Reinforces R9 - boundary cleanup. -- `~/dev/solutions-docs/architecture-patterns/anc-cli-output-envelope-pattern-2026-04-29.md` — document the envelope - contract once in `references/` and link from SKILL.md; do not duplicate field names inline. Drives the three-placement - strategy in R1. -- `~/dev/solutions-docs/best-practices/cli-structure-for-machines-typed-json-fields-over-display-strings-2026-04-20.md` - — agents must switch on typed fields, never grep human-mode display strings. Drives R2's stable-vs-noise guidance. -- `~/dev/solutions-docs/best-practices/consistent-json-schema-across-success-and-error-paths-2026-04-20.md` — error - envelopes must carry the same context fields as success envelopes. Relevant to R11.3 schema template. -- `~/dev/solutions-docs/best-practices/agentnative-version-model-2026-05-01.md` — six version concepts across four - repos; every version mention in SKILL.md must name *which* version (skill bundle / spec / anc). Drives R2 schema-pin - wording. - -### External References - -- `anc.dev/badge` — convention page referenced by R7 badge-placement entry. R7 defers placement to whatever the - convention publishes; if the page doesn't enumerate placement values, R7 proposes one and the sibling brief A4 - surfaces it as `badge.embed_position` metadata in a future anc release. - ---- - -## Key Technical Decisions - -- **Envelope reality alignment.** Every envelope-shape claim in `SKILL.md` and `getting-started.md` is rewritten against - `anc 0.3.0` live output. The brainstorm's R1 "Must show" field list is treated as advisory: where it says - `requirement_id`, the docs ship `id` (the actual envelope field). **The two are different namespaces, not synonyms.** - Envelope `results[].id` is an AUDIT id (`p3-help`, `p1-flag-existence`, `p6-sigpipe`); spec frontmatter - `requirements[].id` is a REQUIREMENT id (`p1-must-no-interactive`, `p2-must-output-flag`). One check verifies one or - more requirements; the mapping lives in each principle file's body prose (e.g. p6's "Measured by audit IDs - `p6-sigpipe`, `p6-no-color`, `p6-completions`, `p6-timeout`, `p6-agents-md`"). The previously-claimed "1:1 mapping" - was wrong and is retracted across U4 / U5 / U8. The brainstorm's `tool/anc/run/target/badge.*` claims now match - reality so they ship verbatim. New top-level fields the brainstorm did not enumerate (`audience`, `audience_reason`) - are documented in `references/scorecard-shape.md` with an explicit value enumeration captured at U5 implementation - time. -- **Schema-pin path is concrete.** R2 pins top-level `schema_version` (which exists today as `"0.5"`). No anc-side - change required for the pin to ship usable. Sibling brief A1 proposed `anc.schema_version` for additivity; current - top-level placement is documented as the canonical path. -- **Exit-code documentation is interim AND overloaded.** R2's exit-code table documents observed `anc 0.3.0` behavior - empirically: `0` = fully clean, `1` = warn/skip-only with `summary.fail == 0`, `2` = `summary.fail > 0` **or** - invocation error (the slot is overloaded today). Verified across 7 commands. The skill **must** instruct agents to - switch on `summary.fail`, never `$?`, since `$? == 2` cannot distinguish a real fail from a missing-path. Sibling - brief A3 will split the slots; until then, the interim guidance is `gate on summary.fail, treat $? as advisory`. The - earlier draft of this decision (`1` = any non-pass present, `2` = invocation error only) was empirically wrong and is - retracted. -- **Badge runbook works against the JSON envelope.** R7's badge-placement entry references `badge.embed_markdown` - directly (it exists in JSON now); placement convention defers to `anc.dev/badge`; soft-deps on sibling brief A4 only - for machine-readable position metadata. -- **R9 first, R12 before R11 Rust starters.** Origin Open Question #4 chose R9 first to avoid re-homing R1–R7 content. - Origin Open Question #7 chose R12 before the Rust starter additions (PR 4) so the merged `references/rust-clap.md` is - the link target the new starter docs reference. This plan adopts both. -- **R12 merge scaffold.** Adopt `framework-idioms.md`'s Free / Must / Anti-patterns three-bucket structure as canonical - (per origin's "merge strategy"); fold every file-unique nugget enumerated in origin (subcommand-vs-flag taxonomy in - P6, `kind()` method + main-only `process::exit()` in P4, Jsonl variant in P2, exemplar references to xurl-rs / bird). -- **R10 trigger-keyword audit defers into U10.** Origin Open Question #5 asked for a recent-traffic audit before - pruning/adding triggers. The audit happens *inside* U10 (qmd query against the skills collection + manual review of - the existing trigger list) rather than as a planning prerequisite. Keeps the plan unblocked. -- **PR-shaped phasing.** The plan ships as six landable PRs matching origin's handoff section. Each phase = one PR. This - means the plan doubles as a multi-PR coordination doc; reviewers can land PR 1 without waiting on the rest. -- **Cross-language idiom sub-sections split across PR 5 + PR 6.** R11's "JSON envelope authoring" and "Testing P5 - mutation boundaries" sub-sections in `references/framework-idioms-other-languages.md` cover four languages (Python, - Go, JS, Ruby). Section scaffold + Python rows land in PR 5; Go rows append in PR 6; JS + Ruby rows defer to follow-up - work. -- **Schema-as-pointer, not schema-as-artifact (U14 pivot).** Origin R11.3 enumerated a hand-written - `templates/scorecard-envelope.schema.json` in this skill bundle. Upstream `agentnative-cli` plan - `2026-04-30-002-feat-scorecard-json-schema-plan.md` ships the canonical scorecard schema embedded in the `anc` binary - (derived from Rust types via `schemars`, exposed via `anc generate scorecard-schema`, archived at - `https://anc.dev/scorecard-v0.5.schema.json`). Vendoring a copy in this bundle would duplicate and drift the moment - upstream bumps. U14 is repurposed: the skill points at the upstream verb (`anc generate scorecard-schema --output -`) - rather than shipping its own copy. The "P2 author-output template" half of origin R11.3 (a generic envelope template - tool authors emit *their tool's* output against, not anc's audit envelope) is deferred to follow-up work — it's a - different artifact and origin conflated the two. -- **Verification posture.** PRs 1–6 are docs + templates. Most units verify via grep checks, line-count assertions, - cross-file link integrity, and byte-faithful comparison against `anc audit --output json` output. Three units have - executable verification: U13 (`cli-tests.rs` runs under `cargo test`), U15 + U18 (Python / Go starters compile or - import cleanly). U14 is now docs-only (it documents the upstream `anc generate scorecard-schema` extraction path), so - it carries no executable verification beyond grep checks and link integrity. **PR 7 is the executable-verification - exception:** U22 / U23 modify runtime bash scripts and U25 is the empirical dogfood gate (`anc audit --binary` against - both scripts). -- **R14 dogfood gate (PR 7).** Approach is additive: new flag surfaces (`--output text|json`, `--quiet`, - `--no-interactive`, `--timeout`, `--dry-run`, `--force`) extend the existing CLI; default behavior is preserved - byte-for-byte when no new flags are passed (back-compat for existing consumers and the cache file at - `~/.cache/agent-native-cli/last-update-check`). Cache file format stays in legacy `UP_TO_DATE ` / - `UPGRADE_AVAILABLE ` shape; JSON appears at output time only. One minor breaking change in `sync-spec.sh`: - success-path status echoes ("querying…", "vendoring…", "wrote N principle files") move from stdout to stderr so - `--output json` produces a clean stdout envelope. PR 7's changelog calls the breaking change out explicitly. The - dogfood claim — the skill audits CLIs against the agent-native spec and the skill's own runtime scripts pass that same - audit at the badge-eligible bar — is non-trivial credibility for the skill itself. - ---- - -## Open Questions - -### Resolved During Planning - -- **OQ-origin-#1** (R1 inline vs reference): inline minimal (~15 lines) **and** reference exhaustive (~60 lines). Both - ship in U5. -- **OQ-origin-#3** (R7 location): `references/runbook.md`, with a one-paragraph pointer in SKILL.md. U9. -- **OQ-origin-#4** (R9 sequencing): R9 first. PR 1 leads with U1. -- **OQ-origin-#6** (R11 PR split): split per language ecosystem. PR 4 Rust, PR 5 Python, PR 6 Go. -- **OQ-origin-#7** (R12 vs R11 Rust): R12 first. PR 3 lands before PR 4. -- **Field-name drift** (`results[].id` vs origin's `requirement_id`): use `id` (actual). The earlier "maps 1:1" framing - was wrong — envelope `id` and spec frontmatter `requirements[].id` are different namespaces (audit id vs requirement - id). U4 / U5 / U8 carry the corrected framing and the principle-prose mapping path. - -### Deferred to Implementation - -- **OQ-origin-#2** (R2 anc exit-code policy): empirically resolved by 2026-05-01 probing across 7 commands. Observed: - `0` clean, `1` warn/skip with `fail==0`, `2` overloaded (`fail>0` OR invocation error). The skill ships interim - guidance "**gate on `summary.fail`, not `$?`**" with cross-link to sibling brief A3 (which will split the overloaded - `2` slot). The earlier draft contract (`1` for any non-pass, `2` for invocation error only) is empirically wrong and - retracted. -- **OQ-origin-#5** (R10 trigger-keyword traffic audit): U10 runs the audit at implementation time using a qmd query - against the skills collection and manual review of `Triggers on:` entries. -- **`anc.dev/badge` placement convention** (R7 dependency): if the page does not enumerate placement values when U9 - lands, U9 proposes one (`top-of-readme` after the H1) and the proposal is mirrored to sibling brief A4. -- **R14 CI integration**: PR 7 ships the manual dogfood gate (U25) plus a `CONTRIBUTING.md` note pinning re-runs to - edits of either script. Wiring the audit into CI as a regression gate (`gh workflow` entry running `anc audit - --binary` on every PR touching `bin/` or `scripts/`) is **explicitly out of scope** for this plan and belongs to a - follow-up CI-hardening PR. The manual gate is sufficient for the v0.4.0 release; CI integration is the next - compounding step. -- **R14 audit-profile choice** (`posix-utility` vs alternatives): U25 audits both scripts under `posix-utility`. If - either script's primary surface turns out to be more idiomatic under a different profile (unlikely — both are - stdin-quiet, stdout-emitting, no TUI), the audit-profile selection note in `references/audit-profile-selection.md` - (U6) is the authoritative source. Pick is empirically resolved at U25 implementation time; if a different profile is - needed, the U25 verification section captures the rationale. - ---- - -## Implementation Units - -Units are grouped by phase (= PR). Each unit is a landable atomic commit within its PR. U-IDs are stable; reordering or -splitting preserves them. - -### Phase 1 — PR 1: Skill-side determinism (R1–R9) - -- U1. **R9: SKILL.md restructure + getting-started.md dedup** - -**Goal:** Reorder SKILL.md so the first runnable command is `anc audit` (not `bash bin/check-update`). Eliminate -cross-file duplication (install block, four-step loop, parallel index tables). Land the new section skeleton so U2–U9 -fill in their own slots without re-homing. - -**Requirements:** R9. - -**Dependencies:** None. (Runs first per OQ-origin-#4.) - -**Files:** - -- Modify: `SKILL.md` -- Modify: `getting-started.md` - -**Approach:** - -- New SKILL.md section order: Preamble → `## Quick Start` (placeholder for U5's inline scorecard sample + the canonical - `anc audit --output json . > scorecard.json` line) → `## Subcommand index` (placeholder for U21) → `## Anc contract` - (placeholder for U4) → `## The anc loop` (conceptual four steps; U7 folds in termination rule, U8 folds in spec-skew) - → `## The seven principles` index → `## Audit profile selection` (placeholder for U6) → `## Common situations` - (one-paragraph pointer to `references/runbook.md`, U9 + U21 fill) → Pointer block → `## Update-check` (demoted to - one-paragraph footnote pointing at `references/update-check.md`) → Sources. -- Delete the duplicated install block in SKILL.md `## Compliance auditing` (lives in `getting-started.md` already). -- Delete the duplicated four-step loop where SKILL.md and `getting-started.md` both walk it. Canonical homes: SKILL.md - for the conceptual loop; `getting-started.md` for the runnable bash recipe. -- Merge SKILL.md's `## Implementation guidance` pointer table and `getting-started.md`'s `## Where things live` table - into one canonical table in `getting-started.md`. SKILL.md keeps a short pointer paragraph. -- Keep SKILL.md under 200 lines after the cleanup (origin R9 acceptance). - -**Patterns to follow:** Existing SKILL.md tone and table formatting (kept consistent with `agentnative-spec` skill). -`getting-started.md`'s "Where things live" table format. - -**Test scenarios:** - -- Verification: `grep -c 'brew install brettdavies/tap/agentnative' SKILL.md getting-started.md` returns 0 in - `SKILL.md`, ≥1 in `getting-started.md` (origin R9 acceptance). -- Verification: `grep -c 'anc skill install claude_code' SKILL.md getting-started.md` returns hits in exactly one file - (`getting-started.md`). -- Verification: First runnable command in SKILL.md (first fenced block after the preamble) is `anc audit`, not `bash - bin/check-update`. -- Verification: `wc -l SKILL.md` ≤ 200 after the full PR 1 lands (re-checked at end of PR 1, not at U1 alone since U2–U9 - add ~70 lines). -- Verification: `## Update-check` is no longer the second top-level heading; appears near the bottom. -- Verification: `markdownlint-cli2 SKILL.md getting-started.md` passes. - -**Verification:** Reordered SKILL.md presents Quick Start as section #2; install + four-step loop appear in exactly one -canonical location each; markdownlint clean. - ---- - -- U2. **R8: SKILL.md `allowed-tools` frontmatter** - -**Goal:** Add `allowed-tools: Bash(anc *), Read` to SKILL.md frontmatter so canonical `anc audit` invocations don't -trigger permission prompts on hosts that honor the field. - -**Requirements:** R8. - -**Dependencies:** U1 (so U2's frontmatter edit doesn't conflict with U1's body restructure). - -**Files:** - -- Modify: `SKILL.md` (frontmatter only) - -**Approach:** Single-line addition to YAML frontmatter. Place after `description:` and before the closing `---`. Use the -exact string `Bash(anc *), Read` so Claude Code's matcher accepts both bare `anc` and any `anc ` invocation. - -**Patterns to follow:** Skills 2.0 frontmatter convention surfaced in -`~/dev/solutions-docs/best-practices/skills-2-0-structure-progressive-disclosure-20260402.md` — `allowed-tools` is -advisory in interactive mode, enforced under headless `claude -p`. - -**Test scenarios:** - -- Verification: `grep -A1 '^name: agent-native-cli$' SKILL.md` shows the description; `grep '^allowed-tools:' SKILL.md` - returns one match. -- Verification: YAML frontmatter parses (`python3 -c "import yaml,sys; - yaml.safe_load(open('SKILL.md').read().split('---')[1])"`). - -**Verification:** Frontmatter is well-formed; `allowed-tools` line present and exactly matches the canonical string. - ---- - -- U3. **R5: anc install precondition replaces "First action: update check"** - -**Goal:** First action in any session is verifying `anc` is on `PATH`. Update-check (which is for the skill bundle, not -`anc`) demotes to a referenced sub-flow. - -**Requirements:** R5. - -**Dependencies:** U1 (uses U1's restructured section layout — `## Update-check` as bottom footnote). - -**Files:** - -- Modify: `SKILL.md` (replace current `## First action: update check` content; add precondition pseudocode) -- Modify: `references/update-check.md` (only if U1 didn't already absorb the demoted bash; otherwise unchanged) - -**Approach:** - -- Replace SKILL.md's current `## First action: update check` body with: a one-paragraph "anc precondition" + the - pseudocode block from origin R5 (`if ! command -v anc; then prompt user; exit; fi`) + a one-line pointer to the - demoted update-check footnote. -- Pseudocode is markdown-fenced shell-pseudo (not literal bash), so agents read it as logic, not as a command to execute - verbatim. Origin's exact pseudocode shape preserved. -- The demoted update-check section runs the existing `bash bin/check-update` after the precondition gate clears. - -**Patterns to follow:** Existing `references/update-check.md` voice + structure for the demoted footnote. - -**Test scenarios:** - -- Verification: `grep -B2 -A8 '^## ' SKILL.md | head` shows the install precondition section appears before any other - procedural section. -- Verification: SKILL.md contains the `command -v anc` pseudocode and the brew + cargo install fallback options. -- Verification: Update-check guidance in SKILL.md is a single paragraph referencing `references/update-check.md` (no - full procedural body). - -**Verification:** Cold-start agent reading SKILL.md top-to-bottom encounters the anc precondition before any `anc audit` -invocation. - ---- - -- U4. **R2: `## Anc contract` section in SKILL.md** - -**Goal:** Document `anc`'s observable contract once, so agents stop inferring it from prose. Output flag policy, exit -codes (interim), `results[].status` enum, schema-version pin, stable-vs-noise subtree. - -**Requirements:** R2. - -**Dependencies:** U1 (placeholder section exists). Soft-dep on U5 (the inline scorecard sample lands in adjacent Quick -Start; cross-link). - -**Files:** - -- Modify: `SKILL.md` (add `## Anc contract` body, ~25–40 lines per origin estimate) - -**Approach:** - -- **Output flag policy.** One paragraph: agents always pass `--output json`; `text` is for humans (and appends a badge - embed hint after the summary line when `badge.eligible == true`). -- **Exit codes.** Three-row table covering observed `anc 0.3.0` behavior: `0` = fully clean (no warn/fail/skip), `1` = - warn/skip present with `summary.fail == 0`, `2` = `summary.fail > 0` **OR** invocation error (missing path, bad flag, - unknown command). The `2` slot is overloaded — same exit code for "tool gated correctly with at least one fail" and - "tool failed to run." Imperative gating rule: **agents writing CI gates must switch on `summary.fail`, not on `$?`**, - since `$? == 2` cannot distinguish a real fail from an invocation error today. Footnote: "Interim contract — sibling - brief A3 (`docs/brainstorms/2026-05-01-002`) will split the slots so `1` means `fail > 0` and `2` is reserved for - invocation errors. Until then, treat `$?` as advisory; the gating signal is `summary.fail`." -- **`results[].status` enum.** Bulleted list naming all five values: `pass`, `warn`, `fail`, `skip`, `error`. One-line - gloss per value. -- **Schema-version pin.** Imperative paragraph: assert `envelope.schema_version == "0.5"` before parsing. If the - assertion fails, do not fall back to silent parse — fail explicit and prompt the user to upgrade `anc` or this skill - bundle. Footnote naming sibling brief A1 as the durable additive replacement (`anc.schema_version` proposed). -- **Stable vs noise.** Two-list paragraph. Stable-for-CI-diffing: `summary.*`, `coverage_summary.*`, `results[*].{id, - status, evidence}`, `audience`, `audit_profile`, `badge.eligible`, `badge.score_pct`. Timestamp/run-noise (don't - diff): `run.started_at`, `run.duration_ms`, `run.invocation`, `tool.version`, `anc.version`. Sibling brief A5 will - eventually offer `--stable` flag. - -**Patterns to follow:** Stable-vs-noise framing from -`~/dev/solutions-docs/best-practices/cli-structure-for-machines-typed-json-fields-over-display-strings-2026-04-20.md`. - -**Test scenarios:** - -- Verification: `grep -c '^## Anc contract' SKILL.md` returns 1. -- Verification: Section enumerates all five status values (`pass`, `warn`, `fail`, `skip`, `error`). -- Verification: Section names `schema_version` as the pin path (not `tool.schema_version` or `anc.schema_version`). -- Verification: Exit-code table has three rows, footnoted as interim with cross-link to sibling brief. -- Verification: Exit-code table is empirically validated at unit time by running `anc audit --output json --command - ` against at least three commands covering `summary.fail == 0` and `summary.fail > 0` cases (e.g. `rg`, `cat`, - `echo`); the documented `$?` value matches observed behavior, the imperative `gate on summary.fail, not $?` line is - present. -- Verification: The schema-pin example (`envelope.schema_version == "0.5"`) matches the live `anc 0.3.0` value. - -**Verification:** An agent writing a CI gate that fails on regression can do so without guessing — exit codes -documented, status enum complete, schema-pin path concrete, stable subtree enumerated. - ---- - -- U5. **R1: Three-placement worked scorecard samples** - -**Goal:** Three coordinated samples sized to their location: SKILL.md inline (~12–15 lines top-level shape), -`getting-started.md` inline (~3–5 lines invocation→output), `references/scorecard-shape.md` (~50–60 lines exhaustive, -every top-level field, one `results[]` entry per status value, every audit-profile category, full metadata). All taken -from a live `anc 0.3.0 check --output json` run, byte-faithful where possible. - -**Requirements:** R1. - -**Dependencies:** U1 (Quick Start placeholder), U4 (links to anc contract section). Soft-dep on U6 (audit-profile -decision table is adjacent). - -**Files:** - -- Create: `references/scorecard-shape.md` -- Modify: `SKILL.md` (Quick Start fenced block) -- Modify: `getting-started.md` ("existing CLI" loop section, after the canonical `anc audit --output json . > - scorecard.json` line) - -**Approach:** - -- **`references/scorecard-shape.md` (exhaustive).** One H1 + short preamble + a single fenced JSON block of the full - envelope with every top-level field populated. Use a synthesized but realistic envelope (start from `anc audit - --output json --command rg`, then construct extra `results[]` entries to cover all five status values, all four - audit-profile categories, and an `audience: agent-optimized` example). Below the JSON block, a per-field gloss table: - `field path → type → semantics → stable for CI?`. Note that `audience` and `audience_reason` are top-level (not inside - `audit_profile`). -- **SKILL.md inline (~15 lines).** Truncated envelope showing only `summary`, `coverage_summary`, `badge`, and one - `results[]` entry. Placed immediately after the Quick Start `anc audit ...` line. Comment ellipses (`/* ... */`) where - fields are elided. -- **`getting-started.md` inline (~5 lines).** Even shorter — just `coverage_summary` + `badge.eligible` + - `badge.embed_markdown`. Placed right after the `anc audit --output json . > scorecard.json` recipe. -- **Field-name discipline.** All three samples use `results[].id` (the actual envelope field). The per-field gloss table - in `references/scorecard-shape.md` carries an explicit "**audit id vs requirement id**" note: envelope `results[].id` - is an AUDIT identifier (`p3-help`, `p1-flag-existence`, `p6-sigpipe`); spec frontmatter `requirements[].id` is a - REQUIREMENT identifier (`p1-must-no-interactive`, `p2-must-output-flag`). The two are different namespaces. Each - principle file's body prose maps audit ids to the requirements they verify (e.g. p6's "Measured by audit IDs - `p6-sigpipe`, …"). To resolve a finding's spec text, read principle prose, not frontmatter alone. Sibling brief A2 - (`anc explain `) is the durable resolver once it ships. -- **Live envelope capture at unit time.** Implementer runs `anc audit --output json --command rg` at U5 implementation - time and uses that output as the byte-faithful base for the exhaustive sample. Do not rely on a snapshot from this - plan — `anc` may have moved between plan-write and unit-implementation; live capture is the freshness guarantee. -- **`audience` semantic capture.** Before the per-field gloss table goes final, run `anc audit --output json` against at - least four targets covering the observed `audience` values (sampled today: `agent-optimized`; null cases also - observed). Enumerate the value set in the gloss row and document when `audience_reason` is present (today: when - `audience == null`). If the value space is open-ended or not stable enough to enumerate, the gloss row says so - explicitly rather than implying enumeration completeness. - -**Patterns to follow:** `references/update-check.md`'s preamble + fenced-block voice. Origin R1 sample-sizing rules. - -**Test scenarios:** - -- Verification: `references/scorecard-shape.md` JSON block parses (`python3 -c "import json; - json.load(open('references/scorecard-shape.md'))"` after extracting the block, or `python3 - scripts/extract-fenced-json.py references/scorecard-shape.md`). -- Verification: Exhaustive sample contains at least one `results[]` entry per `status` value (`pass`, `warn`, `fail`, - `skip`, `error`). -- Verification: `audit_profile` is a scalar field — only one value per envelope. The exhaustive sample shows ONE - `audit_profile` value plus a prose annotation (under the per-field gloss row) enumerating all four possible values - (`human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal`) and noting the field is set per-invocation by - `--audit-profile `. -- Verification: Per-field gloss row for `audience` enumerates the observed value set (sampled `agent-optimized`; null - cases) and notes when `audience_reason` is populated. If the value space is intentionally open, the row says so. -- Verification: All three samples use `id` (not `requirement_id`) for result entries. -- Verification: Field path table in `references/scorecard-shape.md` enumerates `summary, coverage_summary, badge, - results, audit_profile, audience, audience_reason, tool, anc, run, target, schema_version, spec_version` (13 top-level - fields). -- Verification: SKILL.md inline sample fits in ~15 lines. -- Verification: `getting-started.md` inline sample fits in ~5 lines. -- Covers AE: An agent reading only `references/scorecard-shape.md` can write a parser handling all five status values - and all four audit-profile categories without reading `anc` source (origin R1 acceptance). - -**Verification:** Three samples present at three locations; each sized for its job; all field names match `anc 0.3.0` -reality; JSON parses; exhaustive sample is the canonical one for parser authors. - ---- - -- U6. **R3: `--audit-profile` decision table + extended reference** - -**Goal:** A 4-row decision table in SKILL.md (one row per profile) + an extended `references/audit-profile-selection.md` -covering the hybrid-tool rule. - -**Requirements:** R3. - -**Dependencies:** U1 (placeholder section). - -**Files:** - -- Modify: `SKILL.md` (`## Audit profile selection` section, ~10 lines) -- Create: `references/audit-profile-selection.md` (~40 lines) - -**Approach:** - -- **SKILL.md table.** Four rows: `human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal` (reserved). Columns: - when to pick (one-line rule), example tool, what gets suppressed. Glosses paraphrased from `anc audit --help`'s value - list — but agents are pointed at `--help` for the authoritative description. -- **SKILL.md `### Worked examples` sub-section** (added immediately under the 4-row table, ~10 lines). Three concrete - hybrid-tool worked examples with explicit profile picks, addressing the determinism-acceptance concern that one-line - rules in 4 rows won't converge agents on hybrid cases: -- *Example 1 — `lazygit` (pure TUI):* pick `human-tui`. Reason: the binary's primary entry-point is the interactive TTY - interface; no batch / stdin-piped mode exists. -- *Example 2 — `cat` with optional TUI mode:* pick `posix-utility`. Reason: stdin-primary is the documented main use; - any TUI rendering is a secondary surface, scope-out per the primary-entry-point rule. -- *Example 3 — A diagnostic CLI with one TUI rendering subcommand (e.g. `mytool dashboard`) and otherwise read-only - introspection:* pick `diagnostic-only`. Reason: the tool's documented main use is read-only diagnosis; the dashboard - subcommand is a secondary surface, scope-out and document the suppression in the README's Limitations section. -- **`references/audit-profile-selection.md` (extended).** Restate the four profiles with deeper one-paragraph - descriptions, then a "**Hybrid project rule**" sub-section: when a tool mixes a TUI-rendering subcommand with a - stdin-piped batch mode (or a Rust binary with shell-helper subcommands), the rule is: scope the audit profile to the - primary entry-point and leave secondary surfaces uncovered. Cross-link to sibling brief A6 (proposed `--audit-profile - =` repeatable flag) as the durable composition path. -- "Three different agents reading the table for the same tool pick the same profile" (origin R3 acceptance) — frame the - rule deterministically: "primary entry-point" defined as the subcommand documented as the tool's main use in its - README, or the bare invocation behavior if no README clarifies. The worked-examples sub-section is the empirical - determinism test — three different agents reading the table + the three worked examples for a fourth hybrid tool - should converge on the same pick. - -**Patterns to follow:** Existing `references/framework-idioms-other-languages.md` table-with-paragraph-context style. - -**Test scenarios:** - -- Verification: SKILL.md `## Audit profile selection` section contains exactly four rows (one per profile) plus a `### - Worked examples` sub-section with three hybrid-tool examples and explicit profile picks. -- Verification: `references/audit-profile-selection.md` contains a `### Hybrid project rule` (or equivalent) - sub-section. -- Verification: Cross-link to sibling brief A6 present in the extended file. -- Verification: The four profile names in the table match `anc audit --help`'s `--audit-profile` value list - (`human-tui`, `posix-utility`, `diagnostic-only`, `file-traversal`). -- Verification (empirical determinism): give the SKILL.md `## Audit profile selection` section (table + worked examples) - to 3 agents on a fourth hybrid tool not in the worked examples (e.g. `nvtop` — TUI by default but supports stdin-piped - data); record their picks. If any disagreement, revise the worked examples or the rule before merge. - -**Verification:** Decision table renders cleanly; hybrid-tool rule is explicit and deterministic; cross-links hold. - ---- - -- U7. **R4: Loop termination rule for warn / should** - -**Goal:** Folded into `## The anc loop` (placed by U1): one paragraph explicitly stating that `must` violations gate the -loop, `warn`s are advisory, stop iterating when `badge.eligible == true`. - -**Requirements:** R4. - -**Dependencies:** U1. - -**Files:** - -- Modify: `SKILL.md` (`## The anc loop` step 3 / 4 boundary) - -**Approach:** - -- One paragraph (~5–10 lines) inserted at the end of step 3 (Re-audit) or as a short coda before step 4 (Claim the - badge). Three bullets: (1) `must` gates the loop — continue until `must.verified == must.total`. (2) `warn`s are - advisory — once `badge.eligible == true` (≥80%), stop iterating; do not push warns to pass unless the user explicitly - asks. (3) `error` (the run-failure status, distinct from `fail`) means re-run — see runbook (U9). - -**Patterns to follow:** Existing imperative voice in `## The anc loop`. - -**Test scenarios:** - -- Verification: `## The anc loop` section contains the words `advisory` and `badge.eligible` near step 3 / 4. -- Verification: The termination rule explicitly distinguishes `error` (run-failure) from `fail` (gating-violation). - -**Verification:** An agent in "fix mode" reading SKILL.md stops when the badge clears, not when every warn clears. - ---- - -- U8. **R6: Spec-skew fallback paragraph** - -**Goal:** Step 2 of `## The anc loop` (Fix.) carries a short paragraph or table row explaining the **two** failure modes -agents hit when looking up a finding's `id`: (a) the namespace mismatch (envelope `id` is an audit id, not a requirement -id), and (b) actual spec drift when the bundle's vendored `spec/principles/` is older than `anc`. - -**Requirements:** R6. - -**Dependencies:** U1, U4 (the check-id-vs-requirement-id namespace note from U4 is referenced). - -**Files:** - -- Modify: `SKILL.md` (`## The anc loop` step 2 — Fix.) - -**Approach:** - -- ~8–12 line paragraph inserted at the end of step 2. Two-part structure: -- **Part A — namespace, not skew (always check this first).** The envelope's `results[].id` is an AUDIT id (e.g. - `p3-help`, `p1-flag-existence`, `p6-sigpipe`). The spec's `requirements[].id` is a REQUIREMENT id (e.g. - `p1-must-no-interactive`, `p2-must-output-flag`). They are different namespaces. To resolve the spec text the check - references, read the matching principle file's body prose (e.g. `spec/principles/p6-*.md`'s "Measured by audit IDs - `p6-sigpipe`, …" mapping line) — NOT the principle's frontmatter `requirements[]` block alone. If `anc explain ` - is available (sibling brief A2), prefer it as the authoritative resolver. -- **Part B — actual spec skew (only if Part A's principle file doesn't reference the audit id).** (1) Likely cause — - `anc` shipped against a newer spec than this bundle vendors. (2) Interim fix — re-run `scripts/sync-spec.sh` to - refresh the vendored spec, or fetch the missing `spec/principles/p-*.md` from `agentnative` `main`. (3) Non-fix — - do not hallucinate a spec definition; better to surface the gap to the user. - -**Patterns to follow:** Existing `## The anc loop` step prose voice; `references/update-check.md` voice for the -cross-link footnote. - -**Test scenarios:** - -- Verification: SKILL.md step 2 contains both `scripts/sync-spec.sh` (interim fix) and the cross-link to sibling brief - A2 (`anc explain`). -- Verification: The paragraph explicitly says "do not hallucinate" or equivalent (origin R6 non-fix bullet). -- Verification: Part A (namespace) appears before Part B (skew) and uses concrete examples — `p3-help` as an audit id, - `p1-must-no-interactive` as a requirement id — to make the namespace distinction unambiguous. -- Verification: The paragraph names "principle file body prose" (NOT "frontmatter alone") as the resolution path. - -**Verification:** An agent hitting the spec-skew case has a deterministic next step that isn't "read source". - ---- - -- U9. **R7: `references/runbook.md` (common situations)** - -**Goal:** A new reference file with one-paragraph entries (≤4 lines each) per common situational dead-end. Linked from -SKILL.md `## Common situations` (the placeholder U1 created). - -**Requirements:** R7. - -**Dependencies:** U1 (SKILL.md pointer in place), U4 (anc contract referenced from runbook entries). - -**Files:** - -- Create: `references/runbook.md` -- Modify: `SKILL.md` (`## Common situations` paragraph, links to runbook) - -**Approach:** - -- Five entries from origin R7: - -1. **`badge.embed_markdown` placement.** Default convention: top of README, after the H1 title, alongside CI badges. - Override only if `anc.dev/badge` publishes a different convention (check the page; if it doesn't enumerate placement, - this entry's default applies and the proposal mirrors to sibling brief A4). -2. **Should I commit `scorecard.json`?** Default no (artifact, regenerable from `anc audit`). Override only for CI - gating snapshots. If you commit, gitignore `run.*` fields by post-processing or use the eventual `anc audit --stable` - (sibling brief A5). -3. **`anc audit .` vs `--binary` vs `--source`.** `anc audit .` runs both source and behavioral analysis; `--binary` - skips source (use when scoring a pre-built binary that isn't in this repo); `--source` skips behavioral (use when - source-only feedback is wanted, e.g. PR review on a code-only diff). -4. **`anc` panics or returns malformed JSON.** Pointer to `` with - the `anc --version` + invocation. Do not retry blind; retrying a panicking `anc` will produce the same panic with - timestamp churn and waste agent time. -5. **`anc skill install ` host not in registry.** Cross-link to `getting-started.md`'s manual `git clone --depth - 1` fallback. - -- Each entry: ≤4 lines body + a one-line "see also" pointer to the relevant origin requirement, sibling brief item, or - other reference file. Goal is fast index, not deep prose. - -**Patterns to follow:** `references/update-check.md` short-section voice; `getting-started.md` Q&A table for -cross-links. - -**Test scenarios:** - -- Verification: `references/runbook.md` contains all five entries; each entry body is ≤4 lines (excluding heading and - "see also" pointer). -- Verification: SKILL.md `## Common situations` section is one paragraph containing the link to `references/runbook.md`. -- Verification: Markdownlint clean; cross-links resolve (no broken `[text](path)` pointers). -- Verification: Entry 1's badge-placement default cross-references `anc.dev/badge` and sibling brief A4. - -**Verification:** Agents stop generating issues that ask FAQs already covered (origin R7 acceptance). - ---- - -- U21. **R13: Subcommand index + target-resolution runbook entry** - -**Goal:** Document the full `anc` subcommand surface in SKILL.md so agents stop punting to `anc --help` for basic -discovery; expand `references/runbook.md`'s target-resolution entry to cover all four target modes (`.`, `--binary`, -`--source`, `--command `) with resolution rules. - -**Requirements:** R13. - -**Dependencies:** U1 (placeholder `## Subcommand index` section in place), U9 (`references/runbook.md` exists; U21 -expands its target-resolution entry). - -**Files:** - -- Modify: `SKILL.md` (`## Subcommand index` section, ~10 lines) -- Modify: `references/runbook.md` (replace U9's entry 3 with an expanded version, ~10–15 lines) - -**Approach:** - -- **SKILL.md `## Subcommand index`.** A 5-row table placed between `## Quick Start` and `## Anc contract` (U1's section - order). Columns: subcommand, one-line description, when to use. Rows: - -1. `check` — audit a CLI for agent-readiness (the canonical workflow; default when bare `anc ` is invoked). -2. `generate` — produce build artifacts (e.g., coverage matrix); not part of the agent loop. -3. `skill install ` — install this skill bundle into a host's canonical skills directory (six hosts supported; see - `getting-started.md`). -4. `completions` — emit shell completions for bash / zsh / fish / PowerShell. -5. `help [subcommand]` — print help; equivalent to `--help`. - -- Footnote under the table: `anc ` (no subcommand) is shorthand for `anc audit ` — see `anc --help` for the - exact aliasing rules. Bare `anc` (no arguments) prints help and exits 2. -- Subcommand list **must** be verified at unit time by running `anc --help` and reconciling the table against the live - `Commands:` block. If `anc` ships a new subcommand between plan-write and U21 implementation, add it (or trim) so the - table stays current. -- **`references/runbook.md` target-resolution entry (replaces U9's entry 3).** Four-mode coverage: -- **Project mode** (`anc audit .`) — runs both source analysis (Rust-only today) and behavioral audits. Default when a - path argument is given. -- **`--binary`** — runs only behavioral audits against a pre-built binary at the given path. Use when scoring a binary - that isn't built from source you control, or when source analysis would surface noise irrelevant to runtime behavior. -- **`--source`** — runs only source analysis (Rust source-tree scanning). Use when source-only feedback is wanted (e.g., - PR review on a code-only diff before a binary is rebuilt). -- **`--command `** — resolves a binary from `PATH` and runs behavioral audits against it. The cross-repo audit - mode the skill's actual audience uses (auditing someone else's tool). Behavioral checks only — `anc` doesn't analyze - source it didn't find via path resolution. -- Cross-link the entry to `anc audit --help` for the authoritative flag list. - -**Patterns to follow:** Existing `getting-started.md` "Where things live" table format for the SKILL.md subcommand -index; existing `references/update-check.md` short-section voice for the runbook entry. - -**Test scenarios:** - -- Verification: SKILL.md `## Subcommand index` section contains a 5-row table covering at minimum `check`, `generate`, - `skill install`, `completions`, `help`. -- Verification: Table content matches `anc --help` `Commands:` block at unit time (`diff` between table subcommand names - and live `anc --help` output shows zero unexpected omissions). -- Verification: `references/runbook.md` target-resolution entry names all four modes (`.`, `--binary`, `--source`, - `--command `) and gives at least one concrete use case per mode. -- Verification: `--command ` is named explicitly as the cross-repo audit mode (the skill's primary audience per - origin §Problem framing). -- Verification: SKILL.md stays under 200 lines after U21 lands (R9 ceiling preserved). -- Verification: Markdownlint clean; cross-links resolve. - -**Verification:** A first-time agent reading SKILL.md cold can enumerate the subcommand surface without running `anc ---help` and pick the right `--binary` / `--source` / `--command` mode for the audit task at hand. - ---- - -### Phase 2 — PR 2: Frontmatter polish (R10) - -- U10. **R10: SKILL.md frontmatter audit** - -**Goal:** Audit `description`, `Triggers on` keyword list, and `SKIP when` clause; document kept-as-is decisions; ensure -no `argument-hint` / `model:` / `disable-model-invocation` fields are added. - -**Requirements:** R10. - -**Dependencies:** U2 (frontmatter already touched for `allowed-tools`; U10 edits adjacent fields without conflict). - -**Files:** - -- Modify: `SKILL.md` (frontmatter `description` only) - -**Approach:** - -- **Triggers audit (resolves OQ-origin-#5).** Run `qmd query "agent-native" --collection skills` and `qmd query "anc - CLI" --collection skills` to surface what queries the skill actually fires on in real traffic. Manual review of - existing `Triggers on:` list; drop any keyword that hasn't fired discovery in practice; add ones that have come up in - skill-side questions but aren't there. Stay under 1024 chars total. -- **SKIP clause sharpening.** Audit for collisions with `compound-engineering` and `create-agent-skills`. Today's clause - excludes TUI-app authoring and non-CLI library work; sharpen to also explicitly exclude general "skill authoring" - (route to `create-agent-skills`) and general "compound engineering workflows" (route to `compound-engineering`). -- **Kept-as-is decisions** (per origin R10 in-scope notes): document in this plan (not in SKILL.md frontmatter) that - `name` stays, `argument-hint` is not added (background-knowledge skill), `model:` is not pinned (let host decide), - `disable-model-invocation` is not added (auto-load is correct). - -**Patterns to follow:** `~/dev/solutions-docs/best-practices/skills-2-0-structure-progressive-disclosure-20260402.md` — -frontmatter discoverability rules. - -**Test scenarios:** - -- Verification: `description` field is ≤1024 chars (`python3 -c "import yaml; - d=yaml.safe_load(open('SKILL.md').read().split('---')[1]); assert len(d['description']) <= 1024, - len(d['description'])"`). -- Verification: `Triggers on:` keyword list in description does not contain `Slack`, `email`, or other unrelated noise - (sample sanity check). -- Verification: `SKIP when` clause names `compound-engineering` and `create-agent-skills` as explicit redirects. -- Verification: `name`, `description`, `allowed-tools` are the only frontmatter keys; no `argument-hint`, `model`, or - `disable-model-invocation`. - -**Verification:** Frontmatter passes a fresh `create-agent-skills` audit at the "well-tuned" tier; kept-as-is decisions -documented in this plan's Key Technical Decisions or U10 commit message. - ---- - -### Phase 3 — PR 3: Rust idioms consolidation (R12) - -- U11. **R12: Merge `framework-idioms.md` + `rust-clap-patterns.md` → `references/rust-clap.md`** - -**Goal:** One canonical Rust idioms reference. Adopt `framework-idioms.md`'s Free / Must / Anti-patterns scaffold; fold -every file-unique nugget from `rust-clap-patterns.md` into the appropriate principle's bucket. - -**Requirements:** R12. - -**Dependencies:** U1 (SKILL.md's pointer table merged into `getting-started.md`'s Where things live; U11 updates that -single table). Independent of all PR 1 + PR 2 unit content otherwise. - -**Files:** - -- Create: `references/rust-clap.md` -- Delete: `references/framework-idioms.md` -- Delete: `references/rust-clap-patterns.md` -- Modify: `getting-started.md` (Where things live table — point at the new file) -- Modify: `SKILL.md` (any pointer paragraph referencing the old files — update or delete) - -**Approach:** - -- One H2 per principle (P1–P7). Within each, three H3 buckets in this order: **Free from clap**, **Must implement**, - **Anti-patterns**. Per-bucket prose folds in: -- **P1 Must-implement:** the FalseyValueParser detail, four-flag `--output / --quiet / --no-interactive / --timeout` - global pattern (xurl-rs / bird precedent), `cli.no_interactive || !std::io::stdin().is_terminal()` gate. -- **P2 Must-implement:** explicitly enumerate Text / Json / **Jsonl** as the three OutputFormat variants (preserves the - file-unique Jsonl nugget). OutputConfig threading; format-aware error printing. -- **P3 Must-implement:** `after_help` (not `about` or `long_about`); env vars surface automatically via `env` attribute; - per-subcommand `after_help`. -- **P4 Must-implement:** `try_parse()` not `parse()`; `thiserror` enum; `exit_code()` method **and** `kind()` method - (preserves file-unique nugget); main-only `process::exit()` (preserves file-unique rule); sysexits 77 / 78 / 74 - mappings. -- **P5 Must-implement:** `--dry-run` on every write subcommand; `--force` / `--yes` on destructive ops; idempotent - design; read/write categorization rule. -- **P6 Must-implement:** SIGPIPE fix; `IsTerminal`; `NO_COLOR` + `TERM=dumb`; clap_complete; three-tier dependency - gating. Plus a new H3 sub-section before the Free/Must/Anti-patterns triplet: **Flags vs subcommands taxonomy** — five - bullets: subcommands for operations, nested subcommands for namespaced operations, global flags for cross-cutting - modifiers, local flags for command-specific modifiers, both flag and subcommand for universal meta-commands like - `--help` / `--version`. (Preserves the largest file-unique nugget.) -- **P7 Must-implement:** `diag!` macro; `--quiet`; `--limit` / `--max-results` with `clamp()`; `--timeout`; output - clamping with truncation diagnostic. -- Closing pointer table: link to `references/framework-idioms-other-languages.md` (preserved from `framework-idioms.md`) -- new pointer to `templates/cargo-toml.md` (which lands in PR 4). -- Concrete-example anchors: explicit references to xurl-rs and bird as exemplar codebases (preserved from - `rust-clap-patterns.md`). -- Estimated final size ~250 lines, replacing 346 lines (origin R12 estimate). - -**Patterns to follow:** `framework-idioms.md` three-bucket scaffold; `rust-clap-patterns.md` per-principle paragraph -density. - -**Test scenarios:** - -- Verification: `git status` after the unit shows `references/framework-idioms.md` and - `references/rust-clap-patterns.md` as deleted, `references/rust-clap.md` as new. -- Verification: `wc -l references/rust-clap.md` is between 200 and 300 (origin estimate ~250). -- Verification: `grep -c '^## P[1-7]' references/rust-clap.md` returns 7 (one per principle). -- Verification: P6 section contains both the **Flags vs subcommands taxonomy** sub-section and the SIGPIPE / IsTerminal - / clap_complete coverage. -- Verification: P4 section contains both `exit_code()` and `kind()` method names. -- Verification: P2 section enumerates `Text`, `Json`, `Jsonl` (all three OutputFormat variants). -- Verification: `grep -rn 'framework-idioms\.md\|rust-clap-patterns\.md' SKILL.md getting-started.md references/ - templates/` returns 0 hits (no broken links). -- Verification: `grep -n 'rust-clap\.md' SKILL.md getting-started.md` returns ≥1 hit each (new pointers in place). -- Verification: `markdownlint-cli2 references/rust-clap.md` passes. -- **Verification (token-set diff — pre-deletion gate).** Before deleting the source files, compute the symmetric - difference between merged-file content tokens and source-file content tokens, and review every loss explicitly: - -```sh -# Extract H3 + H4 headings, named identifiers (function/method/crate names), -# and code-fenced spans from both source files; compare against merged file. -extract_tokens() { - rg -oE '^####? .+|`[A-Za-z_][A-Za-z0-9_:!]+`|`#\[[^]]+\]`' "$1" \ - | sort -u -} -extract_tokens references/framework-idioms.md > /tmp/src-a.tokens -extract_tokens references/rust-clap-patterns.md > /tmp/src-b.tokens -extract_tokens references/rust-clap.md > /tmp/merged.tokens -sort -u /tmp/src-a.tokens /tmp/src-b.tokens > /tmp/source-union.tokens -# Tokens present in source union but absent in merged (with paraphrase tolerance): -comm -23 /tmp/source-union.tokens /tmp/merged.tokens > /tmp/lost.tokens -wc -l /tmp/lost.tokens -``` - - Review every line in `/tmp/lost.tokens`. For each lost token: either confirm it was an intentional drop (prose - duplication, redundant phrasing) AND record the rationale in the PR description, OR add the missing content to the - merged file before merge. Token-set difference of zero is not the goal (paraphrase tolerance is real); reviewed-loss - is the goal — every loss is a deliberate choice with a recorded reason. - -**Verification:** A diff between "what's in the merged file" and "union of the two source files" shows no information -loss other than prose duplication; SKILL.md and `getting-started.md` link to the merged file; both originals are -deleted. - ---- - -### Phase 4 — PR 4: Rust starter completion (R11.1, R11.2, R11.3) - -- U12. **R11.1: `templates/cargo-toml.md`** - -**Goal:** Drop-in `[dependencies]` block for the Rust starter. Today an agent copying `clap-main.rs` has to know to add -clap (with `derive` + `env`), serde, serde_json, thiserror, libc, clap_complete. A copy-paste TOML is faster. - -**Requirements:** R11 (item 1). - -**Dependencies:** U11 (`rust-clap.md` cross-link from this template). - -**Files:** - -- Create: `templates/cargo-toml.md` - -**Approach:** - -- File starts with a one-paragraph preamble: "Drop-in `[dependencies]` and `[features]` for a greenfield Rust CLI built - from `templates/clap-main.rs`. Append to your `Cargo.toml` after `cargo init`." -- A single fenced TOML block with: clap (`features = ["derive", "env"]`), serde (`features = ["derive"]`), serde_json, - thiserror, libc (`#[cfg(unix)]` only — annotated), clap_complete. Pin version constraints loosely (`"4"` for clap, - `"1"` for serde) per Cargo convention. -- Followed by a "Why each crate" bulleted list — one line per crate naming the principle it serves (clap → P3, serde + - serde_json → P2, thiserror → P4, libc → P6 SIGPIPE, clap_complete → P6 completions). -- Closes with a one-line pointer to `references/rust-clap.md` and `references/project-structure.md`. - -**Patterns to follow:** Existing `templates/agents-md-template.md` voice (preamble + fenced block + "why" prose). - -**Test scenarios:** - -- Verification: TOML block parses (`python3 -c "import tomllib; tomllib.loads(open('templates/cargo-toml.md').read())"` - after extracting the fenced block, or just `cargo init /tmp/test-toml && (cd /tmp/test-toml && python3 ... extract - block ... append to Cargo.toml && cargo metadata --format-version 1)`). -- Verification: Pasting the block into a fresh `cargo init` repo and running `cargo metadata` resolves all dependencies - without errors. -- Verification: The "Why each crate" list names every principle (P1–P7) at least once across crates. - -**Verification:** A new Rust CLI can be bootstrapped with three `cp` commands and a copy-paste from this file (origin -R11 acceptance). - ---- - -- U13. **R11.2: `templates/cli-tests.rs`** - -**Goal:** `assert_cmd` patterns covering P5 mutation boundaries (idempotency, dry-run, `--yes` / `--force` distinction). -Encodes P5 by construction. - -**Requirements:** R11 (item 2). - -**Dependencies:** U11. - -**Files:** - -- Create: `templates/cli-tests.rs` - -**Approach:** - -- File header comment block (matching `templates/clap-main.rs` voice): what this template demonstrates, principle - mapping, where to drop it (`tests/cli.rs` in the consumer repo). -- Test functions covering at minimum: - -1. **`--dry-run` does not mutate.** Run a write subcommand twice with `--dry-run`; assert side-effect (file existence, - exit, JSON-reported state) is unchanged. -2. **`--dry-run` reports what it would do.** Same write subcommand with `--dry-run --output json`; assert the JSON - envelope contains a `would_*` field or equivalent indicating the mutation is described. -3. **Idempotent create.** Run a create subcommand twice without `--force`; second call succeeds without error and - reports "already exists" rather than failing. -4. **`--force` overrides confirmation gate.** Run a destructive subcommand with `--no-interactive` (no `--force`); - assert exit code != 0 and a clear error in JSON. Re-run with `--force --no-interactive`; assert exit 0. -5. **`--yes` accepts implicit confirmation.** Run a destructive subcommand with `--yes`; assert exit 0 and the - destructive op completed. - -- Use `assert_cmd::Command` with `.assert().success()` / `.failure()`. Use `tempfile::tempdir` for filesystem isolation. -- `// TODO: replace 'mytool' with your binary name` markers throughout. - -**Patterns to follow:** xurl-rs and bird `tests/` conventions (cited in `references/rust-clap.md`); `assert_cmd` -documentation idioms. - -**Test scenarios:** - -- Verification: `cargo init --name dummy /tmp/dummy-cli && cp templates/cli-tests.rs /tmp/dummy-cli/tests/cli.rs && (cd - /tmp/dummy-cli && cargo check --tests)` succeeds (compile-only smoke; tests will fail without a binary, but the file - should compile). -- Verification: File contains all five test functions named distinctly. -- Verification: File header comment names P5 explicitly. - -**Verification:** Compile-only smoke passes; an author can copy this file, replace `mytool` with their binary, and have -a working P5-covering test suite. - ---- - -- U14. **R11.3 (repurposed): Document the embedded-schema extraction path** - -**Goal:** Tell agents how to obtain the canonical scorecard JSON Schema from the `anc` binary itself (via `anc generate -scorecard-schema`), rather than vendoring a hand-written copy that would drift the moment upstream bumps. Cross-link to -the upstream plan so readers see where the canonical schema lives. - -**Requirements:** R11 item 3, repurposed away from "ship a vendored schema" to "document the canonical extraction path" -after upstream `agentnative-cli` plan -[`2026-04-30-002-feat-scorecard-json-schema-plan.md`](../../agentnative-cli/docs/plans/2026-04-30-002-feat-scorecard-json-schema-plan.md) -established that the schema is shipped embedded in the binary, derived from Rust types via `schemars`, exposed via `anc -generate scorecard-schema`. - -**Dependencies:** U5 (`references/scorecard-shape.md` exists), U9 (`references/runbook.md` exists), U21 (subcommand -index calls out the `generate` family of verbs). - -**Files:** - -- Modify: `references/scorecard-shape.md` (new sub-section: "Validating against the canonical schema") -- Modify: `references/runbook.md` (new entry 6: "How do I validate a scorecard against the schema?") - -**Approach:** - -- **`references/scorecard-shape.md` sub-section** (~10 lines). Three paragraphs: -- (1) The canonical, authoritative JSON Schema for the scorecard envelope is **embedded in the `anc` binary** — derived - from Rust struct definitions via `schemars`, regenerated at compile time, exposed via `anc generate scorecard-schema`. - The skill bundle does **not** vendor a copy — vendoring would duplicate and drift. -- (2) Usage: `anc generate scorecard-schema --output -` writes the schema to stdout; `anc generate scorecard-schema - --output schema.json` writes to a file; `anc generate scorecard-schema --check --output schema.json` exits non-zero if - the file disagrees with the embedded copy (CI-friendly drift gate). -- (3) Public archival URL: `https://anc.dev/scorecard-v0.5.schema.json` (published by the `agentnative-site` archive, - cross-repo plumbing per upstream plan). Use the verb when validating against the binary actually installed; use the - archive URL when pinning to a specific schema version across consumers. -- **`references/runbook.md` entry 6.** ≤4-line entry: "How do I validate a scorecard against the schema? — run `anc - generate scorecard-schema --output -` to get the schema embedded in your installed `anc`. For a published archive, - fetch `https://anc.dev/scorecard-v{X.Y}.schema.json`. Do not vendor a copy in your repo — it will drift." -- **Soft-dep on upstream verb shipping.** When U14 lands, the `anc generate scorecard-schema` verb may not yet exist in - the `anc` release the consumer has installed (upstream plan is `status: active` as of 2026-05-01). U14's prose handles - this with a one-line caveat: "If your `anc` version doesn't yet ship `generate scorecard-schema`, upgrade via `brew - upgrade brettdavies/tap/agentnative` or pin to the published archive URL." - -**Patterns to follow:** `references/update-check.md`'s short-section voice; `getting-started.md` Q&A table for -cross-links. - -**Test scenarios:** - -- Verification: `references/scorecard-shape.md` contains a sub-section naming `anc generate scorecard-schema`. -- Verification: `references/runbook.md` contains a 6th entry on schema validation. -- Verification: Cross-link to upstream plan - `agentnative-cli/docs/plans/2026-04-30-002-feat-scorecard-json-schema-plan.md` is present. -- Verification: Cross-link to `https://anc.dev/scorecard-v0.5.schema.json` is present. -- Verification: No file is created at `templates/scorecard-envelope.schema.json` (the vendored artifact origin R11.3 - enumerated is explicitly NOT shipped). -- Verification: Markdownlint clean. - -**Verification:** An agent needing to validate a scorecard against the canonical schema runs the verb (no skill bundle -files updated when upstream bumps the schema); the skill teaches the path, the binary owns the artifact. - ---- - -### Phase 5 — PR 5: Python starter (R11.4, R11.6a, cross-language idiom scaffolds) - -- U15. **R11.4: `templates/python-click/`** - -**Goal:** Minimal Python/Click starter (one main file + `pyproject.toml` snippet) encoding P1 SIGPIPE handling, P2 -stdout/stderr separation, P4 exit codes. Mirror of `clap-main.rs` for the Click world. - -**Requirements:** R11 (item 4). - -**Dependencies:** None within PR 5. - -**Files:** - -- Create: `templates/python-click/main.py` -- Create: `templates/python-click/pyproject.toml.snippet` - -**Approach:** - -- **`main.py`** structure: -- Header comment block matching `templates/clap-main.rs` voice — what the template demonstrates, principle mapping, - copy-paste instructions. -- SIGPIPE fix: `import signal; signal.signal(signal.SIGPIPE, signal.SIG_DFL)` in `if __name__ == '__main__':` guard, - wrapped `try / except AttributeError` for Windows compatibility. (P6.) -- Click CLI scaffold with one read subcommand (`status`) and one write subcommand (`apply`) demonstrating `--dry-run` - (`@click.option('--dry-run', is_flag=True, ...)`). (P5.) -- `--output text|json|jsonl` global option using `click.Choice`. (P2.) -- `--quiet`, `--no-interactive`, `--timeout` global options. (P1, P7.) -- Custom exit codes: `EX_USAGE = 2`, `EX_AUTH = 77`, `EX_CONFIG = 78`, `EX_IOERR = 74` (sysexits). (P4.) -- JSON-aware error printing: when `--output json` is set and an error occurs, emit `{"error": true, "kind": ..., - "message": ..., "code": ...}` to stderr; exit with the right code. -- Diagnostic output gated behind `--quiet` and `--output != text`: `def diag(msg, ctx): if not ctx.quiet and ctx.format - == 'text': click.echo(msg, err=True)`. (P7.) -- **`pyproject.toml.snippet`** is a partial pyproject.toml fragment authors append: `[project] name = "mytool"`, - `[project.scripts] mytool = "mytool.main:cli"`, `[project.dependencies] click >= 8.1, < 9`. Comment header notes the - snippet structure (drop-in for `[project]` section). - -**Patterns to follow:** `templates/clap-main.rs` structural shape (header comment + tiered sections + TODO markers); -xurl-rs and bird global-flag conventions translated to Click idioms. - -**Test scenarios:** - -- Verification: `python3 -c "import ast; ast.parse(open('templates/python-click/main.py').read())"` parses cleanly - (syntax check; doesn't execute). -- Verification: With `click` installed in a temp venv, `python3 templates/python-click/main.py --help` prints help text - and exits 0. -- Verification: `python3 templates/python-click/main.py status --output json` emits valid JSON to stdout (parses with - `json.loads`). -- Verification: `python3 templates/python-click/main.py apply --dry-run --output json` reports the would-be action and - does not mutate. -- Verification: `pyproject.toml.snippet` is valid TOML when wrapped in a minimal pyproject (`tomllib.loads(...)` - parses). - -**Verification:** A new Python CLI has a starter that encodes P1/P2/P4 by construction (origin R11 acceptance). - ---- - -- U16. **R11.6a: `templates/agents-md-template.python.md`** - -**Goal:** Python-flavoured AGENTS.md scaffold paired with U15. Mirrors the existing `templates/agents-md-template.md` -(Rust-prescriptive) for the Python world. - -**Requirements:** R11 (item 6a). - -**Dependencies:** U15. - -**Files:** - -- Create: `templates/agents-md-template.python.md` - -**Approach:** - -- Direct adaptation of `templates/agents-md-template.md` (the Rust version): -- **Build & Run** uses `pip install -e .` / `python -m mytool` instead of `cargo`. -- **Test** uses `pytest` instead of `cargo test`; covers single-test (`pytest -k name`), output (`pytest -s`). -- **Lint & Format** uses `ruff` (`ruff check . && ruff format .`). -- **Architecture** module overview names `mytool/main.py`, `mytool/cli/` (Click commands), `mytool/errors.py` (exception - classes with exit-code mapping), `mytool/output.py` (output mode + diag logic). -- **Exit Codes** table identical to Rust template (sysexits 0/1/2/77/78). -- **Conventions** translated to Click idioms: "Output goes through `OutputContext` — never naked `print()` or - `click.echo()` outside the helper"; "Errors raise typed exceptions — never `sys.exit()` except in `main`"; "`--output - text|json|jsonl`, `--quiet`, `--no-interactive`, `--timeout` are global flags via Click context". -- **Common pitfalls** translated: forgetting `signal.SIGPIPE = SIG_DFL` causes broken-pipe noise; using `print()` - directly breaks `--quiet` and `--output json`; `sys.exit()` outside main skips Click's cleanup. - -**Patterns to follow:** `templates/agents-md-template.md` (Rust) structure exactly; placeholder format identical -(`[Binary name]`, `[add modules]`, etc.). - -**Test scenarios:** - -- Verification: File contains the same eight H2 sections as the Rust template (Build & Run, Test, Lint & Format, - Architecture, Quality Bar, Conventions, Common Pitfalls, Known Debt, References). -- Verification: All commands referenced are Python-ecosystem commands, not Rust (`grep -i 'cargo\|rustc' - templates/agents-md-template.python.md` returns 0 hits). -- Verification: Markdownlint clean. - -**Verification:** A Python CLI author copies this template and gets an AGENTS.md that mirrors the Rust version's -structure with idiomatic Python tooling. - ---- - -- U17. **Cross-language idiom sub-sections in `references/framework-idioms-other-languages.md` (scaffold + Python - rows)** - -**Goal:** Add two new sub-sections to the existing four-language idioms file: "JSON envelope authoring (P2)" and -"Testing P5 mutation boundaries". Section scaffolds + Python rows land here. Go rows append in U20. JS + Ruby rows -defer. - -**Requirements:** R11 ("framework idioms to add" — JSON envelope sub-section + P5 testing section). - -**Dependencies:** U14 (the JSON envelope schema is the canonical reference for shape; the Python row points at it). - -**Files:** - -- Modify: `references/framework-idioms-other-languages.md` - -**Approach:** - -- **`## JSON envelope authoring (P2)`** — new section near the bottom of the file, before any closing pointer table. - Intro paragraph: when implementing P2, your `--output json` mode should emit envelopes that follow the canonical shape - documented in `references/scorecard-shape.md` (and validated by `anc`'s embedded schema, extractable via `anc generate - scorecard-schema --output -`). Per-language rows show idiomatic envelope construction: -- **Python (Click + stdlib `json`):** `json.dumps({"data": result, "meta": {"version": __version__}})` → `click.echo` to - stdout. -- **Go (Cobra + `encoding/json`):** appended in PR 6 (U20). Until then, this row is **absent from the shipped doc** — - not a "deferred" stub. -- **JS / Ruby:** **not present in this section.** JS / Ruby authors continue to use the existing per-framework sections - (Click, Commander, yargs, oclif, Thor) already in this file. Cross-language coverage for JS / Ruby is tracked in this - plan's `### Deferred to Follow-Up Work` and lands in a follow-up PR matched to a JS / Ruby ecosystem expert. Absent - rows ship cleaner than visible "deferred" markers — readers interpret an absent row as "the existing per-framework - section covers this" rather than "I'm missing instructions." -- **`## Testing P5 mutation boundaries`** — new section parallel to `references/rust-clap.md`'s implicit P5 testing - coverage. Same row policy: -- **Python (pytest + `subprocess`/`click.testing.CliRunner`):** show a CliRunner-based dry-run-doesn't-mutate test, - idempotent-create test, force-flag-overrides-gate test. Reference `templates/cli-tests.rs` as the Rust analog. -- **Go (testing + `os/exec`):** appended in PR 6 (U20). Absent until then. -- **JS / Ruby:** not present (deferred per the rationale above; same row policy). - -**Patterns to follow:** Existing `references/framework-idioms-other-languages.md` per-language section structure (Click -section, argparse section, Cobra section…). - -**Test scenarios:** - -- Verification: File contains both new H2 sections (`## JSON envelope authoring (P2)`, `## Testing P5 mutation - boundaries`). -- Verification: Python rows present in both sections; Go rows absent (added in PR 6 / U20); JS + Ruby rows absent (no - visible "deferred" marker in the shipped doc). -- Verification: Envelope section closes with a pointer to `references/scorecard-shape.md` (canonical envelope shape) + - `anc generate scorecard-schema` (machine-readable schema extraction). Testing section closes with a pointer to - `templates/cli-tests.rs` (Rust analog). -- Verification: Markdownlint clean. - -**Verification:** Section scaffolds in place; Python rows complete and idiomatic; explicit deferral signals where Go / -JS / Ruby content lives or will live. - ---- - -### Phase 6 — PR 6: Go starter (R11.5, R11.6b, Go cross-language rows) - -- U18. **R11.5: `templates/go-cobra/`** - -**Goal:** Minimal Go/Cobra starter mirroring `clap-main.rs`. - -**Requirements:** R11 (item 5). - -**Dependencies:** None within PR 6. - -**Files:** - -- Create: `templates/go-cobra/main.go` -- Create: `templates/go-cobra/go.mod.snippet` - -**Approach:** - -- **`main.go`** structure (Cobra-idiomatic): -- Header comment block matching `templates/clap-main.rs` voice. -- `cobra.Command` root with one read sub (`status`) and one write sub (`apply`) demonstrating `--dry-run`. (P5.) -- Persistent flags (Cobra's "global" equivalent): `--output text|json|jsonl`, `--quiet`, `--no-interactive`, - `--timeout`. (P1, P2, P7.) -- Custom exit codes via `os.Exit` only in main; subcommand handlers return errors that `main` maps to exit codes. - Mapping: 0 success, 1 command error, 2 usage error, 77 auth, 78 config, 74 IO. (P4.) -- JSON-aware error printing: when `--output json` is set, marshal errors as `{"error": true, "kind": ..., "message": - ..., "code": ...}` to stderr. -- Diagnostic output via a `diag(ctx context.Context, format string, args ...any)` helper that gates on `quiet || output - != "text"`. (P7.) -- Note: Go does NOT install a custom SIGPIPE handler (the runtime's default is acceptable), so no SIGPIPE fix needed — - header comment notes this difference vs Rust/Python. -- **`go.mod.snippet`** is a partial go.mod fragment: `module github.com/example/mytool`, `go 1.22`, `require ( - github.com/spf13/cobra v1.8.0 )`. Header comment explains the snippet is appended to `go mod init` output. - -**Patterns to follow:** `templates/clap-main.rs` structural shape; Cobra's official examples for persistent flags + -exit-code patterns; `~/dev/solutions-docs/architecture-patterns/anc-cli-output-envelope-pattern-2026-04-29.md` for -envelope shape. - -**Test scenarios:** - -- Verification: `go vet templates/go-cobra/main.go` passes (syntax + basic semantics). -- Verification: With Cobra in a temp module, `go run templates/go-cobra/main.go --help` exits 0 and prints help text. -- Verification: `go run templates/go-cobra/main.go status --output json` emits valid JSON. -- Verification: `go run templates/go-cobra/main.go apply --dry-run --output json` reports the would-be action and does - not mutate. -- Verification: `go.mod.snippet` is valid go.mod syntax (parses with `go mod download` after stitching to a minimal - module). - -**Verification:** A new Go CLI has a starter that encodes P1/P2/P4 by construction (origin R11 acceptance, mirrored from -Python). - ---- - -- U19. **R11.6b: `templates/agents-md-template.go.md`** - -**Goal:** Go-flavoured AGENTS.md scaffold paired with U18. - -**Requirements:** R11 (item 6b). - -**Dependencies:** U18. - -**Files:** - -- Create: `templates/agents-md-template.go.md` - -**Approach:** - -- Direct adaptation of `templates/agents-md-template.md` (Rust): -- **Build & Run** uses `go build ./...`, `go run ./cmd/mytool`, `go install ./cmd/mytool`. -- **Test** uses `go test ./...`; single-test `go test -run TestName ./...`; output `go test -v ./...`. -- **Lint & Format** uses `gofmt -w .` and `go vet ./...` (and optionally `golangci-lint run`). -- **Architecture** module overview names `cmd/mytool/main.go`, `internal/cli/` (Cobra commands), `internal/errors/` - (typed errors with exit-code mapping), `internal/output/` (output mode + diag logic). -- **Exit Codes** table identical to Python/Rust templates. -- **Conventions** translated to Go idioms: "Output goes through `OutputContext` — never naked `fmt.Println`"; "Errors - are typed and bubble up — never `os.Exit` except in `main`"; "`--output`, `--quiet`, `--no-interactive`, `--timeout` - are persistent flags". -- **Common pitfalls**: missing `--no-interactive` gate before stdin read; using `fmt.Println` directly breaks `--quiet`; - `os.Exit` outside main skips deferred cleanup; forgetting `--output json` errors must be JSON. - -**Patterns to follow:** `templates/agents-md-template.md` (Rust) structure exactly. - -**Test scenarios:** - -- Verification: Same eight H2 sections as the Rust template. -- Verification: All commands are Go-ecosystem (`grep -i 'cargo\|pip\|rustc' templates/agents-md-template.go.md` returns - 0 hits). -- Verification: Markdownlint clean. - -**Verification:** A Go CLI author copies this template and gets an AGENTS.md that mirrors the Rust version's structure -with idiomatic Go tooling. - ---- - -- U20. **Go rows in `references/framework-idioms-other-languages.md`** - -**Goal:** Append Go content to the two cross-language sub-sections U17 scaffolded. - -**Requirements:** R11 (Go portion of "framework idioms to add"). - -**Dependencies:** U17 (sections + scaffolds exist), U18 (Go starter is the canonical link target). - -**Files:** - -- Modify: `references/framework-idioms-other-languages.md` - -**Approach:** - -- **JSON envelope authoring (P2) → Go row.** Replace U17's stub paragraph with a real Go example: `encoding/json` - marshalling, write to `os.Stdout`, error envelope variant on `os.Stderr`, link to `references/scorecard-shape.md` for - the canonical envelope shape and `anc generate scorecard-schema` for machine-readable schema extraction. -- **Testing P5 mutation boundaries → Go row.** Replace U17's stub paragraph with a `go test` + `os/exec` pattern: - dry-run-doesn't-mutate test, idempotent-create test, force-flag-overrides-gate test. Reference - `templates/cli-tests.rs` as the Rust analog and `templates/go-cobra/main.go` as the implementation. - -**Patterns to follow:** U17's Python row structure (parallel composition). - -**Test scenarios:** - -- Verification: Both Go rows are now substantive paragraphs (not absent, not stub-marked). -- Verification: Both rows link to `templates/go-cobra/` (implementation) or `references/scorecard-shape.md` + `anc - generate scorecard-schema` (envelope shape). -- Verification: JS + Ruby rows remain absent from the new sub-sections (per U17's row policy — no visible "deferred" - markers in the shipped doc; cross-language JS / Ruby coverage continues to be tracked in this plan's `### Deferred to - Follow-Up Work`). -- Verification: Markdownlint clean. - -**Verification:** Cross-language sub-sections cover Python + Go fully; JS + Ruby coverage remains in their existing -per-framework sections (Click / Commander / yargs / oclif / Thor) elsewhere in the file; cross-language deferral tracked -in this plan only, not in the shipped doc. - ---- - -### Phase 7 — PR 7: Deterministic-script hardening (R14) - -- U22. **R14: `bin/check-update` agent-native upgrade** - -**Goal:** Make `bin/check-update` pass `anc audit --binary --audit-profile posix-utility` with `badge.eligible == true`. -Add agent-native flag surface (`--output text|json`, `--quiet`, `--no-interactive`, `--timeout `); distinguish -network-fail from up-to-date in JSON mode; preserve text-mode output and cache-file format byte-for-byte. - -**Requirements:** R14. - -**Dependencies:** None within PR 7. - -**Files:** - -- Modify: `bin/check-update` - -**Approach:** - -- **Flag surface (additive).** `--output text|json` (default `text` — current single-line token grammar preserved); - `--quiet` (suppress all output; cache-file write side-effect still runs); `--no-interactive` (no-op — script is - already non-interactive; flag added for P1 conformance and explicit semantics); `--timeout ` (overrides the - hard-coded `--max-time 5` curl timeout; min 1, max 60). -- **JSON envelope shape.** Single-line JSON to stdout, one of: `{"status": "up_to_date", "version": ""}`, - `{"status": "upgrade_available", "local": "", "remote": ""}`, `{"status": "snoozed", "remote": "", - "expires_at": }`, `{"status": "disabled"}`, `{"status": "network_unavailable", "reason": ""}`, - `{"status": "cache_only", "cached": ""}`. Stderr stays empty in success paths; non-zero curl rc no longer - collapses to silent-exit-0 in JSON mode — agents get a typed signal. -- **Default behavior unchanged.** Without flags, output is byte-faithful to today: nothing on up-to-date, nothing on - snooze/disabled/network-fail, single line `UPGRADE_AVAILABLE ` on stale. Cache file format unchanged. -- **Exit code policy.** Always 0 (degrades silently per the existing contract — periodic update check must not break - user shells if curl fails). Failure semantics surface only in JSON mode via the `status` field. Documented in the - script header as "exit code is intentionally non-meaningful for this script — gate on `status` in JSON mode" and - cross-linked to `## Anc contract`'s "switch on typed fields, never `$?`" guidance from U4. - -**Patterns to follow:** Existing `bin/check-update` voice; `templates/output-format.rs` `OutputFormat` enum shape for -the JSON envelope (Text / Json / Jsonl with format-aware printing); -`~/dev/solutions-docs/best-practices/cli-structure-for-machines-typed-json-fields-over-display-strings-2026-04-20.md`. - -**Test scenarios:** - -- Verification: `bin/check-update --help` exits 0, prints flag list including `--output`, `--quiet`, `--no-interactive`, - `--timeout`. -- Verification: `bin/check-update` (no flags) byte-matches today's output across the six branches (up-to-date, stale, - snoozed, disabled, network-fail, cache-hit) — capture today's output to fixtures pre-edit, diff against post-edit - output. -- Verification: `bin/check-update --output json` emits parseable JSON for each of the six status cases (`status` field - enumerates `up_to_date | upgrade_available | snoozed | disabled | network_unavailable | cache_only`). -- Verification: `bin/check-update --quiet` produces zero output regardless of branch; cache file is still written. -- Verification: `bin/check-update --timeout 0` rejected with non-zero exit + clear error (text mode) or `{"status": - "error", "kind": "invalid_timeout", ...}` (JSON mode); `--timeout 60` accepted. -- Verification: Cache file at `~/.cache/agent-native-cli/last-update-check` retains legacy format (`UP_TO_DATE ` / - `UPGRADE_AVAILABLE `) — verified by parsing the file after a JSON-mode invocation. -- Verification: `shellcheck bin/check-update` clean. - -**Verification:** Output contract is documented (U24 covers docs); JSON mode produces typed envelopes; default text mode -is byte-faithful to today; cache file is back-compat. - ---- - -- U23. **R14: `scripts/sync-spec.sh` agent-native upgrade** - -**Goal:** Make `scripts/sync-spec.sh` pass `anc audit --binary --audit-profile posix-utility`. Add full P1 / P2 / P5 / -P7 flag surface; move success-path status prose from stdout to stderr; emit JSON envelope for the success path; report -idempotency when no work is needed; gate destructive overwrites of dirty `spec/`. - -**Requirements:** R14. - -**Dependencies:** None within PR 7. - -**Files:** - -- Modify: `scripts/sync-spec.sh` - -**Approach:** - -- **Flag surface.** `--output text|json` (default `text`); `--quiet` (suppress stderr status; final result still emitted - in selected output mode); `--no-interactive` (script is already non-interactive; flag added for conformance); - `--timeout ` (wraps `git ls-remote` and `git clone` invocations via `timeout ...` or `git -c - http.lowSpeedTime=...`); `--dry-run` (resolve the upstream tag, report what would be vendored, do not touch `spec/`); - `--force` (override the new dirty-spec gate). -- **Stderr discipline.** Move success-path echoes ("querying $SPEC_REMOTE_URL…", "vendoring $spec_tag…", "wrote $copied - principle files") from stdout to stderr. Errors already go to stderr; this just unifies the success path. Stdout in - text mode now contains only the final result line (e.g., `synced v0.3.0 abc1234 7-files` or `already_in_sync v0.3.0`). -- **JSON envelope.** Success: `{"status": "synced", "tag": "", "sha": "", "source": "remote|local", - "files_written": , "files": ["VERSION", "CHANGELOG.md", "principles/p1-…", …]}`. No-op: `{"status": - "already_in_sync", "tag": "", "sha": ""}`. Dry-run: `{"status": "would_sync", "tag": "", "sha": - "", "files": [...]}`. Error envelopes: `{"status": "error", "kind": "remote_unreachable | no_tags | dirty_spec - | invalid_timeout", "message": "", ...}`. -- **Idempotency report.** Before extracting, compare resolved `$spec_tag` to current `$DEST_DIR/VERSION`. If equal AND - `git -C "$REPO_ROOT" diff --quiet spec/` (no local edits), emit the `already_in_sync` envelope and exit 0 without - writing. -- **Dirty-spec gate.** If `git -C "$REPO_ROOT" status --porcelain spec/` reports non-empty, refuse to overwrite without - `--force`. Error envelope: `{"status": "error", "kind": "dirty_spec", "files": [...]}`. P5 (every write op gates - destructive overwrites) verbatim. -- **`--dry-run`.** Resolve tag + enumerate files via `git ls-tree --name-only $spec_tag principles/`; emit `would_sync` - envelope; do not call `git show`, do not write to `$DEST_DIR`. -- **Exit code policy.** 0 on success / no-op / dry-run; 2 on dirty-spec without `--force` (usage error); 74 on remote - unreachable + no local fallback (sysexits IOERR); 78 on missing local tags (sysexits CONFIG); 1 on any other failure. - Documented in the script header. - -**Patterns to follow:** `templates/error-types.rs` for the `kind` enum naming; existing `sync-spec.sh` cleanup-trap + -remote-first-then-local resolution pattern; -`~/dev/solutions-docs/best-practices/consistent-json-schema-across-success-and-error-paths-2026-04-20.md` for envelope -shape parity across success / error paths. - -**Test scenarios:** - -- Verification: `scripts/sync-spec.sh --help` exits 0, prints all six new flags. -- Verification: `scripts/sync-spec.sh` (no flags) with clean `spec/` and remote reachable produces text-mode output - matching the new contract (single human-readable result line on stdout; status prose on stderr). -- Verification: `scripts/sync-spec.sh --output json` produces parseable JSON; `status` field is one of `synced | - already_in_sync | would_sync | error`. -- Verification: `scripts/sync-spec.sh --dry-run --output json` resolves the tag without writing; `git status spec/` - unchanged after invocation. -- Verification: With dirty `spec/` (touch a principle file), `scripts/sync-spec.sh` exits 2 and emits `{"status": - "error", "kind": "dirty_spec", …}`; `--force` overrides and proceeds. -- Verification: With `SPEC_REMOTE_URL=https://invalid.example.com` and no `SPEC_ROOT`, exits 74 with `{"status": - "error", "kind": "remote_unreachable", …}`. -- Verification: Re-running against an already-vendored tag emits `already_in_sync` and writes no files (`stat -c %Y` on - `spec/VERSION` unchanged across two consecutive runs). -- Verification: `shellcheck scripts/sync-spec.sh` clean. - -**Verification:** Script passes the dogfood audit (U25); idempotency reports correctly; dirty-spec is gated; JSON -envelope shape matches the documented contract. - ---- - -- U24. **R14: Output + env-var contract documentation** - -**Goal:** Document `bin/check-update`'s and `scripts/sync-spec.sh`'s output contract (text grammar + JSON envelope per -status) and env-var overrides in the references; cross-link from SKILL.md's update-check footnote and U8's spec-skew -fallback paragraph. - -**Requirements:** R14. - -**Dependencies:** U22, U23 (the contract is what U22 / U23 actually ship; docs follow the implementation). - -**Files:** - -- Modify: `references/update-check.md` (expand to cover text grammar + JSON envelope + env vars for `bin/check-update`) -- Modify: `references/runbook.md` (new entry 7: "When and how to run `scripts/sync-spec.sh`") - -**Approach:** - -- **`references/update-check.md` expansion.** Add three sub-sections under existing content: - -1. **Output contract — text mode.** The single-line token grammar (`UPGRADE_AVAILABLE ` on stale; nothing - otherwise). Documented as a stable interface; consumers may parse via `awk '{print $1, $2, $3}'`. -2. **Output contract — JSON mode.** Full envelope shape with one example per `status` value (six total). -3. **Env-var overrides.** Three-row table: `AGENTNATIVE_SKILL_DIR`, `AGENTNATIVE_SKILL_REMOTE_URL`, - `AGENTNATIVE_SKILL_STATE_DIR` — purpose + default + when to override (testing, mirrored install). - -- **`references/runbook.md` entry 7.** ~10-line entry: when to run `sync-spec.sh` (after every `agentnative-spec` v* tag - bump; SKILL.md U8 spec-skew fallback is the in-loop trigger); env-var overrides (`SPEC_REMOTE_URL`, `SPEC_ROOT`); - `--dry-run` for previewing; `--force` for dirty-spec override; the idempotency-report behavior; cross-link to - `bin/check-update` (sibling) and `references/update-check.md`. -- **SKILL.md cross-links.** Update the `## Update-check` footnote (created in U1, refined in U3) and U8's spec-skew - fallback paragraph to mention "`--output json` is available — see `references/update-check.md`" and "see - `references/runbook.md` entry 7" respectively. - -**Patterns to follow:** `references/update-check.md` existing voice; `references/runbook.md` short-section voice from -U9. - -**Test scenarios:** - -- Verification: `references/update-check.md` enumerates all six JSON `status` values matching U22's implementation. -- Verification: `references/update-check.md` env-var table covers all three `AGENTNATIVE_SKILL_*` vars. -- Verification: `references/runbook.md` entry 7 names `--dry-run`, `--force`, `SPEC_REMOTE_URL`, `SPEC_ROOT`. -- Verification: SKILL.md update-check footnote references `references/update-check.md`'s JSON envelope; U8 spec-skew - paragraph references `references/runbook.md` entry 7. -- Verification: Markdownlint clean; cross-links resolve. - -**Verification:** An agent reading `references/update-check.md` and `references/runbook.md` cold can parse both scripts' -output and pick the right invocation flags without reading the script source. - ---- - -- U25. **R14: Dogfood gate via `anc audit --binary`** - -**Goal:** Empirically verify both scripts pass `anc audit --binary --audit-profile posix-utility` with `badge.eligible -== true`. Document the audit invocation so contributors can re-run the gate after future edits. - -**Requirements:** R14 (acceptance gate). - -**Dependencies:** U22, U23, U24 (everything else in PR 7). - -**Files:** - -- Modify: `references/runbook.md` (new entry 8: "How do I re-run the dogfood audit?") -- Modify: `CONTRIBUTING.md` (one-paragraph note pointing at the runbook entry) - -**Approach:** - -- **Empirical gate at unit time.** Run `anc audit --binary bin/check-update --audit-profile posix-utility --output json` - and `anc audit --binary scripts/sync-spec.sh --audit-profile posix-utility --output json`. Capture both envelopes. - Assert `badge.eligible == true` for both. If either fails, the failure modes from `results[]` are the punch list — fix - in U22 / U23 and re-run before PR 7 merges. -- **`references/runbook.md` entry 8.** Document the canonical audit invocation (the exact two commands above), the - expected `badge.eligible == true` outcome, and the punch-list workflow when one fails. Note that CI integration is - deferred to a follow-up PR. -- **`CONTRIBUTING.md` update.** Add one paragraph in the existing "Before merging" or equivalent section pointing at the - runbook entry, so a contributor editing either script can't merge without re-running the gate. -- **Out of scope for this unit (deferred to follow-up CI-hardening PR):** wiring the dogfood audit into CI as a - regression gate. PR 7 ships the manual gate + documentation; CI integration is the next compounding step. - -**Patterns to follow:** `references/runbook.md` short-section voice; `CONTRIBUTING.md` existing tone. - -**Test scenarios:** - -- Verification: `anc audit --binary bin/check-update --audit-profile posix-utility --output json | jq .badge.eligible` - returns `true`. -- Verification: `anc audit --binary scripts/sync-spec.sh --audit-profile posix-utility --output json | jq - .badge.eligible` returns `true`. -- Verification: `references/runbook.md` entry 8 contains both audit commands verbatim. -- Verification: `CONTRIBUTING.md` references the runbook entry. - -**Verification:** The dogfood claim is empirically true: the skill audits CLIs against the agent-native spec; under PR 7 -the skill's own runtime scripts pass that same audit at the badge-eligible bar. - ---- - -## System-Wide Impact - -- **Interaction graph.** `SKILL.md` is the entry-point host hosts read; `getting-started.md` is its first link; - `references/*.md` are loaded on triggered read. Reordering in U1 must keep this load order intact — agents reading - SKILL.md top-to-bottom must reach Quick Start before any deeper file is referenced. -- **Error propagation.** PRs 1–6 are docs + templates; the only "errors" are markdownlint failures, broken cross-links, - and JSON Schema parse failures; each unit's verification catches them locally. **PR 7 (R14) is the exception:** U22 / - U23 modify two runtime-callable bash scripts; new failure modes are surfaced as typed JSON statuses in the new - `--output json` mode (`network_unavailable`, `dirty_spec`, `remote_unreachable`, `invalid_timeout`) and gated by - `--dry-run` / `--force` for the destructive `sync-spec.sh` overwrite path. Default text-mode behavior is byte-faithful - to today; back-compat for existing consumers and the cache file at `~/.cache/agent-native-cli/last-update-check`. -- **State lifecycle risks.** PRs 1–6 only touch git history of docs / templates. PR 7 adds two new pieces of agent- - observable state: (1) the cache file format at `~/.cache/agent-native-cli/last-update-check` is preserved as legacy - (no migration needed); (2) `spec/` overwrite is now gated by a dirty-check (`--force` required to override). Plans - live on `dev`; SKILL.md / getting-started.md / references / templates / scripts land on `dev` via PR and propagate to - `main` via `release/*` cherry-pick (per `RELEASES.md`). -- **API surface parity.** The skill's "API" is its frontmatter (`name`, `description`, `allowed-tools`) read by hosts - during discovery. U2 + U10 are the surface-touching units; U2 adds `allowed-tools`, U10 audits `description`. Both are - additive — no rename, no removal of existing fields. -- **Integration coverage.** Live `anc 0.3.0` integration is exercised by U5 (samples), U14 (schema validates real - envelope), U6 (audit-profile values match `anc audit --help`). If `anc` ships a schema bump (sibling brief A1) before - this plan lands, U4's pin path may need to update from top-level `schema_version` to `anc.schema_version` — plan - revision required at that point. -- **Unchanged invariants.** `name: agent-native-cli` (R10's documented "no" — discoverability). The seven principle - files in `spec/principles/` (vendored, not edited here). The four-step `## The anc loop` shape (U1 reorders SKILL.md - but the four steps stay; U7 + U8 fold rules into existing steps). `bin/check-update`'s default text-mode output and - cache-file format (R5 demotes its SKILL.md position; R14 adds new flags but preserves default behavior byte-for-byte). -- **Cross-repo.** Sibling brief `2026-05-01-002` carries A1 / A2 / A3 / A4 / A5 / A6 / A7 against `agentnative-cli`. - This plan's R2 / R6 / R7 reference those items as soft-dependencies but ship interim guidance independently. PR 1's - commit message naming `[A1, A2, A3, A4]` cross-references is recommended for traceability. - ---- - -## Risks & Dependencies - -| Risk | Likelihood | Impact | Mitigation | -| ----------------------------------------------------------------------------------------------------------------- | ---------- | ------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `anc` ships schema 0.6 (or renames `id` to `requirement_id`) mid-plan, invalidating U4 / U5 pinned values. | Low | Med | U4 footnote names the interim contract explicitly; if a bump lands, U4 + U5 need a follow-up PR. Sibling brief A1 is additive, not breaking, so the most likely shape is a new `anc.schema_version` field alongside the top-level one — both pin paths stay valid. | -| `anc.dev/badge` convention page does not exist or doesn't enumerate placement, leaving U9 entry 1 underspecified. | Med | Low | U9 entry 1's open-question note: if convention page lacks placement, U9 proposes a default (`top-of-readme` after H1) and the proposal mirrors to sibling brief A4 as a feature ask. | -| R12 merge loses a file-unique nugget that nobody notices until post-merge. | Low | Med | U11 verification specifically checks for the four enumerated nuggets (Flags-vs-subcommands taxonomy, `kind()` method, main-only `process::exit()`, Jsonl variant) **plus a token-set diff pre-deletion gate** that surfaces every lost token for explicit reviewed-loss disposition. | -| Upstream `anc generate scorecard-schema` verb (the path U14 points at) hasn't shipped when PR 4 lands. | Med | Low | U14 prose carries a one-line caveat naming the upstream plan and the upgrade path (`brew upgrade brettdavies/tap/agentnative` or pin to `https://anc.dev/scorecard-v0.5.schema.json` archive). Until the verb ships, parser authors use U5's prose form. No skill-side rework needed when the verb lands — the path is verb-name-stable. | -| SKILL.md exceeds 200 lines after PR 1 lands (R9 ceiling violated). | Low | Low | U1 verification checks line count at end of PR 1. Origin estimate is +45 lines net; budget is +54 lines, so margin exists. If exceeded, deferred guidance moves to `references/runbook.md` (U9) or extends `references/scorecard-shape.md` (U5). | -| Trigger-keyword audit (U10 / OQ-origin-#5) finds the existing keyword list is fine and U10 has no work. | Low | Low | OK — U10 commits only the kept-as-is rationale documentation in that case. Frontmatter remains unchanged. The unit still ships (one-paragraph commit). | -| PR 5 / PR 6 reviewers (Python / Go specialists) push back on starter idioms. | Med | Low | Each PR is intentionally scoped to one ecosystem so review is fast and targeted. Iterate within PR; do not block the rest of the plan. | -| Live `anc audit` envelope used by U5 examples drifts before PR 1 lands. | Low | Low | U5 verification re-runs `anc audit --output json` at unit time and checks samples against current output. If drift detected, regenerate the samples — they're cheap. | -| `bin/check-update` cache file format compat breaks under PR 7 (cache consumed across an upgrade boundary). | Low | Low | U22 keeps the cache file format unchanged (legacy `UP_TO_DATE ` / `UPGRADE_AVAILABLE `); JSON appears at output time only. Verified by post-edit cache-file inspection in U22's test scenarios. | -| `scripts/sync-spec.sh` stderr-discipline change breaks an unknown consumer that captured the success-path stdout. | Low | Low | Searched for callers; only invocation today is direct human run + this plan's U8 spec-skew fallback (which doesn't capture stdout). PR 7 changelog calls out the breaking change explicitly so any external consumer sees the migration note. | -| `anc audit --binary` against bash scripts surfaces unexpected gaps (e.g., missing P3 `after_help`, P6 SIGPIPE). | Med | Low | U25 is the empirical gate; if either script fails, U22 / U23 absorb the punch list before merge. Acceptable to iterate within PR 7. If a finding requires an `anc`-side feature (e.g., bash-script-aware audit profile), defer with a sibling-brief item and document the suppression in U25's verification note. | - ---- - -## Phased Delivery - -Seven PRs, sequenced. Each phase = one PR. Origin's handoff section is the canonical sequencing source for PRs 1–6; PR 7 -is plan-introduced (R14, the dogfood gate). All seven PRs ship in one consolidated `v0.4.0` release. - -### Phase 1 — PR 1: Skill-side determinism (R1–R9, R13) - -Lands U1–U9 + U21 (R9 first; U21 last so it can absorb U1's section skeleton and expand U9's runbook entry). One commit -per unit for review legibility. Targets `dev`. Estimated total ~280 lines added/changed across `SKILL.md`, -`getting-started.md`, and three new reference files (`runbook.md`, `scorecard-shape.md`, `audit-profile-selection.md`). -Largest PR; contains the load-bearing reorder. - -Acceptance gate: SKILL.md ≤ 200 lines, first runnable command is `anc audit`, all five `results[].status` values -enumerated in `## Anc contract`, exhaustive scorecard sample parses as JSON, exit-code table empirically matches live -`anc 0.3.0` behavior across at least three commands (one with `summary.fail == 0`, one with `summary.fail > 0`, one -invocation error), R6 spec-skew paragraph distinguishes namespace-mismatch from actual-spec-drift, **subcommand index -matches live `anc --help` `Commands:` block** (U21), **target-resolution runbook entry names all four modes (`.`, -`--binary`, `--source`, `--command `)** (U21). U14 (formerly the hand-written JSON Schema; now docs-only — pivots -to documenting the upstream `anc generate scorecard-schema` extraction path) lands in PR 4; PR 1's gate has no -schema-related criterion. - -### Phase 2 — PR 2: Frontmatter polish (R10) - -Lands U10. Tiny PR; can be folded into PR 1 if R10 stays under ~10 lines after the trigger-keyword audit. - -Acceptance gate: `description` ≤ 1024 chars; SKIP clause names `compound-engineering` and `create-agent-skills` -redirects. - -### Phase 3 — PR 3: Rust idioms consolidation (R12) - -Lands U11. Self-contained; deletes two files, creates one, updates SKILL.md and `getting-started.md` link targets. Lands -before PR 4 so the new Rust starter docs reference `references/rust-clap.md`. - -Acceptance gate: `framework-idioms.md` and `rust-clap-patterns.md` deleted; `rust-clap.md` exists, ~250 lines, all four -file-unique nuggets present; no broken links anywhere in the repo. - -### Phase 4 — PR 4: Rust starter completion + scorecard-schema extraction docs (R11.1, R11.2, R11.3-repurposed) - -Lands U12 + U13 + U14. U12 + U13 are Rust-starter additions (cargo-toml.md, cli-tests.rs); U14 is docs-only (modifies -`references/scorecard-shape.md` + `references/runbook.md` to document the upstream `anc generate scorecard-schema` -extraction path). Cross-references `references/rust-clap.md` (PR 3). - -Acceptance gate: Cargo TOML resolves dependencies; `cli-tests.rs` compiles cleanly when copied into a fresh `cargo -init`; `references/scorecard-shape.md` contains the `anc generate scorecard-schema` sub-section with cross-link to -upstream plan; `references/runbook.md` contains the schema-validation entry; **no file is created at -`templates/scorecard-envelope.schema.json`** (the vendored artifact is explicitly NOT shipped — upstream binary is -canonical). - -### Phase 5 — PR 5: Python starter (R11.4, R11.6a, cross-language scaffolds) - -Lands U15 + U16 + U17. Cross-language idiom sub-section scaffolds + Python rows ride here. - -Acceptance gate: Python starter parses + runs `--help`; pyproject snippet is valid TOML; -framework-idioms-other-languages section scaffolds present; Python rows substantive; Go / JS / Ruby rows are explicitly -deferred markers. - -### Phase 6 — PR 6: Go starter (R11.5, R11.6b, Go cross-language rows) - -Lands U18 + U19 + U20. - -Acceptance gate: Go starter `go vet` clean, runs `--help`; AGENTS.md mirrors Rust template structure; Go rows in -cross-language sub-sections substantive (no longer stubs). - -### Phase 7 — PR 7: Deterministic-script hardening (R14) - -Lands U22 + U23 + U24 + U25. The dogfood gate: makes `bin/check-update` and `scripts/sync-spec.sh` pass `anc audit ---binary --audit-profile posix-utility` with `badge.eligible == true`. Additive flag surfaces (no removals); cache-file -format unchanged; one minor breaking change in `sync-spec.sh` (success-path status prose moves from stdout to stderr, -called out in PR 7's changelog). - -Acceptance gate: `bin/check-update --help` and `scripts/sync-spec.sh --help` print the new flag surfaces; both scripts -pass `anc audit --binary --audit-profile posix-utility` with `badge.eligible == true` (the empirical gate, U25); -`references/update-check.md` documents both modes' output contract and env-var overrides; `references/runbook.md` -carries entries 7 (sync-spec runbook) and 8 (re-run-the-dogfood-audit); `CONTRIBUTING.md` references the audit gate; -`shellcheck` clean on both scripts. - ---- - -## Documentation / Operational Notes - -- **CHANGELOG.** Each PR's `## Changelog` section in the body is the source of truth (per repo convention; - `scripts/generate-changelog.sh` extracts it). PR 1's changelog is user-facing-large: "SKILL.md restructure with anc - contract, scorecard samples, audit-profile selection, loop termination rule, install precondition, spec-skew fallback, - and runbook." PR 2–6 changelogs scoped per-PR. PR 7's changelog calls out two user-facing additions and one minor - breaking change: (1) `bin/check-update` and `scripts/sync-spec.sh` gain `--output json|text`, `--quiet`, - `--no-interactive`, `--timeout` flags; `sync-spec.sh` additionally gains `--dry-run` and `--force`; (2) both scripts - pass the dogfood audit (`anc audit --binary --audit-profile posix-utility`) at the badge-eligible bar; (3) **breaking - change:** `sync-spec.sh` success-path status prose ("querying…", "vendoring…", "wrote N files") moves from stdout to - stderr — consumers capturing the success-path stdout will see only the new single-line result token instead. -- **VERSION bump.** All seven PRs ship in one consolidated release: `v0.4.0` from `v0.3.0`. The release covers the R9 - reorder + R2 anc contract + R10 frontmatter + R12 Rust idioms + R11 starter templates + R13 subcommand index + R14 - deterministic-script hardening. Minor bump (additive surface, no removals beyond R12's two deleted reference files - which had explicit redirects, and PR 7's stdout→stderr re-routing in `sync-spec.sh`'s success path which is called out - as breaking in the changelog). Cut the tag after PR 7 lands and the dogfood gate (U25) is green. Per-PR commit - messages remain as the audit trail; the consolidated `RELEASES.md` entry summarizes against the seven PRs. -- **`anc skill install` consumers.** When PR 1 ships, hosts that have already installed the skill via `anc skill - install` will see the restructured SKILL.md after re-running install or `git pull`. R5's install-precondition is the - new first action — agents that cached the old "First action: update check" wording will encounter the precondition on - next session. This is intentional and correct (the precondition is more important than the bundle update-check). -- **Cross-repo.** Sibling brief PRs land independently in `agentnative-cli`. This plan does NOT block on any of them. -- **`spec/` resync.** Not required for this plan — `spec/VERSION = 0.3.0` matches `anc 0.3.0`'s `spec_version: "0.3.0"`. - If the spec moves, run `scripts/sync-spec.sh` as a separate commit before PR 1 lands. -- **Markdownlint configuration.** `.markdownlint-cli2.yaml` enforces 120-char line length. Per global instructions, do - not manually wrap markdown lines — the auto-format hook handles it. - ---- - -## Sources & References - -- **Origin document:** - [`docs/brainstorms/2026-05-01-001-skill-determinism-requirements.md`](../brainstorms/2026-05-01-001-skill-determinism-requirements.md) -- **Sibling brief (cross-repo, anc-side):** - [`docs/brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md`](../brainstorms/2026-05-01-002-anc-determinism-feature-asks-requirements.md) -- **Upstream plan (anc-side schema, the canonical artifact U14 points at):** - `agentnative-cli/docs/plans/2026-04-30-002-feat-scorecard-json-schema-plan.md` — derives the scorecard JSON Schema - from Rust types via `schemars`, embeds it in the binary, exposes it via `anc generate scorecard-schema`, archives - versioned URLs under `https://anc.dev/scorecard-v{X.Y}.schema.json` via the `agentnative-site` repo. This skill's U14 - pivot from "ship a vendored schema" to "document the extraction path" is grounded in this plan being the - single-source-of-truth. -- **Repo files referenced:** `SKILL.md`, `getting-started.md`, `AGENTS.md`, `references/framework-idioms.md`, - `references/framework-idioms-other-languages.md`, `references/project-structure.md`, - `references/rust-clap-patterns.md`, `references/update-check.md`, `templates/clap-main.rs`, - `templates/error-types.rs`, `templates/output-format.rs`, `templates/agents-md-template.md`, `bin/check-update`, - `scripts/sync-spec.sh`, `spec/principles/p[1-7]-*.md`, `spec/VERSION`. -- **Live `anc` integration:** `anc 0.3.0` (verified 2026-05-01); envelope at top-level keys `anc, audience, - audit_profile, badge, coverage_summary, results, run, schema_version, spec_version, summary, target, tool`; result - keys `confidence, evidence, group, id, label, layer, status`; `schema_version: "0.5"`, `spec_version: "0.3.0"`. -- **Solutions:** -- `~/dev/solutions-docs/best-practices/skills-2-0-structure-progressive-disclosure-20260402.md` -- `~/dev/solutions-docs/architecture-patterns/anc-cli-output-envelope-pattern-2026-04-29.md` -- `~/dev/solutions-docs/best-practices/cli-structure-for-machines-typed-json-fields-over-display-strings-2026-04-20.md` -- `~/dev/solutions-docs/best-practices/consistent-json-schema-across-success-and-error-paths-2026-04-20.md` -- `~/dev/solutions-docs/best-practices/agentnative-version-model-2026-05-01.md` -- **External:** -- [`anc.dev/badge`](https://anc.dev/badge) — badge convention reference for U9 entry 1 -- [`agentnative-cli`](https://github.com/brettdavies/agentnative-cli) — sibling repo for cross-repo brief -- [`agentnative`](https://github.com/brettdavies/agentnative) — spec repo (not edited here) -- **Prior plans (this repo, dev):** -- `docs/plans/2026-04-27-001-bootstrap-agentnative-skill-plan.md` (bootstrap) -- `docs/plans/2026-04-28-001-feat-update-check-mechanism-plan.md` (update-check, U3 demotes its SKILL.md placement) From 02a8a7af4c06ab3a6769e12685417b23b45cca75 Mon Sep 17 00:00:00 2001 From: Brett Date: Mon, 1 Jun 2026 12:39:24 -0500 Subject: [PATCH 11/15] chore(release): v0.5.0 Pin skill bundle version to anc CLI version. Bump VERSION 0.2.0 to 0.5.0 (versions 0.3.0 and 0.4.0 skipped so the skill tracks the canonical anc release number going forward). Re-render CHANGELOG with the cleaned v0.5.0 section sourced from PR bodies edited on GitHub, plus filler [0.3.0] and [0.4.0] sections documenting the version skip. --- CHANGELOG.md | 275 +++++++++++++++++++++++++++++++++++++++++++-------- VERSION | 2 +- 2 files changed, 234 insertions(+), 43 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index a87ec0a..f2c8c52 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,69 +2,260 @@ All notable changes to this project will be documented in this file. +## [0.5.0] - 2026-06-01 + +### Added + +- `references/update-check.md`: pulled-out operational detail for the consumer-side update-check script (prompt copy, + snooze ladder, state-dir layout). by @brettdavies in [#14](https://github.com/brettdavies/agentnative-skill/pull/14) +- New "The anc loop" section in `SKILL.md` documenting scorecard schema 0.5 fields (`coverage_summary.must.verified`, + `badge.eligible`, `badge.score_pct`, `badge.embed_markdown`), the 80% badge eligibility floor, and the four + `--audit-profile` categories (`human-tui`, `file-traversal`, `posix-utility`, `diagnostic-only`). +- `anc skill install ` documented in `getting-started.md` § "Installing anc and this skill bundle" with + `--dry-run`, `eval $(...)` capture, and `--output json` envelope. +- `docs/SYNCS.md`: cross-repo sync map covering inbound (`agentnative` spec → this repo via `scripts/sync-spec.sh`) and + outbound (this repo → consumer hosts; `agentnative-site` daily probe) edges, with manifest-vs-bundle ownership + diagrams. +- `--ref ` flag and matching `SPEC_REF` environment variable on `scripts/sync-spec.sh` for vendoring + `agentnative-spec` from an explicit branch, tag, or commit SHA. Default behavior (no `--ref`) still resolves the + latest `v*` tag. by @brettdavies in [#15](https://github.com/brettdavies/agentnative-skill/pull/15) +- `scripts/hooks/pre-push`: local CI mirror that runs markdownlint-cli2 and shellcheck against the same surfaces CI + checks, gating pushes before they reach GitHub. by @brettdavies in + [#16](https://github.com/brettdavies/agentnative-skill/pull/16) +- New skill-bundle channel-context layer: `PRODUCT.md` (channel design context), `BRAND.md` (universal voice, vendored + from `agentnative-spec`), and `scripts/sync-prose-tooling.sh` (vendoring vehicle, decoupled from + `scripts/sync-spec.sh`). by @brettdavies in [#17](https://github.com/brettdavies/agentnative-skill/pull/17) +- `RELEASES-RATIONALE.md` companion to `RELEASES.md` documents the rationale behind branching, PR conventions, CHANGELOG + generation, spec-vendor pipeline, and branch protection. +- GitHub issue forms: `bug-report.yml`, `bundle-proposal.yml`, `00-blank.yml`, and `config.yml`. +- `scripts/sync-dev-after-release.sh`: release-backport tool that overwrites `VERSION` with the released number and + copies `CHANGELOG.md` verbatim from `origin/main` as one signed commit on `dev`. Idempotent on re-run. by @brettdavies + in [#18](https://github.com/brettdavies/agentnative-skill/pull/18) +- P8 (Discoverable Through Agent Skill Bundles) principle, vendored from agentnative-spec v0.4.0. by @brettdavies in + [#19](https://github.com/brettdavies/agentnative-skill/pull/19) +- `principles/scoring.md` (leaderboard formula, badge eligibility floor, color bands) is now vendored into `spec/`; + `scripts/sync-spec.sh` fetches it alongside the principle files. +- Add `evals/` with three self-contained prompts covering greenfield Rust, remediate-existing-Rust, and multi-language + Python (Click) workflows. by @brettdavies in [#21](https://github.com/brettdavies/agentnative-skill/pull/21) +- Document `anc skill install --all` and `anc skill update [host|--all]` in the install section. +- Document `anc emit schema` for extracting the scorecard JSON Schema embedded in the binary. + +### Changed + +- Vendored-spec prose reference in `SKILL.md` bumped `v0.2.0 → v0.3.0` to match `spec/VERSION`. by @brettdavies in + [#14](https://github.com/brettdavies/agentnative-skill/pull/14) +- `SKILL.md` description expanded with Rust/clap, scorecard, audit-profile, agent-native badge, and `anc skill install` + keywords plus a SKIP clause that routes TUI builders to `--audit-profile human-tui` instead of this skill. +- `SKILL.md` "Update check" block compressed from 35 lines (which buried the first-action intent) to a 6-line "First + action: update check" stub; details moved to `references/update-check.md`. +- `RELEASES.md` § "Releasing dev to main" step 4: single guarded-paths grep replaced with a triple-diff verification + block (A: main→release, B: release→dev, C: dev→main) plus a `git cherry HEAD origin/dev` patch-id check with + squash-merge triage guidance. Mirrors the same step that landed on `agentnative-cli` during v0.3.0 prep. +- `scripts/sync-spec.sh` now uses `gh api` (raw content endpoint) instead of `git clone` for the primary fetch path. All + ref types share one code path; the local-fallback path against `SPEC_ROOT` is preserved for offline runs. by + @brettdavies in [#15](https://github.com/brettdavies/agentnative-skill/pull/15) +- PR template, `RELEASES.md`, and `RELEASES-RATIONALE.md` codify the net-diff PR-body rule: Summary describes the + merged-state diff and excludes verification artifacts. by @brettdavies in + [#17](https://github.com/brettdavies/agentnative-skill/pull/17) +- `RELEASES.md` "Apply" section for branch-protection rulesets past-tensed (all three rulesets installed; apply commands + re-runnable). +- `CONTRIBUTING.md` widens the sibling-repo list to four, adds a Contribution Tiers table (Signal / Proposal / Code), + and points at the spec's AI-disclosure policy. +- `README.md` repo-layout block lists `BRAND.md`, `PRODUCT.md`, `RELEASES-RATIONALE.md`, and + `scripts/sync-prose-tooling.sh`; principle-range link covers `/p1` through `/p8`. +- `AGENTS.md` adds a "Voice and prose rules" pointer to `PRODUCT.md` and `BRAND.md`. +- `RELEASES.md` documents the post-publish backport step under "Releasing dev to main." by @brettdavies in + [#18](https://github.com/brettdavies/agentnative-skill/pull/18) +- `RELEASES-RATIONALE.md` documents the rationale for landing the backport as a direct-to-dev commit (rather than + through a PR) and the load-bearing consequences of skipping it. +- The canonical audit command is now `anc audit` (was `anc check`), matching the renamed `anc` subcommand. Skill docs, + the four-step loop, and all `anc`-compliance prose now read "audit" and "auditor". by @brettdavies in + [#19](https://github.com/brettdavies/agentnative-skill/pull/19) +- Bundled spec bumped 0.3.0 to 0.4.0; the skill now teaches eight principles. +- Re-vendor `spec/` to `agentnative-spec` v0.5.0. by @brettdavies in + [#21](https://github.com/brettdavies/agentnative-skill/pull/21) +- Track `anc` v0.5.0 scorecard surface: schema 0.7, per-row `id` / `audit_id` / `tier` fields, `opt_out` and `n_a` + statuses, 70% badge floor. +- Surface new top-level flags: `--examples`, `--json`, `--raw`, `--color`, `--verbose`. + +### Fixed + +- Strip leaked `` / `` XML trailers from `README.md`, `AGENTS.md`, and `CONTRIBUTING.md`. by + @brettdavies in [#20](https://github.com/brettdavies/agentnative-skill/pull/20) +- Correct the "no MUST violations" check: `coverage_summary.must.verified` counts any verdict (including `fail`), so the + right bar is no `results[]` row where `tier == "must" && status == "fail"`. by @brettdavies in + [#21](https://github.com/brettdavies/agentnative-skill/pull/21) +- Clarify that `badge.score_pct` is computed from behavioral-layer rows only. Source- and project-layer audits do not + affect the score. + +### Documentation + +- `docs/SYNCS.md` spec-row mechanism column updated to describe `--ref` / `SPEC_REF`, the cross-repo coordination + workflow, and the `gh api` resolution semantics. by @brettdavies in + [#15](https://github.com/brettdavies/agentnative-skill/pull/15) +- `spec/README.md` now links to the upstream spec landing page (leaderboard, badge convention, acknowledgements) and + documents `scoring.md` in the layout table. by @brettdavies in + [#19](https://github.com/brettdavies/agentnative-skill/pull/19) +- Tighten prose in `SKILL.md`, `AGENTS.md`, `PRODUCT.md`, and `SECURITY.md`. Term-definition bullets switch to colon + style; asides move into parens or commas; strong-contrast sentences split where it reads better. The Layout table in + `AGENTS.md` is wrapped in scoring-skip comment markers because its column indicator is data, not prose. by + @brettdavies in [#22](https://github.com/brettdavies/agentnative-skill/pull/22) + +### Removed + +- Legacy markdown issue templates (`bug_report.md`, `bundle_proposal.md`), replaced by YAML forms. by @brettdavies in + [#17](https://github.com/brettdavies/agentnative-skill/pull/17) + +**Full Changelog**: [v0.2.0...v0.5.0](https://github.com/brettdavies/agentnative-skill/compare/v0.2.0...v0.5.0) + +## [0.4.0] - skipped + +Version skipped. The skill bundle version is now pinned to the `anc` CLI version; the previous skill release was v0.2.0, +and the next is v0.5.0 to match `anc` v0.5.0. See [0.5.0] for the changes that span this range. + +## [0.3.0] - skipped + +Version skipped. The skill bundle version is now pinned to the `anc` CLI version; the previous skill release was v0.2.0, +and the next is v0.5.0 to match `anc` v0.5.0. See [0.5.0] for the changes that span this range. + ## [0.2.0] - 2026-04-29 ### Added -- Version-controlled GitHub repository rulesets for `main`, `dev`, and release tags (`v*`). Apply procedure documented in `.github/rulesets/README.md`. by @brettdavies in [#1](https://github.com/brettdavies/agentnative-skill/pull/1) -- `AGENTS.md` (root) describing the bundle layout, lint commands, branch model, and hard rules for agents working in this producer repo. by @brettdavies in [#2](https://github.com/brettdavies/agentnative-skill/pull/2) -- `RELEASES.md` (root) documenting a release procedure for this repo (later rewritten in #3 to the canonical full `release/*` pattern). +- Version-controlled GitHub repository rulesets for `main`, `dev`, and release tags (`v*`). Apply procedure documented + in `.github/rulesets/README.md`. by @brettdavies in [#1](https://github.com/brettdavies/agentnative-skill/pull/1) +- `AGENTS.md` (root) describing the bundle layout, lint commands, branch model, and hard rules for agents working in + this producer repo. by @brettdavies in [#2](https://github.com/brettdavies/agentnative-skill/pull/2) +- `RELEASES.md` (root) documenting a release procedure for this repo (later rewritten in #3 to the canonical full + `release/*` pattern). - `.github/pull_request_template.md` (canonical PR template). -- `.github/workflows/guard-main-docs.yml` caller for the `brettdavies/.github` reusable workflow that blocks `docs/plans/`, `docs/solutions/`, `docs/brainstorms/`, `docs/reviews/` from PRs targeting `main`. -- `cliff.toml` — git-cliff configuration mirroring sibling repos. by @brettdavies in [#3](https://github.com/brettdavies/agentnative-skill/pull/3) -- `scripts/generate-changelog.sh` — release-time CHANGELOG generator. Reads PR-body `## Changelog` sections and prepends a curated, attributed `[X.Y.Z]` section. Authoritative; never hand-edit `CHANGELOG.md`. -- `CONTRIBUTING.md` — how to propose changes, link to release procedure. -- `.github/ISSUE_TEMPLATE/bug_report.md` — bug report template. -- `.github/ISSUE_TEMPLATE/principle_proposal.md` — substantive standards-change template. -- `**Renamed:**` subsection in `.github/pull_request_template.md` (sync of the canonical update at `~/dotfiles/stow/github/dot-config/github/pull_request_template.md`). Sister sync PRs landing in agentnative-cli (#30 there) and agentnative-site (already on dev as commit 4437435). -- Add vendored `bundle/spec/` tree (agentnative-spec @ v0.2.0): `VERSION`, `CHANGELOG.md`, `README.md`, and seven `principles/p*.md` files with machine-readable `requirements[]` frontmatter — canonical principle text the skill now points at instead of paraphrasing. by @brettdavies in [#4](https://github.com/brettdavies/agentnative-skill/pull/4) -- Add `bundle/getting-started.md` covering three working agent loops (existing CLI / new Rust / other language), canonical `anc audit --output json` invocations, and a "where things live" map. +- `.github/workflows/guard-main-docs.yml` caller for the `brettdavies/.github` reusable workflow that blocks + `docs/plans/`, `docs/solutions/`, `docs/brainstorms/`, `docs/reviews/` from PRs targeting `main`. +- `cliff.toml`: git-cliff configuration mirroring sibling repos. by @brettdavies in + [#3](https://github.com/brettdavies/agentnative-skill/pull/3) +- `scripts/generate-changelog.sh`: release-time CHANGELOG generator. Reads PR-body `## Changelog` sections and prepends + a curated, attributed `[X.Y.Z]` section. Authoritative; never hand-edit `CHANGELOG.md`. +- `CONTRIBUTING.md`: how to propose changes, link to release procedure. +- `.github/ISSUE_TEMPLATE/bug_report.md`: bug report template. +- `.github/ISSUE_TEMPLATE/principle_proposal.md`: substantive standards-change template. +- `**Renamed:**` subsection in `.github/pull_request_template.md` (sync of the canonical update at + `~/dotfiles/stow/github/dot-config/github/pull_request_template.md`). Sister sync PRs landing in agentnative-cli (#30 + there) and agentnative-site (already on dev as commit 4437435). +- Add vendored `bundle/spec/` tree (agentnative-spec @ v0.2.0): `VERSION`, `CHANGELOG.md`, `README.md`, and seven + `principles/p*.md` files with machine-readable `requirements[]` frontmatter. This is the canonical principle text the + skill now points at instead of paraphrasing. by @brettdavies in + [#4](https://github.com/brettdavies/agentnative-skill/pull/4) +- Add `bundle/getting-started.md` covering three working agent loops (existing CLI / new Rust / other language), + canonical `anc check --output json` invocations, and a "where things live" map. - Add `scripts/sync-spec.sh` so the bundle can re-vendor agentnative-spec on demand. -- `LICENSE-APACHE` — Apache 2.0 boilerplate, identical to the file in `agentnative-cli`. by @brettdavies in [#6](https://github.com/brettdavies/agentnative-skill/pull/6) -- `bundle/bin/check-update` — script that compares the consumer's local `VERSION` against the producer repo's `main` and emits `UPGRADE_AVAILABLE ` (or empty when up-to-date / snoozed / disabled). Adapts the gstack pattern with cache TTL (60min UP_TO_DATE / 720min UPGRADE_AVAILABLE) and a 3-level snooze (24h / 48h / 7d). State directory: `$HOME/.cache/agent-native-cli/`. by @brettdavies in [#8](https://github.com/brettdavies/agentnative-skill/pull/8) -- `bundle/SKILL.md` `## Update check` section — first non-frontmatter section after the intro. Documents how to invoke the script and inlines the AskUserQuestion-driven upgrade flow with three options ("Yes, upgrade now" / "Not now" / "Never ask again"). +- `LICENSE-APACHE`: Apache 2.0 boilerplate, identical to the file in `agentnative-cli`. by @brettdavies in + [#6](https://github.com/brettdavies/agentnative-skill/pull/6) +- `bundle/bin/check-update`: script that compares the consumer's local `VERSION` against the producer repo's `main` and + emits `UPGRADE_AVAILABLE ` (or empty when up-to-date / snoozed / disabled). Adapts the gstack pattern + with cache TTL (60min UP_TO_DATE / 720min UPGRADE_AVAILABLE) and a 3-level snooze (24h / 48h / 7d). State directory: + `$HOME/.cache/agent-native-cli/`. by @brettdavies in [#8](https://github.com/brettdavies/agentnative-skill/pull/8) +- `bundle/SKILL.md` `## Update check` section, the first non-frontmatter section after the intro. Documents how to + invoke the script and inlines the AskUserQuestion-driven upgrade flow with three options ("Yes, upgrade now" / "Not + now" / "Never ask again"). ### Changed -- `.gitignore` adds `!AGENTS.md` to override the global `**/AGENTS.md` ignore for this repo only. Other repos remain unaffected. by @brettdavies in [#2](https://github.com/brettdavies/agentnative-skill/pull/2) -- **Breaking (install layout):** Skill bundle moved into `bundle/` subdirectory. Installers must fetch `bundle/` rather than the entire repo. Consumer's installed skill directory shape is unchanged (`SKILL.md` at the root). by @brettdavies in [#3](https://github.com/brettdavies/agentnative-skill/pull/3) -- Adopted the full `release/*` cherry-pick release pattern (was lightweight `dev → main`). Plans on `dev` no longer conflict with release PRs because release branches cherry-pick only non-docs commits. +- `.gitignore` adds `!AGENTS.md` to override the global `**/AGENTS.md` ignore for this repo only. Other repos remain + unaffected. by @brettdavies in [#2](https://github.com/brettdavies/agentnative-skill/pull/2) +- **Breaking (install layout):** Skill bundle moved into `bundle/` subdirectory. Installers must fetch `bundle/` rather + than the entire repo. Consumer's installed skill directory shape is unchanged (`SKILL.md` at the root). by + @brettdavies in [#3](https://github.com/brettdavies/agentnative-skill/pull/3) +- Adopted the full `release/*` cherry-pick release pattern (was lightweight `dev → main`). Plans on `dev` no longer + conflict with release PRs because release branches cherry-pick only non-docs commits. - `RELEASES.md` rewritten to the canonical pattern; broken `../../.claude/...` link removed. -- **Breaking (install layout):** Skill bundle no longer ships `bundle/scripts/` or `bundle/checklists/`. Installers and consumers should fetch only the surviving directories: `SKILL.md`, `getting-started.md`, `spec/`, `references/`, `templates/`. The consumer's installed skill-directory shape (`SKILL.md` at the root) is unchanged. by @brettdavies in [#4](https://github.com/brettdavies/agentnative-skill/pull/4) -- Rewrite `bundle/SKILL.md` to drop inline principle prose, link `bundle/getting-started.md` and `bundle/spec/principles/` for progressive disclosure, and frame the spec / `anc` / skill three-artifact ecosystem. -- Reframe `RELEASES.md` SemVer guidance around the bundle's actual surface (markdown + templates + vendored spec) rather than deleted shell-script exit codes; document the spec-bump-vs-skill-version distinction. -- License changed from MIT-only to dual MIT or Apache-2.0 (consumer's choice). The skill bundle, top-level scripts, and all repo content are now dual-licensed; no MIT compatibility regression. by @brettdavies in [#6](https://github.com/brettdavies/agentnative-skill/pull/6) -- Documentation now points at `https://anc.dev/skill` instead of `https://anc.dev/install` for skill installation instructions, the cross-repo re-pin process, and the `bundle/` consumer description. by @brettdavies in [#7](https://github.com/brettdavies/agentnative-skill/pull/7) -- `bundle/SKILL.md`, `bundle/getting-started.md`, `bundle/spec/README.md` — drop "pinned ref" / "pinned upstream tag" / "pinned SPEC_VERSION" framing in favor of "vendored snapshot, refreshed each release". The bundle's behavior is unchanged; the language was misleading because the install command never actually pinned at the consumer side. by @brettdavies in [#8](https://github.com/brettdavies/agentnative-skill/pull/8) -- **BREAKING (install layout):** Skill content moved out of `bundle/` to the repo root. After install, hosts find `SKILL.md` at the skill root (where Claude Code expects it), not at `/bundle/SKILL.md`. Plain `git clone --depth 1` and `git pull --ff-only` are now the load-bearing install + update commands; no sparse-checkout magic, no post-install scripts. by @brettdavies in [#9](https://github.com/brettdavies/agentnative-skill/pull/9) +- **Breaking (install layout):** Skill bundle no longer ships `bundle/scripts/` or `bundle/checklists/`. Installers and + consumers should fetch only the surviving directories: `SKILL.md`, `getting-started.md`, `spec/`, `references/`, + `templates/`. The consumer's installed skill-directory shape (`SKILL.md` at the root) is unchanged. by @brettdavies in + [#4](https://github.com/brettdavies/agentnative-skill/pull/4) +- Rewrite `bundle/SKILL.md` to drop inline principle prose, link `bundle/getting-started.md` and + `bundle/spec/principles/` for progressive disclosure, and frame the spec / `anc` / skill three-artifact ecosystem. +- Reframe `RELEASES.md` SemVer guidance around the bundle's actual surface (markdown + templates + vendored spec) rather + than deleted shell-script exit codes; document the spec-bump-vs-skill-version distinction. +- License changed from MIT-only to dual MIT or Apache-2.0 (consumer's choice). The skill bundle, top-level scripts, and + all repo content are now dual-licensed; no MIT compatibility regression. by @brettdavies in + [#6](https://github.com/brettdavies/agentnative-skill/pull/6) +- Documentation now points at `https://anc.dev/skill` instead of `https://anc.dev/install` for skill installation + instructions, the cross-repo re-pin process, and the `bundle/` consumer description. by @brettdavies in + [#7](https://github.com/brettdavies/agentnative-skill/pull/7) +- `bundle/SKILL.md`, `bundle/getting-started.md`, `bundle/spec/README.md`: drop "pinned ref" / "pinned upstream tag" / + "pinned SPEC_VERSION" framing in favor of "vendored snapshot, refreshed each release". The bundle's behavior is + unchanged; the language was misleading because the install command never actually pinned at the consumer side. by + @brettdavies in [#8](https://github.com/brettdavies/agentnative-skill/pull/8) +- **BREAKING (install layout):** Skill content moved out of `bundle/` to the repo root. After install, hosts find + `SKILL.md` at the skill root (where Claude Code expects it), not at `/bundle/SKILL.md`. Plain `git clone + --depth 1` and `git pull --ff-only` are now the load-bearing install + update commands; no sparse-checkout magic, no + post-install scripts. by @brettdavies in [#9](https://github.com/brettdavies/agentnative-skill/pull/9) - `bin/check-update`: `SKILL_DIR` is now one dir up from the script (was two), since there's no `bundle/` layer. - `scripts/sync-spec.sh` writes to `spec/` (was `bundle/spec/`). -- README, AGENTS, CONTRIBUTING reframe the consumer/producer split from a directory boundary (`bundle/` vs everything else) to an audience boundary (host reads `SKILL.md` + `bin/` + `spec/` + `references/` + `templates/` + `VERSION`; ignores everything else). -- Spec content vendored under `spec/` re-vendored from `agentnative-spec` v0.2.0 to v0.3.0. All 7 principles flip `status: draft` → `status: active` (P1–P7 are now the shipped baseline); prose tightened across P1 (TUI parenthetical), P2 (sysexits acknowledgment), P4 (dependency-gating cleanup), P5 (`--dry-run` write-gate + retry hedge), P6 (SIGPIPE language-neutral + global-flags behavioral lead), P7 (LLM-vs-non-LLM cost generalization). No requirement IDs added/removed/renamed; no level changes. Full upstream context: agentnative `v0.3.0` CHANGELOG. by @brettdavies in [#10](https://github.com/brettdavies/agentnative-skill/pull/10) -- `scripts/sync-spec.sh` no longer accepts `SPEC_REF`. The script always vendors the latest `v*` tag, queried from `SPEC_REMOTE_URL` (default `https://github.com/brettdavies/agentnative.git`) via `git ls-remote --tags --sort=-version:refname` and shallow-cloned for extraction. On any remote failure, falls back to the existing `SPEC_ROOT`-based logic (default `$HOME/dev/agentnative-spec`). New env var `SPEC_REMOTE_URL` overrides the remote; the temp clone is auto-cleaned on script exit via trap. by @brettdavies in [#11](https://github.com/brettdavies/agentnative-skill/pull/11) -- `.markdownlint-cli2.yaml` excludes `CHANGELOG.md` from linting. Aligns its treatment with `spec/CHANGELOG.md` and reflects that the file is regenerated by `scripts/generate-changelog.sh`, not hand-edited. Per-line content is governed by PR-body bullets in source PRs, not by this repo's MD013 line-length rule. by @brettdavies in [#13](https://github.com/brettdavies/agentnative-skill/pull/13) +- README, AGENTS, CONTRIBUTING reframe the consumer/producer split from a directory boundary (`bundle/` vs everything + else) to an audience boundary (host reads `SKILL.md` + `bin/` + `spec/` + `references/` + `templates/` + `VERSION`; + ignores everything else). +- Spec content vendored under `spec/` re-vendored from `agentnative-spec` v0.2.0 to v0.3.0. All 7 principles flip + `status: draft` → `status: active` (P1–P7 are now the shipped baseline); prose tightened across P1 (TUI + parenthetical), P2 (sysexits acknowledgment), P4 (dependency-gating cleanup), P5 (`--dry-run` write-gate + retry + hedge), P6 (SIGPIPE language-neutral + global-flags behavioral lead), P7 (LLM-vs-non-LLM cost generalization). No + requirement IDs added/removed/renamed; no level changes. Full upstream context: agentnative `v0.3.0` CHANGELOG. by + @brettdavies in [#10](https://github.com/brettdavies/agentnative-skill/pull/10) +- `scripts/sync-spec.sh` no longer accepts `SPEC_REF`. The script always vendors the latest `v*` tag, queried from + `SPEC_REMOTE_URL` (default `https://github.com/brettdavies/agentnative.git`) via `git ls-remote --tags + --sort=-version:refname` and shallow-cloned for extraction. On any remote failure, falls back to the existing + `SPEC_ROOT`-based logic (default `$HOME/dev/agentnative-spec`). New env var `SPEC_REMOTE_URL` overrides the remote; + the temp clone is auto-cleaned on script exit via trap. by @brettdavies in + [#11](https://github.com/brettdavies/agentnative-skill/pull/11) +- `.markdownlint-cli2.yaml` excludes `CHANGELOG.md` from linting. Aligns its treatment with `spec/CHANGELOG.md` and + reflects that the file is regenerated by `scripts/generate-changelog.sh`, not hand-edited. Per-line content is + governed by PR-body bullets in source PRs, not by this repo's MD013 line-length rule. by @brettdavies in + [#13](https://github.com/brettdavies/agentnative-skill/pull/13) ### Fixed -- Harden `bundle/bin/check-update` against malformed local `VERSION` (apply SemVer regex; malformed → silent exit) and against curl failure being cached as UP_TO_DATE (skip cache write on network failure so the next invocation retries). by @brettdavies in [#8](https://github.com/brettdavies/agentnative-skill/pull/8) -- Align table pipes in `SKILL.md` and `getting-started.md` after the `bundle/` path strip (markdownlint MD060). MD060 isn't auto-fixable, so violations slipped past the local PostToolUse hook and surfaced in CI. by @brettdavies in [#9](https://github.com/brettdavies/agentnative-skill/pull/9) +- Harden `bundle/bin/check-update` against malformed local `VERSION` (apply SemVer regex; malformed → silent exit) and + against curl failure being cached as UP_TO_DATE (skip cache write on network failure so the next invocation retries). + by @brettdavies in [#8](https://github.com/brettdavies/agentnative-skill/pull/8) +- Align table pipes in `SKILL.md` and `getting-started.md` after the `bundle/` path strip (markdownlint MD060). MD060 + isn't auto-fixable, so violations slipped past the local PostToolUse hook and surfaced in CI. by @brettdavies in + [#9](https://github.com/brettdavies/agentnative-skill/pull/9) ### Documentation -- `README.md` — License section rewritten to reflect dual licensing and link both LICENSE files; tree row updated. by @brettdavies in [#6](https://github.com/brettdavies/agentnative-skill/pull/6) -- `CONTRIBUTING.md` — License section rewritten: contributions are dual-licensed at the consumer's option, no CLA, with an explicit pointer to the Apache §3 patent grant. -- `bundle/spec/README.md` licensing reference catches drift from PR #6: was "MIT-licensed", now reflects the actual dual MIT/Apache-2.0 posture introduced in `18836d8`. by @brettdavies in [#8](https://github.com/brettdavies/agentnative-skill/pull/8) -- `RELEASES.md` gains a `## Spec re-vendoring` section between `## Why branch from main, not dev` and `## Version bump procedure`, documenting the `scripts/sync-spec.sh` re-vendor step. The script auto-resolves the latest upstream tag from the remote, so no manual version selection is needed at re-vendor time. by @brettdavies in [#10](https://github.com/brettdavies/agentnative-skill/pull/10) -- `AGENTS.md` `## Spec sync` section: rewritten — single-step recipe (`scripts/sync-spec.sh` then review); notes the remote-first / local-fallback behavior and the `SPEC_REMOTE_URL` / `SPEC_ROOT` overrides. Commit-message example uses `` placeholder instead of a hard-coded version. by @brettdavies in [#11](https://github.com/brettdavies/agentnative-skill/pull/11) -- `spec/README.md` `## Resync` section: rewritten similarly; drops the manually-maintained `**Current snapshot:**` line and points readers at `spec/VERSION` (which `sync-spec.sh` writes verbatim from upstream). -- `RELEASES.md` post-merge sequence ends at the GitHub Release; replaces deleted step 5 with a one-liner pointing consumers at `bin/check-update`. +- `README.md`: License section rewritten to reflect dual licensing and link both LICENSE files; tree row updated. by + @brettdavies in [#6](https://github.com/brettdavies/agentnative-skill/pull/6) +- `CONTRIBUTING.md`: License section rewritten. Contributions are dual-licensed at the consumer's option, no CLA, with + an explicit pointer to the Apache §3 patent grant. +- `bundle/spec/README.md` licensing reference catches drift from PR #6: was "MIT-licensed", now reflects the actual dual + MIT/Apache-2.0 posture introduced in `18836d8`. by @brettdavies in + [#8](https://github.com/brettdavies/agentnative-skill/pull/8) +- `RELEASES.md` gains a `## Spec re-vendoring` section between `## Why branch from main, not dev` and `## Version bump + procedure`, documenting the `scripts/sync-spec.sh` re-vendor step. The script auto-resolves the latest upstream tag + from the remote, so no manual version selection is needed at re-vendor time. by @brettdavies in + [#10](https://github.com/brettdavies/agentnative-skill/pull/10) +- `AGENTS.md` `## Spec sync` section: rewritten as a single-step recipe (`scripts/sync-spec.sh` then review). Notes the + remote-first / local-fallback behavior and the `SPEC_REMOTE_URL` / `SPEC_ROOT` overrides. Commit-message example uses + `` placeholder instead of a hard-coded version. by @brettdavies in + [#11](https://github.com/brettdavies/agentnative-skill/pull/11) +- `spec/README.md` `## Resync` section: rewritten similarly; drops the manually-maintained `**Current snapshot:**` line + and points readers at `spec/VERSION` (which `sync-spec.sh` writes verbatim from upstream). +- `RELEASES.md` post-merge sequence ends at the GitHub Release; replaces deleted step 5 with a one-liner pointing + consumers at `bin/check-update`. ### Removed -- Remove `bundle/scripts/check-compliance.sh` and 24 `bundle/scripts/checks/check-*.sh` files (plus `_helpers.sh`). `anc audit --output json` is the canonical replacement. by @brettdavies in [#4](https://github.com/brettdavies/agentnative-skill/pull/4) -- Remove `bundle/references/principles-deep-dive.md` (419-line hand-typed paraphrase of the spec; canonical text now lives at `bundle/spec/principles/`). +- Remove `bundle/scripts/check-compliance.sh` and 24 `bundle/scripts/checks/check-*.sh` files (plus `_helpers.sh`). `anc + check --output json` is the canonical replacement. by @brettdavies in + [#4](https://github.com/brettdavies/agentnative-skill/pull/4) +- Remove `bundle/references/principles-deep-dive.md` (419-line hand-typed paraphrase of the spec; canonical text now + lives at `bundle/spec/principles/`). - Remove `bundle/checklists/new-tool.md` (pre-anc manual checklist; replaced by `bundle/getting-started.md`). -- All SHA-pin claims from public-facing markdown (`RELEASES.md`, `AGENTS.md`, `README.md`, `spec/README.md`, `CONTRIBUTING.md`): pipeline diagram's "site re-pins to commit SHA" step, the post-merge "site re-pins via its own PR" step, the `protect-tags.json` / `install endpoints` claims that tags are pinned to install endpoints, and the spec-vendor "pinned ref" / "pinned `SPEC_REF`" / "current pin is recorded" vocabulary across all docs. by @brettdavies in [#11](https://github.com/brettdavies/agentnative-skill/pull/11) +- All SHA-pin claims from public-facing markdown (`RELEASES.md`, `AGENTS.md`, `README.md`, `spec/README.md`, + `CONTRIBUTING.md`): pipeline diagram's "site re-pins to commit SHA" step, the post-merge "site re-pins via its own PR" + step, the `protect-tags.json` / `install endpoints` claims that tags are pinned to install endpoints, and the + spec-vendor "pinned ref" / "pinned `SPEC_REF`" / "current pin is recorded" vocabulary across all docs. by @brettdavies + in [#11](https://github.com/brettdavies/agentnative-skill/pull/11) **Full Changelog**: [v0.1.0...v0.2.0](https://github.com/brettdavies/agentnative-skill/compare/v0.1.0...v0.2.0) @@ -78,7 +269,7 @@ All notable changes to this project will be documented in this file. - `checklists/new-tool.md` — task checklist for starting a new agent-native CLI. - `references/` — five deep-dive references: principle specifications, framework idioms (Rust/clap and other languages), project structure, Rust/clap patterns. -- `scripts/check-compliance.sh` — automated compliance auditor that produces deterministic pass/warn/fail scorecards +- `scripts/check-compliance.sh` — automated compliance checker that produces deterministic pass/warn/fail scorecards across 24 checks in 9 groups. - `scripts/checks/` — individual check scripts plus shared `_helpers.sh`. - `templates/` — starter files: `AGENTS.md`, `clap-main.rs`, `error-types.rs`, `output-format.rs`. diff --git a/VERSION b/VERSION index 0ea3a94..8f0916f 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.2.0 +0.5.0 From 0634316a2c8c98dd00687ec93a28dac89700588d Mon Sep 17 00:00:00 2001 From: Brett Date: Mon, 1 Jun 2026 12:54:35 -0500 Subject: [PATCH 12/15] feat(prose-tooling): move sync script to dev-only (#24) ## Summary Move `scripts/sync-prose-tooling.sh` to dev-only. The script vendors `BRAND.md` from `agentnative-spec` and is a producer-side dev convenience, not part of the shipped bundle. Mirrors the agentnative-site PR #132 pattern. The merged result vs `dev` (before this PR): - The workflow guard's `extra_paths` now includes `scripts/sync-prose-tooling.sh`. Any future PR that adds or modifies the script in a release branch fails the guard. - `RELEASES.md` gains a `### Dev-direct exception` subsection that documents the two categories of dev-direct change: engineering docs and the prose-tooling vendoring vehicle. - `PRODUCT.md`, `AGENTS.md`, and `README.md` are reframed so the in-tree references to the script are accurate (named as a dev-only sync script rather than linked twice as if it shipped to consumers). `BRAND.md` itself still ships to `main`; consumers read it for skill voice. The script that vendors it does not. ## Changelog ### Changed - `.github/workflows/guard-main-docs.yml`: pass `extra_paths: 'scripts/sync-prose-tooling.sh'` to the reusable guard workflow. Future PRs to `main` that add or modify the script fail the check. - `RELEASES.md`: add a `### Dev-direct exception` subsection under `## Daily development` that names engineering docs and the prose-tooling vendoring vehicle as the two categories that commit directly to `dev` without the feature-branch + PR flow. - `PRODUCT.md`: reframe the `BRAND.md` inheritance text to name a "dev-only sync script" rather than linking the in-tree path twice. - `AGENTS.md`: align the Voice-and-prose-rules section with the same framing. - `README.md`: annotate the repo-layout entry for the script as `(dev-only; guarded off main)`. ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) ## Files Modified **Modified:** `.github/workflows/guard-main-docs.yml`, `RELEASES.md`, `PRODUCT.md`, `AGENTS.md`, `README.md` ## Breaking Changes - [x] No breaking changes ## Deployment Notes - [x] No special deployment steps required ## Related PRs - agentnative-site #132 (the prototype pattern) ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow Conventional Commits - [x] Self-review of code completed - [x] No new warnings or errors introduced --- .github/workflows/guard-main-docs.yml | 2 ++ AGENTS.md | 4 ++-- PRODUCT.md | 9 +++++---- README.md | 2 +- RELEASES.md | 16 ++++++++++++++++ 5 files changed, 26 insertions(+), 7 deletions(-) diff --git a/.github/workflows/guard-main-docs.yml b/.github/workflows/guard-main-docs.yml index eeb3b61..92c8872 100644 --- a/.github/workflows/guard-main-docs.yml +++ b/.github/workflows/guard-main-docs.yml @@ -15,3 +15,5 @@ permissions: jobs: guard-docs: uses: brettdavies/.github/.github/workflows/guard-main-docs.yml@main + with: + extra_paths: 'scripts/sync-prose-tooling.sh' diff --git a/AGENTS.md b/AGENTS.md index 5947b7e..43f087a 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -49,8 +49,8 @@ tooling agree. ## Voice and prose rules -Channel-specific design context lives in [`PRODUCT.md`](PRODUCT.md). It inherits from [`BRAND.md`](BRAND.md) (vendored -from `agentnative-spec` via [`scripts/sync-prose-tooling.sh`](scripts/sync-prose-tooling.sh)). Read both before +Channel-specific design context lives in [`PRODUCT.md`](PRODUCT.md). It inherits from [`BRAND.md`](BRAND.md), which is +vendored from `agentnative-spec` by a dev-only sync script (kept off `main` by the workflow guard). Read both before authoring skill-bundle prose (`SKILL.md`, `getting-started.md`, `references/`, `templates/`). ## Spec sync diff --git a/PRODUCT.md b/PRODUCT.md index 37faad5..c2ffc9c 100644 --- a/PRODUCT.md +++ b/PRODUCT.md @@ -2,16 +2,17 @@ Channel-specific design context for the **skill bundle channel** of agentnative. Inherits the shared identity, voice anchor, audiences, and universal anti-patterns from [`BRAND.md`](BRAND.md), vendored from -[`agentnative-spec`](https://github.com/brettdavies/agentnative) via -[`scripts/sync-prose-tooling.sh`](scripts/sync-prose-tooling.sh). Read that first. +[`agentnative-spec`](https://github.com/brettdavies/agentnative) via a dev-only vendoring script (kept off `main` by the +workflow guard). Read `BRAND.md` first. ## Inheritance The skill bundle channel sits in a three-tier waterfall. Each tier owns a different concern; nothing duplicates. 1. **Universal layer: [`BRAND.md`](BRAND.md).** Shared identity, voice anchor, audiences, universal anti-patterns. - Vendored from `agentnative-spec` via [`scripts/sync-prose-tooling.sh`](scripts/sync-prose-tooling.sh) (parallel to - [`scripts/sync-spec.sh`](scripts/sync-spec.sh), which vendors `spec/principles/`). Read that first. + Vendored from `agentnative-spec` by a dev-only sync script (kept off `main` by the workflow guard; parallel to + [`scripts/sync-spec.sh`](scripts/sync-spec.sh), which vendors `spec/principles/` and does ship to `main`). Read + `BRAND.md` first. 2. **Channel delta: this file (`PRODUCT.md`).** Instructional voice, second-person imperative allowed (the bundle teaches agents how to invoke `anc`), terse and agent-loadable. The narrative companion to the shipped bundle content at the repo root. diff --git a/README.md b/README.md index 1ee5de0..e404444 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ agentnative-skill/ ├── VERSION single-line current version (read by bin/check-update) ├── scripts/ │ ├── sync-spec.sh vendor the latest agentnative-spec v* tag into spec/ -│ ├── sync-prose-tooling.sh vendor BRAND.md from agentnative-spec main HEAD +│ ├── sync-prose-tooling.sh vendor BRAND.md from agentnative-spec main HEAD (dev-only; guarded off main) │ ├── sync-dev-after-release.sh post-release backport: replay release/* artifacts onto dev │ ├── generate-changelog.sh release-time CHANGELOG generator (git-cliff + PR-body extraction) │ └── hooks/pre-push local CI mirror (markdownlint + shellcheck), installed via core.hooksPath diff --git a/RELEASES.md b/RELEASES.md index a043f33..bc320ee 100644 --- a/RELEASES.md +++ b/RELEASES.md @@ -38,6 +38,22 @@ gh pr create --base dev --title "feat(scope): what changed" user-facing release notes; `CHANGELOG.md` entries derive from it directly. See [§ PR body](#pr-body). - **PR body prose scrub**: see [§ Prose scrubbing](#prose-scrubbing). +### Dev-direct exception + +Two categories of change commit directly to `dev` without going through the feature-branch + PR flow: + +- **Engineering docs**: `docs/plans/`, `docs/solutions/`, `docs/brainstorms/`, `docs/reviews/`. These live on `dev` + only; `guard-main-docs.yml` blocks them from reaching `main` so a release branch's cherry-pick from `dev` naturally + excludes them. +- **Prose-tooling vendoring vehicle**: `scripts/sync-prose-tooling.sh`. The script vendors `BRAND.md` from + `agentnative-spec` and is a producer-side dev convenience, not part of the shipped bundle. The workflow guard's + `extra_paths` list keeps it off `main`; future cherry-picks that try to bring it onto a `release/*` branch will fail + the guard. `BRAND.md` itself still ships to `main` (consumers read it), but the script that vendors it does not. + +Everything else (consumer-facing markdown like `README`, `AGENTS`, `CONTRIBUTING`, `CHANGELOG`, the skill bundle content +under `SKILL.md` / `getting-started.md` / `spec/` / `references/` / `templates/`, and any in-repo runbook) goes through +the standard feature-branch + PR flow. + ## PR body Every PR (feature, fix, docs, release) uses `.github/pull_request_template.md` verbatim. From 65cc3c02c2ac86e3a1fda93ae56bbac7369d6827 Mon Sep 17 00:00:00 2001 From: Brett Date: Mon, 1 Jun 2026 12:59:52 -0500 Subject: [PATCH 13/15] feat(scripts): harden sync-dev-after-release.sh with two preconditions (#25) ## Summary Hardens `scripts/sync-dev-after-release.sh` with two preconditions that catch release-flow drift the existing checks miss. Adds a `--dry-run` flag plus duplicate-section guard to `scripts/generate-changelog.sh` to enable the regen check. The merged result vs `dev`: - `sync-dev-after-release.sh` verifies the GitHub Release for `$VERSION` is published (not draft, not missing) before backporting. Tag-reachable-from-main is necessary but not sufficient; consumers see the new version via `gh release`, so the release artifact must actually exist. - `sync-dev-after-release.sh` runs `generate-changelog.sh --dry-run --tag $VERSION` after the backport commit and warns when PR bodies have drifted from main's CHANGELOG.md. Drift is non-fatal (the backport is still correct against current `main`) but flagged so a follow-up release branch can regenerate cleanly. - `generate-changelog.sh` accepts `--dry-run`. It stashes CHANGELOG.md, runs the normal flow in place, prints a unified diff to stderr if the regenerated content differs, restores the original on EXIT trap, and exits 1 on drift. - `generate-changelog.sh` skips the prepend when CHANGELOG.md already has a section for the current tag. Mirrors `agentnative-cli` PR #68's duplicate-section guard. Previously, re-running on an already-published tag emitted a second copy of the section and an empty compare link. ## Changelog ### Added - `scripts/generate-changelog.sh`: `--dry-run` flag prints a unified diff of what regeneration would change without modifying `CHANGELOG.md`. Exits 0 when the file is idempotent vs current PR bodies, exits 1 on drift. - `scripts/sync-dev-after-release.sh`: GitHub Release published-state precondition via `gh release view --json isDraft`. Exits 67 when the release is missing or draft. - `scripts/sync-dev-after-release.sh`: post-sync regen-idempotency check via `generate-changelog.sh --dry-run`. Warns (does not fail) when PR bodies have drifted from main's `CHANGELOG.md`. ### Fixed - `scripts/generate-changelog.sh` no longer prepends a duplicate section when `CHANGELOG.md` already has one for the current tag. Mirrors `agentnative-cli` PR #68. ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) ## Files Modified **Modified:** `scripts/generate-changelog.sh`, `scripts/sync-dev-after-release.sh` ## Breaking Changes - [x] No breaking changes ## Deployment Notes - [x] No special deployment steps required ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow Conventional Commits - [x] Self-review of code completed - [x] No new warnings or errors introduced --- scripts/generate-changelog.sh | 77 +++++++++++++++++++++++-------- scripts/sync-dev-after-release.sh | 41 ++++++++++++++++ 2 files changed, 99 insertions(+), 19 deletions(-) diff --git a/scripts/generate-changelog.sh b/scripts/generate-changelog.sh index b111442..0bcf8b0 100755 --- a/scripts/generate-changelog.sh +++ b/scripts/generate-changelog.sh @@ -4,10 +4,14 @@ # Usage: # generate-changelog.sh [--tag vX.Y.Z] [repo-path] # generate-changelog.sh --check [repo-path] +# generate-changelog.sh --dry-run [--tag vX.Y.Z] [repo-path] # # Options: # --tag vX.Y.Z Override version tag (default: extracted from branch name) # --check Verify CHANGELOG.md has a versioned section (exit 1 if only [Unreleased]) +# --dry-run Print what would change to stdout; do not modify CHANGELOG.md. +# Exit 0 if no changes (idempotent), exit 1 if regeneration would +# produce different content (drift between PR bodies and the file). # # The version tag is extracted from the branch name by matching the pattern # release/vN.N.N (with optional suffix like release/v1.0.5:ci-migration). @@ -25,6 +29,7 @@ set -euo pipefail CHECK_MODE=false +DRY_RUN=false REPO_PATH="." TAG="" @@ -34,6 +39,10 @@ while [[ $# -gt 0 ]]; do CHECK_MODE=true shift ;; + --dry-run) + DRY_RUN=true + shift + ;; --tag) TAG="$2" shift 2 @@ -99,14 +108,39 @@ if [[ -z "${GITHUB_TOKEN:-}" ]]; then fi fi +# Dry-run mode: stash CHANGELOG.md, restore on exit, diff at the end. +if $DRY_RUN; then + if [[ ! -f CHANGELOG.md ]]; then + echo "error: --dry-run requires an existing CHANGELOG.md to compare against" >&2 + exit 1 + fi + ORIG_CHANGELOG="$(mktemp -t changelog-orig.XXXXXX)" + cp CHANGELOG.md "$ORIG_CHANGELOG" + trap 'mv "$ORIG_CHANGELOG" CHANGELOG.md' EXIT +fi + +# Duplicate-section guard: skip prepend if CHANGELOG.md already has a section +# for this tag. Re-running with an already-published tag previously emitted a +# second copy of the same section. +DUPLICATE_SECTION=false +if [[ -f CHANGELOG.md ]] && grep -qE "^## \[${TAG#v}\]" CHANGELOG.md; then + DUPLICATE_SECTION=true + if ! $DRY_RUN; then + echo "CHANGELOG.md already has a [${TAG#v}] section; skipping prepend" + exit 0 + fi +fi + # Step 1: Run git-cliff to prepend entries tagged with the release version -CLIFF_ARGS=(--unreleased --tag "$TAG") -if [[ -f CHANGELOG.md ]]; then - CLIFF_ARGS+=(--prepend CHANGELOG.md) -else - CLIFF_ARGS+=(-o CHANGELOG.md) +if ! $DUPLICATE_SECTION; then + CLIFF_ARGS=(--unreleased --tag "$TAG") + if [[ -f CHANGELOG.md ]]; then + CLIFF_ARGS+=(--prepend CHANGELOG.md) + else + CLIFF_ARGS+=(-o CHANGELOG.md) + fi + git cliff "${CLIFF_ARGS[@]}" fi -git cliff "${CLIFF_ARGS[@]}" # Step 2: Expand squash commit entries using PR body changelog sections OWNER=$(awk -F'"' '/^\[remote\.github\]/{found=1} found && /^owner/{print $2; exit}' cliff.toml) @@ -115,13 +149,27 @@ REPO=$(awk -F'"' '/^\[remote\.github\]/{found=1} found && /^repo/{print $2; exit # Strip leading v for version matching in the changelog VERSION="${TAG#v}" -if [[ -z "$OWNER" || -z "$REPO" ]] || ! command -v gh &>/dev/null; then - echo "Updated CHANGELOG.md (skipping PR expansion — missing [remote.github] or gh CLI)" +summarize_and_exit() { + local msg="$1" + if $DRY_RUN; then + if cmp -s "$ORIG_CHANGELOG" CHANGELOG.md; then + echo "DRY RUN: CHANGELOG.md is current (no regen drift)" + exit 0 + fi + echo "DRY RUN: CHANGELOG.md would change (regen drift detected)" >&2 + diff -u "$ORIG_CHANGELOG" CHANGELOG.md || true + exit 1 + fi + echo "$msg" echo "" echo "Next steps:" echo " git add CHANGELOG.md" echo " git commit -m 'docs: update CHANGELOG.md'" exit 0 +} + +if [[ -z "$OWNER" || -z "$REPO" ]] || ! command -v gh &>/dev/null; then + summarize_and_exit "Updated CHANGELOG.md (skipping PR expansion — missing [remote.github] or gh CLI)" fi # Extract PR numbers from the new version section only @@ -135,12 +183,7 @@ VERSION_SECTION=$(awk -v ver="$VERSION" ' PR_NUMBERS=$(echo "$VERSION_SECTION" | grep -oP '\(#\K\d+' | sort -un) if [[ -z "$PR_NUMBERS" ]]; then - echo "Updated CHANGELOG.md" - echo "" - echo "Next steps:" - echo " git add CHANGELOG.md" - echo " git commit -m 'docs: update CHANGELOG.md'" - exit 0 + summarize_and_exit "Updated CHANGELOG.md" fi # Pass PR numbers as comma-separated arg to python @@ -311,8 +354,4 @@ with open(changelog_path, 'w') as f: f.write(new_content) PYEOF -echo "Updated CHANGELOG.md" -echo "" -echo "Next steps:" -echo " git add CHANGELOG.md" -echo " git commit -m 'docs: update CHANGELOG.md'" +summarize_and_exit "Updated CHANGELOG.md" diff --git a/scripts/sync-dev-after-release.sh b/scripts/sync-dev-after-release.sh index ac1adce..9221b5a 100755 --- a/scripts/sync-dev-after-release.sh +++ b/scripts/sync-dev-after-release.sh @@ -59,6 +59,31 @@ if ! git merge-base --is-ancestor "$TAG_SHA" origin/main; then exit 66 fi +# Verify the GitHub Release exists and is not still a draft. The tag can exist +# (above check) while the GitHub Release was never created (or stayed draft), +# in which case consumers won't see the new version via `gh release` and the +# backport is premature. +if command -v gh >/dev/null 2>&1; then + is_draft="$(gh release view "$VERSION" --json isDraft --jq .isDraft 2>/dev/null || true)" + case "$is_draft" in + false) + ;; + true) + echo "error: GitHub Release $VERSION is still draft -- publish it first" >&2 + exit 67 + ;; + "") + echo "error: no GitHub Release for $VERSION -- create it with 'gh release create $VERSION'" >&2 + exit 67 + ;; + *) + echo "warning: unexpected isDraft value '$is_draft' for $VERSION -- proceeding" >&2 + ;; + esac +else + echo "warning: gh not on PATH -- skipping GitHub Release published-state check" >&2 +fi + git switch dev git pull --ff-only origin dev @@ -80,4 +105,20 @@ Brings dev's release-bookkeeping current with the $VERSION release on main: VERSION bumped to ${VERSION_NO_V} and CHANGELOG.md copied verbatim from origin/main." +# Post-sync sanity check: re-running generate-changelog.sh against the current +# PR bodies should produce an identical CHANGELOG.md. Drift here means upstream +# PR bodies were edited after main's CHANGELOG.md was generated -- the +# backport brought the stale CHANGELOG over, and a future release-branch +# regen will surface unexpected diffs. Warn, do not fail; the backport is +# still correct against what main currently has. +if [[ -x scripts/generate-changelog.sh ]] && command -v git-cliff >/dev/null 2>&1; then + if scripts/generate-changelog.sh --dry-run --tag "$VERSION" >/dev/null 2>&1; then + echo "regen check: CHANGELOG.md matches what PR bodies would produce" + else + echo "warning: PR bodies have drifted from main's CHANGELOG.md for $VERSION" >&2 + echo " re-run 'scripts/generate-changelog.sh --dry-run --tag $VERSION' to see the diff" >&2 + echo " fix by regenerating CHANGELOG.md on a follow-up release branch" >&2 + fi +fi + echo "committed; push with: git push origin dev" From a9af2bc7d5fd0a284d23691157d667ce47a02c20 Mon Sep 17 00:00:00 2001 From: Brett Date: Mon, 1 Jun 2026 13:03:25 -0500 Subject: [PATCH 14/15] chore(release): regenerate v0.5.0 CHANGELOG with #24 + #25 from PR bodies The prose-tooling dev-only move (#24) and the sync-dev-after-release.sh hardening (#25) merged to dev after release/v0.5.0 was opened. Both cherry-pick cleanly. This commit regenerates the v0.5.0 section via `scripts/generate-changelog.sh` so the bullets come from the upstream PR bodies rather than being hand-edited. Procedure: delete `[0.5.0]` + filler `[0.4.0]` / `[0.3.0]` sections, re-run the script (which now finds the two newly cherry-picked PRs in git history), then re-add the filler sections. The `[0.2.0]` and `[0.1.0]` sections are untouched. --- CHANGELOG.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index f2c8c52..1eeaf38 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -39,6 +39,13 @@ All notable changes to this project will be documented in this file. Python (Click) workflows. by @brettdavies in [#21](https://github.com/brettdavies/agentnative-skill/pull/21) - Document `anc skill install --all` and `anc skill update [host|--all]` in the install section. - Document `anc emit schema` for extracting the scorecard JSON Schema embedded in the binary. +- `scripts/generate-changelog.sh`: `--dry-run` flag prints a unified diff of what regeneration would change without + modifying `CHANGELOG.md`. Exits 0 when the file is idempotent vs current PR bodies, exits 1 on drift. by @brettdavies + in [#25](https://github.com/brettdavies/agentnative-skill/pull/25) +- `scripts/sync-dev-after-release.sh`: GitHub Release published-state precondition via `gh release view --json isDraft`. + Exits 67 when the release is missing or draft. +- `scripts/sync-dev-after-release.sh`: post-sync regen-idempotency check via `generate-changelog.sh --dry-run`. Warns + (does not fail) when PR bodies have drifted from main's `CHANGELOG.md`. ### Changed @@ -77,6 +84,16 @@ All notable changes to this project will be documented in this file. - Track `anc` v0.5.0 scorecard surface: schema 0.7, per-row `id` / `audit_id` / `tier` fields, `opt_out` and `n_a` statuses, 70% badge floor. - Surface new top-level flags: `--examples`, `--json`, `--raw`, `--color`, `--verbose`. +- `.github/workflows/guard-main-docs.yml`: pass `extra_paths: 'scripts/sync-prose-tooling.sh'` to the reusable guard + workflow. Future PRs to `main` that add or modify the script fail the check. by @brettdavies in + [#24](https://github.com/brettdavies/agentnative-skill/pull/24) +- `RELEASES.md`: add a `### Dev-direct exception` subsection under `## Daily development` that names engineering docs + and the prose-tooling vendoring vehicle as the two categories that commit directly to `dev` without the feature-branch + + PR flow. +- `PRODUCT.md`: reframe the `BRAND.md` inheritance text to name a "dev-only sync script" rather than linking the in-tree + path twice. +- `AGENTS.md`: align the Voice-and-prose-rules section with the same framing. +- `README.md`: annotate the repo-layout entry for the script as `(dev-only; guarded off main)`. ### Fixed @@ -87,6 +104,9 @@ All notable changes to this project will be documented in this file. [#21](https://github.com/brettdavies/agentnative-skill/pull/21) - Clarify that `badge.score_pct` is computed from behavioral-layer rows only. Source- and project-layer audits do not affect the score. +- `scripts/generate-changelog.sh` no longer prepends a duplicate section when `CHANGELOG.md` already has one for the + current tag. Mirrors `agentnative-cli` PR #68. by @brettdavies in + [#25](https://github.com/brettdavies/agentnative-skill/pull/25) ### Documentation From cd50ca24a6c1ef0abccac1889bd43bfebc4b6339 Mon Sep 17 00:00:00 2001 From: Brett Date: Mon, 1 Jun 2026 13:08:45 -0500 Subject: [PATCH 15/15] chore(release): drop scripts/sync-prose-tooling.sh per the #24 guard PR #24 added `scripts/sync-prose-tooling.sh` to the workflow guard's `extra_paths`, making it a dev-only artifact. The file rode onto this release branch via PR #17's cherry-pick (which predates the guard). The guard now correctly fails the PR-to-main because the file is being added. Drop the file on the release branch only. It stays on `dev` as the vendoring vehicle for `BRAND.md`; `BRAND.md` itself still ships to `main` (consumers read it). --- scripts/sync-prose-tooling.sh | 128 ---------------------------------- 1 file changed, 128 deletions(-) delete mode 100755 scripts/sync-prose-tooling.sh diff --git a/scripts/sync-prose-tooling.sh b/scripts/sync-prose-tooling.sh deleted file mode 100755 index 905a476..0000000 --- a/scripts/sync-prose-tooling.sh +++ /dev/null @@ -1,128 +0,0 @@ -#!/usr/bin/env bash -# Vendor the cross-channel prose source-of-truth shipped on agentnative-spec. -# -# Parallel sync vehicle to scripts/sync-spec.sh, decoupled because the brand -# narrative (BRAND.md) and the contract artifacts (principles/, VERSION, -# CHANGELOG.md) release on different cadences. sync-spec.sh covers the -# contract anc lints against; this script covers the universal voice anchor -# that PRODUCT.md inherits from. -# -# Vendored manifest (paths at spec main HEAD, mirrored verbatim into this -# repo at the same paths): -# -# BRAND.md (universal voice SoT) -# -# BRAND.md only, by design. The skill repo does not run prose-check today -# (no .vale.ini, no styles/ tree), so the broader prose-tooling stack the -# spec ships (Vale rule packs, vocabulary, prose-check.sh orchestrator, -# generate-pack-readme.mjs) is omitted here. When the skill activates Vale, -# extend the vendored manifest below to mirror the site's -# scripts/sync-prose-tooling.sh: add styles/brand/*.yml + README.md, -# styles/config/vocabularies/brand/{accept,reject}.txt, scripts/prose-check.sh, -# and scripts/generate-pack-readme.mjs to required_paths and the extract block. -# -# Tracks `main` HEAD by design: the prose-tooling stack (BRAND.md and, -# once expanded, Vale packs / vocabularies / prose-check.sh) iterates -# faster than the principle contract and does not need release-tag -# ceremony. Tag-pinning is reserved for the principle contract via -# `sync-spec.sh`. Resolves the current commit on `main`, preferring the -# remote repository, and falls back to a local checkout if the remote is -# unreachable. Extracts files via `git show :` so neither -# checkout's working tree is perturbed. -# -# Usage: -# scripts/sync-prose-tooling.sh -# SPEC_ROOT=/path/to/agentnative-spec scripts/sync-prose-tooling.sh -# SPEC_REMOTE_URL=git@github.com:brettdavies/agentnative.git scripts/sync-prose-tooling.sh -# -# Env vars (shared with sync-spec.sh): -# SPEC_REMOTE_URL Remote URL to query first. -# Default: https://github.com/brettdavies/agentnative.git -# SPEC_ROOT Local checkout to fall back to when the remote is -# unreachable. Default: $HOME/dev/agentnative-spec -# -# Resync cadence: rerun whenever the spec's `main` advances with changes -# under the vendored manifest (today: BRAND.md). Tracks `main` HEAD by -# design; tag-pinning is for the principle contract via `sync-spec.sh`. -# Spec's `repository_dispatch:spec-release` event fires on tag publish; a -# consumer-side handler that auto-PRs the resync is tracked as deferred -# follow-up alongside the same handler for sync-spec.sh. -# -# Idempotent at a fixed upstream sha: re-running with no upstream change -# produces no `git diff`. - -set -euo pipefail - -SPEC_REMOTE_URL="${SPEC_REMOTE_URL:-https://github.com/brettdavies/agentnative.git}" -SPEC_ROOT="${SPEC_ROOT:-$HOME/dev/agentnative-spec}" - -REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)" - -# Cleanup hook for the temp clone (set only after mktemp succeeds). -tmp_root="" -cleanup() { - if [[ -n "$tmp_root" && -d "$tmp_root" ]]; then - rm -rf "$tmp_root" - fi -} -trap cleanup EXIT - -# === Remote-first resolution =========================================== -spec_source="" -spec_ref="" -resolved_sha="" - -echo "querying $SPEC_REMOTE_URL for main HEAD..." -remote_sha="$(git ls-remote "$SPEC_REMOTE_URL" 'refs/heads/main' 2>/dev/null | awk '{print $1}')" - -if [[ -n "$remote_sha" ]]; then - tmp_root="$(mktemp -d -t agentnative-prose-XXXXXX)" - if git clone --depth 1 --branch main --quiet \ - "$SPEC_REMOTE_URL" "$tmp_root" 2>/dev/null; then - spec_source="$tmp_root" - spec_ref="main" - resolved_sha="$(git -C "$spec_source" rev-parse --short=7 main)" - echo "pulling from main @ $resolved_sha (remote $SPEC_REMOTE_URL)" - fi -fi - -# === Local fallback ==================================================== -if [[ -z "$spec_source" ]]; then - if [[ ! -d "$SPEC_ROOT/.git" ]]; then - echo "error: remote unreachable and SPEC_ROOT is not a git repository: $SPEC_ROOT" >&2 - echo " remote: $SPEC_REMOTE_URL" >&2 - echo " set SPEC_ROOT to your agentnative-spec checkout, or check network access." >&2 - exit 1 - fi - echo "warning: remote query failed; falling back to local $SPEC_ROOT" >&2 - - spec_source="$SPEC_ROOT" - if ! git -C "$spec_source" rev-parse --verify --quiet main >/dev/null; then - echo "error: no local main branch found in $SPEC_ROOT" >&2 - echo " try \`git -C $SPEC_ROOT fetch origin main:main\` to track upstream" >&2 - exit 1 - fi - spec_ref="main" - resolved_sha="$(git -C "$spec_source" rev-parse --short=7 main)" - echo "pulling from main @ $resolved_sha (local $spec_source)" -fi - -# === Verify expected paths exist at main =============================== -required_paths=( - "BRAND.md" -) -for path in "${required_paths[@]}"; do - if ! git -C "$spec_source" cat-file -e "$spec_ref:$path" 2>/dev/null; then - echo "error: main @ $resolved_sha is missing required path: $path" >&2 - echo " (BRAND.md may not have landed on main yet)" >&2 - exit 1 - fi -done - -# === Extract: top-level singletons ===================================== -git -C "$spec_source" show "$spec_ref:BRAND.md" >"$REPO_ROOT/BRAND.md" - -# === Report ============================================================ -echo "wrote BRAND.md to repo root (pulled from main @ $resolved_sha)" -echo -echo "next: review \`git diff\` for unexpected changes, then commit."