A TOML + memory format that lets AI coding agents cite repo-specific rules and decisions — and detect when that knowledge has gone stale.
AI coding agents have a recurring failure mode: they re-derive the same project facts every session, miss system-level rules that don't grep cleanly, and trust documentation that has silently drifted from the code.
Anatomy is a small, machine-readable corpus you commit to the repo so agents stop guessing. It has two files:
.anatomy— repository identity along four pillars (Stack, Form, Domain, Function) plus the uncapturable knowledge an agent can't infer from source: rules, flows, and decisions..anatomy-memory— an append-only log of lived experience (gotchas, decisions, conventions, attempts) paired to the.anatomyby fingerprint.
Every read pins to a git commit, so consumers can tell when the knowledge no
longer matches HEAD instead of trusting it blindly.
- What it looks like
- What it buys you
- Install
- Quick start
- Format
- Versions & status
- Packages
- Conformance fixtures
- Documentation
- Contributing
- License
This repository describes itself with an .anatomy file. A trimmed excerpt:
anatomy_version = "0.13"
tagline = "TOML + memory format that lets AI agents cite repo-specific rules/decisions and detect their own staleness."
[identity]
stack = "javascript"
form = "monorepo"
domain = "repo-metadata"
function = "ai-context-format"
fingerprint = "jcevybzm4r897e6rhe11"
[[rules]]
rule = "Hand-roll TOML output when section order matters; do not use smol-toml.stringify"
why = "smol-toml does not preserve insertion order; section order is normative per spec section 5"
[[decisions]]
topic = "v0.3 is an ecosystem release, not a wire version"
reason = "v0.3 added cascading discovery + merge semantics for multi-.anatomy repos but did not change the per-file format…"
[generated]
at = 2026-05-17T04:53:13.000Z
commit = "948fe0b" # every read pins here; consumers detect drift vs HEAD
by = "anatomy-cli@0.13.0"An agent reading this can cite the exact rule and decision rather than
re-deriving them — and knows to flag the file if commit has fallen behind
HEAD.
Measured across two independent N=20 runs on a fixed, commit-pinned slate
(4 repos each of JS/TS, Python, Rust, Go, and C/Ruby/Shell/C++),
anatomy generate --ai vs. an /init-style Claude agent producing the
equivalent AGENTS.md: a 2026-05-17 canonical run and a 2026-05-18 Sonnet
replication. Figures below are the replication run (median, [p25–p75] where
it matters); the canonical run agrees on every large effect.
| anatomy | /init-style |
|
|---|---|---|
| Generation cost | $0.115 [0.11–0.14] | $0.448 [0.41–0.48] |
| Input tokens | 33k | 764k |
| Output tokens | 3.2k | 5.5k |
| Wall clock | 56 s | 172 s |
| Tool calls | 0 | 32 |
| Artifact size | 1 027 tok [919–1 134] | 1 964 tok [1 717–2 306] |
| Accuracy | 83% [83–100] | 67% [63–88] |
| Coverage of README facts | 33% | 38% |
| Specificity | 54% | 84% |
| Succeeded | 20 / 20 | 20 / 20 |
What each metric means (which direction is better, and the result here):
- Generation cost — USD for the single model call (anatomy) or agent
session (
/init) that produces the artifact. Lower better. anatomy ≈ ¼. - Input tokens — tokens fed to the model to build the artifact, cache included; drives cost and context pressure. Lower better. anatomy ≈ 23× fewer: one bounded prompt over a deterministic digest vs. ingesting the repo.
- Output tokens — tokens the model generates. Lower better at equal quality. anatomy ≈ 40% fewer (3.2k vs 5.5k).
- Wall clock — elapsed time to produce the artifact. Lower better. anatomy ≈ 3× faster — no exploration loop.
- Tool calls — file reads/greps the agent runs while exploring. Fewer
better (each is latency + tokens). anatomy 0 (Pass-1 is deterministic);
/inita median 32. - Artifact size — tokens in the produced
AGENTS.mda consumer reloads every session. Smaller better at equal information (recurring context cost). anatomy ≈ half. - Specificity — share of statements carrying a concrete identifier
(file / symbol / flag). Higher = denser.
/initwins — free-form prose name-drops more. - Coverage — share of the repo's own README facts the artifact restates.
Higher = broader.
/initwins; anatomy's fixed schema captures fewer onboarding facts by design. - Accuracy — share of sampled artifact claims a judge verifies true against the source. Higher better. Soft metric (3 claims/cell, one LLM judge): read as a direction, not a precise number — see below.
- Succeeded — cells that produced a valid artifact with no error. Higher better. 20 / 20 for both methods, in both runs.
The honest read, both ways. anatomy's artifact is ~3.9× cheaper, ~23×
fewer input tokens, ~2× smaller, ~3× faster, and needs zero exploration tool
calls — and these efficiency effects replicated across both N=20 runs.
Accuracy is the soft, run-sensitive metric: the methods tied (91.7 /
91.7) in the canonical run and anatomy led (83 / 67) here, with anatomy ≥
the baseline in 19 / 20 repos — so the defensible claim is "at least as
accurate as the exploration baseline, never worse," not a fixed margin. The
trade-off is real and also replicated: the /init-style agent covers more
of each repo's own README facts (38% vs 33%) and is far more identifier-dense
(84% vs 54%) — by design, since anatomy's fixed schema deliberately omits
metadata an agent can re-derive from source. anatomy degrades most on
non-mainstream stacks. The pitch is "a much cheaper, smaller,
at-least-as-accurate repo digest," not "a more complete one."
- Self-detected staleness. Every read pins to a git commit, so consumers
see drift between
.anatomy.generated.commitandHEADand can react — a structural property of the format, independent of the comparison above.
A separate, smaller N=3 consumer eval measured citation reliability: agents cited specific
.anatomy/memory entries in 24/27 treatment trials vs 0/27 baseline. It measures task-time citation behaviour — a different axis from the generation-cost comparison above, complementary to it, not part of it.
npm install -g @anatomytool/cli # provides `anatomy` (and `anatomy-cli`)
# or, without installing:
npx @anatomytool/cli --helpRequires Node.js ≥ 22.
To work on the spec and conformance fixtures, or build the CLI from source:
git clone https://github.com/0xHayd3n/anatomy
cd anatomy
npm install
npm run validate # full content-integrity check (see below)anatomy generate # Pass 1: starter .anatomy from manifest + README + dirs; also writes AGENTS.md
anatomy generate --ai # Pass 2: enrich the human-knowledge fields via an AI provider
anatomy validate # validate .anatomy (and a sibling .anatomy-memory if present)
anatomy mcp # serve it to agents over MCP (or: anatomy hook)A generated .anatomy is TOML you are expected to hand-edit — Pass 1
fills what it can deterministically and leaves # TODO markers for the
human-knowledge fields. The full command reference lives in
anatomy-cli/README.md, kept in sync with
anatomy --help and intentionally not duplicated here.
.anatomy files are TOML 1.0, UTF-8. The top level is grouped:
[identity] and [generated] are required; [operation] and [substance]
are optional groups for AI-grade per-repo context.
.anatomy-memory files are also TOML 1.0, UTF-8, with a two-line header
(anatomy_memory_version, repo_fingerprint) followed by [[entries]]
blocks. Append-only by design — entries are superseded, never rewritten.
npm run validate runs the full content-integrity check: every schema is
valid JSON Schema; every recommended-stacks file validates against its
meta-schema; every valid/* fixture parses and validates with correct
canonical-form hashes; every invalid/* fixture fails with the expected
errors (or is a documented schema_can_detect: false boundary case);
valid-with-warnings/* fixtures validate cleanly with their expected warning
surface; and canonicalization cases produce the documented strings and hashes.
The normative version index is spec/CURRENT.md. Current state:
| Surface | Latest | Notes |
|---|---|---|
.anatomy file format |
v1.0 | Stabilization of v0.15 — structurally identical; the 0→1 bump is a stability commitment, not a breaking change. v0.1–v0.15 remain valid, declared via anatomy_version. |
| Ecosystem | v0.3 | Cascading-aware multi-.anatomy repos. An ecosystem (validator + cascading) release — the per-file wire format is unchanged from v0.2. |
.anatomy-memory |
v0.2 | v0.1 still valid; v0.2 adds optional last_verified_at / verified_by for decay tracking. |
| AGENTS.md emission | v0.10 | Emits a derived AGENTS.md (read by Codex / Copilot / Cursor) alongside .anatomy. Token-budgeted; honors the optional [generate] config. |
| Rule verification | v0.12+ | Optional verify on each [[rules]] entry checks the rule against source. Two glob-based kinds (no dependency), one AST kind via optional @ast-grep/napi, and (v0.13+) kind = "semgrep" for pattern combinators and non-JS languages via an optional semgrep CLI. Surfaces drift between documented rules and actual code. |
| Package | Version | What it is |
|---|---|---|
@anatomytool/spec |
1.0.0 | The standard — schema, recommended-stacks reference, canonicalization rules, conformance fixtures. (This repo root.) |
@anatomytool/validate |
1.0.0 | Version-routed JSON-schema validator; fingerprint / hash / path checks; cascading tree discovery. |
@anatomytool/cli |
1.0.0 | The anatomy command — generate, validate, render, migrate, manage the memory log, and serve agents via a Claude Code SessionStart hook or an MCP server. |
fixtures/ is the conformance test set consumed by
validator implementations:
- Single-file: 34 valid, 3 valid-with-warnings, 43 invalid (covering versions 0.1 through 1.0).
- Cascading (multi-file): 2 valid, 1 valid-with-warnings, 2 invalid.
- Canonicalization: 16 cases (11 valid + 5 invalid) in
fixtures/canonicalization-cases.json, driving the ID → canonical-form transformation.
- Normative reference:
spec/CURRENT.md— maps each format version to its schema, canonicalization, prompt, versioning policy, and recommended-stacks docs. - CLI reference:
anatomy-cli/README.md.
Issues and pull requests welcome. Before opening a PR, run npm run validate
from the repo root — it is the same content-integrity gate CI enforces, and a
green run is required to merge. Commits follow the
Conventional Commits style used
throughout the history.