feat(skill): judgment layer — project decision & state pages by ngmeyer · Pull Request #1 · ngmeyer/librarian-mcp

ngmeyer · 2026-05-31T01:28:03Z

Summary

Operators can now retain project decisions in their vault and get agent-proposed next moves grounded in their own accumulated research. Three new /librarian subcommands — project init, project log, and propose — turn a per-project decision page into a falsifiable, outcome-anchored experiment log: declare an anchor metric (Sharpe, CTR, conversion, retention — whatever you actually measure), log experiment deltas as you ship them, and the agent reads the page plus a declared research scope to draft ranked, cited candidate next moves into a managed block on the page.

Voltron is the pilot. The feature itself is generic — substitute any project name, anchor metric, and research scope and the same shape works via frontmatter alone.

Why this exists

Ships Track 1 of STRATEGY.md: the brain is a decision-reduction engine, not a search index over scraped content. Until now, project decisions lived scattered across repo docs, session memory, and the operator's head — so every new session re-derived "what did we decide, what's working, what should we try next." The two-layer model splits that: maintain judgment as canonical commitments in the vault (this PR), and recompute awareness (news, daily flux) on demand (separate Awareness track).

The MVP also widens beyond pure retention. The propose subcommand reads accumulated outcome history plus declared research and drafts steering candidates, compensating for thin domain experience on projects where the operator has lots of research but limited instinct.

How it works

flowchart LR
    Init["/librarian project init"] -->|scaffold page| Page[("Projects/&lt;name&gt;.md<br/>frontmatter: anchor + scope")]
    Log["/librarian project log"] -->|append outcome| Page
    Propose["/librarian propose"] -->|read scope| Corpus[(research corpus<br/>folders / communities<br/>tags / wikilinks)]
    Page -->|frontmatter + history| Propose
    Corpus --> Propose
    Propose -->|cited, ranked| Block[["## Candidate next moves auto<br/>managed fenced block"]]
    Block --> Page

The loop closes when the operator picks a candidate, ships the change, and runs project log with the observed metric delta — informing the next ranking.

Key design decisions

No new Rust primitive. The skill composes the research corpus from existing v0.1.2 librarian tools (library_search, library_cluster, library_traverse, library_tags, library_list, library_read, library_metadata). A library_corpus convenience primitive is the next step only if pilot UX disappoints.
Three subcommands, not one. init and log reinforce the load-bearing operator habit named in the brainstorm — logging the outcome delta when an experiment completes. If deltas aren't logged, the page decays into a narrative log and the falsifiability advantage evaporates.
Generic by design. type: decision-state frontmatter is the canonical signal; default location Projects/<name>.md is convention, not constraint. No project name, metric, or folder is baked into the subcommands — every vault-specific value comes from operator-declared frontmatter.
Fenced managed block. The ## Candidate next moves (auto) upsert mirrors the shipped ## Related (auto) pattern from v0.1.2's optimizer. The agent only writes inside its block; everything else on the page is byte-stable across re-runs.
.librarianisolate honored end-to-end. The propose subcommand reads the isolate list and excludes citations from isolated folders before writing the managed block — fiction projects (or anything else operators isolate) stay sealed.
Operator brings the metric delta. The feature never integrates a metric source (no backtest reader, no analytics API). That's the open-source contract — works for any operator with any metric source-of-truth.
Ranking is agent-side with explicit high|medium|low confidence. Sparse-history fallback caps confidence at medium when no prior outcomes inform the ranking.

Test plan

cargo test passes (4 tests, all pre-existing — no Rust code changed in this diff).
cargo check clean except 3 pre-existing warnings unrelated to this PR.
Skill behavior is verified against the AE-style scenarios in the origin brainstorm (AE1–AE6) and the propose subcommand's documented invariants (idempotency on re-run, isolation filter, sparse-history fallback, empty-scope error).
Manual pilot: docs/walkthroughs/voltron-pilot.md is a step-by-step on-ramp using Voltron as the worked example, with a substitution table showing the same shape works for any project / anchor / scope.

Post-Deploy Monitoring & Validation

Distribution: new skill content is embedded into the Rust binary via include_str!("../skill/SKILL.md") in src/setup.rs. Operators receive the subcommands by running librarian-mcp --setup <vault> after pulling.
Smoke test post-merge: in a fresh Claude Code session after re-running --setup, /librarian shows the three new commands; /librarian project init <test> scaffolds a valid page; the resulting page parses cleanly via library_metadata.
No production / runtime metrics — this is a CLI + skill tool with no service, no logs, no rollout pipeline.
Failure signal: operators reporting the new subcommands aren't visible after --setup indicates the include_str! deploy path needs investigation.
Owner / validation window: repo maintainer, within 24h of merge.

Sources

Origin brainstorm: docs/brainstorms/2026-05-30-judgment-layer-project-decision-state-pages-requirements.md
Plan: docs/plans/2026-05-30-001-feat-judgment-layer-mvp-plan.md
Strategy anchor: STRATEGY.md

The release pattern in this repo is tag-from-main for cargo-dist + brew tap distribution; whoever merges decides when to cut the v0.1.3 release tag to deploy these changes to existing operators.

Add Track 1 of the brain-as-decision-reduction strategy to the bundled /librarian skill, deployable to operators via the existing 'librarian-mcp --setup' flow. Three new subcommands operating on a per-project decision & state page: - project init <name>: scaffolds Projects/<name>.md with frontmatter (type/anchor_outcome/research_scope) and the canonical section layout. Refuses to overwrite existing pages; re-prompts on empty scope. - project log <project>: appends a single experiment-outcome entry to the Experiment & outcome log section (append-only, never rewrites existing entries). Optional decision-ratification marker. - propose <project>: load-bearing steering subcommand. Reads frontmatter, composes the research corpus from declared scope (folders/communities/ tags/wikilinks) via existing librarian primitives, applies the .librarianisolate filter, ranks candidate next moves with cited research and explicit confidence, upserts into a fenced '## Candidate next moves (auto)' block (mirrors the existing '## Related (auto)' pattern). Idempotent on re-run; everything outside the managed block is byte-stable. Adds a 'projects' concept section documenting the page convention, schema, and operator-contract habit (delta-logging is load-bearing). Plus docs/walkthroughs/voltron-pilot.md: a generic on-ramp using Voltron as the worked example, with substitution table showing how the same shape works for any project / anchor / scope. No Rust changes. No new MCP tools. Composes existing v0.1.2 primitives (search/cluster/traverse/tags/list/read/metadata/write). Origin brainstorm: docs/brainstorms/2026-05-30-judgment-layer-project-decision-state-pages-requirements.md Plan: docs/plans/2026-05-30-001-feat-judgment-layer-mvp-plan.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Anchor the product direction for the brain: - STRATEGY.md (repo root): librarian-mcp as a decision-reduction engine, two-layer model (maintain judgment / recompute awareness), three tracks (Judgment, Corpus acquisition, Awareness), four key metrics. Track 1 wording widened from retention-only to "retention + agent-proposed steering" to honor the MVP scope landed in this commit. - docs/brainstorms/2026-05-27 (dashboard / Project Command Center): later Awareness surface; deferred per strategy. - docs/brainstorms/2026-05-28 (on-demand synthesis pipeline): superseded by STRATEGY.md's two-layer model; the on-demand-only stance is now scoped to the Awareness layer only. - docs/brainstorms/2026-05-30 (Judgment layer — project decision & state pages): origin for the skill changes in the prior commit. - docs/plans/2026-05-28-001 (library_changes recency primitive): an Awareness-track primitive, still valid, lower priority. - docs/plans/2026-05-30-001 (Judgment-layer MVP): plan for the prior commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

All 5 implementation units shipped in commits b2df1b4 (feature) and 9f51ed9 (supporting docs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ngmeyer and others added 3 commits May 30, 2026 18:23

chore(plan): mark judgment-layer MVP plan completed

5009ae8

All 5 implementation units shipped in commits b2df1b4 (feature) and 9f51ed9 (supporting docs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skill): judgment layer — project decision & state pages#1

feat(skill): judgment layer — project decision & state pages#1
ngmeyer wants to merge 3 commits into
mainfrom
feat/judgment-layer-mvp

ngmeyer commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ngmeyer commented May 31, 2026

Summary

Why this exists

How it works

Key design decisions

Test plan

Post-Deploy Monitoring & Validation

Sources

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant