Skip to content

feat: assess primitive — callable pre-flight risk check (CLI + MCP tool) (#149)#150

Merged
Ju571nK merged 9 commits into
mainfrom
feat/assess-primitive-149
Jun 12, 2026
Merged

feat: assess primitive — callable pre-flight risk check (CLI + MCP tool) (#149)#150
Ju571nK merged 9 commits into
mainfrom
feat/assess-primitive-149

Conversation

@Ju571nK

@Ju571nK Ju571nK commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Closes #149.

What

A callable assess primitive that scores a proposed shell command or a single MCP server definition against this host's loaded policy (the same rubric + rule-pack deny rules the agent enforces), returning { bucket, score, reasons, deny_match, decision }. Where the existing read surfaces report standing posture ("what is my risk now?"), assess answers "is this action risky / would Sigil block it?" before it runs.

Two surfaces over one pure engine (ai_guard::assess):

  • sigil assess CLI (cold-disk policy, no daemon): --command "<cmd>" or --mcp-config <file> / --mcp-stdin --mcp-name <name>. One-line JSON verdict; exit codes (allow/warn → 0, deny → 2, usage/policy-load error → 1; --fail-on-warn → warn = 2) for shell pre-flight gating.
  • assess MCP tool (local surface) → control IPC → daemon evaluates against its live loaded policy. Read-only.

Design

Built via brainstorm → spec (rev2, codex adversarial review folded) → plan → subagent-driven implementation (8 tasks) → holistic review. Spec/plan are in docs/superpowers/ (gitignored).

Key correctness properties:

  • Pure engine, immutable snapshots injected — the IPC handler snapshot-clones the live rubric + deny evaluator before evaluating (no lock across eval, no mid-eval reload effect). No I/O in assess.
  • Deny parity with the hook — a new hook_deny::evaluate_bash_preview builds the same HookAction::Bash the hook enforces with, so the assess deny verdict matches runtime enforcement (shared canonical preview; parity tests).
  • Reuse, not duplicationcommand_scan extracts the destructive / attack-shape primitives (ai-guard: score MCP stdio launchers by attack shape (shell / transient-path) above the benign baseline #127) from mcp_scan so the MCP-args path and the new raw-command path share one source (raw rm -rf /tmp/x with no bash -c is now caught — a gap codex flagged). The 30 existing mcp_scan parity tests stay green.
  • Boot-equivalent cold loaderload_effective_policy merges defaults < policy.yaml < rule-packs.yaml bundle (not the doctor path that ignores the bundle), tolerating an empty bundle like the daemon does (rule-packs.yaml watcher: re-arm if the config dir is created/replaced after boot #135).
  • Fail-closed — malformed / oversize input and policy-load failure return errors (exit 1 / McpError), never a default-Allow.
  • Bucket/score via the existing AiGuardBucket + rubric::bucket (no invented band type).

codex / holistic review

codex adversarial review of the spec caught 13 issues (all folded into rev2), including the raw-destructive gap, the wrong band type, the deny-evaluator signature, and fail-open-on-malformed-input. The final holistic review found 0 blocking issues; 2 should-fix items (empty-bundle CLI tolerance, deny-parity docs) and 2 nice-to-haves were applied in the last commit.

Tests / gates

cargo fmt + cargo clippy --workspace --all-targets -D warnings + cargo test --workspace all green (84 test-result sections, 0 failures). New coverage: engine (command/MCP/deny/override/determinism), command_scan (raw-destructive regression guard + mcp_scan parity), evaluate_bash_preview hook parity, load_effective_policy (bundle layer + empty/absent tolerance), CLI exit-code matrix + fail-closed, IPC round-trip, MCP tool (verdict + fail-closed validation).

Scope (deferred)

config-snippet input; MCP-def stdio-command deny evaluation; fleet/central assess; an enforce_threshold policy field (enforce bucket is currently High in both paths with a TODO).

🤖 Generated with Claude Code

Ju571nK and others added 9 commits June 12, 2026 20:51
…ion (#149)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…cted from mcp_scan (#149)

Extracts shared structural-detection primitives (is_shell, is_inline_exec_flag,
launcher_basename, effective_shell_target, is_transient_path, is_env_assignment,
starts_with_seq, contains_seq, first_destructive_after_shell_flag) from mcp_scan
into a new ai_guard::command_scan module.  scan_command scans a raw (command, args)
pair for destructive content directly against the full preview string (C5: bare
rm -rf /tmp/x without bash -c MUST emit DestructiveInInlineCommand).  mcp_scan
delegates to command_scan — zero duplicated logic.  All 30 existing mcp_scan
parity tests green; 7 new command_scan tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…#149)

Add public helper `DenyEvaluator::evaluate_bash_preview` that constructs a
`HookAction::Bash` with the canonical blake3 hash of the preview and delegates
to `evaluate()`, guaranteeing the assess primitive's deny verdict matches what
sigil-hook enforcement would produce. Three TDD tests cover: deny match,
no-match (benign), and parity with direct `evaluate()`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add `AiGuardBucket::{PartialOrd,Ord}` (variants already ascending), create
`ai_guard::assess::{AssessCtx, assess}` with 6 TDD tests covering destructive
deny, safe allow, MCP shell launcher, deny-rule-forces-deny, url+command parity,
and determinism. Register module in `ai_guard/mod.rs`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…. rule-pack bundle (#149)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…odes (#149)

Add `sigil assess` subcommand (operator-cli gated) that evaluates a proposed
command or MCP server definition against the host's loaded cold-disk policy
and exits 0 (allow/warn), 2 (deny), or 1 (usage/input/policy-load error).
Includes fail-closed input-size limits, XOR mode validation, and a pure
`exit_code(Decision, fail_on_warn)` function covered by unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add Request::Assess / Response.assess_verdict to the shared control
protocol, handle it in sigil-agent (snapshot-clone Rubric + DenyEvaluator
under short read guards, evaluate synchronously, return AssessVerdict),
and expose pub async fn assess() in sigil-mcp's LocalUpstream so the
MCP layer can ask the running daemon for a live-policy verdict.
TDD: assess_round_trips_against_canned_agent added to local.rs and
passes along with the full sigil-mcp + sigil-agent control test suites.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…#149)

Adds an `assess` MCP tool to SigilLocal in local_tools.rs. Accepts either a
shell command (command + optional args) or an MCP server definition (mcp_server
object + server_name) via an AssessParams struct, enforces XOR at runtime
(fail-closed: neither or both → invalid_params), validates mcp_server is a JSON
object, delegates to LocalUpstream::assess over IPC, and returns the full
AssessVerdict as JSON. Updates get_info instructions to mention the new tool.
TDD: three new tests (verdict round-trip, XOR enforcement, non-object rejection)
all green; 35/35 sigil-mcp tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y-parity docs (#149)

Final holistic review follow-ups:
- effective_policy: treat an empty (zero-byte/whitespace) rule-packs.yaml like
  absent (no bundle) instead of erroring, matching the daemon's boot/reload
  tolerance (#135) — a benign empty file no longer makes `sigil assess` exit 1.
  Only a Corrupt (parse-failed) bundle is fail-loud on cold load.
- Document on the `--command` CLI help and the assess MCP tool description that
  the full command line should go in `command` for faithful deny-rule parity
  with the hook (splitting into args can re-join with different spacing).
- MCP assess tool now rejects an empty `server_name` (symmetry with the CLI).
- Drop the now-stale dead_code allow on LocalUpstream::assess (it is consumed by
  the MCP tool).
- CHANGELOG: document the assess primitive.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Ju571nK Ju571nK merged commit 3272b43 into main Jun 12, 2026
5 checks passed
@Ju571nK Ju571nK deleted the feat/assess-primitive-149 branch June 12, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

assess primitive: pre-flight a proposed command / MCP server against the host's loaded policy (MCP tool + CLI)

1 participant