TL;DR · 11 Skills · Source Adapters · 11 Runtimes · The Loop · Token Economy · Capture Engine · Install
🌍 Languages:
🇬🇧 English |
🇧🇷 Português |
🇪🇸 Español |
🇫🇷 Français |
🇩🇪 Deutsch |
🇮🇹 Italiano |
🇯🇵 日本語 |
🇰🇷 한국어 |
🇨🇳 简体中文 |
🇷🇺 Русский |
🇵🇱 Polski |
🇹🇷 Türkçe |
🇳🇱 Nederlands |
🇮🇳 हिन्दी |
🇸🇦 العربية
simplicio-tasks is a runtime-agnostic super-plugin — one autonomous looping
orchestrator (invoked as /simplicio-tasks) plus five satellite skills — that turns any
strong LLM (Claude, Codex, Copilot, Gemini, Cursor, local models) into a self-driving worker. You
point it at a body of work — "finish all the open issues", "clear the CI queue", "drain the Jira board" — and it
runs the whole lifecycle on its own:
discover → understand → decide → act → verify → correct → record → repeat
It discovers work from any source (GitHub Issues, Jira, Azure DevOps, agentsview sessions, and more), dedups, auto-scales an agent fleet to your machine, implements each item through a quality loop that runs the code (not just compiles it), opens PRs, resolves CI/review feedback, merges, and keeps watching 24/7 for new work — all behind safety gates and a hard cost kill-switch.
/simplicio-tasks finish all open issues
→ identity + pre-flight (kill-switch, auth, watcher)
→ discover 50 issues · dedup · build dependency DAG
→ autoscale fleet = 14 · pipeline implement→review→merge
→ each item: read body+ACs → orient code → plan → edit → run → verify → PR
→ merge · close with evidence · rollback if main breaks
→ keep looping every ~2 min until the queue is dry (evidence-gated, never a false "done")
Three things make it different: it is a super-plugin of focused skills, it runs the same protocol on 11 runtimes, and it does all of this with aggressive, honest token economy.
Within the Simplicio product line, this repo is also the current reference task flow for
company work. simplicio-runtime is the unified entrypoint going forward, but it is expected to
reuse this loop's evidence-gated converge/drain discipline, durable attempt journal, and worker
coordination patterns instead of creating a separate task semantics.
The complete, official roster of what simplicio-tasks ships — every capability below is real,
runnable, and tested (python3 scripts/check.py: claims-audit 5/5 + local test suite). Each links to its
deep section and its worker.
| Capability | What it does | Proof / worker | Details |
|---|---|---|---|
🎬 Video evidence (video_evidence) |
Records the real browser session as moving proof a UI change works (Playwright, default); renders a deterministic captioned MP4 with hyperframes for an explicit explainer request (/simplicio-tasks make a video of screen X) |
scripts/video_evidence.py · BLOCKED (never fake-pass) without the toolchain |
§ Video evidence |
| 🧠 Attempt memory + stall detector | A durable run-journal (.orchestrator/loop/journal.jsonl) + a stall detector so the loop changes strategy instead of oscillating; incremental triage (since) reads only the delta each turn, and optional stage lineage makes retries/governance explicit |
scripts/loop_journal.py · selftest 13/13 |
§ Anti-oscillation |
🧭 Repo conventions (repo_conventions) |
Learns the repo's own playbook — mines git history + merged PRs + static config into .orchestrator/conventions.json so every new branch/commit/PR mirrors the team's established style; worktree-per-item isolation is the default |
scripts/repo_conventions.py · selftest 19/19 |
§ The full flow |
🧩 Scope reflection (dependency_graph) |
Maps local dependencies, reverse dependents, and related tests from the planned touched files; blocks task plans that ignore callers, sibling files, or proof points before the edit starts | scripts/impact_audit.py · selftest |
§ Tests & local checks |
🕸️ Flow coverage (endpoint_compare) |
Maps mixed front/back/service workspaces: UI actions → frontend HTTP calls → backend endpoints → service calls; blocks frontend calls with no backend endpoint and stubbed endpoints, and surfaces unclassified loose ends | scripts/flow_audit.py · selftest |
§ Tests & local checks |
🔒 Fail-closed safety gate (action_gate) |
A PreToolUse/git-pre-push hook that mechanically blocks force-push, history rewrite, mass-delete, destructive DDL, infra teardown, and secret-laden commits/pushes — Step 5 made executable, not prose |
hooks/action_gate.py · selftest 15/15 |
§ Safety |
| 🔬 Local verification | A test suite (worker selftests + an e2e of the loop driver proving evidence-gated exit) + a claims-audit (referenced scripts exist · counts consistent · _bundle ≡ source) — all local, no paid CI |
scripts/check.py · scripts/claims_audit.py · tests/ |
§ Tests & local checks |
| ✅ Honest savings | The savings line is now evidence-gated, not mandatory — a number is shown only with a measured receipt (clamp/signatures/cache/deterministic_edit/ledger); never fabricated |
token-economy contract | § Token economy |
Two loop modes make termination explicit: converge (a single hard task — ends on the
evidence-gated <promise> or a stall escalation) vs drain (a queue — ends when the source
re-query stays empty K rounds). Both still obey the universal exits (promise+evidence,
max_iterations, budget, STOP).
Loop scoring across this line of work: 7.5 (strong design, unproven) → 9 (attempt memory + anti-oscillation) → 9.5 (reproducible local proof) → ~10 (enforced safety + complete loop semantics). The verification infra now catches the project's own regressions as it grows.
The orchestrator core + five satellites + five accelerators/integrations. Each satellite is optional — when loaded, the orchestrator delegates to it (richer + cheaper); when absent, the inline protocol covers 100%. Accelerators are auto-detected — present = used, absent = LLM fallback.
| # | Capability | Absorbs | What it does | Token impact |
|---|---|---|---|---|
| 1 | 🔁 simplicio-tasks | — | The orchestrator loop: 48 extension points, dual-path router, self-audit convergence | Core |
| 2 | ♾️ simplicio-loop | ralph-loop | Hardened Ralph loop: evidence-gated <promise> exit, max_iterations cap |
Loop drive |
| 3 | 🧱 simplicio-orient | rtk + caveman | Terminal-first execution, output-reduction catalog, tee-cache, signatures-read | L0 deterministic |
| 4 | 🔥 simplicio-review | thermos | Parallel adversarial review on distinct rubrics → deduped verdict | Quality gate |
| 5 | 🗜️ simplicio-compress | caveman | Output + memory compression, fail-closed transform_guard |
40-60% fewer |
| 6 | 🎓 simplicio-learn | teaching | Post-run retrospective → durable, deduped lessons in memory | Smarter each run |
| 7 | 🧭 Understand Anything | Egonex-AI | Knowledge graph orient: semantic search, guided tours, dependency graph | L0 zero tokens |
| 8 | 📊 agentsview | kenn-io | Session analytics, cost tracking, stalled-session discovery | L1 SQL only |
| 9 | ⚡ LMCache | LMCache | KV cache between loop turns — 40-70% TTFT reduction on local models | GPU time ↓ |
| 10 | 🗜️ Simplicio capture engine | engine/simplicio_engine.py (native, stdlib-only) |
Transparent capture proxy: forwards to the real provider, measures + deterministically compresses, writes proxy_savings.json |
deterministic |
| 11 | 🎬 video_evidence | Playwright (default) · hyperframes (on request) | Records the real session as moving proof of a UI change (Playwright); renders a deterministic captioned MP4 explainer with hyperframes when the video IS the deliverable | Evidence producer |
Each skill lives under .claude/skills/; each accelerator has a reference doc
under .claude/skills/simplicio-tasks/references/ (the video producer:
video-evidence.md, worker
scripts/video_evidence.py).
The orchestrator discovers work from any source via pluggable adapters. Each exposes six verbs:
list_ready, get_details, claim, update_status, attach_evidence, close.
| Source | Adapter | Purpose |
|---|---|---|
| GitHub Issues/PRs | gh CLI (native) |
Primary work-item source |
| Jira / Asana / ClickUp / Linear / Notion | host connector | Board/project management |
| Trello / Azure DevOps | az boards adapter |
Azure work tracking |
| agentsview sessions | scripts/agentsview_adapter.py |
Stalled session recovery + cost observability |
| Local files / CI queue | filesystem / CI API | Internal work tracking |
See each adapter's reference doc under .claude/skills/simplicio-tasks/references/.
One universal skill core + one set of hooks drives every runtime. An adapter is thin: it tells a runtime where to load the skills, how to arm the loop, and how to bind native speed. The skill names no runtime; the runtime detects the skill.
| Runtime | Skill load | Loop drive | Native bind |
|---|---|---|---|
| Claude Code | .claude/skills/ + plugin |
Stop hook |
MCP |
| Codex | AGENTS.md |
self-paced | MCP / adapter |
| VS Code (Copilot) | copilot-instructions.md |
tasks | MCP |
| Cursor | .cursor-plugin/ |
stop+afterAgentResponse |
MCP / rules |
| Antigravity | rules / AGENTS.md |
self-paced | MCP |
| Kiro | .kiro/steering/ |
specs | MCP |
| OpenCode | AGENTS.md |
self-paced | MCP |
| Gemini | GEMINI.md |
self-paced | MCP / adapter |
| Aider | CONVENTIONS.md |
self-paced | — (LLM fallback) |
| Hermes | native recall | native loop | native |
| OpenClaw | plugin SDK | native scheduler | native |
The promise: same protocol, same gates, same safety on all 11 — only the speed differs.
orient_clamp.py (token economy) works on every runtime with zero wiring. See
adapters/MATRIX.md.
Every layer the orchestrator acts on, in order — from reading the demand (issues, tasks, assigns) to delivering merged, evidenced work, then looping 24/7 for more.
flowchart TD
subgraph SRC["1 · Demand sources (any adapter)"]
direction LR
S1["GitHub Issues / PRs / CI"]
S2["Jira · Azure DevOps · Linear · ClickUp · Notion · agentsview · Understand Anything (orient)"]
S3["Assigns · TODO/FIXME · CVE · local files · LMCache (inference accelerator)"]
end
SRC --> PF
subgraph PF["2 · Pre-flight gates"]
direction LR
P1["cost kill-switch budget · agentsview cost check"]
P2["source auth + scopes"]
P3["arm 24/7 watcher"]
end
PF --> DISC
subgraph DISC["3 · Discover + normalize"]
direction LR
D1["source_adapter: list metadata only"]
D2["normalize to canonical schema"]
D3["dedup id+title+fingerprint+branch/PR"]
D4["dependency DAG"]
end
DISC --> INTK
subgraph INTK["4 · Deep intake (per item)"]
direction LR
I1["body + ALL comments"]
I2["extract acceptance criteria"]
I3["orient code · signatures-only reads or Understand Anything knowledge graph"]
I4["plan + AC checklist + complexity"]
end
INTK --> RT{"5 · Route"}
RT -->|"small and every item complexity at most 3"| FAST["Fast-path: solo, one targeted test"]
RT -->|"large queue or any medium+"| POOL
subgraph POOL["6 · Continuous worker pool (autoscaled, conflict-aware)"]
direction LR
W1["claim · branch · worktree if overlap"]
W2["deterministic_edit"]
W3["quality loop: edit-lint-test-fix"]
end
FAST --> QG
POOL --> QG
subgraph QG["7 · Quality gates"]
direction LR
Q1["AC gate + impact_audit = real DoD"]
Q2["WORKS not just compiles · web_verify · video_evidence · flow_audit"]
Q3["adversarial review · thermos rubrics"]
end
QG --> SG
subgraph SG["8 · Safety gates (non-negotiable)"]
direction LR
G1["secret-scan"]
G2["irreversible-op human gate"]
G3["4-state verdict · attestation"]
end
SG --> DEL
subgraph DEL["9 · Deliver"]
direction LR
L1["commit · push · Draft PR"]
L2["close in-source + evidence"]
L3["verify reality, not self-report"]
end
DEL --> FB
subgraph FB["10 · Feedback loop to merge-ready"]
direction LR
F1["CI fail -> fix root cause"]
F2["review comments -> adjust"]
F3["branch behind main -> additive rebase"]
end
FB -->|"merged and closed"| DONE(["done + evidence + measured savings (only if a receipt exists)"])
WATCH["11 · 24/7 watcher · simplicio-loop evidence-gated promise · max-iterations cap · cost kill-switch · LMCache KV cache warm"]
FB -. "poll new work / comments / checks" .-> WATCH
DONE -. "idle until new work" .-> WATCH
WATCH -. "re-feed the goal" .-> DISC
The Evidence-Gated Loop is the core mechanism. It re-feeds the same goal each turn so the agent sees its own prior work. Exit is ONLY via:
- Evidence-gated
<promise>— the turn that emits the promise MUST also carry concrete proof (passing test, merged PR, closed-item re-query). A promise with no evidence = ignored. max_iterationscap — hard safety backstop- Budget kill-switch —
daily_usd_ceilinghalts the loop when spent - STOP signal —
.orchestrator/STOPor channel command
Between turns, LMCache (when available) caches the KV state so re-feed costs near-zero prefill.
A re-feed loop that remembers nothing oscillates — try X, fail, try X again — until the cap burns.
simplicio-loop keeps a durable run-journal (.orchestrator/loop/journal.jsonl, append-only:
iteration · action · hypothesis · gate · error-fingerprint, plus optional lineage like
execution_state · stage_id · validator · decision · retry_count) and a stall detector
(scripts/loop_journal.py, deterministic + model-free):
- Error fingerprint — the failing gate output is reduced to a stable hash with line numbers, paths, hex/uuids, timestamps and durations normalized away, so the same bug is recognized across turns even when the incidental text differs.
- Stall = K identical-fingerprint failures in a row (default K=3). A changing fingerprint means the loop is moving (PROGRESS); the same one K times means it is spinning (STALLED).
- On STALLED the loop does not re-feed the same goal — it names the dead-end actions to avoid, then switches strategy or escalates to the human gate with the fingerprint.
loop_journal.py resumeis read at the top of every turn, so a fresh process continues without re-deriving prior attempts (real resume) and never retries a known dead-end.- When the loop is doing extraction, validation, or governed retries,
recordcan also stamp--execution-state,--stage-id,--source-artifact,--chunk-id,--validator,--decision,--retry-count,--blocked-reason, and--next-action, so the next turn knows not just what failed, but where in the flow it failed.
loop_journal.py resume # what was tried + dead-ends to avoid
loop_journal.py record --iteration N --action "…" --gate fail --gate-output test.log \
--execution-state planned --stage-id validate --validator pytest --decision retry
loop_journal.py stall --k 3 --exit-code # PROGRESS → re-feed · STALLED → switch/escalateThe loop produces demo videos as proof a change works — two engines, one video_evidence
extension point (worker scripts/video_evidence.py, contract
references/video-evidence.md):
-
Default — the normal evidence flow uses Playwright. After a UI change,
video_evidencerecords the real browser session driving the screen (Playwright native video →.webm, →.mp4with FFmpeg) — the strongest "works, not just compiles" receipt (Step 4b) and a valid evidence-gated<promise>.python3 scripts/video_evidence.py verify --url http://localhost:3000/login \ --name login-demo --expect "Sign in" --issue 42 [--upload --pr 42] -
On request — a personalized explainer uses hyperframes. When the deliverable IS a video ("make an explainer video of screen X"), the orchestrator renders a deterministic, captioned slideshow of the
web_verifyscreenshots with hyperframes (by HeyGen — "same input, same frames, same output", CI-reproducible, no API keys, local render via headless Chrome + FFmpeg)./simplicio-tasks make an explainer video of the system login screen → detect: video-creation request → web_verify captures the screens → video_evidence verify --engine hyperframes → deterministic MP4 → attached to the PR
Either engine: a video that never recorded/rendered yields BLOCKED, never a fake pass. Evidence is always a file path + boolean verdict — never video bytes in context (token economy).
| Technique | Savings |
|---|---|
deterministic_edit (L0) |
100% of edit tokens (file written mechanically, never by LLM) |
| Terminal-first execution | Facts from shell, not LLM hallucination |
| Output-reduction catalog | Caps per command type (CAP_ERRORS=20, CAP_WARNINGS=10, CAP_LIST=20) — orient_clamp.py |
| Tee+CCR cache on failure | Never re-run a failed command — read the cached output |
| Signatures-only reads | simplicio-cli signatures <file> — 870-line file → 65 lines (93% saved), bodies stripped |
simplicio-compress |
Terse prose + one-time memory compaction |
orient_clamp.py |
Clamp + tee on every shell command, zero wiring |
| Native response cache | repeated deterministic (temp=0) request → served from cache, skips the LLM call (100% on hit) — simplicio-cli cache, on by default (SIMPLICIO_CACHE=0 to disable) |
| Simplicio capture proxy + MCP | 60-95% fewer tokens on tool outputs via a transparent compression daemon |
Savings only count on a verified-correct outcome. Baseline = the cheapest sensible non-orchestrated
path to the same result. Savings reporting is evidence-gated, not mandatory: a savings figure is
shown only when a turn actually ran an economy-producing command and the number traces to a
measured receipt (clamp tee, signatures-read, cache hit, deterministic_edit, savings_ledger).
No measured economy → no savings line; the orchestrator never fabricates a baseline or a percentage.
See references/token-economy.md.
Two different things happen when you call simplicio-tasks, and they behave differently per runtime:
- Economy — compression, output clamps, signatures-only reads,
deterministic_edit— applies every time the skill runs and loadssimplicio-orient/simplicio-compress, on any runtime. It is the skill's behavior plus the hooks (strongest where hooks exist:orient_clamp.pyauto-clamps on Claude and Cursor; elsewhere it is instruction-driven). - Measurement — the Token Monitor's live numbers — only counts traffic that flows through the capture proxy.
| Runtime | Economy (skill) | Measurement (monitor) |
|---|---|---|
| Hermes | ✓ | ✓ automatic — already routed through the proxy (base_url → :8788) |
| Claude | ✓ (skill + hooks) | ✗ by default — Claude talks to api.anthropic.com directly; measured only once routed (simplicio-cli wrap claude, or ANTHROPIC_BASE_URL → http://127.0.0.1:8788) |
| Codex | ✓ (skill) | ✗ by default — simplicio-cli init codex adds the MCP tools but does not route LLM traffic; measured with simplicio-cli wrap codex or an OpenAI base-url pointing at the proxy |
So: the savings happen on every runtime; the monitor tallies them automatically on Hermes, and on
Claude/Codex after a one-time routing step (simplicio-cli wrap … / base-url → :8788). Without routing,
the economy still applies — the monitor just won't count those tokens. scripts/simplicio-economy.sh wire
does this routing for OpenAI-compatible clients at install time.
A view of the savings you open when you want — only the capture is always-on:
- Capture proxy — always-on (the one auto-started service; the wired clients need it reachable). It silently captures + measures Claude + Codex + Hermes in the background.
- Web dashboard —
http://127.0.0.1:9090— real-time token chart, savings gauge, the LLMs/runtimes and 141/144 providers (98%) we intercept, a live proxy log. Opens once on the first install so you see it works, then it's on-demand — re-open it any of these ways:simplicio-loop dashboard— works from anywhere after the pip install (no repo path needed);simplicio-loop dashboard --stopto close,--no-browserto just start the server.bash scripts/simplicio-economy.sh monitor(repo checkout) ·… monitor stopto close.- just ask the agent — "open the token dashboard".
- Menu-bar / tray widget — live tokens saved in the system tray (macOS rumps · Windows/Linux pystray).
On-demand:
bash scripts/simplicio-economy.sh tray·… tray stop.
Install auto-starts only the capture proxy (macOS launchd · Linux systemd · Windows Startup). The
dashboard opens once on a fresh install (marker-guarded — a re-install/update never reopens it; opt
out with SIMPLICIO_NO_DASHBOARD=1), and the tray never opens by itself — nothing is forced to stay
open. Manage the stack: scripts/simplicio-economy.sh {status|up|monitor|tray|wire}. After install,
capture runs without invoking the loop — see references/token-capture.md.
engine/simplicio_engine.py is the native Simplicio capture engine
(stdlib-only, fail-open) — a native, transparent capture proxy + deterministic compression engine
with no external dependency. Run any
command via the scripts/simplicio-engine wrapper (e.g. simplicio-engine doctor):
| Command | What it does |
|---|---|
proxy |
the transparent capture proxy — routes each model to its real provider, compresses + measures + caches (no model swap) |
doctor |
proxy reachability + lifetime savings |
cache |
native response cache (stats/clear) — a repeated deterministic request is served from cache, skipping the LLM call |
signatures |
signatures-only view of a source file (bodies stripped, ~93% fewer tokens to read code) |
semantic |
reversible extractive (semantic-lite) compression |
detect |
content-type detection + smart per-block routing |
rag |
TF-IDF (or --ml embedding) retrieval over the CCR memory store |
memory |
CCR compress-cache-retrieve store (remember/recall/forget/list/stats) |
mcp |
native stdio MCP server (compress / retrieve / stats tools) |
init / wrap |
register Simplicio into a client (Claude / Codex / Copilot / OpenClaw) · run a client with capture routing |
report / audit / capture / evals |
savings report · audit a tree for compression opportunity · dry-run a request · compression regression gate |
Four mechanisms sustain the orchestration power:
| Pillar | Focus | Lives in |
|---|---|---|
| DAG + pipeline | parallelism by dependency, staged per item | references/orchestration.md (Step 3 pool + pipeline) |
| Isolation by worktree | parallel edits without corrupting the tree, merge-gated | references/orchestration.md |
| Adversarial verify | panel of skeptics before "delivered" | references/quality-safety-delivery.md · skill simplicio-review |
| Loop budget cap | anti-infinite-loop, dual exit | references/standing-loop-247.md · skill simplicio-loop |
git clone https://github.com/wesleysimplicio/simplicio-loop
cd simplicio-loop
# install for your runtime (omit <runtime> to auto-detect)
bash scripts/install.sh <runtime> [--global] [--minimal] # macOS / Linux
pwsh scripts/install.ps1 <runtime> [-Global] # Windows
# <runtime> ∈ claude codex vscode cursor antigravity kiro opencode gemini aider hermes openclawInstall is complete by default — it installs everything. One command sets up the whole stack:
the two loop operators (simplicio-mapper + simplicio-cli, auto-handling PEP 668 / externally-managed
Python and symlinking the binaries onto PATH), the full Python stack (the package itself),
the 6 skills + hooks with the loop's Stop hook wired, and the always-on capture proxy
with Claude + Codex + Hermes routed and measured in the background. The dashboard opens once on a
fresh install, then it's on-demand (simplicio-loop dashboard / simplicio-economy.sh monitor); the
menu-bar tray never opens by itself — nothing is forced to stay open.
Pass --minimal only for headless/CI to skip the heavy deps + the machine services. Verify any time:
bash scripts/simplicio-economy.sh status.
bash scripts/update.sh [<runtime>] # git pull → reinstall skills/hooks/operators → restart servicesupdate.sh stashes local edits, fast-forwards main, reinstalls from the fresh source, restarts the
launchd/systemd services so they run the new code, and prints the live stack + savings.
python3 scripts/doctor.py # report the whole stack (REQUIRED vs OPTIONAL)
python3 scripts/doctor.py --repair # install/wire what's fixable; make everything operational
# also: bash scripts/simplicio-economy.sh doctor [--repair]doctor separates REQUIRED (python3, the two loop operators, the 6 skills, the loop hooks, the
capture proxy — --repair installs/wires them) from OPTIONAL accelerators (the tray dep).
Missing an optional piece is never a failure and
never blocks — the Python engine + the deterministic path cover everything; the exit code is 0 as
long as every REQUIRED item is healthy.
Or, on Claude Code / Cursor, install it straight from the latest GitHub release (no marketplace):
gh release download --repo wesleysimplicio/simplicio-loop --archive tar.gz
tar xzf simplicio-loop-*.tar.gz && cd simplicio-loop-*/
bash scripts/install.sh claude # or: bash scripts/install.sh cursorThen:
/simplicio-tasks finish all the open issues
The only requirement is python3 on PATH (skills, hooks, and installer are cross-platform
Python). For GitHub sources, git + an authenticated gh. See INSTALL.md and
adapters/MATRIX.md.
Before an unattended 24/7 run: set a cost ceiling in .orchestrator/loop-budget.json
(daily_usd_ceiling > 0), confirm source auth is persistent, and keep the irreversible-op human
gate + secret-scan on. With ceiling = 0 the watcher refuses to run unattended (fail-safe).
- Secret-scan every diff; block on hit.
- Irreversible-op human gate — force-push, history rewrite, prod deploy, data/schema delete, mass-file delete → stop and ask. Headless + no approver → remove the destructive capability.
- Enforced, not just promised —
hooks/action_gate.pyis a fail-closedPreToolUse/ git-pre-push hook that mechanically blocks the above (and secret-laden commits) before they run. The safety contract holds even if the model forgets it.selftestproves the ruleset (15/15). - 4-state pre-execution verdict — optimization may never raise a command's risk tier.
- Trust-before-load — perception-shaping config (clamp profiles, suppression lists) is untrusted until a human reviews and hash-pins it.
- Prompt-injection hardening — item/PR/comment content can never override the contract.
- Hard $ kill-switch for unattended runs; evidence-gated completion (never a false "done"); fail-open hooks (never trap the agent in a loop).
Claims are verified, not just asserted — and the gate runs locally, with zero CI cost:
python3 scripts/check.py # the whole gate (audit + tests)- Test suite (
tests/) — the workers' deterministicselftests, plus an e2e of the loop driver (hooks/loop_stop.py): it proves the loop stops on evidence, ignores a bare<promise>, and stops on the cap as distinct exits — and that the evidence producers BLOCK (never fake-pass) when their toolchain is absent. Runs underpytestor, with no pip at all, self-runs on bare python3 (python3 tests/test_*.py). - Claims audit (
scripts/claims_audit.py, fail-closed) — everyscripts/*.pythe docs reference exists · the extension-point count agrees across all files · each cited worker command actually runs · the shippedsimplicio_loop/_bundle/skills are byte-identical to source. - Impact audit (
scripts/impact_audit.py) — for any code task, proves the declared task surface covers the local blast radius: dependencies, reverse dependents, and related tests.python3 scripts/impact_audit.py audit . --file path/to/seed.py --cover path/to/seed.py --fail-on high - Flow audit (
scripts/flow_audit.py) — for mixed front/back/service repos, produces theendpoint_compareevidence map and fails on objective integration gaps:python3 scripts/flow_audit.py audit . --fail-on high - Wire it as a git pre-push hook to keep
mainhonest for free:printf '#!/bin/sh\npython3 scripts/check.py\n' > .git/hooks/pre-push && chmod +x .git/hooks/pre-push
pip install "simplicio-loop[dev]" adds pytest for nicer output; it is never required.
MIT

