All notable changes to SOMA. Format follows Keep a Changelog.
SOMA uses CalVer in YYYY.M.D form, where each segment is the date of release (no zero-padding — PEP 440 canonicalizes leading zeros away anyway). Multiple releases on the same date append .N: 2026.4.30, 2026.4.30.1, 2026.4.30.2, ….
Versions before 2026.4.30 used a YYYY.M.PATCH counter where the middle segment was a sequential minor, not a month — confusing, and abandoned in favor of the date scheme. Historic CHANGELOG entries (2026.6.x, 2026.5.x, 2026.4.x pre-30) reflect that older numbering and are left as-is.
First release under the new date-based CalVer scheme. Aggregates everything done since the deleted 2026.6.2 artifacts.
tools/hook_bench.py— hook-latency benchmark scaffold.dashboard:SOMA_DASHBOARD_TOKENenv-gated bearer-token auth on every API endpoint. Off by default for backwards compatibility; banner on stderr at startup announces the state.calibration:schema_versionmigration framework. Older profiles run through registered migrators; newer-than-build profiles refuse to overwrite and log underSOMA_DEBUG=1.persistence:update_engine_stateone-shot helper around the existingengine_state_transaction(load → mutate → save underflock).
versioning: switch to date-based CalVer (this entry).
analytics: SQLite trigger rejectsNULL firing_idinserts intoab_outcomes(defense in depth for the firing-id contract).analytics: hot-path indexes onguidance_outcomes(pattern_key, ts)and partialab_outcomes(pattern, arm).ab: purge orphanedab_counters.json.tmp.*files onreset_counters(left behind by hook subprocesses that crashed betweenfsyncandos.replace).cli: friendly error on invalidmodeargument; restore wheel build path.errors:log_silent_failurehelper wired into ~14 swallow sites in data-integrity / write paths. Gated onSOMA_DEBUG=1, silent by default.config: movedconfig_loaderfromcli/to top-levelsoma.config(no public API change).audit: prune rotated audit logs pastretain_rotated.guidance: 1-hour TTL on the healing-transition cache, ontime.monotonic()so NTP corrections can't freeze it.dashboard: token comparison viahmac.compare_digest(closes timing oracle).persistence: prune stale agents fromengine_state.jsonon load.tunables: behavioral thresholds centralized insoma.tunables.
- Deprecated
packages/soma-ai/TypeScript stub. - All in-repo markdown docs (re-authored from scratch).
- All historic GitHub Releases and PyPI artifacts (
2026.4.x–2026.6.2) were withdrawn before this release. Tags and remote releases are gone; the codebase carries forward.
Reliability + performance hotfix on top of 6.1.
hooks: in-processcompile()instead ofpy_compilesubprocess; tighten lint timeouts to 500 ms. Cuts hook-tool RTT in the common path.
analytics: setbusy_timeout=5000+synchronous=NORMALon SQLite connections. WAL mode already on.ab: atomic write forab_counters.json(tmp + fsync + os.replaceunderflock). Prior non-atomic write could corrupt counters under concurrent hook subprocesses.guidance: short-circuit retired patterns incheck_followthrough(no-op for_stats,drift,entropy_drop,context).errors:SomaBlockedandSomaBudgetExhaustednow correctly inherit fromSOMAError.tools:phantom_smoke.pyresolves paths relative to script + reads env-overridable database location.
Stop-ship hotfix: four security + correctness fixes immediately after 6.0.
hooks: reject flag-shapedfile_path(rejects--evilmasquerading as a file path) and add--separator before the path argument in the hook subprocess invocation.cli: function-level allowlist for the--definitionSQL identifier invalidate-patterns(no SQL identifier injection through CLI args).
release-gate: excludeNULL firing_idrows from_arm_countsso the count matches the t-test filter — gates were previously letting through under-powered comparisons.ab: droph=2INSERT whenpressure_after_h1isNULL(closes theB1-class bias at horizon 1).
The "proof pipeline" release: multi-horizon A/B recording, multi-definition guidance outcomes, dropguard, and pattern retirement after audit.
ab: multi-horizon recording —INSERTath=2(orh=1),UPDATE-by-firing_idath=5andh=10. Outcomes can be analysed at any horizon without rerunning sessions.analytics:firing_idcolumn added toab_outcomes+ per-horizon pressure columns (pressure_after_h1,pressure_after_h5,pressure_after_h10).guidance:compute_multi_helped— three orthogonal "helped" definitions persisted side-by-side (helped_pressure_drop,helped_tool_switch,helped_error_resolved).analytics:guidance_outcomescolumns for the three helped definitions.cli:validate-patternsgains--horizonand--definitionflags.dashboard: pattern cards surface multi-helped breakdown.feat: P2.1—bash_error_streakpredictor inOBSERVEmode.feat: P2.2— A/B coverage release gate blocks under-sized ships.analytics: migration archives pre-firing_idlegacyab_outcomesrows into a quarantine table;validate-patternsexcludes them.tools:phantom_smokerunner — end-to-end pipeline verification without an LLM.
wrap+hooks: double-wrap guard + cross-IDEagent_idfamily routing.setup: refuse to overwrite a corruptsettings.json; atomic write with.bak.
ab: idempotency token onshould_inject— kills A/B counter rebias on retried hook invocations.budget: removed silent spend clamp that was suppressing thecost_spiralsignal.state: atomic writes for predictor / quality / fingerprint / task_tracker persistence.hooks:flockaroundcircuit_<aid>.jsonread-modify-write; wrap followthrough RMW block incircuit_transaction.hooks: surfacePostToolUseexceptions onstderrinstead of silentpass.baseline:from_dictdefaults match the constructor.- Multiple corrections from independent code-review (pre-firing slice, firing_id collision window, RMW rollback, timeout pressure).
statusline: mtime-keyed cache with 1-second TTL; cuts repeated reads when the user holds the prompt open.
- Patterns retired after audit:
entropy_drop— hardcodedavg_gap < 3.0panic threshold fired during fastRead/Globexploration loops; historical precision <20%. Underlying entropy signal stays.context—helpedwas structurally biased toward 0% because it required the agent to literally writenext.mdor rungit compact. Underlying context-usage signal stays.
Block-randomized A/B + data hygiene.
ab: block-randomized A/B harness — per-firing assignment totreatment/control, randomization keyed onfiring_id(notsession_id) to prevent intra-session bleed.
- Silence-cache reset that was blocking the A/B data pipeline.
- Silence refresh now filters pre-reset
guidance_outcomesso resets actually clear pattern-silence state. - Test-agent write guard prevents
test_*agents from polluting analytics; purge migration in analytics removes prior pollution.
A/B measurement bias + control-arm contamination fix.
- A/B measurement bias caused by control arm receiving the same context-injection as treatment under a specific code path.
- Control-arm cleanup migration to remove contaminated rows.
A/B proof pipeline + stricter followthrough.
- A/B proof pipeline:
ab_outcomestable, control-arm sham firings, paired-difference reporting, and the validation CLI to surface gate status per pattern.
- Followthrough scoring now requires stricter behavioural evidence to count as "helped".
Round-2 audit fixes — second-pass corrections after independent review of 2026.5.1.
- A series of correctness fixes from the round-2 audit (specific issue ids in commit messages).
Audit-fix release — first-pass corrections after self-review of 2026.5.0.
- Correctness regressions surfaced by the post-release audit.
Self-Calibration + Strict Mode. Per-agent warmup, adaptive thresholds, and a hard-block tier for destructive ops.
calibration:CalibrationProfile+ persistence — per-agent baselines persist across sessions.calibration: warmup gate — guidance disabled until enough actions have been observed for the per-agent baseline to be meaningful.calibration: personal thresholds in pattern checks — pattern triggers consult the per-agent profile, not a global constant.calibration: adaptive auto-silence refresh loop — patterns that consistently fail to help are silenced automatically; silence is re-evaluated on a refresh interval.blocks:BLOCKmode state +soma unblockCLI.strict-mode:PreToolUseenforcement + block lifecycle (issue, expire, clear).signal-pruning: dropped_statsuser-facing emission. Highest fatigue source in 5.x; underlying signal collection unchanged.visibility: surfaced SOMA's work to the user explicitly via the status line and dashboard.
Clean replacement for the yanked 2026.4.2.
- Drift pattern precision: removed false-positive triggers on intentional context shifts.
Pulled from PyPI immediately after release; replaced by 2026.4.3.
Switch to CalVer. Same wire format as 0.7.x; new versioning scheme.
- Version scheme: SemVer → CalVer (
YYYY.M.PATCH). Reset patch numbering.
Dashboard rebuild + Smart Guidance v2.
- Dashboard: full rebuild on FastAPI + SSE with new pattern-card surface, ROI panel, replay tab.
- Smart Guidance v2: pattern engine refactor — priority dict, persistent cooldowns, per-pattern severity, healing-suggestion dictionary.
- Audit logging: every guidance firing recorded with evidence + outcome.
- Cost-template guidance.
- Pressure system recalibrated: tighter thresholds, less false-positive guidance.
- Repo cleanup; docs overhaul.
- Dashboard v2 (predecessor to the 0.7.0 rebuild).
- Repo cleanup pass.
Mirror: proprioceptive behavioural feedback. First version where the agent reads its own state and changes course mid-session — the closed-loop primitive that everything since has built on.
- Mirror loop: state read → guidance → context injection → next action.
- Documentation overhaul.
- Async/streaming hardening.
0.4.0— initial guidance system release.0.4.6— workflow-aware severity.0.4.7— positive feedback (reinforce when the agent does the right thing).0.4.9— directive injections + audit fixes + coverage.0.4.11–0.4.12— core polish, multi-agent ready.
0.3.1— auto-install slash commands viasetup-claude.0.3.0— Claude Code plugin: slash commands and auto-registered hooks; PyPI publish workflow with trusted publishing.0.2.x— 9 new CLI commands + autonomy modes fully connected.0.1.x— project skeleton: types, ring buffer, first 11 tests.
Versioning policy: CalVer minor (YYYY.M.X) bumps for any user-visible change, patch for hotfixes, retroactive yanks for shipped regressions (see 2026.4.2).