| title | System Invariant as Code |
|---|---|
| sidebar_label | System Invariant as Code |
| sidebar_position | 2 |
| description | What system invariant as code means in Stave, and how it differs from OPA, IaC scanners, and CSPM tools. |
System Invariant as Code means you define a small set of safety truths that must always hold for your system, then evaluate snapshots against those truths.
In Stave, a control is a YAML rule (for example: "PHI buckets are never public"). A finding is produced only when observed system state violates that rule.
Many teams have policy checks, scanners, and cloud dashboards, but still struggle to answer:
- Did this unsafe condition persist long enough to matter?
- Can we prove the same result from the same snapshot every time?
- Can we run this in air-gapped review environments with no cloud credentials?
Stave focuses on deterministic, offline proofs over local snapshots.
Let:
S_tbe an observation snapshot at timetIbe a control predicate over asset propertiesU(r, t) = 1when assetris unsafe in snapshotS_tunderI
For unsafe_state, a violation exists if U(r, t_now) = 1.
For unsafe_duration, a violation exists when:
U(r, t_now) = 1- and
(t_now - t_first_unsafe(r)) > threshold
This is why Stave can express "unsafe now" and "unsafe for too long" as separate control types.
A simplified control:
dsl_version: ctrl.v1
id: CTL.S3.PUBLIC.001
name: No Public S3 Buckets
description: S3 buckets with sensitive data must not be publicly readable or listable.
domain: exposure
scope_tags: [aws, s3]
type: unsafe_state
unsafe_predicate:
any:
- field: properties.storage.access.public_read
op: eq
value: true
- field: properties.storage.access.public_list
op: eq
value: trueIf either property is true in a snapshot, Stave emits a finding.
| Approach | Primary input | Cloud credentials needed at evaluation time | Offline by default | What it proves | Typical lifecycle point |
|---|---|---|---|---|---|
| Policy-as-Code (OPA/Sentinel) | Config, admission requests, policy docs | Usually no for local checks; depends on integration | Often | Policy decision for a request/config | CI/CD gates, admission control |
| IaC scanners (tfsec/Checkov) | IaC source and plan artifacts | No | Yes | Static misconfiguration patterns in IaC | Pre-merge / CI scan |
| CSPM (Wiz/Prisma/etc.) | Live cloud APIs and graph inventory | Yes | No | Continuous posture and exposure in deployed cloud | Continuous monitoring |
| Stave | Local observation snapshots + control rules | No | Yes | Deterministic control violations over observed state and time windows | Offline preflight, audit evidence, reproducible investigations |
- Stave + OPA: Use OPA for request-time or pipeline gate policy decisions; use Stave to prove state controls over snapshots and time-based thresholds.
- Stave + CSPM: Use CSPM for continuous cloud detection; use Stave for offline, deterministic replay and preflight checks with no API access.
- OPA/Sentinel: general policy engines for decisions. Stave: control evaluation over snapshot history with compound risk scoring.
- tfsec/Checkov: static IaC analysis. Stave: evaluation of normalized observed state snapshots with duration tracking.
- CSPM: live cloud visibility with API calls. Stave: offline evaluation with local files, deterministic rankings.
- Evaluation-only — Stave evaluates observations; extractors and remediators are separate programs.
- Offline by design — all inputs are local files; cloud API access belongs to extractors.
- Deterministic — same inputs always produce the same findings, scores, and rankings.
This workflow is for contributors and product engineers using Stave without changing Stave internals.
- Prepare observations as
obs.v0.1JSON snapshots in a local directory. - Prepare controls as
ctrl.v1YAML files (for example undercontrols/s3). - Validate artifacts before evaluation:
stave validate --controls <dir> --observations <dir>
- Run evaluation:
stave apply --controls <dir> --observations <dir> --max-unsafe <duration>
- If results are unexpected, run diagnostics:
stave diagnose --controls <dir> --observations <dir> [--previous-output <file>]
- Iterate on control definitions and input quality (not on Stave code) until outputs match expected safety intent.
- Define control intent in YAML (IDs, predicate logic, thresholds).
- Produce valid observation snapshots from approved sources.
- Choose runtime options (
--max-unsafe,--now, allow unknown input policy). - Review and act on violations/diagnostics.
- Version-control control definitions and snapshots as evidence artifacts.
- Schema validation for control and observation contracts.
- Deterministic evaluation logic over snapshots and time windows.
- Built-in operator semantics (
eq,in,missing,any_match, etc.). - Standardized output structures (JSON/text flows with safety envelope validation where enabled).
- Diagnostics for common mismatch causes (threshold too high, insufficient span, reset behavior, clock skew).
- Offline operation (no cloud credentials required at evaluation time).
For normal adoption, teams should only change:
- Control YAML files
- Observation snapshot inputs
- CLI runtime flags
Teams should not need to modify:
internal/coreevaluator logic- app use-case orchestration
- input/output adapters
If you need those code changes, treat it as platform extension work, not normal control authoring.