Description
A meaningful fraction of vibehouse's implementation is agentic — diffs are generated, reviewed at a high level, and merged quickly. That workflow trades off the deep per-line scrutiny that normally catches subtle correctness bugs in consensus code. Formal proofs can close that gap mechanically: a proof assistant rejects the diff if a stated invariant no longer holds, regardless of how the change was authored.
The proposal is to add a ROCQ proof layer targeting the consensus-critical components where correctness is hardest to test exhaustively and most expensive to get wrong. The approach: hand-model the Rust semantics in ROCQ, prove soundness and completeness properties against the model, and add integration tests that validate the Rust implementation satisfies the proof assumptions. Axioms are tracked explicitly in a TRUST_ASSUMPTIONS.md.
Prior work
The Ethereum formal verification landscape is worth understanding before starting:
Runtime Verification — Gasper in Coq (2020): beacon-chain-verification. Three key theorems proven at the abstract protocol level: Accountable Safety, Plausible Liveness, and Slashable Bound. These are proven over an abstract validator set model — not tied to any concrete state transition implementation. The planned bridge from this abstract model to the concrete Python pyspec was listed as ongoing and was never completed. This is the most directly reusable prior work for vibehouse since it's already in Coq/ROCQ and covers the core safety properties we care about.
ConsenSys/EF — Dafny beacon chain (TACAS 2022): eth2.0-dafny, paper. Phase 0 only, covers runtime error absence (no array OOB, no integer overflow) and SSZ round-trip correctness. BLS signatures and hash functions are axiomatized. Inactive since ~2021. Nothing for Altair+.
Galois — blst BLS verification (Cryptol + SAW): BLST-Verification. Partial verification of blst assembly routines. Documented gaps: aggregate verification is incomplete, variable-length inputs proven only for specific sizes, elliptic curve group laws are axiomatized.
The common thread: all prior work targets Phase 0 and none of it has been updated for post-Merge forks. Nothing formally addresses ePBS, the fork choice extensions for payload timeliness, or any Gloas-specific logic.
Steps to resolve
What to prove, in priority order:
Tier 1 — Proto-array fork choice — head selection, pruning, and the should_extend_payload logic added for Gloas. Key invariants: the selected head is always a descendant of the justified checkpoint; pruning never removes any block on the canonical chain; should_extend_payload correctly reflects the (is_payload_timely && is_payload_data_available) || ... disjunction. The Runtime Verification Coq proofs provide abstract safety properties we can import as axioms and refine toward the concrete implementation.
Tier 2 — PTC quorum and builder payment arithmetic — the 60% quorum threshold, the builder_pending_payments slot indexing formula (SLOTS_PER_EPOCH + slot % SLOTS_PER_EPOCH), and the payment sweep logic. The arithmetic invariants are precise enough to state as soundness/completeness theorem pairs and small enough to prove without axioms. This is entirely novel — no prior formal work covers ePBS payment logic.
Tier 3 — process_execution_payload_envelope state transition — the Gloas envelope state transition validating block_hash, withdrawals_root, gas_limit, prev_randao against the bid commitment. This is the most novel code in the repo and was written entirely agentically. Soundness: if the proof passes, the envelope is consistent with the bid. Completeness: a valid envelope always satisfies the constraints.
CI integration
Proofs live in rocq/ alongside the implementation. CI runs rocq on every PR. A change to a Tier 1–3 file that breaks a stated invariant fails CI. This is the mechanical check the agentic workflow currently lacks.
Additional Info
The abstract Gasper Coq safety proofs (Runtime Verification) are the most natural foundation to build on — they're already in Coq/ROCQ and cover the protocol-level properties we want to refine. The planned concrete bridge that was never completed is precisely the gap this effort would fill, at least for the Gloas fork choice extensions.
BLS signatures should be axiomatized, consistent with how both the Dafny and Galois projects treat them. The Galois SAW work provides a partial specification that can be imported as a trust assumption.
Description
A meaningful fraction of vibehouse's implementation is agentic — diffs are generated, reviewed at a high level, and merged quickly. That workflow trades off the deep per-line scrutiny that normally catches subtle correctness bugs in consensus code. Formal proofs can close that gap mechanically: a proof assistant rejects the diff if a stated invariant no longer holds, regardless of how the change was authored.
The proposal is to add a ROCQ proof layer targeting the consensus-critical components where correctness is hardest to test exhaustively and most expensive to get wrong. The approach: hand-model the Rust semantics in ROCQ, prove soundness and completeness properties against the model, and add integration tests that validate the Rust implementation satisfies the proof assumptions. Axioms are tracked explicitly in a
TRUST_ASSUMPTIONS.md.Prior work
The Ethereum formal verification landscape is worth understanding before starting:
Runtime Verification — Gasper in Coq (2020): beacon-chain-verification. Three key theorems proven at the abstract protocol level: Accountable Safety, Plausible Liveness, and Slashable Bound. These are proven over an abstract validator set model — not tied to any concrete state transition implementation. The planned bridge from this abstract model to the concrete Python pyspec was listed as ongoing and was never completed. This is the most directly reusable prior work for vibehouse since it's already in Coq/ROCQ and covers the core safety properties we care about.
ConsenSys/EF — Dafny beacon chain (TACAS 2022): eth2.0-dafny, paper. Phase 0 only, covers runtime error absence (no array OOB, no integer overflow) and SSZ round-trip correctness. BLS signatures and hash functions are axiomatized. Inactive since ~2021. Nothing for Altair+.
Galois — blst BLS verification (Cryptol + SAW): BLST-Verification. Partial verification of blst assembly routines. Documented gaps: aggregate verification is incomplete, variable-length inputs proven only for specific sizes, elliptic curve group laws are axiomatized.
The common thread: all prior work targets Phase 0 and none of it has been updated for post-Merge forks. Nothing formally addresses ePBS, the fork choice extensions for payload timeliness, or any Gloas-specific logic.
Steps to resolve
What to prove, in priority order:
Tier 1 — Proto-array fork choice — head selection, pruning, and the
should_extend_payloadlogic added for Gloas. Key invariants: the selected head is always a descendant of the justified checkpoint; pruning never removes any block on the canonical chain;should_extend_payloadcorrectly reflects the(is_payload_timely && is_payload_data_available) || ...disjunction. The Runtime Verification Coq proofs provide abstract safety properties we can import as axioms and refine toward the concrete implementation.Tier 2 — PTC quorum and builder payment arithmetic — the 60% quorum threshold, the
builder_pending_paymentsslot indexing formula (SLOTS_PER_EPOCH + slot % SLOTS_PER_EPOCH), and the payment sweep logic. The arithmetic invariants are precise enough to state as soundness/completeness theorem pairs and small enough to prove without axioms. This is entirely novel — no prior formal work covers ePBS payment logic.Tier 3 —
process_execution_payload_envelopestate transition — the Gloas envelope state transition validatingblock_hash,withdrawals_root,gas_limit,prev_randaoagainst the bid commitment. This is the most novel code in the repo and was written entirely agentically. Soundness: if the proof passes, the envelope is consistent with the bid. Completeness: a valid envelope always satisfies the constraints.CI integration
Proofs live in
rocq/alongside the implementation. CI runsrocqon every PR. A change to a Tier 1–3 file that breaks a stated invariant fails CI. This is the mechanical check the agentic workflow currently lacks.Additional Info
The abstract Gasper Coq safety proofs (Runtime Verification) are the most natural foundation to build on — they're already in Coq/ROCQ and cover the protocol-level properties we want to refine. The planned concrete bridge that was never completed is precisely the gap this effort would fill, at least for the Gloas fork choice extensions.
BLS signatures should be axiomatized, consistent with how both the Dafny and Galois projects treat them. The Galois SAW work provides a partial specification that can be imported as a trust assumption.