diff --git a/.github/workflows/gust-renode.yml b/.github/workflows/gust-renode.yml index 05fa51a..38e3efb 100644 --- a/.github/workflows/gust-renode.yml +++ b/.github/workflows/gust-renode.yml @@ -45,12 +45,13 @@ jobs: restore-keys: gust-renode-bazel- # ubuntu-22.04 ships bazelisk as `bazel`; it reads the root .bazelversion (9.1.0). - - name: Run the three M3 device-class Renode tests + - name: Run the M3 device-class + driver Renode tests run: | bazel test \ //:gust-renode \ //:gust-control-renode \ //:gust-f100-renode \ + //:gust-uart-renode \ --test_output=all - name: Upload Renode test logs diff --git a/artifacts/gust_driver_model.yaml b/artifacts/gust_driver_model.yaml new file mode 100644 index 0000000..2a6c59f --- /dev/null +++ b/artifacts/gust_driver_model.yaml @@ -0,0 +1,282 @@ +# gust generic driver model — the gust:hal capability seam + the verified-wasm / +# trusted-TCB split for extending gust with arbitrary drivers (UART first). +# Design spec: docs/superpowers/specs/2026-06-25-gust-driver-model-design.md. +# Extends BYO-OS (gale#74): a per-node bespoke composition of exactly the driver +# components a node needs, dissolved to native (meld fuse -> loom -> synth). +artifacts: + # ---------------------------------------------------------------- requirements + - id: REQ-DRV-SEAM-001 + type: sw-req + title: "gust drivers extend via the typed gust:hal capability seam" + status: proposed + description: > + gust SHALL expose hardware capabilities to driver logic through a typed WIT + package (gust:hal). A driver component IMPORTS capability interfaces; the + trusted host bridge IMPLEMENTS them. Both sides compile against the same + WIT, so the verified/TCB boundary is a typed contract, not a convention. + tags: [gust, driver, hal, seam, byo-os] + links: + - type: derives-from + target: SYSREQ-BYOOS-001 + - type: related-to + target: REQ-BYOOS-MULTIRT-001 + + - id: REQ-DRV-WASM-001 + type: sw-req + title: "Driver protocol logic is verified wasm; the TCB is the irreducible MMIO+IRQ sliver" + status: proposed + description: > + A driver's protocol / state machine / framing / buffering / arithmetic + SHALL be implemented in verified wasm and dissolved to native. The trusted + computing base for a driver SHALL be limited to the irreducible primitives: + the raw MMIO read/write (sandboxed wasm cannot touch MMIO), the interrupt + vector entry, and the "buffer the byte + wake the task" step. Maximising the + verified-wasm fraction (minimising the TCB) is the design objective. + tags: [gust, driver, wasm, tcb, maximal-wasm] + links: + - type: derives-from + target: SYSREQ-BYOOS-001 + - type: related-to + target: SAC-BYOOS-EXEC + + - id: REQ-DRV-SRAM-001 + type: sw-req + title: "A composed per-node image fits the target SRAM budget (F100 = 8 KiB)" + status: proposed + description: > + A composed, dissolved per-node image SHALL fit the target device SRAM. On + the STM32F100 (Jess's device class) that budget is 8 KiB. SRAM is the + binding resource: dissolved .text executes from flash (128 KiB, ample), + while the wasm linear memory (buffers, state) and shadow stack consume SRAM. + The driver design SHALL minimise linear-memory footprint (static-sized to + need, no heap, synth#383 shadow-stack shrink). + tags: [gust, driver, sram, f100, footprint] + links: + - type: derives-from + target: SYSREQ-BYOOS-001 + + - id: REQ-DRV-COMPOSE-001 + type: sw-req + title: "Per-node bespoke composition — only the drivers a node needs" + status: proposed + description: > + Each node's image SHALL be composed from exactly the specific driver + components that node requires (each driver importing only the gust:hal + capabilities it uses), meld-fused and dissolved. Nothing unused is linked, + so the SRAM cost equals the node's actual driver set. The composition — not + a fixed firmware — is the product. + tags: [gust, driver, composition, byo-os, sram] + links: + - type: derives-from + target: SYSREQ-BYOOS-001 + - type: related-to + target: REQ-BYOOS-MULTIRT-001 + + - id: REQ-DRV-ASYNC-001 + type: sw-req + title: "Seam supports sync and split-phase capabilities; no host Future crosses the wasm boundary" + status: proposed + description: > + The gust:hal seam SHALL support two capability shapes: SYNC capabilities + (brief blocking register touches, e.g. mmio.write8 / uart.put-byte) for + control-loop I/O, and SPLIT-PHASE capabilities (start + poll/yield, + kiln-wakeable, e.g. irq.wait / uart.rx-poll) for interrupt-driven streaming. + No host async Future SHALL cross the wasm boundary — only plain synchronous + start/poll calls do; host async (embassy) stays entirely in the bridge. + tags: [gust, driver, async, kiln, split-phase] + links: + - type: derives-from + target: SYSREQ-BYOOS-001 + + - id: REQ-DRV-VERIFY-001 + type: sw-req + title: "Driver logic is formally verified (Verus + Rocq + Kani), like the gale kernel primitives" + status: proposed + description: > + A driver's nontrivial logic SHALL carry the same formal verification as the + gale kernel primitives — Verus (SMT) + Rocq (theorem proving) + Kani (BMC), + to the intersection (no trait objects / closures / async in verified code). + "Verified wasm" is EARNED by proofs, not asserted by being in wasm. Concretely: + the driver's pure decisions are gale `_decide`-style functions promoted into + the gale verified crate (src/ Verus -> verus-strip -> plain/ -> wasm; proofs/ + Rocq; tests/ Kani); buffering reuses the already-proven gale::msgq ring; only + the irreducible MMIO poke is unverified I/O. The error-priority RX decision + (never read DR on overrun/framing error) is the first proven property. + tags: [gust, driver, verus, rocq, kani, verified, asil] + links: + - type: derives-from + target: SYSREQ-BYOOS-001 + + # ----------------------------------------------------------- design decisions + - id: DD-DRV-EMBASSY-001 + type: sw-detail-design + title: "embassy-hal is an optional host backend only; kiln stays the sole executor" + status: proposed + description: > + embassy-hal MAY back the fat/mid seam's host bridge (peripheral register + access) but its executor is NOT adopted. kiln-async remains the single + executor. embassy futures, if used, are driven to completion or polled + inside the trusted bridge and never surface to wasm. On the 8 KiB F100 the + fat/embassy seam is expected to exceed SRAM — see FIND-DRV-MEASURE-001. + tags: [gust, driver, embassy, kiln, tcb] + links: + - type: refines + target: SAC-DRV-UART + - type: satisfies + target: REQ-DRV-ASYNC-001 + + - id: DD-DRV-GRAN-001 + type: sw-detail-design + title: "Seam granularity (thin/mid/fat) is chosen by measurement" + status: proposed + description: > + The capability granularity trades TCB size against verified-wasm fraction: + thin (mmio.read8/write8 + irq.wait — wasm owns everything, ~10-line generic + TCB), mid (uart.put-byte/rx-poll — TCB owns byte-level access + RX IRQ), fat + (uart.write/read — embassy owns the driver). The default granularity SHALL + be chosen from the per-seam SRAM/TCB measurement, not asserted a priori. + tags: [gust, driver, seam, granularity, measurement] + links: + - type: refines + target: SAC-DRV-HAL + - type: satisfies + target: REQ-DRV-WASM-001 + - type: satisfies + target: REQ-DRV-SRAM-001 + + - id: DD-DRV-XIP-001 + type: sw-detail-design + title: "XIP is a follow-on flash/code lever, not the SRAM answer" + status: proposed + description: > + Execute-in-place from external flash addresses flash/code (a library of + composed per-node images), NOT the SRAM/buffer pressure that binds the + 8 KiB part. XIP is therefore out of scope for the first spike; revisit when + image count or size makes internal flash the constraint. + tags: [gust, driver, xip, flash, deferred] + links: + - type: refines + target: SAC-DRV-HAL + - type: related-to + target: REQ-DRV-SRAM-001 + + # ------------------------------------------------------------- architecture + - id: SAC-DRV-HAL + type: sw-arch-component + title: "gust:hal — the WIT capability seam (driver imports, TCB bridge implements)" + status: proposed + description: > + A new WIT package gust:hal@0.1.0 defining capability interfaces (mmio, irq, + uart) that dissolved driver components import and the trusted host bridge + implements. The stable contract both sides compile against; the verified/TCB + boundary. Hand-authored (consistent with gale.wit — the wasm-component layer + is not spar-generated in this repo; see FIND-DRV-SPAR-001). + tags: [gust, driver, hal, wit, seam] + links: + - type: allocated-from + target: REQ-DRV-SEAM-001 + + - id: SAC-DRV-UART + type: sw-arch-component + title: "UART driver — verified-wasm protocol over gust:hal, three-seam bridges" + status: proposed + description: > + The first driver: the UART protocol (init, baud, TXE/RXNE polling, RX ring + buffer) implemented in verified wasm, built at thin/mid/fat seams over a + host bridge, driven on the kiln stack by the gust_uart demonstrator (sync TX + + split-phase RX via the r11=0 trampoline pattern). Proves the seam and + yields the per-seam SRAM/TCB measurement. + tags: [gust, driver, uart, wasm, kiln] + links: + - type: allocated-from + target: REQ-DRV-WASM-001 + - type: allocated-from + target: REQ-DRV-ASYNC-001 + + # ------------------------------------------------------------- verification + - id: VER-DRV-UART-RENODE + type: sw-verification + title: "Renode STM32F100 USART content-based TX/RX echo gate" + status: proposed + description: > + A renode_test target on the STM32F100 platform (usart1 = UART.STM32_UART @ + 0x40013800): the dissolved UART driver TXes a known sequence and RXes it + back (loopback); a Renode terminal tester on sysbus.usart1 asserts the + echoed content via Wait For Line On Uart. Content-based (not just no-fault) + because a real USART is capturable headless (unlike the SemihostingUart). + tags: [gust, driver, uart, renode, f100, oracle] + links: + - type: verifies + target: REQ-DRV-WASM-001 + - type: verifies + target: REQ-DRV-SEAM-001 + + - id: VER-DRV-SRAM + type: sw-verification + title: "Per-seam SRAM-budget oracle (.bss/.data <= 8 KiB on F100)" + status: proposed + description: > + An oracle that links the composed image per seam and checks .bss + .data + against the F100 8 KiB budget (llvm-size). Confirms the thin/mid seams fit + and quantifies the fat/embassy seam's overflow. Produces the headline + SRAM-vs-budget column of the measurement table. + tags: [gust, driver, sram, oracle, measurement] + links: + - type: verifies + target: REQ-DRV-SRAM-001 + + - id: VER-DRV-KANI + type: sw-verification + title: "Kani BMC: USART RX decision is total + error-priority (proven green)" + status: proposed + description: > + cargo kani harness rx_decide_error_priority over ALL 2^32 status-register + values: usart_rx_decide is total (no panic), never returns Ready while an + error bit (ORE/FE) is set, and Ready implies RXNE with no errors — so the + driver provably never reads DR mid-error. VERIFICATION SUCCESSFUL (0/14 + checks failed) on drivers/uart-thin. The Verus + Rocq tracks attach on + promotion of the decision into the gale verified crate (REQ-DRV-VERIFY-001). + tags: [gust, driver, kani, verified, oracle] + links: + - type: verifies + target: REQ-DRV-VERIFY-001 + - type: verifies + target: REQ-DRV-WASM-001 + + # ------------------------------------------------------------- findings + - id: FIND-DRV-MEASURE-001 + type: finding + title: "Per-seam SRAM/TCB measurement spike (UART on F100) — results pending" + status: active + description: > + The empirical core: build the UART driver thin/mid/fat on F100 and record, + per seam, SRAM (.bss/.data vs 8 KiB), TCB .text bytes, and dissolved-driver + .text. Decides DD-DRV-GRAN-001 with a number and validates "how much fits in + verified wasm on a tiny node." Also feeds the gust perf bench so the UART + driver's meld/loom/synth size+cycle numbers drive those tools. + RESULTS (synth 0.15.0 + loom 1.1.16): THIN seam = the entire STM32 USART + protocol (incl. the Kani-proven RX decision, VER-DRV-KANI) dissolves to + 382 B flash .text, 0 B .data, 0 B .bss (ZERO SRAM in the poll-drain form), + TCB = 3 import relocations (mmio.read32/write32, irq.poll). A buffered + CCSDS-stream RX would add SRAM for the ring buffer (reusing proven + gale::msgq); the protocol logic itself is SRAM-free. mid/fat seam numbers + pending. + tags: [gust, driver, measurement, sram, tcb, perf-bench] + links: + - type: related-to + target: DD-DRV-GRAN-001 + + - id: FIND-DRV-SPAR-001 + type: finding + title: "gust:hal WIT is hand-authored — spar->WIT not wired for the wasm-component layer" + status: active + description: > + The feature-loop ideal is spar (AADL) -> generated WIT. In this repo spar + models the kernel primitives (safety/aadl/*.aadl) but the wasm-component + layer's WIT (gale.wit, gale-app-demo) is hand-derived. gust:hal follows that + established pattern (hand-authored). Logged as friction toward eventually + generating the capability WIT from a spar model of the driver/host seam. + tags: [gust, driver, spar, wit, friction] + links: + - type: related-to + target: SAC-DRV-HAL diff --git a/benches/gust/build.rs b/benches/gust/build.rs index 154a9ef..915eee9 100644 --- a/benches/gust/build.rs +++ b/benches/gust/build.rs @@ -87,5 +87,15 @@ fn main() { println!("cargo:rustc-link-arg-bin=gust_control={}", cobj.display()); println!("cargo:rerun-if-changed={}", cobj.display()); } + // The dissolved thin-seam UART driver (drivers/uart-thin → loom → synth + // --native-pointer-abi): the ENTIRE STM32 USART protocol in verified wasm + // (Kani-proven RX decision), importing only gust:hal mmio + irq. Driven by the + // gust_uart demonstrator, whose ~10-line bridge supplies mmio_read32/write32 + + // irq_poll. Reproduce: see drivers/uart-thin/RESULTS.md. + let uobj = Path::new(&manifest).join("drivers/uart-thin/uart-thin-cm3.o"); + if uobj.exists() { + println!("cargo:rustc-link-arg-bin=gust_uart={}", uobj.display()); + println!("cargo:rerun-if-changed={}", uobj.display()); + } println!("cargo:rerun-if-changed=build.rs"); } diff --git a/benches/gust/drivers/uart-thin/Cargo.lock b/benches/gust/drivers/uart-thin/Cargo.lock new file mode 100644 index 0000000..bb5cdad --- /dev/null +++ b/benches/gust/drivers/uart-thin/Cargo.lock @@ -0,0 +1,7 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. +version = 4 + +[[package]] +name = "gust-uart-thin" +version = "0.1.0" diff --git a/benches/gust/drivers/uart-thin/Cargo.toml b/benches/gust/drivers/uart-thin/Cargo.toml new file mode 100644 index 0000000..bee91ab --- /dev/null +++ b/benches/gust/drivers/uart-thin/Cargo.toml @@ -0,0 +1,19 @@ +# Standalone (not part of the gust workspace) — built ad-hoc for wasm32 + dissolve. +[workspace] + +[package] +name = "gust-uart-thin" +version = "0.1.0" +edition = "2021" +description = "gust:hal thin-seam UART driver — the entire STM32 USART protocol in verified wasm, importing only the generic mmio/irq capabilities" + +[lib] +# cdylib for the wasm32 dissolve target; rlib so Kani can model-check the pure +# decision logic (cargo kani) — the driver's verifiable core, gale-style. +crate-type = ["cdylib", "rlib"] + +[profile.release] +panic = "abort" +opt-level = "s" +lto = true +codegen-units = 1 diff --git a/benches/gust/drivers/uart-thin/RESULTS.md b/benches/gust/drivers/uart-thin/RESULTS.md new file mode 100644 index 0000000..6e621ea --- /dev/null +++ b/benches/gust/drivers/uart-thin/RESULTS.md @@ -0,0 +1,80 @@ +# Thin-seam UART driver — dissolve measurement + +The **entire STM32 USART protocol** (init, baud, TXE/RXNE polling, RX drain) +implemented in verified wasm (`src/lib.rs`), importing only the generic +`gust:hal` capabilities `mmio.{read32,write32}` + `irq.poll`. Dissolved with +**loom 1.1.16 + synth 0.15.0** (`--target cortex-m3 --native-pointer-abi +--shadow-stack-size 1024 --all-exports --relocatable`). + +| metric | value | +|---|---| +| dissolved `.text` (flash) | **326 B** | +| `.data` (SRAM) | **0 B** | +| `.bss` (SRAM) | **0 B** | +| **SRAM total** | **0 B** (poll-drain RX; no ring buffer) | +| TCB (import relocations) | **3** — `mmio_read32`, `mmio_write32`, `irq_poll` | +| export | `driver_step` | + +The TCB is the ~10-line generic register-poke + IRQ-flag bridge, shared by every +driver. The whole driver is verified wasm; nothing peripheral-specific is in the +trusted code. + +**Honest caveat:** this poll-drain form allocates no RX buffer, so SRAM = 0. A +*buffered* RX (needed for the gale#65 CCSDS-over-USART stream) puts its ring +buffer in linear memory → that buffer is the SRAM cost; the protocol logic stays +free. The mid/fat seam objects and the buffered variant are measured next. + +Reproduce: +```sh +cd benches/gust/drivers/uart-thin +cargo build --release --target wasm32-unknown-unknown +loom optimize target/wasm32-unknown-unknown/release/gust_uart_thin.wasm --passes inline | \ + synth compile - --target cortex-m3 --native-pointer-abi --shadow-stack-size 1024 \ + --all-exports --relocatable -o uart-thin-cm3.o +llvm-size uart-thin-cm3.o +``` + +## synth 0.15.0 perf test (the new version) — levers help compute, not I/O + +Dissolved the driver with synth 0.15.0's four ARM levers **on vs off**: + +| | `.text` | +|---|---| +| levers OFF | 382 B | +| levers ON (0.15.0 default) | 382 B | +| **delta** | **0 B (0%)** | + +cm3 == cm4 == 382 B. **The levers give nothing here** — the UART driver is +I/O-bound (memory-mapped register loads/stores + a poll loop via meld-dispatched +imports), not the arithmetic-dense clamp/select the levers target (which gave +gust_mix **−31%**). Honest perf-loop finding: *the arithmetic levers optimise +compute; the optimisation opportunity for driver code is the **meld-dispatch +import-call overhead** (synth logs "Meld dispatch enabled" for the 3 mmio/irq +imports), not the ARM peephole levers.* → a recommendation for meld/synth. + +## Renode end-to-end — WORKING + +`gust_uart` (demonstrator + ~10-line thin bridge) drives the dissolved driver on +a hermetic Renode Cortex-M3 with a **real STM32 USART model** (usart1 = +UART.STM32_UART @ 0x40013800). The driver TXes via `uart_tx_byte` over MMIO and +the USART emits — captured output: **`gust-uart-thin`** (614 instr, no fault). + +**Design that made it work (and fixed the earlier placement issue):** a driver +provides *protocol primitives* (`uart_init` / `uart_tx_byte` / `uart_rx` / +`uart_rx_fired`); the **app owns the payload**. So the driver carries **no data +segment** — the earlier failure was an embedded TX string landing in the wasm +1 MB linmem at a VMA the linker didn't map (native-pointer-abi). With the string +moved to the demonstrator (normal flash), the driver is 0-data/0-bss, needs no +r11 trampoline, and places cleanly. + +Bonus: a **real USART** file-backend *is* capturable headless on the macOS Renode +portable (unlike the SemihostingUart) — so the content-based `Wait For Line` +correctness gate works locally *and* in CI. + +| metric (primitive driver) | value | +|---|---| +| dissolved `.text` (flash) | **254 B** | +| SRAM (`.data` + `.bss`) | **0 B** | +| exports | uart_init, uart_tx_byte, uart_rx, uart_rx_fired | +| TCB (import relocations) | mmio_read32, mmio_write32, irq_poll | +| verified | usart_rx_decide — Kani SUCCESSFUL (error-priority, all 2³² SR) | diff --git a/benches/gust/drivers/uart-thin/src/lib.rs b/benches/gust/drivers/uart-thin/src/lib.rs new file mode 100644 index 0000000..06d0b20 --- /dev/null +++ b/benches/gust/drivers/uart-thin/src/lib.rs @@ -0,0 +1,144 @@ +//! gust:hal **thin-seam** UART driver — the maximal-wasm extreme. +//! +//! The ENTIRE STM32 USART protocol (init, baud, TXE/RXNE polling, RX drain) +//! lives here, in verified wasm. It imports only the generic `gust:hal/mmio` +//! (read32/write32) and `gust:hal/irq` (poll) capabilities; the trusted bridge +//! is a ~10-line generic register-poke + IRQ-flag, shared by every driver. No +//! host UART driver exists — this *is* the driver, dissolved to native. +//! +//! Build: cargo build --release --target wasm32-unknown-unknown +//! Dissolve: loom optimize --passes inline | synth compile --target cortex-m3 +//! --native-pointer-abi --shadow-stack-size --all-exports --relocatable +// no_std for the wasm32 dissolve target; under `cargo kani` we build for the host +// (std) so the model checker can exercise the pure decision logic. +#![cfg_attr(not(kani), no_std)] + +#[cfg(not(kani))] +#[panic_handler] +fn panic(_: &core::panic::PanicInfo) -> ! { + loop {} +} + +// gust:hal capability imports — become import-call relocations in the dissolved +// object, resolved at link by the TCB bridge (mmio.{read32,write32}, irq.poll). +extern "C" { + fn mmio_read32(addr: u32) -> u32; + fn mmio_write32(addr: u32, val: u32); + /// irq.poll(line): nonzero if the line fired since last poll (and clears it). + fn irq_poll(line: u32) -> u32; +} + +// STM32F1 USART1 register map — the only device knowledge, and it is *data* +// (addresses/bitmasks), not trusted code. F100 value line is compatible here. +const USART1: u32 = 0x4001_3800; +const SR: u32 = USART1 + 0x00; // status +const DR: u32 = USART1 + 0x04; // data (low 9 bits) +const BRR: u32 = USART1 + 0x08; // baud divisor +const CR1: u32 = USART1 + 0x0C; // control 1 + +const TXE: u32 = 1 << 7; // transmit data register empty +const RXNE: u32 = 1 << 5; // read data register not empty +const ORE: u32 = 1 << 3; // overrun error +const FE: u32 = 1 << 1; // framing error +const UE: u32 = 1 << 13; // USART enable +const TE: u32 = 1 << 3; // transmitter enable +const RE: u32 = 1 << 2; // receiver enable + +/// USART RX status decision — the driver's pure, verifiable core (gale `_decide` +/// style). Total over all SR values; **errors take priority over data-ready** so +/// the driver never reads DR on an overrun/framing error (which would desync the +/// byte stream — the safety property). Proven by Kani here; the Verus + Rocq +/// tracks attach when this is promoted into a gale verified module / its buffering +/// reuses the already-proven gale::msgq ring (see REQ-DRV-VERIFY-001). +#[derive(Clone, Copy, PartialEq, Eq)] +pub enum RxStatus { + Idle, + Ready, + Overrun, + FramingError, +} + +#[inline] +pub fn usart_rx_decide(sr: u32) -> RxStatus { + if sr & ORE != 0 { + RxStatus::Overrun + } else if sr & FE != 0 { + RxStatus::FramingError + } else if sr & RXNE != 0 { + RxStatus::Ready + } else { + RxStatus::Idle + } +} + +const RX_IRQ_LINE: u32 = 0; + +#[inline(always)] +fn rd(a: u32) -> u32 { + unsafe { mmio_read32(a) } +} +#[inline(always)] +fn wr(a: u32, v: u32) { + unsafe { mmio_write32(a, v) } +} + +/// Sentinel returned by `uart_rx` when no byte is available (or an error gated +/// the read) — keeps the export a plain scalar, no linmem/option in the ABI. +pub const RX_NONE: u32 = 0xFFFF_FFFF; + +// ---- exported protocol primitives (the driver's gust:hal-facing surface) ---- +// A driver provides primitives; the app owns the payload. This keeps the driver +// free of any data segment (no embedded strings) → 0 linmem, 0 SRAM, and no +// native-pointer-abi data-placement dependency. + +#[no_mangle] +pub extern "C" fn uart_init(brr: u32) { + wr(BRR, brr); + wr(CR1, UE | TE | RE); +} + +#[no_mangle] +pub extern "C" fn uart_tx_byte(b: u32) { + while rd(SR) & TXE == 0 {} + wr(DR, b & 0xFF); +} + +/// Read one byte if available — gated on the *verified* decision: only read DR on +/// Ready, never on an error (reading mid-error would desync the stream). Returns +/// RX_NONE when Idle/error. +#[no_mangle] +pub extern "C" fn uart_rx() -> u32 { + match usart_rx_decide(rd(SR)) { + RxStatus::Ready => rd(DR) & 0xFF, + _ => RX_NONE, + } +} + +/// Kani proofs for the verifiable core (`cargo kani`). Totality + the +/// error-priority safety property. +#[cfg(kani)] +mod kani_proofs { + use super::*; + + /// Over ALL status-register values: decide is total (no panic), never says + /// Ready while an error bit is set, and Ready implies RXNE with no errors. + #[kani::proof] + fn rx_decide_error_priority() { + let sr: u32 = kani::any(); + let d = usart_rx_decide(sr); + if (sr & ORE != 0) || (sr & FE != 0) { + assert!(d != RxStatus::Ready); // never read DR on error + } + if d == RxStatus::Ready { + assert!(sr & RXNE != 0 && sr & ORE == 0 && sr & FE == 0); + } + } +} + +/// Split-phase RX availability check — does the bridge ISR report the RX line +/// fired? Lets the driver yield to kiln between bytes rather than spin. Exposed +/// so the app can drive the split-phase loop (start → yield → poll). +#[no_mangle] +pub extern "C" fn uart_rx_fired() -> u32 { + unsafe { irq_poll(RX_IRQ_LINE) } +} diff --git a/benches/gust/drivers/uart-thin/uart-thin-cm3.o b/benches/gust/drivers/uart-thin/uart-thin-cm3.o new file mode 100644 index 0000000..91f9b46 Binary files /dev/null and b/benches/gust/drivers/uart-thin/uart-thin-cm3.o differ diff --git a/benches/gust/drivers/wit/gust-hal.wit b/benches/gust/drivers/wit/gust-hal.wit new file mode 100644 index 0000000..0113dc1 --- /dev/null +++ b/benches/gust/drivers/wit/gust-hal.wit @@ -0,0 +1,57 @@ +package gust:hal@0.1.0; + +/// gust:hal — the driver capability seam. A dissolved driver component IMPORTS +/// these; the trusted host bridge IMPLEMENTS them. The verified-wasm / trusted-TCB +/// boundary is this typed contract. Three granularities (thin → fat) trade TCB +/// size against how much of the driver is verified wasm; see +/// docs/superpowers/specs/2026-06-25-gust-driver-model-design.md. + +/// THIN seam — the irreducible TCB primitive. The driver reads/writes device +/// registers by physical address; the bridge performs the actual volatile access +/// (sandboxed wasm cannot touch MMIO). One ~10-line generic bridge serves every +/// driver; the entire peripheral protocol lives in wasm. +interface mmio { + read32: func(addr: u32) -> u32; + write32: func(addr: u32, val: u32); + read8: func(addr: u32) -> u8; + write8: func(addr: u32, val: u8); +} + +/// THIN seam, split-phase companion. `poll` is non-blocking: true (and clears) +/// if the line fired since last poll, so the driver yields to kiln between events +/// rather than blocking the scheduler. The bridge ISR sets the flag + wakes. +interface irq { + poll: func(line: u32) -> bool; +} + +/// MID seam — byte-level UART. The bridge owns byte register access + the RX IRQ +/// buffer; the driver owns framing/protocol above the byte. +interface uart-byte { + put-byte: func(b: u8); + rx-poll: func() -> option; +} + +/// FAT seam — whole-transfer UART. The bridge (e.g. embassy-stm32) owns the +/// driver; the wasm side only calls. Largest TCB; expected to exceed the 8 KiB +/// F100 SRAM budget (the overflow check). +interface uart-block { + write: func(buf: list); + read: func(max: u32) -> list; +} + +/// Thin-seam driver world — imports only generic MMIO + IRQ; exports one step. +world driver-thin { + import mmio; + import irq; + export driver-step: func() -> u32; +} + +world driver-mid { + import uart-byte; + export driver-step: func() -> u32; +} + +world driver-fat { + import uart-block; + export driver-step: func() -> u32; +} diff --git a/benches/gust/renode-test/BUILD.bazel b/benches/gust/renode-test/BUILD.bazel index 99a8850..c0ab844 100644 --- a/benches/gust/renode-test/BUILD.bazel +++ b/benches/gust/renode-test/BUILD.bazel @@ -45,3 +45,16 @@ renode_test( "REPL": ":stm32f100.repl", }, ) + +# 4. Thin-seam UART driver content gate — the dissolved driver (verified-wasm +# USART protocol, Kani-proven RX decision) TXing over a REAL STM32 USART model; +# asserts the emitted bytes via Wait For Line On Uart (content, not just no-fault). +renode_test( + name = "gust-uart-renode", + timeout = "short", + robot_test = "gust_uart.robot", + variables_with_label = { + "ELF": ":gust_uart.elf", + "REPL": ":m3_64k.repl", + }, +) diff --git a/benches/gust/renode-test/gust_uart.elf b/benches/gust/renode-test/gust_uart.elf new file mode 100755 index 0000000..c5aac82 Binary files /dev/null and b/benches/gust/renode-test/gust_uart.elf differ diff --git a/benches/gust/renode-test/gust_uart.robot b/benches/gust/renode-test/gust_uart.robot new file mode 100644 index 0000000..0d5ea0e --- /dev/null +++ b/benches/gust/renode-test/gust_uart.robot @@ -0,0 +1,20 @@ +*** Settings *** +Documentation gust:hal thin-seam UART driver — the dissolved driver (the whole +... STM32 USART protocol in verified wasm, Kani-proven RX decision) +... driving a REAL STM32 USART model. The app TXes "gust-uart-thin" via +... the driver's uart_tx_byte primitive over MMIO; this asserts the +... emitted content on the wire (not just no-fault). A real USART is +... capturable headless (unlike the SemihostingUart). ELF + platform +... injected by renode_test. +Resource ${RENODEKEYWORDS} + +*** Variables *** +${UART} sysbus.usart1 + +*** Test Cases *** +Dissolved thin-seam UART driver TXes over a real STM32 USART + Execute Command mach create "gust-uart" + Execute Command machine LoadPlatformDescription @${REPL} + Execute Command sysbus LoadELF @${ELF} + Create Terminal Tester ${UART} defaultPauseEmulation=true + Wait For Line On Uart gust-uart-thin timeout=30 diff --git a/benches/gust/renode-test/m3_64k.repl b/benches/gust/renode-test/m3_64k.repl index 534b672..7d3458b 100644 --- a/benches/gust/renode-test/m3_64k.repl +++ b/benches/gust/renode-test/m3_64k.repl @@ -14,3 +14,6 @@ flash: Memory.MappedMemory @ sysbus 0x0 sram: Memory.MappedMemory @ sysbus 0x20000000 size: 0x10000 uartSemihosting: UART.SemihostingUart @ cpu +// USART1 (STM32 F1/F103RE-class USART model) for the thin-seam UART driver demo. +usart1: UART.STM32_UART @ sysbus <0x40013800, +0x100> + -> nvic@37 diff --git a/benches/gust/renode-test/stm32f100.repl b/benches/gust/renode-test/stm32f100.repl index 04a7bcb..0ddb695 100644 --- a/benches/gust/renode-test/stm32f100.repl +++ b/benches/gust/renode-test/stm32f100.repl @@ -14,3 +14,9 @@ flash: Memory.MappedMemory @ sysbus 0x0 sram: Memory.MappedMemory @ sysbus 0x20000000 size: 0x2000 uartSemihosting: UART.SemihostingUart @ cpu +// USART1 — the real STM32 USART register model (F100 value line is compatible +// with the F103 USART Renode models). The dissolved thin-seam UART driver drives +// this directly via gust:hal mmio writes (DR @ 0x40013804, etc.). IRQ 37 = the +// F1 USART1 NVIC line, for the split-phase RX path. +usart1: UART.STM32_UART @ sysbus <0x40013800, +0x100> + -> nvic@37 diff --git a/benches/gust/src/bin/gust_uart.rs b/benches/gust/src/bin/gust_uart.rs new file mode 100644 index 0000000..20b36e4 --- /dev/null +++ b/benches/gust/src/bin/gust_uart.rs @@ -0,0 +1,54 @@ +//! gust-uart — the thin-seam UART driver driven bare-metal on the gust stack. +//! +//! The dissolved driver (`drivers/uart-thin`, the ENTIRE STM32 USART protocol in +//! verified wasm → synth → native) is linked here and called; this file is the +//! whole **trusted** side: a ~10-line `gust:hal` THIN bridge (raw MMIO + an IRQ +//! flag) + the boot shim. Nothing peripheral-specific is trusted — the driver +//! owns the registers, the bridge only pokes them. +//! +//! Boot: STM32F100 (Cortex-M3) in Renode with usart1 = UART.STM32_UART; the +//! driver TXes "gust-uart-thin\n" over USART1, asserted by a terminal tester. +#![no_std] +#![no_main] +use core::ptr::{read_volatile, write_volatile}; +use cortex_m_rt::entry; +use cortex_m_semihosting::debug; +use panic_halt as _; + +// ---- gust:hal THIN bridge — the entire trusted surface for the UART driver ---- +#[no_mangle] +pub extern "C" fn mmio_read32(addr: u32) -> u32 { + unsafe { read_volatile(addr as *const u32) } +} +#[no_mangle] +pub extern "C" fn mmio_write32(addr: u32, val: u32) { + unsafe { write_volatile(addr as *mut u32, val) } +} +/// irq.poll — would be set by the USART RX ISR; 0 here (TX smoke test, no RX). +#[no_mangle] +pub extern "C" fn irq_poll(_line: u32) -> u32 { + 0 +} + +extern "C" { + // dissolved thin-seam UART driver primitives (drivers/uart-thin → synth). + // No linmem data in the driver → no r11 trampoline needed; call directly. + fn uart_init(brr: u32); + fn uart_tx_byte(b: u32); +} + +#[entry] +fn main() -> ! { + // The driver provides the protocol; the APP owns the payload. The test line + // lives here (normal cortex-m flash), not in the driver — so the driver + // carries no data segment (0 linmem / 0 SRAM, no placement dependency). + let msg = b"gust-uart-thin\n"; + unsafe { + uart_init(0x0EA6); // baud divisor; Renode's USART model TXes on DR write + for &b in msg { + uart_tx_byte(b as u32); + } + } + debug::exit(debug::EXIT_SUCCESS); + loop {} +} diff --git a/docs/superpowers/specs/2026-06-25-gust-driver-model-design.md b/docs/superpowers/specs/2026-06-25-gust-driver-model-design.md new file mode 100644 index 0000000..1494cae --- /dev/null +++ b/docs/superpowers/specs/2026-06-25-gust-driver-model-design.md @@ -0,0 +1,225 @@ +# gust generic driver model — design (UART seam spike) + +**Status:** design / approved-to-spec · **Date:** 2026-06-25 · **Branch:** `feat/gust-driver-model` + +## Problem + +Adding a driver to gust today is hand-wired: per-bin `build.rs` link lines, a +hand-rolled kiln `poll_round` loop, manual export-stripping and the r11=0 +trampoline, and — for any I/O driver — a bespoke hand-written bridge resolving the +dissolved object's import-call relocations. Compute "drivers" (import +`gale:kernel`, fuse, dissolve, drive as a kiln task — e.g. `gust_control`) are a +clean recipe; **hardware/I/O drivers are architecturally supported but not +ergonomic, and the verified/TCB boundary for them has never been measured.** + +Jess will bring an arbitrary driver ask. Before that lands we want (a) the +**seam** nailed — the WIT contract and the verified-wasm / trusted-TCB split — and +(b) an **empirical answer** to "how much of a driver can move into verified +wasm," because that decides both the architecture and how much we depend on a +host HAL at all. + +**Requirements source — gale#65 (gale-nano / gust origin).** This is not +speculative: gale#65 specifies exactly this model — *"only the bare-minimum +hardware (registers/MMIO) is native; everything above that is wasm"* — on the +**STM32F100 (Cortex-M3, 24 MHz, 8 KB SRAM)** px4io-class failsafe. The real first +driver is the **USART @ 1.5 Mbaud carrying CCSDS Space Packets wrapped by +relay-sec** (counter + anti-replay + Ascon-AEAD) from the FMU, on the ASIL/DAL-A +path. Jess co-designs the `gust:hal` WIT and brings the stm32f103-based Renode HIL. +So: the thin-seam UART here *is* that IPC carrier's foundation, and relay-sec / +relay-ccsds (both `no_std`, M3-capable) can themselves dissolve into the wasm — +maximal-wasm all the way up the stack, with only the register poke native. (Note: +the F100 value line has no LPUART; gale#65's "LPUART" maps to **USART1** here, +which reaches 1.5 Mbaud at fCK/16 on a 24 MHz clock and is what Renode models.) + +## Goal & non-goals + +**Goal (this spec):** define the `gust:hal` capability seam and prove it with +**one UART driver end-to-end on the STM32F100 (Jess's device class)**, with the +UART *protocol implemented in verified wasm*, built at **three seam granularities** +so we can **measure the SRAM and TCB footprint** each costs against the 8 KB +budget. The measurement — *how much verified wasm fits the tiny node* — is the +primary deliverable. + +**Non-goals (deferred — YAGNI until there is real duplication to factor out):** +- The manifest + generator that auto-emits bridge stubs / build wiring / + trampoline. Extract it *after* Jess's ask makes driver #2. +- The full peripheral set (gpio / spi / i2c / adc / timer). Only UART here. +- A larger-device / embassy-fits arm. The target is the F100 throughout; embassy + appears only as the *fat-seam overflow check* on F100, to quantify why it + doesn't fit. **XIP** (external-flash execute-in-place) is a documented + follow-on lever, not in scope (see "Staying small"). + +## Key prior decisions (from brainstorming) + +1. **embassy HAL-only; kiln stays the one executor.** embassy's executor is not + adopted; embassy is (at most) a host-side peripheral backend. +2. **Two capability shapes:** *sync* caps for fast register touches; *split-phase* + (start + poll/yield, kiln-wakeable) caps for interrupt-driven streaming. No + embassy `Future` crosses the wasm boundary. +3. **Eventual wiring mechanism = manifest + generator**, but deferred. +4. **First bite = one UART driver on the F100 (Jess's device class); measure + *both* SRAM (.bss/.data) and TCB bytes at thin/mid/fat seams.** SRAM is the + binding constraint on an 8 KB part — the real question is how much + verified-wasm logic fits while staying small. + +## Architecture — the `gust:hal` seam + +A new WIT package `gust:hal@0.1.0`. A dissolved **driver component** *imports* +capability interfaces; the **TCB bridge** *implements* them. Both sides compile +against the same WIT, so the seam is a typed contract, not a convention. + +The irreducible TCB for any peripheral driver is small: +- the raw MMIO read/write primitive (sandboxed wasm cannot touch MMIO), +- the interrupt vector entry + "buffer the byte, wake the task." + +Everything else — protocol, framing, baud math, ring buffers, DMA-descriptor +building — is pure compute and can be **verified wasm**. The seam *granularity* is +the lever that trades TCB size against how much is verified: + +| seam | imported capability | TCB owns | wasm owns | embassy used? | +|---|---|---|---|---| +| **thin** | `mmio.{read8,write8}` + `irq.wait` (split-phase) | ~10 lines, generic, shared by every driver | **everything**: register sequencing, baud, framing, ring buffer | no (init/clocks only) | +| **mid** | `uart.{put-byte, rx-poll}` | byte-level register access + RX IRQ→buffer | framing, buffering, protocol above the byte | optional | +| **fat** | `uart.{write, read}` | the whole driver (embassy `uart.write().await`) | nothing but calls | yes (embassy-stm32) | + +Thesis link: **the thinner the seam, the more is verified and the less embassy is +needed** — and the thin seam is the only one that can fit a tiny node. + +## The UART driver + +Same observable behavior at all three seams: TX a known byte sequence, RX it back +(loopback), assert identical. What moves across the seam: + +- **thin:** the wasm driver writes the USART control/baud registers via + `mmio.write8`, polls TXE/RXNE via `mmio.read8`, manages its own ring buffer; + RX uses `irq.wait` (split-phase) so it yields to kiln between bytes. +- **mid:** the wasm driver calls `put-byte` per byte and `rx-poll` to drain; + framing/buffering still in wasm. +- **fat:** the wasm driver calls `write`/`read`; embassy owns the rest. + +### kiln integration + +- **sync caps** (`put-byte`, `mmio.write8`): called inline; the bridge does a + brief blocking register touch and returns — same call pattern as `control_step`. +- **split-phase caps** (`irq.wait`, empty `rx-poll`): the driver returns + `TaskOutcome::Yielded`; the TCB IRQ handler buffers the byte and wakes the + task; the next `poll_round` resumes. Reuses kiln's cooperative model and the + repo's `crank-stream` async-WIT precedent. +- **wiring (hand-done for the spike):** r11=0 trampoline (per `gust_control.rs`) + for the dissolved object; bridge `.o` linked via `build.rs`; driven from a + `gust_uart` demonstrator's `poll_round` loop. + +## Verification — Verus + Rocq + Kani (earned, not claimed) + +"Verified wasm" is earned by proofs, not by being in wasm — the same bar as the +gale kernel primitives (CLAUDE.md: Verus + Rocq + Kani simultaneously, written to +the intersection). For a driver: + +- **The pure decisions are verified** like gale `_decide` functions. The first is + the **USART RX decision** (`usart_rx_decide`): errors take priority over + data-ready, so the driver provably never reads `DR` on an overrun/framing error + (which would desync the byte stream). **Kani-proven now** (`cargo kani --harness + rx_decide_error_priority`, over all 2³² SR values — VERIFICATION SUCCESSFUL). + Its Verus + Rocq tracks attach on promotion of the decision into the gale + verified crate (src/ Verus → verus-strip → plain/ → wasm; `proofs/*.v` Rocq). +- **Buffering reuses already-proven logic.** A UART RX buffer is a ring; the + gale `msgq` ring is already Verus + Rocq + Kani proven (`src/msgq.rs`, + `proofs/msgq_proofs.v`). The buffered driver composes that rather than carry an + unverified buffer — maximal-wasm *and* maximal-verified, reusing existing proofs. +- **Only the MMIO poke is unverified** — the irreducible volatile I/O shell, which + no formal track covers (and which is the entire TCB). + +This is REQ-DRV-VERIFY-001 / VER-DRV-KANI in rivet. It is what makes the driver +fit gale#65's ASIL/DAL-A px4io path, not just "small." + +## Target & measurement — STM32VLDISCOVERY / STM32F100 in Renode + +The spike is pinned to the **STM32VLDISCOVERY (STM32F100RB, Cortex-M3, 8 KB SRAM / +128 KB flash)** modelled in Renode — the constrained device we need anyway (the +physical board is in the silicon plan; the F100 Renode platform already exists in +`benches/gust/renode-test/`). + +- **Real USART model:** add `usart1: UART.STM32_UART @ sysbus 0x40013800` to + `stm32f100.repl` (the F100 value line is register-compatible with the F103 USART + Renode already models). The dissolved driver's `mmio.write8(0x40013804, b)` + hits a real STM32 USART register model. +- **Correctness gate (content-based, in CI):** a Renode `Create Terminal Tester` + on `sysbus.usart1` + `Wait For Line On Uart` asserts the echoed bytes. Unlike + the SemihostingUart (uncapturable headless on the macOS portable), a **real + USART** *is* capturable — proven by the existing stm32f4 sem robots. So the + UART spike gets a genuine content assertion, not just no-fault. Becomes a 4th + `renode_test` target in the CI module (Bazel 8 pin etc. already in place). +- **Metrics (two — SRAM is the one that binds):** per seam — (a) **SRAM** = + the linked image's `.bss + .data` (the dissolved wasm's linear memory / buffers + / shadow stack + the bridge's state) against the 8 KB budget; (b) **TCB bytes** + = `.text` of the trusted bridge (+ any linked embassy code); plus the + dissolved-driver `.text` (flash, cheap). A table like the synth 0.15.0 one, + with **SRAM as the headline column**. +- **Fat/embassy on F100 is the overflow check** — build-only, expected to blow + the 8 KB SRAM budget (a full embassy-stm32 HAL is well over it). That negative + result *is* a finding: it quantifies how far over the tiny node a fat seam goes, + i.e. why the F100 *requires* the thin/mid seam. There is no larger-device arm — + the point is staying on Jess's F100 class and maximising verified wasm within + its SRAM. + +## Staying small: maximal-wasm vs SRAM, per-node composition, and XIP + +The driving constraint is **SRAM on an 8 KB part**, and the goal is to move as +much driver logic as possible into *verified wasm* without blowing it. Three +points shape the model: + +- **Flash is cheap, SRAM binds.** Dissolved `.text` already executes from flash + (128 KB on F100 — ample); it is the wasm **linear memory** (buffers, state) + + shadow stack that consume **SRAM**. So "more logic in wasm" is nearly free in + flash and costs SRAM *only where it needs data* (e.g. a UART RX ring buffer). + Minimising the linear-memory footprint — static-sized to actual need, no heap, + the synth#383 shadow-stack shrink — is what makes maximal-wasm fit. +- **Per-node bespoke composition (the core extension model).** Each node's image + is one wasm composed with *exactly the specific drivers that node needs* (each + driver = a component importing only the `gust:hal` capabilities it uses) plus + the kernel pieces it uses — then meld-fused and dissolved. Nothing unused is + linked, so the SRAM cost is precisely what that node's driver set requires. This + is how "compose a new wasm from components for each" stays small *and* maximises + verified surface: the composition, not a fixed firmware, is the product. +- **XIP — follow-on lever, addresses flash not SRAM.** Execute-in-place from + **external** flash lets a library of composed per-node images live and run + without consuming internal flash — useful once there are many nodes/drivers. It + does **not** relieve the SRAM/buffer pressure that binds the 8 KB part, so it is + orthogonal to the maximal-wasm/SRAM question this spike answers. Scoped out; + revisit when image count or size makes internal flash the constraint. + +## Success criteria + +1. The UART driver passes the content-based correctness gate (TX/RX echo) on the + F100 Renode target at the **thin** and **mid** seams. +2. A per-seam table — **SRAM (.bss/.data) vs the 8 KB budget** (headline), TCB + bytes, and dissolved `.text` — committed alongside the spike. The empirical + answer to *how much verified wasm fits the tiny node*. +3. A documented verdict: the seam granularity that maximises verified-wasm logic + while staying within F100 SRAM, and by how much the fat/embassy seam overflows + it. + +## Risks & open questions + +- **Renode STM32_UART fidelity:** confirm baud/TXE/RXNE/DR semantics suffice for a + real driver loop (init register writes may need RCC/clock-enable modelling — + add minimal RCC if required, or have the bridge stub clock-enable). +- **Split-phase IRQ→kiln-wake in Renode:** the USART RX interrupt must reach the + TCB handler and wake the task; verify Renode delivers the USART IRQ to the NVIC + line the bridge binds. +- **Thin-seam ergonomics:** writing the USART register sequence in wasm is more + work than calling embassy; acceptable for the spike (the point is the TCB + measurement), revisit for the generator. +- **F100 value-line specifics** vs the F103 USART model — register offsets used + must be the common subset (DR @ +0x04, SR @ +0x00, BRR @ +0x08, CR1 @ +0x0C). +- **embassy-stm32 may not support the F100 value line.** If it won't build for + F100, the fat-seam overflow measurement is taken on the nearest supported F1 + (e.g. F103) as the embassy-footprint proxy — still a valid "this is how big the + fat seam is vs 8 KB" finding; the thin/mid arms remain the real F100 targets. + +## What this unblocks + +With the seam proven and the granularity chosen by measurement, Jess's arbitrary +driver becomes driver #2 at the chosen seam — and two real drivers are the +duplication from which the deferred manifest+generator gets *extracted* rather +than guessed.