Deterministic I/O and simulation testing for Zig.
Long term, Marionette is aiming to be the deterministic std.Io for Zig:
production libraries accept std.Io, and tests swap in Marionette's
deterministic implementation. Today, Marionette ships the simulator, trace,
fault, disk, and network primitives that make that direction concrete.
Marionette already runs real, unmodified Zig code under deterministic
simulation: a storage engine (xit-vcs/xitdb) and a cooperative-concurrency
library (g41797/mailbox), both with seed-reproducible replay, and has
surfaced reproducible recovery and correctness counterexamples in the process.
Today the demonstrated tiers are:
- deterministic
std.Io.Filestorage simulation with crash/torn-write faults; - cooperative
std.Iotask scheduling forMutex/Conditioncode; - typed endpoint message passing with deterministic loss, latency, and partitions.
Write production-shaped code against std.Io plus any small Marionette handles
it actually needs, such as mar.Recorder or mar.Endpoint(Message). In tests,
drive control to inject faults. For the modeled file and local endpoint
surfaces, the same application logic can run on the simulator and on production
adapters.
fn writeAndRecover(io: std.Io, root: std.Io.Dir, recorder: mar.Recorder) !KVStore {
var store = try KVStore.init(io, root, recorder);
try store.put(1, 41, .sync);
try store.put(2, 99, .no_sync);
try store.recover(.strict);
return store;
}
// In simulation: deterministic, fault-injectable, replayable from a seed.
const sim = try world.simulate(.{ .disk = .{ .sector_size = 16 } });
var sim_store = try writeAndRecover(sim.env.io(), std.Io.Dir.cwd(), sim.env.recorder());
// In production: real disk, same code path.
var production = try mar.Production.init(.{ .root_dir = tmp.dir, .io = std.testing.io });
const prod_env = production.env();
var prod_store = try writeAndRecover(prod_env.io(), tmp.dir, prod_env.recorder());For file-backed code like this, that parity is the point. You don't write a "simulator version" of your code. You write your code behind Marionette-owned authorities, and Marionette gives you a deterministic environment to run it in.
Distributed and storage systems fail in ways that are hard to reproduce: a torn write under crash, a network partition during quorum, a race between two timers. By the time you have a stack trace, the conditions that caused the bug are gone.
Deterministic simulation testing turns those bugs into seeds. Every run is reproducible. Every failure is replayable. You compress weeks of fuzz-testing into seconds, and when something breaks in CI, the seed alone is enough to debug it.
Marionette brings that approach to Zig. It's inspired by the techniques behind FoundationDB, TigerBeetle, and Antithesis, but designed to be a drop-in library, not a framework you build your system around.
Here's a WAL recovery test that crashes the disk mid-write, corrupts a sector, and asserts that committed records survive while unsynced ones don't.
pub fn scenario(harness: *Harness) !void {
try harness.store.put(committed_key, committed_value, .sync);
try harness.control.disk.setFaults(.{ .crash_lost_write_rate = .always() });
try harness.store.put(volatile_key, volatile_value, .no_sync);
try harness.control.disk.crash();
try harness.control.disk.restart();
try harness.control.disk.corruptSector(wal_path, record_size);
try harness.store.recover(.strict);
}
pub const checks = [_]mar.StateCheck(Harness){
.{ .name = "synced records recover, unsynced records are rejected", .check = recoveredStateIsSafe },
};
test "wal recovery" {
try mar.expectPass(.{
.allocator = std.testing.allocator,
.seed = 0xC0FFEE,
.init = Harness.init,
.scenario = scenario,
.checks = &checks,
});
}
test "wal recovery fuzz" {
try mar.expectFuzz(.{
.allocator = std.testing.allocator,
.seed = 0xC0FFEE,
.seeds = 16,
.init = Harness.init,
.scenario = scenario,
.checks = &checks,
});
}Three pieces, every test:
initsets up your harness: your code under test, plus thecontrolhandle for fault injection.scenariodrives the action. It calls into your code through the handles created byenv, and into the simulator viacontrol.checksassert invariants on the final state.
expectPass runs once with a fixed seed. expectFuzz runs many seeds in parallel. expectFailure asserts that a deliberately-buggy scenario gets caught, useful for proving your checker actually works.
Every Marionette test has two halves.
io is what production-shaped storage code should usually see. In
simulation, sim.env.io() returns Marionette's deterministic std.Io backend.
In production, production.env().io() returns the host std.Io supplied at
setup. Application code that wants trace events should accept a narrow
mar.Recorder, not all of mar.Env.
var store = try KVStore.init(io, root, recorder);
try store.put(1, 41, .sync);control is what your test code uses to inject faults. It's only available in simulation. It mirrors env's structure: every resource has a control surface.
try control.disk.crash();
try control.disk.corruptSector(path, offset);
try control.network.partition(&side_a, &side_b);
try control.network.setLossiness(.{ .drop_rate = .percent(20) });
try control.network.heal();Env is still the harness-owned bundle that supplies io(), recorder(),
clock/random helpers, and remaining Marionette capabilities. Code that only
needs file I/O should prefer std.Io so it stays ordinary Zig code.
Network simulation works the same way. Here's a partition test against a toy replicated register:
fn partitionScenario(harness: *Harness) !void {
const isolated = [_]mar.NodeId{0};
const majority = [_]mar.NodeId{ 1, 2, client_node_id };
try harness.control.network.partition(&isolated, &majority);
try harness.replicas.write(.{ .version = 1, .value = 41, .retry_limit = 2 });
try harness.control.network.heal();
try harness.replicas.write(.{ .version = 1, .value = 41, .retry_limit = 1 });
try checkReplicaCommitted(&harness.replicas, 0, 1, 41);
}Messages have configurable loss, latency, clogs, and partition dynamics through focused control.network calls such as setLossiness(...), setLatency(...), and setPartitionDynamics(...). Application code sends through a node-scoped endpoint with endpoint.send(to, message) and receives with while (try endpoint.receive()) |envelope|.
Marionette also has an experimental scheduler-backed std.Io futex path for
cooperative Mutex / Condition code. The pinned g41797/mailbox validation
target runs unmodified and exercises timed receive, send/wake, same-deadline
timeout ordering, and byte-identical same-seed replay. The internal
validate-bounded-queue target adds a canonical FIFO oracle and a planted
lost-wakeup deadlock to demonstrate concurrency bug detection without counting
it as an external SUT finding. This is cooperative std.Io concurrency, not
preemptive OS thread or memory-model testing; see
Std.Io Direction for the exact boundary.
Every run produces a structured trace. When a check fails, you get the full sequence of events that led to the violation, plus the seed to reproduce it.
register.write.start version=1 value=41 retry_limit=8
register.message kind=propose to=0 version=1 value=41
replica.accept replica=0 version=1 value=41 accepted=true
register.message kind=propose to=1 version=1 value=41
replica.accept replica=1 version=1 value=41 accepted=true
register.write.quorum version=1 value=41 acks=2
register.invariant_violation kind=committed_divergence replica=1 ...
You write trace records with mar.Recorder or, inside harness-shaped code,
env.record(...). Application code, scenario code, and checks can all record.
Failed runs print the trace automatically. Passing runs hand it back to you so
you can persist it, diff it, or feed it to whatever observability you already
have.
- Overview
- Architecture
- Trace Format
- Run
- Findings
- API Target Spec
- BUGGIFY
- Network Model
- Network API Direction
- Disk Fault Model
- API
- Determinism
- Examples
- Roadmap
- Prior art
- TigerBeetle Lessons
- Blog
Marionette is early. This is a 0.x release: there is no API stability
guarantee before 1.0. The intended-stable surface today is World, Env,
Control, runCase / expect*, Disk, SimDisk, RealDisk, Production,
Recorder, and the app-facing Endpoint(Message) shape. Everything else may
change as the simulator grows.
The simulator currently models clock, deterministic randomness, disk, a flat
std.Io.File subset, typed endpoint networking, a narrow scheduler-backed
std.Io.net stream subset, and experimental cooperative std.Io futex waits
for Mutex / Condition code, validated against the pinned g41797/mailbox
target and the internal bounded-queue capability demo. It does not model
arbitrary OS thread scheduling or memory-level concurrency; code that depends on
those needs separate testing. The production network path
is partial: local same-process endpoints and experimental framed loopback paths
exist, but cross-process production transport is still roadmap work. Allocator
simulation, async/cancel integration, and broader scheduler parity are planned.
If you're building something where determinism matters and you want to try it, the examples/ directory is the best place to start. Open issues and PRs welcome.
zig fetch --save https://github.com/sb2bg/marionette/archive/refs/tags/v0.2.0.tar.gz
Requires Zig 0.16.x.
Marionette stands on the shoulders of FoundationDB's simulation testing, TigerBeetle's VOPR, and the broader DST tradition. The bugs they catch are bugs everyone has; this library tries to make catching them easy in Zig.
MIT