Skip to content

smite-ir: add program minimizers and wire them into AFL++ custom trim #54

Open
erickcestari wants to merge 6 commits into
morehouse:masterfrom
erickcestari:add-minimizers
Open

smite-ir: add program minimizers and wire them into AFL++ custom trim #54
erickcestari wants to merge 6 commits into
morehouse:masterfrom
erickcestari:add-minimizers

Conversation

@erickcestari
Copy link
Copy Markdown
Contributor

@erickcestari erickcestari commented Apr 21, 2026

When AFL finds an interesting input it gets handed to us for trimming before going back into the corpus. Keeps things tidy and makes crashes much easier to read.

There's a small Minimizer trait, same shape as Mutator, one method on a unit struct, and two implementations:

DeadCodeEliminator walks instructions in reverse with a refcount and drops anything that's both unreferenced and removable (i.e. not a network I/O op). The reverse direction lets chains collapse in one pass: once we drop a consumer its producer's count falls to zero and it goes too.

CommonSubexpressionEliminator is keyed on (operation, canonicalized_inputs). Canonicalizing each instruction's inputs through the running remap before the lookup makes the merge transitive: two duplicate LoadAmounts collapse, then the DerivePoints consuming them collapse, and so on.

Both transforms are safe in IR semantics; the only ops we can't touch are SendMessage and RecvAcceptChannel, because they actually do network I/O. One Operation::has_side_effects predicate gates both passes (same set, same reason).

smite-ir-mutator wires this into AFL's custom-trim ABI. Because both minimizers are deterministic and run to completion in-process, the shim composes them into a single candidate during init_trim, serializes it once, and hands it to AFL as a single round-trip — no phase machine, no per-phase fallback, no iterative feedback loop. If the trim is a no-op (program already minimal) init_trim returns 0 and AFL skips trim entirely.

End-to-end coverage

A new smite-ir-e2e-test crate is a minimal AFL++ harness that decodes a postcard-encoded Program, validates it, and publishes coverage manually to __afl_area_ptr. A new #[ignore]d e2e test under smite-ir-mutator spawns afl-fuzz against this binary with our cdylib loaded as AFL_CUSTOM_MUTATOR_LIBRARY and asserts every hook we export is actually used in a real fuzzing run.

The bitmap has to be bit-identical across DCE/CSE-trimmed variants of the same program (so AFL's trim cksum accepts the shrunk candidate) yet vary under our mutators (so AFL queues new entries). Any compiler-inserted edge whose hit count tracks program.instructions.len() fails the first half: DCE/CSE move the count across AFL's hit-count buckets and the cksum mismatches. postcard::from_bytes and Program::validate both contain such loops.

So the harness is built with RUSTFLAGS=-Cllvm-args=-sanitizer-coverage-level=0 and publishes coverage manually: for each instruction reachable (via inputs) from a side-effect root (SendMessage, RecvAcceptChannel), mark a slot derived from a content hash of (operation, hashes of inputs). Because the hash folds input content (not indices), DCE renumbering doesn't change it; CSE merges duplicates whose hashes were already equal; OperationParamMutator shifts an operation's hash; InputSwapMutator rewires an edge and shifts the consumer's hash.

This also encodes a broader smite design principle: coverage is driven only by side-effecting work. Pure setup instructions that never feed a Send/Recv produce zero coverage and AFL never queues them. The fuzzing signal lines up with the minimizer's notion of "useful work" so trimming can't change coverage.

The e2e test asserts five signals, all scraped from AFL's own AFL_DEBUG=1 output so the cdylib stays instrumentation-free:

  1. Found 'afl_custom_<name>' lines at startup for all six exported hooks.
  2. Queue filenames carry smite-ir:<last_sequence> from afl_custom_describe, with both branches of mutate_stacked surfacing (fresh and one of op-param / input-swap).
  3. [Custom Trimming] START confirms afl_custom_init_trim is invoked.
  4. START: Max 1 confirms the DCE+CSE pipeline shrank at least one input. The seed corpus mixes a DCE-reducible program (dead LoadAmount appended) and a CSE-reducible one (duplicate LoadPrivateKey injected) so both minimizer paths can fire.
  5. [Custom Trimming] SUCCESS confirms AFL persisted at least one trimmed candidate — the trimmed bytes' coverage cksum matched the original. Verifies DCE+CSE preserve coverage end-to-end, not just that we offered a smaller candidate.

A afl-e2e.yml workflow runs this test on PRs that touch the AFL-relevant crates. cargo-afl is pinned via CARGO_AFL_VERSION and cached on that key, so bumping the env var invalidates the cache and selects the new version.

@Chand-ra
Copy link
Copy Markdown

Maybe it's a good idea to draft this until #53 gets merged? That way we can avoid any accidental merges or merge conflicts in the meantime.

@erickcestari erickcestari marked this pull request as draft April 24, 2026 19:04
@erickcestari
Copy link
Copy Markdown
Contributor Author

Maybe it's a good idea to draft this until #53 gets merged? That way we can avoid any accidental merges or merge conflicts in the meantime.

Sure, I also need to rebase it. It will probably only be merged after the first milestone is finished.

Copy link
Copy Markdown
Owner

@morehouse morehouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the iterator pattern doesn't fit very will with the duplicate-load and nop minimizers -- both could be fully minimized in a single call, and we wouldn't ever expect the post_trim call to report failure for them. Both could also be easily optimized to linear algorithms.

Since the goal is to actually trim/reduce inputs, I think we should also consider deleting the nops we insert -- currently the nops continue to take up space in the input.

Comment thread smite-ir/src/operation.rs Outdated
Comment thread smite-ir/src/operation.rs Outdated
Comment thread smite-ir/src/operation.rs Outdated
Comment thread smite-ir/src/minimizers/suffix_cutting.rs Outdated
Comment thread smite-ir/src/minimizers/nopping.rs Outdated
Comment thread smite-ir/src/minimizers/nopping.rs Outdated
Comment thread smite-ir/src/minimizers/duplicate_load.rs Outdated
Comment thread smite-ir-mutator/src/lib.rs Outdated
Comment thread smite-ir-mutator/src/lib.rs Outdated
@erickcestari erickcestari force-pushed the add-minimizers branch 3 times, most recently from 80c2f9c to 6d5eee6 Compare May 3, 2026 15:13
@erickcestari erickcestari requested a review from morehouse May 3, 2026 15:17
@erickcestari erickcestari marked this pull request as ready for review May 3, 2026 16:05
@erickcestari
Copy link
Copy Markdown
Contributor Author

I think the iterator pattern doesn't fit very will with the duplicate-load and nop minimizers -- both could be fully minimized in a single call, and we wouldn't ever expect the post_trim call to report failure for them. Both could also be easily optimized to linear algorithms.

Since the goal is to actually trim/reduce inputs, I think we should also consider deleting the nops we insert -- currently the nops continue to take up space in the input.

I've refactored almost all of the code. I've ensured that all pipeline trimming occurs once at afl_custom_init_trim. I've also made the algorithms linear O(N). I've completely removed the NOPs and the Iterator pattern.

Copy link
Copy Markdown
Owner

@morehouse morehouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is much cleaner.

Comment thread smite-ir/src/tests.rs Outdated
Comment thread smite-ir/src/tests.rs
Comment thread smite-ir/src/tests.rs Outdated
Comment thread smite-ir/src/tests.rs
Comment thread smite-ir/src/tests.rs
Comment thread smite-ir/src/minimizers.rs Outdated
Comment thread smite-ir/src/tests.rs
Comment thread smite-ir-mutator/src/lib.rs Outdated
Comment thread smite-ir-mutator/src/lib.rs Outdated
Comment thread smite-ir-mutator/src/lib.rs Outdated
@erickcestari erickcestari force-pushed the add-minimizers branch 4 times, most recently from 2579ee6 to 872a31c Compare May 8, 2026 00:43
@erickcestari erickcestari requested a review from morehouse May 8, 2026 00:44
Copy link
Copy Markdown
Owner

@morehouse morehouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to do an e2e test with this to make sure our custom trim actually runs and is useful.

Comment thread smite-ir/src/tests.rs
Comment thread smite-ir-mutator/src/lib.rs
return 0;
}

1
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may also need to update last_sequence so we can tell if trim is actually being used by AFL.

Comment thread smite-ir/src/minimizers/cse.rs Outdated
@erickcestari erickcestari force-pushed the add-minimizers branch 11 times, most recently from 19a5aaa to 8217cd7 Compare May 9, 2026 19:15
Comment thread smite-ir-mutator/src/lib.rs Outdated
Comment thread smite-ir-harness/Cargo.toml Outdated
Comment thread smite-ir-mutator/tests/afl_custom_mutator_e2e.rs Outdated
Comment thread smite-ir-mutator/tests/afl_custom_mutator_e2e.rs Outdated
Comment thread smite-ir-mutator/tests/afl_custom_mutator_e2e.rs Outdated
Comment thread smite-ir-harness/Cargo.toml Outdated
Comment thread smite-ir-harness/src/main.rs Outdated
Comment thread .github/workflows/afl-e2e.yml Outdated
Comment thread .github/workflows/afl-e2e.yml
@erickcestari erickcestari force-pushed the add-minimizers branch 4 times, most recently from f127082 to dbeb4b7 Compare May 14, 2026 12:35
@erickcestari erickcestari requested a review from morehouse May 14, 2026 12:38
@morehouse
Copy link
Copy Markdown
Owner

I'm starting to second-guess this new e2e test. It's become quite contrived, almost to the point that it will pass by construction. The way we define the custom coverage metric almost guarantees AFL will consider trimming a success.

We already have good unit tests that cover a significant portion of what the e2e test covers, so it seems like the e2e test brings in a lot of machinery for the marginal improvement to coverage.

What I really want to know is whether the minimizers are useful against a real target. I propose that we drop the e2e test and run a short LDK or LND fuzzing campaign with AFL_DEBUG=1. If we see "[Custom Trimming] SUCCESS" lines in the debug output, that's a good indicator that the minimizers are useful today, and let's get this merged. If not, a few things could be going on -- either our current mutators/generators don't really generate programs that are minimizable (in which case let's retest once we have more diversity in that area) or the instability of our fuzzing targets prevents AFL from accepting the minimizations (in which case trimming is broken for us anyway).

`has_side_effects` returns `true` for operations that have I/O side
effects (`SendMessage` and `RecvAcceptChannel`) and therefore cannot
be dropped by DCE or deduplicated by CSE. Used by both minimizers
introduced in the next commit.

Also derive `Hash` on `Operation` and `AcceptChannelField` so CSE can
key its canonical map on `(operation, canonicalized_inputs)`.
Introduces the `Minimizer` trait (mirroring the `Mutator` trait shape)
and two implementations that shrink an IR program in place:

    fn minimize(&self, program: &mut Program) -> bool;

The bool reports whether the program was modified, so callers can
skip an `==` walk over every instruction.

- `DeadCodeEliminator` keeps an instruction if it has side effects or
  is referenced by a later kept instruction. A reverse pass marks
  liveness; a forward pass consumes the program and rewrites the
  surviving instructions' inputs to their new indices.
- `CommonSubexpressionEliminator` merges instructions that compute
  the same expression. A single forward pass canonicalizes inputs as
  it goes and dedupes via a `HashMap` keyed on
  `(operation, canonicalized_inputs)`. SSA guarantees inputs are
  already canonicalized by the time we reach each instruction, so
  the merge is transitive: two compute ops whose inputs collapsed to
  the same canonical loads are themselves recognized as equivalent.

Both transforms are safe in IR semantics (don't change observable
behaviour modulo `SendMessage`/`RecvAcceptChannel` side-effects), so
they don't take an oracle.
Wires the `DeadCodeEliminator` and `CommonSubexpressionEliminator`
minimizers into AFL++'s custom-mutator trim ABI as a single composed
pass. Both are deterministic in-process transforms safe in IR
semantics, so we run them once during `afl_custom_init_trim`,
serialize the result into `out_buf`, and offer it to AFL as a single
candidate.

`afl_custom_init_trim` returns `1` if either minimizer reports a
change (or `0` if both no-op'd; AFL skips trim entirely).
`afl_custom_trim` hands back the pre-serialized buffer.
`afl_custom_post_trim` returns `1` unconditionally to terminate AFL's
`while (stage_cur < stage_max)` loop after the single iteration. AFL
itself decides whether to persist the trimmed bytes based on its
coverage-cksum check; we don't need to track partial state across
iterations because there's only one.
Minimal AFL++ harness binary that decodes a postcard-encoded `Program`,
validates it, and publishes coverage manually to `__afl_area_ptr`. The
e2e test for the custom mutator's trim pipeline drives `afl-fuzz`
against this binary.

The bitmap must be bit-identical across DCE/CSE-trimmed variants of
the same program (so AFL's trim cksum accepts shrunk candidates) yet
vary under our mutators (so AFL queues new entries). Any compiler-
inserted edge whose hit count tracks `program.instructions.len()`
fails the first half: DCE/CSE move the count across AFL's hit-count
buckets and the cksum mismatches. `postcard::from_bytes` and
`Program::validate` both contain such loops, and rustc doesn't expose
a SanitizerCoverage allowlist to exclude them.

So the harness is built with
`RUSTFLAGS=-Cllvm-args=-sanitizer-coverage-level=0` and publishes
coverage manually: for each instruction reachable (via `inputs`) from
a side-effect root (`SendMessage`, `RecvAcceptChannel`), mark a slot
derived from a content hash of `(operation, hashes of inputs)`.
Because the hash folds input content (not indices), DCE renumbering
doesn't change it; CSE merges duplicates whose hashes were already
equal; `OperationParamMutator` shifts an operation's hash;
`InputSwapMutator` rewires an edge and shifts the consumer's hash.

This also encodes a broader smite design principle: coverage is
driven only by side-effecting work. Pure setup instructions that
never feed a Send/Recv produce zero coverage and AFL never queues
them. The fuzzing signal lines up with the minimizer's notion of
"useful work", the same reachability DCE uses, so trimming can't
change coverage.

The crate is workspace-excluded so AFL's link-arg insertions don't
leak into the rest of the workspace. We deliberately don't use the
`afl` crate: its `fuzz!` macro forces persistent + shmem delivery,
which hangs during AFL's calibration when SanitizerCoverage is off.
The harness calls `__afl_manual_init` and reads stdin instead.
Drives the real `afl-fuzz` binary against the `smite-ir-e2e-test`
harness with our cdylib loaded as `AFL_CUSTOM_MUTATOR_LIBRARY`, then
asserts every hook we export is actually used in a real fuzzing run.
All signals come from AFL's own `AFL_DEBUG=1` output, so the cdylib
stays instrumentation-free.

Five signals checked:

1. `Found 'afl_custom_<name>'` lines at startup for all six hooks
   we export (init/fuzz/deinit bundled as `afl_custom_mutator`, plus
   describe, init_trim, trim, post_trim, splice_optout).
2. Queue filenames carry `smite-ir:<last_sequence>` from
   `afl_custom_describe`. Both branches of `mutate_stacked` must
   surface: `fresh` (regenerate) and one of `op-param` / `input-swap`
   (stacked mutation).
3. `[Custom Trimming] START` lines confirm `afl_custom_init_trim`
   is invoked.
4. `START: Max 1` confirms the DCE+CSE pipeline shrank at least one
   input. The seed corpus mixes a DCE-reducible program (dead
   `LoadAmount` appended) and a CSE-reducible one (duplicate
   `LoadPrivateKey` injected mid-program) so both minimizer paths
   can fire.
5. `[Custom Trimming] SUCCESS` confirms AFL persisted at least one
   trimmed candidate, i.e. the trimmed bytes' coverage cksum matched
   the original. Verifies DCE+CSE preserve coverage end-to-end --
   relies on the harness's DCE/CSE-invariant signal.

The harness is built with
`RUSTFLAGS=-Cllvm-args=-sanitizer-coverage-level=0` from this test
(cargo-afl appends user RUSTFLAGS to its own, and LLVM honors the
last `-Cllvm-args=` seen). `AFL_MAP_SIZE` + `AFL_SKIP_BIN_CHECK` are
set because sancov is off so `__afl_final_loc` is 0 and AFL wouldn't
otherwise know the binary is fuzzable.

Marked `#[ignore]` so `cargo test` skips it by default; spawns
afl-fuzz for ~30s. Skips cleanly if `cargo-afl` isn't on `PATH`.
Working files land in `/tmp/smite-e2e/` so they survive a panic for
post-mortem.
Runs the smite-ir-mutator e2e test on PRs and pushes to master that
touch the AFL-relevant crates (smite-ir, smite-ir-mutator,
smite-ir-e2e-test, workspace manifests, or the workflow itself).
Installs `cargo-afl` (cached across runs), then runs the `#[ignore]`
test with `--ignored`.

Kept as a separate workflow rather than a step in `rust.yml` because
the AFL toolchain install + harness build adds several minutes; the
fast Rust gate stays fast. On failure, tars `/tmp/smite-e2e/` (seeds,
queue, AFL stdout/stderr) and uploads it as an artifact -- AFL queue
filenames contain colons, which actions/upload-artifact rejects, so
the tarball is required.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants