Skip to content

syz-verifier, executor, pkg/flatrpc: implement memory comparison#13

Open
natitati4 wants to merge 1 commit into
dBransky:verifier-devfrom
natitati4:memory-comparison
Open

syz-verifier, executor, pkg/flatrpc: implement memory comparison#13
natitati4 wants to merge 1 commit into
dBransky:verifier-devfrom
natitati4:memory-comparison

Conversation

@natitati4
Copy link
Copy Markdown

@natitati4 natitati4 commented Apr 20, 2026

This PR introduces the memory comparison engine to detect memory divergences across different kernel versions in syz-verifier.

syz-verifier:
Isolate the memory policy engine into a dedicated memcmp.go module to track and evaluate mismatches. The verifier now sets the new MemCmp execution flag when requesting program execution to instruct the executor to provide memory data.

executor:
Introduce ptrace interception into the execution loop to capture baseline and final memory hashes. Add 2 hook points into the child that raise SIGSTOP, which the parent catches via ptrace - one when the child starts and one when it is about to exit. This allows the parent to collect memory information about the child to send back to the verifier safely. Gate memory comparison related operations behind flag_memcmp to avoid incurring overhead for other tools that rely on the executor.

pkg/flatrpc:
Integrate said MemCmp flag into ExecFlags via FlatBuffers.


Unsolved problems:

Bucketing memory mismatches in crash report

There is currently no unique "name" for a program. So the mismatch reports appear in the web page under one title that contains all of them. It is capped at 100, so we're probably missing things. There are a couple of possible solutions to this:

  • Get memory hash after every syscall, so that we know exactly when they diverged, just like in the errno comparison
    • Pros
      • Let's use bucket mismatches like errno
      • Saves us work triaging to find the exact syscall where the memory contents diverged
    • Cons
      • Increases the program execution time a lot, since we now sweep the entire memory multiple times during the execution
      • Increases data saved (in OutputData, which is the next problem) and returned by the executor
  • Use some kind of semantic parsing (like having a list of "interesting" syscalls and bucketing by them)
    • Pros
      • Entirely in the verifier side, saving work for the executor
    • Cons
      • Harder to implement
      • not 100% accurate
  • Open the PR to upstream and ask the syzkaller maintainer what is the best way to deal with this
    • Pros
      • More likely to get merged once we actually do what they think
    • Cons
      • Will probably take time for them to respond

IPC Data Bloat and OutputData Limits

The fork server parent (executor 'exec') has to add the VMA arrays into the shared OutputData struct in order to talk to the orchestrator (executor 'runner') and give it the memory comparison info. OutputData is of a limited size (256KB), and probably for a good reason. Modifying its size is very likely to get flagged by the syzkaller maintainers. Possible solutions:

  • Decrease the size of the arrays/VMA struct (what we do now)
    • Pros
      • Easy hack
    • Cons
      • Not scalable
  • Research how the executor transfers other high volume data back to the verifier (the most notable piece of such info being the coverage signals. It does not get passed through the regular fbs route, but rather via (if I understand correctly) inter-VM shared memory) and do something like this too.
    • Pros
      • More likely to be accepted as it does not bloat the hot inter-process shared memory
    • Cons
      • Requires research, and probably hard to implement
  • Open the PR to upstream and ask the syzkaller maintainer what is the best way to deal with this
    • Pros & Cons as above

Speed

syz-verifier is currently entirely sequential, this prevents us from using multiple VMs, and probably more CPUs/procs. We probably need to find a way to make syz-verifier parallelize its work.

And many other features/optimizations/rewrites/cleanups.

@natitati4 natitati4 force-pushed the memory-comparison branch 5 times, most recently from 2652f9b to 8feb144 Compare April 23, 2026 01:27
Comment thread executor/common.h Outdated
Comment thread executor/common.h Outdated
Comment thread executor/common.h Outdated
Comment thread executor/executor.cc
if (sscanf(line, "%llx-%llx %7s %*s %*s %*s %127[^\n]", &start, &end, perms, name_buf) < 3)
continue;

if (perms[0] != 'r')
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What vma's we can't read? Those are the vma's we decided we don't want to read compare anyway?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - those are handled separately a few lines above. This is just a defensive check, because if a VMA is unreadable process_vm_readv will not read anything and hashing will be useless.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now because we compare only the scratchpad we don't have a problem to read straight from the /proc/pid/mem according to the virtual addresses in /proc/pid/maps. Later when we will maybe want to read unreadable areas - worth to move read straight from /dev/mem or /proc/kcore (Like https://github.com/jtsylve/LiME) so we won't be limited by the maps perms for comparing memory areas

Comment thread syz-verifier/memcmp.go
@natitati4 natitati4 force-pushed the memory-comparison branch from 8feb144 to 806b2c7 Compare May 12, 2026 06:33
This commit introduces a differential memory comparison engine to detect
memory divergences across different kernel versions in syz-verifier.

syz-verifier:
Isolate the memory policy engine into a dedicated memcmp.go module to track
and evaluate mismatches. The verifier now sets the new MemCmp execution flag
when requesting program execution to instruct the executor to provide memory
data.

executor:
Introduce ptrace interception into the execution loop to capture baseline
and final memory hashes. Add 2 hook points into the child that raise SIGSTOP,
which the parent catches via ptrace - one when the child starts and one when
it is about to exit. This allows the parent to collect memory information about
the child to send back to the verifier safely. Gate memory comparison related
operations behind flag_memcmp to avoid incurring overhead for other tools that
rely on the executor.

pkg/flatrpc:
Integrate said MemCmp flag into ExecFlags via FlatBuffers.
@natitati4 natitati4 force-pushed the memory-comparison branch from 806b2c7 to 74e0017 Compare May 12, 2026 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants