Skip to content

Add backend regression benchmark harness#546

Draft
Byron wants to merge 2 commits intorust-lang:mainfrom
Byron:baseline-bench
Draft

Add backend regression benchmark harness#546
Byron wants to merge 2 commits intorust-lang:mainfrom
Byron:baseline-bench

Conversation

@Byron
Copy link
Copy Markdown
Member

@Byron Byron commented Apr 16, 2026

Add a backend regression benchmark harness that compares the current checkout against a known-good pre-#502 commit on the same runner.

This is intended as a regression guard for backend-sensitive performance paths, not as a fix for the underlying slowdown. The benchmark should reproduce the issue on main, establish a baseline from before the regression, and fail when the current code exceeds the allowed slowdown derived from measurement uncertainty.

Tasks

  • wait for the official fix from @folkertdev (needs additions to the public api of zlib-rs)
  • review the benchmark harness or drop it if it's too messy (we can also just hope for this to not happen again)

How it works

This adds an ignored integration test, tests/backend-regression-bench.rs, that:

  • benchmarks a small set of backend-sensitive compress/decompress cases in release mode
  • creates a detached worktree at KNOWN_GOOD_COMMIT
  • generates a temporary Cargo project whose manifest points flate2 at that known-good worktree
  • copies tests/support/backend-regression-driver.rs into that temporary project and runs it to collect baseline measurements
  • compares the current run against the baseline and fails if the observed slowdown exceeds the combined uncertainty of the current and baseline measurements
  • writes CSV artifacts to target/backend-bench/*.csv

The baseline driver is kept as a normal Rust source file instead of an embedded string so the setup remains easy to reuse and reason about across historical and current revisions.

CI

This also wires the backend regression benchmark into CI via a dedicated workflow:

  • .github/workflows/backend-regression.yml

That workflow now runs on:

  • push
  • pull_request
  • workflow_dispatch
  • nightly schedule

Each matrix job runs the ignored benchmark in release mode for one backend and uploads the generated CSV artifacts.

How to use

Run the default backend locally with:

cargo test --release --test backend-regression-bench -- --ignored --exact backend_regression_bench --nocapture

Run another backend, for example zlib-rs, with:

cargo test --release --test backend-regression-bench --features zlib-rs --no-default-features -- --ignored --exact backend_regression_bench --nocapture

Inspect artifacts after a run in:

target/backend-bench/*.csv

Maintenance

This is meant to be maintained as a long-term regression check:

  • update KNOWN_GOOD_COMMIT only when intentionally accepting a new performance baseline
  • keep the temporary baseline driver reusable and branch-independent
  • extend the benchmark cases if other low-level performance-sensitive APIs should also be covered

Related

@Byron Byron force-pushed the baseline-bench branch 2 times, most recently from b8a8271 to fd5d3ea Compare April 16, 2026 04:46
@Byron Byron force-pushed the baseline-bench branch 3 times, most recently from 897f9a8 to 580c2b4 Compare April 16, 2026 05:41
Add known-good backend regression benchmark harness

Introduce an ignored integration test that measures backend-sensitive
compress/decompress cases against a known-good commit.

- the test benchmarks the current checkout in release mode
- it creates a detached worktree at KNOWN_GOOD_COMMIT on the same runner
- it generates a tiny temporary Cargo project whose manifest points
  `flate2` at that worktree
- it copies `tests/support/backend-regression-driver.rs` into that
  project and runs it to produce baseline CSV data
- it compares the current results against those baseline measurements and
  fails when the observed slowdown exceeds the combined measurement
  uncertainty of the current run and the baseline run

- run a single backend locally with:
  `cargo test --release --test backend-regression-bench -- --ignored --exact backend_regression_bench --nocapture`
- switch backends the same way the rest of the test matrix does, for
  example:
  `cargo test --release --test backend-regression-bench --features zlib-rs --no-default-features -- --ignored --exact backend_regression_bench --nocapture`
- optionally, inspect `target/backend-bench/*.csv` for the raw baseline/current
  measurements after a run

This is intended to be maintained as a regression guard:
- update KNOWN_GOOD_COMMIT only when intentionally accepting a new
  performance baseline
- keep the driver source reusable and branch-independent so the baseline
  setup stays lightweight and easy to reason about
```

> Add this benchmark (rust-lang#544 (comment)) to version of the library before rust-lang#502 was merged to get a baseline for all backends. Store that baseline, and add a CI job that checks against the baseline.
  Consider other uses of performance-critical low-level methods and see if it makes sense to add more kinds of tests to the baseline the most recent version.
  Success means that the baseline as run on main reproduces issue rust-lang#544 as well, there is no need to attempt to fix it.

Co-authored-by: Sebastian Thiel <sebastian.thiel@icloud.com>
… toggle.

We'd want to wait for a publicly available API though.
@Byron Byron marked this pull request as draft April 16, 2026 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants