Add backend regression benchmark harness#546
Draft
Byron wants to merge 2 commits intorust-lang:mainfrom
Draft
Conversation
b8a8271 to
fd5d3ea
Compare
897f9a8 to
580c2b4
Compare
Add known-good backend regression benchmark harness Introduce an ignored integration test that measures backend-sensitive compress/decompress cases against a known-good commit. - the test benchmarks the current checkout in release mode - it creates a detached worktree at KNOWN_GOOD_COMMIT on the same runner - it generates a tiny temporary Cargo project whose manifest points `flate2` at that worktree - it copies `tests/support/backend-regression-driver.rs` into that project and runs it to produce baseline CSV data - it compares the current results against those baseline measurements and fails when the observed slowdown exceeds the combined measurement uncertainty of the current run and the baseline run - run a single backend locally with: `cargo test --release --test backend-regression-bench -- --ignored --exact backend_regression_bench --nocapture` - switch backends the same way the rest of the test matrix does, for example: `cargo test --release --test backend-regression-bench --features zlib-rs --no-default-features -- --ignored --exact backend_regression_bench --nocapture` - optionally, inspect `target/backend-bench/*.csv` for the raw baseline/current measurements after a run This is intended to be maintained as a regression guard: - update KNOWN_GOOD_COMMIT only when intentionally accepting a new performance baseline - keep the driver source reusable and branch-independent so the baseline setup stays lightweight and easy to reason about ``` > Add this benchmark (rust-lang#544 (comment)) to version of the library before rust-lang#502 was merged to get a baseline for all backends. Store that baseline, and add a CI job that checks against the baseline. Consider other uses of performance-critical low-level methods and see if it makes sense to add more kinds of tests to the baseline the most recent version. Success means that the baseline as run on main reproduces issue rust-lang#544 as well, there is no need to attempt to fix it. Co-authored-by: Sebastian Thiel <sebastian.thiel@icloud.com>
… toggle. We'd want to wait for a publicly available API though.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a backend regression benchmark harness that compares the current checkout against a known-good pre-#502 commit on the same runner.
This is intended as a regression guard for backend-sensitive performance paths, not as a fix for the underlying slowdown. The benchmark should reproduce the issue on
main, establish a baseline from before the regression, and fail when the current code exceeds the allowed slowdown derived from measurement uncertainty.Tasks
zlib-rs)How it works
This adds an ignored integration test,
tests/backend-regression-bench.rs, that:KNOWN_GOOD_COMMITflate2at that known-good worktreetests/support/backend-regression-driver.rsinto that temporary project and runs it to collect baseline measurementstarget/backend-bench/*.csvThe baseline driver is kept as a normal Rust source file instead of an embedded string so the setup remains easy to reuse and reason about across historical and current revisions.
CI
This also wires the backend regression benchmark into CI via a dedicated workflow:
.github/workflows/backend-regression.ymlThat workflow now runs on:
pushpull_requestworkflow_dispatchEach matrix job runs the ignored benchmark in release mode for one backend and uploads the generated CSV artifacts.
How to use
Run the default backend locally with:
cargo test --release --test backend-regression-bench -- --ignored --exact backend_regression_bench --nocaptureRun another backend, for example
zlib-rs, with:cargo test --release --test backend-regression-bench --features zlib-rs --no-default-features -- --ignored --exact backend_regression_bench --nocaptureInspect artifacts after a run in:
Maintenance
This is meant to be maintained as a long-term regression check:
KNOWN_GOOD_COMMITonly when intentionally accepting a new performance baselineRelated
(de)compress_uninitthat accepts&[MaybeUninit<u8>]#502main, compare against the pre-Add(de)compress_uninitthat accepts&[MaybeUninit<u8>]#502 baseline, and check that in CI