compression: size decompression buffers from the input, not the limit#132
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
The buffered decompression paths (gzip, zstd, and the trait's default decompress_with_limit) pre-allocated max_message_size + 1 bytes for every message when the limit was below 64 MiB. Bytes::from(Vec) keeps the full allocation alive when the length is below the capacity, so every small decompressed message was backed by a limit-sized allocation for as long as the message was held. Size the initial buffer from the compressed input instead (capped at the limit), and let it grow on demand; the existing limit enforcement during growth and the Read::take bounds are unchanged. Adds tests asserting that a small decompressed message is not backed by a limit-sized allocation, for all three paths.
e141a6e to
cbd2f0e
Compare
rpb-ant
left a comment
There was a problem hiding this comment.
The problem is real and this is the right place to fix it: the decompressed Bytes escapes into the envelope/unary/streaming paths and ultimately user code (including prost zero-copy bytes fields), so sizing the backing allocation at its origin fixes every holder at once, and generalizing the existing input-based heuristic keeps it consistent rather than inventing a new one. Limit semantics at the boundary look unchanged (exactly-at-limit still passes, one-over still fails with resource_exhausted).
The one thing I'd like to settle before merge is the worst-case transient allocation under a decompression bomb (inline at the gzip sizing): eventual rejection is unchanged, but the peak before rejection roughly doubles now that growth happens on demand. Might well be acceptable — it just deserves a deliberate call rather than falling out of the change.
Smaller, non-blocking notes beyond the inline ones:
- Worth a quick before/after run of the
unary/large_gzip/unary/large_zstdbenches to confirm the realloc cost for highly-compressible large messages is in the noise, and a line in the PR description with the result. - Nice tests overall —
backing_capacityviatry_into_mutis a neat way to observe retention; a couple of robustness suggestions inline.
| // `Bytes`, so a limit-sized allocation would stay resident for the | ||
| // lifetime of every (possibly tiny) message. The loop below grows | ||
| // the buffer on demand and enforces the limit as it grows. | ||
| let mut capacity = data.len().saturating_mul(2).max(256); |
There was a problem hiding this comment.
With the initial capacity no longer pinned at limit + 1, the growth loop below becomes the enforcement point, and its over-limit check only runs when capacity == len and uses >. That roughly doubles the peak transient size for an over-limit payload: with the default 4 MiB limit and a small compressed input, the doubling trajectory lands exactly on 4 MiB, so at cap == len == limit the len > limit check doesn't fire, output.reserve(output.len().max(4096)) doubles capacity to 8 MiB, and decompress_vec can fill all of it before the next iteration errors. Previously the buffer was pre-sized to limit + 1 and the decompressor couldn't write past it, so a bomb peaked at ~1× limit; now it's ~2×. The read_to_end paths have a milder version of the same thing (amortized growth from a small base can overshoot limit + 1 in capacity, though take means no more than limit + 1 bytes are ever written).
If a 2× peak under attack is acceptable, it's probably worth saying so explicitly in this comment. Otherwise the growth can be capped so capacity never exceeds limit + 1 (e.g. reserve min(output.len().max(4096), limit + 1 - capacity) when a limit is set) — then once the buffer fills at limit + 1 the existing > check fires and the worst case stays identical to the old behavior, while keeping the small-message win this PR is after.
| // `Bytes`, so a limit-sized allocation would stay resident for the | ||
| // lifetime of every (possibly tiny) message. `read_to_end` grows the | ||
| // buffer on demand and the `take` below still bounds the total. | ||
| let capacity = data |
There was a problem hiding this comment.
This computation (input × multiplier, floor 256, clamp to limit + 1) plus essentially the same explanatory comment now appears here, in GzipProvider::decompress_inner, and in ZstdProvider::decompress_impl, with small accidental differences in shape. A tiny private helper, e.g.
fn initial_decompress_capacity(input_len: usize, multiplier: usize, max_size: Option<usize>) -> usizewould keep the three sites from drifting, give the five-line "why" comment a single home, and be the natural place to document why gzip/the trait default guess ×2 while zstd guesses ×4 (the old zstd comment that explained its heuristic went away in this change).
| /// `Bytes::try_into_mut` reuses the original allocation when the handle | ||
| /// is unique, so the resulting `BytesMut::capacity()` exposes how much | ||
| /// memory the decompressed message actually retains. | ||
| fn backing_capacity(bytes: Bytes) -> usize { |
There was a problem hiding this comment.
This helper implicitly relies on Bytes::from(Vec) keeping the original capacity — the very behavior the production change is working around. If a future bytes release ever shrank in From<Vec>, these three tests would keep passing regardless of whether the sizing fix is present, i.e. they'd silently stop guarding the regression. A small control assertion (build a Bytes from a deliberately over-allocated Vec and check backing_capacity reports the large value) would keep them honest.
| assert_eq!(&out[..], b"tiny payload"); | ||
| let capacity = backing_capacity(out); | ||
| assert!( | ||
| capacity < 64 * 1024, |
There was a problem hiding this comment.
With the new sizing the retained capacity for this 12-byte payload is 256, so 64 * 1024 is a fairly loose bound — a future change that retained, say, 32 KiB per small envelope (still a real cost on streaming RPCs) would pass all three of these tests. Tightening to something like < 4096, or naming the threshold as a const with a sentence on how it was chosen, would make the regression guard meaningfully stronger. Applies to the zstd and default-trait variants below as well.
| http-body = "1" | ||
| http-body-util = "0.1" | ||
| bytes = { version = "1.5", features = ["serde"] } | ||
| bytes = { version = "1.6", features = ["serde"] } |
There was a problem hiding this comment.
Just confirming this floor bump (1.5 → 1.6, for Bytes::try_into_mut in the new tests) is intentional — it raises the minimum version of a publicly re-exported dependency for the sake of a test helper. Seems fine given how old 1.6 is; only flagging so it's a conscious choice.
- share the initial-capacity heuristic in initial_decompress_capacity and document the gzip/zstd multiplier choice there - cap gzip buffer growth at limit + 1 so the peak allocation for an over-limit payload matches the previous pre-allocation behaviour - name the small-message retention bound used by the tests and add a control test that the capacity probe observes over-allocation
|
[claude code] Addressed in the latest commit:
|
rpb-ant
left a comment
There was a problem hiding this comment.
Thanks — the growth cap restores the old worst case (the reserve is clamped to limit + 1 − capacity, so the buffer fills at limit + 1 and the existing > check fires; no zero-reserve stall is reachable since that state errors first), the shared initial_decompress_capacity helper with the multiplier rationale reads well, and the control test plus the named retention bound make the allocation tests meaningfully stronger. LGTM.
Two tiny non-blocking notes: initial_decompress_capacity ended up just below the // Tests banner — worth nudging it up next time this file is touched; and looking forward to the bench numbers in the description.
# Conflicts: # connectrpc/src/compression.rs
0a78914 to
524fbba
Compare
Summary
The buffered decompression paths (gzip, zstd, and the trait's default
decompress_with_limit) pre-allocatedmax_message_size + 1bytes for every message when the limit was below 64 MiB.Bytes::from(Vec)keeps the full allocation alive whenlen < cap, so every small decompressed message was backed by a limit-sized (4 MiB by default) allocation for as long as the message was held — including per-envelope messages on streaming RPCs.This change sizes the initial buffer from the compressed input (capped at the limit) and lets it grow on demand. Limit enforcement during growth and the
Read::takebounds are unchanged, and all existing roundtrip/limit tests pass.Tests
New tests assert that a small decompressed message is not backed by a limit-sized allocation, for the gzip provider, the zstd provider, and the default trait implementation (via the existing mock provider). Before this change each reported a 4,194,305-byte backing buffer for a 12-byte message.
Benchmarks
decompress_with_limiton a 4 MiB decompressed payload (limit = 4 MiB), 100 iterations after warm-up, same machine, baseline =main, branch = this PR including the growth-cap follow-up (mean / best per call):The growth-based sizing is within noise of (or slightly ahead of) the old limit-sized pre-allocation for max-sized messages, including the worst case for reallocation (highly compressible input that expands to the full limit). There is no compression-specific Criterion bench in the repo today; these numbers come from a small standalone harness that times
decompress_with_limitdirectly on the two providers.