compression: size decompression buffers from the input, not the limit by iainmcgin · Pull Request #132 · anthropics/connect-rust

iainmcgin · 2026-05-27T01:54:37Z

Summary

The buffered decompression paths (gzip, zstd, and the trait's default decompress_with_limit) pre-allocated max_message_size + 1 bytes for every message when the limit was below 64 MiB. Bytes::from(Vec) keeps the full allocation alive when len < cap, so every small decompressed message was backed by a limit-sized (4 MiB by default) allocation for as long as the message was held — including per-envelope messages on streaming RPCs.

This change sizes the initial buffer from the compressed input (capped at the limit) and lets it grow on demand. Limit enforcement during growth and the Read::take bounds are unchanged, and all existing roundtrip/limit tests pass.

Tests

New tests assert that a small decompressed message is not backed by a limit-sized allocation, for the gzip provider, the zstd provider, and the default trait implementation (via the existing mock provider). Before this change each reported a 4,194,305-byte backing buffer for a 12-byte message.

Benchmarks

decompress_with_limit on a 4 MiB decompressed payload (limit = 4 MiB), 100 iterations after warm-up, same machine, baseline = main, branch = this PR including the growth-cap follow-up (mean / best per call):

Case	main	this PR
gzip, compressible (51 KiB compressed)	0.498 / 0.483 ms	0.451 / 0.438 ms
gzip, incompressible (4.2 MiB compressed)	9.04 / 8.83 ms	9.17 / 8.83 ms
zstd, compressible	0.743 / 0.723 ms	0.725 / 0.703 ms
zstd, incompressible	0.535 / 0.519 ms	0.509 / 0.497 ms

The growth-based sizing is within noise of (or slightly ahead of) the old limit-sized pre-allocation for max-sized messages, including the worst case for reallocation (highly compressible input that expands to the full limit). There is no compression-specific Criterion bench in the repo today; these numbers come from a small standalone harness that times decompress_with_limit directly on the two providers.

github-actions · 2026-05-27T01:54:46Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

The buffered decompression paths (gzip, zstd, and the trait's default decompress_with_limit) pre-allocated max_message_size + 1 bytes for every message when the limit was below 64 MiB. Bytes::from(Vec) keeps the full allocation alive when the length is below the capacity, so every small decompressed message was backed by a limit-sized allocation for as long as the message was held. Size the initial buffer from the compressed input instead (capped at the limit), and let it grow on demand; the existing limit enforcement during growth and the Read::take bounds are unchanged. Adds tests asserting that a small decompressed message is not backed by a limit-sized allocation, for all three paths.

rpb-ant

The problem is real and this is the right place to fix it: the decompressed Bytes escapes into the envelope/unary/streaming paths and ultimately user code (including prost zero-copy bytes fields), so sizing the backing allocation at its origin fixes every holder at once, and generalizing the existing input-based heuristic keeps it consistent rather than inventing a new one. Limit semantics at the boundary look unchanged (exactly-at-limit still passes, one-over still fails with resource_exhausted).

The one thing I'd like to settle before merge is the worst-case transient allocation under a decompression bomb (inline at the gzip sizing): eventual rejection is unchanged, but the peak before rejection roughly doubles now that growth happens on demand. Might well be acceptable — it just deserves a deliberate call rather than falling out of the change.

Smaller, non-blocking notes beyond the inline ones:

Worth a quick before/after run of the unary/large_gzip / unary/large_zstd benches to confirm the realloc cost for highly-compressible large messages is in the noise, and a line in the PR description with the result.
Nice tests overall — backing_capacity via try_into_mut is a neat way to observe retention; a couple of robustness suggestions inline.

rpb-ant · 2026-05-27T02:50:19Z

+        // `Bytes`, so a limit-sized allocation would stay resident for the
+        // lifetime of every (possibly tiny) message. The loop below grows
+        // the buffer on demand and enforces the limit as it grows.
+        let mut capacity = data.len().saturating_mul(2).max(256);


With the initial capacity no longer pinned at limit + 1, the growth loop below becomes the enforcement point, and its over-limit check only runs when capacity == len and uses >. That roughly doubles the peak transient size for an over-limit payload: with the default 4 MiB limit and a small compressed input, the doubling trajectory lands exactly on 4 MiB, so at cap == len == limit the len > limit check doesn't fire, output.reserve(output.len().max(4096)) doubles capacity to 8 MiB, and decompress_vec can fill all of it before the next iteration errors. Previously the buffer was pre-sized to limit + 1 and the decompressor couldn't write past it, so a bomb peaked at ~1× limit; now it's ~2×. The read_to_end paths have a milder version of the same thing (amortized growth from a small base can overshoot limit + 1 in capacity, though take means no more than limit + 1 bytes are ever written).

If a 2× peak under attack is acceptable, it's probably worth saying so explicitly in this comment. Otherwise the growth can be capped so capacity never exceeds limit + 1 (e.g. reserve min(output.len().max(4096), limit + 1 - capacity) when a limit is set) — then once the buffer fills at limit + 1 the existing > check fires and the worst case stays identical to the old behavior, while keeping the small-message win this PR is after.

rpb-ant · 2026-05-27T02:50:19Z

+        // `Bytes`, so a limit-sized allocation would stay resident for the
+        // lifetime of every (possibly tiny) message. `read_to_end` grows the
+        // buffer on demand and the `take` below still bounds the total.
+        let capacity = data


This computation (input × multiplier, floor 256, clamp to limit + 1) plus essentially the same explanatory comment now appears here, in GzipProvider::decompress_inner, and in ZstdProvider::decompress_impl, with small accidental differences in shape. A tiny private helper, e.g.

fn initial_decompress_capacity(input_len: usize, multiplier: usize, max_size: Option<usize>) -> usize

would keep the three sites from drifting, give the five-line "why" comment a single home, and be the natural place to document why gzip/the trait default guess ×2 while zstd guesses ×4 (the old zstd comment that explained its heuristic went away in this change).

rpb-ant · 2026-05-27T02:50:19Z

+    /// `Bytes::try_into_mut` reuses the original allocation when the handle
+    /// is unique, so the resulting `BytesMut::capacity()` exposes how much
+    /// memory the decompressed message actually retains.
+    fn backing_capacity(bytes: Bytes) -> usize {


This helper implicitly relies on Bytes::from(Vec) keeping the original capacity — the very behavior the production change is working around. If a future bytes release ever shrank in From<Vec>, these three tests would keep passing regardless of whether the sizing fix is present, i.e. they'd silently stop guarding the regression. A small control assertion (build a Bytes from a deliberately over-allocated Vec and check backing_capacity reports the large value) would keep them honest.

rpb-ant · 2026-05-27T02:50:19Z

+        assert_eq!(&out[..], b"tiny payload");
+        let capacity = backing_capacity(out);
+        assert!(
+            capacity < 64 * 1024,


With the new sizing the retained capacity for this 12-byte payload is 256, so 64 * 1024 is a fairly loose bound — a future change that retained, say, 32 KiB per small envelope (still a real cost on streaming RPCs) would pass all three of these tests. Tightening to something like < 4096, or naming the threshold as a const with a sentence on how it was chosen, would make the regression guard meaningfully stronger. Applies to the zstd and default-trait variants below as well.

rpb-ant · 2026-05-27T02:50:19Z

 http-body = "1"
 http-body-util = "0.1"
-bytes = { version = "1.5", features = ["serde"] }
+bytes = { version = "1.6", features = ["serde"] }


Just confirming this floor bump (1.5 → 1.6, for Bytes::try_into_mut in the new tests) is intentional — it raises the minimum version of a publicly re-exported dependency for the sake of a test helper. Seems fine given how old 1.6 is; only flagging so it's a conscious choice.

- share the initial-capacity heuristic in initial_decompress_capacity and document the gzip/zstd multiplier choice there - cap gzip buffer growth at limit + 1 so the peak allocation for an over-limit payload matches the previous pre-allocation behaviour - name the small-message retention bound used by the tests and add a control test that the capacity probe observes over-allocation

iainmcgin · 2026-05-27T03:01:14Z

[claude code] Addressed in the latest commit:

Bomb peak: the gzip growth path now reserves at most up to limit + 1, so the peak allocation for an over-limit payload is the same as the old pre-allocation behaviour; the small-message win is unchanged. The read_to_end paths keep amortized growth (transiently up to ~2× of the bytes actually written, still bounded by their take(limit + 1) readers) — that trade-off is now documented on the shared helper rather than left implicit.
Deduplication: the three sizing sites now call a shared initial_decompress_capacity(input_len, multiplier, max_size), which is also where the ×2 vs ×4 multiplier choice is documented.
Test honesty: added a control test that backing_capacity observes a deliberately over-allocated Vec, and the small-message bound is now a named SMALL_MESSAGE_RETENTION_BOUND = 4096 (heuristic yields 256 today) instead of the loose 64 KiB.
bytes 1.5 → 1.6: intentional — Bytes::try_into_mut (used only by the test probe) was added in 1.6.0 (Feb 2024), and the workspace already resolves to 1.11.x in practice.
Benches: will run unary/large_gzip / unary/large_zstd before/after and add the numbers to the PR description before undrafting. Expectation: for large highly-compressible messages the manual doubling adds ~13 reallocations totalling roughly one extra memcpy of the payload, which should be in the noise next to the decompression itself — but the numbers will confirm.

rpb-ant

Thanks — the growth cap restores the old worst case (the reserve is clamped to limit + 1 − capacity, so the buffer fills at limit + 1 and the existing > check fires; no zero-reserve stall is reachable since that state errors first), the shared initial_decompress_capacity helper with the multiplier rationale reads well, and the control test plus the named retention bound make the allocation tests meaningfully stronger. LGTM.

Two tiny non-blocking notes: initial_decompress_capacity ended up just below the // Tests banner — worth nudging it up next time this file is touched; and looking forward to the bench numbers in the description.

# Conflicts: # connectrpc/src/compression.rs

iainmcgin marked this pull request as ready for review May 27, 2026 01:57

iainmcgin force-pushed the decompress-output-sizing branch from e141a6e to cbd2f0e Compare May 27, 2026 02:05

Merge branch 'main' into decompress-output-sizing

bbc852b

iainmcgin enabled auto-merge (squash) May 27, 2026 02:26

rpb-ant reviewed May 27, 2026

View reviewed changes

rpb-ant previously approved these changes May 27, 2026

View reviewed changes

iainmcgin dismissed rpb-ant’s stale review via 0a78914 May 27, 2026 03:15

rpb-ant previously approved these changes May 27, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into decompress-output-sizing

524fbba

# Conflicts: # connectrpc/src/compression.rs

iainmcgin dismissed rpb-ant’s stale review via 524fbba May 27, 2026 03:17

iainmcgin force-pushed the decompress-output-sizing branch from 0a78914 to 524fbba Compare May 27, 2026 03:17

rpb-ant approved these changes May 27, 2026

View reviewed changes

Merge branch 'main' into decompress-output-sizing

fa9c06d

iainmcgin merged commit 7c59759 into main May 27, 2026
12 checks passed

github-actions Bot locked and limited conversation to collaborators May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compression: size decompression buffers from the input, not the limit#132

compression: size decompression buffers from the input, not the limit#132
iainmcgin merged 5 commits into
mainfrom
decompress-output-sizing

iainmcgin commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 27, 2026 •

edited

Loading

Uh oh!

rpb-ant left a comment

Uh oh!

rpb-ant May 27, 2026

Uh oh!

rpb-ant May 27, 2026

Uh oh!

rpb-ant May 27, 2026

Uh oh!

rpb-ant May 27, 2026

Uh oh!

rpb-ant May 27, 2026

Uh oh!

iainmcgin commented May 27, 2026

Uh oh!

rpb-ant left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

iainmcgin commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Benchmarks

Uh oh!

github-actions Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rpb-ant left a comment

Choose a reason for hiding this comment

Uh oh!

rpb-ant May 27, 2026

Choose a reason for hiding this comment

Uh oh!

rpb-ant May 27, 2026

Choose a reason for hiding this comment

Uh oh!

rpb-ant May 27, 2026

Choose a reason for hiding this comment

Uh oh!

rpb-ant May 27, 2026

Choose a reason for hiding this comment

Uh oh!

rpb-ant May 27, 2026

Choose a reason for hiding this comment

Uh oh!

iainmcgin commented May 27, 2026

Uh oh!

rpb-ant left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

iainmcgin commented May 27, 2026 •

edited

Loading

github-actions Bot commented May 27, 2026 •

edited

Loading