Skip to content

feat: raise MaxFrameSize 256 MiB → 1 GiB + env override#22

Open
TeoSlayer wants to merge 2 commits into
mainfrom
feat/raise-frame-cap-1gib
Open

feat: raise MaxFrameSize 256 MiB → 1 GiB + env override#22
TeoSlayer wants to merge 2 commits into
mainfrom
feat/raise-frame-cap-1gib

Conversation

@TeoSlayer

Copy link
Copy Markdown
Contributor

What

  • Default frame cap 256 MiB → 1 GiB. `MaxFrameSize` const → `var`, with `DefaultMaxFrameSize uint32 = 1 << 30`.
  • New env override `PILOT_DATAEXCHANGE_MAX_FRAME` (bytes), evaluated at package init, range-checked to `[64 KiB, 2 GiB)`. Garbage or unsafe values silently fall back to the default rather than crashing the daemon.

Why

The wire format's length field has always been `uint32` — the absolute ceiling is ~4 GiB. The 256 MiB cap was a memory-safety knob sized for "the test fleet's 100 MiB file payloads" (per the old comment). Operators on multi-GiB-RAM hosts are hitting the cap on real workloads (the "send-file fails on 50 MB / 10 MB files" report in `web4/docs/BUG-updater-version-skew.md` is a separate reliability issue, but raising the cap removes one ceiling on the same code path).

The chunked streaming protocol in `web4/docs/PROPOSAL-reliable-file-transfer.md` will eventually remove the need for a per-frame cap entirely — this is the smallest change that unblocks bigger transfers today without changing wire format.

Compatibility

Wire format unchanged. Both ends must agree on the cap:

  • New sender → old receiver: receiver still caps at 256 MiB, returns `frame too large`, drops the connection (current behavior).
  • Old sender → new receiver: receiver accepts up to 1 GiB; sender was self-capped at 256 MiB anyway.
  • Rolling out the higher cap requires restarting daemons on both sides; env-only operators don't need a redeploy.

Tests

  • `TestReadFrame_SizeCap` updated for the new default cap (was hard-coded 256 MiB).
  • New `TestMaxFrameSize_DefaultAndConstants` pins the default and verifies the env override is unset in CI (so a misconfigured runner doesn't silently warp the rest of the suite).
  • Full `go test ./...` green.

🤖 Generated with Claude Code

Bumps the per-frame cap from 256 MiB → 1 GiB (default) so single-frame
send-file works for the larger artefacts operators are actually shipping
on multi-GiB-RAM hosts. Adds PILOT_DATAEXCHANGE_MAX_FRAME env override
(read at package init, range-checked to [64 KiB, 2 GiB)) so the limit
can be raised further without recompiling.

Wire format is unchanged — the length field has always been uint32, so
the absolute ceiling has always been ~4 GiB; this cap is a memory-safety
knob, not a wire-format limit. Both ends must agree: a sender that
exceeds the receiver's cap still gets "frame too large" and dropped.

The chunked streaming protocol in docs/PROPOSAL-reliable-file-transfer.md
(web4) will eventually remove the need for a per-frame cap entirely.

Tests:
- TestReadFrame_SizeCap updated for the new default (was 256 MiB).
- New TestMaxFrameSize_DefaultAndConstants pins the default and verifies
  the env override is unset in CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 14, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Adds TypeFileStream (frame type 7) — a chunked transfer protocol that
replaces the single giant TypeFile frame that stalled on any non-trivial
path (>64 KiB stalls over relay, and over a direct link that flips to
relay under one-way load).

- filestream.go: INIT/CHUNK/ACK/DONE/COMPLETE control protocol; sender
  state machine (SendFileStream) with a sliding window, end-to-end
  SHA-256, and resume keyed on a content-derived transfer_id; receiver
  (StreamReceiver) writing .partial contiguously, verifying the full
  hash, and atomically renaming into place. Chunk size 48 KiB —
  deliberately below the ~64 KiB tunnel-stall threshold, so each chunk
  rides the known-good path and the per-chunk ACK keeps the reverse
  direction busy (which also stops the blackhole heuristic flipping the
  link mid-transfer).
- service.go: dispatch TypeFileStream in handleConn via a per-connection
  StreamReceiver; final filename + file.received event match the legacy
  saveReceivedFile path.
- zz_filestream_test.go: loopback tests — happy path, resume-from-offset,
  corrupt-resume rejection, codec round-trip.

Verified on the Mac↔GCP-VM dual-NAT rig: 50 MB completes with matching
sha256 over a hole-punched direct path; resume from a mid-transfer kill
continues from the exact byte offset.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants