Skip to content

harden untrusted-input parsing: recursion caps, bounds, sanitized errors#172

Merged
dfa1 merged 7 commits into
mainfrom
harden/untrusted-input-bounds
Jun 26, 2026
Merged

harden untrusted-input parsing: recursion caps, bounds, sanitized errors#172
dfa1 merged 7 commits into
mainfrom
harden/untrusted-input-bounds

Conversation

@dfa1

@dfa1 dfa1 commented Jun 26, 2026

Copy link
Copy Markdown
Owner

Hardens the reader's untrusted-input parse paths. All findings share one root cause: a malformed-input path reaching the JDK without the IoBounds/VortexException discipline the rest of the parser follows. One fix per commit.

Fixes

Sev Commit Issue
HIGH fix(reader): cap DType-tree recursion depth convertDType recursed through Struct/List/FixedSizeList/Extension with no depth guard (unlike convertLayout). A crafted/self-referential FlatBuffer dtype → StackOverflowError — an Error, so it escaped VortexException sanitization and leaked the memory-mapped Arena (open's catch (Exception) never ran arena.close()).
HIGH fix(reader): cap array-node recursion depth convertArrayNode had the same unguarded recursion over the encoded array tree → same StackOverflowError escape.
HIGH fix(reader): validate segment specs in the HTTP reader VortexHttpReader.open called parseBlobs directly, skipping validateSegmentSpecs. A malformed remote footer's segmentSpecs could fall outside the file, turning into out-of-bounds Range requests.
MED fix(zstd): bound frame sizes and overflow-check total Untrusted per-frame uncompressed_size summed into a long with no validation, then arena.allocate'd — wrap-to-small (under-alloc) or huge-claim (alloc DoS); the (int) narrowing at the asSlice site could go negative → raw IOOBE. Now IoBounds.toIntSize + Math.addExact.
MED fix(reader): sanitize HTTP Content-Range and blob slicing fetchTail parsed the server-controlled Content-Range with bare Long.parseLong/substringNumberFormatException/StringIndexOutOfBoundsException; fetchBlob sliced with unchecked length → raw IOOBE. Now parsed defensively + IoBounds.slice.
LOW fix(zstd): bounds-check VarBin length prefixes buildVarBin/buildScatteredVarBin advanced the cursor by an unchecked length prefix → raw IOOBE on a crafted payload. Now routed through a range-checking readVarBinLen.

The two recursion caps mirror the existing MAX_LAYOUT_DEPTH guard (64). FFM already gave spatial safety on the off-heap segments — these fixes restore the "malformed input → VortexException" contract (ADR 0003) and close the one genuine resource leak (#1/#2).

Test

./mvnw test — BUILD SUCCESS, 0 failures. No behavior change on valid input; existing round-trip + interop suites green.

Negative-path unit tests for the recursion/bounds guards are not included — they need hand-built malformed FlatBuffers the writer can't emit. Happy to add fixture-driven ITs if wanted.

🤖 Generated with Claude Code

dfa1 and others added 7 commits June 26, 2026 20:30
convertDType recursed through Struct/List/FixedSizeList/Extension with no
depth guard, unlike convertLayout. A crafted or self-referential FlatBuffer
DType blob drove unbounded recursion into StackOverflowError — an Error, so
it escaped the parseBlobs/open VortexException sanitization and leaked the
reader's memory-mapped Arena. Add MAX_DTYPE_DEPTH (64) and thread depth, same
shape as the existing layout-tree guard.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
convertArrayNode recursed through child nodes with no depth guard. A crafted
or self-referential FlatBuffer array tree drove unbounded recursion into
StackOverflowError — an Error, bypassing the VortexException contract. Add
MAX_ARRAY_TREE_DEPTH (64) and thread depth, mirroring the layout/DType guards.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
VortexHttpReader.open called PostscriptParser.parseBlobs directly, skipping
the validateSegmentSpecs containment check that the local-file path runs
inside PostscriptParser.parse. A malformed remote footer could therefore
carry segmentSpecs whose offset/length fall outside the file, which
rawSegment() would turn into out-of-bounds Range requests. Validate against
fileSize before constructing the reader.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
decode summed untrusted per-frame uncompressed_size into a long with no
validation, then arena.allocate'd it; a crafted metadata could wrap the total
to a small positive (under-allocation) or claim a huge size (allocation DoS).
The (int) narrowing of uncompressed_size at the asSlice call site could also
go negative/truncated, throwing a raw IndexOutOfBoundsException instead of a
VortexException. Validate each frame via IoBounds.toIntSize and accumulate
with Math.addExact, surfacing overflow as VortexException.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fetchTail parsed the server-controlled Content-Range header with bare
Long.parseLong and substring calls: a missing '-'/'/' or non-numeric field
threw StringIndexOutOfBoundsException / NumberFormatException instead of a
VortexException. fetchBlob sliced the tail with an unchecked length, throwing
a raw IndexOutOfBoundsException. Parse defensively and slice via IoBounds so
malformed remote responses surface as VortexException.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
buildVarBin and buildScatteredVarBin read a 4-byte length prefix and advanced
the cursor by it with no validation. A crafted decompressed payload with a
negative or oversized length overran the segment, surfacing as a raw
IndexOutOfBoundsException instead of a VortexException. Route every prefix
read through a readVarBinLen helper that range-checks the prefix and its
trailing bytes against the source segment.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the malformed-input tests deferred in the hardening PR, plus two small
cleanups in the touched main files.

Tests:
- Content-Range parsing: missing prefix/'-'/'/', non-numeric fields, reversed
  separators, empty fields (MalformedHttpResponseTest).
- HTTP segmentSpec bounds: out-of-range remote footer surfaces as
  VortexException, mirroring the local-file path (HttpSegmentSpecBoundsSecurityTest).
- DType-tree and array-node depth caps: 65536-deep nesting rejected as
  VortexException, not StackOverflowError (DTypeDepthBombSecurityTest,
  ArrayNodeDepthBombSecurityTest).
- zstd decode guards: negative/oversized frame size (IoBounds.toIntSize) and
  oversized VarBin length prefix on both the contiguous and scattered paths
  (readVarBinLen) in ZstdEncodingEncoderTest.

Nits:
- FlatSegmentDecoder: import VortexException instead of inline FQN.
- ZstdEncodingDecoder: note the VarBin second pass relies on the first pass's
  readVarBinLen bounds check; keep both passes in lockstep.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dfa1 dfa1 merged commit 111cb6b into main Jun 26, 2026
6 checks passed
@dfa1 dfa1 deleted the harden/untrusted-input-bounds branch June 26, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant