harden untrusted-input parsing: recursion caps, bounds, sanitized errors#172
Merged
Conversation
convertDType recursed through Struct/List/FixedSizeList/Extension with no depth guard, unlike convertLayout. A crafted or self-referential FlatBuffer DType blob drove unbounded recursion into StackOverflowError — an Error, so it escaped the parseBlobs/open VortexException sanitization and leaked the reader's memory-mapped Arena. Add MAX_DTYPE_DEPTH (64) and thread depth, same shape as the existing layout-tree guard. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
convertArrayNode recursed through child nodes with no depth guard. A crafted or self-referential FlatBuffer array tree drove unbounded recursion into StackOverflowError — an Error, bypassing the VortexException contract. Add MAX_ARRAY_TREE_DEPTH (64) and thread depth, mirroring the layout/DType guards. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
VortexHttpReader.open called PostscriptParser.parseBlobs directly, skipping the validateSegmentSpecs containment check that the local-file path runs inside PostscriptParser.parse. A malformed remote footer could therefore carry segmentSpecs whose offset/length fall outside the file, which rawSegment() would turn into out-of-bounds Range requests. Validate against fileSize before constructing the reader. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
decode summed untrusted per-frame uncompressed_size into a long with no validation, then arena.allocate'd it; a crafted metadata could wrap the total to a small positive (under-allocation) or claim a huge size (allocation DoS). The (int) narrowing of uncompressed_size at the asSlice call site could also go negative/truncated, throwing a raw IndexOutOfBoundsException instead of a VortexException. Validate each frame via IoBounds.toIntSize and accumulate with Math.addExact, surfacing overflow as VortexException. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fetchTail parsed the server-controlled Content-Range header with bare Long.parseLong and substring calls: a missing '-'/'/' or non-numeric field threw StringIndexOutOfBoundsException / NumberFormatException instead of a VortexException. fetchBlob sliced the tail with an unchecked length, throwing a raw IndexOutOfBoundsException. Parse defensively and slice via IoBounds so malformed remote responses surface as VortexException. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
buildVarBin and buildScatteredVarBin read a 4-byte length prefix and advanced the cursor by it with no validation. A crafted decompressed payload with a negative or oversized length overran the segment, surfacing as a raw IndexOutOfBoundsException instead of a VortexException. Route every prefix read through a readVarBinLen helper that range-checks the prefix and its trailing bytes against the source segment. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the malformed-input tests deferred in the hardening PR, plus two small cleanups in the touched main files. Tests: - Content-Range parsing: missing prefix/'-'/'/', non-numeric fields, reversed separators, empty fields (MalformedHttpResponseTest). - HTTP segmentSpec bounds: out-of-range remote footer surfaces as VortexException, mirroring the local-file path (HttpSegmentSpecBoundsSecurityTest). - DType-tree and array-node depth caps: 65536-deep nesting rejected as VortexException, not StackOverflowError (DTypeDepthBombSecurityTest, ArrayNodeDepthBombSecurityTest). - zstd decode guards: negative/oversized frame size (IoBounds.toIntSize) and oversized VarBin length prefix on both the contiguous and scattered paths (readVarBinLen) in ZstdEncodingEncoderTest. Nits: - FlatSegmentDecoder: import VortexException instead of inline FQN. - ZstdEncodingDecoder: note the VarBin second pass relies on the first pass's readVarBinLen bounds check; keep both passes in lockstep. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hardens the reader's untrusted-input parse paths. All findings share one root cause: a malformed-input path reaching the JDK without the
IoBounds/VortexExceptiondiscipline the rest of the parser follows. One fix per commit.Fixes
fix(reader): cap DType-tree recursion depthconvertDTyperecursed through Struct/List/FixedSizeList/Extension with no depth guard (unlikeconvertLayout). A crafted/self-referential FlatBuffer dtype →StackOverflowError— anError, so it escapedVortexExceptionsanitization and leaked the memory-mapped Arena (open'scatch (Exception)never ranarena.close()).fix(reader): cap array-node recursion depthconvertArrayNodehad the same unguarded recursion over the encoded array tree → sameStackOverflowErrorescape.fix(reader): validate segment specs in the HTTP readerVortexHttpReader.opencalledparseBlobsdirectly, skippingvalidateSegmentSpecs. A malformed remote footer's segmentSpecs could fall outside the file, turning into out-of-bounds Range requests.fix(zstd): bound frame sizes and overflow-check totaluncompressed_sizesummed into alongwith no validation, thenarena.allocate'd — wrap-to-small (under-alloc) or huge-claim (alloc DoS); the(int)narrowing at theasSlicesite could go negative → raw IOOBE. NowIoBounds.toIntSize+Math.addExact.fix(reader): sanitize HTTP Content-Range and blob slicingfetchTailparsed the server-controlledContent-Rangewith bareLong.parseLong/substring→NumberFormatException/StringIndexOutOfBoundsException;fetchBlobsliced with unchecked length → raw IOOBE. Now parsed defensively +IoBounds.slice.fix(zstd): bounds-check VarBin length prefixesbuildVarBin/buildScatteredVarBinadvanced the cursor by an unchecked length prefix → raw IOOBE on a crafted payload. Now routed through a range-checkingreadVarBinLen.The two recursion caps mirror the existing
MAX_LAYOUT_DEPTHguard (64). FFM already gave spatial safety on the off-heap segments — these fixes restore the "malformed input →VortexException" contract (ADR 0003) and close the one genuine resource leak (#1/#2).Test
./mvnw test— BUILD SUCCESS, 0 failures. No behavior change on valid input; existing round-trip + interop suites green.Negative-path unit tests for the recursion/bounds guards are not included — they need hand-built malformed FlatBuffers the writer can't emit. Happy to add fixture-driven ITs if wanted.
🤖 Generated with Claude Code