Fix invisible flaw in DecryptingIndexInput.seek() and extract DecryptionBuffer by antogruz · Pull Request #131 · apache/solr-sandbox

antogruz · 2026-05-11T12:37:09Z

This PR contains two commits, well separated:

1. `Show and fix the invisible flaw of DecryptingIndexInput`

A one-line fix in seek() plus a regression test.

When slice() produces a fresh DecryptingIndexInput, the constructor leaves the AES-CTR encrypter at counter 0 and relies on the immediate seek(0) to call setPosition() with the correct counter and padding for the slice's actual offset. But when the cloned delegate's file pointer already matches the target position - which happens whenever the slice is created at offset 0 - seek() was taking a buffered-output shortcut and skipping setPosition() entirely. With an empty buffer (outSize == 0 on a fresh slice), the shortcut condition targetPosition >= currentPosition && targetPosition <= delegatePosition trivially holds, leaving the encrypter at counter=0 with padding=0. Any subsequent read returned corrupted plaintext.

The fix is to guard the shortcut with outSize > 0, so that a fresh slice always goes through setPosition() on its seek(0).

The bug is exercised by IndexInput#randomAccessSlice(long, long), which internally calls slice("randomaccess", 0, length). A new test testRandomAccessSlice reproduces it with prefixSize=17 so both the AES counter (17/16=1) and the padding (17%16=1) are non-zero, ensuring both dimensions of the bug are caught.

2. `Extract DecryptionBuffer into private class`

Pure refactoring. Moves all the AES-CTR buffering state (encrypter, in/out byte buffers, positions, padding) and the methods that operate on it into a new private static inner class DecryptionBuffer. DecryptingIndexInput keeps only the IndexInput-level concerns (file pointer, slice bounds, clone/close lifecycle) and delegates byte-level operations to the buffer.

No behavior change. The seven existing DecryptingIndexInputTest tests still pass, including the guard introduced by commit 1, which is now expressed as buffer.hasData() instead of outSize > 0.

When slice() creates a new DecryptingIndexInput, the constructor leaves the AES-CTR counter at 0 (encrypter.init(0)) and relies on the subsequent seek(0) call to invoke setPosition() with the correct counter and padding for the slice's actual offset. However, when the cloned delegate's file pointer already equals the target position - which happens when the slice is created at offset 0 - seek() takes a buffered-output shortcut and skips setPosition() altogether. The shortcut relies on outSize/outPos to express the buffered range, but they are still 0 on a fresh DecryptingIndexInput, so the check (targetPosition >= currentPosition && targetPosition <= delegatePosition) trivially succeeds with an empty buffer. The encrypter is then left initialised at counter=0 with padding=0, producing corrupted decrypted bytes for any read that follows. The fix is to guard the shortcut with outSize > 0 so that a fresh slice always goes through setPosition() on its seek(0). This bug is exercised by IndexInput#randomAccessSlice(long, long), which internally calls slice("randomaccess", 0, length): when invoked on a slice that itself starts at a non-zero offset (e.g. a sub-file of a compound file), the nested slice silently returned wrong data. Adds a regression test (testRandomAccessSlice) with prefixSize=17 so both the AES counter (17/16=1) and the padding (17%16=1) are non-zero, ensuring both dimensions of the bug are caught. The test fails on the previous code with a ComparisonFailure and passes with the fix.

Move all the AES-CTR buffering state (encrypter, in/out byte buffers, their positions and sizes, and the leading padding) and the methods that operate on it (readToFillBuffer, decryptBuffer, the previous setPosition()) into a new private static inner class DecryptionBuffer. After this refactoring, DecryptingIndexInput keeps only the IndexInput-level concerns (file pointer, slice bounds, clone/close lifecycle) and delegates every byte-level encryption operation to the buffer: - getPosition() -> buffer.bufferedAhead() - seek() -> buffer.hasData() / skipBytes() / seek() - slice() -> buffer.cloneEncrypter() / capacity() - readBytes() -> buffer.readDecrypted() - clone() -> buffer.clone() / seek() The behavior is unchanged - in particular the seek() guard introduced in the previous commit is preserved, now expressed as buffer.hasData() instead of outSize > 0. The seven existing DecryptingIndexInputTest tests still pass. This makes the buffering state self-contained (easier to reason about counter and padding initialization) and shrinks DecryptingIndexInput's own surface, which should make future evolutions of either layer easier to review.

antogruz added 2 commits May 11, 2026 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix invisible flaw in DecryptingIndexInput.seek() and extract DecryptionBuffer#131

Fix invisible flaw in DecryptingIndexInput.seek() and extract DecryptionBuffer#131
antogruz wants to merge 2 commits into
apache:mainfrom
antogruz:port-decrypting-index-input-fix-and-refactor

antogruz commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

antogruz commented May 11, 2026

1. Show and fix the invisible flaw of DecryptingIndexInput

2. Extract DecryptionBuffer into private class

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `Show and fix the invisible flaw of DecryptingIndexInput`

2. `Extract DecryptionBuffer into private class`