Skip to content

Fix invisible flaw in DecryptingIndexInput.seek() and extract DecryptionBuffer#131

Open
antogruz wants to merge 2 commits into
apache:mainfrom
antogruz:port-decrypting-index-input-fix-and-refactor
Open

Fix invisible flaw in DecryptingIndexInput.seek() and extract DecryptionBuffer#131
antogruz wants to merge 2 commits into
apache:mainfrom
antogruz:port-decrypting-index-input-fix-and-refactor

Conversation

@antogruz
Copy link
Copy Markdown

This PR contains two commits, well separated:

1. Show and fix the invisible flaw of DecryptingIndexInput

A one-line fix in seek() plus a regression test.

When slice() produces a fresh DecryptingIndexInput, the constructor leaves the AES-CTR encrypter at counter 0 and relies on the immediate seek(0) to call setPosition() with the correct counter and padding for the slice's actual offset. But when the cloned delegate's file pointer already matches the target position - which happens whenever the slice is created at offset 0 - seek() was taking a buffered-output shortcut and skipping setPosition() entirely. With an empty buffer (outSize == 0 on a fresh slice), the shortcut condition targetPosition >= currentPosition && targetPosition <= delegatePosition trivially holds, leaving the encrypter at counter=0 with padding=0. Any subsequent read returned corrupted plaintext.

The fix is to guard the shortcut with outSize > 0, so that a fresh slice always goes through setPosition() on its seek(0).

The bug is exercised by IndexInput#randomAccessSlice(long, long), which internally calls slice("randomaccess", 0, length). A new test testRandomAccessSlice reproduces it with prefixSize=17 so both the AES counter (17/16=1) and the padding (17%16=1) are non-zero, ensuring both dimensions of the bug are caught.

2. Extract DecryptionBuffer into private class

Pure refactoring. Moves all the AES-CTR buffering state (encrypter, in/out byte buffers, positions, padding) and the methods that operate on it into a new private static inner class DecryptionBuffer. DecryptingIndexInput keeps only the IndexInput-level concerns (file pointer, slice bounds, clone/close lifecycle) and delegates byte-level operations to the buffer.

No behavior change. The seven existing DecryptingIndexInputTest tests still pass, including the guard introduced by commit 1, which is now expressed as buffer.hasData() instead of outSize > 0.

antogruz added 2 commits May 11, 2026 14:21
When slice() creates a new DecryptingIndexInput, the constructor leaves
the AES-CTR counter at 0 (encrypter.init(0)) and relies on the
subsequent seek(0) call to invoke setPosition() with the correct
counter and padding for the slice's actual offset.

However, when the cloned delegate's file pointer already equals the
target position - which happens when the slice is created at offset 0 -
seek() takes a buffered-output shortcut and skips setPosition()
altogether. The shortcut relies on outSize/outPos to express the
buffered range, but they are still 0 on a fresh DecryptingIndexInput,
so the check (targetPosition >= currentPosition &&
targetPosition <= delegatePosition) trivially succeeds with an empty
buffer. The encrypter is then left initialised at counter=0 with
padding=0, producing corrupted decrypted bytes for any read that
follows.

The fix is to guard the shortcut with outSize > 0 so that a fresh
slice always goes through setPosition() on its seek(0).

This bug is exercised by IndexInput#randomAccessSlice(long, long),
which internally calls slice("randomaccess", 0, length): when invoked
on a slice that itself starts at a non-zero offset (e.g. a sub-file of
a compound file), the nested slice silently returned wrong data.

Adds a regression test (testRandomAccessSlice) with prefixSize=17 so
both the AES counter (17/16=1) and the padding (17%16=1) are non-zero,
ensuring both dimensions of the bug are caught. The test fails on the
previous code with a ComparisonFailure and passes with the fix.
Move all the AES-CTR buffering state (encrypter, in/out byte buffers,
their positions and sizes, and the leading padding) and the methods
that operate on it (readToFillBuffer, decryptBuffer, the previous
setPosition()) into a new private static inner class DecryptionBuffer.

After this refactoring, DecryptingIndexInput keeps only the
IndexInput-level concerns (file pointer, slice bounds, clone/close
lifecycle) and delegates every byte-level encryption operation to the
buffer:

  - getPosition()        -> buffer.bufferedAhead()
  - seek()               -> buffer.hasData() / skipBytes() / seek()
  - slice()              -> buffer.cloneEncrypter() / capacity()
  - readBytes()          -> buffer.readDecrypted()
  - clone()              -> buffer.clone() / seek()

The behavior is unchanged - in particular the seek() guard introduced
in the previous commit is preserved, now expressed as buffer.hasData()
instead of outSize > 0. The seven existing DecryptingIndexInputTest
tests still pass.

This makes the buffering state self-contained (easier to reason about
counter and padding initialization) and shrinks DecryptingIndexInput's
own surface, which should make future evolutions of either layer
easier to review.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant