Skip to content

Conversation

@jonded94
Copy link
Contributor

@jonded94 jonded94 commented Feb 7, 2026

Which issue does this PR close?

Rationale for this change

The bug occurs when using RowSelection with nested types (like List) when:

  1. A column has multiple pages in a row group
  2. The selected rows span across page boundaries
  3. The first page is entirely consumed during skip operations

The issue was in arrow-rs/parquet/src/column/reader.rs:287-382 (skip_records function).

Root cause: When skip_records completed successfully after crossing page boundaries, the has_partial state in the RepetitionLevelDecoder could incorrectly remain true.

This happened when:

  • The skip operation exhausted a page where has_record_delimiter was false
  • The skip found the remaining records on the next page by counting a delimiter at index 0
  • When a subsequent read_records(1) was called, the stale has_partial=true state caused count_records to incorrectly interpret the first repetition level (0) at index 0 as ending a "phantom" partial record, returning (1 record, 0 levels, 0 values) instead of properly reading the actual record data.

For a more descriptive explanation, look here: #9370 (comment)

What changes are included in this PR?

Added code at the end of skip_records to reset the partial record state when all requested records have been successfully skipped.

This ensures that after skip_records completes, we're at a clean record boundary with no lingering partial record state, fixing the array length mismatch in StructArrayReader.

Are these changes tested?

Commit 365bd9a introduces a test showcasing this issue with v2 data pages only on a unit-test level. PR #9399 could be used to showcase the issue in an end-to-end way.

Previously wrong assumption that thought it had to do with mixing v1 and v2 data pages:

In b52e043 I added a test that I validated to fail whenever I remove my fix.

  Bug Mechanism                                                                                                                                                                                             
                                                                                                                                                                                                            
  The bug requires three ingredients:                                                                                                                                                                       

  1. Page 1 (DataPage v1): Contains a nested column (with rep levels). During skip_records, all levels on this page are consumed. count_records sees no following rep=0 delimiter, so it sets               
  has_partial=true. Since has_record_delimiter is false (the default InMemoryPageReader returns false when more pages exist), flush_partial is not called.
  2. Page 2 (DataPage v2): Has num_rows available in its metadata. When num_rows <= remaining_records, the entire page is skipped via skip_next_page() — this does not touch the rep level decoder at all,
  so has_partial remains stale true from page 1.
  3. Page 3 (DataPage v1): When read_records loads this page, the stale has_partial=true causes the rep=0 at position 0 to be misinterpreted as completing a "phantom" partial record. This produces (1
  record, 0 levels, 0 values) instead of reading the actual record data.

  Test Verification

  - With fix (flush_partial at end of skip_records): read_records(1) correctly returns (1, 2, 2) with values [70, 80]
  - Without fix: read_records(1) returns (1, 0, 0) — a phantom record with no data, which is what causes the "Not all children array length are the same!" error when different sibling columns in a struct
  produce different record counts

@etseidl
Copy link
Contributor

etseidl commented Feb 9, 2026

Thanks for tracking this down @jonded94! Rows spanning page boundaries have been the bane of my existence 😬. I want to give this a good look this afternoon.

@etseidl
Copy link
Contributor

etseidl commented Feb 9, 2026

I've looked at the code and the test, and I'm not quite sure this is the correct fix. Given the data ([10, 20], [30, 40], [50, 60], [70, 80]), I would expect skip_records(2) to skip over [10, 20], [30, 40] and then return [50, 60] for the next row (which is the behavior if all three pages use v1 headers). I think the issue you've identified is that when mixing page header types, we can't simply trust num_rows because we've found it on a single page header. I think we instead need to detect mixed v1/v2 page headers and not use the num_rows short cut in a mixed environment.

Maybe something like

   pub fn skip_records(&mut self, num_records: usize) -> Result<usize> {
        let mut remaining_records = num_records;
        let mut all_v2_pages = true; // << add this boolean
        while remaining_records != 0 {
            if self.num_buffered_values == self.num_decoded_values {
                ...

                if let Some(rows) = rows {
                    // this short circuits decoding levels, but can be incorrect if V1 pages are
                    // also present
                    if rows <= remaining_records && all_v2_pages { // << add to this test
                        self.page_reader.skip_next_page()?;
                        remaining_records -= rows;
                        continue;
                    }
                } else {
                    all_v2_pages = false; // << set to false if we encounter V1 page
                }

@jonded94
Copy link
Contributor Author

Hey @etseidl 👋

Thanks for your review! I'm not super well versed around data page decoding but I agree that after skipping 2 records, [50, 60] probably should be the next decoded record.. I'm unfortunately super busy with internal bugfixes right now, I just plugged that back into the AI slop machine and it came out with a fix (d738d35) slightly different from your suggestion that does correctly return [50, 60]:

  The corrected fix (addressing reviewer @etseidl's feedback):                                                                                                                                              
                  
  - Before: flush_partial() was called at the end of skip_records after all records were skipped — this only cleared the stale has_partial state but didn't fix the over-counting (3 records skipped instead
   of 2).         
  - After: flush_partial() is called before the num_rows whole-page-skip shortcut. This properly counts the pending partial record from the previous page, adjusts remaining_records (2→1), and prevents the
   v2 page from being entirely skipped (since rows=2 > remaining=1). The skip then falls through to level-based decoding, correctly skipping only 1 record from page 2.

  The corrected test asserts that after skip_records(2) over 4 records [10,20], [30,40], [50,60], [70,80], read_records(1) returns [50, 60] (the 3rd record), not [70, 80] (the 4th).

I'm really sorry I can't invest more time into that right now, but I hope what I committed is now finally correct!

@jonded94 jonded94 changed the title Reset partial record state after skipping all requested records Fix skip_records over-counting when partial record precedes num_rows page skip Feb 10, 2026
Copy link
Contributor

@etseidl etseidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jonded94, I think the AI might have got it right this time 😄.

I wonder if @tustvold has a few moments to spare to take a look at this?

// offset index), its records are self-contained, so the
// partial from the previous page is complete at this boundary.
if let Some(decoder) = self.rep_level_decoder.as_mut() {
if decoder.flush_partial() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct. If all pages are V2, then has_partial will never be true, because V2 pages must start on a new record (R=0). If all pages are V1 this will never trigger because num_rows will be None. This case only applies when switching from V1 to V2, in which case it's appropriate to call flush_partial because, as said above, V2 pages must start at a new record.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that means that this bug requires a parquet file with a column chunk that has both V1 and V2 pages?

Copy link
Contributor

@alamb alamb Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made such a file (that triggers the error with a RowSelection): 🤯 #9395

Co-authored-by: Ed Seidl <etseidl@users.noreply.github.com>
@alamb
Copy link
Contributor

alamb commented Feb 11, 2026

run benchmark arrow_reader

@alamb
Copy link
Contributor

alamb commented Feb 11, 2026

I would just like to make sure this isn't causing some major regression in performance (I don't think it is) and then it looks good to me.

It would be nice to add a "end to end" reproducer with an actual parquet file -- I made one for your consideration on #9395

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing reset-partial-record-state-after-skipping-all-records (0fac73f) to 7dbe58a diff
BENCH_NAME=arrow_reader
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_reader
BENCH_FILTER=
BENCH_BRANCH_NAME=reset-partial-record-state-after-skipping-all-records
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                                      main                                   reset-partial-record-state-after-skipping-all-records
-----                                                                                                      ----                                   -----------------------------------------------------
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                           1.04   1270.2±4.40µs        ? ?/sec    1.00   1224.1±6.50µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                          1.00   1267.6±2.61µs        ? ?/sec    1.00   1264.1±4.56µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                            1.05  1290.7±65.08µs        ? ?/sec    1.00   1231.3±3.25µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, mandatory, no NULLs                                     1.00    472.7±4.35µs        ? ?/sec    1.07    506.5±2.95µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, half NULLs                                    1.00   637.4±14.68µs        ? ?/sec    1.03    655.4±1.59µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, no NULLs                                      1.00    477.3±3.78µs        ? ?/sec    1.04    497.6±3.65µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, mandatory, no NULLs                                          1.02    557.4±4.32µs        ? ?/sec    1.00    544.5±6.09µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, half NULLs                                         1.00    713.4±4.62µs        ? ?/sec    1.00   711.2±15.32µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, no NULLs                                           1.01    564.6±3.37µs        ? ?/sec    1.00    560.6±8.97µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, mandatory, no NULLs                                 1.00    154.2±1.44µs        ? ?/sec    1.00    154.1±1.59µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, half NULLs                                1.00    176.5±2.24µs        ? ?/sec    1.17    207.3±0.91µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, no NULLs                                  1.01    157.4±1.31µs        ? ?/sec    1.00    155.9±1.70µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs                                      1.00    211.4±3.48µs        ? ?/sec    1.06   223.9±11.61µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs, short string                        1.01    196.1±1.87µs        ? ?/sec    1.00    193.3±0.48µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, half NULLs                                     1.00    220.9±3.37µs        ? ?/sec    1.11    245.8±4.35µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, no NULLs                                       1.00    219.9±7.35µs        ? ?/sec    1.04    229.5±6.50µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs     1.07   1122.6±3.20µs        ? ?/sec    1.00   1046.3±5.51µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, half NULLs    1.00    926.6±3.11µs        ? ?/sec    1.02    940.6±3.60µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, no NULLs      1.07   1128.1±4.89µs        ? ?/sec    1.00   1051.2±3.98µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                 1.09    450.9±2.86µs        ? ?/sec    1.00    414.6±3.93µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                1.00    584.2±2.84µs        ? ?/sec    1.07    625.4±2.43µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                  1.08    452.1±3.03µs        ? ?/sec    1.00    419.3±2.99µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, mandatory, no NULLs        1.00    160.8±3.05µs        ? ?/sec    1.21    194.7±1.71µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, half NULLs       1.00    301.6±1.26µs        ? ?/sec    1.12    338.0±1.03µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, no NULLs         1.00    164.4±1.23µs        ? ?/sec    1.21    198.5±0.58µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, mandatory, no NULLs                    1.00     76.7±0.50µs        ? ?/sec    1.53    117.5±0.94µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, half NULLs                   1.00    258.7±2.19µs        ? ?/sec    1.16    299.2±0.59µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, no NULLs                     1.00     81.2±0.37µs        ? ?/sec    1.50    121.6±0.83µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, mandatory, no NULLs                    1.07    734.7±1.85µs        ? ?/sec    1.00    689.7±1.95µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, half NULLs                   1.00    536.6±1.17µs        ? ?/sec    1.08    578.8±8.01µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, no NULLs                     1.06    739.4±6.82µs        ? ?/sec    1.00    697.1±7.55µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, mandatory, no NULLs                                1.00     61.7±4.90µs        ? ?/sec    1.08     66.6±7.53µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, half NULLs                               1.00    200.4±1.09µs        ? ?/sec    1.35    270.4±1.37µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, no NULLs                                 1.00     68.4±5.23µs        ? ?/sec    1.07     73.4±6.12µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, mandatory, no NULLs                     1.08     93.9±0.63µs        ? ?/sec    1.00     86.6±1.94µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, half NULLs                    1.01    231.1±1.22µs        ? ?/sec    1.00    228.9±0.91µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, no NULLs                      1.08     98.0±0.77µs        ? ?/sec    1.00     90.7±0.28µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, mandatory, no NULLs                                 1.00      9.3±0.37µs        ? ?/sec    1.00      9.3±0.09µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, half NULLs                                1.00    188.7±0.55µs        ? ?/sec    1.03   194.0±20.15µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, no NULLs                                  1.00     13.0±0.41µs        ? ?/sec    1.06     13.8±0.28µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, mandatory, no NULLs                     1.08    183.8±0.73µs        ? ?/sec    1.00    170.4±0.64µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, half NULLs                    1.00    366.4±1.17µs        ? ?/sec    1.02   375.1±10.71µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, no NULLs                      1.08    188.9±5.52µs        ? ?/sec    1.00    174.4±0.50µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, mandatory, no NULLs                                 1.00     14.0±0.30µs        ? ?/sec    1.04     14.6±0.44µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, half NULLs                                1.00    281.6±4.06µs        ? ?/sec    1.05    296.8±4.37µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, no NULLs                                  1.00     19.0±0.41µs        ? ?/sec    1.00     19.0±0.43µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, mandatory, no NULLs                     1.07    367.5±8.81µs        ? ?/sec    1.00    342.8±1.37µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, half NULLs                    1.00    340.8±6.00µs        ? ?/sec    1.17    400.1±5.67µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, no NULLs                      1.06    371.5±4.42µs        ? ?/sec    1.00    349.3±6.73µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, mandatory, no NULLs                                 1.00     25.3±0.27µs        ? ?/sec    1.07     27.0±0.51µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, half NULLs                                1.00    172.5±1.96µs        ? ?/sec    1.39    240.6±2.11µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, no NULLs                                  1.00     29.8±0.36µs        ? ?/sec    1.10     32.7±0.49µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    108.4±0.15µs        ? ?/sec    1.02    110.8±0.48µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    129.9±0.19µs        ? ?/sec    1.01    131.1±0.56µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    110.9±0.44µs        ? ?/sec    1.02    113.2±0.45µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    158.5±1.88µs        ? ?/sec    1.02    161.9±3.35µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, half NULLs                               1.00    222.3±1.28µs        ? ?/sec    1.01    224.2±1.14µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, no NULLs                                 1.00    163.4±4.38µs        ? ?/sec    1.01    165.6±1.47µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00     75.6±2.00µs        ? ?/sec    1.01     76.0±0.27µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.01    180.3±4.70µs        ? ?/sec    1.00    179.3±0.64µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.00     78.4±1.30µs        ? ?/sec    1.04     81.4±0.20µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.00    133.7±0.59µs        ? ?/sec    1.06    142.2±0.35µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, half NULLs                          1.00    211.2±1.79µs        ? ?/sec    1.03    216.9±0.71µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    137.3±0.30µs        ? ?/sec    1.08    148.5±4.23µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, mandatory, no NULLs                                1.01     72.3±1.73µs        ? ?/sec    1.00     71.3±1.05µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, half NULLs                               1.01    178.1±5.43µs        ? ?/sec    1.00    175.8±2.35µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, no NULLs                                 1.02     76.8±2.01µs        ? ?/sec    1.00     75.2±0.49µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    107.7±0.30µs        ? ?/sec    1.02    109.8±0.52µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    118.1±0.24µs        ? ?/sec    1.01    119.4±0.60µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    108.4±0.79µs        ? ?/sec    1.02    110.6±0.39µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    160.2±1.31µs        ? ?/sec    1.01    161.6±0.39µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, half NULLs                               1.01    209.8±2.86µs        ? ?/sec    1.00    208.5±0.58µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, no NULLs                                 1.00    164.4±3.35µs        ? ?/sec    1.00    165.2±0.30µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00    200.1±2.77µs        ? ?/sec    1.01    202.1±0.72µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.00    226.9±0.94µs        ? ?/sec    1.00    226.5±0.94µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.00    204.7±0.81µs        ? ?/sec    1.01    207.3±0.96µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.05    149.4±0.37µs        ? ?/sec    1.00    142.1±0.50µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, half NULLs                          1.02    197.7±5.75µs        ? ?/sec    1.00    194.0±2.05µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, no NULLs                            1.05    152.9±0.28µs        ? ?/sec    1.00    146.1±0.40µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, mandatory, no NULLs                                1.06    104.8±0.78µs        ? ?/sec    1.00     98.6±1.01µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, half NULLs                               1.02    175.3±1.15µs        ? ?/sec    1.00    172.4±0.52µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, no NULLs                                 1.04    111.6±1.64µs        ? ?/sec    1.00    107.5±0.55µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, mandatory, no NULLs                                      1.00     76.6±0.52µs        ? ?/sec    1.03     79.0±0.55µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, half NULLs                                     1.00    103.2±1.14µs        ? ?/sec    1.00    103.4±0.34µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, no NULLs                                       1.00     78.3±0.78µs        ? ?/sec    1.03     80.6±1.62µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, mandatory, no NULLs                                           1.00    106.1±0.60µs        ? ?/sec    1.02    108.5±0.68µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, half NULLs                                          1.00    174.2±1.47µs        ? ?/sec    1.01    176.2±2.18µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, no NULLs                                            1.00    109.5±1.19µs        ? ?/sec    1.02    111.3±0.34µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, mandatory, no NULLs                               1.01     43.0±0.24µs        ? ?/sec    1.00     42.8±0.07µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, half NULLs                              1.02    142.8±0.33µs        ? ?/sec    1.00    139.6±0.39µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, no NULLs                                1.00     45.7±0.13µs        ? ?/sec    1.00     45.6±0.06µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, mandatory, no NULLs                                      1.00    101.1±1.33µs        ? ?/sec    1.08    109.4±0.27µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, half NULLs                                     1.00    175.9±4.06µs        ? ?/sec    1.01    177.6±0.55µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, no NULLs                                       1.00    104.2±1.05µs        ? ?/sec    1.08    112.7±0.29µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, mandatory, no NULLs                                           1.00     36.7±0.53µs        ? ?/sec    1.01     37.0±0.29µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, half NULLs                                          1.01    139.6±0.89µs        ? ?/sec    1.00    138.6±1.35µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, no NULLs                                            1.00     40.1±0.21µs        ? ?/sec    1.01     40.3±0.43µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, mandatory, no NULLs                                      1.00     82.0±0.19µs        ? ?/sec    1.03     84.1±0.35µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, half NULLs                                     1.00    101.3±0.32µs        ? ?/sec    1.01    102.4±0.77µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, no NULLs                                       1.00     83.9±0.37µs        ? ?/sec    1.03     86.1±1.46µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, mandatory, no NULLs                                           1.00    106.8±1.90µs        ? ?/sec    1.02    108.8±0.76µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, half NULLs                                          1.00    166.0±0.55µs        ? ?/sec    1.01    167.4±1.20µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, no NULLs                                            1.00    109.2±0.22µs        ? ?/sec    1.02    111.7±1.29µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, mandatory, no NULLs                               1.06     23.0±0.22µs        ? ?/sec    1.00     21.8±0.32µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, half NULLs                              1.01    122.4±0.54µs        ? ?/sec    1.00    121.7±3.86µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, no NULLs                                1.03     26.1±0.23µs        ? ?/sec    1.00     25.2±0.26µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, mandatory, no NULLs                                      1.00     82.3±2.59µs        ? ?/sec    1.10     90.6±1.18µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, half NULLs                                     1.00    155.1±0.50µs        ? ?/sec    1.03    159.1±1.30µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, no NULLs                                       1.00     84.3±0.17µs        ? ?/sec    1.11     93.8±0.62µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, mandatory, no NULLs                                           1.02     15.5±0.30µs        ? ?/sec    1.00     15.3±0.25µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, half NULLs                                          1.00    119.6±0.28µs        ? ?/sec    1.00    119.2±4.33µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, no NULLs                                            1.01     20.1±0.22µs        ? ?/sec    1.00     20.0±0.46µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, mandatory, no NULLs                                      1.00     79.1±0.20µs        ? ?/sec    1.02     80.3±0.43µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, half NULLs                                     1.00     89.6±0.61µs        ? ?/sec    1.01     90.1±0.78µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, no NULLs                                       1.00     81.1±0.43µs        ? ?/sec    1.01     82.2±0.32µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, mandatory, no NULLs                                           1.00    104.9±0.77µs        ? ?/sec    1.03    108.1±0.62µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, half NULLs                                          1.00    142.3±0.43µs        ? ?/sec    1.01    143.6±0.92µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, no NULLs                                            1.00    106.8±0.54µs        ? ?/sec    1.03    110.2±0.43µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, mandatory, no NULLs                               1.00    146.3±0.92µs        ? ?/sec    1.00    146.3±0.88µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, half NULLs                              1.01    167.2±1.35µs        ? ?/sec    1.00    165.7±0.63µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, no NULLs                                1.00    149.6±0.62µs        ? ?/sec    1.00    149.5±0.80µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, mandatory, no NULLs                                      1.11     97.2±3.35µs        ? ?/sec    1.00     87.7±0.83µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, half NULLs                                     1.04    139.6±0.65µs        ? ?/sec    1.00    134.4±1.37µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, no NULLs                                       1.10     99.8±0.79µs        ? ?/sec    1.00     90.9±1.85µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, mandatory, no NULLs                                           1.00     37.7±0.77µs        ? ?/sec    1.21     45.7±1.98µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, half NULLs                                          1.00    110.0±0.60µs        ? ?/sec    1.01    111.6±0.57µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, no NULLs                                            1.00     41.9±0.88µs        ? ?/sec    1.23     51.5±2.26µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, mandatory, no NULLs                                       1.00     80.8±2.33µs        ? ?/sec    1.03     83.5±2.33µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, half NULLs                                      1.00    102.5±1.05µs        ? ?/sec    1.02    104.2±3.43µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, no NULLs                                        1.00     82.5±1.00µs        ? ?/sec    1.04     85.4±2.45µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, mandatory, no NULLs                                            1.00    108.1±1.65µs        ? ?/sec    1.03    111.4±0.21µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, half NULLs                                           1.00    170.9±0.66µs        ? ?/sec    1.01    172.4±0.55µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, no NULLs                                             1.00    111.0±0.20µs        ? ?/sec    1.03    114.7±0.32µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, mandatory, no NULLs                                1.00     34.6±0.16µs        ? ?/sec    1.01     35.0±0.53µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, half NULLs                               1.01    134.1±1.02µs        ? ?/sec    1.00    132.9±1.57µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, no NULLs                                 1.04     37.7±0.40µs        ? ?/sec    1.00     36.2±0.26µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, mandatory, no NULLs                                       1.00     92.7±1.11µs        ? ?/sec    1.10    101.9±2.73µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, half NULLs                                      1.00    167.1±5.62µs        ? ?/sec    1.02    170.9±2.41µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, no NULLs                                        1.00     96.1±1.53µs        ? ?/sec    1.09    105.1±0.70µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, mandatory, no NULLs                                            1.00     28.6±0.18µs        ? ?/sec    1.01     28.8±0.63µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, half NULLs                                           1.01    131.5±0.87µs        ? ?/sec    1.00    130.2±1.78µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, no NULLs                                             1.00     32.0±0.30µs        ? ?/sec    1.01     32.2±0.17µs        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings half NULLs                                     1.00      6.1±0.04ms        ? ?/sec    1.00      6.1±0.03ms        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings no NULLs                                       1.00     12.0±0.20ms        ? ?/sec    1.02     12.2±0.17ms        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, mandatory, no NULLs                                     1.00    489.6±3.56µs        ? ?/sec    1.02    498.3±3.47µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, half NULLs                                    1.00    634.1±2.76µs        ? ?/sec    1.04    657.1±5.29µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, no NULLs                                      1.00    476.8±3.54µs        ? ?/sec    1.06   505.3±22.68µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, mandatory, no NULLs                                          1.00    671.1±7.18µs        ? ?/sec    1.02    686.6±4.22µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, half NULLs                                         1.00    764.4±7.36µs        ? ?/sec    1.03   789.3±24.18µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, no NULLs                                           1.00   677.3±14.75µs        ? ?/sec    1.03   695.7±10.60µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, mandatory, no NULLs                                1.00    330.2±2.91µs        ? ?/sec    1.00    330.9±1.79µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, half NULLs                               1.00    384.7±5.59µs        ? ?/sec    1.10    421.8±2.51µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, no NULLs                                 1.00    335.2±7.25µs        ? ?/sec    1.01    338.2±5.78µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, mandatory, no NULLs                                 1.02    152.7±3.42µs        ? ?/sec    1.00    150.4±1.70µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, half NULLs                                1.00    161.6±3.37µs        ? ?/sec    1.18    190.2±1.38µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, no NULLs                                  1.00    137.1±2.98µs        ? ?/sec    1.02    140.4±2.18µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, mandatory, no NULLs                                      1.00    398.7±8.17µs        ? ?/sec    1.03    411.2±3.24µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, half NULLs                                     1.00    305.9±2.14µs        ? ?/sec    1.11    338.1±3.55µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, no NULLs                                       1.00    403.7±4.45µs        ? ?/sec    1.04    420.0±3.52µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, mandatory, no NULLs                                     1.00     92.4±1.73µs        ? ?/sec    1.02     94.2±0.51µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, half NULLs                                    1.00    111.5±0.39µs        ? ?/sec    1.01    113.0±2.35µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, no NULLs                                      1.00     93.8±0.68µs        ? ?/sec    1.03     96.3±0.36µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, mandatory, no NULLs                                          1.00    125.9±3.68µs        ? ?/sec    1.02    128.4±0.49µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, half NULLs                                         1.00    185.6±0.32µs        ? ?/sec    1.01    187.9±0.40µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, no NULLs                                           1.00    128.6±0.20µs        ? ?/sec    1.02    131.8±1.82µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, mandatory, no NULLs                              1.00     40.8±0.20µs        ? ?/sec    1.01     41.1±0.24µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, half NULLs                             1.00    141.1±0.60µs        ? ?/sec    1.00    141.1±4.10µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, no NULLs                               1.00     43.9±0.10µs        ? ?/sec    1.01     44.2±0.66µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, mandatory, no NULLs                                     1.00    100.0±0.25µs        ? ?/sec    1.10    109.7±1.19µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, half NULLs                                    1.00    174.3±0.84µs        ? ?/sec    1.02    178.4±3.58µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, no NULLs                                      1.00    103.8±0.45µs        ? ?/sec    1.09    113.4±3.15µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, mandatory, no NULLs                                          1.00     36.7±0.18µs        ? ?/sec    1.00     36.6±0.17µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, half NULLs                                         1.01    140.2±4.20µs        ? ?/sec    1.00    138.9±1.63µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, no NULLs                                           1.00     40.1±0.14µs        ? ?/sec    1.00     39.9±0.31µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, mandatory, no NULLs                                     1.00     81.8±0.27µs        ? ?/sec    1.03     84.0±1.37µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, half NULLs                                    1.01    102.1±3.51µs        ? ?/sec    1.00    101.5±0.47µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, no NULLs                                      1.00     83.6±0.18µs        ? ?/sec    1.02     85.6±0.14µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, mandatory, no NULLs                                          1.00    106.4±2.83µs        ? ?/sec    1.02    109.0±0.30µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, half NULLs                                         1.00    167.4±3.21µs        ? ?/sec    1.01    169.3±8.04µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, no NULLs                                           1.00    110.2±2.51µs        ? ?/sec    1.02    112.8±3.44µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, mandatory, no NULLs                              1.03     24.2±0.31µs        ? ?/sec    1.00     23.5±0.31µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, half NULLs                             1.01    122.7±2.41µs        ? ?/sec    1.00    122.0±0.23µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, no NULLs                               1.02     27.0±0.45µs        ? ?/sec    1.00     26.6±0.17µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, mandatory, no NULLs                                     1.00     82.2±0.71µs        ? ?/sec    1.11     91.0±1.12µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, half NULLs                                    1.00    156.4±5.62µs        ? ?/sec    1.02    159.3±0.97µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, no NULLs                                      1.00     85.5±0.67µs        ? ?/sec    1.10     94.0±0.84µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, mandatory, no NULLs                                          1.00     18.0±0.62µs        ? ?/sec    1.05     18.9±0.91µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, half NULLs                                         1.01    120.9±0.99µs        ? ?/sec    1.00    119.5±3.99µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, no NULLs                                           1.00     21.5±0.79µs        ? ?/sec    1.00     21.5±0.79µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, mandatory, no NULLs                                     1.00     79.4±0.32µs        ? ?/sec    1.02     80.8±0.84µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, half NULLs                                    1.00     89.2±0.35µs        ? ?/sec    1.01     89.9±0.41µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, no NULLs                                      1.00     81.0±0.24µs        ? ?/sec    1.02     82.6±0.55µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, mandatory, no NULLs                                          1.00    105.3±0.30µs        ? ?/sec    1.03    108.3±0.64µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, half NULLs                                         1.00    143.3±0.43µs        ? ?/sec    1.01    144.3±0.64µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, no NULLs                                           1.00    107.8±0.38µs        ? ?/sec    1.02    110.1±0.88µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, mandatory, no NULLs                              1.00    146.1±0.50µs        ? ?/sec    1.01    146.9±0.47µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, half NULLs                             1.01    168.2±4.75µs        ? ?/sec    1.00    166.4±1.73µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, no NULLs                               1.00    150.4±1.13µs        ? ?/sec    1.00    150.0±0.48µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, mandatory, no NULLs                                     1.09     96.5±0.51µs        ? ?/sec    1.00     88.3±1.74µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, half NULLs                                    1.04    140.2±1.44µs        ? ?/sec    1.00    134.5±2.65µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, no NULLs                                      1.09    100.0±0.43µs        ? ?/sec    1.00     91.7±3.95µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, mandatory, no NULLs                                          1.00     37.6±0.60µs        ? ?/sec    1.22     45.8±2.69µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, half NULLs                                         1.00    110.1±0.63µs        ? ?/sec    1.00    110.5±0.52µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, no NULLs                                           1.00     43.2±1.01µs        ? ?/sec    1.21     52.5±2.19µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, mandatory, no NULLs                                      1.00     87.8±1.08µs        ? ?/sec    1.05     92.1±1.31µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, half NULLs                                     1.00    106.3±0.17µs        ? ?/sec    1.02    108.5±1.17µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, no NULLs                                       1.00     89.5±0.17µs        ? ?/sec    1.05     93.9±0.73µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, mandatory, no NULLs                                           1.00    117.9±3.56µs        ? ?/sec    1.04    122.9±1.38µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, half NULLs                                          1.00    176.2±0.50µs        ? ?/sec    1.03    181.0±4.25µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, no NULLs                                            1.00    120.5±0.18µs        ? ?/sec    1.05    126.1±1.46µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     34.6±0.31µs        ? ?/sec    1.01     34.9±0.10µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, half NULLs                              1.01    134.5±1.96µs        ? ?/sec    1.00    133.0±1.20µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, no NULLs                                1.00     37.6±0.40µs        ? ?/sec    1.00     37.6±0.12µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, mandatory, no NULLs                                      1.00     92.7±0.17µs        ? ?/sec    1.10    101.6±2.98µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, half NULLs                                     1.00    167.4±2.80µs        ? ?/sec    1.02    170.7±1.09µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, no NULLs                                       1.00     96.1±0.49µs        ? ?/sec    1.09    104.6±0.72µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, mandatory, no NULLs                                           1.00     28.6±0.10µs        ? ?/sec    1.02     29.1±0.06µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, half NULLs                                          1.01    132.4±4.74µs        ? ?/sec    1.00    131.5±3.77µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, no NULLs                                            1.00     32.2±0.11µs        ? ?/sec    1.00     32.3±0.74µs        ? ?/sec
arrow_array_reader/struct/Int32Array/plain encoded, mandatory struct, optional data, half NULLs            1.00    120.4±1.93µs        ? ?/sec    1.00    120.0±1.93µs        ? ?/sec
arrow_array_reader/struct/Int32Array/plain encoded, mandatory struct, optional data, no NULLs              1.00     20.9±0.83µs        ? ?/sec    1.05     21.9±0.81µs        ? ?/sec
arrow_array_reader/struct/Int32Array/plain encoded, optional struct, optional data, half NULLs             1.00    241.9±1.80µs        ? ?/sec    1.00    241.0±1.44µs        ? ?/sec
arrow_array_reader/struct/Int32Array/plain encoded, optional struct, optional data, no NULLs               1.00    118.8±2.28µs        ? ?/sec    1.00    119.2±1.89µs        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Feb 11, 2026

Some of the benchmarks look like they got slower, but it might be noise. I will run them again to see

@alamb
Copy link
Contributor

alamb commented Feb 11, 2026

run benchmark arrow_reader

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing reset-partial-record-state-after-skipping-all-records (0fac73f) to 7dbe58a diff
BENCH_NAME=arrow_reader
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_reader
BENCH_FILTER=
BENCH_BRANCH_NAME=reset-partial-record-state-after-skipping-all-records
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                                                      main                                   reset-partial-record-state-after-skipping-all-records
-----                                                                                                      ----                                   -----------------------------------------------------
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                           1.04   1274.2±6.70µs        ? ?/sec    1.00   1225.5±3.10µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                          1.00   1267.8±5.50µs        ? ?/sec    1.00   1263.5±7.47µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                            1.04   1279.8±4.61µs        ? ?/sec    1.00   1233.0±9.87µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, mandatory, no NULLs                                     1.00    475.1±3.49µs        ? ?/sec    1.08    511.5±8.24µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, half NULLs                                    1.00    651.3±3.99µs        ? ?/sec    1.01    656.7±3.14µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, no NULLs                                      1.00    488.6±3.82µs        ? ?/sec    1.03    503.5±5.15µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, mandatory, no NULLs                                          1.00   553.5±15.81µs        ? ?/sec    1.00   555.0±12.94µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, half NULLs                                         1.01   723.1±38.54µs        ? ?/sec    1.00   714.0±18.95µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, no NULLs                                           1.00    558.8±6.36µs        ? ?/sec    1.00    559.0±7.27µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, mandatory, no NULLs                                 1.00    151.2±1.90µs        ? ?/sec    1.02    153.8±1.92µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, half NULLs                                1.00    175.9±0.75µs        ? ?/sec    1.18    207.9±0.90µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, no NULLs                                  1.00    157.1±6.10µs        ? ?/sec    1.00    156.9±1.68µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs                                      1.00    207.6±4.46µs        ? ?/sec    1.07    221.6±3.07µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs, short string                        1.00    193.8±0.85µs        ? ?/sec    1.00    194.1±2.32µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, half NULLs                                     1.00    213.9±5.35µs        ? ?/sec    1.14    244.6±2.22µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, no NULLs                                       1.00    217.0±5.25µs        ? ?/sec    1.05    228.5±3.46µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs     1.07   1123.1±4.81µs        ? ?/sec    1.00  1050.2±13.71µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, half NULLs    1.00    918.4±4.42µs        ? ?/sec    1.03    949.3±9.92µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, no NULLs      1.07  1130.0±12.39µs        ? ?/sec    1.00  1055.2±17.34µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                 1.06    447.5±3.05µs        ? ?/sec    1.00    422.9±5.36µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                1.00    589.6±3.12µs        ? ?/sec    1.06    624.6±5.90µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                  1.06    455.3±4.38µs        ? ?/sec    1.00    428.6±8.18µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, mandatory, no NULLs        1.00    160.5±0.72µs        ? ?/sec    1.22    195.6±2.97µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, half NULLs       1.00    300.9±4.42µs        ? ?/sec    1.12    338.1±6.72µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, no NULLs         1.00    164.7±2.33µs        ? ?/sec    1.21    199.1±3.93µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, mandatory, no NULLs                    1.00     76.8±1.39µs        ? ?/sec    1.54    118.2±4.05µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, half NULLs                   1.00    257.3±1.94µs        ? ?/sec    1.17    300.9±5.17µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, no NULLs                     1.00     81.0±0.54µs        ? ?/sec    1.49    121.1±0.99µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, mandatory, no NULLs                    1.07    736.6±3.31µs        ? ?/sec    1.00    691.5±6.06µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, half NULLs                   1.00    541.4±6.15µs        ? ?/sec    1.06    575.3±1.45µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, no NULLs                     1.07    742.1±2.84µs        ? ?/sec    1.00    696.1±2.69µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, mandatory, no NULLs                                1.00     63.7±5.18µs        ? ?/sec    1.02     65.3±4.07µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, half NULLs                               1.00    203.9±1.61µs        ? ?/sec    1.34    273.8±3.18µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, no NULLs                                 1.00     69.2±5.98µs        ? ?/sec    1.07     74.1±5.25µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, mandatory, no NULLs                     1.09     94.2±0.32µs        ? ?/sec    1.00     86.1±0.45µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, half NULLs                    1.02    232.8±1.43µs        ? ?/sec    1.00    229.2±3.92µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, no NULLs                      1.08     98.0±0.54µs        ? ?/sec    1.00     90.5±1.30µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, mandatory, no NULLs                                 1.03      9.4±0.25µs        ? ?/sec    1.00      9.1±0.17µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, half NULLs                                1.00    189.0±0.33µs        ? ?/sec    1.00    189.8±0.50µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, no NULLs                                  1.00     13.0±0.22µs        ? ?/sec    1.03     13.3±0.29µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, mandatory, no NULLs                     1.08    184.0±0.67µs        ? ?/sec    1.00    170.0±0.38µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, half NULLs                    1.00    366.3±1.43µs        ? ?/sec    1.01    371.4±1.15µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, no NULLs                      1.08    188.4±0.68µs        ? ?/sec    1.00    174.3±0.61µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, mandatory, no NULLs                                 1.00     14.3±0.40µs        ? ?/sec    1.03     14.7±0.54µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, half NULLs                                1.00    282.8±1.35µs        ? ?/sec    1.05    295.7±5.19µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, no NULLs                                  1.00     18.5±0.35µs        ? ?/sec    1.03     19.0±0.48µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, mandatory, no NULLs                     1.07    367.2±5.44µs        ? ?/sec    1.00    342.6±3.59µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, half NULLs                    1.00    341.9±3.54µs        ? ?/sec    1.16    396.9±9.07µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, no NULLs                      1.07    371.7±4.54µs        ? ?/sec    1.00    347.9±1.15µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, mandatory, no NULLs                                 1.00     27.4±0.44µs        ? ?/sec    1.00     27.5±0.52µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, half NULLs                                1.00    173.5±4.05µs        ? ?/sec    1.38    238.9±4.66µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, no NULLs                                  1.05     34.3±0.94µs        ? ?/sec    1.00     32.6±0.59µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    109.0±0.26µs        ? ?/sec    1.02    111.0±0.49µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    129.7±1.73µs        ? ?/sec    1.01    131.3±0.90µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    111.2±2.08µs        ? ?/sec    1.03    114.2±4.26µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    160.4±0.94µs        ? ?/sec    1.00    161.1±0.52µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, half NULLs                               1.00    223.5±3.25µs        ? ?/sec    1.00    224.3±1.77µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, no NULLs                                 1.00    163.8±1.94µs        ? ?/sec    1.01    165.5±0.55µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00     75.3±1.99µs        ? ?/sec    1.03     77.6±0.91µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.00    179.6±0.75µs        ? ?/sec    1.00    178.9±2.19µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.00     78.7±1.10µs        ? ?/sec    1.01     79.7±0.61µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.00    134.2±0.48µs        ? ?/sec    1.07    143.1±1.60µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, half NULLs                          1.00    211.2±2.19µs        ? ?/sec    1.02    216.3±0.94µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    137.9±0.59µs        ? ?/sec    1.07    147.4±1.64µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, mandatory, no NULLs                                1.05     73.2±1.03µs        ? ?/sec    1.00     70.0±0.33µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, half NULLs                               1.02    178.5±2.68µs        ? ?/sec    1.00    175.7±0.38µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, no NULLs                                 1.05     77.6±0.28µs        ? ?/sec    1.00     74.1±0.31µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    107.0±0.97µs        ? ?/sec    1.03    109.7±0.24µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    117.9±0.34µs        ? ?/sec    1.04    122.6±0.38µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    109.1±1.17µs        ? ?/sec    1.03    112.3±0.43µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    159.3±0.90µs        ? ?/sec    1.02    162.9±0.72µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, half NULLs                               1.02    207.9±0.93µs        ? ?/sec    1.00    204.8±0.81µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, no NULLs                                 1.00    163.7±1.39µs        ? ?/sec    1.01    165.7±2.59µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.01    203.7±0.67µs        ? ?/sec    1.00    201.9±0.75µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.00    224.8±3.54µs        ? ?/sec    1.00    224.9±1.42µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.02    210.0±3.80µs        ? ?/sec    1.00    206.7±1.25µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.06    150.5±0.52µs        ? ?/sec    1.00    142.7±0.67µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, half NULLs                          1.02    198.7±0.85µs        ? ?/sec    1.00    194.7±1.55µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, no NULLs                            1.06    154.4±0.53µs        ? ?/sec    1.00    146.3±0.35µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, mandatory, no NULLs                                1.05    107.4±1.39µs        ? ?/sec    1.00    102.3±1.14µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, half NULLs                               1.00    174.9±2.52µs        ? ?/sec    1.01    176.5±1.19µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, no NULLs                                 1.03    114.9±2.25µs        ? ?/sec    1.00    112.1±1.69µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, mandatory, no NULLs                                      1.00     76.7±1.80µs        ? ?/sec    1.03     78.6±1.03µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, half NULLs                                     1.00    103.6±0.74µs        ? ?/sec    1.00    103.4±0.66µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, no NULLs                                       1.00     78.7±1.27µs        ? ?/sec    1.03     81.1±3.66µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, mandatory, no NULLs                                           1.00    106.4±0.97µs        ? ?/sec    1.03    109.2±3.67µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, half NULLs                                          1.00    175.6±1.19µs        ? ?/sec    1.01    176.7±3.55µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, no NULLs                                            1.00    109.9±0.73µs        ? ?/sec    1.02    111.6±3.25µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     40.8±0.16µs        ? ?/sec    1.05     42.8±0.12µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, half NULLs                              1.03    143.4±0.60µs        ? ?/sec    1.00    139.9±1.79µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, no NULLs                                1.00     44.2±0.34µs        ? ?/sec    1.03     45.6±0.18µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, mandatory, no NULLs                                      1.00    100.5±0.92µs        ? ?/sec    1.09    109.2±0.63µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, half NULLs                                     1.00    175.1±2.59µs        ? ?/sec    1.02    178.1±1.03µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, no NULLs                                       1.00    104.0±0.39µs        ? ?/sec    1.08    112.4±0.48µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, mandatory, no NULLs                                           1.00     36.8±0.18µs        ? ?/sec    1.00     36.7±0.12µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, half NULLs                                          1.01    139.8±2.74µs        ? ?/sec    1.00    138.1±1.42µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, no NULLs                                            1.01     40.2±0.26µs        ? ?/sec    1.00     39.9±0.13µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, mandatory, no NULLs                                      1.00     82.3±0.68µs        ? ?/sec    1.02     84.2±0.42µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, half NULLs                                     1.00    101.8±1.64µs        ? ?/sec    1.00    102.0±0.14µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, no NULLs                                       1.00     84.1±1.46µs        ? ?/sec    1.02     85.8±0.14µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, mandatory, no NULLs                                           1.00    106.7±2.73µs        ? ?/sec    1.02    108.9±1.07µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, half NULLs                                          1.00    166.4±0.82µs        ? ?/sec    1.00    166.9±0.52µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, no NULLs                                            1.00    109.7±0.47µs        ? ?/sec    1.02    112.3±3.24µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, mandatory, no NULLs                               1.05     23.0±0.22µs        ? ?/sec    1.00     22.0±0.27µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, half NULLs                              1.01    122.7±0.86µs        ? ?/sec    1.00    121.5±1.33µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, no NULLs                                1.05     26.3±0.24µs        ? ?/sec    1.00     25.1±0.74µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, mandatory, no NULLs                                      1.00     82.4±0.50µs        ? ?/sec    1.11     91.1±2.64µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, half NULLs                                     1.00    155.9±1.68µs        ? ?/sec    1.02    159.0±0.83µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, no NULLs                                       1.00     84.4±0.59µs        ? ?/sec    1.12     94.2±2.80µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, mandatory, no NULLs                                           1.02     15.5±0.65µs        ? ?/sec    1.00     15.2±0.50µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, half NULLs                                          1.01    120.0±0.96µs        ? ?/sec    1.00    118.3±0.31µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, no NULLs                                            1.01     20.1±0.62µs        ? ?/sec    1.00     19.9±0.42µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, mandatory, no NULLs                                      1.00     78.9±0.44µs        ? ?/sec    1.03     81.0±0.87µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, half NULLs                                     1.00     89.3±0.43µs        ? ?/sec    1.01     90.3±0.31µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, no NULLs                                       1.00     80.8±0.50µs        ? ?/sec    1.03     82.9±0.94µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, mandatory, no NULLs                                           1.00    105.3±0.68µs        ? ?/sec    1.03    108.1±1.07µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, half NULLs                                          1.00    142.2±0.45µs        ? ?/sec    1.01    144.0±1.77µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, no NULLs                                            1.00    107.7±0.53µs        ? ?/sec    1.03    110.7±0.42µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, mandatory, no NULLs                               1.00    147.4±1.42µs        ? ?/sec    1.00    146.9±0.50µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, half NULLs                              1.02    168.3±2.48µs        ? ?/sec    1.00    165.6±0.76µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, no NULLs                                1.00    149.7±1.08µs        ? ?/sec    1.01    150.5±0.74µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, mandatory, no NULLs                                      1.10     96.9±0.69µs        ? ?/sec    1.00     87.7±0.69µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, half NULLs                                     1.04    140.5±3.30µs        ? ?/sec    1.00    134.8±0.73µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, no NULLs                                       1.10    100.1±0.55µs        ? ?/sec    1.00     90.8±0.51µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, mandatory, no NULLs                                           1.00     38.8±0.90µs        ? ?/sec    1.25     48.5±3.14µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, half NULLs                                          1.00    110.4±1.60µs        ? ?/sec    1.01    111.0±0.37µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, no NULLs                                            1.00     43.2±0.91µs        ? ?/sec    1.25     53.9±3.45µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, mandatory, no NULLs                                       1.00     81.0±0.55µs        ? ?/sec    1.03     83.6±0.86µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, half NULLs                                      1.00    103.0±2.21µs        ? ?/sec    1.01    104.1±1.80µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, no NULLs                                        1.00     83.8±1.39µs        ? ?/sec    1.03     86.4±5.56µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, mandatory, no NULLs                                            1.00    108.5±0.92µs        ? ?/sec    1.04    112.8±5.91µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, half NULLs                                           1.00    171.9±0.97µs        ? ?/sec    1.00    172.4±1.02µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, no NULLs                                             1.00    111.8±0.88µs        ? ?/sec    1.03    115.2±2.17µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, mandatory, no NULLs                                1.00     34.6±0.29µs        ? ?/sec    1.01     35.0±0.78µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, half NULLs                               1.01    134.5±0.59µs        ? ?/sec    1.00    132.9±0.34µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, no NULLs                                 1.04     37.7±0.39µs        ? ?/sec    1.00     36.2±0.09µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, mandatory, no NULLs                                       1.00     92.6±0.67µs        ? ?/sec    1.10    101.4±0.47µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, half NULLs                                      1.00    166.5±0.88µs        ? ?/sec    1.02    170.3±0.56µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, no NULLs                                        1.00     95.9±0.96µs        ? ?/sec    1.09    104.8±0.26µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, mandatory, no NULLs                                            1.01     28.9±0.17µs        ? ?/sec    1.00     28.7±0.09µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, half NULLs                                           1.01    132.1±0.86µs        ? ?/sec    1.00    130.3±4.04µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, no NULLs                                             1.00     32.2±0.21µs        ? ?/sec    1.00     32.3±0.72µs        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings half NULLs                                     1.00      6.1±0.15ms        ? ?/sec    1.00      6.0±0.04ms        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings no NULLs                                       1.01     12.0±0.23ms        ? ?/sec    1.00     11.9±0.11ms        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, mandatory, no NULLs                                     1.00    488.4±5.79µs        ? ?/sec    1.03   503.6±13.94µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, half NULLs                                    1.00   641.8±10.76µs        ? ?/sec    1.02   657.4±13.75µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, no NULLs                                      1.00    480.4±3.70µs        ? ?/sec    1.04    501.6±8.31µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, mandatory, no NULLs                                          1.00    660.0±4.57µs        ? ?/sec    1.05    692.6±7.05µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, half NULLs                                         1.00    775.3±5.97µs        ? ?/sec    1.01    787.0±5.33µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, no NULLs                                           1.00   669.9±10.25µs        ? ?/sec    1.04    696.7±9.20µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, mandatory, no NULLs                                1.00    328.4±1.18µs        ? ?/sec    1.00    329.4±0.98µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, half NULLs                               1.00   382.6±16.42µs        ? ?/sec    1.10    421.8±1.79µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, no NULLs                                 1.00    333.6±2.08µs        ? ?/sec    1.00    334.7±2.52µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, mandatory, no NULLs                                 1.01    152.0±2.55µs        ? ?/sec    1.00    150.8±1.57µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, half NULLs                                1.00    167.3±3.23µs        ? ?/sec    1.14    190.0±0.57µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, no NULLs                                  1.02    144.8±3.74µs        ? ?/sec    1.00    141.6±4.95µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, mandatory, no NULLs                                      1.00    394.5±2.19µs        ? ?/sec    1.04    411.2±1.70µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, half NULLs                                     1.00    300.5±3.22µs        ? ?/sec    1.13   339.4±11.24µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, no NULLs                                       1.00    405.9±7.89µs        ? ?/sec    1.03    419.1±2.22µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, mandatory, no NULLs                                     1.00     92.2±0.37µs        ? ?/sec    1.03     94.7±0.70µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, half NULLs                                    1.00    111.5±0.45µs        ? ?/sec    1.01    112.8±0.38µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, no NULLs                                      1.00     94.3±0.38µs        ? ?/sec    1.02     96.5±1.26µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, mandatory, no NULLs                                          1.00    125.7±0.35µs        ? ?/sec    1.02    128.5±0.26µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, half NULLs                                         1.00    186.6±7.36µs        ? ?/sec    1.02   189.8±10.13µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, no NULLs                                           1.00    129.4±1.14µs        ? ?/sec    1.02    132.3±3.54µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, mandatory, no NULLs                              1.00     40.9±0.17µs        ? ?/sec    1.00     41.0±0.19µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, half NULLs                             1.01    141.5±0.49µs        ? ?/sec    1.00    140.6±1.22µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, no NULLs                               1.00     44.1±0.15µs        ? ?/sec    1.00     44.1±0.10µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, mandatory, no NULLs                                     1.00    101.0±0.64µs        ? ?/sec    1.09    109.6±0.35µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, half NULLs                                    1.00    174.5±0.75µs        ? ?/sec    1.02    178.0±0.58µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, no NULLs                                      1.00    104.3±3.39µs        ? ?/sec    1.08    112.5±0.31µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, mandatory, no NULLs                                          1.01     36.9±0.20µs        ? ?/sec    1.00     36.5±0.12µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, half NULLs                                         1.01    140.2±0.49µs        ? ?/sec    1.00    138.9±0.52µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, no NULLs                                           1.00     40.2±0.17µs        ? ?/sec    1.00     40.0±0.70µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, mandatory, no NULLs                                     1.00     82.0±0.17µs        ? ?/sec    1.02     83.8±0.15µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, half NULLs                                    1.00    101.9±0.30µs        ? ?/sec    1.00    101.5±0.25µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, no NULLs                                      1.00     84.1±0.28µs        ? ?/sec    1.03     86.4±0.71µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, mandatory, no NULLs                                          1.00    106.6±0.49µs        ? ?/sec    1.03    109.3±0.71µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, half NULLs                                         1.00    167.6±0.39µs        ? ?/sec    1.00    167.2±0.56µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, no NULLs                                           1.00    110.1±0.35µs        ? ?/sec    1.02    112.3±0.37µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, mandatory, no NULLs                              1.02     24.2±0.52µs        ? ?/sec    1.00     23.6±0.30µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, half NULLs                             1.00    122.8±1.20µs        ? ?/sec    1.00    122.2±0.75µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, no NULLs                               1.00     26.8±0.26µs        ? ?/sec    1.00     26.8±0.70µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, mandatory, no NULLs                                     1.00     82.0±0.88µs        ? ?/sec    1.10     90.2±0.27µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, half NULLs                                    1.00    155.9±1.87µs        ? ?/sec    1.02    159.7±4.96µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, no NULLs                                      1.00     85.5±0.44µs        ? ?/sec    1.10     93.9±1.04µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, mandatory, no NULLs                                          1.00     18.8±1.22µs        ? ?/sec    1.03     19.4±1.59µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, half NULLs                                         1.01    120.3±0.46µs        ? ?/sec    1.00    119.1±0.73µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, no NULLs                                           1.00     22.2±0.88µs        ? ?/sec    1.01     22.4±1.44µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, mandatory, no NULLs                                     1.00     80.3±0.35µs        ? ?/sec    1.00     80.6±0.58µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, half NULLs                                    1.01     90.5±2.76µs        ? ?/sec    1.00     89.4±0.19µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, no NULLs                                      1.00     82.2±0.30µs        ? ?/sec    1.00     82.4±0.37µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, mandatory, no NULLs                                          1.00    107.4±0.82µs        ? ?/sec    1.01    108.3±0.48µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, half NULLs                                         1.00    143.8±0.84µs        ? ?/sec    1.00    144.4±1.94µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, no NULLs                                           1.00    110.0±1.03µs        ? ?/sec    1.00    110.5±0.61µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, mandatory, no NULLs                              1.00    147.4±0.86µs        ? ?/sec    1.00    147.2±0.75µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, half NULLs                             1.01    168.4±1.53µs        ? ?/sec    1.00    166.9±0.67µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, no NULLs                               1.00    150.5±0.48µs        ? ?/sec    1.00    150.6±0.59µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, mandatory, no NULLs                                     1.10     96.7±0.53µs        ? ?/sec    1.00     87.8±0.47µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, half NULLs                                    1.04    140.2±0.74µs        ? ?/sec    1.00    134.4±0.35µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, no NULLs                                      1.10    100.1±0.59µs        ? ?/sec    1.00     90.9±1.39µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, mandatory, no NULLs                                          1.00     39.8±0.87µs        ? ?/sec    1.17     46.7±2.82µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, half NULLs                                         1.00    110.2±1.12µs        ? ?/sec    1.00    109.6±0.80µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, no NULLs                                           1.00     44.8±0.81µs        ? ?/sec    1.16     52.0±2.33µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, mandatory, no NULLs                                      1.00     88.5±3.70µs        ? ?/sec    1.04     91.6±0.17µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, half NULLs                                     1.00    107.3±2.04µs        ? ?/sec    1.01    108.4±0.51µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, no NULLs                                       1.00     89.8±0.96µs        ? ?/sec    1.04     93.9±1.79µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, mandatory, no NULLs                                           1.00    118.5±1.56µs        ? ?/sec    1.03    122.5±0.28µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, half NULLs                                          1.00    177.2±3.70µs        ? ?/sec    1.02    180.1±0.44µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, no NULLs                                            1.00    121.4±1.04µs        ? ?/sec    1.04    125.9±0.57µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     34.7±0.18µs        ? ?/sec    1.00     34.7±0.05µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, half NULLs                              1.01    134.1±0.82µs        ? ?/sec    1.00    132.7±0.48µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, no NULLs                                1.01     37.7±0.25µs        ? ?/sec    1.00     37.4±0.15µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, mandatory, no NULLs                                      1.00     93.4±0.81µs        ? ?/sec    1.09    101.5±0.41µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, half NULLs                                     1.00    167.5±2.65µs        ? ?/sec    1.02    170.4±0.51µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, no NULLs                                       1.00     96.8±0.42µs        ? ?/sec    1.08    104.8±0.33µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, mandatory, no NULLs                                           1.00     28.9±0.20µs        ? ?/sec    1.00     28.9±0.44µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, half NULLs                                          1.02    132.6±1.44µs        ? ?/sec    1.00    130.4±0.31µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, no NULLs                                            1.01     32.4±0.20µs        ? ?/sec    1.00     32.1±0.16µs        ? ?/sec
arrow_array_reader/struct/Int32Array/plain encoded, mandatory struct, optional data, half NULLs            1.02    121.8±0.64µs        ? ?/sec    1.00    120.0±1.08µs        ? ?/sec
arrow_array_reader/struct/Int32Array/plain encoded, mandatory struct, optional data, no NULLs              1.01     22.1±1.14µs        ? ?/sec    1.00     21.9±1.25µs        ? ?/sec
arrow_array_reader/struct/Int32Array/plain encoded, optional struct, optional data, half NULLs             1.01    242.6±0.85µs        ? ?/sec    1.00    241.1±0.55µs        ? ?/sec
arrow_array_reader/struct/Int32Array/plain encoded, optional struct, optional data, no NULLs               1.00    118.8±1.72µs        ? ?/sec    1.01    120.0±3.90µs        ? ?/sec

@jonded94
Copy link
Contributor Author

jonded94 commented Feb 12, 2026

Actually, what I stated (and tested for) here was wrong: This isn't about mixing v1 and v2 data pages, but can be reproduced with v2 data pages only. That's what my new commit 365bd9a is about.

Additionally, I verified that the original file I had this issue with also only contains v2 data pages. Please read this comment for a very detailed overview over all data pages in the affected row group: #9370 (comment) I also added a correctly currently panicking test in PR #9399 that I validated to no longer panic when the bugfix of this PR is introduced.

Regarding the actual bugfix introduced in this PR: It seems to lead to some quite hefty performance regressions, especially in these benchmarks launched by @alamb

arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, mandatory, no NULLs        1.00    160.5±0.72µs        ? ?/sec    1.22    195.6±2.97µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, half NULLs       1.00    300.9±4.42µs        ? ?/sec    1.12    338.1±6.72µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, no NULLs         1.00    164.7±2.33µs        ? ?/sec    1.21    199.1±3.93µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, mandatory, no NULLs                    1.00     76.8±1.39µs        ? ?/sec    1.54    118.2±4.05µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, half NULLs                   1.00    257.3±1.94µs        ? ?/sec    1.17    300.9±5.17µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, no NULLs                     1.00     81.0±0.54µs        ? ?/sec    1.49    121.1±0.99µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, half NULLs                               1.00    203.9±1.61µs        ? ?/sec    1.34    273.8±3.18µs        ? ?/sec

@jonded94
Copy link
Contributor Author

Hey, unfortunately this is a bit out of my skill set/expertise, but here is a slightly different fix:

different_bugfix.patch

diff --git a/parquet/src/column/page.rs b/parquet/src/column/page.rs
index f18b296c1..4cfc07a02 100644
--- a/parquet/src/column/page.rs
+++ b/parquet/src/column/page.rs
@@ -406,7 +406,14 @@ pub trait PageReader: Iterator<Item = Result<Page>> + Send {
     /// [(#4327)]: https://github.com/apache/arrow-rs/pull/4327
     /// [(#4943)]: https://github.com/apache/arrow-rs/pull/4943
     fn at_record_boundary(&mut self) -> Result<bool> {
-        Ok(self.peek_next_page()?.is_none())
+        match self.peek_next_page()? {
+            // Last page in the column chunk - always a record boundary
+            None => Ok(true),
+            // A V2 data page is required by the parquet spec to start at a
+            // record boundary, so the current page ends at one.  V2 pages
+            // are identified by having `num_rows` set in their header.
+            Some(metadata) => Ok(metadata.num_rows.is_some()),
+        }
     }
 }
 
diff --git a/parquet/src/column/reader.rs b/parquet/src/column/reader.rs
index eb2d53fb7..29cb50185 100644
--- a/parquet/src/column/reader.rs
+++ b/parquet/src/column/reader.rs
@@ -309,20 +309,6 @@ where
                 });
 
                 if let Some(rows) = rows {
-                    // If there is a pending partial record from a previous page,
-                    // count it before considering the whole-page skip. When the
-                    // next page provides num_rows (e.g. a V2 data page or via
-                    // offset index), its records are self-contained, so the
-                    // partial from the previous page is complete at this boundary.
-                    if let Some(decoder) = self.rep_level_decoder.as_mut() {
-                        if decoder.flush_partial() {
-                            remaining_records -= 1;
-                            if remaining_records == 0 {
-                                return Ok(num_records);
-                            }
-                        }
-                    }
-
                     if rows <= remaining_records {
                         self.page_reader.skip_next_page()?;
                         remaining_records -= rows;
diff --git a/parquet/src/file/serialized_reader.rs b/parquet/src/file/serialized_reader.rs
index b3b6383f7..254ccb779 100644
--- a/parquet/src/file/serialized_reader.rs
+++ b/parquet/src/file/serialized_reader.rs
@@ -1158,7 +1158,12 @@ impl<R: ChunkReader> PageReader for SerializedPageReader<R> {
 
     fn at_record_boundary(&mut self) -> Result<bool> {
         match &mut self.state {
-            SerializedPageReaderState::Values { .. } => Ok(self.peek_next_page()?.is_none()),
+            SerializedPageReaderState::Values { .. } => match self.peek_next_page()? {
+                None => Ok(true),
+                // V2 data pages must start at record boundaries per the parquet
+                // spec, so the current page ends at one.
+                Some(metadata) => Ok(metadata.num_rows.is_some()),
+            },
             SerializedPageReaderState::Pages { .. } => Ok(true),
         }
     }

I validated that the bugfix still lets the test in this PR and also my end-to-end regression test in #9399 succeed (i.e., for the latter I had to remove the should_panic macro).

If you think the new bugfix would make more sense, I'll commit it.

Claude output about why this new bugfix is potentially faster:

The problem

The original bugfix in PR #9374 added a 14-line flush_partial check to skip_records() — a generic function monomorphized 8 times (once per parquet physical type). This increased compiled code size
across the entire compilation unit, causing instruction cache effects that slowed down read_records() benchmarks by up to 54% for FixedLenByteArray types, even though read operations never call
skip_records.

The optimization: fix the root cause instead

The bug's root cause is that at_record_boundary() incorrectly returns false for V2 data pages in sequential reading mode. Per the parquet spec, V2 pages must start at record boundaries. By fixing
at_record_boundary() to return true when the next page is V2, the existing flush_partial mechanism in both read_records and skip_records handles the bug correctly — no new code needed in those hot
functions.

Changes:

  1. parquet/src/column/page.rs — Default at_record_boundary() now returns true when the next page has num_rows (V2 indicator), not just when there's no next page.
  2. parquet/src/file/serialized_reader.rs — Same fix for the Values state (sequential reading without offset index).
  3. parquet/src/column/reader.rs — Removed the 14-line flush_partial block from skip_records(). The function is now identical to master.

Why this eliminates the performance regression

  • skip_records() returns to its original code size (actually slightly smaller due to net -4 lines)
  • No new code in any monomorphized generic function
  • The only change is in at_record_boundary(), called once per page load (not in the hot decoding loop)
  • The existing flush_partial mechanism at page exhaustion handles the partial record naturally

Why the existing mechanism works

When has_record_delimiter is true (now correctly set for V2 pages), the existing code at lines 335-340 of reader.rs fires:

if levels_read == remaining_levels && self.has_record_delimiter {
    records_read += decoder.flush_partial() as usize;
}

This flushes the pending partial record when a page is exhausted during skip, before the loop iterates back to the page-skip shortcut — preventing the over-count that caused the phantom record bug.

@etseidl
Copy link
Contributor

etseidl commented Feb 12, 2026

Hmm, I had assumed at_record_boundary was more sophisticated. Looking at past discussions around this issue (#4943), I don't know what the correct fix is, beyond simply removing the short cut altogether. The issue with relying on V2 page behavior is that files created by earlier version of this crate (and possibly other impls), did not always obey the pages-start-at-a-record-boundary rule. The only way I can think of to really know if the has_partial flag can be cleared is to decode the first repetition level for the next page and see if it's 0. Beyond that there's really no way to know.

I think maybe we should only trust num_rows if there are no repetition levels (i.e. no lists present), in which case we know for sure there are no page-spanning rows.

Something like

                // If page has less rows than the remaining records to
                // be skipped, skip entire page
                // If no repetition levels, num_levels == num_rows
                // We cannot trust V2 pages to always respect record boundaries, so we'll only
                // perform this short-circuit when no lists are present.
                if self.rep_level_decoder.is_none() {
                    if let Some(rows) = metadata.num_levels {
                        if rows <= remaining_records {
                            self.page_reader.skip_next_page()?;
                            remaining_records -= rows;
                            continue;
                        }
                    }
                }

alamb pushed a commit that referenced this pull request Feb 12, 2026
# Which issue does this PR close?

Related to

#9374
#9370

# Rationale for this change

In #9374, I only came up with a
unit-test that didn't really throw the message "Not all children array
length are the same!" error the issue #9370 is about. In this PR, tests
are introduced that are able to reproduce this issue end-to-end and are
expected to still panic. In test
`test_list_struct_page_boundary_desync_produces_length_mismatch`, we
exactly get the error message the original issue is about, while
`test_row_selection_list_column_v2_page_boundary_skip` shows a slightly
different error message.

# What changes are included in this PR?

Two test are introduced, which currently are expected to panic:
- test_row_selection_list_column_v2_page_boundary_skip
- test_list_struct_page_boundary_desync_produces_length_mismatch

# Are there any user-facing changes?

No
@alamb
Copy link
Contributor

alamb commented Feb 12, 2026

I merged up from main and up "should_panic"d the tests

I need to devote more time to studying this PR (and trying to reproduce the benchmark differences locally) tomorrow

@alamb
Copy link
Contributor

alamb commented Feb 12, 2026

This is great work @jonded94

@etseidl
Copy link
Contributor

etseidl commented Feb 13, 2026

Something like

Hmm, never mind. This causes some async tests to fail (I suspect due to skipping the fetching/reading of pages via the page indexes). I'm testing a different approach in #9406.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error "Not all children array length are the same!" when decoding rows spanning across page boundaries in parquet file when using RowSelection

4 participants