Skip to content

fix(reader): width-agnostic numeric compare in zone-map pruning (#159)#160

Merged
dfa1 merged 1 commit into
mainfrom
fix/compare-values-numeric-widths
Jun 24, 2026
Merged

fix(reader): width-agnostic numeric compare in zone-map pruning (#159)#160
dfa1 merged 1 commit into
mainfrom
fix/compare-values-numeric-widths

Conversation

@dfa1

@dfa1 dfa1 commented Jun 24, 2026

Copy link
Copy Markdown
Owner

Fixes #159.

ScanIterator's zone-map comparator caught ClassCastException and returned 0 — which canPruneChunk treats as cannot prune. Stats decode integers as Long and floats as Float/Double, so a filter value boxed at a different width (Integer on an I64 column, Float on F32) threw internally and silently disabled pruning — a valid, selective predicate degraded to a full scan with no signal.

Fix — key the comparison off the column type, not the boxed operand

  • floating columnDouble.compare
  • unsigned int columnLong.compareUnsigned (U64 stats/values store raw bits, so a value >= 2^63 is a negative Long; signed compare would keep/drop the wrong chunks — a correctness bug, not just perf). U8/U16/U32 zero-extend to a positive Long, unaffected.
  • signed int columnLong.compare

Keying off the column also avoids ever routing an integer column through double-compare, which would lose precision past 2^53 and mis-prune (review point #1).

Eq/Neq had their own inline comparator with the same swallow; they now route through the shared one.

⚠️ Behaviour change (review point #2)

A filter value genuinely incomparable to its column (e.g. a String against a numeric column) now raises VortexException during the scan instead of silently disabling pruning. Callers relying on the old silent full scan will see an exception. Called out in CHANGELOG.md.

API

Adds DType.isUnsigned() (exhaustive switch over the sealed set) to classify the column.

Tests

  • ZoneMapPruningTest (27): BoxedWidth (Integer == Long, all six operators), Unsigned (U64 >= 2^63 keep/prune correctness), FloatWidths (F32 stat vs Double/Float filters), IntegerColumnFloatFilter (Double filter on I64 compares in the integer domain past 2^53), TypeMismatch (String throws). All groups parameterized.
  • DTypeIsUnsignedTest.

Full reader + writer suites green (incl. checkstyle/javadoc).

🤖 Generated with Claude Code

@dfa1 dfa1 force-pushed the fix/compare-values-numeric-widths branch 3 times, most recently from 731dc81 to 60137a4 Compare June 24, 2026 21:40
)

ScanIterator's comparator caught ClassCastException and returned 0, which
canPruneChunk reads as "cannot prune". Stats decode integers as Long and floats
as Float/Double, so a filter value boxed at a different width (Integer for I64,
Float for F32) threw internally and silently disabled pruning — a valid,
selective predicate degraded to a full scan with no signal.

The comparison now keys off the *column* type, not the boxed operand:
- floating column      -> Double.compare;
- unsigned int column  -> Long.compareUnsigned (U64 stats/values store raw bits,
  so a value >= 2^63 is a negative Long; signed compare keeps/drops the wrong
  chunks). U8/U16/U32 zero-extend to a positive Long, unaffected;
- signed int column    -> Long.compare.

Keying off the column also avoids routing an integer column through
double-compare, which would lose precision past 2^53 and mis-prune.

Eq/Neq previously had their own inline comparator with the same swallow; they now
route through the shared one. A genuinely incomparable filter value (e.g. a
String against a numeric column) now raises VortexException instead of a silent
no-prune — a behaviour change, noted in the changelog.

Adds DType.isUnsigned() (exhaustive over the sealed set) to classify the column.

Coverage — ZoneMapPruningTest (27): BoxedWidth (Integer == Long, all six
operators), Unsigned (U64 >= 2^63 keep/prune correctness), FloatWidths (F32 stat
vs Double/Float filters), IntegerColumnFloatFilter (Double filter on I64 compares
in the integer domain past 2^53), TypeMismatch (String throws). Plus
DTypeIsUnsignedTest.

Closes #159.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dfa1 dfa1 force-pushed the fix/compare-values-numeric-widths branch from 60137a4 to acbaa0b Compare June 24, 2026 21:43
@dfa1 dfa1 merged commit b6143f4 into main Jun 24, 2026
6 checks passed
@dfa1 dfa1 deleted the fix/compare-values-numeric-widths branch June 24, 2026 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reader: ScanIterator.compareValues swallows ClassCastException, silently disabling zone-map pruning on type-mismatched filters

1 participant