Skip to content

fix(registry): deterministic encoder order via TreeMap (fixes Windows build)#102

Merged
dfa1 merged 1 commit into
mainfrom
fix-windows-writeregistry-order
Jun 20, 2026
Merged

fix(registry): deterministic encoder order via TreeMap (fixes Windows build)#102
dfa1 merged 1 commit into
mainfrom
fix-windows-writeregistry-order

Conversation

@dfa1

@dfa1 dfa1 commented Jun 20, 2026

Copy link
Copy Markdown
Owner

The failure

Windows CI (run 27878549743) threw in VortexWriterTest.create_withWriteRegistry_roundTrips:

java.lang.ClassCastException: class [J cannot be cast to ChunkedData
  at ChunkedEncodingEncoder.encode
  at VortexWriter.writeSegment / writeChunk

Root cause (not the symptom)

VortexWriter.create(.., WriteRegistry) selects the first encoder whose accepts() matches the dtype, iterating WriteRegistry.encoderMap().values(). That map was Map.copyOf(HashMap)non-deterministic iteration order — so encoder selection varied across runs and platforms. On Windows the hash order put the composite ChunkedEncodingEncoder (accepts(Primitive) == true, but it needs a ChunkedData wrapper) ahead of PrimitiveEncodingEncoder for a plain long[]. Mac/Linux happened to order a working encoder first → green there, red on Windows.

Fix

Back both WriteRegistry and ReadRegistry with a TreeMap, so iteration follows the id enum's natural ordering — a total order that is a pure function of the registered set, independent of:

  • registration sequence,
  • ServiceLoader's unspecified iteration order,
  • how register / registerServiceLoaded calls interleave (the "what if another is added after loadAll()?" case).

registerServiceLoaded no longer needs to sort its own batch — build() owns the order.

Tests

  • WriteRegistryTestencoderMap iterates in natural key order regardless of registration sequence; loadAll() order is sorted.
  • ReadRegistryTest — a TreeMap-backed registry retains all decoders.
  • create_withWriteRegistry_roundTrips now registers just the encoder the column needs (not loadAll(), which pulls in composite encoders unsuited to first-match leaf dispatch).

Verified locally: ./mvnw verify -pl reader,writer -am — 0 checkstyle, javadoc clean, all tests pass.

🤖 Generated with Claude Code

…Windows build)

VortexWriter.create(.., WriteRegistry) with no cascade selects the first encoder
whose accepts() matches the dtype, iterating WriteRegistry.encoderMap().values().
Two coupled defects made that selection wrong and unstable:

1. The map was Map.copyOf(HashMap) — non-deterministic iteration order — so the
   chosen encoder varied across runs and platforms.
2. ChunkedEncodingEncoder.accepts(Primitive) returned true, but encode() requires
   a ChunkedData wrapper the dtype-based write path never produces. It is not in
   DEFAULT_CODECS or the cascade and nothing dtype-dispatches to it — it is only
   reachable here. On Windows the hash order put it first for a plain long[],
   throwing ClassCastException; other platforms happened to order a leaf encoder
   first.

Fixes (root, not symptom):
- Both WriteRegistry and ReadRegistry back their maps with a TreeMap ordered by
  encoding name (Comparator.comparing(id) — Enum.compareTo is final, so natural
  enum order is ordinal and a Comparator is needed). Order is now a pure function
  of the registered set, independent of registration sequence or ServiceLoader's
  unspecified iteration order.
- ChunkedEncodingEncoder.accepts() returns false: it is carrier-dispatched
  (selected explicitly when a caller holds ChunkedData), never dtype-dispatched,
  so it must not be a first-match candidate. Its explicit-encode path is
  unchanged and still covered by ChunkedEncodingEncoderTest.

Tests:
- WriteRegistryTest pins name-ordered iteration regardless of registration
  sequence, and that loadAll() is sorted by name.
- ReadRegistryTest covers that a TreeMap-backed registry retains all decoders.
- create_withFullRegistry_roundTrips drives the exact failing combination —
  create(loadAll()) -> writeChunk -> read back — as an end-to-end regression
  guard, rather than dodging it with a hand-picked single-encoder registry.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dfa1 dfa1 force-pushed the fix-windows-writeregistry-order branch from a7aa667 to f870c9c Compare June 20, 2026 18:03
@dfa1 dfa1 merged commit 9c4ebb1 into main Jun 20, 2026
6 checks passed
@dfa1 dfa1 deleted the fix-windows-writeregistry-order branch June 20, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant