State sync: doc fixes, encoding integration coverage, boxed nullable map writes#90
Merged
Conversation
Review-driven doc cleanup against current code: - compatibility.md: vortex.patched Encode ✅ (PatchedEncodingEncoder is implemented + SPI-registered; "encode not yet implemented" was stale). - README.md + compatibility.md: dependency snippet 0.7.3 → 0.8.0 (released). - ADR-0003: status now reflects Phase E (bounds typing via IoBounds) shipped, Phases A–D (message sanitization) pending; ADR.md index title widened to "sanitization + bounds typing". - ADR-0007/0010/0012/0014: normalize "Implemented" → "Completed" to match the documented status vocabulary (Proposed → Accepted → Completed | …). - TODO.md: narrow the asSlice line (IoBounds.slice shipped; only the Checkstyle rule + final sweep remain) and the sanitization line (Phases A–D remain); add a Compute section tracking ADR-0013 (was untracked). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Close three integration-coverage gaps found in the state review:
- fastlanes.delta — Java→Rust round-trip (forced DeltaEncodingEncoder over a
monotonic I64 ramp; Rust trims the padded 1024-chunk back to n rows).
- vortex.masked — direct Masked-over-primitive Java→Rust round-trip (nullable
I64 with interleaved nulls via the boxed Long[] builder path). Previously only
covered indirectly through nullable_date (Masked → Ext → Primitive).
- vortex.patched — the bundled vortex-jni build rejects a standalone patched
array ("Unknown encoding: vortex.patched"), so this is a Java write → Java
read round-trip in a new JavaRoundTripIntegrationTest. Still exercises the
writer encode + file format + reader decode end to end; four outliers (< n/20)
force the patch path.
Full integration suite: 271 green, 2 @disabled (f16/uuid, JNI-blocked).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The map entry point writeChunk(Map) only accepted raw primitive arrays, so a
nullable column had to be supplied via the builder (writeChunk(c -> c.put(...,
new Long[]{...}))). Passing a boxed Long[]/Integer[]/… through the map threw
"unsupported data type". The two entry points now share ChunkImpl.validateAndAdapt:
- writeChunk(Map) adapts every column up front (boxed nullable array → NullableData,
raw arrays pass through) before the row-count check and encoding, so the map form
routes nullable columns through MaskedEncoding exactly like the builder.
- validateAndAdapt is now package-private and idempotent: an already-adapted
NullableData passes through, since the builder pre-adapts before delegating to
writeChunk(Map) (would otherwise double-adapt and reject NullableData).
Tests: VortexWriterTest gains map-path accept (boxed Long[] with null) + reject
(boxed on non-nullable) cases; the integration masked round-trip now drives the
map form, verifying null preservation end-to-end through the JNI reader.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Follow-up from a full state review of the repo. Three threads:
1. Doc sync (
f8e743fe)compatibility.md:vortex.patchedEncode ❌→✅ —PatchedEncodingEncoderis implemented + SPI-registered; "encode not yet implemented" was stale.0.7.3→0.8.0(README + compatibility.md).IoBounds) shipped, Phases A–D pending; ADR.md index title widened.Implemented→Completed(documented status vocabulary).asSliceand sanitization lines (IoBounds shipped); add a Compute section tracking ADR-0013 (was untracked).2. Encoding integration coverage (
29c81cca)Closes three gaps found in the review:
fastlanes.delta— Java→Rust round-trip (forced encoder, monotonic I64 ramp).vortex.masked— direct Masked-over-primitive Java→Rust round-trip (nullable I64 with interleaved nulls). Previously only covered indirectly vianullable_date(Masked → Ext → Primitive).vortex.patched— the bundledvortex-jnibuild rejects a standalone patched array (Unknown encoding: vortex.patched), so this is a Java write → Java read round-trip in a newJavaRoundTripIntegrationTest. Still exercises writer encode + file format + reader decode end to end.3. Writer: boxed nullable arrays on the map path (
2180e3e8)writeChunk(Map)rejected boxed arrays (Long[], …) — nullable columns could only go through the builder. Both entry points now shareChunkImpl.validateAndAdapt(made package-private + idempotent), so the map form routes nullable columns throughMaskedEncodingtoo.Findings worth keeping
vortex.patched— same JNI-limitation class as the@Disabledf16/uuid tests. That's why patched had no integration coverage.Verify
@Disabledunchanged).🤖 Generated with Claude Code