refactor(encoding): shared PrimitiveArrays + PType.isUnsigned#136
Merged
Conversation
…to core toLongs (integer array -> long[] widen) was copy-pasted in 4 writer encoders (Delta, Bitpacked, FrameOfReference, Patched), and fromLongs (long[] -> off-heap segment) in the Delta encoder and decoder — identical but for the EncodingId in the error throw. Since fromLongs is needed by both reader and writer, the shared home is core (io.github.dfa1.vortex.encoding), the only module both depend on. New PrimitiveArrays holds the inverse pair: toLongs(data, ptype, encoding) and fromLongs(longs, ptype, arena). The EncodingId is now a parameter so each caller keeps its own error attribution. Deliberately NOT folded in: RleEncodingEncoder.toLongs (a superset that also raw-bit-packs F32/F64/F16), Rle.fromLongs (takes a count prefix), and Patched.fromLongs (returns a boxed java array, opposite direction). Those are genuinely different methods, left alone. Ground truth green: JavaWritesRustReads (213) + JavaRoundTrip round-trips pass, so the extracted widen/narrow produces byte-identical output. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… helpers isUnsigned(ptype) was copy-pasted in the Delta, Bitpacked and FrameOfReference encoders. It is exactly the complement of the existing PType.isSigned() (every non-unsigned ptype is a signed integer or floating-point), so add PType.isUnsigned() returning !isSigned() and route the call sites through it. Tests: - PTypeTest: isUnsigned true/false partitions + an exact-complement-of-isSigned property over every enum constant. - PrimitiveArraysTest (new, for the prior extraction): toLongs sign/zero-extension per width, I64 passthrough identity, floating-ptype throw carrying the caller's EncodingId; fromLongs round-trip through toLongs, little-endian I64 layout, and low-byte-only narrowing. Ground truth green: JavaWritesRustReads (213) + JavaRoundTrip. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two encoding-helper dedups, both ground-truth verified.
PrimitiveArrays (core)
toLongs(int array → long[] widen) was copy-pasted in 4 writer encoders (Delta, Bitpacked, FrameOfReference, Patched);fromLongs(long[] → off-heap segment) in the Delta encoder + decoder. SincefromLongsis needed by both reader and writer, the shared home is core (io.github.dfa1.vortex.encoding). NewPrimitiveArraysholds the inverse pair; theEncodingIdis now a parameter so each caller keeps its own error attribution.Left alone (genuinely different methods):
Rle.toLongs(superset — also raw-bit-packs F32/F64/F16),Rle.fromLongs(count-prefix param),Patched.fromLongs(returns a boxed java array — the narrow-to-array inverse, single-use).PType.isUnsigned
isUnsignedwas copy-pasted in 3 encoders and is exactly!isSigned(). AddedPType.isUnsigned(), dropped the copies.Tests
PrimitiveArraysTest: per-width sign/zero-extension, I64 passthrough identity, floating-ptype throw carrying the caller'sEncodingId;fromLongsround-trip, little-endian I64 layout, low-byte narrowing.PTypeTest:isUnsignedpartitions + exact-complement-of-isSignedproperty over every constant.Ground truth green: JavaWritesRustReads (213) + JavaRoundTrip.
🤖 Generated with Claude Code