Skip to content

refactor(encoding): shared PrimitiveArrays + PType.isUnsigned#136

Merged
dfa1 merged 2 commits into
mainfrom
refactor/primitive-arrays-dedup
Jun 22, 2026
Merged

refactor(encoding): shared PrimitiveArrays + PType.isUnsigned#136
dfa1 merged 2 commits into
mainfrom
refactor/primitive-arrays-dedup

Conversation

@dfa1

@dfa1 dfa1 commented Jun 22, 2026

Copy link
Copy Markdown
Owner

Two encoding-helper dedups, both ground-truth verified.

PrimitiveArrays (core)

toLongs (int array → long[] widen) was copy-pasted in 4 writer encoders (Delta, Bitpacked, FrameOfReference, Patched); fromLongs (long[] → off-heap segment) in the Delta encoder + decoder. Since fromLongs is needed by both reader and writer, the shared home is core (io.github.dfa1.vortex.encoding). New PrimitiveArrays holds the inverse pair; the EncodingId is now a parameter so each caller keeps its own error attribution.

Left alone (genuinely different methods): Rle.toLongs (superset — also raw-bit-packs F32/F64/F16), Rle.fromLongs (count-prefix param), Patched.fromLongs (returns a boxed java array — the narrow-to-array inverse, single-use).

PType.isUnsigned

isUnsigned was copy-pasted in 3 encoders and is exactly !isSigned(). Added PType.isUnsigned(), dropped the copies.

Tests

  • PrimitiveArraysTest: per-width sign/zero-extension, I64 passthrough identity, floating-ptype throw carrying the caller's EncodingId; fromLongs round-trip, little-endian I64 layout, low-byte narrowing.
  • PTypeTest: isUnsigned partitions + exact-complement-of-isSigned property over every constant.

Ground truth green: JavaWritesRustReads (213) + JavaRoundTrip.

🤖 Generated with Claude Code

dfa1 and others added 2 commits June 22, 2026 22:53
…to core

toLongs (integer array -> long[] widen) was copy-pasted in 4 writer encoders
(Delta, Bitpacked, FrameOfReference, Patched), and fromLongs (long[] -> off-heap
segment) in the Delta encoder and decoder — identical but for the EncodingId in
the error throw. Since fromLongs is needed by both reader and writer, the shared
home is core (io.github.dfa1.vortex.encoding), the only module both depend on.

New PrimitiveArrays holds the inverse pair: toLongs(data, ptype, encoding) and
fromLongs(longs, ptype, arena). The EncodingId is now a parameter so each caller
keeps its own error attribution.

Deliberately NOT folded in: RleEncodingEncoder.toLongs (a superset that also
raw-bit-packs F32/F64/F16), Rle.fromLongs (takes a count prefix), and
Patched.fromLongs (returns a boxed java array, opposite direction). Those are
genuinely different methods, left alone.

Ground truth green: JavaWritesRustReads (213) + JavaRoundTrip round-trips pass,
so the extracted widen/narrow produces byte-identical output.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… helpers

isUnsigned(ptype) was copy-pasted in the Delta, Bitpacked and FrameOfReference
encoders. It is exactly the complement of the existing PType.isSigned() (every
non-unsigned ptype is a signed integer or floating-point), so add PType.isUnsigned()
returning !isSigned() and route the call sites through it.

Tests:
- PTypeTest: isUnsigned true/false partitions + an exact-complement-of-isSigned
  property over every enum constant.
- PrimitiveArraysTest (new, for the prior extraction): toLongs sign/zero-extension
  per width, I64 passthrough identity, floating-ptype throw carrying the caller's
  EncodingId; fromLongs round-trip through toLongs, little-endian I64 layout, and
  low-byte-only narrowing.

Ground truth green: JavaWritesRustReads (213) + JavaRoundTrip.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dfa1 dfa1 merged commit d8f8408 into main Jun 22, 2026
6 checks passed
@dfa1 dfa1 deleted the refactor/primitive-arrays-dedup branch June 22, 2026 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant