Skip to content

spike(writer): FFM binding to system libzstd vs aircompressor#163

Closed
dfa1 wants to merge 1 commit into
mainfrom
spike/zstd-ffm
Closed

spike(writer): FFM binding to system libzstd vs aircompressor#163
dfa1 wants to merge 1 commit into
mainfrom
spike/zstd-ffm

Conversation

@dfa1

@dfa1 dfa1 commented Jun 25, 2026

Copy link
Copy Markdown
Owner

What

Throwaway spike (test scope only) probing whether core's vortex.zstd codec can move off the unmaintained aircompressor onto native libzstd via FFM downcalls — no JNI, no Unsafe.

Motivation: aircompressor is effectively unmaintained (PRs/issues ignored), yet it's load-bearing in core — both ZstdEncodingEncoder and ZstdEncodingDecoder ride it. zstd-jni was ruled out (its bundled blob statically links libzstd with the ZSTD_* C symbols hidden — only the JNI entry points are exported, so FFM can't borrow it; and it's JNI anyway).

How

  • ZstdNative — binds ZSTD_compressBound/compress/decompress/isError through Linker downcalls. Resolves the system libzstd from well-known paths (-Dzstd.lib.path=... override); available() reports load success so callers can fall back to the pure-Java path when no native lib is present.
  • ZstdFfmSpikeTest — measures ratio/speed on real OHLC column bytes and asserts frame compatibility both directions. Skips when no system libzstd is present.

Spike copies through heap byte[] for an apples-to-apples comparison; a real codec would work straight on arena segments.

Findings (1M rows, libzstd 1.5.7)

Frame-compatible both ways (test passes): native and aircompressor decode each other's frames → the on-disk vortex.zstd format is unchanged regardless of which side writes it. This is the gate, and it's green.

aircompressor (Java) native L3 native L9
compress F64 214 MB/s, 3.59× 464 MB/s, 3.60× 87 MB/s, 3.72×
compress I64 248 MB/s, 2.41× 390 MB/s 75 MB/s, 2.47×
decompress F64 962 MB/s 1334 MB/s
decompress I64 1000 MB/s 1080 MB/s
  • Compress: native L3 ~2× faster at equal ratio; higher levels give a real ratio knob aircompressor lacks.
  • Decompress: native 1.1–1.4× faster — and understated, since the spike still bounces through heap byte[]. A real decoder writing into the scan arena would be faster and satisfy the off-heap rule (today's decoder violates it with toArray() + new byte[]).
  • Dictionaries: native supports them; the pure-Java decoder explicitly throws on dictionary_size != 0. The wire field already exists.
  • Cost: needs system libzstd present → keep aircompressor as graceful fallback when absent.

Not in scope (follow-up if we promote)

Move ZstdNative to core/main, wire ZstdEncodingEncoder/Decoder to it with aircompressor fallback, decode straight into the arena, add dictionary support. Bundling libzstd (vs require-system-lib) is a separate packaging decision.

🤖 Generated with Claude Code

@dfa1

dfa1 commented Jun 26, 2026

Copy link
Copy Markdown
Owner Author

this will be closed by using this library: https://github.com/dfa1/zstd-java (JDK 25 + MemorySegment native)

Throwaway exploration (test scope) probing whether core's vortex.zstd
codec can move off the unmaintained aircompressor onto native libzstd
via FFM downcalls (no JNI, no Unsafe).

ZstdNative binds ZSTD_compressBound/compress/decompress/isError through
Linker, resolving the system libzstd (-Dzstd.lib.path override; graceful
available() check for fallback). ZstdFfmSpikeTest measures ratio/speed on
real OHLC column bytes and asserts frame compatibility both directions.

Findings (1M rows, libzstd 1.5.7): native and aircompressor decode each
other's frames, so the on-disk format is unchanged. Native compress ~2x
faster at equal ratio (level 3), with a real ratio knob at higher levels;
decompress 1.1-1.4x faster even through the spike's heap bounce. Native
also supports dictionaries, which the pure-Java decoder rejects.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dfa1 dfa1 force-pushed the spike/zstd-ffm branch from 41989fb to 066df0a Compare June 26, 2026 12:25
@dfa1 dfa1 closed this Jun 26, 2026
@dfa1 dfa1 deleted the spike/zstd-ffm branch June 26, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant