Skip to content

WIP: investigate compression flag propagation (issue #49)#50

Open
ZaGrayWolf wants to merge 2 commits into
compiler-research:developfrom
ZaGrayWolf:fix/compression-override
Open

WIP: investigate compression flag propagation (issue #49)#50
ZaGrayWolf wants to merge 2 commits into
compiler-research:developfrom
ZaGrayWolf:fix/compression-override

Conversation

@ZaGrayWolf
Copy link
Copy Markdown

Related to #49

Initial attempt to respect CLI compression flag by removing override in SamToNTuple.

However, further inspection shows compression is still overridden in RAMNTupleRecord.cxx (SetCompression(505)), so this PR is incomplete.

Opening as draft for visibility and discussion on correct propagation of compression settings across layers.

Replaces hardcoded 505 (ZSTD L5) with user-selectable algorithm and level.
--compression accepts: zstd (default), lz4, lzma, zlib
--level accepts 1-19 (default 5)
Compression setting threaded through as level*100 + algo_id per ROOT convention.

Tested on SRR062634 (309,689 reads, 85MB SAM):
  ZSTD L5 (default): 23MB, 6s
  LZ4:               23MB, 3s
  ZSTD L1:           25MB, 4s

Closes compiler-research#49
@ZaGrayWolf ZaGrayWolf marked this pull request as ready for review April 12, 2026 17:28
@ZaGrayWolf
Copy link
Copy Markdown
Author

Problem (issue #49)

samtoramntuple hardcoded SetCompression(505) (ZSTD level 5) in two places, so compression was always ZSTD regardless of user input. There was also no --compression CLI argument — the flag didn’t exist.

Fix

Added --compression zstd|lz4|lzma|zlib and --level 1-19 flags. Compression is now computed as level * 100 + algo_id (ROOT convention) and used in both the single-file and split-by-chromosome paths. Also fixes a build issue: ROOT::Experimental::RNTupleParallelWriter → ROOT::RNTupleParallelWriter for ROOT 6.38.

Tested on SRR062634 (309,689 reads, 85MB SAM)

ZSTD level 5 (default): 23MB in 6s. LZ4: 23MB in 3s (same ratio, faster). ZSTD level 1: 25MB in 4s (slightly larger, as expected). The flag now works — previously all outputs were identical due to the hardcode.

Closes #49

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant