WIP: investigate compression flag propagation (issue #49)#50
WIP: investigate compression flag propagation (issue #49)#50ZaGrayWolf wants to merge 2 commits into
Conversation
Replaces hardcoded 505 (ZSTD L5) with user-selectable algorithm and level. --compression accepts: zstd (default), lz4, lzma, zlib --level accepts 1-19 (default 5) Compression setting threaded through as level*100 + algo_id per ROOT convention. Tested on SRR062634 (309,689 reads, 85MB SAM): ZSTD L5 (default): 23MB, 6s LZ4: 23MB, 3s ZSTD L1: 25MB, 4s Closes compiler-research#49
|
Problem (issue #49) samtoramntuple hardcoded SetCompression(505) (ZSTD level 5) in two places, so compression was always ZSTD regardless of user input. There was also no --compression CLI argument — the flag didn’t exist. Fix Added --compression zstd|lz4|lzma|zlib and --level 1-19 flags. Compression is now computed as level * 100 + algo_id (ROOT convention) and used in both the single-file and split-by-chromosome paths. Also fixes a build issue: ROOT::Experimental::RNTupleParallelWriter → ROOT::RNTupleParallelWriter for ROOT 6.38. Tested on SRR062634 (309,689 reads, 85MB SAM) ZSTD level 5 (default): 23MB in 6s. LZ4: 23MB in 3s (same ratio, faster). ZSTD level 1: 25MB in 4s (slightly larger, as expected). The flag now works — previously all outputs were identical due to the hardcode. Closes #49 |
Related to #49
Initial attempt to respect CLI compression flag by removing override in SamToNTuple.
However, further inspection shows compression is still overridden in RAMNTupleRecord.cxx (SetCompression(505)), so this PR is incomplete.
Opening as draft for visibility and discussion on correct propagation of compression settings across layers.