Skip to content

Compression CLI flag ineffective - overridden to ZSTD across multiple layers (SamToNTuple + RAMNTupleRecord) #49

@ZaGrayWolf

Description

@ZaGrayWolf

Description

The samtoramntuple CLI accepts a compression algorithm argument (zstd, lz4, lzma), but the output size remains identical regardless of the selected option.

Reproduction

Commands:
./tools/samtoramntuple sample.sam zstd.ram zstd
./tools/samtoramntuple sample.sam lz4.ram lz4
./tools/samtoramntuple sample.sam lzma.ram lzma

Observed output sizes:
zstd.ram → 22 MB
lz4.ram → 22 MB
lzma.ram → 22 MB

Code Analysis

Compression parameters are initially passed correctly:

src/ramcore/SamToNTuple.cxx:
writeOptions.SetCompression(compression_algorithm);

However, compression is overridden in multiple places:

  1. In SamToNTuple.cxx:
    writeOptions.SetCompression(ROOT::RCompressionSetting::EAlgorithm::kZSTD, 1);

  2. In src/rntuple/RAMNTupleRecord.cxx:
    writeOptions.SetCompression(505); // ZSTD level 5

As a result, the CLI-selected compression algorithm is not respected, and all outputs are effectively compressed using ZSTD.

Expected Behavior

The compression algorithm provided via CLI should determine the compression used in the output file.

Actual Behavior

Compression is always ZSTD regardless of CLI input.

Environment

  • macOS (Apple Silicon ARM64)
  • Test dataset: ~85 MB SAM (~309k reads)

Suggested Fix

Remove or conditionally apply hardcoded compression overrides so that CLI arguments are respected consistently across layers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions