diff --git a/code/reference_data/rss_ld_sketch.ipynb b/code/reference_data/rss_ld_sketch.ipynb
index 5c266023..53b626c5 100644
--- a/code/reference_data/rss_ld_sketch.ipynb
+++ b/code/reference_data/rss_ld_sketch.ipynb
@@ -10,30 +10,30 @@
     "# RSS LD Sketch Pipeline\n",
     "\n",
     "## Overview\n",
-    "This pipeline generates a **stochastic genotype sample** U from whole-genome sequencing VCF files and stores it as a PLINK2 pgen file for use as an LD reference panel.\n",
     "\n",
-    "**Terminology:**\n",
-    "- **Stochastic genotype sample**: U = W<sup>T</sup> G, shape (B × p). B pseudo-individuals × p variants. This is what we store in the pgen file (after scaling to $[0,2]$).\n",
-    "- **LD sketch** (computed downstream, never stored here): R̂ = U<sup>T</sup>U / B, computed by PLINK2 or SuSiE-RSS when LD is requested from the pgen.\n",
+    "This pipeline generates a stochastic genotype sample **U = WᵀG** from whole-genome sequencing VCF files and stores it as a PLINK2 pgen file for use as an LD reference panel with SuSiE-RSS fine-mapping.\n",
     "\n",
-    "**Steps:**\n",
-    "1. Generate projection matrix W once from total cohort sample size n (saved as `.npy`, shared across all chromosomes)\n",
-    "2. Read LD block BED file and emit one coordinate file per block (`process_block_1`)\n",
-    "3. For each block in parallel: load VCF, compute U = W<sup>T</sup>G, scale to [0,2], write compressed dosage file + allele frequencies — no pgen yet (`process_block_2`)\n",
-    "4. For each chromosome: convert per-block dosage files to pgen, merge into one per-chromosome pgen, clean up all intermediate block files (`merge_chrom`)\n",
+    "**Key idea:** Rather than storing the full genotype matrix G (n × p, ~530 MB/block), we compute U = WᵀG (B × p, ~320 MB/block) using a random projection matrix W ~ N(0, 1/√n). The approximate LD matrix R̂ = UᵀU / B ≈ R = GᵀG / n by the Johnson-Lindenstrauss lemma. G is never stored.\n",
     "\n",
-    "**Matrix convention:**\n",
-    "- G : (n × p) — individuals × variants\n",
-    "- W : (n × B) — individuals × pseudo-individuals, W ~ N(0, 1/√n)\n",
-    "- U : (B × p) — stochastic genotype sample = W<sup>T</sup> G\n",
+    "**Matrix dimensions:**\n",
+    "- G : (n × p) — n individuals × p variants (~8,000 per block)\n",
+    "- W : (n × B) — projection matrix, W ~ N(0, 1/√n), generated once per cohort\n",
+    "- U : (B × p) — stochastic genotype sample = WᵀG, stored in pgen\n",
+    "- R̂ : (p × p) — approximate LD matrix, computed on-the-fly by SuSiE-RSS from U\n",
+    "\n",
+    "**Pipeline steps:**\n",
+    "1. `generate_W` — generate W once from cohort sample size n (saved as `.npy`, shared across all chromosomes)\n",
+    "2. `process_block_1` — read LD block BED file, emit one coordinate file per block\n",
+    "3. `process_block_2` — for each block in parallel: load VCF, compute U = WᵀG, scale to [0,2], write compressed dosage + allele frequencies\n",
+    "4. `merge_chrom` — for each chromosome: convert per-block dosage files to sorted pgen, merge into one per-chromosome pgen, clean up all intermediates\n",
     "\n",
     "## Input Files\n",
     "\n",
     "| File | Description |\n",
     "|------|-------------|\n",
-    "| VCF (.bgz) | Per-chromosome VCF files with tabix index. If the cohort contains mixed ancestries, provide `--sample-list` to restrict to one source population. |\n",
-    "| `sample_list.txt` | Optional. One sample ID per line. If provided, only these samples are loaded from the VCF. |\n",
-    "| `EUR_LD_blocks.bed` | LD block BED file (0-based half-open, e.g. Berisa & Pickrell EUR blocks). |\n",
+    "| VCF (.bgz) | Per-chromosome VCF files with tabix index |\n",
+    "| `sample_list.txt` | Optional. One sample ID per line. Leave empty to use all samples. |\n",
+    "| `EUR_LD_blocks.bed` | LD block BED file (0-based half-open, e.g. Berisa & Pickrell EUR blocks) |\n",
     "\n",
     "## Parameters\n",
     "\n",
@@ -41,64 +41,55 @@
     "\n",
     "| Parameter | Default | Description |\n",
     "|-----------|---------|-------------|\n",
-    "| `plink2_bin` | `plink2` | Path to plink2 binary, or just `plink2` if on `$PATH` |\n",
+    "| `plink2_bin` | `plink2` | Path to plink2 binary |\n",
+    "| `walltime` | `24:00:00` | Per-task walltime |\n",
+    "| `mem` | `32G` | Per-task memory |\n",
+    "| `numThreads` | `8` | Per-task CPU cores |\n",
     "\n",
-    "**`generate_W` workflow parameters:**\n",
+    "**`generate_W` parameters:**\n",
     "\n",
     "| Parameter | Default | Description |\n",
     "|-----------|---------|-------------|\n",
-    "| `n_samples` | required | Total number of individuals in the cohort (determines W shape) |\n",
+    "| `n_samples` | required | Total number of individuals in the cohort |\n",
     "| `output_dir` | required | Output directory for `W_B{B}.npy` |\n",
-    "| `B` | `10000` | Number of pseudo-individuals (sketch dimension) |\n",
-    "| `seed` | `123` | Random seed for W generation |\n",
+    "| `B` | `10000` | Sketch dimension (number of pseudo-individuals) |\n",
+    "| `seed` | `123` | Random seed |\n",
     "\n",
-    "**`process_block` workflow parameters:**\n",
+    "**`process_block` parameters:**\n",
     "\n",
     "| Parameter | Default | Description |\n",
     "|-----------|---------|-------------|\n",
-    "| `vcf_base` | required | Directory containing VCF (.bgz) files, one or more per chromosome |\n",
-    "| `vcf_prefix` | required | Shared filename prefix of all VCF files (everything before `chr{N}:` or `chr{N}.`) |\n",
-    "| `ld_block_file` | required | Path to LD block BED file (e.g. `EUR_LD_blocks.bed`) |\n",
-    "| `chrom` | `0` | Chromosome number (1–22), or 0 for all chromosomes |\n",
+    "| `vcf_base` | required | Directory containing VCF (.bgz) files |\n",
+    "| `vcf_prefix` | required | Shared filename prefix of all VCF files |\n",
+    "| `ld_block_file` | required | Path to LD block BED file |\n",
+    "| `chrom` | `0` | Chromosome 1–22, or 0 for all chromosomes |\n",
     "| `cohort_id` | `ADSP.R5.EUR` | Cohort prefix for output filenames |\n",
     "| `output_dir` | required | Base output directory |\n",
     "| `W_matrix` | required | Path to W `.npy` output from `generate_W` |\n",
-    "| `B` | `10000` | Number of pseudo-individuals — must match W |\n",
-    "| `sample_list` | `\"\"` | Optional path to a file of sample IDs (one per line) to subset from the VCF. Leave empty to use all samples. |\n",
+    "| `B` | `10000` | Must match W |\n",
+    "| `sample_list` | `\"\"` | Optional path to sample ID file |\n",
     "| `maf_min` | `0.0005` | Minimum minor allele frequency |\n",
     "| `mac_min` | `5` | Minimum minor allele count |\n",
     "| `msng_min` | `0.05` | Maximum missingness rate per variant |\n",
     "\n",
-    "**`merge_chrom` workflow parameters:**\n",
+    "**`merge_chrom` parameters:**\n",
     "\n",
     "| Parameter | Default | Description |\n",
     "|-----------|---------|-------------|\n",
-    "| `chrom` | `0` | Chromosome to merge (1–22), or 0 for all |\n",
+    "| `chrom` | `0` | Chromosome 1–22, or 0 for all chromosomes |\n",
     "| `output_dir` | required | Base output directory (same as `process_block`) |\n",
-    "| `cohort_id` | `ADSP.R5.EUR` | Must match the `process_block` cohort_id used |\n",
-    "\n",
-    "**Global parameters (shared across all workflows):**\n",
-    "\n",
-    "| Parameter | Default | Description |\n",
-    "|-----------|---------|-------------|\n",
-    "| `walltime` | `24:00:00` | Per-task walltime |\n",
-    "| `mem` | `32G` | Per-task memory |\n",
-    "| `numThreads` | `8` | Per-task CPU cores |\n",
+    "| `cohort_id` | `ADSP.R5.EUR` | Must match `process_block` cohort_id |\n",
     "\n",
     "## Output Files\n",
     "\n",
     "| File | Description |\n",
     "|------|-------------|\n",
-    "| `W_B{B}.npy` | Projection matrix W, shape (n × B). Output of `generate_W`, input to `process_block`. |\n",
-    "| `{cohort_id}.chr{N}_stocld_panel.pgen` | **Final output.** Per-chromosome stochastic genotype sample U as PLINK2 binary |\n",
-    "| `{cohort_id}.chr{N}_stocld_panel.pvar` | Per-chromosome variant information |\n",
-    "| `{cohort_id}.chr{N}_stocld_panel.psam` | Pseudo-individual IDs S1…S{B} |\n",
-    "| `{cohort_id}.chr{N}_stocld_panel.afreq` | Per-chromosome allele frequencies (OBS_CT = 2 × n) |\n",
-    "| *(cleaned up after merge)* | Per-block `.dosage.gz`, `.map`, `.afreq`, `.meta` intermediate files |\n",
-    "\n",
-    "**Timing:** ~7 hours for all 22 chromosomes with -J 32 (blocks parallelized), plus ~1 hour for merge_chrom\n",
-    "\n",
-    "## Minimal Working Example\n"
+    "| `W_B{B}.npy` | Projection matrix W, shape (n × B). Output of `generate_W`. |\n",
+    "| `{cohort_id}.chr{N}.pgen` | Final output — per-chromosome stochastic genotype sample U as PLINK2 binary |\n",
+    "| `{cohort_id}.chr{N}.pvar` | Per-chromosome variant information |\n",
+    "| `{cohort_id}.chr{N}.psam` | Pseudo-individual IDs S1…S{B} |\n",
+    "| `{cohort_id}.chr{N}.afreq` | Per-chromosome allele frequencies (OBS_CT = 2 × n) |\n",
+    "| *(cleaned up after merge)* | Per-block `.dosage.gz`, `.map`, `.afreq`, `.meta` intermediate files |\n"
    ]
   },
   {
@@ -106,7 +97,7 @@
    "id": "3547538e-ef0f-49d3-aa45-208a278f5d3c",
    "metadata": {},
    "source": [
-    "### Step 0: Generate W once — provide your VCF directory, prefix, and LD block file"
+    "### Step 0 — Generate W (run once per cohort)\n"
    ]
   },
   {
@@ -119,10 +110,10 @@
    "outputs": [],
    "source": [
     "sos run pipeline/rss_ld_sketch.ipynb generate_W \\\n",
-    "    --n-samples 2000 \\\n",
+    "    --n-samples 16571 \\\n",
     "    --output-dir output/rss_ld_sketch \\\n",
     "    --B 10000 \\\n",
-    "    --seed 123"
+    "    --seed 123\n"
    ]
   },
   {
@@ -130,7 +121,7 @@
    "id": "e2559a2e-8bdd-443b-980f-26fe7f728674",
    "metadata": {},
    "source": [
-    "### Step 1: Process all blocks  →  compressed dosage files\n"
+    "### Step 1+2 — Process all blocks → compressed dosage files\n"
    ]
   },
   {
@@ -142,14 +133,14 @@
    },
    "outputs": [],
    "source": [
-    "# Process all blocks for one chromosome\n",
+    "# Process one chromosome\n",
     "sos run pipeline/rss_ld_sketch.ipynb process_block \\\n",
     "    --vcf-base /path/to/vcfs/ \\\n",
     "    --vcf-prefix your.prefix. \\\n",
     "    --ld-block-file /path/to/EUR_LD_blocks.bed \\\n",
     "    --chrom 22 \\\n",
     "    --output-dir output/rss_ld_sketch \\\n",
-    "    --W-matrix output/rss_ld_sketch/W_B10000.npy\n"
+    "    --W-matrix output/rss_ld_sketch/W_B10000.npy "
    ]
   },
   {
@@ -166,7 +157,8 @@
     "    --ld-block-file /restricted/projectnb/casa/oaolayin/ROSMAP_NIA_geno/EUR_LD_blocks.bed \\\n",
     "    --chrom 0 \\\n",
     "    --output-dir output/rss_ld_sketch \\\n",
-    "    --W-matrix output/rss_ld_sketch/W_B10000.npy"
+    "    --W-matrix output/rss_ld_sketch/W_B10000.npy \n",
+    "\n"
    ]
   },
   {
@@ -174,7 +166,7 @@
    "id": "19fdaf69-ae0d-45fb-b7a1-dc5a03ff9a51",
    "metadata": {},
    "source": [
-    "#### Step 2: Merge per-block dosage files into per-chromosome pgen\n"
+    "### Step 3 — Merge per-block dosage files into per-chromosome pgen\n"
    ]
   },
   {
@@ -186,7 +178,8 @@
    "source": [
     "sos run pipeline/rss_ld_sketch.ipynb merge_chrom \\\n",
     "    --output-dir output/rss_ld_sketch \\\n",
-    "    --chrom 0"
+    "    --chrom 0\n",
+    "\n"
    ]
   },
   {
@@ -281,10 +274,10 @@
     "                        Must match cohort-id used in process_block.\n",
     "      --output-dir VAL (as str, required)\n",
     "    Output per chromosome (final, block intermediates cleaned up):\n",
-    "      {cohort_id}.chr{N}_stocld_panel.pgen\n",
-    "      {cohort_id}.chr{N}_stocld_panel.pvar\n",
-    "      {cohort_id}.chr{N}_stocld_panel.psam\n",
-    "      {cohort_id}.chr{N}_stocld_panel.afreq\n",
+    "      {cohort_id}.chr{N}.pgen\n",
+    "      {cohort_id}.chr{N}.pvar\n",
+    "      {cohort_id}.chr{N}.psam\n",
+    "      {cohort_id}.chr{N}.afreq\n",
     "```\n"
    ]
   },
@@ -295,7 +288,7 @@
     "kernel": "SoS"
    },
    "source": [
-    "# Setup and Global Parameters"
+    "# Pipeline Implementation\n"
    ]
   },
   {
@@ -310,14 +303,11 @@
     "[global]\n",
     "parameter: cwd        = path(\"output\")\n",
     "parameter: job_size   = 1\n",
-    "parameter: walltime   = \"4h\"\n",
-    "parameter: mem        = \"32G\"\n",
-    "parameter: numThreads = 8\n",
+    "parameter: walltime   = \"24:00:00\"\n",
+    "parameter: mem        = \"16G\"\n",
+    "parameter: numThreads = 16\n",
     "parameter: container  = \"\"\n",
-    "parameter: plink2_bin  = \"plink2\"\n",
-    "parameter: walltime    = \"24:00:00\"\n",
-    "parameter: mem         = \"32G\"\n",
-    "parameter: numThreads  = 8\n",
+    "parameter: plink2_bin = \"plink2\"\n",
     "\n",
     "import re\n",
     "from sos.utils import expand_size\n",
@@ -337,7 +327,9 @@
     "kernel": "SoS"
    },
    "source": [
-    "## `generate_W`"
+    "## `generate_W`\n",
+    "\n",
+    "Generates the random projection matrix W ~ N(0, 1/√n), shape (n × B). Run once per cohort. W is shared across all chromosomes.\n"
    ]
   },
   {
@@ -357,10 +349,10 @@
     "# n_samples is passed directly as a parameter; no VCF reading is needed.\n",
     "# All 22 chromosomes reuse the same W so that per-chromosome stochastic\n",
     "# genotype samples can be arithmetically merged for meta-analysis.\n",
-    "parameter: n_samples = int\n",
-    "parameter: output_dir    = str\n",
-    "parameter: B         = 10000\n",
-    "parameter: seed      = 123\n",
+    "parameter: n_samples  = int\n",
+    "parameter: output_dir = str\n",
+    "parameter: B          = 10000\n",
+    "parameter: seed       = 123\n",
     "\n",
     "import os\n",
     "input:  []\n",
@@ -396,7 +388,9 @@
     "kernel": "SoS"
    },
    "source": [
-    "## `process_block`\n"
+    "## `process_block`\n",
+    "\n",
+    "Two-step workflow. `process_block_1` reads the BED file and emits one coordinate file per LD block. `process_block_2` reads the VCF for each block, computes U = WᵀG, scales to [0,2], and writes compressed dosage + allele frequency files. G is never stored.\n"
    ]
   },
   {
@@ -458,7 +452,9 @@
    "id": "a1b2c3d4-0031-4031-a031-000000000031",
    "metadata": {},
    "source": [
-    "## `process_block_2`\n"
+    "## `process_block_2`\n",
+    "\n",
+    "Per-block parallel task. Reads VCF, computes U = WᵀG in chunks, scales to [0,2], writes `.dosage.gz`, `.map`, `.afreq`, `.meta`. Output is intermediate — cleaned up by `merge_chrom`.\n"
    ]
   },
   {
@@ -683,7 +679,9 @@
    "id": "a1b2c3d4-0040-4040-a040-000000000040",
    "metadata": {},
    "source": [
-    "## `merge_chrom`\n"
+    "## `merge_chrom`\n",
+    "\n",
+    "For each chromosome: converts per-block `.dosage.gz` files to sorted pgen (two-pass: import then `--sort-vars`), merges all blocks into one per-chromosome pgen using `plink2 --pmerge-list`, concatenates allele frequencies, and removes all per-block intermediate directories. Output: `{cohort_id}.chr{N}.{pgen,pvar,psam,afreq}`.\n"
    ]
   },
   {
@@ -698,98 +696,102 @@
     "[merge_chrom]\n",
     "# Merge all per-block dosage files for a chromosome into one per-chrom pgen.\n",
     "# Steps per chromosome:\n",
-    "#   1. For each block: plink2 --import-dosage → per-block pgen (temp)\n",
+    "#   1. For each block: plink2 --import-dosage → unsorted pgen, then --sort-vars → sorted pgen\n",
     "#   2. plink2 --pmerge-list → one per-chrom pgen\n",
-    "#   3. Clean up all per-block dirs\n",
+    "#   3. Clean up all intermediates including -merge.* files\n",
     "#\n",
     "# Run after process_block completes (or run together: sos run ... process_block merge_chrom)\n",
-    "parameter: chrom       = 0\n",
-    "parameter: output_dir  = str\n",
-    "parameter: cohort_id   = \"ADSP.R5.EUR\"\n",
+    "parameter: chrom      = 0\n",
+    "parameter: output_dir = str\n",
+    "parameter: cohort_id  = \"ADSP.R5.EUR\"\n",
     "\n",
     "import os, glob\n",
     "\n",
     "def _chroms_to_process(output_dir, chrom_filter):\n",
     "    if chrom_filter != 0:\n",
     "        return [f\"chr{chrom_filter}\"]\n",
-    "    found = sorted(set(\n",
+    "    return sorted(set(\n",
     "        os.path.basename(d)\n",
     "        for d in glob.glob(os.path.join(output_dir, \"chr*\"))\n",
     "        if os.path.isdir(d)\n",
     "    ))\n",
-    "    return found\n",
     "\n",
     "_chroms = _chroms_to_process(output_dir, chrom)\n",
-    "print(f\"  Merging {len(_chroms)} chromosome(s): {', '.join(_chroms)}\")\n",
     "\n",
-    "input:  []\n",
-    "for_each: {\"_chrom\": _chroms}\n",
-    "task: trunk_workers = 1, trunk_size = 1, walltime = walltime, mem = mem, cores = numThreads\n",
-    "bash: expand = \"${ }\"\n",
+    "input: []\n",
+    "output: [f\"{output_dir}/{c}/{cohort_id}.{c}.pgen\" for c in _chroms]\n",
     "\n",
+    "for _chrom in _chroms:\n",
+    "    bash(f\"\"\"\n",
     "    set -euo pipefail\n",
-    "    chrom=\"${_chrom}\"\n",
-    "    cohort_id=\"${cohort_id}\"\n",
-    "    output_dir=\"${output_dir}\"\n",
-    "    plink2_bin=\"${plink2_bin}\"\n",
-    "    chrom_dir=\"${output_dir}/${chrom}\"\n",
-    "    final_prefix=\"${chrom_dir}/${cohort_id}.${chrom}_stocld_panel\"\n",
-    "    merge_list=\"${chrom_dir}/${cohort_id}.${chrom}_pmerge_list.txt\"\n",
-    "    psam_path=\"${chrom_dir}/${cohort_id}.${chrom}_stocld_panel.psam\"\n",
-    "\n",
-    "    echo \"[merge_chrom] ${chrom}: starting\"\n",
-    "\n",
-    "    # ── Step 1: Convert each block dosage.gz → temp per-block pgen ──\n",
-    "    > \"${merge_list}\"\n",
-    "    for dosage_gz in \"${chrom_dir}\"/*/*.dosage.gz; do\n",
-    "        block_dir=$(dirname \"${dosage_gz}\")\n",
-    "        block_tag=$(basename \"${block_dir}\")\n",
-    "        prefix=\"${block_dir}/${cohort_id}.${block_tag}_tmp\"\n",
-    "        map_file=\"${block_dir}/${cohort_id}.${block_tag}.map\"\n",
-    "        psam_file=\"${block_dir}/${cohort_id}.${block_tag}.psam\"\n",
-    "\n",
-    "        # Write .psam if not present (B pseudo-individuals)\n",
-    "        B=$(grep \"^B=\" \"${block_dir}/${cohort_id}.${block_tag}.meta\" | cut -d= -f2)\n",
-    "        if [ ! -f \"${psam_file}\" ]; then\n",
-    "            echo \"#FID IID\" > \"${psam_file}\"\n",
-    "            for i in $(seq 1 ${B}); do echo \"S${i} S${i}\"; done >> \"${psam_file}\"\n",
-    "        fi\n",
-    "\n",
-    "        echo \"  importing ${block_tag} ($(zcat ${dosage_gz} | wc -l) variants)\"\n",
-    "        \"${plink2_bin}\"             --import-dosage \"${dosage_gz}\" format=1 noheader             --psam \"${psam_file}\"             --map  \"${map_file}\"             --make-pgen             --out  \"${prefix}\"             --silent\n",
-    "\n",
-    "        echo \"${prefix}\" >> \"${merge_list}\"\n",
+    "    shopt -s nullglob\n",
+    "\n",
+    "    chrom_dir=\"{output_dir}/{_chrom}\"\n",
+    "    final_prefix=\"${{chrom_dir}}/{cohort_id}.{_chrom}\"\n",
+    "    merge_list=\"${{chrom_dir}}/{cohort_id}.{_chrom}_pmerge_list.txt\"\n",
+    "\n",
+    "    # ── Step 1: Convert each block dosage.gz → sorted per-block pgen (two-pass) ──\n",
+    "    > \"${{merge_list}}\"\n",
+    "    files=(\"${{chrom_dir}}\"/*/*.dosage.gz)\n",
+    "    if [ ${{#files[@]}} -eq 0 ]; then\n",
+    "        exit 1\n",
+    "    fi\n",
+    "    for dosage_gz in \"${{files[@]}}\"; do\n",
+    "        block_dir=$(dirname \"${{dosage_gz}}\")\n",
+    "        block_tag=$(basename \"${{block_dir}}\")\n",
+    "        prefix=\"${{block_dir}}/{cohort_id}.${{block_tag}}_tmp\"\n",
+    "        map_file=\"${{block_dir}}/{cohort_id}.${{block_tag}}.map\"\n",
+    "        psam_file=\"${{block_dir}}/{cohort_id}.${{block_tag}}.psam\"\n",
+    "        meta_file=\"${{block_dir}}/{cohort_id}.${{block_tag}}.meta\"\n",
+    "        B=$(grep \"^B=\" \"${{meta_file}}\" | cut -d= -f2)\n",
+    "        printf '#FID\\tIID\\n' > \"${{psam_file}}\"\n",
+    "        for i in $(seq 1 ${{B}}); do\n",
+    "            printf 'S%d\\tS%d\\n' ${{i}} ${{i}} >> \"${{psam_file}}\"\n",
+    "        done\n",
+    "        # Pass 1: import dosage → unsorted pgen\n",
+    "        \"{plink2_bin}\" \\\n",
+    "            --import-dosage \"${{dosage_gz}}\" format=1 noheader \\\n",
+    "            --psam \"${{psam_file}}\" \\\n",
+    "            --map  \"${{map_file}}\" \\\n",
+    "            --make-pgen \\\n",
+    "            --out  \"${{prefix}}_unsorted\" \\\n",
+    "            --silent\n",
+    "        # Pass 2: sort variants\n",
+    "        \"{plink2_bin}\" \\\n",
+    "            --pfile \"${{prefix}}_unsorted\" \\\n",
+    "            --make-pgen \\\n",
+    "            --sort-vars \\\n",
+    "            --out  \"${{prefix}}\" \\\n",
+    "            --silent\n",
+    "        rm -f \"${{prefix}}_unsorted.pgen\" \"${{prefix}}_unsorted.pvar\" \"${{prefix}}_unsorted.psam\"\n",
+    "        echo \"${{prefix}}\" >> \"${{merge_list}}\"\n",
     "    done\n",
     "\n",
-    "    # ── Step 2: Write shared .psam for the merged file ──\n",
-    "    B=$(grep \"^B=\" $(ls \"${chrom_dir}\"/*/*.meta | head -1) | cut -d= -f2)\n",
-    "    echo \"#FID IID\" > \"${psam_path}\"\n",
-    "    for i in $(seq 1 ${B}); do echo \"S${i} S${i}\"; done >> \"${psam_path}\"\n",
-    "\n",
-    "    # ── Step 3: Merge all per-block pgens → one per-chrom pgen ──\n",
-    "    n_blocks=$(wc -l < \"${merge_list}\")\n",
-    "    echo \"  merging ${n_blocks} blocks → ${final_prefix}\"\n",
-    "    \"${plink2_bin}\"         --pmerge-list \"${merge_list}\" pfile         --psam \"${psam_path}\"         --make-pgen         --out  \"${final_prefix}\"\n",
+    "    # ── Step 2: Merge all per-block pgens → one per-chrom pgen ──\n",
+    "    \"{plink2_bin}\" \\\n",
+    "        --pmerge-list \"${{merge_list}}\" pfile \\\n",
+    "        --make-pgen \\\n",
+    "        --sort-vars \\\n",
+    "        --out  \"${{final_prefix}}\"\n",
     "\n",
-    "    # ── Step 4: Concatenate .afreq files ──\n",
-    "    afreq_out=\"${final_prefix}.afreq\"\n",
+    "    # ── Step 3: Concatenate .afreq ──\n",
     "    first=1\n",
-    "    for f in \"${chrom_dir}\"/*/*.afreq; do\n",
-    "        if [ \"${first}\" -eq 1 ]; then\n",
-    "            cat \"${f}\" > \"${afreq_out}\"\n",
+    "    for f in \"${{chrom_dir}}\"/*/*.afreq; do\n",
+    "        if [ \"${{first}}\" -eq 1 ]; then\n",
+    "            cat \"${{f}}\" > \"${{final_prefix}}.afreq\"\n",
     "            first=0\n",
     "        else\n",
-    "            tail -n +2 \"${f}\" >> \"${afreq_out}\"\n",
+    "            tail -n +2 \"${{f}}\" >> \"${{final_prefix}}.afreq\"\n",
     "        fi\n",
     "    done\n",
-    "    echo \"  afreq written: ${afreq_out}\"\n",
     "\n",
-    "    # ── Step 5: Clean up per-block intermediate files ──\n",
-    "    rm -f \"${merge_list}\" \"${psam_path}\"\n",
-    "    for block_dir in \"${chrom_dir}\"/*/; do\n",
-    "        rm -rf \"${block_dir}\"\n",
+    "    # ── Step 4: Cleanup — remove merge intermediates and per-block dirs ──\n",
+    "    rm -f \"${{merge_list}}\"\n",
+    "    rm -f \"${{final_prefix}}-merge.pgen\" \"${{final_prefix}}-merge.pvar\" \"${{final_prefix}}-merge.psam\"\n",
+    "    for block_dir in \"${{chrom_dir}}\"/*/; do\n",
+    "        rm -rf \"${{block_dir}}\"\n",
     "    done\n",
-    "    echo \"[merge_chrom] ${chrom}: done → ${final_prefix}.{pgen,pvar,psam,afreq}\"\n"
+    "    \"\"\")"
    ]
   }
  ],