From 8a83f8c1ba548fc421f3e46072d3f898bb90aa10 Mon Sep 17 00:00:00 2001
From: mtcicer026 <tommywtullius@gmail.com>
Date: Wed, 27 May 2026 13:49:24 -0400
Subject: [PATCH] Add FiberHMM page to Dry Lab section

Documents FiberHMM as a DAF-seq nucleosome, MSP, and TF footprint caller.
Adds the page to the SUMMARY sidebar and reframes the daf-qc downstream
analysis section so it points to FiberHMM as the DAF-native option, with
`ft add-nucleosomes` noted as a Fiber-seq nucleosome caller reachable for
DAF-seq via the `ft ddda-to-m6a` conversion bridge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 src/SUMMARY.md         |  1 +
 src/drylab/daf-qc.md   |  9 ++---
 src/drylab/fiberhmm.md | 87 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+), 6 deletions(-)
 create mode 100644 src/drylab/fiberhmm.md

diff --git a/src/SUMMARY.md b/src/SUMMARY.md
index ccfc2b4..038bfa0 100644
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@@ -17,6 +17,7 @@
 # Dry Lab
 
 - [DAF-QC Pipeline](drylab/daf-qc.md)
+- [FiberHMM](drylab/fiberhmm.md)
 
 ---
 
diff --git a/src/drylab/daf-qc.md b/src/drylab/daf-qc.md
index 1c45879..90211ef 100644
--- a/src/drylab/daf-qc.md
+++ b/src/drylab/daf-qc.md
@@ -90,14 +90,11 @@ See `config/config.yaml` in the repository for the full list of options with des
 - **QC metrics**: Targeting efficiency, deamination rates (overall and by 2-bp sequence context), strand calling, enzyme bias, and mutation rates.
 - **HTML dashboard**: `results/{sample_name}/qc/{sample_name}.dashboard.html` with all QC plots. The dashboard is self-contained (plots are embedded), so you can copy a single file for sharing or local viewing.
 
-## Downstream analysis with fibertools
+## Downstream analysis
 
-After QC, DAF-seq data can be further processed with [fibertools](https://github.com/fiberseq/fibertools-rs) (`ft`) for chromatin fiber analysis:
+After QC, DAF-seq data can be processed for nucleosome, MSP, and transcription factor footprint calling with [FiberHMM](fiberhmm.md), a Hidden Markov Model toolkit that operates natively on deaminase data (DddA and DddB) and emits [fibertools](https://github.com/fiberseq/fibertools-rs)-compatible BAMs plus [Molecular-annotation spec](https://github.com/fiberseq/Molecular-annotation-spec) tags. See the [FiberHMM](fiberhmm.md) page for installation and usage.
 
-1. **`ft ddda-to-m6a`**: Converts DAF-seq deamination marks (C-to-T / G-to-A) into m6A-equivalent format, enabling compatibility with the Fiber-seq analysis ecosystem.
-2. **`ft add-nucleosomes`**: Infers nucleosome positions from the converted deamination data.
-
-These steps allow you to use the full suite of Fiber-seq visualization and analysis tools on DAF-seq data. See the [fibertools documentation](https://fiberseq.github.io) for details.
+Alternatively, the Fiber-seq nucleosome caller `ft add-nucleosomes` can be applied to DAF-seq data after first converting the deamination marks to m6A-equivalent format with `ft ddda-to-m6a`. This routes DAF-seq data through the Fiber-seq analysis stack; see the [fibertools documentation](https://fiberseq.github.io) for details.
 
 ## Further reading
 
diff --git a/src/drylab/fiberhmm.md b/src/drylab/fiberhmm.md
new file mode 100644
index 0000000..35269df
--- /dev/null
+++ b/src/drylab/fiberhmm.md
@@ -0,0 +1,87 @@
+# FiberHMM
+
+[FiberHMM](https://github.com/fiberseq/FiberHMM) is a Hidden Markov Model toolkit for calling chromatin footprints from single-molecule DNA modification data. It supports DAF-seq (DddA and DddB) as well as Fiber-seq (PacBio and Nanopore Hia5), and emits nucleosomes, methylase-sensitive patches (MSPs), and sub-nucleosomal TF/Pol II footprints in [fibertools](https://github.com/fiberseq/fibertools-rs)-compatible BAMs.
+
+For DAF-seq, FiberHMM is a native nucleosome and footprint caller that runs directly on deaminase data with DAF-trained HMM emissions, adds a log-likelihood-ratio recaller for transcription factor footprints, and writes spec-compliant [Molecular-annotation](https://github.com/fiberseq/Molecular-annotation-spec) tags. This page covers installation, inputs, and the recommended one-command workflow.
+
+## Getting started
+
+Install from PyPI:
+
+```bash
+pip install fiberhmm
+```
+
+Optional dependencies that are worth installing:
+
+```bash
+pip install numba        # ~10x faster HMM computation
+pip install matplotlib   # --stats visualization
+pip install h5py         # HDF5 posteriors export
+```
+
+For bigBed output, install [`bedToBigBed`](https://hgdownload.soe.ucsc.edu/admin/exe/) from UCSC tools.
+
+Pre-trained models for DddA and DddB are bundled with the package; no separate download is required.
+
+## Inputs
+
+FiberHMM operates on aligned DAF-seq BAMs from the [DAF-QC pipeline](daf-qc.md) (or any equivalent alignment workflow). In DAF mode, the caller needs to know which positions on each read are C-to-T or G-to-A conversions. It auto-detects this per read from any of:
+
+- **R/Y IUPAC codes in the stored sequence**, written by `fiberhmm-daf-encode`.
+- **MD tag** on a raw aligned BAM (produced by `minimap2 --MD` or `samtools calmd`). Parsed on the fly; no preprocessing step needed.
+- **MM/ML tags** encoding deaminated C/G positions as base modifications.
+- **`--reference ref.fa`**, used as a fallback when none of the above are present.
+
+At least one of these must be present on the input BAM.
+
+## Usage
+
+`fiberhmm-call` is the recommended entry point. It fuses the nucleosome/MSP HMM and the TF recaller into a single in-process pipeline.
+
+### DddA
+
+FiberHMM automatically selects the DddA two-model workflow under the hood (a nucleosome model plus a TF recall pass with an efficiency uplift to match DddA's higher per-position deamination rate).
+
+```bash
+fiberhmm-call -i aligned.bam -o recalled.bam \
+              --mode daf --enzyme ddda \
+              -c 8 --io-threads 16 \
+              --region-parallel
+```
+
+`--region-parallel` requires a coordinate-sorted and indexed input and scales near-linearly with `--cores` up to the chromosome count. The output is sorted and indexed in place; no separate sort pass is needed.
+
+DddB samples are supported by the same commands with `--enzyme dddb`.
+
+## Key outputs
+
+`fiberhmm-call` writes a tagged BAM that downstream tools like FiberBrowser and Fibertools can read directly.
+
+- **Legacy footprint tags** (`ns`/`nl`, `as`/`al`): nucleosome and MSP starts and lengths, compatible with any tool in the fibertools ecosystem.
+- **Molecular-annotation spec tags** (`MA`, `AQ`): per-spec `nuc+Q`, `msp+`, and `tf+QQQ` annotations with LLR-based confidence (`tq`) and edge-sharpness bytes (`el`, `er`) on TF calls. See the [spec](https://github.com/fiberseq/Molecular-annotation-spec) for the encoding.
+- **TF/Pol II footprints**: sub-nucleosomal calls (typically 15–80 bp) live in the `tf+QQQ` annotation of `MA`/`AQ`. For DAF-seq, the recaller uses a tuned `--min-llr` (5.0 for DddA, 4.0 for DddB) selected automatically by `--enzyme`.
+
+### Extract to bigBed
+
+For a smaller-filesize representation of the call set for downstream analysis and [FiberBrowser](https://github.com/fiberseq/FiberBrowser), convert the tagged BAM:
+
+```bash
+fiberhmm-extract -i recalled.bam --footprint --msp --tf --bigbed
+```
+
+## Choosing options
+
+| Situation | Command |
+|---|---|
+| Default: full pipeline on a sorted + indexed DAF-seq BAM | `fiberhmm-call --mode daf --enzyme ddda --region-parallel` |
+| Want FIRE element calls afterwards | Pipe `fiberhmm-call -o -` into `ft fire - final.bam` |
+| DddB samples | swap `--enzyme ddda` for `--enzyme dddb` in any of the above |
+
+For Hia5 fiber-seq usage and the full CLI surface (training new models, exporting posteriors, model inspection), see the FiberHMM README.
+
+## Further reading
+
+- [FiberHMM repository and README](https://github.com/fiberseq/FiberHMM)
+- [Molecular-annotation spec](https://github.com/fiberseq/Molecular-annotation-spec) for the `MA`/`AQ` tag schema
+- [fibertools-rs](https://github.com/fiberseq/fibertools-rs) for downstream FIRE scoring and extraction