The Rapid-CNS2 nextflow pipeline is a bioinformatics workflow designed for comprehensive analysis of genomic and epigenomic data generated using adaptive sampling based sequencing of central nervous system (CNS) tumours. It performs tasks such as alignment, SNV calling, structural variant calling, methylation analysis, copy number variation calling, and provides a comprehensive molecular report.
This pipeline is implemented using Nextflow, allowing for easy execution and scalability on various compute environments, including local machines, clusters, and cloud platforms.
- Modular architecture for easy customization and extension
- Flexible input handling - supports aligned and unaligned BAM files with automatic alignment detection
- Accelerated variant calling with Clara Parabricks supported DeepVariant and Sniffles2
- Comprehensive analysis including methylation analysis with Rapid-CNS² classifier and MGMT promoter methylation status
- Automated reporting with molecular diagnostic-ready reports
- MNP-Flex integration for additional classifier input preparation (optional)
- Nextflow: version 20.10.0 or later (recommended 22.10.0+)
- Container Engine: Docker (recommended) or Singularity
- Java: OpenJDK 8 or later
- System: Linux (Ubuntu 18.04+, CentOS 7+, or similar)
- Memory: Minimum 8GB RAM, recommended 32GB+ for large datasets
- Storage: At least 100GB free space for reference genomes and databases
git clone https://github.com/areebapatel/Rapid-CNS2_nf.git
cd Rapid-CNS2_nf# Using Conda (recommended)
conda create -n nextflow python=3.9
conda activate nextflow
conda install -c bioconda nextflow
# Or manual installation
curl -s https://get.nextflow.io | bash
sudo mv nextflow /usr/local/bin/# Docker (recommended)
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker $USER
# Or Singularity
sudo apt-get update
sudo apt-get install -y singularity-container# Create reference directory
mkdir -p /path/to/references/hg38
# Download UCSC hg38 reference genome
wget https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz
gunzip hg38.fa.gz
mv hg38.fa /path/to/references/hg38/hg38.fa
# Create index files
samtools faidx /path/to/references/hg38/hg38.faANNOVAR is required for variant annotation:
-
Register and download:
- Visit ANNOVAR Download Form
- Fill out the registration form with your institutional email
- Download the package from the link sent to your email
-
Install and setup:
# Extract ANNOVAR
tar -xzf annovar.latest.tar.gz
cd annovar
chmod +x *.pl
# Create humandb directory and download databases
mkdir humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar refGene humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar cytoBand humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar clinvar_20240917 humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar avsnp151 humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar 1000g2015aug humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar cosmic70 humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar dbnsfp42c humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar allofus humandb/Note: ANNOVAR is freely available for personal, academic, and non-profit use only. Commercial users must purchase a license from QIAGEN.
AnnotSV is required for structural variant annotation:
# Navigate to installation directory
cd /path/to/install/
# Clone AnnotSV repository
git clone https://github.com/lgmgeo/AnnotSV.git
# Navigate to AnnotSV directory
cd AnnotSV
# Install AnnotSV
make PREFIX=. install
# Install human annotations
make PREFIX=. install-human-annotationNote: AnnotSV is freely available under the MIT license. See the AnnotSV repository for more information.
Edit the nextflow.config file with your system-specific paths:
params {
// Update these paths to match your system
ref = "/path/to/references/hg38/hg38.fa"
annovarPath = "/path/to/annovar/"
annovarDB = "/path/to/annovar/humandb/"
annotsvAnnot = "/path/to/AnnotSV/Annotations_Human/"
}nextflow run main.nf \
--input /data/sample.bam \
--id SAMPLE001 \
--outDir ./results \
-profile lsfnextflow run main.nf \
--input /data/sample.bam \
--id SAMPLE001 \
--patient "John Doe" \
--outDir ./results \
--minimumMgmtCov 10 \
--mnpFlex true \
--runHumanVariation true \
--containerEngine singularity \
-profile slurm- Platform: Oxford Nanopore Technologies (MinION, GridION, PromethION)
- Sequencing Method: Adaptive sampling with targeted gene panel
- Gene Panel: NPHD panel (160 gene regions) as described in:
- Patel et al. (2022). Rapid-CNS²: rapid comprehensive adaptive nanopore-sequencing of CNS tumors. Acta Neuropathologica 143, 609–612
- Patel et al. (2025). Prospective, multicenter validation of a platform for rapid molecular profiling of central nervous system tumors. Nature Medicine 31, 1567–1577
- Shallow whole genome sequencing (WGS)
- Other sequencing platforms (Illumina, PacBio, etc.)
- Non-targeted sequencing approaches
- Reduced sensitivity for variant detection
- Increased false negative rates
- Unreliable methylation analysis
- Poor MGMT promoter methylation assessment
Basecalling must be performed externally.
- Use epi2me-labs/wf-basecalling or Dorado directly
- Ensure you use a model that supports modified basecalling (see Dorado documentation)
- Provide the resulting BAM(s) as input to this pipeline
The pipeline accepts:
- (Preferred) Single aligned BAM file: Direct path to a single aligned and merged BAM file (e.g.,
/path/to/sample.bam) - Directory with aligned BAM files: Path to directory containing multiple aligned BAM files (will be merged automatically)
- Directory with unaligned BAM files: Path to directory containing multiple unaligned BAM files (will be aligned and merged automatically)
graph TD
A[Input Data BAM] --> B[Alignment Check]
B --> C[Alignment if needed]
B --> D[Direct Processing]
C --> E[Merge BAMs]
D --> E
E --> F[SNV Calling]
E --> G[Methylation Analysis]
E --> H[Structural Variant Calling]
F --> I[Annotation & Filtering]
G --> J[MGMT Promoter Analysis]
G --> K[Methylation Classification]
H --> L[Structural Variant Annotation]
I --> M[Report Generation]
J --> M
K --> M
L --> M
| Parameter | Description | Example |
|---|---|---|
--input |
Required. Path to input BAM file(s). Can be: • Single aligned BAM file: /path/to/sample.bam• Directory with aligned BAMs: /path/to/aligned_bams/• Directory with unaligned BAMs: /path/to/unaligned_bams/ |
--input /data/sample.bam |
--id |
Required. Unique sample identifier used for naming output files and reports. Should be alphanumeric with no spaces. | --id SAMPLE001 |
These parameters must be configured for your specific system and installation paths in the nextflow.config file:
| Parameter | Description | Default | Example |
|---|---|---|---|
--ref |
Path to hg38 reference genome FASTA file | System-specific | --ref /refs/hg38.fa |
--annovarPath |
Path to ANNOVAR installation directory. | System-specific | --annovarPath /tools/annovar/ |
--annovarDB |
Path to ANNOVAR database directory (humandb/). | System-specific | --annovarDB /tools/annovar/humandb/ |
--annotsvAnnot |
Path to AnnotSV annotations directory. | System-specific | --annotsvAnnot /tools/AnnotSV/Annotations_Human/ |
--annotations |
Path to annotation file for IGV reports (refGene.txt.gz). | data/refGene.txt |
--annotations /refs/refGene.txt |
| Parameter | Description | Default | Example |
|---|---|---|---|
--outDir |
Output directory for all pipeline results. Will be created if it doesn't exist. | output |
--outDir /results/analysis |
--tmpDir |
Directory for temporary files. Auto-set to ${outDir}/tmp/ unless overridden. |
${outDir}/tmp/ |
--tmpDir /scratch/tmp |
--logDir |
Directory for log files. | logDir |
--logDir /logs |
--patient |
Patient name for reports. If not specified, uses the --id value. |
null (uses --id) |
--patient "John Doe" |
| Parameter | Description | Default | Example |
|---|---|---|---|
--maxThreads |
Maximum number of threads for general processes. | 64 |
--maxThreads 32 |
--modkitThreads |
Number of threads for modkit methylation calling. | 32 |
--modkitThreads 16 |
--cnvThreads |
Number of threads for CNVpytor copy number analysis. | 32 |
--cnvThreads 16 |
--snifflesThreads |
Number of threads for Sniffles2 structural variant calling. | 32 |
--snifflesThreads 16 |
--snpThreads |
Number of threads for SNV calling with DeepVariant. | 64 |
--snpThreads 32 |
--svThreads |
Number of threads for structural variant calling. | 64 |
--svThreads 32 |
--covThreads |
Number of threads for coverage calculation with mosdepth. | 8 |
--covThreads 4 |
--methThreads |
Number of threads for methylation classification. | 64 |
--methThreads 32 |
--mgmtThreads |
Number of threads for MGMT promoter analysis. | 8 |
--mgmtThreads 4 |
| Parameter | Description | Default | Example |
|---|---|---|---|
--minimumMgmtCov |
Minimum coverage threshold for MGMT promoter methylation analysis. If coverage is below this threshold, MGMT analysis will be skipped. | 5 |
--minimumMgmtCov 10 |
--bamMinCoverage |
Minimum coverage threshold for human variation workflow. | 10 |
--bamMinCoverage 15 |
--mnpFlex |
Enable MNP-Flex classifier input preparation. Creates files needed for external MNP-Flex analysis. | false |
--mnpFlex true |
--runHumanVariation |
Enable wf-human-variation SNP and SV pipeline. Adds additional variant calling workflows. | false |
--runHumanVariation true |
| Parameter | Description | Default | Example |
|---|---|---|---|
--containerEngine |
Container engine to use: 'docker' or 'singularity'. | docker |
--containerEngine singularity |
--seq |
Sequencer platform identifier. Auto-detected from BAM headers, but can be manually set. | P2 |
--seq F |
The pipeline supports different compute infrastructure profiles:
- Executor: LSF
- Queue: normal
- Memory: 8 GB per process
- CPUs: 2 per process
- GPU: Available for variant calling and structural variant processes
- Executor: SLURM
- Queue: batch
- Memory: 8 GB per process
- CPUs: 2 per process
- GPU: Available for variant calling and structural variant processes
- Executor: Local
- Memory: 4 GB per process
- CPUs: 1 per process
- GPU: Not available
The pipeline generates comprehensive outputs in the specified output directory:
- SNV analysis: Variant calls, annotations, and filtered reports
- Structural variants: SV calls with annotations
- Copy Number variations: CNV analysis with plots and annotations
- Methylation analysis: Methylation calls and classification results
- MGMT analysis: Promoter methylation status and predictions
- Coverage analysis: Depth of coverage summaries
- Reports: HTML reports with comprehensive molecular diagnostic information
MNP-Flex is a methylation-based tumor classifier that provides detailed molecular classification of CNS tumours. The Rapid-CNS² pipeline can prepare the necessary input files for MNP-Flex analysis.
MNP-Flex is a methylation classifier compatible with the latest version of the Heidelberg CNS tumour methylation classifier. It can classify CNS tumours into 184 subclasses according to the 2021 WHO classification.
-
Enable MNP-Flex in the pipeline:
nextflow run main.nf \ --input /data/sample.bam \ --id SAMPLE001 \ --mnpFlex true \ -profile lsf -
Locate the output files: The pipeline creates MNP-Flex compatible files in:
${outDir}/mnpflex/ └── ${id}.MNPFlex.subset.bed -
Upload to MNP-Flex:
- Visit mnp-flex.org
- Upload the
.bedfile generated by the pipeline - Submit for analysis
-
Interpret results:
- MNP-Flex will provide detailed classification results
- Results include confidence scores
- Reports are compatible with clinical interpretation guidelines
${id}.MNPFlex.subset.bed: Methylation data in MNP-Flex compatible format
Note: MNP-Flex analysis is performed externally on the mnp-flex.org platform. The pipeline only prepares the input files. For detailed information about MNP-Flex, visit mnp-flex.org.
- Check the Nextflow documentation: https://www.nextflow.io/docs/
- Review the pipeline logs in the
work/directory - Check the
pipeline_info/directory for execution reports - Run with
--helpfor available options
If you use this pipeline, please cite our work:
Patel, A., Göbel, K., Ille, S. et al. Prospective, multicenter validation of a platform for rapid molecular profiling of central nervous system tumors. Nature Medicine 31, 1567–1577 (2025). https://doi.org/10.1038/s41591-025-03562-5
This project is licensed under the MIT License.
