Rapid-CNS² workflow

313507599-6ba243da-0dca-4f4e-9cea-a4df7b989ff6

Rapid-CNS² workflow

Overview

The Rapid-CNS² nextflow pipeline is a bioinformatics workflow designed for comprehensive analysis of genomic and epigenomic data generated using adaptive sampling based sequencing of central nervous system (CNS) tumours. It performs tasks such as alignment, SNV calling, structural variant calling, methylation analysis, copy number variation calling, and provides a comprehensive molecular report.

This pipeline is implemented using Nextflow, allowing for easy execution and scalability on various compute environments, including local machines, clusters, and cloud platforms.

Features

Modular architecture for easy customization and extension
Flexible input handling - supports aligned and unaligned BAM files with automatic alignment detection
Accelerated variant calling with Clara Parabricks supported DeepVariant and Sniffles2
Comprehensive analysis including methylation analysis with Rapid-CNS² classifier and MGMT promoter methylation status
Automated reporting with molecular diagnostic-ready reports
MNP-Flex integration for additional classifier input preparation (optional)

Requirements

Nextflow: version 20.10.0 or later (recommended 22.10.0+)
Container Engine: Docker (recommended) or Singularity
Java: OpenJDK 8 or later
System: Linux (Ubuntu 18.04+, CentOS 7+, or similar)
Memory: Minimum 8GB RAM, recommended 32GB+ for large datasets
Storage: At least 100GB free space for reference genomes and databases

Quick start

1. Clone the repository

git clone https://github.com/areebapatel/Rapid-CNS2_nf.git
cd Rapid-CNS2_nf

2. Install dependencies

Install Nextflow

# Using Conda (recommended)
conda create -n nextflow python=3.9
conda activate nextflow
conda install -c bioconda nextflow

# Or manual installation
curl -s https://get.nextflow.io | bash
sudo mv nextflow /usr/local/bin/

Install container engine

# Docker (recommended)
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker $USER

# Or Singularity
sudo apt-get update
sudo apt-get install -y singularity-container

3. Download reference genome

# Create reference directory
mkdir -p /path/to/references/hg38

# Download UCSC hg38 reference genome
wget https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz
gunzip hg38.fa.gz
mv hg38.fa /path/to/references/hg38/hg38.fa

# Create index files
samtools faidx /path/to/references/hg38/hg38.fa

4. Install ANNOVAR

ANNOVAR is required for variant annotation:

Register and download:
- Visit ANNOVAR Download Form
- Fill out the registration form with your institutional email
- Download the package from the link sent to your email
Install and setup:

# Extract ANNOVAR
tar -xzf annovar.latest.tar.gz
cd annovar
chmod +x *.pl

# Create humandb directory and download databases
mkdir humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar refGene humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar cytoBand humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar clinvar_20240917 humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar avsnp151 humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar 1000g2015aug humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar cosmic70 humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar dbnsfp42c humandb/
./annotate_variation.pl -buildver hg38 -downdb -webfrom annovar allofus humandb/

Note: ANNOVAR is freely available for personal, academic, and non-profit use only. Commercial users must purchase a license from QIAGEN.

5. Install AnnotSV

AnnotSV is required for structural variant annotation:

# Navigate to installation directory
cd /path/to/install/

# Clone AnnotSV repository
git clone https://github.com/lgmgeo/AnnotSV.git

# Navigate to AnnotSV directory
cd AnnotSV

# Install AnnotSV
make PREFIX=. install

# Install human annotations
make PREFIX=. install-human-annotation

Note: AnnotSV is freely available under the MIT license. See the AnnotSV repository for more information.

6. Configure the pipeline

Edit the nextflow.config file with your system-specific paths:

params {
    // Update these paths to match your system
    ref = "/path/to/references/hg38/hg38.fa"
    annovarPath = "/path/to/annovar/"
    annovarDB = "/path/to/annovar/humandb/"
    annotsvAnnot = "/path/to/AnnotSV/Annotations_Human/"
}

7. Run the pipeline

Basic run

nextflow run main.nf \
    --input /data/sample.bam \
    --id SAMPLE001 \
    --outDir ./results \
    -profile lsf

Advanced run

nextflow run main.nf \
    --input /data/sample.bam \
    --id SAMPLE001 \
    --patient "John Doe" \
    --outDir ./results \
    --minimumMgmtCov 10 \
    --mnpFlex true \
    --runHumanVariation true \
    --containerEngine singularity \
    -profile slurm

Input requirements

Sequencing requirements

⚠️ IMPORTANT: This pipeline is specifically designed for Oxford Nanopore Technologies (ONT) data generated using adaptive sampling with the gene panel described in Patel et al. 2022 and Patel et al. 2025.

Platform: Oxford Nanopore Technologies (MinION, GridION, PromethION)
Sequencing Method: Adaptive sampling with targeted gene panel
Gene Panel: NPHD panel (160 gene regions) as described in:
- Patel et al. (2022). Rapid-CNS²: rapid comprehensive adaptive nanopore-sequencing of CNS tumors. Acta Neuropathologica 143, 609–612
- Patel et al. (2025). Prospective, multicenter validation of a platform for rapid molecular profiling of central nervous system tumors. Nature Medicine 31, 1567–1577

⚠️ NOT SUITABLE FOR:

Shallow whole genome sequencing (WGS)
Other sequencing platforms (Illumina, PacBio, etc.)
Non-targeted sequencing approaches

⚠️ WGS WARNING: If using whole genome sequencing data, the average coverage should be at least 10X for reliable reporting of variants. Lower coverage may result in:

Reduced sensitivity for variant detection
Increased false negative rates
Unreliable methylation analysis
Poor MGMT promoter methylation assessment

Basecalling

Basecalling must be performed externally.

Use epi2me-labs/wf-basecalling or Dorado directly
Ensure you use a model that supports modified basecalling (see Dorado documentation)
Provide the resulting BAM(s) as input to this pipeline

Input options

The pipeline accepts:

(Preferred) Single aligned BAM file: Direct path to a single aligned and merged BAM file (e.g., /path/to/sample.bam)
Directory with aligned BAM files: Path to directory containing multiple aligned BAM files (will be merged automatically)
Directory with unaligned BAM files: Path to directory containing multiple unaligned BAM files (will be aligned and merged automatically)

Pipeline structure

graph TD
    A[Input Data BAM] --> B[Alignment Check]
    B --> C[Alignment if needed]
    B --> D[Direct Processing]
    C --> E[Merge BAMs]
    D --> E
    E --> F[SNV Calling]
    E --> G[Methylation Analysis]
    E --> H[Structural Variant Calling]
    F --> I[Annotation & Filtering]
    G --> J[MGMT Promoter Analysis]
    G --> K[Methylation Classification]
    H --> L[Structural Variant Annotation]
    I --> M[Report Generation]
    J --> M
    K --> M
    L --> M

Parameters

Required parameters

Parameter	Description	Example
`--input`	Required. Path to input BAM file(s). Can be: • Single aligned BAM file: `/path/to/sample.bam` • Directory with aligned BAMs: `/path/to/aligned_bams/` • Directory with unaligned BAMs: `/path/to/unaligned_bams/`	`--input /data/sample.bam`
`--id`	Required. Unique sample identifier used for naming output files and reports. Should be alphanumeric with no spaces.	`--id SAMPLE001`

System-specific parameters

These parameters must be configured for your specific system and installation paths in the nextflow.config file:

Parameter	Description	Default	Example
`--ref`	Path to hg38 reference genome FASTA file	System-specific	`--ref /refs/hg38.fa`
`--annovarPath`	Path to ANNOVAR installation directory.	System-specific	`--annovarPath /tools/annovar/`
`--annovarDB`	Path to ANNOVAR database directory (humandb/).	System-specific	`--annovarDB /tools/annovar/humandb/`
`--annotsvAnnot`	Path to AnnotSV annotations directory.	System-specific	`--annotsvAnnot /tools/AnnotSV/Annotations_Human/`
`--annotations`	Path to annotation file for IGV reports (refGene.txt.gz).	`data/refGene.txt`	`--annotations /refs/refGene.txt`

Optional parameters

Output parameters

Parameter	Description	Default	Example
`--outDir`	Output directory for all pipeline results. Will be created if it doesn't exist.	`output`	`--outDir /results/analysis`
`--tmpDir`	Directory for temporary files. Auto-set to `${outDir}/tmp/` unless overridden.	`${outDir}/tmp/`	`--tmpDir /scratch/tmp`
`--logDir`	Directory for log files.	`logDir`	`--logDir /logs`
`--patient`	Patient name for reports. If not specified, uses the `--id` value.	`null` (uses `--id`)	`--patient "John Doe"`

Resource parameters

Parameter	Description	Default	Example
`--maxThreads`	Maximum number of threads for general processes.	`64`	`--maxThreads 32`
`--modkitThreads`	Number of threads for modkit methylation calling.	`32`	`--modkitThreads 16`
`--cnvThreads`	Number of threads for CNVpytor copy number analysis.	`32`	`--cnvThreads 16`
`--snifflesThreads`	Number of threads for Sniffles2 structural variant calling.	`32`	`--snifflesThreads 16`
`--snpThreads`	Number of threads for SNV calling with DeepVariant.	`64`	`--snpThreads 32`
`--svThreads`	Number of threads for structural variant calling.	`64`	`--svThreads 32`
`--covThreads`	Number of threads for coverage calculation with mosdepth.	`8`	`--covThreads 4`
`--methThreads`	Number of threads for methylation classification.	`64`	`--methThreads 32`
`--mgmtThreads`	Number of threads for MGMT promoter analysis.	`8`	`--mgmtThreads 4`

Analysis parameters

Parameter	Description	Default	Example
`--minimumMgmtCov`	Minimum coverage threshold for MGMT promoter methylation analysis. If coverage is below this threshold, MGMT analysis will be skipped.	`5`	`--minimumMgmtCov 10`
`--bamMinCoverage`	Minimum coverage threshold for human variation workflow.	`10`	`--bamMinCoverage 15`
`--mnpFlex`	Enable MNP-Flex classifier input preparation. Creates files needed for external MNP-Flex analysis.	`false`	`--mnpFlex true`
`--runHumanVariation`	Enable wf-human-variation SNP and SV pipeline. Adds additional variant calling workflows.	`false`	`--runHumanVariation true`

Container and system parameters

Parameter	Description	Default	Example
`--containerEngine`	Container engine to use: 'docker' or 'singularity'.	`docker`	`--containerEngine singularity`
`--seq`	Sequencer platform identifier. Auto-detected from BAM headers, but can be manually set.	`P2`	`--seq F`

Profile-specific parameters

The pipeline supports different compute infrastructure profiles:

LSF profile (`-profile lsf`)

Executor: LSF
Queue: normal
Memory: 8 GB per process
CPUs: 2 per process
GPU: Available for variant calling and structural variant processes

SLURM profile (`-profile slurm`)

Executor: SLURM
Queue: batch
Memory: 8 GB per process
CPUs: 2 per process
GPU: Available for variant calling and structural variant processes

Local profile (`-profile local`)

Executor: Local
Memory: 4 GB per process
CPUs: 1 per process
GPU: Not available

Output

The pipeline generates comprehensive outputs in the specified output directory:

SNV analysis: Variant calls, annotations, and filtered reports
Structural variants: SV calls with annotations
Copy Number variations: CNV analysis with plots and annotations
Methylation analysis: Methylation calls and classification results
MGMT analysis: Promoter methylation status and predictions
Coverage analysis: Depth of coverage summaries
Reports: HTML reports with comprehensive molecular diagnostic information

MNP-Flex integration

MNP-Flex is a methylation-based tumor classifier that provides detailed molecular classification of CNS tumours. The Rapid-CNS² pipeline can prepare the necessary input files for MNP-Flex analysis.

What is MNP-Flex?

MNP-Flex is a methylation classifier compatible with the latest version of the Heidelberg CNS tumour methylation classifier. It can classify CNS tumours into 184 subclasses according to the 2021 WHO classification.

How to use MNP-Flex

Enable MNP-Flex in the pipeline:

nextflow run main.nf \
    --input /data/sample.bam \
    --id SAMPLE001 \
    --mnpFlex true \
    -profile lsf

Locate the output files: The pipeline creates MNP-Flex compatible files in:
```
${outDir}/mnpflex/
└── ${id}.MNPFlex.subset.bed
```
Upload to MNP-Flex:
- Visit mnp-flex.org
- Upload the .bed file generated by the pipeline
- Submit for analysis
Interpret results:
- MNP-Flex will provide detailed classification results
- Results include confidence scores
- Reports are compatible with clinical interpretation guidelines

Output files

${id}.MNPFlex.subset.bed: Methylation data in MNP-Flex compatible format

Note: MNP-Flex analysis is performed externally on the mnp-flex.org platform. The pipeline only prepares the input files. For detailed information about MNP-Flex, visit mnp-flex.org.

Getting Help

Check the Nextflow documentation: https://www.nextflow.io/docs/
Review the pipeline logs in the work/ directory
Check the pipeline_info/ directory for execution reports
Run with --help for available options

Citation

If you use this pipeline, please cite our work:

Patel, A., Göbel, K., Ille, S. et al. Prospective, multicenter validation of a platform for rapid molecular profiling of central nervous system tumors. Nature Medicine 31, 1567–1577 (2025). https://doi.org/10.1038/s41591-025-03562-5

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
data		data
dockerfiles/rapid_cns		dockerfiles/rapid_cns
logos		logos
nextflow		nextflow
scr		scr
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Folders and files

Latest commit

History

Repository files navigation

Rapid-CNS2 workflow

Overview

Features

Requirements

Quick start

1. Clone the repository

2. Install dependencies

Install Nextflow

Install container engine

3. Download reference genome

4. Install ANNOVAR

5. Install AnnotSV

6. Configure the pipeline

7. Run the pipeline

Basic run

Advanced run

Input requirements

Sequencing requirements

Basecalling

Input options

Pipeline structure

Parameters

Required parameters

System-specific parameters

Optional parameters

Output parameters

Resource parameters

Analysis parameters

Container and system parameters

Profile-specific parameters

LSF profile (-profile lsf)

SLURM profile (-profile slurm)

Local profile (-profile local)

Output

MNP-Flex integration

What is MNP-Flex?

How to use MNP-Flex

Output files

Getting Help

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Rapid-CNS² workflow

LSF profile (`-profile lsf`)

SLURM profile (`-profile slurm`)

Local profile (`-profile local`)

Packages