Nextflow-based pySCENIC Pipeline

An AnnData-native Nextflow pipeline built to make pySCENIC accessible for researchers with high-performance computing (HPC) cluster access. Built as a component of my senior capstone project for the Bioinformatics concentration of the Biomolecular Engineering & Bioinformatics major at UC Santa Cruz.

Prerequisites

You must have access to a HPC cluster with the SLURM scheduler to use this pipeline, a resource offered by most universities. Support for local and cloud-based (e.g. AWS Batch) runtime environments does not currently exist, but may be implemented in the future.

The following software must exist on your cluster:

Nextflow with DSL-2 support
Singularity or Apptainer

You must also build the singularity image file from the provided Dockerfile in containers/ on your local machine, then transfer the resulting .sif file to your cluster by running the following (locally):

# ON A LOCAL MACHINE
cd containers
bash build.sh
scp [sif filepath] user@hostname:[expected sif filepath]  # copy sif file to cluster

On the cluster, you must then edit the io_container variable under the params block in nextflow.config to the [expected sif filepath] provided with scp (the location of the sif file). Alternatively, you can pass the full filepath at pipeline runtime with the --io_container <sif filepath> flag.

Usage

# ASSUMED PREREQUISITES:
# - working on a SLURM-based cluster
# - nextflow available as a module
# - singularity / apptainer already installed and active
# - .sif file already built locally and copied to cluster
# - params.io_container in nextflow.config reflects correct .sif filepath on cluster

# 1. download required databases
cd databases
bash download.sh
cd ..

# 2. load required modules
module load nextflow

# 3. execute pipeline
nextflow run main.nf --h5ad <input> --outdir <output directory> [--replicates <number>]

Output Structure

output-directory/
├── aucell/
│   └── auc_mtx.csv         # regulon activation scores per cell
├── cistarget/              
│   └── regulons.csv        # regulons: pruned co-expression modules supported by cistarget databases
├── grn/                    
│   └── adj.tsv             # average of co-expression modules of 50 random runs
└── annotated.h5ad          # Resulting h5ad object annotated with auc_mtx.csv in adata.obsm['X_scenic_auc'] and regulons in adata.uns['scenic_regulons']

Pipeline Parameters

Flag	Type	Required	Default	Description
`--h5ad <path>`	Path	Yes	N/A	The path to an AnnData object expecting raw counts in `adata.X`. Selecting and subsetting highly variable genes is highly recommended.
`--outdir <path>`	Path	Yes	results/	Output directory to write files to. See Output Structure for expected output.
`--replicates <number>`	integer	Yes	10	Number of replicates to average for GRN step. 10+ is highly recommended due to stochastic nature.
`--partition <cluster partition>`	string	No	null	Specifies cluster partition to execute pipeline with. If left empty (null), it is left to the scheduler.
`--io_container <path>`	Path	Yes	`/home/aspandit/lab/scenic/containers/h5ad-to-loom.sif`	Path to singularity image file for I/O pipeline steps. Highly suggested to edit in `nextflow.config`. Future update will use image pushed to container registry instead.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.vscode		.vscode
containers		containers
databases		databases
.gitignore		.gitignore
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
run-nextflow.sh		run-nextflow.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nextflow-based pySCENIC Pipeline

Table of Contents

Prerequisites

Usage

Output Structure

Pipeline Parameters

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nextflow-based pySCENIC Pipeline

Table of Contents

Prerequisites

Usage

Output Structure

Pipeline Parameters

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages