flowchart TD
A["Input tables"] --> B["DESeq2 model"]
B --> C["DE tables: step2a"]
B --> D["Scaled matrix (VST + z-score)"]
D --> E["PCA"]
D --> F["Heatmap of top DE genes"]
B --> G["Volcano plots"]
B --> H["MA plots"]
D --> I["Boxplots: top 10 DE genes"]
B --> J["Top 10 up/down lists"]
- Input expected under
input/:em.csv(tab-delimited counts, first columnID),sample_sheet.csv(SAMPLE,SAMPLE_GROUP),annotations.csv(Gene ID,Associated Gene Name). - Flags:
--pcol: which significance column to use (pvalueorpadj)--pthresh: significance threshold (default0.01)--lfc: absolute log2 fold-change cutoff (default1)
- Run with p-values:
nextflow run main.nf --pcol pvalue --pthresh 0.01 --lfc 1
- Run with adjusted p-values:
nextflow run main.nf --pcol padj --pthresh 0.05 --lfc 1
- Output:
- Tables in
output_step2a - Plots in
output_step2b(pvalue) oroutput_step2c(padj)
- Tables in
Outputs are written to output_step2a and either output_step2b (pvalue) or output_step2c (padj).
|
|
|
|
This pipeline runs DESeq2 on the provided experiment inputs, filters non-finite values, produces DE tables, PCA, heatmap, volcano/MA plots, boxplots, and top-10 up/down tables. It enforces the sample group order gut -> duct -> node.
Expected under input/:
em.csv(tab-delimited expression matrix, first columnID)sample_sheet.csv(tab-delimited:SAMPLE,SAMPLE_GROUP)annotations.csv(tab-delimited:Gene ID,Associated Gene Name)
Always:
output_step2a/de_gut_duct.tsvoutput_step2a/de_duct_node.tsvoutput_step2a/de_node_gut.tsv
When --pcol pvalue:
output_step2b/with PCA, heatmap, volcano, MA, boxplots, and top-10 tables
When --pcol padj:
output_step2c/with the same plots/tables based on adjusted p-values
--pcol:pvalueorpadj--pthresh: significance threshold (default0.01)--lfc: absolute log2 fold-change threshold (default1)--input_dir: input directory (defaultinput)--outdir: output directory (default.)
nextflow run main.nf --pcol pvalue --pthresh 0.01 --lfc 1If you prefer Docker, create a minimal container with R and required packages. Then run:
nextflow run main.nf -profile docker --pcol pvalue --pthresh 0.01 --lfc 1You will need a Docker-enabled nextflow.config profile that sets process.container to an image containing:
- R (>= 4.2 recommended)
- Bioconductor
DESeq2 - CRAN:
ggplot2,ggrepel,pheatmap
Create a conda env with R + packages and use Nextflow's conda profile:
nextflow run main.nf -profile conda --pcol pvalue --pthresh 0.01 --lfc 1Example nextflow.config snippet for conda:
profiles {
conda {
process.conda = 'conda/renv.yml'
}
docker {
process.container = 'your-docker-image:tag'
}
}
- The pipeline filters out NaN/Inf values before writing outputs.
- Sample group order is enforced as
gut,duct,node. - Labels on volcano/MA plots use
geom_label_repel.


