circdna_e2f5

This repository contains scripts and resources used to identify extrachromosomal DNA (ecDNA) in the MMTV-Cre E2F5-flox mouse model of breast cancer. The analysis leverages the nf-core/circdna pipeline and downstream tools including AmpliconArchitect, CycleViz, and AmpliconReconstructor.

Repository Structure

circdna_e2f5/
├── fetchngs_results/         # Metadata and samplesheet and raw data
│   ├── fastq/
│   ├── metadata/
│   └── samplesheet/
├── mosek_license_dir/        # MOSEK license required for AmpliconArchitect
│   └── mosek.lic
├── results/                  # Output from pipeline (e.g., sv_view images)
│   └── ampliconsuite/
├── data_repo/                # Reference data required by AmpliconArchitect (for mm10)
├── run_circdna.sh            # Script to run nf-core/circdna
├── run_fetchngs.sh           # Script to fetch raw fastq files via nf-core/fetchngs
├── icer.config               # Custom config for running on MSU HPCC
├── ids.csv                   # List of run accessions (e.g., SRX IDs)
├── LICENSE
├── README.md

Requirements

Nextflow >= 22.10.1
nf-core/fetchngs and nf-core/circdna
Singularity (or Docker)
MOSEK license for AmpliconArchitect
Reference genome repository for AmpliconArchitect (see below)
Access to HPC with SLURM (configured in icer.config)

Installation

Clone this repo:

git clone https://github.com/johnvusich/circdna_e2f5.git
cd circdna_e2f5

Getting a MOSEK Academic License

AmpliconArchitect requires a MOSEK license. Academic users can request a free license as follows:

Visit the MOSEK license request page.
Fill out the form using your academic email address.
Once approved, download the license file (typically named mosek.lic).
Place the file in the following path in your local setup:

circdna_e2f5/mosek_license_dir/mosek.lic

Ensure the path is correctly passed to the --mosek_license_dir parameter in the pipeline script.

Setting up `data_repo` for AmpliconArchitect (mm10)

AmpliconArchitect requires a structured reference data repository to function. In this analysis, the repository is set up at:

$SCRATCH/circdna_e2f5/data_repo

To set this up exactly as used in the pipeline, run the following script:

bash setup_data_repo.sh

This script will:

Create the expected data_repo directory
Download the mm10.tar.gz reference bundle
Unpack it with proper permissions
Set the AA_DATA_REPO environment variable (used by the pipeline)

Final Directory Structure Example

circdna_e2f5/data_repo
├── coverage.stats
├── mm10
│   ├── annotations
│   │   ├── gencode.vM10.basic.annotation_genes.gff
│   │   └── mm10GenomicSuperDup.tab
│   ├── cancer
│   │   ├── oncogene_list.txt
│   │   └── oncogenes
│   │       ├── AC_oncogene_set_mm10.gff
│   │       └── mm10_consensus_oncogenes_list_from_hg19.gff
│   ├── dummy_ploidy.vcf
│   ├── file_list.txt
│   ├── file_sources.txt
│   ├── last_updated.txt
│   ├── mm10-blacklist.v2.bed
│   ├── mm10_centromere.bed
│   ├── mm10_cnvkit_filtered_ref.cnn
│   ├── mm10_conserved_gain5.bed
│   ├── mm10_conserved_gain5_onco_subtract.bed
│   ├── mm10.fa
│   ├── mm10.fa.fai
│   ├── mm10.Hardison.Excludable.full.bed
│   ├── mm10_k35.mappability.bedgraph
│   ├── mm10_merged_centromeres_conserved_sorted.bed
│   ├── mm10_noAlt.fa.fai
│   └── onco_bed.bed
├── mm10.tar.gz

Make sure data_repo is accessible by the pipeline or container environment and define its path when running AmpliconArchitect manually, or ensure the environment is configured to detect it.

Step-by-Step Workflow

1. Fetch Raw Sequencing Data

Use run_fetchngs.sh to download data from SRA using the IDs listed in ids.csv.

sbatch run_fetchngs.sh

Edit samplesheet.csv and multiqc_config.yml as needed in fetchngs_results/samplesheet/.

2. Run nf-core/circdna

Use run_circdna.sh to launch the circular DNA analysis pipeline with AmpliconArchitect.

sbatch run_circdna.sh

This script uses icer.config for cluster-specific settings on the MSU HPCC.

3. Downstream Analysis

CycleViz and AmpliconReconstructor can be run using output files from AmpliconArchitect.
Visual outputs (e.g., .png files) are stored in results/ampliconsuite/ampliconarchitect/sv_view.

Example Output

The following are example structural variant views of predicted circular DNA generated by AmpliconArchitect:

`SRX18120904_amplicon1.png`

`SRX18120906_amplicon1.png`

These figures can be found in: results/ampliconsuite/ampliconarchitect/sv_view/

Citation

If you use this code, please cite:

License

MIT License – see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

circdna_e2f5

Repository Structure

Requirements

Installation

Getting a MOSEK Academic License

Setting up `data_repo` for AmpliconArchitect (mm10)

Final Directory Structure Example

Step-by-Step Workflow

1. Fetch Raw Sequencing Data

2. Run nf-core/circdna

3. Downstream Analysis

Example Output

`SRX18120904_amplicon1.png`

`SRX18120906_amplicon1.png`

Citation

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data_repo		data_repo
fetchngs_results		fetchngs_results
mosek_license_dir		mosek_license_dir
results/ampliconsuite/ampliconarchitect/sv_view		results/ampliconsuite/ampliconarchitect/sv_view
LICENSE		LICENSE
README.md		README.md
icer.config		icer.config
ids.csv		ids.csv
run_circdna.sh		run_circdna.sh
run_fetchngs.sh		run_fetchngs.sh
setup_data_repo.sh		setup_data_repo.sh

License

johnvusich/circdna_e2f5

Folders and files

Latest commit

History

Repository files navigation

circdna_e2f5

Repository Structure

Requirements

Installation

Getting a MOSEK Academic License

Setting up data_repo for AmpliconArchitect (mm10)

Final Directory Structure Example

Step-by-Step Workflow

1. Fetch Raw Sequencing Data

2. Run nf-core/circdna

3. Downstream Analysis

Example Output

SRX18120904_amplicon1.png

SRX18120906_amplicon1.png

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Setting up `data_repo` for AmpliconArchitect (mm10)

`SRX18120904_amplicon1.png`

`SRX18120906_amplicon1.png`

Packages