Skip to content

Add subworkflow for ensemble species detection with 3 taxonomic profilers #107

@ppreshant

Description

@ppreshant

Goal:

Implementing Eddies' Bakeoff project with unified DBs for consensus species detection

The consensus will be determined using a custom script that combines their species table calls. Here's the idea (from Todd) :

  • 2 out of 3 tools detecting presence of species
  • Same criterea for parent species if looking at strain calls
  • Ignore combining the abundance information for now and only focus on presence/absence.
  • Should we include the relative abundance from all 3 tools in our final table or will that be more confusing?

What do we do with multiple samples? Should they be combined into a single output?

Plan

Prashant laying the ground-work with:

  • Picking these 3 tools: Download the nf-core modules using nf-core modules install <module> on branch: add_taxprofilers

    1. Sylph : rebase part of the older branch: sylph_i73 at commit: f6c45ef
    2. Ganon2
    3. Kraken2
  • Direct the pipeline to Eddy's bakeoff DB path:

unified_db_base_dir = "/home/Users/pacbio_bakeoff/data/ref_db/refseq03032025" // path to Eddy's unified databases (Ensemble analysis: species detection)

Continue from here on a new branch with something like ensemble_species_detection_i107

  • (fce89de) Test the 3 modules using the respective unified DBs assigned in nextflow.config. Can start with this unfinished script: test-modules/ensemble-test.nf
  • Using the test script above, create a new subworkflow that brings together all 3 tools and connects to the main workflow ; activated with analysis_type = species_detection
    • (think/elaborate more) Implement a switch with mode = comprehensive or something to turn on ensemble mode and activate the 3 tools in species_detection branch. Or use a fast mode to run sylph only here?

Metadata

Metadata

Assignees

Labels

No labels
No labels
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions