Add subworkflow for ensemble species detection with 3 taxonomic profilers

# Goal: 
_Implementing Eddies' Bakeoff project with unified DBs for consensus species detection_

The consensus will be determined using a custom script that combines their species table calls. Here's the idea (from Todd) :
- 2 out of 3 tools detecting presence of species 
- Same criterea for parent species if looking at strain calls
- Ignore combining the abundance information for now and only focus on presence/absence. 
- _Should we include the relative abundance from all 3 tools in our final table or will that be more confusing?_
 
What do we do with multiple samples? Should they be combined into a single output?

# Plan

Prashant laying the ground-work with: 
- Picking these 3 tools: Download the nf-core modules using `nf-core modules install <module>` on branch: `add_taxprofilers`
  1. [x] Sylph : _rebase part of the older branch:_ `sylph_i73` at commit: f6c45ef
  2. [x] Ganon2
  3. [x] Kraken2


- [x] Direct the pipeline to Eddy's bakeoff DB path: 
```groovy
unified_db_base_dir = "/home/Users/pacbio_bakeoff/data/ref_db/refseq03032025" // path to Eddy's unified databases (Ensemble analysis: species detection)
``` 
- [x] Do a prelim test with the proper run command for sylph using `test-modules/sylph-test.nf` (same as subissue here: #108)

Continue from here on a new branch with something like `ensemble_species_detection_i107`
- [x] (fce89deff6cc2b05c10cdddce4425a851d19d9a8) Test the 3 modules using the respective unified DBs assigned in `nextflow.config`. Can start with this unfinished script: `test-modules/ensemble-test.nf` 
- [x] Using the test script above, create a new subworkflow that brings together all 3 tools and connects to the main workflow ; activated with `analysis_type = species_detection` 
  - [ ] (_think/elaborate more_) Implement a switch with `mode = comprehensive` or something to turn on `ensemble mode` and activate the 3 tools in `species_detection` branch. Or use a `fast` mode to run sylph only here?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add subworkflow for ensemble species detection with 3 taxonomic profilers #107

Goal:

Plan

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add subworkflow for ensemble species detection with 3 taxonomic profilers #107

Description

Goal:

Plan

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions