You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
results.rep: plain report of the run, used to further generate tree-like reports
results.tre: tree-like report with cumulative abundances by taxonomic ranks (can be re-generated with ganon report)
kraken2 : mock9.kraken2.report.txt : similar to ganon2's .tre with hierarchy of outputs?
Understanding taxpasta output. _how do I match the taxonomy_id to the species/taxa name?
If you want to learn how to use taxpasta to add taxonomic names (rather than IDs) to your profiles, see here. // Need to supply ncbi/other taxanomy files (.dmp)
Information to retain
Species name
confidence metric:
adjusted_ANI (sylph)
abundance estimate
References
Best tool to use: polars/python which is fast and rust based. This is a quick way to learn this library + Copilot/Seqera will help generate a base script and module
(NO, skip for now) Is it relevant to use taxpasta here to standardize or merge the 3 tool outputs?
Does it support all 3 of our tools? -- not supporting sylph yet :😞; Brought up polars library in their repo here
_Consider if their minimalistic 2 col output format (taxonomy_id and count) is good enough for us?
Example output formats for the 3 tools for the test datasets (mock9 and mock20) in these work dirs; store somewhere or link here for reference?
[80/06b239] ORCHESTRATE_SOMATEM:SOMATEM:SPECIES_DETECTION:SYLPH_PROFILE (mock9) [100%] 2 of 2 ✔
[38/2e20cc] ORCHESTRATE_SOMATEM:SOMATEM:SPECIES_DETECTION:GANON_CLASSIFY (mock9) [100%] 2 of 2 ✔
[0e/f5823f] ORCHESTRATE_SOMATEM:SOMATEM:SPECIES_DETECTION:KRAKEN2_KRAKEN2 (mock9) [100%] 2 of 2 ✔
Goals
Plan
taxonomy_id)Understand formats
Understand the output formats
mock9.kraken2.report.txt: similar to ganon2's .tre with hierarchy of outputs?taxonomy_idto the species/taxa name?.dmp)Information to retain
sylph)References
standardizeormergethe 3 tool outputs?taxonomy_idandcount) is good enough for us?mock9andmock20) in these work dirs; store somewhere or link here for reference?