Skip to content

chore: add benchmarks#110

Open
cademirch wants to merge 2 commits into
mainfrom
benchmarks
Open

chore: add benchmarks#110
cademirch wants to merge 2 commits into
mainfrom
benchmarks

Conversation

@cademirch
Copy link
Copy Markdown
Contributor

This adds some benchmarking of ft subcommands in order to understand the (if any) consequences of the future MA library transition. A small helper script downloads the public FIRE test dataset on which the benchmarks are run.

We use criterion to handle the benchmark measuring, repeats, etc.

We only benchmark subcommands that largely touch the "bam_ordered" code path. This is because the handling of this in MA is different than the outgoing Ranges/FiberAnnotations:

on main, FiberAnnotations stores the per-annotation query starts/ends, ref starts/ends, lengths, quals etc as Vecs, computed once at parse time. Importantly, if the molecule is reverse aligned, these vecs are flipped at this time.

In MA, nothing is stored upfront in this way. Rather, we use iter_type on MolecularAnnotations to collect a vec of all annotations of a given type, collect the starts,ends,etc and reverse if needed. The current implementation of this in fibertools-ma branch has this happen for every access and these subcommands hit the accessors multiple times per fiber, so the O(N) reverse on rev-aligned reads adds up fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant