Skip to content

Is there a way to run subset-bam faster #34

@mattli7

Description

@mattli7

Hello,

I am trying to subset my bam file for each barcode. I have around 20k cells and each of their barcode is in a directory. I have been using the code below to execute subset-bam. It takes around 35 minutes per barcode. I was wondering if there is a way to make subset-bam run any faster, perhaps parallelization?

FILES="my directory containing every barcode"
for file in $FILES
do
filename=$(basename "$file")
filename_no_extension="${filename%%.*}"

subset-bam_linux --bam marked.duplicates.bam --bam-tag CB --cell-barcodes barcodes/$filename --out-bam barcode_bams/$filename_no_extension.bam
done

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions