Skip to content

Skip deduplication for libraries with UMI #188

@ChristopherBarrington

Description

@ChristopherBarrington

We have some DNase Hi-C data being produced that has UMI information to identify PCR duplicates. My intention would be to deduplicate the libraries using the FastQ files then submitting those duplicate-free FastQ to distiller.

Is there a method that you suggest using to preprocess these data with distiller? There doesn't look to be a straightforward option but I had a look at the DSL1 Nextflow script and thought that duplicating the merge_split process to avoid the deduplication step and create empty files for the expected duplicate-relevant files may work? The choice of process can then be controlled by --params.skip_dedup in a when directive.

I gave it a go and it seemed to work but I am worried that I will have missed something.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions