Skip to content

0-based VCF output #9

@hdashnow

Description

@hdashnow

Hi! I noticed that atarva accepts a 0-based bed regions file as input and also produces a 0-based VCF as output. VCF format should be 1-based (see below). This creates off-by-1 errors in our downstream analyses.

A nice summary of this common problem:
https://www.biostars.org/p/84686/

The VCF spec:

POS- position: The reference position, with the 1st base having position 1. Positions are sorted numerically, in increasing order, within each reference sequence CHROM. It is permitted to have multiple records with the same POS. Telomeres are indicated by using positions 0 or N+1, where N is the length of the corresponding chromosome or contig. (Integer, Required)

May I suggest that you produce 1-based VCFs?

Warm regards,
Harriet

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions