zebrafishCC/findConservedProteinVariant
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Human variant infomation (clinvar_20181202.vcf) was downloaded from clinVar database. Human and zebrafish gene orthologues (human_orthos_2018.12.06.txt) information was obtained from ZFIN. Final output for conserved variants that can be performed in zebrafish are seperated accoring to their orientation. Head clinvar_20181202.vcf ##fileformat=VCFv4.1 ##fileDate=2018-12-02 ##source=ClinVar ##reference=GRCh38 ##ID=<Description="ClinVar Variation ID"> ##INFO=<ID=AF_ESP,Number=1,Type=Float,Description="allele frequencies from GO-ESP"> ##INFO=<ID=AF_EXAC,Number=1,Type=Float,Description="allele frequencies from ExAC"> ##INFO=<ID=AF_TGP,Number=1,Type=Float,Description="allele frequencies from TGP"> ##INFO=<ID=ALLELEID,Number=1,Type=Integer,Description="the ClinVar Allele ID"> ##INFO=<ID=CLNDN,Number=.,Type=String,Description="ClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB"> chengchen@chengchen-Lenovo-ideapad-Y700-17ISK:~/data/base-editing/scripts$ head -40 clinvar_20181202.vcf ##fileformat=VCFv4.1 ##fileDate=2018-12-02 ##source=ClinVar ##reference=GRCh38 ##ID=<Description="ClinVar Variation ID"> ##INFO=<ID=AF_ESP,Number=1,Type=Float,Description="allele frequencies from GO-ESP"> ##INFO=<ID=AF_EXAC,Number=1,Type=Float,Description="allele frequencies from ExAC"> ##INFO=<ID=AF_TGP,Number=1,Type=Float,Description="allele frequencies from TGP"> ##INFO=<ID=ALLELEID,Number=1,Type=Integer,Description="the ClinVar Allele ID"> ##INFO=<ID=CLNDN,Number=.,Type=String,Description="ClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB"> ##INFO=<ID=CLNDNINCL,Number=.,Type=String,Description="For included Variant : ClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB"> ##INFO=<ID=CLNDISDB,Number=.,Type=String,Description="Tag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN"> ##INFO=<ID=CLNDISDBINCL,Number=.,Type=String,Description="For included Variant: Tag-value pairs of disease database name and identifier, e.g. OMIM:NNNNNN"> ##INFO=<ID=CLNHGVS,Number=.,Type=String,Description="Top-level (primary assembly, alt, or patch) HGVS expression."> ##INFO=<ID=CLNREVSTAT,Number=.,Type=String,Description="ClinVar review status for the Variation ID"> ##INFO=<ID=CLNSIG,Number=.,Type=String,Description="Clinical significance for this single variant"> ##INFO=<ID=CLNSIGCONF,Number=.,Type=String,Description="Conflicting clinical significance for this single variant"> ##INFO=<ID=CLNSIGINCL,Number=.,Type=String,Description="Clinical significance for a haplotype or genotype that includes this variant. Reported as pairs of VariationID:clinical significance."> ##INFO=<ID=CLNVC,Number=1,Type=String,Description="Variant type"> ##INFO=<ID=CLNVCSO,Number=1,Type=String,Description="Sequence Ontology id for variant type"> ##INFO=<ID=CLNVI,Number=.,Type=String,Description="the variant's clinical sources reported as tag-value pairs of database and variant identifier"> ##INFO=<ID=DBVARID,Number=.,Type=String,Description="nsv accessions from dbVar for the variant"> ##INFO=<ID=GENEINFO,Number=1,Type=String,Description="Gene(s) for the variant reported as gene symbol:gene id. The gene symbol and id are delimited by a colon (:) and each pair is delimited by a vertical bar (|)"> ##INFO=<ID=MC,Number=.,Type=String,Description="comma separated list of molecular consequence in the form of Sequence Ontology ID|molecular_consequence"> ##INFO=<ID=ORIGIN,Number=.,Type=String,Description="Allele origin. One or more of the following values may be added: 0 - unknown; 1 - germline; 2 - somatic; 4 - inherited; 8 - paternal; 16 - maternal; 32 - de-novo; 64 - biparental; 128 - uniparental; 256 - not-tested; 512 - tested-inconclusive; 1073741824 - other"> ##INFO=<ID=RS,Number=.,Type=String,Description="dbSNP ID (i.e. rs number)"> ##INFO=<ID=SSR,Number=1,Type=Integer,Description="Variant Suspect Reason Codes. One or more of the following values may be added: 0 - unspecified, 1 - Paralog, 2 - byEST, 4 - oldAlign, 8 - Para_EST, 16 - 1kg_failed, 1024 - other"> #CHROM POS ID REF ALT QUAL FILTER INFO 1 1014042 475283 G A . . ALLELEID=446939;CLNDISDB=MedGen:C4015293,OMIM:616126,Orphanet:ORPHA319563;CLNDN=Immunodeficiency_38_with_basal_ganglia_calcification;CLNHGVS=NC_000001.11:g.1014042G>A;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Benign;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=ISG15:9636;MC=SO:0001583|missense_variant;ORIGIN=1;RS=143888043 1 1014122 542074 C T . . ALLELEID=514926;CLNDISDB=MedGen:C4015293,OMIM:616126,Orphanet:ORPHA319563;CLNDN=Immunodeficiency_38_with_basal_ganglia_calcification;CLNHGVS=NC_000001.11:g.1014122C>T;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=ISG15:9636;MC=SO:0001583|missense_variant;ORIGIN=1;RS=150861311 1 1014143 183381 C T . . ALLELEID=181485;CLNDISDB=MedGen:C4015293,OMIM:616126,Orphanet:ORPHA319563;CLNDN=Immunodeficiency_38_with_basal_ganglia_calcification;CLNHGVS=NC_000001.11:g.1014143C>T;CLNREVSTAT=no_assertion_criteria_provided;CLNSIG=Pathogenic;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;CLNVI=OMIM_Allelic_Variant:147571.0003;GENEINFO=ISG15:9636;MC=SO:0001587|nonsense;ORIGIN=1;RS=786201005 1 1014179 542075 C T . . ALLELEID=514896;CLNDISDB=MedGen:C4015293,OMIM:616126,Orphanet:ORPHA319563;CLNDN=Immunodeficiency_38_with_basal_ganglia_calcification;CLNHGVS=NC_000001.11:g.1014179C>T;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Uncertain_significance;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=ISG15:9636;MC=SO:0001583|missense_variant;ORIGIN=1 head human_orthos_2018.12.06.txt Date: 2018.12.06 ZFIN ID ZFIN Symbol ZFIN Name Human Symbol Human Name OMIM IDGene ID HGNC ID Evidence Pub ID ZDB-GENE-000112-47 ppardb peroxisome proliferator-activated receptor delta b PPARD peroxisome proliferator activated receptor delta 600409 5467 9235 AA ZDB-PUB-060313-16 ZDB-GENE-000112-47 ppardb peroxisome proliferator-activated receptor delta b PPARD peroxisome proliferator activated receptor delta 600409 5467 9235 AA ZDB-PUB-070210-39 ZDB-GENE-000112-47 ppardb peroxisome proliferator-activated receptor delta b PPARD peroxisome proliferator activated receptor delta 600409 5467 9235 AA ZDB-PUB-071118-46 ZDB-GENE-000112-47 ppardb peroxisome proliferator-activated receptor delta b PPARD peroxisome proliferator activated receptor delta 600409 5467 9235 AA ZDB-PUB-150121-5 head all_variants_positive.txt MYO15A ENST00000205890 myo15aa ENSDART00000149546 2382 A/V AGAGGGATTGCTGAAATTTGGcCGCCACCGTTGAggCCTGAT DLD ENST00000205402 dldh ENSDART00000006709 60 A/V GTACCTTAAACCCAAGCTGAGcTGCTTTGATAGCggCAACAT DLD ENST00000205402 dldh ENSDART00000006709 239 A/V CATGGCCAAGAAATTCCACTGcTGTAACTTTggCACCCAACC ABCC6 ENST00000205557 abcc6a ENSDART00000189619 158 A/V ACTCTTCCTCAGCTGTTTTGcAGACCAggCACCCTTAggCAA TSC2 ENST00000219476 tsc2 ENSDART00000158948 122 A/V TAGGGTGTACTCCGAGGACGcCGCTCTCCTCCGCggTGCTGT TSC2 ENST00000219476 tsc2 ENSDART00000158948 264 A/V CACGCTGGTCTCCTACAGAGcTCAggCCATCCAGCCggCCAA TSC2 ENST00000219476 tsc2 ENSDART00000158948 454 A/V TTTCCTGCTGCTGATGAGGGcTGATTCTCTACATCGTCTCgg head all_variants_negative.txt NEXMIF ENST00000055682 nexmifa ENSDART00000149893 347 G/D AccATGCAGTAGAAAGGAGGgCCCAAAAGAGAAACCAGACCA B4GALT7 ENST00000029410 b4galt7 ENSDART00000170614 215 C/Y AGCGGTTTGACATccCGTTgCACTTTAGGCAGAAATCACAAG B4GALT7 ENST00000029410 b4galt7 ENSDART00000170614 273 A/T ATATTAccTGTTTTTGTGCAgCGATCCGCTTCTGGTCTCTTT OTC ENST00000039007 otc ENSDART00000089526 88 S/N TGGACATGCGGGTccTAGTgCTCCTCTTCTCAAATATCATGG OTC ENST00000039007 otc ENSDART00000089526 172 A/T ccTGTAGTGTTAGCAGGTCAgCCAGAATCTGAATAGGGTGGT OTC ENST00000039007 otc ENSDART00000089526 251 A/T GAACACTGCTGTccCGCGCTgCCTCCACTGGATCTGACACAA SPG21 ENST00000204566 spg21 ENSDART00000130967 180 A/T TGTCAAccATGAAGTCAATTgCGTCTGCCATTTTTGGGTCCA