Skip to content

LGOICOUR/My-Local-SNP-Checker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

My-Local-SNP-Checker

Cross-reference your 23andMe raw data with ClinVar's SNP data locally:

This tool cross-references RSIDs in your direct-to-consumer (DTC) raw data file (e.g., 23andMe) against ClinVar. It runs entirely on your computer and never uploads your data.

  • Input: 23andMe/Ancestry/MyHeritage TXT or single-sample VCF
  • Reference: ClinVar variant_summary.txt.gz (and optionally ClinVar VCF) downloaded once, cached locally
  • Output: A human-readable report saved next to your genotype file

Educational use only; not a clinical test. Consult clinical genetics for medical decisions.

Features

  • Gene filtering: Choose one or more genes to focus on.
  • ClinicalSignificance modes:
    • pathogenic (default) — includes "pathogenic" and "likely pathogenic" via substring match
    • risk_trait — includes risk/trait terms (risk factor, association, protective, affects, drug response, confers sensitivity/resistance)
    • broad — union of pathogenic + risk_trait
    • custom — supply your own terms with --clinsig-terms "term1,term2"
  • Allele-aware check (TXT inputs): Uses ClinVar VCF to test if your bases include the ALT allele(s) for matched RSIDs
  • Privacy: Your genotype file is read locally only

Quick start

  1. Ensure Python 3 is installed (macOS usually has it).

  2. Run the reproductive genetic carrier panel (pathogenic):

python3 rsid_clinvar_checker.py \
  --genotype "/absolute/path/to/your_raw.txt" \
  --genes HBA1,HBA2,HBB,ASPA,CFTR,DLD,ELP1,ABCC8,FANCC,FMR1,GBA,G6PC,TMEM216,BCKDHB,MCOL1,NEB,SMPD1,SMN1,SMN2,HEXA,PCDH15,CLRN1,FKTN \
  --mode pathogenic
  1. For allele-aware calls (TXT inputs only):
python3 rsid_clinvar_checker.py \
  --genotype "/absolute/path/to/your_raw.txt" \
  --genes HBA1,HBA2,HBB,ASPA,CFTR,DLD,ELP1,ABCC8,FANCC,FMR1,GBA,G6PC,TMEM216,BCKDHB,MCOL1,NEB,SMPD1,SMN1,SMN2,HEXA,PCDH15,CLRN1,FKTN \
  --mode pathogenic \
  --allele-aware
  1. Custom ClinicalSignificance terms:
python3 rsid_clinvar_checker.py \
  --genotype "/absolute/path/to/your_raw.txt" \
  --genes APOE,LCT \
  --mode custom \
  --clinsig-terms "risk factor,association,drug response"

Interpreting results

  • The report prints to the terminal and saves as rsid_clinvar_report.txt next to your genotype file.
  • "Overlaps found" means your file contains the listed RSIDs that ClinVar classifies with the selected terms.
  • With --allele-aware (TXT inputs), the tool reports if your bases include the specific ALT allele(s) for those RSIDs.
  • Important: RSIDs can tag multiple alleles. An RSID match alone does not confirm the specific allele.

Common RSIDs for sanity checks (read-only)

grep -E '^(rs334|rs113993960|rs28929474|rs1050828|rs1800562)[[:space:]]' \
  "/absolute/path/to/your_raw.txt" | head
  • If lines appear, your chip likely covers some positive-control variants.

Notes and limitations

  • Many pathogenic variants lack rsIDs or are not on consumer SNP arrays; a negative screen does not rule out disease.
  • ClinVar classifications change over time; re-run periodically to use the latest reference.
  • For medical decisions, use clinical-grade testing and consult a clinician.

License

MIT

About

Cross-reference your 23andMe raw data with ClinVar's SNP data locally

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages