Cross-reference your 23andMe raw data with ClinVar's SNP data locally:
This tool cross-references RSIDs in your direct-to-consumer (DTC) raw data file (e.g., 23andMe) against ClinVar. It runs entirely on your computer and never uploads your data.
- Input: 23andMe/Ancestry/MyHeritage TXT or single-sample VCF
- Reference: ClinVar
variant_summary.txt.gz(and optionally ClinVar VCF) downloaded once, cached locally - Output: A human-readable report saved next to your genotype file
Educational use only; not a clinical test. Consult clinical genetics for medical decisions.
- Gene filtering: Choose one or more genes to focus on.
- ClinicalSignificance modes:
pathogenic(default) — includes "pathogenic" and "likely pathogenic" via substring matchrisk_trait— includes risk/trait terms (risk factor, association, protective, affects, drug response, confers sensitivity/resistance)broad— union ofpathogenic+risk_traitcustom— supply your own terms with--clinsig-terms "term1,term2"
- Allele-aware check (TXT inputs): Uses ClinVar VCF to test if your bases include the ALT allele(s) for matched RSIDs
- Privacy: Your genotype file is read locally only
-
Ensure Python 3 is installed (macOS usually has it).
-
Run the reproductive genetic carrier panel (pathogenic):
python3 rsid_clinvar_checker.py \
--genotype "/absolute/path/to/your_raw.txt" \
--genes HBA1,HBA2,HBB,ASPA,CFTR,DLD,ELP1,ABCC8,FANCC,FMR1,GBA,G6PC,TMEM216,BCKDHB,MCOL1,NEB,SMPD1,SMN1,SMN2,HEXA,PCDH15,CLRN1,FKTN \
--mode pathogenic- For allele-aware calls (TXT inputs only):
python3 rsid_clinvar_checker.py \
--genotype "/absolute/path/to/your_raw.txt" \
--genes HBA1,HBA2,HBB,ASPA,CFTR,DLD,ELP1,ABCC8,FANCC,FMR1,GBA,G6PC,TMEM216,BCKDHB,MCOL1,NEB,SMPD1,SMN1,SMN2,HEXA,PCDH15,CLRN1,FKTN \
--mode pathogenic \
--allele-aware- Custom ClinicalSignificance terms:
python3 rsid_clinvar_checker.py \
--genotype "/absolute/path/to/your_raw.txt" \
--genes APOE,LCT \
--mode custom \
--clinsig-terms "risk factor,association,drug response"- The report prints to the terminal and saves as
rsid_clinvar_report.txtnext to your genotype file. - "Overlaps found" means your file contains the listed RSIDs that ClinVar classifies with the selected terms.
- With
--allele-aware(TXT inputs), the tool reports if your bases include the specific ALT allele(s) for those RSIDs. - Important: RSIDs can tag multiple alleles. An RSID match alone does not confirm the specific allele.
grep -E '^(rs334|rs113993960|rs28929474|rs1050828|rs1800562)[[:space:]]' \
"/absolute/path/to/your_raw.txt" | head- If lines appear, your chip likely covers some positive-control variants.
- Many pathogenic variants lack rsIDs or are not on consumer SNP arrays; a negative screen does not rule out disease.
- ClinVar classifications change over time; re-run periodically to use the latest reference.
- For medical decisions, use clinical-grade testing and consult a clinician.
MIT