GWAS and machine learning analysis to identify genomic predictors of SSRI treatment response in Major Depressive Disorder. Developed as an MS Applied Project at Arizona State University (May 2025).
A multi-stage workflow combining genotype QC, GWAS, pathway analysis, and ML prediction:
- QC & Filtering —
QC_Filtering_Analysis.sh,filterSubjects.sh,filterForML.sh - Genotype Processing —
plinkAndGeneFilter.sh,sortAndIndex.sh,calcLD.sh - Summary Statistics —
readSummStats.R,formatSummStats.ipynb - MAGMA Gene Analysis —
formatForMagma.ipynb,runAnalysis.sh,readMagmaResults.R - GSEA Pathway Analysis —
runGSEA.sh,reformatBED.ipynb - ML Data Preparation —
createMLData.py,createMLData.ipynb,formatML.R - ML Training & Tuning —
runML.ipynb,runHyperparam.sh - Response Analysis —
runResponseAnalysis.sh
- Up to 70% accuracy predicting seasonal depression pattern
- 66% accuracy predicting citalopram treatment response
- PLINK, MAGMA, GSEA
- Python, R, Bash | Jupyter Notebooks
- Linux HPC | Conda