This project applies Machine Learning (ML) techniques to detect and analyze Parkinson's disease.
It includes both classification and regression models, providing valuable insights for early diagnosis and disease progression tracking.
Parkinson/
├── Model_Training/
│ ├── Classification.ipynb # Jupyter notebook for classification models
│ ├── Regression.ipynb # Jupyter notebook for regression models
│
├── Model_Pipeline/
│ ├── Script_classification.ipynb # Jupyter notebook for classification pipeline
│ ├── Script_regression.ipynb # Jupyter notebook for regression pipeline
│
├── Dataset/
│ ├── Dataset_Description.md # Detailed dataset documentation
│ ├── parkinsons_disease_data_cls.csv # Dataset used for training/testing
│
├── Parkinsons_Project_Documentation.pdf # Project documentation (PDF)
│
└── README.md # Project documentation (Markdown)
- Detect Parkinson's disease using patient biomedical voice data and health records.
- Compare and evaluate different machine learning algorithms.
- Perform both classification (disease vs. no disease) and regression (severity prediction).
- Provide interpretable metrics to support medical decision-making.
The dataset provides comprehensive health information for patients (IDs 3058–5162) who underwent examination for Parkinson’s disease diagnosis.
It includes demographic, lifestyle, medical, cognitive, functional, and symptom-related features.
- PatientID: Unique identifier (3058–5162)
- Demographics: Age (50–90), Gender, Ethnicity, Education level
- Lifestyle: BMI, Smoking status, Alcohol consumption (weekly units), Physical activity (hours/week), Diet quality score, Sleep quality score
- Family history of Parkinson’s
- Traumatic brain injury
- Hypertension, Diabetes, Depression, Stroke
- Blood Pressure: Systolic (90–180 mmHg), Diastolic (60–120 mmHg)
- Cholesterol: Total (150–300), LDL (50–200), HDL (20–100), Triglycerides (50–400)
- UPDRS: Unified Parkinson’s Disease Rating Scale (0–199; higher = worse)
- MoCA: Montreal Cognitive Assessment (0–30; lower = impairment)
- Functional Assessment: 0–10; lower = impairment
- Tremor, Rigidity, Bradykinesia (slowness of movement)
- Postural instability
- Speech problems
- Sleep disorders
- Constipation
- DoctorInCharge: Confidential field with anonymized placeholder (
DrXXXConfid)
- Diagnosis: Binary outcome →
1= Parkinson’s,0= No Parkinson’s
Start Jupyter Lab or Notebook:
jupyter notebookOpen:
- Classification.ipynb → Train & evaluate classification models
- Regression.ipynb → Train & evaluate regression models
- Script_classification.ipynb → Run the trained classification model pipeline on new data
- Script_regression.ipynb → Run the trained regression model pipeline on new data
-
Classification Models:
- Accuracy ✅
- Precision, Recall, F1-score 📈
- ROC-AUC 🔵
-
Regression Models:
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- R² Score
- Achieved ~95% accuracy in classification.
- Regression models provided good estimates of disease progression severity.
(Detailed results are available inside the notebooks.)
- Incorporate deep learning models (e.g., LSTMs for time-series voice data).
- Enhance feature engineering with signal processing techniques.
- Deploy as a web-based diagnostic tool for clinicians.
Hossam Ahmed
Fatma Ahmed
Maria Kaiser
Nouran Haitham
Nora Ahmed
Christin Medhat