GitHub - sneMoDi/AI-Powered-Patient-Risk-Prediction-System-

README: Patient Risk Prediction System

Overview

The Patient Risk Prediction System leverages machine learning models to predict risks for conditions such as diabetes, hypertension, and strokes. The system uses structured datasets to train and evaluate various classification models, ensuring high accuracy and reliable predictions.

Datasets

Diabetes Dataset:
- Predicts whether a patient is at risk of diabetes.
- Target Variable: Diabetes
Hypertension Dataset:
- Predicts the likelihood of a patient having hypertension.
- Target Variable: target
Strokes Dataset:
- Predicts the probability of a stroke occurrence.
- Target Variable: stroke

Models Evaluated

Logistic Regression
Decision Tree
Random Forest
K-Nearest Neighbors (KNN)
Multinomial Naive Bayes (for normalized datasets)

Results Summary

Diabetes Dataset

Model	Training Accuracy	Testing Accuracy	F1 Score (Test)
Logistic Regression	74.78%	74.61%	74.60%
Decision Tree	97.17%	65.86%	65.85%
Random Forest	97.17%	72.57%	72.53%
KNN	82.89%	69.48%	69.46%

Hypertension Dataset

Model	Training Accuracy	Testing Accuracy	F1 Score (Test)
Logistic Regression	85.87%	85.74%	85.62%
Decision Tree	100%	100%	100%
Random Forest	100%	100%	100%
KNN	100%	100%	100%
Multinomial Naive Bayes	78.61%	78.55%	78.30%

Strokes Dataset

Model	Training Accuracy	Testing Accuracy	F1 Score (Test)
Logistic Regression	68.39%	68.25%	68.10%
Decision Tree	100%	99.94%	99.94%
Random Forest	100%	99.66%	99.66%
KNN	94.42%	89.64%	89.57%
Multinomial Naive Bayes	66.31%	66.94%	66.03%

Key Observations

Decision Tree and Random Forest:
- Achieved perfect accuracy on the Hypertension and Stroke datasets, likely due to overfitting on small data variations.
- Performed less consistently on the Diabetes dataset.
Logistic Regression:
- Provided stable and interpretable results across all datasets.
- Best choice for cases requiring linear separation and simplicity.
KNN:
- Showed strong performance, especially for Stroke predictions with accuracy close to 90%.
Multinomial Naive Bayes:
- Performed well on normalized data but struggled with complex, high-dimensional datasets.

System Features

Normalization:
- Applied MinMaxScaler to scale features between 0 and 1, ensuring compatibility with all models.
Evaluation Metrics:
- Accuracy, F1 Score, Precision, and Recall are calculated for both training and testing datasets.
Heatmap Visualization:
- Correlation matrices provide insights into feature relationships and predictive power.

How to Use

Pre-requisites:
- Python 3.7+
- Libraries: pandas, numpy, sklearn, seaborn, matplotlib
Run the Code:
- Update the dataset paths in the datasets dictionary.
- Execute the script to train models and visualize results.
View Outputs:
- Model metrics are printed for each dataset.
- Correlation matrices and insights are displayed as visualizations.

Conclusion

This system demonstrates the ability to predict patient risk effectively using machine learning models.
Decision Tree and Random Forest are highly accurate but prone to overfitting.
Logistic Regression is reliable for datasets with linear relationships.
Feature normalization is essential for models like Multinomial Naive Bayes.

For further enhancements, consider hyperparameter tuning and additional data preprocessing techniques.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
ai_risk.ipynb		ai_risk.ipynb
health_data.zip		health_data.zip
main.js		main.js
patient_risk_schema.rtf		patient_risk_schema.rtf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README: Patient Risk Prediction System

Overview

Datasets

Models Evaluated

Results Summary

Diabetes Dataset

Hypertension Dataset

Strokes Dataset

Key Observations

System Features

How to Use

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

README: Patient Risk Prediction System

Overview

Datasets

Models Evaluated

Results Summary

Diabetes Dataset

Hypertension Dataset

Strokes Dataset

Key Observations

System Features

How to Use

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages