Heart Disease Prediction Using Machine Learning
This project uses Machine Learning classification models to predict whether a patient is likely to have heart disease using clinical features such as age, chest pain type, cholesterol level, fasting blood sugar, resting ECG results, maximum heart rate, etc.
The project compares multiple ML models and selects the best performing one for deployment.
Tech Stack
Python
Pandas, NumPy
Scikit-Learn
Random Forest, Logistic Regression, Decision Tree, MLPClassifier
Joblib (model saving)
Matplotlib/Seaborn (optional)
Excel dataset (heart.xlsx)
Project Structure
Heart-Disease-Prediction-Using-Machine-Learning/
│
├── heartdisease.py # Main ML training script
├── heart.xlsx # Heart disease dataset
├── heart_disease_rf_model.pkl # Saved Random Forest model
├── scaler.pkl # Saved StandardScaler
└── README.md # Project documentation
Objective
To develop and evaluate machine learning models that can accurately predict heart disease (0 = No, 1 = Yes) based on patient health data.
Models Used
The project evaluates the following models:
Logistic Regression
Decision Tree Classifier
Random Forest Classifier
Neural Network (MLPClassifier)
The Random Forest model achieved the best overall performance and is saved as:
heart_disease_rf_model.pkl
How It Works 1️ Load Dataset
heart.xlsx contains 303 patient records with 14 medical features.
2️ Data Preprocessing
Handling categorical features
Scaling numeric values using StandardScaler
Train-test split (80–20)
3️ Model Training
Cross-validation & hyperparameter tuning using GridSearchCV.
4️ Evaluation Metrics
Accuracy
Precision
Recall
F1-score
ROC-AUC
Confusion Matrix
5️ Model Saving
Both model and scaler are saved for later use:
heart_disease_rf_model.pkl → Random Forest model
scaler.pkl → StandardScaler for preprocessing
Run This Project Locally
-
Clone the repository git clone https://github.com/richachauhan15/Heart-Disease-Prediction-Using-Machine-Learning.git
-
Navigate to the project folder cd Heart-Disease-Prediction-Using-Machine-Learning
-
Install required libraries pip install pandas numpy scikit-learn joblib openpyxl
-
Run the ML script python heartdisease.py
Future Enhancements
Streamlit Web App for live predictions
Model explainability (SHAP)
Hyperparameter optimization using RandomizedSearchCV
Deployment using Flask or FastAPI
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Author
Richa Chauhan GitHub: richachauhan15