Heart Disease Analysis Using R
Description: This project analyzes heart disease risk factors using statistical and predictive modeling techniques in R. The dataset includes 303 patients, with variables such as age, cholesterol, chest pain type, and target (heart disease presence).
Objectives: 1.Explore relationships between health parameters and heart disease. 2.Compare logistic regression, decision trees, and random forests for prediction. 3.Identify key predictors of heart disease.
Methods:
- Statistical Tests: T-tests and Chi-Square tests to analyze relationships.
- Predictive Models: Logistic regression, decision trees, random forests, and SVM.
- Visualizations: Power BI and R-generated plots.
Key Findings: 1.Age and chest pain type are significant predictors. 2.Random forest showed the best model accuracy (73.35%). 3.Decision trees provided high interpretability.
Files:
Heart Disease Patient Data obtained from Kaggle: https://github.com/Risanaparvin00/Heart-Disease-Analysis-UsingR/blob/main/Heart%20Disease%20Data.xlsx
R script to replicate the analysis: https://github.com/Risanaparvin00/Heart-Disease-Analysis-UsingR/blob/main/R%20code
View the thesis document for results: https://github.com/Risanaparvin00/Heart-Disease-Analysis-UsingR/blob/main/Heart_Disease_Analysis_Thesis.docx
Clone the repository: git clone https://github.com/Risanaparvin00/Heart-Disease-Analysis-UsingR/tree/main
(Note that the plots are in .emf file , to view, it must be downloaded)