Skip to content

Dev1822/Heart-Disease-EDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

🫀 Heart Disease Data Analysis

📌 Overview

This project performs an in-depth Exploratory Data Analysis (EDA) on a heart disease dataset to identify key factors that influence the presence of heart disease.

The main objective is to understand how variables like age, cholesterol levels, chest pain type, resting blood pressure, and maximum heart rate affect the likelihood of heart disease.

This analysis helps uncover data-driven insights that can support early diagnosis and better healthcare decision-making.


📂 Dataset Information

  • File: heart.csv
  • Rows: 918
  • Columns: 12

Key Features:

  • Age
  • Sex
  • ChestPainType
  • RestingBP
  • Cholesterol
  • FastingBS
  • RestingECG
  • MaxHR
  • ExerciseAngina
  • Oldpeak
  • ST_Slope
  • HeartDisease (Target Variable: 1 = Yes, 0 = No)

🛠️ Technologies Used

  • Python 🐍
  • Pandas
  • Matplotlib
  • Seaborn
  • Jupyter Notebook

🔍 Project Workflow

1. Data Loading

  • Loaded dataset using Pandas
  • Checked dataset shape, structure, and data types

2. Data Cleaning

  • Checked for missing or inconsistent values
  • Handled zero/invalid values in:
    • Cholesterol
    • RestingBP
  • Ensured proper data types for categorical and numerical columns

3. Feature Engineering

  • Converted HeartDisease from numeric to categorical:
    HeartDiseaseYes / No
  • Encoded categorical variables where required

🔍 4. Exploratory Data Analysis (EDA)

Performed analysis using:

  • Value counts for categorical variables

  • Pie charts for:

    • Heart disease distribution
    • Chest pain types
    • Exercise-induced angina
  • Bar plots for:

    • Heart disease count
    • Gender-wise comparison
  • Comparative analysis:

    • Patients with vs without heart disease
    • Impact of chest pain type
    • Effect of exercise angina
  • Statistical summaries:

    • Average age
    • Average cholesterol
    • Average maximum heart rate

📊 Key Insights

  • ❤️ Age Factor:
    Higher age groups show a greater likelihood of heart disease.

  • 🩺 Chest Pain Type is Crucial:
    Certain chest pain types are strongly associated with heart disease presence.

  • 🏃 Heart Rate Impact:
    Lower maximum heart rate is often observed in patients with heart disease.

  • ⚠️ Exercise-Induced Angina:
    Individuals experiencing angina during exercise are more likely to have heart disease.

  • 📉 Cholesterol & Blood Pressure:
    Abnormal levels contribute significantly to risk, though patterns may vary.


🧪 Hypothesis Testing

The project explores hypotheses such as:

  • Whether age significantly affects heart disease occurrence
  • The impact of cholesterol and blood pressure on heart health
  • Relationship between exercise-induced angina and heart disease
  • Differences in heart disease occurrence across gender

📌 Visualizations

The notebook includes:

  • Pie charts 📊
  • Bar graphs 📈
  • Distribution plots 📉

These visualizations help in understanding relationships between health indicators and heart disease.


🚀 How to Run

1. Clone the repository

git clone https://github.com/Dev1822/Heart-Disease-EDA
cd Heart-Disease-EDA

2. Install dependencies

pip install pandas matplotlib seaborn

3. Run the notebook


Made By : https://github.com/Dev1822

About

This project performs Exploratory Data Analysis (EDA) on a heart disease dataset to identify key factors that influence the presence of heart disease. The main objective is to understand how variables like age, cholesterol levels, chest pain type, resting blood pressure, and maximum heart rate affect the likelihood of heart disease.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors