Skip to content

ReiBands/Healthcare-Outcome-Prediciton

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Healthcare Outcome Prediction Using Logistic Regression

Project Overview

This project focuses on predicting a patient’s health outcome using medical diagnostic data. The goal is to explore how machine learning can assist in identifying patterns related to disease risk, while remaining transparent, interpretable, and aligned with academic best practices.

Rather than building a complex or black-box system, this project intentionally uses a simple and explainable classification model to understand how medical features relate to health outcomes.


Project Goals

  • Apply machine learning to a real healthcare-related dataset
  • Practice structured data cleaning and preprocessing
  • Use logistic regression for binary classification
  • Evaluate model performance using clear and interpretable metrics
  • Understand the limitations of predictive models in healthcare contexts

Dataset Description

The dataset contains medical and demographic information collected from patients. Each row represents one patient, and the target variable indicates whether the patient tested positive or negative for a specific medical condition.

Typical features include:

  • Glucose levels
  • Blood pressure
  • Body mass index (BMI)
  • Age
  • Other physiological measurements

The target variable is binary:

  • Positive diagnosis
  • Negative diagnosis

This dataset is well-suited for introductory healthcare classification tasks due to its structured format and clear outcome variable.


Data Preparation and Cleaning

Before modeling, the dataset was carefully prepared:

  • Invalid zero values in medical measurements were identified and treated as missing data
  • Missing values were replaced using column mean imputation
  • Features were verified for consistency and correctness
  • The dataset was split into training and testing sets

These steps ensure that the model is trained on realistic and meaningful medical data.


Model Used

Logistic Regression

Logistic Regression was selected because:

  • It is widely used in healthcare analytics
  • It produces interpretable probability-based outputs
  • It aligns with coursework concepts
  • It avoids unnecessary model complexity

The focus of this project is understanding how the model works, not maximizing performance at all costs.


Model Evaluation

The model was evaluated using:

  • Accuracy – to measure overall prediction correctness
  • Confusion Matrix – to examine true positives, false positives, true negatives, and false negatives

These metrics are especially important in healthcare, where different types of errors can have different real-world implications.


Key Insights

  • Certain medical features show strong relationships with patient outcomes
  • Logistic regression provides clear insight into classification behavior
  • The confusion matrix highlights where the model succeeds and fails
  • Even simple models can provide meaningful signals when data is cleaned properly

Limitations

  • This model is trained on historical data and does not replace medical diagnosis
  • The dataset lacks contextual factors such as lifestyle or genetics
  • Accuracy alone is insufficient for real healthcare deployment
  • Ethical and clinical validation would be required in real-world use

This project is strictly educational and exploratory.


Future Improvements

Possible extensions include:

  • Adding precision, recall, and F1-score analysis
  • Comparing logistic regression with other classifiers
  • Exploring feature importance more deeply
  • Using cross-validation for more robust evaluation

Author

Isaac Wanlemvo
Software Engineering & AI Student

About

A healthcare-focused machine learning project using logistic regression to predict patient outcomes based on medical diagnostic data, with an emphasis on interpretability and responsible evaluation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors