This research project implements a stroke risk prediction system using an Artificial Neural Network (ANN) model integrated with a user-friendly Streamlit web interface. The system analyzes personal health metrics to assess an individual's risk of experiencing a stroke, providing a valuable tool for preventive healthcare.
B.Sc. Research Project in Electrical and Electronic Engineering - Sri Lanka Institute of Information Technology (SLIIT)
- 8,600 health records collected from hospital data
- 2,500 stroke cases identified for analysis
- Comprehensive patient health profiles including medical history and lifestyle factors
- Implemented a deep neural network with multiple hidden layers (1024→512→256→128→64→32→16→1)
- Incorporated dropout layers (0.2) to prevent overfitting
- Applied L2 regularization for model stability
- Utilized Adam optimizer with fine-tuned learning rate
- Preprocessed healthcare dataset (categorical encoding, outlier handling)
- Resolved data imbalance using SMOTE (Synthetic Minority Over-sampling Technique)
- Implemented feature scaling using StandardScaler
- Achieved 81% accuracy in stroke prediction
- Precision: ~81% - High reliability in positive predictions
- Recall: ~82% - Strong capability to identify actual stroke cases
- F1-Score: ~81-82% - Balanced performance between precision and recall
- Developed a responsive Streamlit interface for user interaction
- Created an intuitive health information form with appropriate input constraints
- Implemented real-time risk assessment with visual feedback
- Displayed prediction results with probability scores
- Python - Core programming language
- TensorFlow/Keras - Neural network implementation
- Pandas - Data manipulation and preprocessing
- NumPy - Numerical operations
- Scikit-learn - Machine learning utilities (SMOTE, StandardScaler)
- Matplotlib/Seaborn - Data visualization
- Streamlit - Web application framework
- Pickle - Model serialization
The ANN model processes the following user health metrics to predict stroke risk:
| Feature | Type | Description | Significance |
|---|---|---|---|
| Gender | Categorical | Male/Female | Different risk profiles by gender |
| Age | Numerical | Age in years | Increasing risk with age |
| Hypertension | Binary | 0=No, 1=Yes | Major stroke risk factor |
| Heart Disease | Binary | 0=No, 1=Yes | Comorbidity affecting stroke risk |
| Avg. Glucose Level | Numerical | Blood glucose level (mg/dL) | Diabetes-related risk indicator |
| BMI | Numerical | Body Mass Index | Weight-related risk factor |
| Smoking Status | Categorical | Never smoked/Formerly smoked/Smokes | Lifestyle risk factor |
The interactive Streamlit application provides an intuitive interface for users to input their health data and receive stroke risk predictions.
-
Clone the repository
git clone https://github.com/yourusername/stroke-prediction-system.git cd stroke-prediction-system -
Install dependencies
pip install -r requirements.txt
-
Run the Streamlit application
streamlit run app.py
-
Access the web interface Open your browser and navigate to
http://localhost:8501
| Metric | Value |
|---|---|
| Accuracy | 81% |
| Precision | 81% |
| Recall | 82% |
| F1-Score | 81-82% |
The system workflow consists of three main components:
-
Data Preprocessing Pipeline
- Categorical feature encoding
- Missing value imputation
- Feature scaling
- Class imbalance handling via SMOTE
-
Neural Network Model
model = Sequential([ Dense(1024, activation='relu', kernel_regularizer=l2(0.01)), Dropout(0.2), Dense(512, activation='relu', kernel_regularizer=l2(0.01)), Dropout(0.2), # Additional layers... Dense(1, activation='sigmoid') ])
-
Web Application Integration
- User data collection
- Real-time prediction
- Result visualization
- Model Refinement: Further optimization of neural network architecture
- Hyperparameter Tuning: Systematic exploration of optimal parameters
- Clinical Deployment: Testing in real-world healthcare settings
- Feature Expansion: Incorporate additional health parameters for improved prediction
- Explainability: Implement feature importance visualization for clinical interpretation
- Personalized Recommendations: Add tailored health advice based on risk factors
- Mobile Application: Develop companion app for wider accessibility
- Longitudinal Analysis: Implement user accounts for tracking health metrics over time
This research contributes to AI-driven healthcare solutions by:
- Enhancing Early Detection: Identifying high-risk individuals before symptom onset
- Supporting Preventive Care: Enabling targeted interventions for at-risk populations
- Reducing Healthcare Burden: Potential reduction in emergency care costs through prevention
- Democratizing Healthcare: Providing accessible stroke risk assessment tools
