This project implements a deep learning solution for detecting various eye diseases from fundus images. The project includes comprehensive data analysis, preprocessing, and multi-label classification of eye conditions.
- Multi-label classification of 8 different eye conditions
- Comprehensive data preprocessing and analysis
- Custom data augmentation pipeline
- Implementation of various deep learning architectures
- Detailed visualization of results and model performance
The project uses the ODIR-5K dataset (Ocular Disease Intelligent Recognition) which contains:
- 6,392 fundus images from both left and right eyes
- 8 disease categories including:
- Normal (N)
- Diabetes (D)
- Glaucoma (G)
- Cataract (C)
- Age-related Macular Degeneration (A)
- Hypertension (H)
- Myopia (M)
- Other diseases/abnormalities (O)
- Removal of low-quality images
- Handling duplicates and inconsistencies
- Standardization of image sizes
- Data cleaning and validation
- Distribution analysis of eye conditions
- Age and gender analysis across different conditions
- Correlation analysis between different eye diseases
- Visualization of disease patterns
- Custom classification system for eye diseases
- Structured organization of images into disease categories
- Implementation of balanced sampling strategies
- numpy
- pandas
- tensorflow
- opencv-python
- scikit-learn
- matplotlib
- seaborn
- plotly-
Disease Distribution
- Analysis of normal vs. abnormal fundus images
- Distribution of diseases across age groups
- Gender-based disease patterns
-
Data Patterns
- Correlation between left and right eye conditions
- Age-related disease patterns
- Gender-specific disease prevalence
- Initial analysis framework inspired by kaggel community work
- Custom implementation of multi-label classification
- Original dataset: ODIR-5K (Ocular Disease Intelligent Recognition)
The model achieved promising results across different eye conditions, with particularly strong performance in several categories.
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Normal | 1.00 | 1.00 | 1.00 | 210 |
| Diabetes | 0.67 | 0.88 | 0.76 | 205 |
| Glaucoma | 0.83 | 0.85 | 0.84 | 34 |
| Cataract | 0.88 | 1.00 | 0.93 | 35 |
| AMD | 0.84 | 0.93 | 0.88 | 28 |
| Hypertension | 0.40 | 1.00 | 0.57 | 8 |
| Myopia | 0.90 | 1.00 | 0.95 | 28 |
| Other | 0.00 | 0.00 | 0.00 | 90 |
- Overall Accuracy: 81% across all classes
- Best Performing Categories:
- Normal (F1: 1.00)
- Myopia (F1: 0.95)
- Cataract (F1: 0.93)
- AMD (F1: 0.88)
- Challenges:
- 'Other' category (F1: 0.00)
- Hypertension (Precision: 0.40)
-
Strong Performance:
- Perfect classification for Normal cases
- Excellent detection of Myopia and Cataract
- High recall across most categories
-
Areas for Improvement:
- Poor performance in 'Other' category
- Low precision in Hypertension detection
- Moderate precision in Diabetes classification
-
Class Imbalance:
- Large variation in support sizes (8 to 210 samples)
- May affect model performance on minority classes
- Macro Average F1-Score: 0.74
- Weighted Average F1-Score: 0.76
- Overall Accuracy: 0.81
This performance analysis suggests strong potential for clinical application while highlighting specific areas for future improvement, particularly in handling the 'Other' category and improving precision for Hypertension detection.
Distribution of eye conditions across the dataset:
- Both Abnormal: 45.7% of cases
- Both Normal: 30.0% of cases
- Left Normal, Right Abnormal: 12.9%
- Right Normal, Left Abnormal: 11.4%
This indicates a significant presence of bilateral conditions in the dataset.
Principal Component Analysis (PCA) visualization shows the relationship between different eye conditions:
- Clear separation between Normal (N) and Diabetic (D) cases
- Glaucoma (G) shows distinct clustering
- Age-related conditions (A) and Normal cases show some overlap
- Patient Age is centrally positioned, indicating its relevance across conditions
The confusion matrix demonstrates the model's classification performance across different eye conditions:
- Perfect prediction (210/210) for Normal class
- Strong performance for Diabetes (181 correct predictions)
- Excellent accuracy for specialized conditions (Cataract: 35/35, Myopia: 28/28)
- Some misclassifications between Diabetes and Other categories
Training metrics over epochs show:
-
Accuracy:
- Validation accuracy reaches ~80%
- Training accuracy stabilizes around 65%
- Good convergence without significant overfitting
-
Loss:
- Both training and validation loss decrease steadily
- Model shows stable learning with minor fluctuations
- Optimal convergence achieved around epoch 15
These visualizations demonstrate:
- Strong classification performance across major disease categories
- Clear disease clustering patterns
- Balanced distribution of normal and abnormal cases
- Stable and effective model training process
- Arundhati Das & Pranay Ghuge
Note: This project combines both established analysis methods and custom implementations for multi-label classification of eye diseases.



