Skip to content

monicaduarteai/medical-image-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Medical Image Classification: Pulmonary Disease Detection

Independent Research Project | Python & PyTorch

This project focuses on developing a robust Computer Vision system to classify pulmonary diseases from chest X-ray images. By utilising Deep Learning architectures, the goal is to create an interpretable model that assists in medical diagnostics through automated image analysis.


📊 Research Results

The project followed an iterative research path to overcome significant data imbalance and overfitting:

Model Configuration Validation Accuracy Test Accuracy Status
Frozen ResNet50 (Baseline) 87.5% (n=16) 59.78% Overfit / Discarded
Fine-Tuned ResNet50 (v2) 98.29% (n=~800) 80.61% Current Best

Key Breakthrough: Transitioning from a frozen backbone to full-model fine-tuning and expanding the validation set by 50x were the primary drivers of the 20.8% performance increase.


🔬 Project Overview

  • Objective: Accurate classification of pulmonary conditions (e.g., Pneumonia) using convolutional neural networks (CNNs).
  • Architecture: Developed using PyTorch within a Miniconda environment for reproducible research.
  • Dataset: Medical X-ray datasets processed via OpenCV and Torchvision.

🛠️ Technical Implementation

  • Image Processing & Augmentation: Implementing preprocessing pipelines to normalise medical datasets and improve model robustness through resizing, grayscale conversion, and data augmentation.
  • Performance Optimisation: Utilising Scikit-learn and Matplotlib in Jupyter Notebooks to evaluate model accuracy through confusion matrices and loss curves.
  • Development Environment: Managed through VS Code and Miniconda to ensure a clean, isolated dependency structure.

📂 Repository Structure

  • .gitignore: Standardised configuration to exclude local environments and large datasets from version control.
  • environment.yml: Miniconda configuration file for seamless environment replication.
  • data/: (Local Only) Local directory for medical image datasets; excluded from Git to maintain privacy and performance.
  • med_ai_env/: (Local Only) The dedicated Conda environment containing PyTorch, Torchvision, and CUDA-optimized libraries for the RTX 4090.
  • models/: (Local Only) Stores trained PyTorch weights (.pth files). Note: These files are excluded from the repository due to size.
  • notebooks/: Contains detailed Jupyter Notebooks tracking experimental iterations and hyperparameter tuning.
  • scripts/: Python utilities for hardware verification and environment setup (e.g., check_gpu.py).
  • src/: Modular source code for model architecture, training loops, and evaluation pipelines.

📈 Methodology & Research Goals

I am currently maintaining detailed technical documentation to track experimental iterations and prepare findings for academic review at Queen Mary University of London. The project follows professional Data Governance standards mirrored from my industry experience at NovoPart to ensure data integrity and IP security.


🛡️ Research Ethics & Academic Integrity

  • This project is an independent research endeavour.
  • All methodologies follow standard research ethics regarding medical data handling and algorithmic transparency.

🌐 Connect with Me


This project is a core component of my development in Computer Vision and Artificial Intelligence.

About

Pulmonary disease detection using PyTorch and Computer Vision. Independent research project investigating CNN-based classification of chest X-rays.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors