Skip to content

AlexThunder01/Thyroid-Nodule-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🩺 Thyroid Nodule AI: Deep Learning CAD Pipeline

Python PyTorch License Thesis

Hero Interface

Computer-Aided Diagnosis (CAD) system for Thyroid Ultrasound Analysis.

This repository contains the official implementation of the Bachelor's Thesis: "Development of a Deep Learning pipeline for thyroid nodule diagnosis: Comparison between CNN and Vision Transformers" (Sapienza University of Rome, 2025).


πŸ“– Table of Contents


πŸ₯ Project Overview

Thyroid nodules are a pervasive clinical issue, present in up to 60% of the adult population. While the majority are benign, distinguishing the 5-10% of malignant cases remains a challenge due to the subjective nature of ultrasound interpretation (high inter-observer variability).

This project proposes a two-stage Deep Learning pipeline to support radiologists:

  1. Detection: Localizing nodules in B-mode ultrasound images.
  2. Classification: Discriminating between Benign and Malignant nodules.
  3. Explainability: Visualizing the morphological features driving the AI's decision.

Research Goal

The core study compares traditional CNNs (EfficientNet, ConvNeXt) against modern Vision Transformers (DINOv3, Swin) to determine if Self-Supervised Learning (SSL) offers superior robustness in processing noisy medical imagery.


βš™οΈ Methodology & Pipeline

The system processes raw ultrasound images through a strict pipeline described in Chapter 3 of the thesis.

1. Preprocessing & Enhancement

Ultrasound images suffer from speckle noise and low contrast. Before inference, images undergo:

  • Perceptual Hashing (dHash): To remove duplicate frames and prevent data leakage.
  • CLAHE: Contrast Limited Adaptive Histogram Equalization.
  • Sharpening: To emphasize nodule margins (a critical TI-RADS feature).

Preprocessing Pipeline Figure: Effect of the enhancement pipeline. (Left) Original raw input. (Right) Enhanced image fed to the models.

2. Architecture Comparison

We benchmarked the following architectures:

  • Detection: YOLOv12 (Anchor-Free) vs. DINO-DETR vs. Faster R-CNN.
  • Classification:
    • Baselines: SVM (Radiomics), EfficientNetV2, ConvNeXt V2.
    • Our Approach: DINOv3 (ViT) pretrained with Self-Supervised Learning (SSL).

πŸ“Š Key Results

The models were evaluated on the TN5000 and AUITD datasets (7,000+ nodules).

πŸ† Classification Performance (Test Set)

DINOv3-Large achieved state-of-the-art results, significantly outperforming CNNs in Sensitivity (Recall), which is the most critical metric for cancer screening.

Architecture Type AUC-ROC Recall (Sensitivity) F1-Score
EfficientNetV2 CNN 0.898 92.63% 0.864
ConvNeXt V2 CNN 0.906 91.71% 0.865
DINOv3-Large ViT 0.932 94.70% 0.887

Visual Analysis

  • Left (ROC Curve): The Foundation Model (Orange line) demonstrates superior separation capabilities ($AUC=0.93$).
  • Right (Confusion Matrix): High accuracy in identifying malignant cases, minimizing dangerous False Negatives.

🩺 Clinical Integration & Risk Stratification

The system is designed as a Decision Support Tool. While the model performs a binary classification (Benign vs. Malignant) using an optimized operational threshold ($p=0.38$), the raw probability score ($p$) provides a granular estimation of malignancy risk.

Validation on external datasets (Chapter 5.5.4) confirms that the model's confidence levels correlate strongly with the K-TIRADS risk categories, allowing for a professional interpretation of the AI output:

AI Confidence ($p$) Classification Clinical Interpretation Suggested Action
$p < 0.20$ Benign Low risk (Correlates to TR1/TR2) Routine follow-up
$0.20 \le p < 0.60$ Indeterminate Moderate risk (Correlates to TR3/TR4) Short-term follow-up / FNA
$p \ge 0.60$ Malignant High Risk (Correlates to TR5) Strong Biopsy Recommendation

πŸ” Explainability (Heatmaps)

To move beyond "Black Box" AI, the pipeline integrates Attention Maps (for DINOv3) and Grad-CAM (for CNNs). This allows clinicians to verify if the AI is focusing on relevant radiological features, such as:

  • Irregular Margins: Precisely outlined by the Transformer's global attention.
  • Microcalcifications: Detected as high-frequency textures by the CNN backbones.

Heatmaps Figure: DINOv3 attention maps focusing on the irregular borders of a malignant nodule, aligning with K-TIRADS criteria.


πŸ’» Installation

Prerequisites

  • OS: Linux (Recommended) or Windows 10/11
  • Python: 3.10+
  • Hardware: NVIDIA GPU with CUDA support (Recommended for DINOv3)

Setup Steps

# 1. Clone the repository
git clone https://github.com/YourUsername/Thyroid-Nodule-AI.git
cd Thyroid-Nodule-AI

# 2. Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install core dependencies (PyTorch, Ultralytics, etc.)
pip install -r requirements.txt

# 4. (Optional) Install GUI dependencies
pip install -r requirements-gui.txt

⚠️ Model Weights: Due to file size limits and licensing, weights are not included in the repo. Please refer to docs/WEIGHTS.md for download links and placement instructions.


πŸš€ Usage (GUI & CLI)

1. Thyroid AI Assistant (GUI)

A user-friendly interface for simulating the clinical workflow (Drag & Drop, Real-time Analysis).

python src/gui/app.py

Features:

  • Visual toggle for Preprocessing (CLAHE).
  • Real-time Detection (YOLO) and Classification (DINO).
  • PDF Report generation (Experimental).

GUI Demo

2. Command Line Interface (CLI)

Run the classifier on a single image via terminal.

python3 src/inference_dino.py \
  --image path/to/your/image.jpg \
  --weights path/to/model_weights.pt \
  --dino-repo dinov3 \
  --out-dir results

Arguments:

Argument Description Example
--image Path to the input image (supports .jpg, .png). assets/benign.jpg
--weights Path to the trained DINOv3 .pt file. models/dinov3_large.pt
--dino-repo Path to the local DINOv3 repository/folder. dinov3
--out-dir Directory where the result will be saved. results

Note: The script automatically applies the preprocessing pipeline (CLAHE + Sharpening) before inference.


πŸ“‚ Repository Structure

.
β”œβ”€β”€ assets/                 # Images, plots, and demo resources
β”‚   β”œβ”€β”€ benign              # Images of benign nodules
β”‚   └── malignant           # Images of malignant nodules
β”œβ”€β”€ dinov3/                 # Submodule for Vision Transformer architecture
β”œβ”€β”€ docs/                   # Documentation
β”‚   β”œβ”€β”€ GUI.md              # User manual for the interface
β”‚   └── WEIGHTS.md          # Links to pretrained models
β”œβ”€β”€ src/                    # Source code
β”‚   β”œβ”€β”€ gui/                # Application logic (CustomTkinter)
β”‚   └── inference_dino.py   # CLI inference script
β”œβ”€β”€ Thesis.pdf              # Full Bachelor's Thesis document
β”œβ”€β”€ CITATION.bib            # BibTeX citation
β”œβ”€β”€ LICENSE                 # MIT License
β”œβ”€β”€ MODEL_CARD.md           # Model specifics and limitations
β”œβ”€β”€ requirements.txt        # Python dependencies
└── requirements-gui.txt    # gui dependencies

πŸŽ“ Citation

If you use this work in your research, please cite the thesis:

@bachelorsthesis{catania2025thyroid,
  author  = {Alessandro Catania},
  title   = {Development of a Deep Learning pipeline for thyroid nodule diagnosis: Comparison between CNN and Vision Transformers},
  school  = {Sapienza University of Rome},
  year    = {2025},
  type    = {Bachelor's Thesis}
}

Acknowledgments

  • Sapienza University of Rome - Faculty of Information Engineering.
  • Ultralytics for the YOLOv12 framework.
  • Meta AI for the DINOv3 foundation model.

For questions or collaborations, please open an Issue.

About

Deep Learning pipeline for thyroid nodule diagnosis in ultrasound. Benchmarks CNNs vs. Vision Transformers (YOLOv12, DINOv3) on a 7k+ dataset. Features a two-stage approach (Detection + Classification), achieving SOTA results with Foundation Models. Includes explainability maps and a GUI demo.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages