Skip to content

NoorMajdoub/Fire-Detection-and-Localization-using-CLIP-DINOv2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 

Repository files navigation

Fire Detection and Localization Using CLIP and DINOv2

Project Overview

This Jupyter notebook implements a two-stage pipeline for automatic fire detection and localization in forest imagery using state-of-the-art vision-language and self-supervised models:

  1. Fire Detection – Binary classification using OpenAI’s CLIP-ViT model.
  2. Fire Localization – Patch-level anomaly detection using Facebook’s DINOv2 model with cosine similarity for feature matching.

Methodology

1. Fire Detection with CLIP-ViT

  • Utilizes CLIP (Contrastive Language–Image Pretraining) for zero-shot classification.

  • Embeds image and prompt texts into a shared semantic space.

  • Prompts used:

    • "a normal forest scene"
    • "a fire or smoke in a forest"
  • Classification is based on cosine similarity between image and text embeddings.

  • Outputs class label (fire / no fire) along with probability scores.

2. Fire Localization with DINOv2

  • Uses DINOv2 for self-supervised feature extraction.

  • Extracts patch-level features from both:

    • Target image
    • A reference normal forest image (used as baseline context)
  • Computes cosine similarity between patches to localize anomalous (i.e. fire-related) regions.

  • Visualizes results with anomaly heatmaps overlaid on the original image.


Notes: To Do : test SAM approach for fire localization


Required Libraries:

  • Python 3.8+
  • PyTorch (CUDA support recommended)
  • Transformers (CLIP, DINOv2)
  • OpenCV, PIL, NumPy, Matplotlib

Usage

Fire Detection

result, probs = detectfile("/path/to/image.jpg")
print(f"Fire Detected: {bool(result)} | Probabilities: {probs}")

Fire Localization

  • Load DINOv2 model and a reference image (non-fire scene).
  • Extract patch features from both test and reference images.
  • Compute per-patch cosine similarity.
  • Generate and display heatmap highlighting anomalous regions.

Notes

  • Developed and tested in a Kaggle notebook environment.
  • Models are sourced from Hugging Face Hub.
  • Includes fallback logic for CUDA initialization issues.
  • Detection thresholding is static; performance can be optimized via calibration.
  • Localization assumes a clean reference image (AoF06726.jpg) for anomaly comparison.

Future Work

  • Train or fine-tune models for improved robustness and generalization.

  • Extend pipeline to support:

    • Video input
    • Real-time inference
  • Replace static thresholding with adaptive scoring (e.g., via Gaussian models or learned thresholds).

  • Automate reference image selection or compute scene-level context clusters.


About

A two-stage pipeline for detecting and localizing forest fires using CLIP for zero-shot classification(classes :fire /no fire) and DINOv2 for patch-level anomaly localization(where is the fire).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors