Skip to content

balewgize/image-similarity-checker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Similarity Checker

Find duplicate and similar images using AI embeddings.

Goal: A learning project exploring computer vision - CLIP embeddings, semantic image understanding, and similarity matching.

image

How It Works

  • Understand: OpenCLIP turns each image into a compact numeric signature
  • Compare: the signatures are compared to get a similarity score (0–1)
  • Decide: scores at or above SIMILARITY_MATCH are labeled as matches

Think of the signature as a visual fingerprint for the image.

image -> OpenCLIP -> signature -> similarity score -> match/no match

Quick Start

pip install -r requirements.txt

# CLI compare
python cli.py compare image1.jpg image2.jpg

# API + web UI
uvicorn api:app --reload --host 0.0.0.0 --port 5000
# open http://localhost:5000 in your browser

CLI

Compare two images:

python cli.py compare image1.jpg image2.jpg

Find duplicates in a folder:

python cli.py duplicates ./photos

Search similar images:

python cli.py search query.jpg ./gallery --top 5

REST API

Start the server:

uvicorn api:app --host 0.0.0.0 --port 5000

Endpoints:

  • POST /compare (multipart form: image1, image2)
  • GET /health

Example:

curl -X POST \
  -F "image1=@img1.jpg" \
  -F "image2=@img2.jpg" \
  http://localhost:5000/compare

Response:

{
  "similarity": 0.9423,
  "match": true,
  "threshold": 0.90
}

Web UI

The web UI is served directly by FastAPI from the static/ directory (no separate web server needed). Just start the API server and visit http://localhost:5000.

Features:

  • drag-and-drop uploads for both images
  • interactive before/after slider

Configuration

Edit core/config.py:

Setting Default Description
CLIP_MODEL RN50 OpenCLIP model architecture
CLIP_PRETRAINED openai Pretrained weights source
SIMILARITY_MATCH 0.90 Default match threshold
LOG_LEVEL INFO Logging verbosity

Project Structure

.
├── api.py              # FastAPI server (serves API + static files)
├── cli.py              # Command-line interface
├── static/             # Web UI assets
│   ├── index.html      # Main page (HTML + Tailwind CDN)
│   ├── styles.css      # Custom styles
│   └── app.js          # Frontend JavaScript
├── core/
│   ├── config.py       # Configuration & logging
│   ├── embedder.py     # OpenCLIP embedding generation
│   ├── errors.py       # Custom exceptions
│   └── pipeline.py     # Compare/duplicate/search logic
├── requirements.txt    # Python dependencies
└── logs/               # Application logs

License

MIT License

About

Find duplicate and similar images using AI embeddings.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors