SAM2 Universal Video Tracker

A desktop application for tracking any object in video using SAM 2 (Segment Anything Model 2), with two input modes:

Box prompt — draw a bounding box on any frame; SAM2 propagates the mask through the entire video.
Text prompt — type free-form class names ("helmet, vest, person"); GroundingDINO detects boxes automatically, which are passed to SAM2 for pixel-accurate tracking.

Features

Box-prompt and text-prompt tracking on video
Multi-object tracking with distinct colors per object
Add / remove / refine prompts on any frame — not just frame 0
Live preview with play / pause / seek / frame-step
Export options: annotated video (MP4), mask video, per-frame masks (PNG), COCO-style JSON, CSV of bounding boxes
Thread-safe — all heavy inference runs off the UI thread
Persistent settings (last-used paths, model checkpoints, device)

Why This Design

SAM 2 is promptable by points, boxes, and masks — it does not accept text natively. The open-vocabulary text entry point is added via GroundingDINO (the standard "Grounded SAM 2" pattern). This repo keeps the two models decoupled so either side can be swapped independently.

Stack

UI: PyQt6
Tracking: SAM2 (Meta) + GroundingDINO
Video I/O: decord (read), imageio-ffmpeg (write), opencv-python (draw)
Runtime: PyTorch ≥ 2.3 — CUDA strongly recommended, CPU works

Install

1. Clone and create environment

git clone https://github.com/hassan-hfk/sam2-video-tracker.git
cd sam2-video-tracker
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

2. Install dependencies

pip install -r requirements.txt

Note: ffmpeg must be on your system PATH for video export.

Windows: https://ffmpeg.org/download.html

Linux: sudo apt install ffmpeg

Mac: brew install ffmpeg

3. Download SAM2 checkpoint

Choose one and place it anywhere on your system — you'll set the path in Settings:

# Small — fast, 46M params (recommended for most use cases)
wget https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_small.pt

# Base Plus — balanced
wget https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_base_plus.pt

# Large — best quality
wget https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt

Or download from the official repo: https://github.com/facebookresearch/sam2

4. Download GroundingDINO checkpoint (text-prompt mode only)

wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

Run

python -m app.main

On first launch, go to Settings to set your SAM2 checkpoint path and (optionally) GroundingDINO checkpoint path.

Project Structure

sam2_tracker/
├── app/              ← Entry point + application shell
├── core/             ← Model wrappers, tracker orchestrator, export
├── ui/               ← PyQt6 widgets
├── utils/            ← Video I/O, drawing, config
└── assets/           ← Icons, QSS stylesheet

Hardware Requirements

Mode	Min GPU VRAM	Notes
SAM2 Small (box prompt)	4 GB	CPU also works, slower
SAM2 Large (box prompt)	8 GB	Best mask quality
SAM2 + GroundingDINO (text prompt)	8 GB	Both models loaded simultaneously

CUDA 11.8+ recommended. CPU inference works but is significantly slower for long videos.

License

MIT — free to use, modify, and distribute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAM2 Universal Video Tracker

Features

Why This Design

Stack

Install

1. Clone and create environment

2. Install dependencies

3. Download SAM2 checkpoint

4. Download GroundingDINO checkpoint (text-prompt mode only)

Run

Project Structure

Hardware Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
assets		assets
core		core
ui		ui
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SAM2 Universal Video Tracker

Features

Why This Design

Stack

Install

1. Clone and create environment

2. Install dependencies

3. Download SAM2 checkpoint

4. Download GroundingDINO checkpoint (text-prompt mode only)

Run

Project Structure

Hardware Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages