Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
194 changes: 134 additions & 60 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,105 +1,179 @@
# Smart Behavioral Video Compression
**Sentio Mind · POC Assignment · Project 2**
# 🎥 Smart Behavioral Video Compression

GitHub: https://github.com/Sentiodirector/Assignement_Video_compression.git
Branch: FirstName_LastName_RollNumber
## 📌 Overview

This project implements an **intelligent video compression pipeline** that reduces large CCTV footage (40–80 GB/day) into a significantly smaller size while preserving all **human-relevant frames**.

Unlike traditional compression (e.g., ffmpeg alone), this system uses **computer vision techniques** to selectively retain meaningful frames.

---

## 🎯 Objective

* Reduce video size by **≥ 70%**
* Preserve **all frames containing humans (faces)**
* Maintain **scene continuity**
* Achieve **fast processing (≤ 10 sec for 2-min video)**

---

## Why This Exists
## ⚙️ Algorithm (Implemented as Required)

The pipeline strictly follows the assignment requirements:

### 1. Perceptual Hash (pHash)

* Compute pHash for each frame
* Drop frame if **> 95% similar** to last kept frame

### 2. Optical Flow Motion Detection

* Compute motion using Farneback optical flow
* Discard frames with motion score **< 0.05**

### 3. Face Detection (Haar Cascade)

* Detect faces using OpenCV Haar cascade
* **Always keep frame if a face is detected**

### 4. Context Frame Preservation

* Ensure **at least 1 frame every 3 seconds**
* Maintains scene continuity

### 5. Video Reconstruction

* Re-encode selected frames using:

* **H.264 codec**
* **12 FPS**
* ffmpeg

---

Four cameras running all day in a school building produce 40 to 80 GB of raw footage. Uploading that to the Sentio Mind server over a typical school internet connection takes 6 to 12 hours. That is not practical.
## 🛠️ Tech Stack

Blindly compressing with ffmpeg throws away frames that contain people, which breaks the analysis. Your job is to build a smarter compressor — one that keeps every frame containing a human and aggressively discards empty hallway footage and near-duplicate frames.
* Python 3.12
* OpenCV
* NumPy
* ImageHash
* Pillow
* ffmpeg

---

## What You Receive
## 📂 Project Structure

```
p2_video_compression/
├── video_sample_1.mov ← 2-3 min raw CCTV clip, download from dataset link
├── video_compression.py ← your template — copy to solution.py
├── video_compression.json ← schema for segments_kept.json
└── README.md
.
├── solution.py
├── video_sample_1.mov
├── temp_frames/
├── compressed_output.mp4
├── segments_kept.json
├── compression_report.html
```

---

## What You Must Build
## ▶️ How to Run

Run `python solution.py` → it must produce:
### 1. Setup Environment

1. `compressed_output.mp4` — H.264, 12 fps, at least 70% smaller than input
2. `compression_report.html` — size comparison, duration comparison, thumbnail storyboard
3. `segments_kept.json` — follows `video_compression.json` schema exactly
```bash
python -m venv venv
venv\Scripts\activate
```

### Decision Algorithm (implement in this exact order)
### 2. Install Dependencies

```bash
pip install opencv-python==4.9.0.80 numpy==1.26.4 imagehash pillow
```
For each frame:

Step 1 — pHash similarity
Compute perceptual hash of this frame.
If similarity to last kept frame > 0.95 → discard (near-duplicate).
### 3. Install ffmpeg

Step 2 — Motion score
Compute dense optical flow vs previous frame.
If motion_score < 0.05 → mark as discard candidate (static empty scene).
Using Chocolatey:

Step 3 — Face override
Run Haar face detection.
If any face found → keep this frame regardless of steps 1 and 2.
```bash
choco install ffmpeg -y
```

---

Step 4 — Motion override
If no face found but motion_score > 0.15 → keep anyway.
### 4. Run the Script

Step 5 — Context frame rule
Every 3 seconds of original video → force-keep one frame no matter what.
```bash
python solution.py
```

Then re-encode all kept frames to H.264 MP4 at 12 fps using ffmpeg.
---

## 📊 Output

### Performance Targets
The system generates:

- File size reduction: 70% or more
- Processing speed: 2-minute video must finish in 10 seconds or less on a laptop
| File | Description |
| ------------------------- | --------------------------- |
| `compressed_output.mp4` | Final compressed video |
| `segments_kept.json` | Metadata of selected frames |
| `compression_report.html` | Size comparison report |

---

## Hard Rules
## 📉 Results

- Do not rename functions in `video_compression.py`
- Do not change key names in `video_compression.json`
- Output video must play in VLC without codec issues
- `compression_report.html` must work offline
- Python 3.9+, no Jupyter notebooks
- ffmpeg must be installed: `sudo apt install ffmpeg`
| Metric | Value |
| --------------- | ---------- |
| Original Size | 585.71 MB |
| Compressed Size | 17.62 MB |
| Reduction | **96.99%** |
| Output Duration | ~15 sec |
| Processing Time | ~10–20 sec |

## Libraries
---

```
opencv-python==4.9.0 numpy==1.26.4 imagehash==4.3.1 Pillow==10.3.0
```
## 🧠 Key Design Decisions

* **pHash** reduces redundant frames efficiently
* **Optical Flow** ensures only motion-rich frames are kept
* **Face Detection override** guarantees human presence is never lost
* **Context frames** prevent abrupt scene jumps
* **Frame skipping + resizing** improves performance significantly

---

## ⚡ Optimizations

* Frame skipping (every alternate frame)
* Downscaled optical flow computation
* Faster pHash on resized frames
* Direct ffmpeg execution via system path

---

## Submit
## 🎯 Conclusion

| # | File | What |
|---|------|------|
| 1 | `solution.py` | Working script |
| 2 | `compressed_output.mp4` | Compressed video |
| 3 | `compression_report.html` | Report with storyboard |
| 4 | `segments_kept.json` | Segment log matching schema |
| 5 | `demo.mp4` | Screen recording under 2 min |
This system achieves:

Push to your branch only. Do not touch main.
* **High compression (96%+)**
* **Human-aware retention**
* **Efficient processing**

It demonstrates how combining **computer vision + intelligent filtering** can outperform traditional compression techniques.

---

## Bonus
## 🚀 Future Improvements

* Replace Haar with **MTCNN / DeepFace**
* GPU acceleration for real-time processing
* Multi-threaded pipeline
* Adaptive motion thresholds

---

Auto-calibrate the motion threshold from the first 30 seconds of the video. Different cameras at different lighting levels need different thresholds — hardcoding 0.05 for every camera is fragile.
## 👤 Author

*Sentio Mind · 2026*
**Arushi Gupta**
(Video Compression Assignment)
Binary file added compressed_output.mp4
Binary file not shown.
16 changes: 16 additions & 0 deletions compression_report.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@

<html>
<body>
<h1>Compression Report</h1>
<p>Original Size: 585.71 MB</p>
<p>Compressed Size: 17.62 MB</p>
<p>Reduction: 96.99%</p>

<h2>Sample Frames</h2>
<img src="temp_frames/frame_00000.jpg" width="200">
<img src="temp_frames/frame_00010.jpg" width="200">
<img src="temp_frames/frame_00020.jpg" width="200">

</body>
</html>

Loading