DRISHTI · ITMS

Automated Photo Identification & Classification for Traffic Violations using Computer Vision

Flipkart GRiD 6.0 — PS-3 · Bengaluru Traffic Police / ASTraM-compatible

DRISHTI turns any camera — pole CCTV, 4G wireless, even a phone — into an AI traffic-enforcement node. The entire pipeline runs locally on a GPU (demoed on a GTX 1650): no cloud, no per-frame cost, no citizen plate data leaving the edge.

What it detects (6 violation classes)

Violation	How
Red-Light Violation	vehicle crosses the stop-line through the line while the signal (HSV) is RED
Stop-Line Violation	stops past the line during RED (encroachment) — distinct from running it
Wrong-Way Driving	sustained motion opposite the dominant traffic flow
Illegal Parking	vehicle stationary inside a drawn no-parking polygon
Triple Riding	≥3 riders sharing one motorcycle box
No Helmet	finetuned helmet model on rider crops → `without_helmet`

Each flagged vehicle gets super-resolved evidence crops + ANPR (plate read) + timestamp + ERS reliability score.

Forensic ANPR with custom super-resolution

Distant CCTV plates are low-res and unreadable. Our pipeline reconstructs them:

plate.pt (localize)
  → IBP-MFSR     (Iterative Back-Projection multi-frame super-resolution — fuses sub-pixel detail across the burst)
  → Real-ESRGAN  (GAN upscaling, general-x4v3)
  → PP-OCRv6     (PaddleOCR, reads multi-row Indian plates)
  → plate-format cleanup (strips the IND strip, validates the format)

The custom method recovers plates that are unreadable in any single frame — here a heavily pixelated capture that a normal OCR returns nothing for becomes legible:

More examples:

Left = the original low-res plate the camera captured. Right = DRISHTI's super-resolved reconstruction + the OCR read. The pipeline reads plates a single frame can't.

Models — SOTA architecture, trained by us

We didn't bolt together off-the-shelf weights — we finetuned our own detectors on ~27,000 images / 70,000+ annotations of real CCTV traffic and license plates (incl. Indian plates), on Modal L40S GPUs.

Model	Classes	Trained on	Result
base	YOLO26-nano	COCO	vehicles, riders, pedestrians, traffic lights
plate.pt	license plate	~14.7k images / 17k boxes	0.98 mAP@50
helmet.pt	helmet / without_helmet	~12k images / 55k boxes	real-CCTV helmet compliance

The whole stack is state-of-the-art (2026):

YOLO26-nano — newest YOLO, NMS-free, real-time, edge-deployable (Jetson / CCTV NPUs)
PP-OCRv6 (PaddleOCR, ONNX) — 2026 SOTA OCR for the plate read
Real-ESRGAN + Iterative Back-Projection MFSR — the forensic super-resolution front-end
BoT-SORT multi-object tracking

Roadmap: with more compute we unify detection + helmet + plate into a single multi-task YOLO26 (one forward pass per frame → even cheaper at the edge), and expand the plate set with more Indian-specific data.

Base YOLO26-nano — vehicles, riders & pedestrians on a live CCTV junction:

Finetuned helmet & plate detectors:

Pages

Command Center — live KPIs + analytics (recharts)
Upload & Analyze — drop a clip, draw the stop-line + mark lights, watch live GPU detection + challans, with per-model on/off + confidence sliders
Violations — officer review queue, evidence gallery, approve/reject
Live Cameras — connect RTSP/HTTP sources, per-camera calibration
Model Playground — drop any image, test the 3 models live with adjustable thresholds + super-res toggle
Hotspot Map — Mappls (MapmyIndia) live violation heatmap of Bengaluru junctions, ranked by density — shows enforcement teams where to deploy

Stack

Frontend: Next.js 16 · React 19 · Tailwind v4 · shadcn/ui · Recharts · true-black UI
Backend (local): FastAPI · YOLO26-nano (Ultralytics) · BoT-SORT tracking · Real-ESRGAN (spandrel) · PP-OCRv6 (RapidOCR) · OpenCV
Training: Modal (L40S) — finetune yolo26n on Roboflow datasets

Run

# backend (GPU)  —  the CV engine lives in ./backend
cd backend && pip install -r requirements.txt && python3 server.py   # :8000

# frontend (repo root)
npm install && npm run dev                                            # :3000

Open Upload & Analyze → drop a traffic clip → draw the stop-line + mark the signal → Run. For live cameras: Live Cameras → Connect source → calibrate once → Start live detection.

See DEPLOY.md for Vercel + cloud-GPU deployment. The backend needs no API keys; only MAPPLS_KEY (map) is used client-side — .env is gitignored, nothing secret is committed.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
backend		backend
components		components
docs		docs
lib		lib
public		public
showcase		showcase
.env.example		.env.example
.gitignore		.gitignore
DEPLOY.md		DEPLOY.md
METRICS.md		METRICS.md
README.md		README.md
next-env.d.ts		next-env.d.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DRISHTI · ITMS

Automated Photo Identification & Classification for Traffic Violations using Computer Vision

What it detects (6 violation classes)

Forensic ANPR with custom super-resolution

Models — SOTA architecture, trained by us

Pages

Stack

Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DRISHTI · ITMS

Automated Photo Identification & Classification for Traffic Violations using Computer Vision

What it detects (6 violation classes)

Forensic ANPR with custom super-resolution

Models — SOTA architecture, trained by us

Pages

Stack

Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages