Skip to content

Harp404/DRISHTI

Repository files navigation

DRISHTI · ITMS

Automated Photo Identification & Classification for Traffic Violations using Computer Vision

Flipkart GRiD 6.0 — PS-3 · Bengaluru Traffic Police / ASTraM-compatible

DRISHTI turns any camera — pole CCTV, 4G wireless, even a phone — into an AI traffic-enforcement node. The entire pipeline runs locally on a GPU (demoed on a GTX 1650): no cloud, no per-frame cost, no citizen plate data leaving the edge.


What it detects (6 violation classes)

Violation How
Red-Light Violation vehicle crosses the stop-line through the line while the signal (HSV) is RED
Stop-Line Violation stops past the line during RED (encroachment) — distinct from running it
Wrong-Way Driving sustained motion opposite the dominant traffic flow
Illegal Parking vehicle stationary inside a drawn no-parking polygon
Triple Riding ≥3 riders sharing one motorcycle box
No Helmet finetuned helmet model on rider crops → without_helmet

Each flagged vehicle gets super-resolved evidence crops + ANPR (plate read) + timestamp + ERS reliability score.


Forensic ANPR with custom super-resolution

Distant CCTV plates are low-res and unreadable. Our pipeline reconstructs them:

plate.pt (localize)
  → IBP-MFSR     (Iterative Back-Projection multi-frame super-resolution — fuses sub-pixel detail across the burst)
  → Real-ESRGAN  (GAN upscaling, general-x4v3)
  → PP-OCRv6     (PaddleOCR, reads multi-row Indian plates)
  → plate-format cleanup (strips the IND strip, validates the format)

The custom method recovers plates that are unreadable in any single frame — here a heavily pixelated capture that a normal OCR returns nothing for becomes legible:

ANPR super-resolution rescue UP50AS4535

More examples:

ANPR super-resolution GJ01MW7581 ANPR super-resolution MH20DV2366 ANPR super-resolution KL65L2203

Left = the original low-res plate the camera captured. Right = DRISHTI's super-resolved reconstruction + the OCR read. The pipeline reads plates a single frame can't.


Models — SOTA architecture, trained by us

We didn't bolt together off-the-shelf weights — we finetuned our own detectors on ~27,000 images / 70,000+ annotations of real CCTV traffic and license plates (incl. Indian plates), on Modal L40S GPUs.

Model Classes Trained on Result
base YOLO26-nano COCO vehicles, riders, pedestrians, traffic lights
plate.pt license plate ~14.7k images / 17k boxes 0.98 mAP@50
helmet.pt helmet / without_helmet ~12k images / 55k boxes real-CCTV helmet compliance

The whole stack is state-of-the-art (2026):

  • YOLO26-nano — newest YOLO, NMS-free, real-time, edge-deployable (Jetson / CCTV NPUs)
  • PP-OCRv6 (PaddleOCR, ONNX) — 2026 SOTA OCR for the plate read
  • Real-ESRGAN + Iterative Back-Projection MFSR — the forensic super-resolution front-end
  • BoT-SORT multi-object tracking

Roadmap: with more compute we unify detection + helmet + plate into a single multi-task YOLO26 (one forward pass per frame → even cheaper at the edge), and expand the plate set with more Indian-specific data.

Base YOLO26-nano — vehicles, riders & pedestrians on a live CCTV junction:

YOLO26 vehicle & person detection

Finetuned helmet & plate detectors:

Helmet detection Plate detection


Pages

  • Command Center — live KPIs + analytics (recharts)
  • Upload & Analyze — drop a clip, draw the stop-line + mark lights, watch live GPU detection + challans, with per-model on/off + confidence sliders
  • Violations — officer review queue, evidence gallery, approve/reject
  • Live Cameras — connect RTSP/HTTP sources, per-camera calibration
  • Model Playground — drop any image, test the 3 models live with adjustable thresholds + super-res toggle
  • Hotspot MapMappls (MapmyIndia) live violation heatmap of Bengaluru junctions, ranked by density — shows enforcement teams where to deploy

Stack

  • Frontend: Next.js 16 · React 19 · Tailwind v4 · shadcn/ui · Recharts · true-black UI
  • Backend (local): FastAPI · YOLO26-nano (Ultralytics) · BoT-SORT tracking · Real-ESRGAN (spandrel) · PP-OCRv6 (RapidOCR) · OpenCV
  • Training: Modal (L40S) — finetune yolo26n on Roboflow datasets

Run

# backend (GPU)  —  the CV engine lives in ./backend
cd backend && pip install -r requirements.txt && python3 server.py   # :8000

# frontend (repo root)
npm install && npm run dev                                            # :3000

Open Upload & Analyze → drop a traffic clip → draw the stop-line + mark the signal → Run. For live cameras: Live Cameras → Connect source → calibrate once → Start live detection.

See DEPLOY.md for Vercel + cloud-GPU deployment. The backend needs no API keys; only MAPPLS_KEY (map) is used client-side — .env is gitignored, nothing secret is committed.

About

DRISHTI · ITMS — Any-Camera AI Traffic Violation Enforcement

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors