Skip to content

richard-luc/ESP32-Camera-AI-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ESP32 Camera AI Detection

AI Surveillance System is a multi-camera monitoring platform built around ESP32-CAM devices, a Python detection server, and a Qt desktop client. The system captures frames from lightweight edge cameras, performs centralized YOLO-based inference on the server, and presents annotated live feeds to an operator dashboard.

Overview

This repository combines three main parts:

  • esp32cam-firmware/: PlatformIO firmware for ESP32-CAM devices that capture JPEG frames and push them to one or more servers over HTTP.
  • server/: Flask-based backend that receives frames, runs object detection, tracks active cameras, and serves annotated images to clients.
  • client/Client-Eye-Desktop/: Qt desktop application that displays live camera feeds and server availability in a simple operator interface.

The current implementation focuses on centralized inference so the ESP32-CAM nodes remain inexpensive and lightweight while the server handles the heavier AI workload.

Key Features

  • Real-time multi-camera ingestion from ESP32-CAM devices
  • Centralized AI inference using YOLO models
  • Person detection and weapon detection on incoming frames
  • Annotated live feeds served to a desktop monitoring client
  • Online/offline camera visibility in the Qt dashboard
  • Telegram alert integration for critical detections

Architecture

  1. Each ESP32-CAM captures JPEG frames and sends them to the Flask server.
  2. The Flask server receives images through the /snap endpoint.
  3. YOLO models process frames for person and weapon detection.
  4. Annotated frames are stored per camera and exposed through HTTP endpoints.
  5. The Qt client polls the server for active camera IPs and live annotated images.
  6. Telegram notifications can be triggered when configured detection rules are met.

System architecture

Repository Structure

AI-surveillance-system/
|-- client/
|   `-- Client-Eye-Desktop/
|-- esp32cam-firmware/
`-- server/
    |-- notification/
    `-- weights/

Technology Stack

  • Firmware: C++ with PlatformIO for ESP32-CAM
  • Backend: Python, Flask, OpenCV, Ultralytics YOLO
  • Desktop client: C++ with Qt
  • Notifications: Telegram Bot API

Getting Started

Prerequisites

  • Python 3.11 or newer
  • PlatformIO for firmware builds
  • Qt Creator or a compatible Qt/CMake toolchain for the desktop client
  • ESP32-CAM hardware
  • YOLO model weights placed in server/weights

1. Configure the ESP32-CAM firmware

Update the following values in esp32cam-firmware/src/main.cpp:

  • Wi-Fi SSID and password
  • Target server addresses in the servers list

Then build and upload:

cd esp32cam-firmware
pio run --target upload

2. Configure and run the server

The Python server entry point is server/server.py.

Install the required Python packages in your environment, including:

pip install flask opencv-python numpy ultralytics python-telegram-bot

If you plan to use Telegram notifications, review the bot configuration in server/notification/notification.py before starting the bot service.

Run the notification bot:

cd server/notification
python notification.py

In a separate terminal, run the detection server:

cd server
python server.py

By default, the Flask service listens on 0.0.0.0:5000.

3. Configure and run the desktop client

Update the backend address in client/Client-Eye-Desktop/clienteye.cpp if your server IP or port is different.

Then open the project in Qt Creator and build/run the desktop application from client/Client-Eye-Desktop/CMakeLists.txt.

Detection Models

The server currently loads multiple YOLO weights from server/weights:

  • yolov8s.pt
  • yolov8m.pt
  • yolov8l.pt
  • best.pt

In the current codebase:

  • person detection uses the larger YOLO model
  • weapon detection uses the custom best.pt model

Detection example

HTTP Endpoints

The backend currently exposes these primary routes:

  • GET /ping: health check endpoint
  • POST /snap: receives JPEG frames from cameras
  • GET /getaliveips: returns currently active camera IPs
  • GET /get?ip=<camera-ip>: returns the latest annotated frame for a camera

Sample Output

Desktop Client

Desktop client view

Telegram Notifications

Configuration Notes

  • Camera connectivity and server addresses are currently configured directly in source files.
  • Telegram bot behavior is implemented in the notification module.
  • Model paths are currently hard-coded in the detection layer.
  • The client polls the server aggressively for status and frames, which is suitable for a lab prototype but may need tuning for larger deployments.

Security Notice

This repository appears to contain environment-specific network values and notification credentials in source files. Before publishing or deploying this project, move secrets and private configuration into environment variables or a secure configuration mechanism.

Roadmap Ideas

  • Replace hard-coded configuration with environment-based settings
  • Add authentication and HTTPS for device-to-server communication
  • Introduce containerized deployment for the backend
  • Add a web dashboard for browser-based monitoring
  • Improve alert rules, throttling, and audit logging
  • Optimize polling and streaming for larger camera fleets

License

This project is licensed under the MIT License. See LICENSE for details.

Author

Richard Lucero
GitHub: @richard-luc

About

Intelligent surveillance system utilizing ESP32-CAM hardware and YOLOv8 object detection for real-time person and weapon identification. Includes a Python server for processing and a Qt-based C++ client application for simultaneous multi-camera monitoring and management.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors