A production-ready, edge-optimized anti-spoofing (liveness detection) platform for face-based authentication systems.
The system classifies faces as REAL vs FAKE (printed images, phone screens, TV displays) using custom-trained YOLO models, supporting cloud, browser, and fully offline mobile inference.
This project is designed for real-world deployment, not academic benchmarks.
Face recognition systems are inherently vulnerable to spoofing attacks such as:
- Printed photographs
- Mobile phone displays
- TV or video replays
- Static image injection
This project adds a dedicated liveness verification layer that ensures a physically present human before any identity verification step.
Unlike generic models trained on public datasets, this system is trained in the target environment, dramatically improving real-world accuracy and robustness.
- REAL vs FAKE classification using a single-stage YOLO detector
- Custom dataset collection in ~20 minutes using automated labeling
- Real-time inference (30+ FPS on GPU, optimized for edge; slower but usable on CPU/Docker)
- Multi-face detection per frame
- Blur-aware data filtering for high-quality training data
- Web (Next.js + Three.js) with premium UX overlays
- Backend API (FastAPI + Docker) for server or edge devices
- Mobile (React Native) with fully on-device inference
- Offline-first architecture for privacy-critical use cases
Web (Next.js + Tailwind + Three.js)
β
β HTTP (REST)
βΌ
FastAPI Inference Service (Docker)
β
β PyTorch / ONNX (configurable)
βΌ
YOLO Anti-Spoofing Model
React Native Mobile App
(Camera β Native Inference β UI)
β
βΌ
ONNX Runtime (Android) / CoreML (iOS)
- Training β Inference (strict separation of concerns)
- No UI logic in backend
- No images leave device in mobile mode
- Hardware-aware optimization (CPU, GPU, NNAPI, Metal)
- Environment-specific data over generic datasets
- Automated face detection using MediaPipe (
cvzone.FaceDetectionModule) - Auto-labeling in YOLO format
- Blur filtering via Laplacian variance
- Confidence-based face filtering
Supports:
- Live webcam faces
- Phone screens
- Printed photos
- TV/video content
A typical dataset:
- ~7,000 images
- Balanced REAL / FAKE classes
- Collected in ~20 minutes
- YOLOv8 (Nano β Large supported)
- Exportable to ONNX for cross-platform deployment
- Quantization options:
- INT8 (Android / CPU edge)
- FP16 (GPU / iOS)
- Live camera capture
- Smooth face overlays
- Color-coded liveness status:
- Green β REAL
- Red β FAKE
- Confidence indicators (percentage)
- Three.js depth and ambient feedback for premium UX
- Fully offline inference
- Zero-copy camera frames (VisionCamera)
- Hardware-accelerated ML (NNAPI / Metal)
- Battery-aware execution
- Privacy-first (no network dependency for inference)
- No backend required on mobile
- No biometric data leaves device
- Deterministic, low-latency inference
- Works in low-connectivity or air-gapped environments
Suitable for:
- KYC & onboarding
- Attendance systems
- Secure access control
- Exam proctoring
- Workforce authentication
- FastAPI
- Docker
- YOLOv8 (Ultralytics)
- ONNX Runtime
- TensorRT / OpenVINO (optional, via custom builds)
- Next.js (App Router)
- Tailwind CSS
- shadcn-style UI components
- Three.js for ambient visuals
- React Native
react-native-vision-camera- ONNX Runtime (Android)
- CoreML + Metal (iOS)
It Is:
- A deployable anti-spoofing (liveness) layer
- Edge-ready and privacy-preserving
- Environment-specific and accurate
- Designed for real production systems
It Is Not:
- A face recognition / identification system
- A biometric identity store
- A generic one-model-fits-all solution
- Face-based login systems
- KYC verification flows
- Secure attendance tracking
- Access control gates / turnstiles
- Exam and remote proctoring
- Workforce identity verification
- Temporal liveness checks (blink, micro-motion, challenge-response)
- Face recognition pipeline integration
- Model drift monitoring in production
- Edge device benchmarking and profiles
- WebRTC streaming support
- Confidence-based policy enforcement (threshold-based actions)
This project demonstrates how production computer vision systems should be built:
- Data first
- Automation over manual labeling
- Environment-specific training
- Edge-optimized deployment
- Clean separation of concerns
A focused implementation produces a real, defensible, deployable anti-spoofing system that you can adapt to your own environment.
πΉ Watch Full Demo Video - Real-time face liveness detection (real vs spoof) with live camera overlay
-
Backend (
app/, FastAPI + YOLOv8)- Exposes
POST /v1/predictfor image-based liveness detection - Exposes
GET /v1/healthfor model/device status - Loads a YOLOv8 model from
MODEL_PATHand runs inference on CPU or GPU (DEVICE)
- Exposes
-
Web frontend (
frontend/, Next.js)- Uses the browser camera (
react-webcam) to capture frames - Sends JPEG snapshots to the backend via
/v1/predictat a throttled FPS - Renders bounding boxes and labels (
REAL/FAKE) over the live video - Shows API connection status + latency in the header
- Uses the browser camera (
-
Mobile app (
mobile/, React Native)- Uses
react-native-vision-camerato access frames on-device - Designed to run inference locally via ONNX Runtime (Android) or CoreML (iOS)
- Does not depend on the FastAPI backend for liveness decisions
- Uses
-
Training pipeline (
training/)data_collection.py: collects labeled face crops (fake/real) with blur-based filteringsplit_data.py: builds YOLO-format train/val/test splits anddata.yamltrain.py: trains a YOLOv8 model forfakevsrealclassification
-
Camera capture (browser)
CameraFeedusesreact-webcamto grab frames at a configured FPS.- Frames are passed to
FrameCaptureas base64 JPEGs.
-
Frame β image file (frontend)
FrameCapturedraws the frame to an off-screen<canvas>, converts it to a JPEGBlob, and wraps it in aFile.- A single in-flight guard ensures only one
/v1/predictrequest is active at a time.
-
HTTP request (frontend β backend)
predictImageinfrontend/lib/api.tssends theFileasmultipart/form-datatoPOST /v1/predictusing Axios.- The base URL is taken from
NEXT_PUBLIC_API_BASE_URL(usuallyhttp://localhost:8000).
-
Inference (backend)
- FastAPI endpoint
app/api/v1/predict.py:- Validates and reads the uploaded file
- Decodes it to an OpenCV image (
decode_image) - Runs
ModelWrapper.predict(...)(YOLOv8) withCONFIDENCE_THRESHOLDand image size from settings - Post-processes YOLO outputs in
postprocess_results(boxes, class IDs βfake/real) - Returns
faces[]+latency_msas JSON
- FastAPI endpoint
-
Rendering (frontend)
FrameCapturereceives thePredictionResponseand callsonDetection.Homepage updates state:detections+latency.FaceBoxesoverlays scaled bounding boxes + labels over the live video.StatusBadgeshows overall status (REAL/FAKE/MIXED) with highest confidence.- Header pill shows API status (
Checking / Connected / Not available) and latestlatency_ms.
-
Health checking
- On page load and every 10s, the frontend calls
GET /v1/health. - If
model_loadedis true, the UI marks the API as Connected; otherwise it shows API not available.
- On page load and every 10s, the frontend calls
- Web: Camera β JPEG frame β FastAPI
/v1/predictβ YOLOv8 β JSON β bounding boxes & labels overlay. - Mobile: Camera β native frame preprocessing β ONNX/CoreML model β labels β overlay (fully on-device).
- Training: Collected images + labels β YOLO training β
.ptmodel β deployed asMODEL_PATHin backend or exported to ONNX/CoreML for mobile.
User
β
β opens web UI
βΌ
Browser (Next.js / React)
β
β camera frame (react-webcam)
βΌ
FrameCapture (canvas)
β
β draw frame β JPEG β multipart/form-data
βΌ
Axios client (frontend/lib/api.ts)
β POST /v1/predict
βΌ
FastAPI backend (app/api/v1/predict.py)
β decode_image() β preprocess_image()
β ModelWrapper.predict() (YOLOv8)
β postprocess_results() β format_detections()
βΌ
JSON response (faces[], latency_ms)
β
β onDetection() updates React state
βΌ
FaceBoxes / StatusBadge
β
ββ> Draw boxes + REAL/FAKE labels over live video
From repo root:
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
# Run API
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reloadVerify:
Invoke-RestMethod http://localhost:8000/v1/healthIn a new terminal:
cd .\frontend
npm install
# Point frontend to backend
$env:NEXT_PUBLIC_API_BASE_URL="http://localhost:8000"
npm run devOpen http://localhost:3000 and allow camera permissions.
From repo root:
docker compose up --build -d
docker compose psHealth check:
Invoke-RestMethod http://localhost:8000/v1/healthTest /v1/predict with a local image:
$imgPath = "E:\path\to\test.jpg"
Invoke-RestMethod -Method Post `
-Uri "http://localhost:8000/v1/predict" `
-Form @{ file = Get-Item $imgPath }Stop:
docker compose downReturns model status + uptime.
Notes:
- The model detects liveness (real vs fake faces). To detect screen-based spoofs (faces shown on phones/tablets), the training data must include examples of such spoofs labeled as
fake. - The original training approach uses blur detection during data collection - screen-based faces tend to be blurrier, which helps the model learn.
- Use
CONFIDENCE_THRESHOLDin.envto trade off recall vs precision (default0.25).
Multipart form upload:
- field name:
file - types:
image/jpeg,image/png
Response:
faces[]: each haslabel(real|fake),confidence(0..1),bbox(x,y,w,h)latency_ms
Copy .env.example β .env (optional). Key vars:
MODEL_PATH(defaultmodel/anti_spoofing.pt)CONFIDENCE_THRESHOLD(default0.25, lower = more detections but also more noise)DEVICE(auto|cpu|cuda)
File: training/data_collection.py
Important: This script uses blur detection to ensure only focused faces are saved. Faces shown on mobile screens/phones tend to be blurrier than real faces, which helps the model learn to distinguish them.
How to collect data:
-
For FAKE class (classID = 0):
- Run the script with
classID = 0 - Show faces on mobile screens (photos/videos on phones/tablets) to the camera
- Show printed photos of faces
- The script will only save frames where faces are detected AND blur value > threshold (focused enough)
- Run the script with
-
For REAL class (classID = 1):
- Run the script with
classID = 1 - Show live faces directly to the camera
- Ensure good lighting and focus
- Run the script with
Key parameters:
blurThreshold = 35: Higher = more strict focus requirement (default 35 works well)confidence = 0.8: Face detection confidence thresholdclassID: Set to0for fake,1for real before running
Output: Dataset/DataCollect/*.jpg + *.txt (YOLO format labels)
After collection: Manually copy all collected images to Dataset/all/ before splitting.
File: training/split_data.py
- Reads from
Dataset/all/(you must copy collected data here first) - Creates
Dataset/SplitData/{train,val,test}/{images,labels}+data.yaml - Split ratio: 70% train, 20% val, 10% test
File: training/train.py
- Trains a YOLOv8 model using the generated
data.yaml - Default:
yolov8n.pt(nano), 300 epochs, batch size 16 - Best model saved to
runs/anti_spoofing/weights/best.pt
Class mapping: The model outputs class 0 = fake, class 1 = real (matches classNames = ["fake", "real"]).
Directory: mobile/
Implemented:
- VisionCamera-based camera view
- JS-side detection coordinator + throttling
- Native module stubs:
- Android:
mobile/android/.../YoloModule.kt(ONNX Runtime + NNAPI hook) - iOS:
mobile/ios/YoloModule.swift(CoreML/Vision hook)
- Android:
Not yet turnkey:
- You must export your model to ONNX and place it at:
mobile/assets/anti_spoofing.onnx
- You must wire native frame preprocessing properly (currently placeholder).
See docs:
docs/mobile.md
app/ FastAPI runtime
training/ Offline training/data scripts
frontend/ Next.js web UI
mobile/ React Native app scaffold
docker/ Dockerfiles
tests/ Backend tests
docs/ Extra documentation
model/ Model artifacts (anti_spoofing.pt)