Skip to content

namann5/Ai_deepfake

Repository files navigation


Typing SVG


Status Version Node.js MongoDB Docker CI/CD License


💀 Deepfakes are getting scary good. Deepfake fights back — upload any image or video, get a verdict in seconds. Because not everything you see is real.


Department of Computer Science & Engineering (CSED) · Section 2FH



📌 Table of Contents


🧠 About the Project

Deepfake is a lightweight, web-based deepfake and AI-generated media verifier built as an academic project. As synthetic media becomes increasingly indistinguishable from reality, tools that help everyday users verify digital content are more critical than ever.

Deepfake analyzes uploaded images and videos through a 3-layer detection pipeline — powered by a real PyTorch deep learning model at its core:

Layer What it does
🤖 Deep Learning Model Xception backbone extracts deep visual forgery features; custom dense head produces a binary deepfake probability
🔎 Metadata Forensics EXIF parsing — camera info detection, AI software flagging, timestamp analysis
⚖️ Score Aggregation Weighted fusion → final probability score + verdict

The system outputs a probability score (e.g., 72% — Likely Synthetic) with confidence rating and detailed breakdown — no technical expertise required.

⚠️ Light Version — built for academic and demonstration purposes. Not legal or forensic-grade.


✨ Features

Feature Description Status
🖼️ Image Upload Drag-and-drop or click — JPEG, PNG, WEBP supported ✅ Done
🎞️ Video Upload OpenCV frame sampling + Haar Cascade face detection ✅ Done
🤖 Deep Learning Detection Xception + custom dense head — trained forgery feature extraction ✅ Done
📊 AI Probability Score 0–100% confidence score with Real / Synthetic verdict ✅ Done
🔎 Metadata Forensics EXIF parsing — flags missing camera, AI software, timestamps ✅ Done
⚖️ Weighted Score Fusion 3 signals combined into one final score intelligently ✅ Done
🗄️ Result Logging Every scan saved to MongoDB with timestamp & breakdown ✅ Done
🎨 React UI Reusable components — responsive, fast, state-driven ✅ Done
Fast Pipeline End-to-end analysis completes in under 3 seconds ✅ Done
🐳 Docker Support Full stack containerized — one command to run everything ✅ Done
🔁 CI/CD Pipeline GitHub Actions — auto test, build, and publish on every push ✅ Done

🧬 ML Stack Architecture

The project contains three ML stacks. Only one is the active primary inference path.

✅ Active Stack — Primary Inference Path

Component Technology
Service file image_server.py
Framework PyTorch
Backbone Xception (pytorchcv)
Custom head BatchNorm1d → Linear → ReLU → BatchNorm1d → Linear
Pooling AdaptiveAvgPool2d(1) — resolution-robust feature compression
Video handling OpenCV frame sampling + Haar Cascade face detection
API layer FastAPI

Model-layer purpose:

  1. Xception backbone — extracts deep visual features from image texture and patterns
  2. AdaptiveAvgPool2d(1) — makes feature size robust for varied input resolutions
  3. Custom dense head — maps extracted features to a single binary logit
  4. Output — logit converted to deepfake probability, thresholded into REAL or DEEPFAKE
  5. Video — multiple sampled frames scored individually by the image model, then aggregated

🟡 Secondary Stack — Dedicated Video Classifier (Not Primary)

Component Technology
Service file video_server.py
Framework TensorFlow / Keras
Model format .keras
Purpose Direct video-classifier inference from selected frames

The Node.js backend currently routes video through the image endpoint in image_server.py. This stack exists for dedicated video classification but is not the active runtime path.

🔴 Legacy Stack — Archived

Component Technology
Service file deepscan-backend/ml_service/app.py
Framework TensorFlow / Keras
Architecture Reconstructed DenseNet121-based image model
API layer Flask
Purpose Older inference service — superseded by the active FastAPI stack
Active vs Legacy at a glance:

  image_server.py (FastAPI + PyTorch + Xception)   ← PRIMARY ✅
  video_server.py (Keras .keras model)              ← Secondary 🟡
  deepscan-backend/ml_service/app.py (Flask + DenseNet121) ← Legacy 🔴

🏗️ System Architecture

┌─────────────────────────────────────────────────────────────┐
│                    LAYER 1 — FRONTEND                       │
│      React.js  ·  Component UI  ·  Upload Widget  ·  Axios  │
└─────────────────────┬───────────────────────────────────────┘
                      │  HTTP Request (multipart/form-data)
                      ▼
┌─────────────────────────────────────────────────────────────┐
│                 LAYER 2 — API GATEWAY                       │
│      Node.js / Express.js  ·  Multer Validator  ·  Router   │
└──────────┬──────────────────────────────┬───────────────────┘
           │                              │
           ▼                              ▼
┌──────────────────────┐      ┌───────────────────────────────┐
│  METADATA FORENSICS  │      │     ML DETECTION ENGINE       │
│  EXIF Parser (exifr) │      │  image_server.py (FastAPI)    │
│  Timestamp Checker   │      │  PyTorch + Xception Backbone  │
│  AI Software Flags   │      │  AdaptiveAvgPool2d(1)         │
│  Camera Info Check   │      │  Custom Dense Head            │
│                      │      │  OpenCV + Haar Cascade (video)│
└──────────┬───────────┘      └──────────────┬────────────────┘
           │                                  │
           └──────────────┬───────────────────┘
                          │  Weighted Score Fusion
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                  SCORE AGGREGATOR                           │
│     Model (50%)  +  Artifact (30%)  +  Metadata (20%)       │
│                  → Final Probability %                      │
│         → Verdict: REAL / UNCERTAIN / SYNTHETIC             │
└─────────────────────┬───────────────────────────────────────┘
                      │
                      ▼
              Result returned to UI
         [ 72% — ⚠ LIKELY SYNTHETIC · Medium Confidence ]

          ┌────────────────────────┐
          │        MongoDB         │
          │  Store results & logs  │
          └────────────────────────┘

Docker Architecture

  Browser (localhost:3001)
          │
          ▼
  [ Frontend Container ]     React / Nginx — Port 3001→3000
          │
          │ HTTP API calls
          ▼
  [ Backend Container ]      Node.js + Express — Port 5000
    /api/analyze
    /api/results
          │
          │ HTTP (internal)
          ▼
  [ ML Service Container ]   FastAPI + PyTorch — image_server.py
          │
          │ Mongoose ODM
          ▼
  [ MongoDB Container ]      mongo:7 — Port 27017 — Volume: mongo_data

🔄 Workflow

User Uploads Image / Video
       │
       ▼
┌─────────────┐
│  Validate   │  ← File type · Size limit · Multer
└──────┬──────┘
       │
       ▼
┌─────────────┐
│ Preprocess  │  ← Resize · Normalize · Extract Metadata
│             │    (Video: OpenCV frame sampling + Haar Cascade)
└──────┬──────┘
       │
       ├──────────────────────────┐
       ▼                          ▼
┌─────────────┐          ┌────────────────┐
│  ML Model   │          │    Metadata    │
│  Inference  │          │   Forensics    │
│ (Xception + │          │   (exifr)      │
│  PyTorch)   │          │               │
└──────┬──────┘          └───────┬────────┘
       │                          │
       └──────────┬───────────────┘
                  ▼
         ┌────────────────┐
         │Score Aggregator│  ← Weighted fusion
         └───────┬────────┘
                 │
                 ▼
         ┌──────────────┐
         │  Save to DB  │  ← MongoDB logging
         └───────┬──────┘
                 │
                 ▼
        Display Result to User
      ┌─────────────────────────┐
      │  Final Score: 72%       │
      │  Verdict: ⚠ SYNTHETIC   │
      │  Confidence: Medium     │
      │  Breakdown: shown       │
      └─────────────────────────┘

Step-by-step breakdown:

  1. 📤 Upload — JPEG, PNG, WEBP, or video · max 5MB accepted
  2. Validate — Multer checks file type and size before processing
  3. 🤖 ML Inference — Xception backbone + custom dense head extracts deep forgery features; for video, OpenCV samples frames and Haar Cascade detects faces before per-frame scoring
  4. 🔎 Metadata Forensics — EXIF flags missing camera info, AI software signatures, and timestamp anomalies
  5. ⚖️ Score Fusion — Model 50% + Artifact 30% + Metadata 20% combined into a single probability score
  6. 🗄️ DB Logging — Full result saved to MongoDB for scan history
  7. 📊 Response — Score, verdict, confidence level, and detailed breakdown returned to the frontend

🛠️ Tech Stack

Node.js Express.js MongoDB React PyTorch FastAPI OpenCV Docker GitHub Actions JavaScript Nginx

Layer Technology Purpose
Frontend React.js · Axios · CSS3 · Nginx Interactive UI, API communication, production serving
Backend Node.js · Express.js · Multer REST API, file handling, routing
ML Inference PyTorch · Xception (pytorchcv) · FastAPI Deep learning model serving — primary inference path
Video Processing OpenCV · Haar Cascade Frame sampling, face-focused preprocessing for video input
Metadata Analysis exifr EXIF metadata extraction and forensics scoring
Database MongoDB · Mongoose Scan result persistence and history
DevOps Docker · Docker Compose · GitHub Actions Containerization, orchestration, CI/CD
Dev Tools Postman · VS Code · Git · GitHub API testing, development, version control

🔬 EXIF Metadata Scoring

The metadata forensics layer analyzes hidden EXIF data embedded in every real camera photo. AI-generated images typically have none of this data — making its absence a strong synthetic signal.

Base Score: 50  (neutral starting point)

┌─────────────────────────────────────────────────────────────────┐
│                     EXIF SIGNAL TABLE                          │
├────────────────────────────┬────────────────────────────────────┤
│  Signal                    │  Score Impact                      │
├────────────────────────────┼────────────────────────────────────┤
│  No EXIF data at all       │  → score = 85  (instant flag)      │
│  No camera make/model      │  → score += 20                     │
│  Camera make/model found   │  → score -= 20                     │
│  AI software detected *    │  → score += 40                     │
│  No timestamp              │  → score += 10                     │
│  Timestamp found           │  → score -= 10                     │
│  No GPS data               │  → score += 5                      │
│  GPS data found            │  → score -= 10                     │
├────────────────────────────┼────────────────────────────────────┤
│  Final score               │  clamped between 0 and 100         │
└────────────────────────────┴────────────────────────────────────┘

* AI software detection covers:
  Stable Diffusion · Midjourney · DALL-E · Adobe Firefly

💡 Note: Images shared via WhatsApp or Telegram have their EXIF stripped, which inflates the metadata score even for real photos. Always upload the original file directly via USB for accurate results.


📡 API Endpoints

POST /api/analyze

Upload an image or video for deepfake analysis.

Request

Body    : form-data
Key     : image (type: File)
Allowed : .jpg · .jpeg · .png · .webp · video formats
Max Size: 5MB

Response

{
  "id": "69b03d5a06d049d960e7937c",
  "final_score": 72,
  "verdict": "LIKELY SYNTHETIC",
  "confidence": "Medium",
  "breakdown": {
    "model_score": 68,
    "artifact_score": 75,
    "metadata_score": 85
  },
  "flags": ["No EXIF data found — strong synthetic signal"],
  "analyzed_at": "2026-03-10T15:48:42.235Z"
}

GET /api/results

Fetch last 20 scan results from the database.

Response

{
  "results": [ ...array of past scans... ]
}

🚀 Getting Started

Option 1: Docker (Recommended)

See Docker Setup below — one command runs everything.

Option 2: Manual Setup

Prerequisites: Node.js v18+ · Python 3.9+ · MongoDB running locally

git clone https://github.com/namann5/Ai_deepfake.git
cd Ai_deepfake

ML Service (Active — image_server.py):

cd ml_service
pip install torch torchvision pytorchcv fastapi uvicorn opencv-python exifr
uvicorn image_server:app --port 8000

Backend:

cd deepfake-backend
npm install
node server.js

Create .env inside deepfake-backend/:

PORT=5000
MONGODB_URI=mongodb://localhost:27017/deepfake
ML_SERVICE_URL=http://localhost:8000

Frontend:

cd deepfake-frontend
npm install
npm start

Access: Frontend → http://localhost:3000 · Backend → http://localhost:5000 · ML Service → http://localhost:8000


🐳 Docker Setup

Prerequisites: Install Docker Desktop

Start everything (dev mode with hot-reload)

docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d

Stop everything (keep data)

docker compose -f docker-compose.yml -f docker-compose.dev.yml down

Stop and delete all data (fresh start)

docker compose -f docker-compose.yml -f docker-compose.dev.yml down -v

Rebuild after Dockerfile changes

docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build

View live logs

docker compose -f docker-compose.yml -f docker-compose.dev.yml logs -f

Access URLs (Docker)

Service URL
Frontend App http://localhost:3001
Backend API http://localhost:5000
Backend Health http://localhost:5000/
ML Service http://localhost:8000
All Results http://localhost:5000/api/results
MongoDB mongodb://localhost:27017/deepfake

Why Docker?

Before Docker After Docker
Install Node + Python + MongoDB manually docker compose up — done
4 separate terminals to start One command starts everything
"Works on my machine" bugs Identical environment everywhere
Hours to onboard a new dev ~5 minutes from clone to running
Manual MongoDB + ML service setup All containers start automatically

🔁 CI/CD Pipeline

Every push or pull request to main automatically triggers the GitHub Actions pipeline:

Push to main
     │
     ├── Job 1: Backend Test    → npm ci → run Jest tests
     ├── Job 2: Frontend Build  → npm ci → npm run build
     └── Job 3: Docker Push     → build images → push to GHCR
                                  (only on direct push, not PRs)

Pipeline file: .github/workflows/ci.yml


📐 Project Scope

✅ In Scope

  • Detection of AI-generated and deepfake images
  • Video deepfake detection via frame sampling and face detection
  • Deep learning model inference using PyTorch + Xception backbone
  • Metadata forensics via EXIF analysis
  • Probability scores with verdict and confidence rating
  • Clean web-based UI accessible to non-technical users
  • Scan history stored in MongoDB
  • Docker containerization for consistent cross-platform environments
  • CI/CD automated testing and deployment pipeline

❌ Out of Scope

  • Real-time live-stream deepfake detection
  • Legal or forensic-grade accuracy guarantees
  • Detection of AI models released after training data cutoff
  • Audio deepfake analysis

📅 Implementation Plan

Phase 1  ██████████  Requirement Analysis & Problem Understanding   ✅ Done
Phase 2  ██████████  Dataset & Pre-trained Model Selection          ✅ Done
Phase 3  ██████████  System Architecture Design                     ✅ Done
Phase 4  ██████████  Backend Development                            ✅ Done
Phase 5  ██████████  Frontend Development                           ✅ Done
Phase 6  ██████████  Frontend–Backend Integration                   ✅ Done
Phase 7  ████░░░░░░  Testing & Performance Evaluation               🔄 In Progress
Phase 8  ██░░░░░░░░  Deployment & Documentation                     🔄 In Progress

👥 Team

Name Role ID
Anurag Singh Backend Development & System Analysis 12515990006
Arpita Raj Frontend Development & UI Design 12515990007
Harshita Nagpal Frontend Development & Documentation 12515990016
Naman Singh Backend Development & Testing 12515990024

Supervisor: Mr. Abhishek Singh (Technical Trainer) Submitted To: Mr. Sanjay Madaan


📚 References

  1. Research papers on Deepfake Detection
  2. FaceForensics++ dataset — Kaggle
  3. Xception architecture — Chollet, F. (2017)
  4. pytorchcv documentation — pre-trained model library
  5. PyTorch documentation — model training and inference
  6. OpenCV documentation — video processing and face detection
  7. exifr documentation — EXIF metadata parsing
  8. FastAPI documentation — ML model serving
  9. IEEE and ACM digital libraries
  10. Docker documentation — containerization
  11. GitHub Actions documentation — CI/CD

Department of Computer Science & Engineering (CSED) · Section 2FH · Academic Project

Deepfake — Fighting synthetic misinformation, one pixel at a time. 🔍

About

AI-powered deepfake & synthetic image detection system — Node.js backend with EXIF forensics, visual artifact analysis, and weighted score aggregation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors