Face Matrix Backend

A production-ready facial recognition backend built with FastAPI, PostgreSQL, and NVIDIA Triton. This system detects faces, extracts feature vectors, performs similarity matching, and handles user enrollment with a complete REST API.

Getting Started

Prerequisites

Docker and Docker Compose installed on your system
(Optional) NVIDIA GPU with CUDA support for faster inference

The system runs on CPU by default. GPU speeds up inference by ~3-5x but isn't required. Model files are included in the repository (model_repository/facenet/).

Quick Start

Clone the repository:

git clone <your-repo-url>
cd face-matrix-backend

Set up environment variables:

cp .env.example .env
# Edit .env with your preferred settings

Build and run with Docker Compose:

docker build -t fastapi-server .
docker-compose up -d

The API will be available at http://localhost:8080

Try it Out

Test the recognition endpoint with any face image:

curl -X POST http://localhost:8080/api/v1/recognise_user \
  -F "image=@/path/to/your/image.jpg"

If no match is found, you'll get a req_id. Use it to enroll:

curl -X POST http://localhost:8080/api/v1/add_user \
  -H "Content-Type: application/json" \
  -d '{
    "name": "John Doe",
    "birthdate": "1990-01-01",
    "req_id": "your-req-id-here"
  }'

Services

FastAPI Backend: http://localhost:8080
PostgreSQL Database: localhost:5432
Triton Inference Server: http://localhost:8000 (HTTP), localhost:8001 (gRPC), http://localhost:8002 (Metrics)

System Requirements

Functional Requirements

Accept an image as input.
Detect a face in the image.
Extract feature vectors using a pre-trained model.
Search for a similar face in the database.
Return user information if a match is found.
Handle user enrollment by storing vectors and details in the database.

Non Functional Requirements

Performance: API response time < 1 second.
Scalability: Handle 10,000 faces with 1 reqs/sec.
Availability: 99.99% uptime for up to 5 reqs/sec.
Maintainability: Maintainability index ≥ 85 (≥ 20 by Visual Studio standards).

Application Flow

Image Input: User uploads an image.
Face Detection: Detect a face in the image.
- If no face is detected, raise an exception.
Feature Extraction: Extract facial features from the detected face.
Similarity Search: Perform a similarity search in the database for matching facial vectors.
- If a match is found, return user information.
- If no match is found:
  - Store the facial vectors in the cache.
  - Return a cache key to the user.
User Uploads Info?: Check if the user provides additional information.
- If yes, add the user info and facial vectors to the database.
- If no, expire the cached facial vectors.

Architecture

System Components

Face Detection:
- Model: MTCNN
- Role: Detects, crops and aligns faces for features extraction.
Feature Extraction:
- Model: FaceNet (InceptionResNetV1 pre-trained on “vggface2” dataset)
- Role: Generates 512-D vectors.
Similarity Search:
- Tech: PostgreSQL with pgvector.
- Role: Perform similarity lookups.
Database:
- Tech: PostgreSQL.
- Role: Persistent storage for user info and feature vectors, temporarily cache unmatched feature vectors.
Model Inference Server:
- Tech: NVIDIA Triton Inference Server
- Role: Serves ML models via HTTP/gRPC. Handles batching, model versioning, and can run on CPU or GPU. Config files are in model_repository/facenet/config.pbtxt.

Technical Specifications

API Documentation

The REST APIs will expose two key endpoints:

Image Recognition API: Responsible for recognising a user from their uploaded image.
Add User API: Will allow new users to add themselves in the system.

Image Recognition API

API Endpoint

POST /api/v1/recognise_user
Content-Type: multipart/form-data

Request Parameters

image : The image file containing the face to be recognised.
```
{
  "image": [binary data]
}
```
Response

The API will respond with a JSON object containing the following fields:
- status: Indicates whether the recognition attempt was successful ("success" or "error").
- message: A message describing the result of the recognition attempt (e.g., "Match found", "No match found").
- user_info: Contains user details if a match is found. If no match is found, this will be null.
- enrollmentKey: (only if no match is found) Includes:
  - req_id: A unique identifier that the user can use later for enrollment without re-uploading the image.
  - expiresAt: The expiration time of the temporary data in the cache.
  - metadata: Information about the quality and confidence of the detected face in the image.
Example successful response:
```
{
  "status": "success",
  "message": "Match found",
  "user_info": {
    "id": "12345",
    "name": "John Doe",
    "email": "john.doe@example.com"
  }
}
```
Example response when no match is found:
```
{
  "status": "failure",
  "message": "no_match",
  "req_id": "req-67890"
}
```

Add User API

API Endpoint

POST /api/v1/add_user
Content-Type: multipart/form-data

Request Parameters

The request should include the following parameters:
- name (string, required): The user's full name.
- birthdate (string, required): The user's birthdate in Date format.
- req_id (string, optional): Key to the cached facial vectors.
Request Example (Using Cached Request ID for Enrollment):
```
{
  "name": "John Doe",
  "birthdate": "2002-10-19",
  "req_id": "req-67890"
}
```
Response

The API will respond with a JSON object containing the following fields:
- status: Indicates whether the enrollment attempt was successful ("success" or "error").
- message: A descriptive message explaining the result (e.g., "User added successfully", "User already exists").
- user_id: The unique identifier assigned to the new user, returned if the user is successfully enrolled.
- error_code: Optional, included in case of errors.
Example Successful Response:
```
{
  "status": "success",
  "message": "User added successfully",
  "user_id": "12345"
}
```
Example Error Response:
```
{
  "status": "error",
  "message": "User already exists with this email",
  "error_code": 404
}
```

Database Schema

Users Table

CREATE TABLE users (
    id SERIAL PRIMARY KEY,                    
    name VARCHAR,               
    email VARCHAR(255) UNIQUE NOT NULL,       
    birthdate DATE NOT NULL,                  
    face_vector VECTOR(128) NOT NULL,

Cache Table

CREATE TABLE IF NOT EXISTS "vectors_cache" (
    timestamp timestamp NOT NULL DEFAULT NOW(),
    key VARCHAR PRIMARY KEY,
    face_vectors vector(512) NOT NULL
);

Error Handling

Each service layer follows a standard approach to managing errors

Error Categories

Client Errors (4XX):
- 400 - Bad Request: Input validation failures. (Face Not Detected, Similarity Threshold Not Met)
- 404 - Not Found: Resource not found.
- 409 - Conflict: Duplicate entries.
Server Errors (5XX):
- 500 - Internal Server Error: Generic server issues.
- 503 - Service Unavailable: When external dependencies are down.

Error Response Structure

All error responses are structured like this:

error_code: error code as per the above categories.
message: error message.
requestId: Unique identifier to trace the error to specific requests.

{
  "status": "error",
  "error_code": 601,
  "message": "Face not detected in the image"
}

Performance Benchmarking

Test Environment

All benchmarks and load tests were conducted on the following hardware:

Instance Type: AWS g4dn.xlarge
GPU: 1x NVIDIA Tesla T4 (16 GB GPU memory)
vCPUs: 4 vCPUs
RAM: 16 GB
Storage: Amazon EBS

This configuration represents a typical mid-range GPU instance suitable for production facial recognition workloads.

Dataset

Name: Labeled Faces in the Wild (LFW)
Link: Dataset Link
Details:
- Total number of unique identities: 5,750+
- Images per identity: Varies, with multiple images for some identities.

Accuracy Benchmarking

False Rejection Rate (FRR):
- Objective: Measure how often the system fails to recognise the correct match for a given identity.
- Setup:
  - Extracted unique identities with more than one image.
  - Used the first image to create the database of embeddings.
  - Tested the second image of each identity for recognition.
False Acceptance Rate (FAR):
- Objective: Measure how often the system incorrectly matches an input image to a different identity.
- Setup:
  - Divided the dataset into two halves.
  - Used the first half to populate the database of embeddings.
  - Tested the second half for recognition.

Metric	DB Size	Test data size	Errors	Rate(%)
FRR	5650	1680	108	6.43
FAR	2796	3557	149	4.19

Key Insights

False Rejection Rate (FRR): 6.43% indicates that approximately 1 in 16 genuine users might need to retry authentication
False Acceptance Rate (FAR): 4.19% shows strong security, with only ~4 in 100 false matches
These metrics can be tuned by adjusting the similarity threshold in the configuration
Results may vary based on image quality, lighting conditions, and face angles

Note: The benchmarking scripts and test data are available in the project-data/benchmarking/ directory for reproducibility.

Load Testing Summary

Test Configuration

The load test was conducted on the same hardware environment (AWS g4dn.xlarge with NVIDIA T4 GPU) using the following parameters:

Load Simulation

Execution using the constant-vus executor.

Scenarios

1st stage: 4 VUs (duration: 5 minutes)
2nd stage: 8 VUs (duration: 5 minutes)
3rd stage: 12 VUs (duration: 5 minutes)

Thresholds

95% of reqs must have a response time < 3000 ms.
Failure rate must be < 5%.

Test Data

Dataset: Link
Requests: Each request contained a randomly selected image from the enrolled dataset.

Results

HTTP Performance Summary

Metric	Value
Max Throughput	5.44 req/s
HTTP Failures	0.0120 req/s
Avg. Response Time	1.46 s
95% Response Time	2.86 s
Total Requests	4,968
Failed Requests	11 (0%)
Unique URLs Tested	6

Top Slowest URLs (sorted by P95 response time)

Endpoint	Scenario	Method	Status	Count	Avg	P95	Max
/api/recognise_user	scenario_12_users	POST	200	1.72K	2.11s	3.44s	5.80s
/api/recognise_user	scenario_12_users	POST	400	4	1.66s	2.63s	2.76s
/api/recognise_user	scenario_8_users	POST	200	1.65K	1.46s	2.59s	4.06s
/api/recognise_user	scenario_8_users	POST	400	3	1.62s	2.16s	2.26s
/api/recognise_user	scenario_4_users	POST	400	4	1.09s	2.03s	2.13s
/api/recognise_user	scenario_4_users	POST	200	1.59K	753ms	1.33s	2.73s

Final Evaluation

A total of 9,936 checks were evaluated with 22 failures, resulting in a 99.8% overall success rate.

Check Name	Success Rate	Success Count	Fail Count
status is 200 (scenario_8_users)	99.8%	1,649	3
response matches schema (scenario_8_users)	99.8%	1,649	3
status is 200 (scenario_12_users)	99.8%	1,715	4
status is 200 (scenario_4_users)	99.7%	1,593	4
response matches schema (scenario_4_users)	99.7%	1,593	4
response matches schema (scenario_12_users)	99.8%	1,715	4

Thresholds

All performance expectations were met, with no defined thresholds being exceeded:

http_req_duration: p(95) < 3000ms ✓
http_req_failed: rate < 0.05 ✓

Performance by Scenario

Response Time Peak

Maximum response time: 6s (at 12 VUs)
Average response time at peak: 3s
95% of requests at peak: < 5s

Throughput

Overall average: 5.5 req/s
Peak: 7.3 req/s (at 12 VUs)

Bandwidth

Data sent peak: 66.5 KB/s (at 12 VUs)
Data received peak: 1.79 KB/s (at 12 VUs)
Total data sent: 44.1 MB
Total data received: 1.18 MB

Resource Utilisation

CPU Usage

Started at ~250% and stayed constant about ~300% as the load reached it’s peak.

Memory Usage

Started at ~150 MB and peaked at ~300 MB.

Common Issues

Triton won't start: Check that model_repository/facenet/1/model.onnx exists. The model file should be ~94MB.

Database connection errors: Wait 10-15 seconds after docker-compose up for Postgres to initialize. Check logs with docker-compose logs postgres.

Port conflicts: If 8080, 5432, or 8000-8002 are in use, modify the ports in docker-compose.yml.

Slow inference on CPU: First request is always slower (~2-3s). Subsequent requests should be under 1s. For faster performance, enable GPU support by uncommenting the runtime settings in docker-compose.yml.

Architecture Notes

The system is designed with modularity in mind:

Face detection and feature extraction models are served via Triton and can be swapped out
Vector similarity search uses PostgreSQL with pgvector, but the database layer is abstracted
The FastAPI backend handles orchestration and can be extended with additional endpoints

Contributing

Contributions are welcome. Feel free to fork this project and adapt it to your needs.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github/workflows		.github/workflows
docs-data		docs-data
model_repository		model_repository
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
init.sh		init.sh
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Face Matrix Backend

Getting Started

Prerequisites

Quick Start

Try it Out

Services

System Requirements

Functional Requirements

Non Functional Requirements

Application Flow

Architecture

System Components

Technical Specifications

API Documentation

Image Recognition API

Add User API

Database Schema

Users Table

Cache Table

Error Handling

Performance Benchmarking

Test Environment

Dataset

Accuracy Benchmarking

Key Insights

Load Testing Summary

Test Configuration

Load Simulation

Scenarios

Thresholds

Test Data

Results

HTTP Performance Summary

Top Slowest URLs (sorted by P95 response time)

Final Evaluation

Thresholds

Performance by Scenario

Response Time Peak

Throughput

Bandwidth

Resource Utilisation

CPU Usage

Memory Usage

Common Issues

Architecture Notes

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages