Skip to content

ZXRProductions/document-intel-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Document Intelligence API

Python FastAPI License Stars Issues

Swagger UI preview

A backend service that turns raw documents into structured JSON using a mix of text extraction, AI analysis and metadata extraction.

Upload a PDF, DOCX or TXT file and the API will:

  • extract the text
  • generate a short summary
  • classify the document
  • pull out a few useful key fields
  • extract entities, including character offsets
  • store everything in SQLite, per user

All of that comes back as a single, well-defined JSON object that can be used by other services or UIs.


Table of Contents


Features

  • Document upload (PDF, DOCX, TXT) via FastAPI
  • Text extraction
    • PDFs via pypdf
    • DOCX via python-docx
    • Plain text files as-is
  • LLM-backed analysis using OpenAI GPT-4o-mini (or an OpenAI compatible API)
  • Structured JSON output:
    • summary (3–7 sentences)
    • classification (document_type and category)
    • key_fields (date, due_date, total_amount, person_or_company, reference_id)
    • entities (PERSON, ORG, DATE, MONEY, LOCATION, OTHER) with optional offsets
  • Entity character offsets
    • Each entity can include start_offset and end_offset into the original document text
    • Useful for UI highlighting or redaction workflows
  • SQLite backed storage
    • Users, documents, analyses and entities stored in SQLite
    • Per user history with /documents and /documents/{id}
  • JWT authentication
    • /auth/register and /auth/login
    • Bearer token required for analysis and document endpoints
  • OpenAI compatible endpoint support
    • Use OpenAI directly
    • Or point to an OpenAI compatible base URL via OPENAI_BASE_URL
  • Interactive docs
    • Swagger UI at /docs
    • ReDoc at /redoc
  • Simple HTML UI
    • Minimal upload page at /
    • Paste token, upload a file, see the raw JSON response

Architecture

High level flow:

flowchart LR
    U[Client] -->|Upload file + JWT| A[FastAPI /analyze]
    A -->|Auth check| AUTH[JWT verification]
    A -->|Save temp file| F[uploads/]
    A -->|Detect type| T[Extractor]
    T -->|Extract text| TXT[Document text]
    TXT -->|Prompt| L[AI client]
    L -->|Strict JSON| J[AnalysisResult]
    J -->|Persist| DB[(SQLite)]
    DB -->|History and detail| H[History endpoints]
    A -->|Return JSON| U
Loading

The service is intentionally monolithic. A single FastAPI app owns routing, analysis, persistence and authentication.
The goal is to keep the code readable and straightforward rather than overly abstract.


Tech Stack

  • Backend: FastAPI, Uvicorn
  • Language: Python 3.12+
  • AI / LLM: OpenAI client (GPT-4o-mini by default)
  • Parsing: pypdf, python-docx
  • Config: python-dotenv
  • Auth: JWT using python-jose and passlib
  • Database: SQLite via SQLAlchemy

Getting Started

Requirements

  • Python 3.10 or later (developed with 3.12)
  • A working pip installation
  • An OpenAI API key or a compatible provider

Installation

Clone the repository and set up a virtual environment:

git clone https://github.com/ZXRProductions/document-intel-api.git
cd document-intel-api

python -m venv venv

# Windows
venv\Scripts\Activate
# macOS / Linux
source venv/bin/Activate

pip install --upgrade pip
pip install -r requirements.txt

Environment Variables

Copy the example file and fill in your details:

cp .env.example .env

.env:

# LLM config
OPENAI_API_KEY=your_api_key_here
MODEL_NAME=gpt-4o-mini

# Optional OpenAI compatible endpoint
# Example: https://api.groq.com/openai/v1
OPENAI_BASE_URL=

# App behaviour
DEBUG=false
MAX_FILE_SIZE_MB=10

# Database (leave default for local sqlite)
# DATABASE_URL=sqlite:///document_intel.db

# Auth / security
SECRET_KEY=CHANGE_ME_IN_PRODUCTION
ACCESS_TOKEN_EXPIRE_MINUTES=60

Running the API

From the project root:

uvicorn app.main:app --reload

Open:


API Reference

Auth

POST /auth/register

Create a new user.

Request body:

{
  "email": "user@example.com",
  "password": "yourpassword"
}

Response (201):

{
  "id": 1,
  "email": "user@example.com"
}

POST /auth/login

Authenticate and obtain a JWT access token.

Form data (application/x-www-form-urlencoded):

  • username (email)
  • password

Example:

curl -X POST "http://127.0.0.1:8000/auth/login"   -H "Content-Type: application/x-www-form-urlencoded"   -d "username=user@example.com&password=yourpassword"

Response (200):

{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "bearer"
}

Use this token in the Authorization header:

Authorization: Bearer <token>

GET /me

Return basic information about the authenticated user.

Response:

{
  "id": 1,
  "email": "user@example.com"
}

System

GET /health

Simple health check.

Response:

{
  "status": "ok",
  "detail": "Service is up."
}

Analysis

POST /analyze

Upload a document and get structured analysis back. Requires a valid Bearer token.

  • Method: POST
  • Content-Type: multipart/form-data
  • Field: file (PDF, DOCX, TXT)
  • Auth: Authorization: Bearer <token>

Possible responses:

  • 200 analysis JSON
  • 400 invalid file or empty text
  • 401 missing or invalid token
  • 422 extraction failed
  • 500 unexpected server or AI error

Documents

GET /documents

List analysed documents for the current user.

Response:

[
  {
    "id": 1,
    "filename": "invoice-june.pdf",
    "created_at": "2025-11-28T19:22:15.123456",
    "document_type": "invoice",
    "category": "finance",
    "summary": "Short summary of the invoice..."
  }
]

GET /documents/{id}

Get full details for a single document owned by the current user.

Response:

{
  "id": 1,
  "filename": "invoice-june.pdf",
  "mime_type": "application/pdf",
  "created_at": "2025-11-28T19:22:15.123456",
  "full_text": "Full extracted document text here...",
  "analysis": {
    "summary": "This invoice from Acme Corp...",
    "classification": {
      "document_type": "invoice",
      "category": "finance"
    },
    "key_fields": {
      "date": "2025-06-30",
      "due_date": "2025-07-30",
      "total_amount": "£1,250.00",
      "person_or_company": "Acme Corp",
      "reference_id": "INV-2025-0612"
    },
    "entities": [
      {
        "type": "ORG",
        "text": "Acme Corp",
        "start_offset": 15,
        "end_offset": 24
      },
      {
        "type": "PERSON",
        "text": "John Doe",
        "start_offset": 120,
        "end_offset": 128
      }
    ]
  }
}

If the document does not exist or is not owned by the current user, a 404 is returned.


Example Request

Using curl:

curl -X POST "http://127.0.0.1:8000/analyze"   -H "accept: application/json"   -H "Authorization: Bearer <your_token>"   -F "file=@example_invoice.pdf"

Example Response

{
  "summary": "This document is an invoice issued by Acme Corp to John Doe for consulting services rendered in June 2025. It lists the invoice number, billing address, service description, total amount due and payment terms. The invoice highlights a single line item with hourly consulting fees. Payment is due within 30 days via bank transfer. Contact details are provided for billing queries. The document serves as a formal request for payment.",
  "classification": {
    "document_type": "invoice",
    "category": "finance"
  },
  "key_fields": {
    "date": "2025-06-30",
    "due_date": "2025-07-30",
    "total_amount": "£1,250.00",
    "person_or_company": "Acme Corp",
    "reference_id": "INV-2025-0612"
  },
  "entities": [
    {
      "type": "ORG",
      "text": "Acme Corp",
      "start_offset": 10,
      "end_offset": 19
    },
    {
      "type": "PERSON",
      "text": "John Doe",
      "start_offset": 120,
      "end_offset": 128
    },
    {
      "type": "DATE",
      "text": "30 June 2025",
      "start_offset": 200,
      "end_offset": 212
    },
    {
      "type": "MONEY",
      "text": "£1,250.00",
      "start_offset": 260,
      "end_offset": 270
    }
  ]
}

The exact values will depend on the uploaded document and the model output.


Project Structure

document-intel-api/
  ├─ app/
  │   ├─ main.py          # FastAPI app, routes, error handling, simple HTML UI
  │   ├─ extractor.py     # PDF / DOCX / TXT extraction
  │   ├─ ai_client.py     # AI prompt and OpenAI client wrapper
  │   ├─ auth.py          # JWT auth, register/login, current user dependency
  │   ├─ database.py      # SQLAlchemy engine and session helpers
  │   ├─ db_models.py     # ORM models (User, Document, Entity)
  │   ├─ models.py        # Pydantic models (request and response)
  │   ├─ config.py        # Environment and settings
  │   └─ utils.py         # Helper utilities (file saving, normalisation, offsets)
  │
  ├─ uploads/             # Temporary uploads (kept out of git)
  │   └─ .gitkeep
  │
  ├─ README.md
  ├─ .env.example
  ├─ requirements.txt
  ├─ .gitignore
  └─ LICENSE

Planned Roadmap

The current version focuses on a clean, end to end flow rather than every possible feature.

AI and NLP

  • More robust NER with confidence scores
  • Richer entity types and relationships
  • Better handling of long and complex documents
  • Table extraction from invoices and reports
  • PII redaction utilities for personal data
  • A dedicated endpoint for follow up questions about a document

Backend and API

  • More detailed analytics and metrics
  • Pagination and filtering for /documents
  • Optional soft deletion or archiving of documents
  • API keys for service to service access
  • Rate limiting options for public deployments

Frontend

  • A more polished HTML/JS front end
  • Optional React dashboard with:
    • drag and drop upload
    • document list with filters
    • detail view with highlighted entities
    • simple charts for document distribution

Deployment

  • Dockerfile and container image build
  • Example configuration for Railway, Render or Fly.io
  • CORS tightening for production
  • Health and readiness endpoints for container platforms

Testing

  • pytest suite
  • Fixtures with fake PDF, DOCX and TXT documents
  • Mocked AI responses for deterministic tests
  • Integration tests covering auth, analysis and history

Development Notes

Some practical details when working on the project:

  • Documents and analyses are stored per user and only visible to the owner.
  • Entity extraction is handled by the AI model, not a separate NER library.
  • Offsets are calculated using a simple first match approach. It is usually enough for demos and small tools but can be refined.
  • SQLite is the default to keep setup simple. Switching to Postgres or another database mainly involves updating DATABASE_URL and running migrations.

Contributing

If you spot something odd or want to extend the project, feel free to:

  1. Fork the repository.

  2. Create a feature branch:

    git checkout -b feature/my-idea
  3. Commit your changes:

    git commit -am "Describe your change"
  4. Push the branch:

    git push origin feature/my-idea
  5. Open a pull request.

Bug reports and small improvements to the documentation are very welcome.


License

This project is licensed under the MIT License.

About

FastAPI backend that extracts and analyzes documents using AI, with authentication and SQLite storage.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages