Skip to content

InvoiceAI is a practical backend project that combines OCR and AI to extract key information from invoices. It demonstrates how to integrate Tesseract OCR, LangChain/OpenAI models, and FastAPI to build a smart, production-ready API for document understanding — fully containerized with Docker and persistently storing results in SQLite.

Notifications You must be signed in to change notification settings

vickytilotia/InvoiceAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧾 InvoiceAI - OCR + AI Agent Invoice Field Extractor

🚀 Overview

InvoiceAI is a practical AI-driven backend project built to automatically extract key invoice fields such as invoice number, date, total amount, and customer details.
The objective was to explore OCR (Optical Character Recognition) and AI Agents (LangChain + LLMs) in real-world document understanding workflows.


🧠 Features

  • Invoice Text Extraction using pdf2image + pytesseract
  • AI-Powered Field Extraction using LangChain and Hugging Face model (flan-t5-small)
  • REST API built with FastAPI
  • Asynchronous processing ready for Docker deployment
  • Auto JSON structuring for clean, structured invoice data output

🧩 System Architecture

flowchart TD
  A[Client: Upload invoice] --> B[FastAPI /upload endpoint]
  B --> C{File type}
  C -->|PDF| D[pdf2image → images]
  C -->|Image| D
  D --> E["Tesseract OCR (pytesseract) → raw_text"]
  E --> F["ai_parser.extract_invoice_fields(raw_text)"]
  F --> G{Parser mode}
  G -->|Local LLM| H["HuggingFace pipeline (flan-t5-base / small model)"]
  G -->|OpenAI| I["OpenAI API (gpt-3.5-turbo)"]
  H --> J[raw_model_output]
  I --> J
  J --> K["Post-process (optional): regex / heuristics → structured fields"]
  K --> L["Save to SQLite (InvoiceText table)"]
  L --> M["Response: {id, filename, extracted_text, structured_data}"]
  M --> N["Client receives result"]

Loading

⚙️ Tech Stack

Layer Technology
Backend Framework FastAPI
OCR Engine pytesseract
AI Model LangChain + HuggingFace (flan-t5-small)
File Handling pdf2image
Containerization Docker
Language Python 3.10+

🧰 Installation & Setup

1️⃣ Clone the repository

git clone https://github.com/vickytilotia/InvoiceAI.git
cd InvoiceAI

2️⃣ Create and activate virtual environment

python -m venv venv
venv\Scripts\activate  # On Windows
source venv/bin/activate  # On Linux/Mac

3️⃣ Install dependencies

pip install -r requirements.txt

4️⃣ Run the FastAPI server

uvicorn app.main:app --reload

The server will be live at 👉 http://127.0.0.1:8000


🐳 Docker Setup

Build the Docker image

docker build -t invoice-ai .

Run the container

docker run -d -p 8000:8000 invoice-ai

Access the API docs at: http://localhost:8000/docs


📄 API Endpoint

POST /upload
Upload an invoice PDF to extract structured fields.

Example Response:

{
  "invoice_number": "123100401",
  "invoice_date": "2024-03-01",
  "total_amount": "251.12",
  "currency": "EUR",
  "vendor": "CPB Software GmbH"
}

GET /invoices

Returns a list of stored invoices with id, filename, created_at.

GET /invoices/{id}

Returns single invoice details including extracted text and stored parsed result.


🧭 Project Flow Summary

  1. PDF uploaded → converted into images
  2. OCR extracts visible text using Tesseract
  3. LLM processes the text → predicts key fields
  4. Result structured into JSON format
  5. Returned via REST API

🧪 Conclusion

✅ Successfully integrated OCR with AI model inference.
✅ Demonstrated how LangChain can structure unstructured text intelligently.
✅ Explored a modular and production-ready architecture with FastAPI and Docker.

This project lays a strong foundation for advanced document understanding pipelines like automated invoice validation, expense tracking, or ERP integration.


✨ Author

Vivek Kumar
💼 Backend Developer | 🧠 Exploring AI + Automation

About

InvoiceAI is a practical backend project that combines OCR and AI to extract key information from invoices. It demonstrates how to integrate Tesseract OCR, LangChain/OpenAI models, and FastAPI to build a smart, production-ready API for document understanding — fully containerized with Docker and persistently storing results in SQLite.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published