Enterprise-grade AI system that achieves 98% accuracy in categorizing financial transactions using an ensemble of MCC codes, rules, machine learning, and LLMs.
- Overview
- Key Features
- Architecture
- Performance
- Quick Start
- API Endpoints
- Configuration
- Training & Evaluation
- Monitoring
- Project Structure
- Development
Transaction AI is a privacy-first, production-ready system for automatically categorizing financial transactions with high accuracy. It combines multiple AI techniques in an intelligent ensemble to achieve 98%+ accuracy while maintaining fast response times (~100ms in fast mode).
- Repository maintenance update pushed for current-month activity
- Historical evaluation-log cleanup completed for public history hygiene
- Documentation re-reviewed for public presentation
- π― High Accuracy: 98.38% validation accuracy, 69.2% on real-world data
- π Privacy-First: 100% local processing, no cloud APIs required
- β‘ Fast Performance: ~100ms latency with intelligent fast-path optimization
- π§ Hybrid Intelligence: Ensemble of MCC codes, rules, ML embeddings, and LLMs
- π Production-Ready: Docker deployment, monitoring, health checks, auto-retraining
- π PDF Support: Extract and categorize transactions from bank statements
- π Active Learning: Auto-retrains from user feedback every 50 corrections
-
MCC Classifier (15% weight)
- Uses ISO 18245 merchant category codes
- 85-95% confidence on transactions with MCC data
- Instant categorization for MCC-enabled transactions
-
Rule-Based Engine (15% weight)
- 90+ keyword patterns across 29 categories
- Regex matching for merchant names
- 90-98% confidence, <35ms latency
-
ML Embedding Classifier (65% weight - highest)
- LightGBM model trained on 22,664+ transactions
- sentence-transformers embeddings (all-MiniLM-L6-v2)
- 96%+ accuracy with semantic understanding
-
LLM Classifier (5% weight)
- Llama 3.1 8B (Ollama) or Azure GPT-4/GPT-4o
- Few-shot learning with 5 category examples
- 92% accuracy, handles edge cases
-
Fast Mode: Skips LLM when Rule + ML agree (β₯90% confidence)
- 70% of transactions use fast path
- ~100ms latency vs 850ms with full ensemble
- Maintains 98% accuracy
-
Early Exit: High-confidence merchant/MCC matches skip ensemble entirely
-
Category-Specific Thresholds:
- Critical categories (Investments, Rent): 90% auto-accept
- Medium categories (Travel, Health): 85% auto-accept
- Low-risk (Food, Shopping): 80% auto-accept
food_dining groceries transport travel
bills utilities fuel health
education shopping entertainment subscriptions
income_salary transfers_upi investments atm_cash
rent insurance professional_services automotive
electronics home_improvement pets kids_family
personal_care gifts_occasions charity_donations taxes_government
fees_charges fraud_security other
- User feedback stored in corrections.jsonl + database
- Auto-retraining triggered every 50 corrections
- Hot model reload with zero downtime
- User-corrected categories cached for instant future lookups
- Upload PDF bank statements (PhonePe, ICICI, etc.)
- Automatic transaction extraction using pdfplumber
- Batch categorization of all extracted transactions
- Supports multi-page statements (tested up to 26 pages)
- Single transaction categorization
- Batch CSV/text upload (max 1000 transactions)
- PDF bank statement upload
- Real-time ensemble voting visualization
- System health monitoring (7 components)
- Performance statistics
- User feedback submission
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Transaction AI System β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Next.js β β FastAPI β β PostgreSQL β β
β β Dashboard ββββ REST API ββββ Database β β
β β (Port 3000) β β (Port 8000) β β (Port 5432) β β
β ββββββββββββββββ βββββββββ¬βββββββ ββββββββββββββββ β
β β β
β βββββββββ΄ββββββββ β
β β β β
β ββββββββββββΌβββββ βββββββββΌββββββ β
β β Redis Cache β β Ollama β β
β β (Port 6379) β β LLM Service β β
β βββββββββββββββββ β (Port 11435)β β
β βββββββββββββββ β
β β
β βββββββββββββββββββ Ensemble Router ββββββββββββββββββ β
β β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββ β β
β β β MCC β β Rules β β ML β β LLM β β β
β β β (15%) β β (15%) β β (65%) β β (5%) β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββ β β
β β β² β² β² β² β β
β β ββββββββββββββ΄βββββββββββββ΄βββββββββββββ β β
β β Weighted Voting System β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββ Monitoring Stack βββββββββββββ β
β β Prometheus (Metrics) + Grafana (Viz) β β
β βββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Input Transaction
β
βΌ
βββββββββββββββ
β Preprocessorβ ββ Extract MCC, amount, date, merchant
ββββββββ¬βββββββ
βΌ
βββββββββββββββ
β Normalizer β ββ Clean text, resolve merchant aliases
ββββββββ¬βββββββ
βΌ
βββββββββββββββ
β Router β ββ Fast path check (high confidence?)
ββββββββ¬βββββββ
β
ββββ YES βββΆ Return category (< 35ms)
β
NO
βΌ
βββββββββββββββββββββββββββββββ
β Ensemble Voting β
β βββββββ¬ββββββ¬ββββββ¬ββββββ β
β β MCC β Ruleβ ML β LLM β β
β β 15% β 15% β 65% β 5% β β
β βββββββ΄ββββββ΄ββββββ΄ββββββ β
β β β
β Weighted Vote β
β β β
β ββββββββΌβββββββ β
β β Confidence β β
β β >= 80%? β β
β ββββββββ¬βββββββ β
β β β
β YES β NO β
β βΌ βΌ β
β Accept Flag for Review β
βββββββββββββββββββββββββββββββ
β
βΌ
Return Result + Cache
| Dataset | Accuracy | Samples |
|---|---|---|
| Validation Set | 98.38% | 5,600 |
| Real-World (PhonePe) | 66.7% | 12 |
| Real-World (ICICI) | 71.4% | 14 |
| Well-Known Brands | 95%+ | - |
| Mode | P50 | P95 | P99 | Throughput |
|---|---|---|---|---|
| Fast Mode (70% of traffic) | 100ms | 150ms | 200ms | ~70 req/s |
| Full Ensemble | 850ms | 1200ms | 1500ms | ~10 req/s |
| Rules Only | 35ms | 50ms | 75ms | ~1000 req/s |
| ML Only | 115ms | 180ms | 250ms | ~100 req/s |
- RAM: 16GB (8GB LLM, 4GB ML, 4GB system)
- Disk: 20GB
- CPU: 8 cores recommended (4 minimum)
- GPU: Optional (5-10x faster LLM inference)
- Docker 20.10+ and Docker Compose 2.0+
- 16GB RAM, 20GB disk space
- (Optional) NVIDIA GPU for faster LLM inference
git clone https://github.com/Rahul1269227/transaction-ai
cd transaction-aicp .env.example .env
# Edit .env to configure database passwords, LLM provider, etc.Key configurations:
# Database
POSTGRES_PASSWORD=your_secure_password
# LLM Provider (choose one)
LLM_PROVIDER=ollama # Local LLM (recommended)
# LLM_PROVIDER=azure # Azure OpenAI
# Ensemble Weights
MCC_WEIGHT=0.15
RULE_WEIGHT=0.15
ML_WEIGHT=0.65
LLM_WEIGHT=0.05
# Performance
FAST_MODE=true
FAST_MODE_THRESHOLD=0.90# First time: Download LLM model (llama3.1:8b ~5GB)
docker-compose --profile llm-setup up llm-loader
# Start all services
docker-compose --profile llm up -ddocker-compose up -d postgres redis api uidocker-compose --profile llm --profile monitoring up -d# Check API health
curl http://localhost:8000/health
# Check all services
docker-compose ps- Dashboard UI: http://localhost:3001
- API Docs: http://localhost:8000/docs
- Grafana: http://localhost:4000 (admin/admin)
- Prometheus: http://localhost:9090
curl -X POST http://localhost:8000/categorize \
-H "Content-Type: application/json" \
-d '{
"text": "Payment to Starbucks Coffee",
"amount": 5.50,
"currency": "USD",
"mcc": "5814"
}'Response:
{
"category": "food_dining",
"subcategory": "Cafes & Coffee",
"confidence": 0.95,
"method": "merchant_gazetteer",
"ensemble_votes": {
"mcc": "food_dining",
"rule": "food_dining",
"ml": "food_dining",
"llm": null
},
"requires_review": false
}curl -X POST http://localhost:8000/batch-categorize \
-H "Content-Type: application/json" \
-d '{
"transactions": [
"Netflix monthly subscription",
"Uber ride to airport",
"Whole Foods groceries"
]
}'curl -X POST http://localhost:8000/upload-pdf \
-F "file=@bank_statement.pdf"curl -X POST http://localhost:8000/feedback \
-H "Content-Type: application/json" \
-d '{
"transaction_text": "Payment to Netflix",
"predicted_category": "entertainment",
"correct_category": "subscriptions_memberships",
"was_incorrect": true
}'curl -X POST http://localhost:8000/feedback-learningcurl http://localhost:8000/healthcurl http://localhost:8000/statscurl http://localhost:8000/metricsPOSTGRES_HOST=postgres
POSTGRES_PORT=5432
POSTGRES_DB=transactions
POSTGRES_USER=postgres
POSTGRES_PASSWORD=your_passwordOllama (Local)
LLM_PROVIDER=ollama
LLM_URL=http://llm-service:11434
LLM_MODEL=llama3.1:8b
LLM_TIMEOUT=120.0Azure OpenAI
LLM_PROVIDER=azure
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your_api_key
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_OPENAI_API_VERSION=2024-08-01-preview# Weights (must sum to 1.0)
MCC_WEIGHT=0.15
RULE_WEIGHT=0.15
ML_WEIGHT=0.65
LLM_WEIGHT=0.05
# Thresholds
ML_CONFIDENCE_THRESHOLD=0.80
RULE_CONFIDENCE_THRESHOLD=0.80
# Performance
USE_ENSEMBLE=true
FAST_MODE=true
FAST_MODE_THRESHOLD=0.90
ENABLE_PARALLEL=trueAUTO_RETRAIN_ENABLED=true
AUTO_RETRAIN_THRESHOLD=50 # Retrain after 50 correctionsEdit data/taxonomy.yaml to add/modify categories:
categories:
- name: "Food & Dining"
id: "food_dining"
description: "Restaurants, food delivery, cafes"
mcc_codes:
- "5812" # Restaurants
- "5814" # Fast Food
keywords:
- "restaurant"
- "cafe"
- "starbucks"
patterns:
- "(?i).*restaurant.*"
- "(?i).*cafe.*"Add merchant aliases in data/gazetteer/merchant_aliases.csv:
merchant_id,canonical_name,aliases,category,subcategory
1,STARBUCKS,"starbucks,starbuck,sbux",food_dining,Cafes & Coffee
2,NETFLIX,"netflix,netflix.com",subscriptions_memberships,Streaming Servicespython3 scripts/train.pypython3 scripts/train_model.py \
--train data/train.jsonl \
--val data/test.jsonl \
--output models/transaction_classifier \
--n-estimators 200 \
--learning-rate 0.05 \
--max-depth 10Hyperparameters:
n_estimators: Number of boosting rounds (default: 200)learning_rate: Learning rate (default: 0.05)max_depth: Maximum tree depth (default: 10)num_leaves: Maximum number of leaves (default: 50)min_child_samples: Minimum samples per leaf (default: 20)
python3 scripts/evaluate_f1.py \
--model models/transaction_classifier \
--test data/test.jsonlpython3 scripts/evaluate_bias.py \
--model models/transaction_classifier \
--test data/test.jsonl \
--output reports/bias_report.json# Retrain with user corrections
python3 scripts/retrain_with_corrections.py \
--corrections data/corrections/corrections.jsonl \
--model-path models/transaction_classifier
# Background auto-retraining
python3 scripts/feedback_learning.py# Access metrics endpoint
curl http://localhost:8000/metricsAvailable Metrics:
categorization_requests_total- Total requests by endpointcategorization_latency_seconds- Latency histogrammethod_usage_total- Usage by method (rule/ml/llm)categorization_requires_review_total- Review ratecategorization_cache_events_total- Cache hit/missensemble_agreement_ratio- Method agreement rate
- Access Grafana: http://localhost:4000
- Login: admin/admin
- Navigate to pre-configured dashboard: "Transaction AI Performance"
Dashboard Panels:
- Request Rate & Throughput
- P50/P95/P99 Latency
- Cache Hit Ratio
- Method Distribution
- Review Rate Trends
- Resource Usage (CPU, Memory)
# Component-level health
curl http://localhost:8000/health | jq
# Response:
{
"status": "healthy",
"components": {
"router": "healthy",
"normalizer": "healthy",
"rule_categorizer": "healthy",
"ml_classifier": "healthy",
"llm_classifier": "healthy",
"merchant_resolver": "healthy",
"database": "healthy",
"cache": "healthy"
}
}transaction-ai/
βββ apps/
β βββ api/
β βββ main.py # FastAPI application (1,480 lines)
βββ core/
β βββ model/
β β βββ ensemble_router.py # Ensemble voting system
β β βββ llm_classifier.py # LLM categorization
β β βββ classifier.py # ML classifier
β β βββ mcc_classifier.py # MCC code classifier
β β βββ router.py # Hybrid router
β βββ rules/
β β βββ engine.py # Rule-based categorization
β βββ normalize/
β β βββ normalizer.py # Text normalization
β βββ resolve/
β β βββ resolver.py # Merchant resolution
β βββ parsers/
β β βββ pdf_parser.py # PDF bank statement parser
β βββ models.py # Pydantic models
βββ data/
β βββ taxonomy.yaml # 29 category definitions
β βββ gazetteer/
β β βββ merchant_aliases.csv # Merchant aliases (353+)
β βββ train.jsonl # Training data (22,664)
β βββ test.jsonl # Test data (5,600)
β βββ corrections/
β βββ corrections.jsonl # User feedback
βββ scripts/
β βββ train.py # Training script
β βββ evaluate_f1.py # F1 evaluation
β βββ evaluate_bias.py # Fairness evaluation
β βββ feedback_learning.py # Auto-retraining
βββ ui/ # Next.js dashboard
β βββ app/
β βββ components/
β βββ package.json
βββ infra/
β βββ docker-compose.yaml # Multi-container orchestration
β βββ Dockerfile # API container
βββ monitoring/
β βββ prometheus.yml # Metrics config
β βββ grafana-dashboard.json # Pre-built dashboard
β βββ alerts.yml # Alert rules
βββ tests/ # Test suite (15+ files)
βββ models/ # Trained models
βββ docs/ # Documentation
βββ requirements.txt # Python dependencies
βββ .env.example # Environment template
βββ README.md # This file
# Clone repository
git clone https://github.com/Rahul1269227/transaction-ai
cd transaction-ai
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install UI dependencies
cd ui && npm install && cd ..# Terminal 1: Start database & cache
docker-compose up -d postgres redis
# Terminal 2: Start API
MODEL_PATH=models/transaction_classifier \
python3 -m uvicorn apps.api.main:app --reload --port 8000
# Terminal 3: Start UI
cd ui && npm run dev# All tests
pytest
# Specific test file
pytest tests/test_ensemble_router.py
# With coverage
pytest --cov=core --cov-report=htmlWe welcome contributions! Please see our Contributing Guide for details.
- Edit
data/taxonomy.yaml:
- name: "New Category"
id: "new_category"
description: "Description"
keywords: ["keyword1", "keyword2"]
patterns: ["(?i)pattern.*"]- Add training examples to
data/train.jsonl:
{"text": "Example transaction", "label": "new_category"}- Retrain model:
python3 scripts/train.pyEdit data/gazetteer/merchant_aliases.csv:
100,NEW_MERCHANT,"merchant,alias1,alias2",category,subcategoryReload API to apply changes.
This project is licensed under the MIT License - see the LICENSE file for details.
- LightGBM - Microsoft's gradient boosting framework
- sentence-transformers - Hugging Face semantic embeddings
- Ollama - Local LLM inference
- FastAPI - Modern Python web framework
- Next.js - React framework for production
- Documentation: https://transaction-ai.readthedocs.io/en/latest/
- Issues: GitHub Issues
- Real-World Testing: See REAL_WORLD_TEST_RESULTS.md
- Mobile app (React Native)
- Real-time transaction streaming
- Multi-language support
- Custom category training UI
- Fraud detection integration
- Export to accounting software (QuickBooks, Xero)
- Smart budgeting recommendations
- Transaction deduplication
Built with β€οΈ for accurate, private, and intelligent transaction categorization