This directory contains evaluation scripts for testing AXIA API data extraction performance on different datasets.
Evaluates receipt data extraction using the SROIE 2019 dataset.
Dataset: SROIE2019 receipt images and expected outputs
API Endpoint: /api/extract/SROIEReceipt
Evaluators:
- CompanyEvaluator: String similarity matching for company names
- AddressEvaluator: String similarity matching for addresses
- DateEvaluator: Binary matching with DD/MM swap handling for dates
- TotalEvaluator: Numerical similarity for receipt totals
Evaluates handwritten name extraction from the handwriting recognition dataset.
Dataset: HANDWRITING images with expected name labels
API Endpoint: /api/extract/Name
Evaluators:
- SimilarityEvaluator: String similarity matching for extracted names
- ExactEvaluator: Binary exact matching for extracted names
- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Set the AXIA API key environment variable:
export AXIA_API_KEY=your_api_key_here- Ensure datasets are available:
- SROIE2019:
SROIE2019/cases.yamland corresponding image/text files - Handwriting:
HANDWRITING/written_name_test_short.csvand test images
- SROIE2019:
Run the SROIE 2019 evaluation:
python -m eval_sroie2019Run the handwriting evaluation:
python -m eval_handwritingBoth scripts support concurrent processing (20 concurrent requests) and provide detailed evaluation reports including input/output comparisons and timing information.
- Concurrent Processing: Both scripts process multiple images simultaneously
- Robust Date Handling: SROIE evaluator handles DD/MM swapping in annotations
- Detailed Reporting: Comprehensive evaluation reports with timing and comparison data
- Error Handling: Graceful handling of API errors and missing data