-
Notifications
You must be signed in to change notification settings - Fork 161
[FEAT] Implement Batch LLM Extraction for 10x Performance Gain #391
Copy link
Copy link
Open
Description
Is your feature request related to a problem?
The current LLM extraction logic makes sequential HTTP requests to Ollama for every single field in a PDF. On CPU-based systems, this is extremely slow (~1-2 minutes per field), making the app difficult to use for standard forms with 5+ fields.
Describe the solution you'd like
I propose a "Batch Extraction" engine that:
- Consolidates all field requests into a single, structured prompt.
- Utilizes Ollama's
format: "json"mode for structured output. - Extracts all data in one single LLM call.
Describe alternatives you've considered
Keeping the sequential model, but it is too slow for production use on standard hardware.
Additional context
Preliminary testing shows that this reduces extraction time from ~10 minutes down to ~1 minute for a standard form (a 10x speedup). This also improves contextual accuracy as the LLM sees all fields at once.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels