-
Notifications
You must be signed in to change notification settings - Fork 161
[FEAT]: Department Profile System for Pre-Mapped PDF Templates #206
Description
📝 Description
FireForm currently extracts PDF field names as machine-generated identifiers (e.g. textbox_0_0, textbox_0_1). When these are sent to Mistral for extraction, the model has no semantic context and either returns null for all fields or hallucinates a single value repeated across unrelated fields (see related Bug #173).
A Department Profile system would ship pre-built mappings between human-readable field labels and the internal PDF field identifiers for common agency forms used by Fire Departments, Police, and EMS.
💡 Rationale
FireForm's mission is to serve real first responders out of the box. Currently:
- A firefighter uploads a Cal Fire incident form
- Mistral receives
{"textbox_0_0": "", "textbox_0_1": ""} - It has no idea what these fields mean → returns null or wrong values
- The filled PDF is blank or incorrect
With department profiles:
- The profile provides
{"Officer Name": "textbox_0_0", "Incident Location": "textbox_0_1"} - Mistral receives human-readable labels → extracts correctly
- The filled PDF is accurate
This solves the root cause of Issue #173 without requiring changes to the LLM pipeline.
🛠️ Proposed Solution
- Create
src/profiles/directory with JSON profile files - Each profile maps human-readable field labels → internal PDF field IDs
- Add profile selector to the frontend UI (dropdown by department type)
- Pass field label mapping to LLM prompt during extraction
Profile schema:
{
"department": "Fire Department",
"description": "Standard Cal Fire incident report",
"fields": {
"Officer Name": "textbox_0_0",
"Badge Number": "textbox_0_1",
"Incident Location": "textbox_0_2",
"Incident Date": "textbox_0_3",
"Number of Victims": "textbox_0_4"
},
"example_transcript": "Officer Smith, badge 4421, responding to structure fire at 742 Evergreen Terrace on March 8th. Two victims on scene."
}Profiles to implement:
-
fire_department.json— Cal Fire incident report -
police_report.json— Standard police incident form -
ems_medical.json— EMS patient care report - Logic change in
src/llm.pyto use profile labels in prompt - Frontend dropdown to select department profile
✅ Acceptance Criteria
- At least 3 department profiles ship with the repo
- Profile labels are injected into the Mistral prompt
- Extraction accuracy improves for pre-mapped forms (no null output)
- Feature works in Docker container
- Documentation updated in
docs/ - JSON output validates against the schema
📌 Additional Context
Related bugs this directly addresses: #173 (PDF filler hallucinates repeating values)
Related features this complements: #111 (Field Mapping Wizard — for custom PDFs not covered by profiles)
This is especially important for FireForm's stated mission as a UN Digital Public Good — the system should work correctly for real first responders without requiring technical setup.