-
Notifications
You must be signed in to change notification settings - Fork 161
[FEATURE] Prompt Engineering Layer for Robust LLM-Based Data Extraction #412
Copy link
Copy link
Open
Description
📝 Description
FireForm uses an LLM (via Ollama/Mistral) to convert unstructured incident descriptions into structured JSON.
In practice, the extraction is not always consistent. Some common issues I noticed:
- prompts are too generic in some cases
- some fields are missing or only partially extracted
- JSON output formatting can break
- ambiguous input text leads to inconsistent results
This affects the reliability of downstream processing.
💡 Rationale
Right now, most checks happen after extraction (validation layer).
Improving the prompt itself can reduce errors earlier in the pipeline and make the output more consistent before it even reaches validation.
🛠️ Proposed Solution
Add a small prompt engineering layer before calling the LLM.
This can include:
- structured prompt templates instead of raw prompts
- a few examples (few-shot) to guide extraction
- clearer instructions for expected JSON format
- field-level hints (e.g., expected formats for dates, IDs, etc.)
✅ Acceptance Criteria
- Prompt templates added and reusable
- Few-shot examples included
- JSON output is more consistent across inputs
- Fewer malformed or incomplete outputs observed
- Clean integration with existing extraction flow
📎 Additional Context
This focuses on improving extraction quality before validation, rather than changing the validation layer itself.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels