Skip to content

[FEATURE] Prompt Engineering Layer for Robust LLM-Based Data Extraction #412

@Lochit-Vinay

Description

@Lochit-Vinay

📝 Description

FireForm uses an LLM (via Ollama/Mistral) to convert unstructured incident descriptions into structured JSON.
In practice, the extraction is not always consistent. Some common issues I noticed:

  • prompts are too generic in some cases
  • some fields are missing or only partially extracted
  • JSON output formatting can break
  • ambiguous input text leads to inconsistent results

This affects the reliability of downstream processing.


💡 Rationale

Right now, most checks happen after extraction (validation layer).
Improving the prompt itself can reduce errors earlier in the pipeline and make the output more consistent before it even reaches validation.


🛠️ Proposed Solution

Add a small prompt engineering layer before calling the LLM.
This can include:

  • structured prompt templates instead of raw prompts
  • a few examples (few-shot) to guide extraction
  • clearer instructions for expected JSON format
  • field-level hints (e.g., expected formats for dates, IDs, etc.)

✅ Acceptance Criteria

  • Prompt templates added and reusable
  • Few-shot examples included
  • JSON output is more consistent across inputs
  • Fewer malformed or incomplete outputs observed
  • Clean integration with existing extraction flow

📎 Additional Context

This focuses on improving extraction quality before validation, rather than changing the validation layer itself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions