Skip to content

🚨 Improve Red Flags Detection with Enhanced LLM Prompt#4

Open
pranaya-mathur wants to merge 4 commits into
mainfrom
improve-red-flags
Open

🚨 Improve Red Flags Detection with Enhanced LLM Prompt#4
pranaya-mathur wants to merge 4 commits into
mainfrom
improve-red-flags

Conversation

@pranaya-mathur

Copy link
Copy Markdown
Owner

Problem

LLM was not consistently detecting and listing red flags in suspicious claims. Even when identifying issues in fraud_explanation, the red_flags array remained empty.

Solution

Enhanced LLM Prompt

Added explicit red flag checklist with clear examples:

IMPORTANT - RED FLAGS to check:
- "delayed-report" - mentions filing later ("will file later", "baad mein")
- "no-police-report" - no police report filed ("no FIR", "nahi karwayi")
- "vague-details" - lacks specific information
- "short-narrative" - unusually brief (under 50 chars)
- "inconsistent-timeline" - events don't follow logic
- "excessive-claim" - disproportionate amount
- "suspicious-timing" - timing raises questions

Key Improvements:

  1. Specific Instructions - LLM now knows EXACTLY what to look for
  2. Bilingual Support - Includes Hindi patterns ("baad mein", "nahi karwayi")
  3. Fraud Risk Calibration - Rule: "If multiple red flags present, fraud_risk should be 0.4 or higher"
  4. Role Clarity - Changed from "claim evaluator" to "fraud detection expert"

Updated Fallback Heuristic:

Also improved local_fallback() function to detect Hindi patterns:

  • Added "baad mein" detection
  • Added "nahi karwayi" detection

Testing Required:

Test Case 1 - Delayed Report (Hindi):

{
  "narrative": "Kal raat accident hua. FIR abhi tak nahi karwayi. Baad mein file kar dunga."
}

Expected:

  • red_flags: ["delayed-report", "no-police-report"]
  • fraud_risk: >= 0.4

Test Case 2 - Vague + Short:

{
  "narrative": "Accident. Will file later."
}

Expected:

  • red_flags: ["short-narrative", "delayed-report", "vague-details"]
  • fraud_risk: >= 0.5

Test Case 3 - Legitimate Claim:

{
  "narrative": "Car accident on highway. Police report filed immediately with FIR number 123. Full damage documentation attached."
}

Expected:

  • red_flags: []
  • fraud_risk: < 0.2

Impact:

  • ✅ More accurate fraud detection
  • ✅ Better transparency (users see specific red flags)
  • ✅ Consistent results across similar cases
  • ✅ Bilingual pattern recognition

Ready for testing - merge after validation!

- Removed .env file containing exposed GROQ_API_KEY
- Added .env.example template
- Created comprehensive .gitignore to prevent future credential leaks
…lity

- Increased ThreadPoolExecutor workers from 2 to 10
- Added rate limiting (slowapi): 60/min for evaluate, 20/min for batch
- Implemented proper LLM timeout enforcement using signal module
- Added batch size validation (max 100 claims)
- Converted file writes to async to avoid blocking
- Pinned all dependency versions
- Improved error handling and logging
- API version bumped to 0.2
- Added docker-compose.yml with API + Prometheus stack
- Improved Dockerfile with health checks and layer caching
- Created requirements-dev.txt with testing and dev tools
- Added .pre-commit-config.yaml for automated code quality
- Created .dockerignore to optimize Docker builds
- Added Makefile with common development commands
- Complete local development environment ready
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant