-
Per-Sender Isolated Memory — Each sender has an independent memory store to prevent cross-user context leakage and maintain clean conversation tracking.
-
Dual Memory Architecture — Combines
InMemoryChatMessageHistory(full conversational context for the LLM) with a lightweight structured history log (category, action, subject) for fast system-level checks. -
Context Injection Strategy — When a new email arrives, the agent builds a composite context string (chat history + system summary) and injects it into the LLM prompt to enable follow-up awareness.
-
Action-Aware Metadata Tracking — The lightweight history log enables duplicate detection, escalation stickiness, and quick retrieval of the last handled category without querying the full chat memory.
-
Structured Confidence Scoring — The LLM outputs a validated confidence score (0.0–1.0) via Pydantic schema, ensuring reliable decision-making instead of heuristic guesswork.
-
Threshold-Based Automation Control — Emails with confidence ≥ 0.75 are eligible for auto-reply; anything below triggers escalation to avoid risky automation.
-
Confidence as Risk Guardrail — Low-confidence classifications act as a safety net, ensuring ambiguous or unclear emails are handled by humans.
-
Multi-Layer Escalation Checks — Escalation occurs if confidence is low, high-risk keywords are detected (
fraud,hack,sue,legal,lawyer), or the sender was previously escalated. -
Sticky Escalation Rule — Once escalated, all future emails from that sender in the session are automatically escalated for consistent handling.
-
LLM-Bypass for Duplicates — Duplicate follow-ups skip LLM analysis entirely and receive a system-generated acknowledgment, improving efficiency and reducing API cost.