Skip to content

Run find_contradictions automatically inside recall and surface debug ranking #50

@ramonlimaramos

Description

@ramonlimaramos

Problem

Two hygiene gaps:

1. find_contradictions() is a tool nobody calls

The tool exists but agents rarely invoke it proactively. Over weeks, memories can drift and contradict (e.g. 'use library X' vs 'we migrated off library X to Y'). The contradiction is only caught the next time the user catches the agent applying the stale rule.

2. recall ranking is opaque

recall() returns memories with score + decay + trust, but the agent has no insight into WHY a memory ranked above another. When two contradictory memories both have score ~0.1, the agent can pick the wrong one and not know.

Proposal

Inline contradiction detection

recall() runs a lightweight contradiction check on its result set (top-K memories) and surfaces conflicts inline:

[working] (project) score=0.11 decay=1.00 trust=0.50
  Use library X for date formatting.
  id=aaa-...

[working] (project) score=0.10 decay=1.00 trust=0.50
  We migrated off library X to Y on 2026-04-15.
  id=bbb-...

⚠️ Contradiction detected between aaa-... and bbb-...
   Prefer the newer memory (bbb-...) or call resolve_contradictions(aaa, bbb) to pick one.

Optional debug mode for ranking

recall(query, debug=True) includes per-memory:

score: 0.115
breakdown:
  semantic_similarity: 0.81 (cos)
  keyword_overlap: 0.42 (BM25)
  hrr_match: 0.55
  decay_factor: 1.00
  trust_factor: 0.50
final = semantic*w1 + keyword*w2 + hrr*w3 - age_penalty

The agent (and the user inspecting traces) can see why a result ranked higher and trust the ordering.

Why this matters

Memories accumulate. Without proactive hygiene, the same human-rule will appear in 3 versions over 6 months and the agent will keep guessing.

Acceptance

  • recall() flags contradicting pairs in its result set with their ids.
  • recall(debug=True) returns the ranking breakdown for each result.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions