Push keyword search down to Qdrant via a full-text payload index#72
Merged
Conversation
keyword_search() previously transferred up to 5000 full payloads per query and substring-matched in Python; past that cap, older memories silently became unsearchable. Now a full-text index on the data field (created lazily and idempotently on first keyword search; outcome cached) lets Qdrant prefilter candidates server-side with MatchText, and the original substring check verifies them so semantics are unchanged. MatchText is token-based, so queries that only match inside a word (or yield no surviving candidates) transparently fall back to the legacy scan - recall never regresses, and stores on Qdrant versions that cannot create the index keep working via the scan path. Closes #64 https://claude.ai/code/session_01H2Dbh6kD8bseWZZEf7kGhx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #64
What
keyword_search()previously pulled up to 5,000 full payloads out of Qdrant on every keyword query and substring-matched them in Python — linear cost with store size, and memories beyond the scan cap silently became unsearchable. Matching is now pushed down to Qdrant in two stages:datafield (created lazily and idempotently on first keyword search; the outcome is cached process-wide behind a lock) lets Qdrant return only points containing every query token via aMatchTextfilter, combined with the existing exact-match scoping (user_id, provenance filters).Recall never regresses
MatchTextis token-based, so three cases transparently fall back to the legacy full scan:hilmatchingPhilips, or non-contiguous tokens).Worst case (a query with no matches at all) costs one indexed query plus one scan — the scan being the status-quo price. The typical whole-word query transfers only the matching points.
No new dependencies (
qdrant-clientwas already pinned) and no settings changes; the index is created automatically with no operator action.Tests
8 new tests in
tests/test_memory.py(index created once and with the right schema,MatchText+MatchValuefilter composition, full-scan skipped when the index hits, token-but-not-substring rejection falls back, mid-token fragment recall via fallback, creation-failure caching, indexed-query failure fallback, empty-query short-circuit) plus an autouse conftest fixture isolating the index-state cache between tests. Existing keyword tests now exercise the scan path explicitly. Full suite: 174 passed,ruffclean.Review
Adversarial review surfaced 8 candidates; 2 were fixed (a lock around the index-state cache so racing first searches can't clobber a successful creation with a transient failure's
False; an explicitscrollstub in the REST keyword test instead of relying on a MagicMock unpack failure). Case-handling concerns are covered by Qdrant tokenizing queries with the index'slowercase=trueconfig plus the substring fallback.https://claude.ai/code/session_01H2Dbh6kD8bseWZZEf7kGhx
Generated by Claude Code