feat: memory retrieval performance improvements (SPEC-2025-12-27-002) by zircote · Pull Request #35 · zircote/git-notes-memory

zircote · 2025-12-27T21:38:16Z

Summary

Improve memory retrieval accuracy from 65% (13/20) to 90%+ (18/20) through:

Hybrid Search: Combine BM25 and vector search using Reciprocal Rank Fusion (RRF)
Entity Indexing: Extract and index named entities (PERSON, PROJECT, TECHNOLOGY, FILE)
Temporal Indexing: Parse and normalize dates for time-based queries
Query Expansion: LLM-powered query enhancement (opt-in)

Implementation Plan

Phase	Focus	Tasks
Phase 1	Foundation (Schema v5, RRF)	4
Phase 2	Hybrid Search (BM25 + Vector)	4
Phase 3	Entity Indexing (NER + boost)	6
Phase 4	Temporal Indexing (dateparser)	5
Phase 5	Query Expansion (LLM)	4

Total: 21 tasks, 10 ADRs

Test Plan

Unit tests for RRF fusion algorithm
Unit tests for EntityExtractor (regex + spaCy)
Unit tests for TemporalExtractor (dateparser)
Integration tests for HybridSearchEngine
Benchmark harness validation (target: 90%+)

Add comprehensive specification for improving memory retrieval accuracy from 65% to 90%+ through hybrid search, entity indexing, temporal indexing, and LLM-powered query expansion. Key documents: - REQUIREMENTS.md: 4 P0, 4 P1, 3 P2 requirements - ARCHITECTURE.md: 5 new components, schema v5 - IMPLEMENTATION_PLAN.md: 5 phases, 21 tasks - DECISIONS.md: 10 ADRs including RRF, FTS5 BM25, spaCy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Phase 1 - Foundation complete: Schema v5 Migration: - Add entities table for named entity storage - Add memory_entities junction table for entity-memory mapping - Add temporal_refs table for date references - Update migration logic to run on new databases RRF Fusion Engine (src/git_notes_memory/index/rrf_fusion.py): - Implement Reciprocal Rank Fusion algorithm (k=60 default) - Support configurable source weights - Track source contributions per result - 28 unit tests for edge cases and score calculations HybridSearchConfig (src/git_notes_memory/retrieval/config.py): - Frozen dataclass with all hybrid search settings - Environment variable loading with sensible defaults - Integration with RRFConfig for weight extraction - 23 unit tests for config loading Retrieval Module Scaffold: - New retrieval/ module with lazy imports - Factory function for config singleton Tests: 141 passing (90 index + 28 RRF + 23 config) Part of SPEC-2025-12-27-002: Memory Retrieval Performance Improvements Target: Improve benchmark accuracy from 65% to 90%+ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Fix B007 in rrf_fusion.py: rename unused loop variable to _source_name - Fix mypy type narrowing in config.py: explicit SearchMode cast 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Phase 2 implementation of memory retrieval improvements: - Task 2.1: HybridSearchEngine with parallel vector + BM25 search - Reciprocal Rank Fusion combining multiple strategies - Mode selection: hybrid, vector, bm25 - Configurable weights and RRF k parameter - Observability integration with metrics and tracing - Task 2.2: Extend SearchEngine with ranking methods - search_vector_ranked() returns (memory, rank, distance) - search_text_ranked() returns (memory, rank, bm25_score) - Ranks are 1-indexed for RRF compatibility - Task 2.3: Extend RecallService with hybrid parameters - search_hybrid() method for RRF-fused search - Lazy-initialized HybridSearchEngine - Thread-safe initialization with double-checked locking Tests: 74 passing (21 hybrid, 28 RRF, 25 config) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

zircote and others added 2 commits December 27, 2025 16:37

zircote changed the base branch from main to v1.0.0 December 27, 2025 22:31

zircote and others added 2 commits December 27, 2025 17:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: memory retrieval performance improvements (SPEC-2025-12-27-002)#35

feat: memory retrieval performance improvements (SPEC-2025-12-27-002)#35
zircote wants to merge 4 commits intov1.0.0from
feature/memory-retrieval-improvements

zircote commented Dec 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zircote commented Dec 27, 2025

Summary

Implementation Plan

Test Plan

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant