-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
Overview
Currently, the insights system uses a flat vector search approach where every insight's full content (topic + name + overview + details) is embedded into a single vector. This approach has performance and relevance limitations when searching large insight collections.
Proposed Solution: Hierarchical Vector Search
Implement a three-level hierarchical vector index that mirrors the natural structure of the insights system:
Topic (embedding of topic name + summary)
├── Insight 1 (embedding of name + overview)
│ └── Details (embedding of full details)
├── Insight 2 (embedding of name + overview)
│ └── Details (embedding of full details)
└── ...
Benefits
- Performance: 5-10x faster searches by progressive filtering
- Relevance: Better keyword matching with topic/title embeddings
- Scalability: Efficient elimination of irrelevant content categories
- Flexibility: Different embedding strategies for different content types
Implementation Approach
New Data Models
// New: Topic-level vectors
pub struct TopicVector {
pub topic_id: String, // "backend-api"
pub topic_name: String, // "Backend API"
pub topic_embedding: Vec<f32>, // Embedding of topic name + topic summary
pub insight_count: i32, // How many insights in this topic
pub created_at: String,
pub updated_at: String,
}
// Enhanced: Multi-level insight vectors
pub struct HierarchicalInsightRecord {
pub insight_id: String, // "backend-api:authentication"
pub topic_id: String, // "backend-api"
pub name: String,
pub overview: String,
pub details: String,
// Multiple embeddings for different search strategies
pub title_embedding: Vec<f32>, // Name + overview (keyword-optimized)
pub content_embedding: Vec<f32>, // Full details (semantic-optimized)
pub created_at: String,
pub updated_at: String,
}Three-Level Search Strategy
pub async fn hierarchical_search(
&self,
query: &str,
limit: usize,
) -> Result<Vec<EmbeddingSearchResult>> {
let query_embedding = create_embedding(query).await?;
// Level 1: Find relevant topics (fast, broad filter)
let relevant_topics = search_topic_vectors(&query_embedding, limit * 2).await?;
// Level 2: Find relevant insights within those topics (focused)
let relevant_insights = search_insight_vectors(
&query_embedding,
&relevant_topics.iter().map(|t| &t.topic_id).collect::<Vec<_>>(),
limit
).await?;
// Level 3: Search detailed content only for promising insights
search_content_vectors(&query_embedding, &relevant_insights, limit).await
}Migration Checklist
Phase 1: Infrastructure Setup
- Create new
TopicVectormodel and LanceDB table - Create new
HierarchicalInsightRecordmodel - Add topic summary generation (automatically derive from existing insights)
- Implement topic embedding generation
- Add migration utility to populate topic vectors from existing data
Phase 2: Enhanced Insight Vectors
- Modify
InsightRecordto support multiple embeddings - Update schema to include both
title_embeddingandcontent_embedding - Implement separate embedding generation for title vs content
- Add migration utility to split existing insight embeddings
- Update insight storage to generate both embedding types
Phase 3: Hierarchical Search Implementation
- Implement topic-level search functions
- Implement insight-level search with topic filtering
- Implement content-level search with insight filtering
- Create combined hierarchical search function
- Add performance benchmarking and comparison with existing search
Phase 4: API Integration
- Add hierarchical search endpoint alongside existing search
- Update search handlers to support both search types
- Add configuration option to choose search strategy
- Update CLI to support hierarchical search parameters
- Add search performance metrics and logging
Phase 5: Optimization & Migration
- Compare performance between flat and hierarchical search
- Tune similarity thresholds for each search level
- Implement adaptive search (fall back to content search if topic/insight searches return few results)
- Gradually migrate default search to hierarchical approach
- Add comprehensive test coverage for hierarchical search
Phase 6: Cleanup (Optional)
- Deprecate flat search approach (if hierarchical proves superior)
- Remove single embedding field from
InsightRecord - Clean up legacy search code
- Update documentation and examples
Technical Considerations
- Backward Compatibility: Keep existing search functionality during migration
- Storage Overhead: Multiple embeddings per insight will increase storage ~2-3x
- Index Management: Topic summaries need to be regenerated when insights change
- Performance Monitoring: Track search latency and relevance metrics during migration
- Embedding Consistency: Ensure all embedding types use the same model version
Success Criteria
- Search performance improves by 5-10x for large insight collections
- Search relevance improves, especially for keyword-based queries
- System remains backward compatible during migration
- Migration can be completed incrementally without service disruption
Related Files
crates/insights/src/server/services/lancedb/models.rs- Data modelscrates/insights/src/server/services/lancedb/search.rs- Search implementationcrates/insights/src/server/services/lancedb/records.rs- Arrow schemacrates/insights/src/server/handlers/insights.rs- API handlers
This enhancement would significantly improve the insights system's search performance and relevance, especially as the insight collection grows larger.