-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Feature Request: Dynamic Metadata Filtering for RAG Queries
Problem Statement
Currently, the RAG implementation only supports filtering by:
- Number of results (
n_results) - Distance threshold (
distance_threshold)
However, ChromaDB supports powerful metadata filtering through its where parameter, which could significantly improve the precision of document retrieval.
Proposed Solution
Add a dynamic filter builder UI that allows users to create metadata-based filters for their RAG queries.
Implementation Overview
Backend Changes
-
RAG Config Service (
rag_config_service.py)- Add method to detect available metadata fields from collection
- Modify
query_collection()to acceptwhereparameter - Store filter preferences in config
-
API Endpoints (
routes.py)GET /api/rag/metadata-fields- Return available fields with types and unique values- Update
POST /api/chatto accept filter parameters
Frontend Changes
-
Main Chat Interface (
script.js,index.html)- Collapsible filter panel below RAG toggle
- Dynamic filter rows with field/operator/value selectors
- Support AND/OR logic between filters
- Show active filter count badge
-
Settings Page (
settings.js,settings.html)- Preview available metadata fields when collection selected
- Configure default filters
Filter Types Support
- Text fields: equals, contains (using
$in) - Numbers: equals,
$gt,$lt,$gte,$lte, range - Lists: multi-select with
$in/$nin - Dates: date picker with comparison operators
Example Filter Format
{
"filters": [
{"field": "author", "operator": "$eq", "value": "John Doe"},
{"field": "chapter", "operator": "$in", "value": [1, 2, 3]},
{"field": "date", "operator": "$gte", "value": "2024-01-01"}
],
"logic": "$and" // or "$or"
}User Benefits
- Precision: Target specific document subsets (e.g., "only search in chapter 3")
- Efficiency: Reduce noise from irrelevant content
- Flexibility: Build complex queries without writing code
- Discovery: Explore metadata patterns in the corpus
- Performance: Smaller, more relevant result sets
Additional Features to Consider
- Save/load filter presets
- Quick filter templates ("Recent docs", "By author")
- Filter match explanations in results
- Visual indicators for active filters
- Recently used filters history
ChromaDB Reference
ChromaDB supports these metadata filter operators:
- Comparison:
$eq,$ne,$gt,$gte,$lt,$lte - Logical:
$and,$or - Inclusion:
$in,$nin
Documentation: https://docs.trychroma.com/docs/querying-collections/metadata-filtering
Acceptance Criteria
- Users can add/remove filter conditions dynamically
- Filters persist across page refreshes
- Filter UI shows available fields from current collection
- Applied filters are visible in chat details modal
- Clear documentation on how to use filters
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request