Skip to content

feat: hybrid retrieval approach#8

Open
srini047 wants to merge 4 commits intodevrev:mainfrom
srini047:srini-01
Open

feat: hybrid retrieval approach#8
srini047 wants to merge 4 commits intodevrev:mainfrom
srini047:srini-01

Conversation

@srini047
Copy link
Copy Markdown

@srini047 srini047 commented Mar 16, 2026

Implementation:
BM42 Hybrid Retrieval with Haystack + Qdrant - Hybrid search system combining sparse (BM42) and dense (mxbai-embed-large-v1) embeddings for information retrieval on the DevRev knowledge base.

Indexing Pipeline:
DocumentCleaner → FastembedSparseDocumentEmbedder (BM42)
→ SentenceTransformersDocumentEmbedder (1024-dim)
→ DocumentWriter (Qdrant)

Retrieval Pipeline:
FastembedSparseTextEmbedder (BM42) + SentenceTransformersTextEmbedder
→ QdrantHybridRetriever (RRF fusion)

Work item: ISS-269621

@srini047
Copy link
Copy Markdown
Author

@nimit2801 Please help with the PR description validation.

@srini047
Copy link
Copy Markdown
Author

Please use this json for results: test_queries_results.json. Have the parquet as well.

@nimit2801
Copy link
Copy Markdown
Collaborator

hey @srini047

Kindly add this in your PR description: https://app.devrev.ai/devrev/works/ISS-269621

@prakhar7651
Copy link
Copy Markdown
Contributor

Hey!
These are your scores.
Recall@10: 0.3188
Precision@10: 0.2598

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants