This project implements a Retrieval Augmented Generation (RAG) system for stock news analysis using a hybrid retrieval strategy.
- Google News RSS ingestion
- Semantic retrieval using embeddings
- Keyword retrieval using BM25
- Hybrid scoring for better Recall@K
- Grounded LLM responses
- Retrieval-first evaluation mindset
- Python
- sentence-transformers
- BM25 (rank-bm25)
- OpenAI / Gemini
- No RAG frameworks used
Embedding search captures meaning. BM25 captures exact terms (tickers, numbers). Hybrid improves recall and coverage.
pip install -r requirements.txt
python src/main.py