Skip to content

docs: RLM vs RAG benchmark (FinanceBench 50/50 + full)#64

Merged
miguelgfierro merged 1 commit into
feat/rlm-integrationfrom
docs/rlm-vs-rag-benchmark
Jun 18, 2026
Merged

docs: RLM vs RAG benchmark (FinanceBench 50/50 + full)#64
miguelgfierro merged 1 commit into
feat/rlm-integrationfrom
docs/rlm-vs-rag-benchmark

Conversation

@miguelgfierro

Copy link
Copy Markdown
Contributor

Reframes the benchmark doc as RLM vs hybrid vector RAG across FinanceBench 50/50 and full, matching the experiments-repo README format (RAGAS + custom + retrieval metrics, time, cost; ordered by Answer-Correctness). Shows RLM winning answer quality on both datasets (AnsCorr 0.497/0.501 vs best-RAG 0.434/0.422), the vector embedding-ingest cost (~1h16m / ~2h36m) vs RLM's lazy no-ingest, and the 50/50 production-sandbox validation (0.510, 0 failures). Renames docs/rlm-benchmark.mddocs/rlm-vs-rag-benchmark.md. Lands in PR #43.

Rename rlm-benchmark.md -> rlm-vs-rag-benchmark.md; match the experiments
README format; RLM vs hybrid vector RAG across both datasets with retrieval
+ generation metrics, time (incl. the embedding-heavy vector ingest vs RLM's
lazy no-ingest), and cost; PageIndex omitted (RLM-vs-RAG view).
@miguelgfierro miguelgfierro merged commit 8a7a76c into feat/rlm-integration Jun 18, 2026
@miguelgfierro miguelgfierro deleted the docs/rlm-vs-rag-benchmark branch June 18, 2026 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant