Update RAG metrics by jmsevin · Pull Request #13 · CyberCRI/WeLearn-api

jmsevin · 2025-05-02T15:34:56Z

Description

In order to evaluate future evolutions (choice of new LLM, hybrid search feature in Qdrant...), we needed to update the RAG metrics script.

Why?

Ragas library evolved quite a lot, so we had to update the metrics script to reflect those changes. Moreover, we decided to focus on metrics which don't need groundtruth data, as the way we synthesized them the first time may be not relevant for our current usecase. Finally, we added an option in the script to compute RAG metrics without context (i.e. when the LLM doesn't get resources from WeLearn database).

How?

Five RAG metrics are computed now:

The script is launched through the command line python rag-metrics.py. Three options can be added:

--all_corpus to aggregate WeLearn resources by language and provide one context only (per language) to the LLM
--reranking to rerank WeLearn resources in order to improve diversity
--vanilla to compute RAG metrics without any context (i.e. WeLearn resources) for the LLM

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

My code follows the code style of this project.
My code is tested.
I have updated the documentation accordingly.

Copilot

Pull Request Overview

This PR updates the RAG metrics script and related chat service functions to align with the latest library changes and improve error handling.

Added try/except blocks and enhanced logging in the chat functions.
Updated dependency versions and added new dependencies in pyproject.toml to support evolved libraries.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

File	Description
src/app/services/abst_chat.py	Improved error handling in chat calls and updated model strings.
pyproject.toml	Updated dependency versions for ragas and added new dependencies.

Comments suppressed due to low confidence (1)

src/app/services/abst_chat.py:595

The print statement in the chat_schema function may be left over from debugging. Consider replacing it with a logging statement or removing it to avoid unintended console output in production.

print(completion)

Jean-Marc SEVIN added 6 commits May 2, 2025 16:15

Update OpenAI and Mistral config

faa83dc

Update dependencies

b2026cb

Complete update of RAG metrics

588b322

Add RAG evaluation dataset

b05bc90

Remove duplicate method

11899d7

Update dependencies

cad2f40

jmsevin requested review from Copilot and sandragjacinto May 2, 2025 15:34

Copilot AI reviewed May 2, 2025

View reviewed changes

Remove debugging print

8caa480

sandragjacinto approved these changes May 5, 2025

View reviewed changes

jmsevin merged commit e5658d1 into main May 5, 2025
3 checks passed

jmsevin deleted the update-metrics branch May 5, 2025 14:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update RAG metrics#13

Update RAG metrics#13
jmsevin merged 7 commits into
mainfrom
update-metrics

jmsevin commented May 2, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jmsevin commented May 2, 2025

Description

Why?

How?

Types of changes

Checklist:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants