[SC-16470] Support keyless Gemini fallback for judge configuration #521
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
johnwalz97
left a comment
There was a problem hiding this comment.
had one note but lgtm!
| return "gemini" | ||
|
|
||
| return None | ||
| return "gemini" |
There was a problem hiding this comment.
Just a thought: What if instead of trying to auto-configure creds and config for the LLM, we just accept a LangChain client object. That way the user has full flexibility to use whatever provider and credentials that they want?
There was a problem hiding this comment.
@juanmleng this is similar to what I was referring to today re: "bring your own client/judge" :)
I'd say this change is good for now (if a new version is needed soon) but we should figure out a more flexible interface so we don't have to change internal implementation of the code whenever the underlying LLM/client interface changes.
There was a problem hiding this comment.
Great point @johnwalz97, totally agree. Worth noting that DeepEval scorers can’t use a raw LangChain client directly, so we would need an adapter around it. So perhaps we can leave this as is for now, and in the next iteration give it a bit of thought on how to expose a cleaner client-based API as @cachafla suggested?
Co-authored-by: Cursor <cursoragent@cursor.com>
|
There was a separate issue I ran into while testing this, and it also reproduces on main: DeepEval is now trying to log scorer results to the Confident AI platform, which causes the scorer flow to fail locally with a missing/invalid DeepEval/Confident API key error. The last commit ( When you get a chance, could you please do a quick second sanity-check pass on the latest commit before we merge? |
PR SummaryThis PR introduces significant updates to the LLM and DeepEval integration within the ValidMind project. The key changes include:
Overall, these changes improve the integration with the Gemini LLM and DeepEval components, provide better default behavior when environment variables are missing, and add thorough test coverage for the new and modified functionality. Test Suggestions
|
Pull Request Description
What and why?
Supports keyless Gemini setups for ValidMind judge flows by defaulting to Gemini when OpenAI and Azure are not explicitly configured, instead of requiring Gemini API keys up front. This also extends the DeepEval scorer path to work without Gemini keys and updates tests and notebook guidance to reflect the new keyed and keyless Gemini behavior.
This is needed so enterprise Gemini users can run evaluations in environments where Gemini access is available without API keys.
How to test
Run
uv run pytest tests/unit_tests/test_ai_utils.pyWhat needs special review?
Dependencies, breaking changes, and deployment notes
Release notes
Added support for keyless Gemini evaluation in ValidMind. Gemini now works as the default fallback judge provider when OpenAI and Azure are not explicitly configured, including DeepEval scorer flows, and the related notebook guidance was updated to document both keyed and keyless Gemini setups.
Checklist