fix(evaluation): populate developer_instructions when invocation_even…#5595
Open
cthurston-clgx wants to merge 1 commit intogoogle:mainfrom
Open
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Collaborator
|
Response from ADK Triaging Agent Hello @cthurston-clgx, thank you for your contribution! Before we can merge this pull request, you'll need to sign the Contributor License Agreement (CLA). It seems that the CLA check has failed. Please visit https://cla.developers.google.com/ to sign the agreement. This is a necessary step for us to be able to accept your contribution. Thanks! |
…ts is empty (google#5593) The rubric_based_final_response_quality_v1 evaluator fails to pass developer_instructions to the judge when the agent makes zero tool calls (empty invocation_events list). This occurs because the agent name is resolved exclusively from invocation_events[0].author, with no fallback for the zero-event case. This is critical for evaluating out-of-scope rejection behavior where an agent correctly declines a request without calling any tools. The judge receives an empty <developer_instructions> block and cannot validate rubrics that reference the system prompt. Fix: When invocation_events is empty, fall back to the first agent name in app_details.agent_details to resolve developer_instructions. This mirrors how hallucinations_v1.py handles the same scenario. Fixes google#5593
67ab346 to
ecdc7fd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The rubric_based_final_response_quality_v1 evaluator fails to pass developer_instructions to the judge when the agent makes zero tool calls (empty invocation_events list). This occurs because the agent name is resolved exclusively from invocation_events[0].author, with no fallback for the zero-event case.
This is critical for evaluating out-of-scope rejection behavior where an agent correctly declines a request without calling any tools. The judge receives an empty <developer_instructions> block and cannot validate rubrics that reference the system prompt.
Fix: When invocation_events is empty, fall back to the first agent name in app_details.agent_details to resolve developer_instructions. This mirrors how hallucinations_v1.py handles the same scenario.
Fixes #5593