fix(e2e): remove overly broad "error" string check in chat response assertions by Abirdcfly · Pull Request #919 · volcano-sh/kthena

Abirdcfly · 2026-04-24T08:21:57Z

What type of PR is this?
/kind bug

What this PR does / why we need it:

fix test in ci:E2E Tests (gateway-api) like https://github.com/volcano-sh/kthena/actions/runs/24876825004/job/72835282280?pr=916#step:7:8123

The previous check strings.Contains(response, "error") caused false positives when chat messages naturally contained the word "error". Now properly checks JSON error field instead.

Fixes flaky TestMetrics and similar tests.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

volcano-sh-bot · 2026-04-24T08:22:08Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hzxuzhonghu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

test/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist

Code Review

This pull request introduces a warmup step in the TestMetricsShared function within test/e2e/router/shared.go by calling utils.CheckChatCompletions before capturing baseline metrics. This change ensures the route is ready and prevents retry-induced requests from affecting metric calculations. I have no feedback to provide.

Copilot

Pull request overview

This PR updates the router/gateway-api E2E metrics test to reduce CI flakiness by sending a “warmup” request before capturing baseline Prometheus metrics, so initial reconciliation/retry traffic doesn’t skew the measured metric deltas.

Changes:

Add a warmup chat-completions request before baseline metrics are captured in TestMetricsShared.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

This PR aims to reduce flakiness in the E2E metrics test by ensuring the chat route is exercised before taking a metrics baseline, and adjusts chat response error detection to avoid false positives from LLM-generated content.

Changes:

Add a warmup chat request in TestMetricsShared prior to capturing baseline metrics.
Remove naive "error" substring assertions from chat response checks.
Refine containsError detection logic used by retry/ready checks.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
test/e2e/utils/chat.go	Updates chat response validation and error-detection heuristics used by retries/readiness helpers.
test/e2e/router/shared.go	Adds a warmup request before capturing baseline metrics in `TestMetricsShared`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Abirdcfly · 2026-04-27T09:04:08Z

 	// Assert successful response
 	assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code")
 	assert.NotEmpty(t, resp.Body, "Chat response is empty")
-	assert.NotContains(t, resp.Body, "error", "Chat response contains error")



The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.

Abirdcfly · 2026-04-27T09:04:26Z

@@ -197,7 +196,6 @@ func CheckChatCompletionsQuiet(t *testing.T, modelName string, messages []ChatMe
 	resp := SendChatRequestWithRetryQuiet(t, DefaultRouterURL, modelName, messages, nil)
 	assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code")
 	assert.NotEmpty(t, resp.Body, "Chat response is empty")


The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.

…ssertions The previous check strings.Contains(response, "error") caused false positives when chat messages naturally contained the word "error". Now properly checks JSON error field instead. Fixes flaky TestMetrics and similar tests. Signed-off-by: Abirdcfly <fp544037857@gmail.com>

Copilot

Pull request overview

This PR updates e2e chat response validation to avoid flaky failures caused by checking for the substring "error" in normal chat content, and instead relies on structured JSON error detection.

Changes:

Removed assert.NotContains(..., "error") checks from chat completion helpers.
Updated containsError to look for a top-level JSON error field instead of substring matching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Abirdcfly · 2026-04-27T09:05:04Z

 	// Assert successful response
 	assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code")
 	assert.NotEmpty(t, resp.Body, "Chat response is empty")
-	assert.NotContains(t, resp.Body, "error", "Chat response contains error")



The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.

Abirdcfly · 2026-04-27T09:05:10Z

@@ -197,7 +195,6 @@ func CheckChatCompletionsQuiet(t *testing.T, modelName string, messages []ChatMe
 	resp := SendChatRequestWithRetryQuiet(t, DefaultRouterURL, modelName, messages, nil)
 	assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code")
 	assert.NotEmpty(t, resp.Body, "Chat response is empty")


The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.

Abirdcfly · 2026-04-27T09:09:25Z

 func containsError(response string) bool {
-	responseLower := strings.ToLower(response)
-	return strings.Contains(responseLower, "error")
+	var r struct {
+		Error json.RawMessage `json:"error"`
+	}
+	_ = json.Unmarshal([]byte(response), &r)
+	return len(r.Error) > 0


This is the standard OpenAI response format, and in my opinion, there is no need to consider HTML errors or other proxy errors in test

Abirdcfly · 2026-04-27T09:07:23Z

+		Error json.RawMessage `json:"error"`
+	}
+	_ = json.Unmarshal([]byte(response), &r)
+	return len(r.Error) > 0


This is the standard OpenAI format, no { "error": null } is returned.

Copilot AI review requested due to automatic review settings April 24, 2026 08:21

volcano-sh-bot added kind/bug do-not-merge/work-in-progress labels Apr 24, 2026

volcano-sh-bot requested review from YaoZengzeng and hzxuzhonghu April 24, 2026 08:22

volcano-sh-bot added the size/XS label Apr 24, 2026

Copilot started reviewing on behalf of Abirdcfly April 24, 2026 08:22 View session

Abirdcfly force-pushed the fixci branch from 69fd804 to 5f05126 Compare April 24, 2026 08:22

gemini-code-assist Bot reviewed Apr 24, 2026

View reviewed changes

Copilot AI reviewed Apr 24, 2026

View reviewed changes

Comment thread test/e2e/router/shared.go Outdated

Abirdcfly force-pushed the fixci branch from 5f05126 to c384b81 Compare April 24, 2026 09:16

volcano-sh-bot added size/S and removed size/XS labels Apr 24, 2026

Abirdcfly force-pushed the fixci branch from c384b81 to 0f9098c Compare April 24, 2026 09:17

Copilot AI review requested due to automatic review settings April 24, 2026 09:45

Abirdcfly force-pushed the fixci branch from 0f9098c to 121f423 Compare April 24, 2026 09:45

Copilot started reviewing on behalf of Abirdcfly April 24, 2026 09:45 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

Abirdcfly force-pushed the fixci branch from 121f423 to 33f36af Compare April 27, 2026 08:35

Abirdcfly changed the title ~~Add warmup request for metrics baseline in TestMetrics~~ fix(e2e): remove overly broad "error" string check in chat response assertions Apr 27, 2026

Copilot AI review requested due to automatic review settings April 27, 2026 08:59

Abirdcfly force-pushed the fixci branch from 33f36af to f50e08b Compare April 27, 2026 08:59

Copilot AI reviewed Apr 27, 2026

View reviewed changes

Abirdcfly marked this pull request as ready for review April 27, 2026 09:37

volcano-sh-bot removed the do-not-merge/work-in-progress label Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(e2e): remove overly broad "error" string check in chat response assertions#919

fix(e2e): remove overly broad "error" string check in chat response assertions#919
Abirdcfly wants to merge 1 commit intovolcano-sh:mainfrom
Abirdcfly:fixci

Abirdcfly commented Apr 24, 2026 •

edited

Loading

Uh oh!

volcano-sh-bot commented Apr 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Abirdcfly Apr 27, 2026

Uh oh!

Abirdcfly Apr 27, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Abirdcfly Apr 27, 2026 •

edited

Loading

Uh oh!

Abirdcfly Apr 27, 2026 •

edited

Loading

Uh oh!

Abirdcfly Apr 27, 2026

Uh oh!

Abirdcfly Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Abirdcfly commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

volcano-sh-bot commented Apr 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Abirdcfly Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Abirdcfly Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Abirdcfly Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Abirdcfly Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Abirdcfly Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Abirdcfly Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Abirdcfly commented Apr 24, 2026 •

edited

Loading

Abirdcfly Apr 27, 2026 •

edited

Loading

Abirdcfly Apr 27, 2026 •

edited

Loading