Skip to content

fix(e2e): remove overly broad "error" string check in chat response assertions#919

Open
Abirdcfly wants to merge 1 commit intovolcano-sh:mainfrom
Abirdcfly:fixci
Open

fix(e2e): remove overly broad "error" string check in chat response assertions#919
Abirdcfly wants to merge 1 commit intovolcano-sh:mainfrom
Abirdcfly:fixci

Conversation

@Abirdcfly
Copy link
Copy Markdown
Contributor

@Abirdcfly Abirdcfly commented Apr 24, 2026

What type of PR is this?
/kind bug

What this PR does / why we need it:

fix test in ci:E2E Tests (gateway-api) like https://github.com/volcano-sh/kthena/actions/runs/24876825004/job/72835282280?pr=916#step:7:8123

The previous check strings.Contains(response, "error") caused false positives when chat messages naturally contained the word "error". Now properly checks JSON error field instead.

Fixes flaky TestMetrics and similar tests.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@volcano-sh-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hzxuzhonghu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a warmup step in the TestMetricsShared function within test/e2e/router/shared.go by calling utils.CheckChatCompletions before capturing baseline metrics. This change ensures the route is ready and prevents retry-induced requests from affecting metric calculations. I have no feedback to provide.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the router/gateway-api E2E metrics test to reduce CI flakiness by sending a “warmup” request before capturing baseline Prometheus metrics, so initial reconciliation/retry traffic doesn’t skew the measured metric deltas.

Changes:

  • Add a warmup chat-completions request before baseline metrics are captured in TestMetricsShared.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/e2e/router/shared.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce flakiness in the E2E metrics test by ensuring the chat route is exercised before taking a metrics baseline, and adjusts chat response error detection to avoid false positives from LLM-generated content.

Changes:

  • Add a warmup chat request in TestMetricsShared prior to capturing baseline metrics.
  • Remove naive "error" substring assertions from chat response checks.
  • Refine containsError detection logic used by retry/ready checks.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
test/e2e/utils/chat.go Updates chat response validation and error-detection heuristics used by retries/readiness helpers.
test/e2e/router/shared.go Adds a warmup request before capturing baseline metrics in TestMetricsShared.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/e2e/utils/chat.go
Comment on lines 186 to 189
// Assert successful response
assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code")
assert.NotEmpty(t, resp.Body, "Chat response is empty")
assert.NotContains(t, resp.Body, "error", "Chat response contains error")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.

Comment thread test/e2e/utils/chat.go
@@ -197,7 +196,6 @@ func CheckChatCompletionsQuiet(t *testing.T, modelName string, messages []ChatMe
resp := SendChatRequestWithRetryQuiet(t, DefaultRouterURL, modelName, messages, nil)
assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code")
assert.NotEmpty(t, resp.Body, "Chat response is empty")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.

Comment thread test/e2e/utils/chat.go Outdated
@Abirdcfly Abirdcfly changed the title Add warmup request for metrics baseline in TestMetrics fix(e2e): remove overly broad "error" string check in chat response assertions Apr 27, 2026
…ssertions

The previous check strings.Contains(response, "error") caused false positives when chat messages naturally contained the word "error". Now properly checks JSON error field instead.

Fixes flaky TestMetrics and similar tests.

Signed-off-by: Abirdcfly <fp544037857@gmail.com>
Copilot AI review requested due to automatic review settings April 27, 2026 08:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates e2e chat response validation to avoid flaky failures caused by checking for the substring "error" in normal chat content, and instead relies on structured JSON error detection.

Changes:

  • Removed assert.NotContains(..., "error") checks from chat completion helpers.
  • Updated containsError to look for a top-level JSON error field instead of substring matching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/e2e/utils/chat.go
Comment on lines 185 to 188
// Assert successful response
assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code")
assert.NotEmpty(t, resp.Body, "Chat response is empty")
assert.NotContains(t, resp.Body, "error", "Chat response contains error")

Copy link
Copy Markdown
Contributor Author

@Abirdcfly Abirdcfly Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.

Comment thread test/e2e/utils/chat.go
@@ -197,7 +195,6 @@ func CheckChatCompletionsQuiet(t *testing.T, modelName string, messages []ChatMe
resp := SendChatRequestWithRetryQuiet(t, DefaultRouterURL, modelName, messages, nil)
assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code")
assert.NotEmpty(t, resp.Body, "Chat response is empty")
Copy link
Copy Markdown
Contributor Author

@Abirdcfly Abirdcfly Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.

Comment thread test/e2e/utils/chat.go
Comment on lines 239 to +244
func containsError(response string) bool {
responseLower := strings.ToLower(response)
return strings.Contains(responseLower, "error")
var r struct {
Error json.RawMessage `json:"error"`
}
_ = json.Unmarshal([]byte(response), &r)
return len(r.Error) > 0
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the standard OpenAI response format, and in my opinion, there is no need to consider HTML errors or other proxy errors in test

Comment thread test/e2e/utils/chat.go
Comment on lines +241 to +244
Error json.RawMessage `json:"error"`
}
_ = json.Unmarshal([]byte(response), &r)
return len(r.Error) > 0
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the standard OpenAI format, no { "error": null } is returned.

@Abirdcfly Abirdcfly marked this pull request as ready for review April 27, 2026 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants