fix(e2e): remove overly broad "error" string check in chat response assertions#919
fix(e2e): remove overly broad "error" string check in chat response assertions#919Abirdcfly wants to merge 1 commit intovolcano-sh:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request introduces a warmup step in the TestMetricsShared function within test/e2e/router/shared.go by calling utils.CheckChatCompletions before capturing baseline metrics. This change ensures the route is ready and prevents retry-induced requests from affecting metric calculations. I have no feedback to provide.
There was a problem hiding this comment.
Pull request overview
This PR updates the router/gateway-api E2E metrics test to reduce CI flakiness by sending a “warmup” request before capturing baseline Prometheus metrics, so initial reconciliation/retry traffic doesn’t skew the measured metric deltas.
Changes:
- Add a warmup chat-completions request before baseline metrics are captured in
TestMetricsShared.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce flakiness in the E2E metrics test by ensuring the chat route is exercised before taking a metrics baseline, and adjusts chat response error detection to avoid false positives from LLM-generated content.
Changes:
- Add a warmup chat request in
TestMetricsSharedprior to capturing baseline metrics. - Remove naive
"error"substring assertions from chat response checks. - Refine
containsErrordetection logic used by retry/ready checks.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| test/e2e/utils/chat.go | Updates chat response validation and error-detection heuristics used by retries/readiness helpers. |
| test/e2e/router/shared.go | Adds a warmup request before capturing baseline metrics in TestMetricsShared. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Assert successful response | ||
| assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code") | ||
| assert.NotEmpty(t, resp.Body, "Chat response is empty") | ||
| assert.NotContains(t, resp.Body, "error", "Chat response contains error") | ||
|
|
There was a problem hiding this comment.
The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.
| @@ -197,7 +196,6 @@ func CheckChatCompletionsQuiet(t *testing.T, modelName string, messages []ChatMe | |||
| resp := SendChatRequestWithRetryQuiet(t, DefaultRouterURL, modelName, messages, nil) | |||
| assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code") | |||
| assert.NotEmpty(t, resp.Body, "Chat response is empty") | |||
There was a problem hiding this comment.
The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.
…ssertions The previous check strings.Contains(response, "error") caused false positives when chat messages naturally contained the word "error". Now properly checks JSON error field instead. Fixes flaky TestMetrics and similar tests. Signed-off-by: Abirdcfly <fp544037857@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR updates e2e chat response validation to avoid flaky failures caused by checking for the substring "error" in normal chat content, and instead relies on structured JSON error detection.
Changes:
- Removed
assert.NotContains(..., "error")checks from chat completion helpers. - Updated
containsErrorto look for a top-level JSONerrorfield instead of substring matching.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Assert successful response | ||
| assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code") | ||
| assert.NotEmpty(t, resp.Body, "Chat response is empty") | ||
| assert.NotContains(t, resp.Body, "error", "Chat response contains error") | ||
|
|
There was a problem hiding this comment.
The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.
| @@ -197,7 +195,6 @@ func CheckChatCompletionsQuiet(t *testing.T, modelName string, messages []ChatMe | |||
| resp := SendChatRequestWithRetryQuiet(t, DefaultRouterURL, modelName, messages, nil) | |||
| assert.Equal(t, http.StatusOK, resp.StatusCode, "Expected HTTP 200 status code") | |||
| assert.NotEmpty(t, resp.Body, "Chat response is empty") | |||
There was a problem hiding this comment.
The sendChatRequestWithRetry function was called when obtaining the resp, which already includes the containsError function. A retry is made when an error is present, and this judgment:assert.NotContains(t, resp.Body, "error") is redundant and ineffective.
| func containsError(response string) bool { | ||
| responseLower := strings.ToLower(response) | ||
| return strings.Contains(responseLower, "error") | ||
| var r struct { | ||
| Error json.RawMessage `json:"error"` | ||
| } | ||
| _ = json.Unmarshal([]byte(response), &r) | ||
| return len(r.Error) > 0 |
There was a problem hiding this comment.
This is the standard OpenAI response format, and in my opinion, there is no need to consider HTML errors or other proxy errors in test
| Error json.RawMessage `json:"error"` | ||
| } | ||
| _ = json.Unmarshal([]byte(response), &r) | ||
| return len(r.Error) > 0 |
There was a problem hiding this comment.
This is the standard OpenAI format, no { "error": null } is returned.
What type of PR is this?
/kind bug
What this PR does / why we need it:
fix test in ci:E2E Tests (gateway-api) like https://github.com/volcano-sh/kthena/actions/runs/24876825004/job/72835282280?pr=916#step:7:8123
The previous check
strings.Contains(response, "error")caused false positives when chat messages naturally contained the word"error". Now properly checks JSON error field instead.Fixes flaky
TestMetricsand similar tests.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: