chore(storage): add retry max attempts to MRD range requests#14576
chore(storage): add retry max attempts to MRD range requests#14576cpriti-os wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements per-range retry limits for the MultiRangeDownloader to prevent infinite reconnection loops. It introduces an attempts field to the rangeRequest struct and updates the stream handling logic to track and enforce the MaxAttempts configuration. A new test case, TestMultiRangeDownloaderRetryLimitEmulated, is added to verify this behavior. Review feedback suggests improving the test's robustness by checking for errors in the gRPC stream interceptor, using UnixNano to avoid resource name collisions, and explicitly handling timeouts in the test's polling loop.
| for i := 0; i < 100; i++ { | ||
| r1Mu.Lock() | ||
| done := range1Done | ||
| r1Mu.Unlock() | ||
| if done { | ||
| break | ||
| } | ||
| time.Sleep(50 * time.Millisecond) | ||
| } |
There was a problem hiding this comment.
The test waits for the range request to fail using a loop with a timeout, but it doesn't explicitly check if the loop terminated because the operation finished or because it timed out. If it times out, range1Done will be false, and the subsequent check if err1 == nil will trigger a failure with a potentially misleading message ("expected Range 1 to fail, but it succeeded"). It's better to explicitly check for the timeout.
// Wait for Range 1 to fail.
for i := 0; i < 100; i++ {
r1Mu.Lock()
done := range1Done
r1Mu.Unlock()
if done {
break
}
time.Sleep(50 * time.Millisecond)
}
r1Mu.Lock()
done := range1Done
err1 := range1Err
r1Mu.Unlock()
if !done {
t.Fatalf("timed out waiting for Range 1 to fail")
}References
- In tests, explicit checks for conditions like timeouts are clearer and preferred over implicit failures.
fixes b/512296671