Skip to content

chore(storage): add retry max attempts to MRD range requests#14576

Open
cpriti-os wants to merge 3 commits into
googleapis:mainfrom
cpriti-os:retry
Open

chore(storage): add retry max attempts to MRD range requests#14576
cpriti-os wants to merge 3 commits into
googleapis:mainfrom
cpriti-os:retry

Conversation

@cpriti-os
Copy link
Copy Markdown
Contributor

fixes b/512296671

@cpriti-os cpriti-os requested review from a team as code owners May 12, 2026 08:27
@product-auto-label product-auto-label Bot added the api: storage Issues related to the Cloud Storage API. label May 12, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements per-range retry limits for the MultiRangeDownloader to prevent infinite reconnection loops. It introduces an attempts field to the rangeRequest struct and updates the stream handling logic to track and enforce the MaxAttempts configuration. A new test case, TestMultiRangeDownloaderRetryLimitEmulated, is added to verify this behavior. Review feedback suggests improving the test's robustness by checking for errors in the gRPC stream interceptor, using UnixNano to avoid resource name collisions, and explicitly handling timeouts in the test's polling loop.

Comment thread storage/client_test.go
Comment thread storage/client_test.go Outdated
Comment thread storage/client_test.go Outdated
Comment on lines +2287 to +2295
for i := 0; i < 100; i++ {
r1Mu.Lock()
done := range1Done
r1Mu.Unlock()
if done {
break
}
time.Sleep(50 * time.Millisecond)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test waits for the range request to fail using a loop with a timeout, but it doesn't explicitly check if the loop terminated because the operation finished or because it timed out. If it times out, range1Done will be false, and the subsequent check if err1 == nil will trigger a failure with a potentially misleading message ("expected Range 1 to fail, but it succeeded"). It's better to explicitly check for the timeout.

	// Wait for Range 1 to fail.
	for i := 0; i < 100; i++ {
		r1Mu.Lock()
		done := range1Done
		r1Mu.Unlock()
		if done {
			break
		}
		time.Sleep(50 * time.Millisecond)
	}

	r1Mu.Lock()
	done := range1Done
	err1 := range1Err
	r1Mu.Unlock()

	if !done {
		t.Fatalf("timed out waiting for Range 1 to fail")
	}
References
  1. In tests, explicit checks for conditions like timeouts are clearer and preferred over implicit failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the Cloud Storage API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant