chore(storage): log additional bytes#14608
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements logging to detect and report instances where Google Cloud Storage returns more data than requested, covering both the standard Reader and the gRPC multi-range downloader. Feedback highlights a missing 'bytes' import in the tests and suggests a more robust method for capturing log output to avoid side effects on global state. Additionally, the reviewer noted that the current logging logic might under-report the total number of extra bytes if they are delivered across multiple read calls or gRPC chunks.
| // failRange capacity calculations. | ||
| if req.length >= 0 && req.bytesWritten > req.length { | ||
| if !req.extraBytesLogged { | ||
| log.Printf("storage: received %d more bytes than requested from GCS", req.bytesWritten-req.length) |
There was a problem hiding this comment.
Similar to the issue in Reader.Read, the count of extra bytes logged here might be under-reported if the excess data is delivered across multiple gRPC data chunks. The extraBytesLogged flag ensures only the first chunk that exceeds the limit is logged, which may not represent the total amount of additional data received from GCS.
| req.completed = true | ||
| delete(mrdStream.pendingRanges, req.readID) | ||
| mrdStream.updateCapacity(m, -1, 0) | ||
| if req.length >= 0 && req.bytesWritten > req.length { |
There was a problem hiding this comment.
Did you check scenarios with negative length and negative offset? I know we convert them when the metadata is available but can be useful to doubt check.
|
|
||
| func (r *Reader) Read(p []byte) (int, error) { | ||
| n, err := r.reader.Read(p) | ||
| r.mu.Lock() |
There was a problem hiding this comment.
I think this locking will introduce regressions. Should perform some regression testing I think.
| if r.remain != -1 { | ||
| r.remain -= int64(n) | ||
| } |
There was a problem hiding this comment.
Why the explicit check for -1? I don't see us initializing r.remain with -1 anywhere. Can you check if it is still relevant?
|
|
||
| r := &Reader{ | ||
| reader: m, | ||
| remain: 5, // We pretend we only requested 5 bytes |
There was a problem hiding this comment.
nit: Add periods everywhere.
| } | ||
|
|
||
| logStr := logOutput.String() | ||
| expectedLog := `storage: received 7 more bytes than requested from GCS for bucket "my-bucket", object "my-object"` |
There was a problem hiding this comment.
nit: replace "my-bucket" with r.bucket?
No description provided.