aiscp: Handle files > 2,147,479,552 bytes#157
Conversation
This change allows aiscp to copy files that are larger than the maximum allowed IO size on linux.
There was a problem hiding this comment.
Pull request overview
This pull request enhances the aiscp utility to support copying files larger than approximately 2GB (2,147,479,552 bytes) by implementing a chunked copying strategy. The previous implementation attempted to read and write the entire file in a single operation, which fails on Linux when exceeding system I/O size limits.
Changes:
- Introduced chunked I/O with AISCP_CHUNK_SIZE constant set to just under 2GB
- Added nested loop structure to read and write data in manageable chunks
- Added alignUp helper function to ensure proper alignment for O_DIRECT I/O operations
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| /// @brief Round value to the next multiple of align. Align _must_ be a power of 2. | ||
| /// @param value The value to round up. | ||
| /// @param align Value will be round up to a multiple of align |
There was a problem hiding this comment.
The parameter description for 'align' contains a grammatical error: "Value will be round up" should be "Value will be rounded up".
| /// @param align Value will be round up to a multiple of align | |
| /// @param align Value will be rounded up to a multiple of align |
| while (static_cast<size_t>(ncopy) < file_size) { | ||
| nread = hipFileRead(src_handle, devbuf, buffer_size, ncopy, 0); | ||
| if (nread < 0) { | ||
| fprintf(stderr, "Could not read from %s (%zd) (%s)\n", src_path, nread, | ||
| IS_HIPFILE_ERR(nread) ? HIPFILE_ERRSTR(nread) : strerror(errno)); | ||
| goto free_devbuf; | ||
| } | ||
|
|
||
| nbytes = hipFileWrite(dst_handle, devbuf, buffer_size, 0, 0); | ||
| if (nbytes < 0 || buffer_size != static_cast<size_t>(nbytes)) { | ||
| fprintf(stderr, "Could not write to %s (%zd) (%s)\n", src_path, nbytes, | ||
| IS_HIPFILE_ERR(nbytes) ? HIPFILE_ERRSTR(nbytes) : strerror(errno)); | ||
| goto free_devbuf; | ||
| nwrite = 0; | ||
| while (nwrite < nread) { | ||
| nbytes = | ||
| hipFileWrite(dst_handle, devbuf, alignUp(static_cast<size_t>(nread - nwrite), block_size), | ||
| static_cast<hoff_t>(ncopy + nwrite), static_cast<hoff_t>(nwrite)); | ||
| if (nbytes < 0) { | ||
| fprintf(stderr, "Could not write to %s (%zd) (%s)\n", src_path, nbytes, | ||
| IS_HIPFILE_ERR(nbytes) ? HIPFILE_ERRSTR(nbytes) : strerror(errno)); | ||
| goto free_devbuf; | ||
| } | ||
| nwrite += nbytes; | ||
| } | ||
| ncopy += nread; | ||
| } |
There was a problem hiding this comment.
The outer while loop continues until ncopy equals file_size, but there's no handling for the case when hipFileRead returns 0 (EOF) before reaching file_size. If hipFileRead returns 0 prematurely, this will result in an infinite loop since ncopy will never reach file_size. Consider adding a check: if nread is 0, break from the loop or handle it as an error.
| hipFileWrite(dst_handle, devbuf, alignUp(static_cast<size_t>(nread - nwrite), block_size), | ||
| static_cast<hoff_t>(ncopy + nwrite), static_cast<hoff_t>(nwrite)); | ||
| if (nbytes < 0) { | ||
| fprintf(stderr, "Could not write to %s (%zd) (%s)\n", src_path, nbytes, |
There was a problem hiding this comment.
The error message incorrectly references src_path instead of dst_path. When hipFileWrite fails, the error should indicate the destination file, not the source file.
| fprintf(stderr, "Could not write to %s (%zd) (%s)\n", src_path, nbytes, | |
| fprintf(stderr, "Could not write to %s (%zd) (%s)\n", dst_path, nbytes, |
| * This _very_basic_ program copies SOURCE to DEST via GPU memory. | ||
| */ | ||
|
|
||
| #ifndef AISCP_CHUNK_SIZE |
There was a problem hiding this comment.
The AISCP_CHUNK_SIZE constant (0x7ffff000lu = 2,147,479,552 bytes) should have a comment explaining why this specific value was chosen and how it relates to the Linux maximum I/O size mentioned in the PR description. This would help future maintainers understand the rationale behind this magic number.
| #ifndef AISCP_CHUNK_SIZE | |
| #ifndef AISCP_CHUNK_SIZE | |
| // Use a chunk size just below 2 GiB (2^31 bytes) and one page (4 KiB) smaller than | |
| // that limit: 0x7ffff000 = 2_147_479_552 bytes. This keeps each I/O request within | |
| // the Linux maximum I/O size constraints while remaining page-aligned for O_DIRECT. |
| goto free_devbuf; | ||
| } | ||
| while (static_cast<size_t>(ncopy) < file_size) { | ||
| nread = hipFileRead(src_handle, devbuf, buffer_size, ncopy, 0); |
There was a problem hiding this comment.
The hipFileRead call always reads buffer_size bytes, even on the last chunk where fewer bytes may remain. While hipFileRead likely handles this correctly by returning the actual bytes read, it would be more explicit and safer to limit the read size to the remaining bytes: std::min(buffer_size, file_size - static_cast<size_t>(ncopy)). This also makes the code's intent clearer.
| nread = hipFileRead(src_handle, devbuf, buffer_size, ncopy, 0); | |
| const size_t remaining = file_size - static_cast<size_t>(ncopy); | |
| const size_t read_size = std::min(buffer_size, remaining); | |
| nread = hipFileRead(src_handle, devbuf, read_size, ncopy, 0); |
|
Going to re-work this implementation and then put it back up for review. |
This change allows aiscp to copy files that are larger than the maximum allowed IO size on linux.