[local] Write files atomically to avoid 500s on concurrent reads#30
Merged
Conversation
SerhatG
approved these changes
Jun 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The file-server intermittently returns HTTP 500 on GET of a stored file when the same file is overwritten while it is being read.
Root cause: LocalFileStorageSevice.putFile used Files.copy(in, file, REPLACE_EXISTING), which is not atomic. The JDK implementation deletes the target and then recreates it (CREATE_NEW) before streaming the bytes, so during an overwrite the file briefly does not exist. LocalFileController.getFile checks existence and returns a FileUrlResource, but the read and contentLength() happen afterwards, when Spring's ResourceHttpMessageConverter serializes the body, outside the controller's try/catch. If the writer's delete lands in that window, contentLength() throws FileNotFoundException, which escapes to the dispatcherServlet as a 500.
This surfaced in Register: on upload the API writes validation.json twice (a pending result, then the real result) while the frontend polls the validation-result endpoint in a tight loop, so overwrites overlap live reads. A single 500 there makes the upload fail for the user. Only the local storage profile is affected; the s3 profile redirects to an S3 URL and S3 overwrites are atomic.
Fix: stage the content in a temp file in the same directory and ATOMIC_MOVE it into place, so a concurrent reader always sees either the complete old file or the complete new one.
Reproduction, 8 concurrent readers and one writer overwriting for 3s:
non-atomic copy: 339751 reads failed with FileNotFoundException (each a 500)
atomic move: 0 failures