fix(cli): retry processor upload on ECONNRESET and other transient network errors#72
Merged
Merged
Conversation
…rors The upload step PUTs the packed processor to a signed storage URL via node-fetch. node-fetch intermittently surfaces a "socket hang up" (ECONNRESET) against GCS, but the retry guard only matched EPIPE, so a single reset aborted the whole upload with no retry. Widen the retry whitelist to cover the common transient socket errnos (ECONNRESET, ETIMEDOUT, ECONNREFUSED, EAI_AGAIN) in addition to EPIPE. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
sentio uploadfails intermittently at the final step that PUTs the packedprocessor (
dist/lib.js) to the signed storage URL, with:Build / codegen / packaging all succeed and the signed URL is obtained fine —
only the upload PUT (via
node-fetch) gets reset. The same PUT to the same GCShost succeeds reliably via
curland via Node's built-infetch(undici), sothe reset is a transient connection issue that
node-fetch2.x surfaces anddoes not transparently retry.
Root cause
The retry guard in
tryUploadingonly matchedEPIPE:ECONNRESET(the common one against GCS) fell through to theelsebranch andaborted the whole upload after a single attempt — no retry — even though the
triedCount/backoff machinery was already in place for up to 5 tries.Fix
Widen the retry whitelist to the common transient socket errnos
(
ECONNRESET,ETIMEDOUT,ECONNREFUSED,EAI_AGAIN) in addition toEPIPE.No behavior change for non-transient errors (they still surface immediately).
Test
Reproduced the
ECONNRESETreliably (intermittent) withnode-fetch2.7.0PUT-ing ~5MB to the GCS bucket host; undici and curl never reset. After the
change, a reset triggers the existing 1s-delay retry loop instead of aborting,
which is what lets the upload succeed on a subsequent attempt.