Correctness fixes: maxFee, 504 retries, key cache by CassioMG · Pull Request #238 · stellar/typescript-wallet-sdk

CassioMG · 2026-05-27T21:38:16Z

Summary

Three independent correctness bug fixes, one commit each:

submitWithFeeIncrease now respects maxFee on every retry, not just the first — previously the fee cap was silently dropped after the first retry, letting fees grow unbounded.
submitTransaction 504 retries are now bounded with exponential backoff instead of recursing forever with no delay — avoids hammering Horizon.
KeyManager.removeKey now clears the in-memory cache — previously a removed key lingered in the cache and could still be used to sign when shouldCache was on.

No public API changes. No new dependencies.

1. Forward `maxFee` across `submitWithFeeIncrease` retries (`1dd8411`)

submitWithFeeIncrease rebuilt its params object on each recursive retry but omitted maxFee. As a result the cap was only enforced on the first retry; from the second retry onward maxFee was undefined and the guard if (maxFee && …) short-circuited to false, letting the fee grow unbounded.

Forward maxFee in the recursive call.
Tighten the guard from maxFee && newFee > maxFee to maxFee !== undefined && newFee > maxFee.

2. Bound `submitTransaction` 504 retries with exponential backoff (`c1dc76f`)

On HTTP 504 the method recursed with no retry limit, no delay, and no backoff. Under sustained Horizon 504s this hammered the already-slow server and could eventually exhaust the call stack (RangeError: Maximum call stack size exceeded), since V8 doesn't apply tail-call optimization to await.

Convert the recursion to an iterative loop with a fixed five-retry budget.
Capped exponential backoff: 1s base, 30s ceiling, additive jitter up to +50%.
After exhausting retries, the last 504 is re-thrown.
Use optional chaining on e?.response?.status to avoid a latent crash when a non-axios error enters the catch block.

3. Clear `KeyManager` cache entry on `removeKey` (`e2d9691`)

_writeIndexCache's guard if (this.shouldCache && key) skipped the body when key was undefined, so removeKey's _writeIndexCache(id, undefined) call was a no-op. With shouldCache: true, a removed key stayed in the in-memory cache and subsequent signTransaction / fetchAuthToken calls kept returning it.

Split the guard so the method writes when given a key and deletes the cache entry when given undefined.

Testing

Each fix ships with a focused unit test:

submitWithFeeIncrease cap is enforced across multiple retries (not just the first).
submitTransaction 504 handling: retries then succeeds, re-throws non-504 immediately, and gives up after the retry budget.
removeKey purges the cache so a subsequent sign attempt misses and fails as expected.

No public API changes. No new dependencies.

🤖 Generated with Claude Code

Copilot

Pull request overview

Three independent correctness bug fixes to the core SDK and key-management package, each backed by a focused unit test.

Changes:

Forward maxFee in submitWithFeeIncrease recursive retries and tighten the guard to maxFee !== undefined so the cap is enforced on every retry, not just the first.
Replace unbounded recursion in submitTransaction's 504 handling with a bounded iterative loop (5 retries) using capped exponential backoff + jitter, plus optional chaining on e?.response?.status to harden non-axios error handling.
Fix KeyManager._writeIndexCache so it actually deletes the cached entry when key is undefined, ensuring removeKey purges the in-memory cache.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
@stellar/typescript-wallet-sdk/src/walletSdk/Horizon/Stellar.ts	Bounded 504 retry loop with backoff; forward `maxFee` and tighten its guard in `submitWithFeeIncrease`.
@stellar/typescript-wallet-sdk/test/stellar.test.ts	Adds tests for multi-retry `maxFee` enforcement and the three 504-handling paths (retry+success, non-504 rethrow, exhaustion).
@stellar/typescript-wallet-sdk-km/src/keyManager.ts	Split cache guard so `_writeIndexCache(id, undefined)` deletes the entry.
@stellar/typescript-wallet-sdk-km/test/keyManager.test.ts	Adds `removeKey` cache-purge test; minor formatting cleanup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The recursive call rebuilt the params object without maxFee, so the cap was enforced on the first retry only. After that, maxFee was undefined and the guard short-circuited to false, letting the fee grow without bound. Forward maxFee in the recursive call and tighten the guard from `maxFee && …` to `maxFee !== undefined && …`.

The previous implementation recursed on every 504 with no delay, no retry limit, and no backoff. Under sustained Horizon 504s this both hammered the already-congested server and eventually crashed the process with RangeError: Maximum call stack size exceeded. Convert to an iterative loop with capped exponential backoff (1s base, 30s ceiling, additive jitter up to +50%) and a fixed five-retry budget. After exhausting retries, the last 504 is rethrown. Also use optional chaining on `e?.response?.status` to avoid a latent crash when a non-axios error (e.g., TransactionSubmitFailedError from the `!response.successful` branch) enters the catch.

The guard `if (this.shouldCache && key)` skipped the body when key was undefined, so removeKey's `_writeIndexCache(id, undefined)` call was a no-op. The cached Key remained in memory and subsequent signTransaction / fetchAuthToken calls returned the "deleted" key. Split the guard so the method writes when given a key and deletes the cache entry when given undefined.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

piyalbasu

A couple of correctness concerns on the new submitTransaction / submitWithFeeIncrease logic.

piyalbasu · 2026-05-28T17:31:21Z

        const newFee = parseInt(transaction.fee) + baseFeeIncrease;

-        if (maxFee && newFee > maxFee) {
+        if (maxFee !== undefined && newFee > maxFee) {


maxFee: 0 regression. This guard changed from if (maxFee && newFee > maxFee) to if (maxFee !== undefined && newFee > maxFee). The old behavior treated maxFee === 0 as falsy ("no cap"); the new behavior treats it as a hard zero cap that always trips, since newFee = parseInt(transaction.fee) + baseFeeIncrease > 0.

Any caller passing maxFee: 0 (defensive default, numeric coercion of an unset env var, or as a documented no-cap sentinel under the old behavior) will now throw TransactionSubmitWithFeeIncreaseFailedError(0, e) on the very first tx_too_late and never resubmit. Worth a CHANGELOG note flagging the breaking change, or — if 0 was previously undocumented — at least narrowing the type to a positive number.

Separately, this is also bypassed when parseInt(transaction.fee) returns NaN: NaN > maxFee is false, so submitWithFeeIncrease recurses with baseFee: NaN. Worth guarding with Number.isFinite(newFee).

On maxFee: 0: agreed, reverted in 4939bb2 — the guard is back to if (maxFee && newFee > maxFee), preserving the maxFee: 0 ≈ no cap behavior. The actual bug fix (forwarding maxFee across the recursive call) is what closes the report; the guard tightening was spec creep.

On NaN guard: pushing back. transaction.fee comes from @stellar/stellar-sdk's TransactionBuilder, which sets fee from a numeric value — it's always a numeric string. For parseInt to return NaN here, the SDK contract would have to break, and we already depend on that contract elsewhere (baseFee.toString() in makeFeeBump, the builder constructor, etc.). I'd rather not sprinkle Number.isFinite defenses against contracts we already lean on; happy to revisit if there's a concrete path where transaction.fee could be non-numeric.

piyalbasu · 2026-05-28T17:31:21Z

-      if (e.response.status === 504) {
-        // in case of 504, keep retrying this tx until submission succeeds or we get a different error
+    let lastError: unknown;
+    for (let attempt = 0; attempt <= SUBMIT_504_MAX_RETRIES; attempt++) {


504 retry semantics changed from unbounded to ~6 attempts / ~46s. The previous implementation retried recursively forever; this caps at SUBMIT_504_MAX_RETRIES + 1 attempts and then throws the last 504. Two follow-ups worth considering:

Caller cascade: submitWithFeeIncrease catches submitTransaction errors and only handles tx_too_late. A bubbled 504 has getResultCode(e) === '', so it rethrows immediately — even though the original transaction may still land on-chain. A naive caller that retries on its own will double-submit. Worth either documenting this in the JSDoc or special-casing 504 in submitWithFeeIncrease.

Jitter exceeds the documented cap (line 160): cappedDelay = Math.min(BASE * 2**attempt, MAX) is clamped to 30s, but jitter (Math.random() * cappedDelay/2, up to 15s) is then added on top, giving ~45s waits on the final retries — 50% over SUBMIT_504_MAX_DELAY_MS. If the cap is meant to bound caller latency, apply jitter first and Math.min after, or use full-jitter (Math.random() * cappedDelay).

On 1 (caller cascade): you're right that an exhausted-504 bubbles through submitWithFeeIncrease unchanged, and the developer-experience concern stands. Documented in e7aba81 — the new JSDoc on submitTransaction explicitly calls out that an exhausted-504 leaves the on-chain status indeterminate and that callers should poll the tx hash rather than resubmit blindly (and notes that resubmitting the same signed tx will fail with tx_bad_seq once the original lands, so it's a DX problem rather than a fund-loss one — Stellar's sequence numbers prevent the double-submit you're concerned about).

On 2 (jitter exceeds the cap): fixed in ba3cb75 — switched to equal-jitter (cappedDelay/2 + Math.random() * cappedDelay/2), so sleeps now stay in [cap/2, cap]. Retains the herd-smoothing benefit with a deterministic progressive floor, and the named SUBMIT_504_MAX_DELAY_MS actually bounds the wait now.

piyalbasu

This looks good to me. I'd say the maxFee comment is worth addressing, but the 504 improvement could maybe be pushed off

The previous additive jitter (cappedDelay + Math.random()*cappedDelay/2) let sleeps exceed SUBMIT_504_MAX_DELAY_MS by 50% (up to 45s with the 30s cap). Switch to equal-jitter: each attempt waits cappedDelay/2 plus a random amount up to cappedDelay/2, so the total stays in [cap/2, cap]. The deterministic floor keeps the schedule progressive while preserving the herd-smoothing benefit. Update the 504 retry tests to match the new (cappedDelay/2)-floor sleeps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous tightening to \`maxFee !== undefined && newFee > maxFee\` changed behavior for callers passing \`maxFee: 0\` (e.g. as a defensive default or numeric coercion of an unset env var): the old truthy check treated 0 as "no cap" and recursed, while the strict check threw on the very first tx_too_late. Restore the original \`maxFee && newFee > maxFee\` guard. The actual fix — forwarding \`maxFee\` across the recursive call — remains in place, which is what closes the underlying bug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Make the JSDoc describe the actual behavior: a bounded number of 504 retries with exponential backoff, immediate rethrow on non-504 errors, and an explicit note that an exhausted-504 leaves the transaction's on-chain status indeterminate so callers should poll the tx hash instead of resubmitting blindly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CassioMG · 2026-05-29T02:05:27Z

This looks good to me. I'd say the maxFee comment is worth addressing, but the 504 improvement could maybe be pushed off

@piyalbasu Both your comments have been addressed, thanks for the review!

Bug-fix release rolling up #237 (SEP-10 challenge integrity) and #238 (submitWithFeeIncrease maxFee forwarding, bounded submitTransaction 504 retries, KeyManager removeKey cache deletion). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 27, 2026 21:38

Copilot started reviewing on behalf of CassioMG May 27, 2026 21:38 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

CassioMG changed the title ~~Correctness fixes: submitWithFeeIncrease maxFee, submitTransaction 504 retries, KeyManager cache~~ Correctness fixes: maxFee, 504 retries, key cache May 27, 2026

CassioMG added 3 commits May 27, 2026 15:08

CassioMG force-pushed the cg-correctness-improvements branch from a216429 to 4593135 Compare May 27, 2026 22:09

CassioMG requested a review from piyalbasu May 27, 2026 22:14

CassioMG self-assigned this May 27, 2026

CassioMG requested a review from Copilot May 27, 2026 22:15

Copilot started reviewing on behalf of CassioMG May 27, 2026 22:15 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

piyalbasu reviewed May 28, 2026

View reviewed changes

piyalbasu approved these changes May 28, 2026

View reviewed changes

CassioMG and others added 3 commits May 28, 2026 18:44

CassioMG merged commit 07da147 into main May 29, 2026
8 checks passed

CassioMG deleted the cg-correctness-improvements branch May 29, 2026 02:06

CassioMG mentioned this pull request May 29, 2026

Release/3.0.1 #239

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Correctness fixes: maxFee, 504 retries, key cache#238

Correctness fixes: maxFee, 504 retries, key cache#238
CassioMG merged 6 commits into
mainfrom
cg-correctness-improvements

CassioMG commented May 27, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

piyalbasu left a comment

Uh oh!

piyalbasu May 28, 2026

Uh oh!

CassioMG May 29, 2026 •

edited

Loading

Uh oh!

piyalbasu May 28, 2026

Uh oh!

CassioMG May 29, 2026 •

edited

Loading

Uh oh!

piyalbasu left a comment

Uh oh!

CassioMG commented May 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

CassioMG commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Forward maxFee across submitWithFeeIncrease retries (1dd8411)

2. Bound submitTransaction 504 retries with exponential backoff (c1dc76f)

3. Clear KeyManager cache entry on removeKey (e2d9691)

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

piyalbasu left a comment

Choose a reason for hiding this comment

Uh oh!

piyalbasu May 28, 2026

Choose a reason for hiding this comment

Uh oh!

CassioMG May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

piyalbasu May 28, 2026

Choose a reason for hiding this comment

Uh oh!

CassioMG May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

piyalbasu left a comment

Choose a reason for hiding this comment

Uh oh!

CassioMG commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CassioMG commented May 27, 2026 •

edited

Loading

1. Forward `maxFee` across `submitWithFeeIncrease` retries (`1dd8411`)

2. Bound `submitTransaction` 504 retries with exponential backoff (`c1dc76f`)

3. Clear `KeyManager` cache entry on `removeKey` (`e2d9691`)

CassioMG May 29, 2026 •

edited

Loading

CassioMG May 29, 2026 •

edited

Loading

CassioMG commented May 29, 2026 •

edited

Loading