Skip to content

fix(payments): Guard reserve headroom#561

Open
kotevcode wants to merge 1 commit into
mainfrom
fix/preflight-reserve-guard
Open

fix(payments): Guard reserve headroom#561
kotevcode wants to merge 1 commit into
mainfrom
fix/preflight-reserve-guard

Conversation

@kotevcode
Copy link
Copy Markdown
Contributor

Description

Adds a seller-side preflight reserve headroom guard before provider routing so requests are not served when their minimum/estimated max cost cannot fit inside the confirmed on-chain reserve. Also adds a recoverable reserve_headroom_required payment code so buyers top up the existing channel and retry, and lowers the channel top-up settled threshold from 85% to 65%.

This addresses the observed Diem case where the seller had 684445/1000000 settled/confirmed spend, a top-up failed at the previous 85% threshold, and the next request pushed cumulative spend beyond the confirmed reserve.

Release Notes

  • Sellers now request a reserve top-up before routing paid work when confirmed channel headroom is too low for the next request.
  • Buyers automatically respond to recoverable reserve-headroom 402s by topping up the active channel and retrying.
  • Channel top-ups are allowed once 65% of the current reserve is settled, reducing top-up failures near the reserve ceiling.

Types of Changes

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Chore (maintenance tasks, refactoring, or non-functional changes)

Checklist

  • My code follows the code style of this project
  • I have added the necessary documentation (if appropriate)
  • I have added tests (if appropriate)
  • Lint and unit tests pass locally with my changes

@kc-zero-lab
Copy link
Copy Markdown
Contributor

Reviewed as a coding agent. I think there’s a correctness blocker in the new reserve_headroom_required flow.

The seller can now emit reserve_headroom_required whenever the forward-looking required headroom exceeds remaining reserve, even when the accepted/settled cumulative is still far below the contract’s top-up gate. Example from the new tests: spent=accepted=100_000, reserveMax=1_000_000, large max-token estimate requires requiredCumulativeAmount=1_100_000. The buyer handles this by calling topUpReserve() directly, but topUpReserve() sends the current cumulative amount as-is. At 100k/1M, the seller’s on-chain topUp() should still fail TopUpThresholdNotMet because the contract now requires 65% of the current deposit to be settleable before top-up.

That creates a deadlock for large early requests:

  1. seller refuses to route because there is not enough confirmed headroom,
  2. buyer sends a top-up auth with only the current low cumulative,
  3. seller cannot apply the top-up on-chain because the 65% threshold is not met,
  4. retry gets the same reserve_headroom_required again.

This is different from the existing handleNeedAuth() top-up path, which first advances the SpendingAuth up to the old ceiling when top-up is needed. The new preflight path intentionally avoids signing undelivered spend, so it cannot satisfy the top-up threshold when accepted < 65% * reserveMax.

I’d suggest one of these fixes before merge:

  • only emit reserve_headroom_required when the accepted/settleable cumulative is already above the contract top-up threshold; otherwise return a distinct non-retryable/renegotiate response, or route only if the request fits current confirmed headroom;
  • or have the buyer retire/renegotiate a fresh larger reserve when headroom is needed but the current channel is below the top-up threshold;
  • or change the protocol payload so the seller can explicitly distinguish “top-up is currently possible” from “this request cannot fit current headroom yet”.

Also worth adding a buyer+seller integration test for the low-spend large-request case (accepted < 65% reserveMax and requiredHeadroom > remainingHeadroom) to prove it does not retry-loop.

I tried to run the targeted node tests, but this checkout is missing node_modules/vitest/tsc, so I could only do static review here.

@kotevcode kotevcode force-pushed the fix/preflight-reserve-guard branch from a124031 to c9920c4 Compare May 18, 2026 17:51
@kotevcode
Copy link
Copy Markdown
Contributor Author

Addressed the low-spend large-request retry-loop blocker.

The seller now only emits recoverable reserve_headroom_required when the accepted cumulative is already at the 65% top-up threshold. If confirmed headroom is insufficient before that threshold, it returns channel_exhausted, closes the current channel through the existing close path, and lets the buyer renegotiate a fresh reserve instead of retrying an impossible top-up.

Added coverage for the low-spend/max-token case and re-ran:

  • pnpm --filter @antseed/node test -- node-payment-pricing.test.ts buyer-payment-negotiator.test.ts payment-codec.test.ts
  • pnpm --filter @antseed/node typecheck
  • git diff --check

@Augustas11
Copy link
Copy Markdown
Contributor

Review verdict

APPROVE_WITH_NITS. Headroom math is correct, the deadlock blocker @kc-zero-lab flagged is properly resolved (seller now returns channel_exhausted + closes when accepted < 65% so the buyer never retries an impossible top-up), the 65% threshold is consistent on-chain and off-chain, and tests cover all three preflight branches (top-up, close, fits). Findings below are non-blocking.

Findings

P2 — Worth addressing

  1. Threshold drift risk: hardcoded 6500 BPS in two new placespackages/node/src/seller-request-handler.ts:633-636 (_canTopUpAtCurrentAccepted) and packages/node/src/payments/buyer-payment-manager.ts:32-34 bake 6500/0.65 as literals. AntseedChannels.TOP_UP_SETTLED_THRESHOLD_BPS is owner-settable via setTopUpSettledThresholdBps (AntseedChannels.sol:504). If an operator ever changes it, the seller's preflight mis-classifies canTopUpNow, emits reserve_headroom_required when the on-chain gate will revert, and the buyer's topUpReserve() round-trip burns seller gas on a guaranteed TopUpThresholdNotMet. Suggest reading the threshold from ChannelsClient at startup and threading it into seller-payment-manager.
  2. Per-request top-up amortization — at 65% the preflight (estimate vs remaining headroom) can fire for cheap requests with large max_tokens even when actual delivered cost is small. There is no per-channel minimum top-up batching, so bursty requests can churn one on-chain topUp() per request. Suggest requiredHeadroom > max(remainingHeadroom, minTopUpDelta) so micro-shortfalls fall through to the existing NeedAuth path.

P3 — Minor

  1. Parallel-request race in topUpReservebuyer-payment-manager.ts:974-1014 reads prevCeiling, computes newCeiling, signs, sends, then commits. Two concurrent reserve_headroom_required 402s (pipelined requests) can both observe the same prevCeiling and emit duplicate ReserveAuths — the seller's second topUp() reverts TopUpAmountTooLow (INV-10). Not unsafe, but a per-seller mutex around topUpReserve would suppress the spurious revert/gas.
  2. Buyer skips deposit-headroom check before signing top-upbuyer-payment-negotiator.ts:347-359 only checks balance.available <= 0n before calling topUpReserve. PR fix(node): Stop short deposit auth retries #563 adds a stricter available < maxReserveAmountUsdc short-circuit on the negotiation path. Once fix(node): Stop short deposit auth retries #563 lands, mirror that check in the new reserveHeadroomRequired branch so the buyer doesn't sign a top-up the seller can't fund.
  3. Memory drift — INV-09 and INV-60 still state "85%"; post-merge bookkeeping for our local invariants doc.

Notes

  • The requiredCumulativeAmount semantics split in types/protocol.ts:89-104 is documented well — forward-looking headroom target (recoverable) vs claimable SpendingAuth (exhausted).
  • The 100k/1M low-spend / max-token regression test (node-payment-pricing.test.ts:557-595) directly captures @kc-zero-lab's blocker scenario — nice coverage.
  • CI green: Test + Typecheck + E2E Payment Flow all SUCCESS.
  • Cross-cut with PR fix(payments): close zombie channels #554 (zombie-close): the preflight close path feeds the same settleSession flow fix(payments): close zombie channels #554 hardened; no obvious interaction issue.

Review assisted by Claude Code agent (@Augustas11).

@kotevcode kotevcode force-pushed the fix/preflight-reserve-guard branch from c9920c4 to c12c0e7 Compare May 19, 2026 21:26
@kotevcode kotevcode force-pushed the fix/preflight-reserve-guard branch from c12c0e7 to 810edeb Compare May 19, 2026 21:41
@Gekko-dot-ETH
Copy link
Copy Markdown

Gekko-dot-ETH commented May 21, 2026

Edit — redacting this: I got the version wrong. We weren't on 0.2.87
in prod — our seller was on @antseed/cli@0.1.121, which pins
@antseed/node@0.2.86, so it resolved the CLI's nested 0.2.86 copy rather
than the 0.2.87 we had at top-level. The 0xa97ce07e… overdraft was
0.2.86's effectiveMax = max(reserveMax, pendingTopUp.newMaxAmount), which
0.2.87 already removed — so it doesn't reproduce there.
@antseed/cli@0.1.122 already bumps the pin to 0.2.87; this was just a
stale CLI on our side. Sorry for the noise.

@kotevcode
Copy link
Copy Markdown
Contributor Author

kotevcode commented May 21, 2026 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants