Skip to content

Fix: Improve anchor program deploy reliability with resumable buffers#4531

Open
swaroop-osec wants to merge 11 commits into
otter-sec:masterfrom
swaroop-osec:fix/anchor-deploy
Open

Fix: Improve anchor program deploy reliability with resumable buffers#4531
swaroop-osec wants to merge 11 commits into
otter-sec:masterfrom
swaroop-osec:fix/anchor-deploy

Conversation

@swaroop-osec
Copy link
Copy Markdown
Collaborator

@swaroop-osec swaroop-osec commented May 14, 2026

Summary

  • Makes anchor program deploy and anchor program upgrade resumable across retries and reruns using persistent per-program buffer keypairs.
  • Improves priority-fee selection by querying contention-aware recent fees for the accounts being write-locked.
  • Honor --skip-preflight consistently across buffer writes, deploys, upgrades.

Closes #4481

@vercel
Copy link
Copy Markdown

vercel Bot commented May 14, 2026

@swaroop-osec is attempting to deploy a commit to the Solana Foundation Team on Vercel.

A member of the Team first needs to authorize it.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 14, 2026

Greptile Summary

This PR overhauls anchor program deploy and anchor program upgrade to be resumable across retries and reruns: it introduces per-program persistent buffer keypairs stored under target/deploy/, a fetch_buffer_program_data helper for diff-only resume (re-sending only un-landed chunks), an outer retry loop in both deploy and upgrade paths, and consistent --skip-preflight and priority-fee propagation through all transaction types.

  • Persistent buffer + diff-only resume: ensure_buffer_keypair_arg pins a per-program keypair; prepare_write_messages skips already-matching chunks on each attempt, enabling efficient recovery from partial failures.
  • Contention-aware fees: get_recommended_micro_lamport_fee now accepts write_locked_accounts so the fee query is scoped to transactions that write-locked the same program/buffer accounts, replacing the global-median approach.
  • Skip-preflight propagation: deploy_program, upgrade_program, close_undersized_buffer, and all write batches now honour the caller's skip_preflight flag rather than always running preflight.

Confidence Score: 3/5

The deploy/upgrade paths work correctly for the common case, but the persistent-buffer resume logic has a correctness gap when the user shrinks the program binary after a failed full-buffer-write attempt.

The stale-tail-bytes issue can silently deploy a corrupted program binary or fail ELF verification in a scenario the PR was specifically designed to handle — fully-written buffer, failed upgrade tx, then a smaller binary on retry.

cli/src/program.rs deserves a close second look, particularly the oversized-buffer case in both program_deploy and program_upgrade, and the commitment level passed to fetch_buffer_program_data.

Important Files Changed

Filename Overview
cli/src/lib.rs Adds write_locked_accounts param to fee-query helpers for contention-aware priority fees; aligns DEFAULT_MAX_SIGN_ATTEMPTS to 5 (Agave default); moves --buffer injection out of this helper and into per-program callers; removes the global add_recommended_deployment_solana_args call from the workspace deploy() wrapper.
cli/src/program.rs Large refactor introducing resumable buffer logic: persistent per-program keypairs, diff-only resume, close_undersized_buffer, outer retry loop, and skip_preflight propagation. Key issues: stale tail bytes when a fully-written larger binary's upgrade fails and the user then shrinks the binary; fetch_buffer_program_data uses potentially-finalized RPC commitment; CreateBuffer tx lacks a compute unit price.

Reviews (1): Last reviewed commit: "fix(cli): ensure deploy directory is val..." | Re-trigger Greptile

Comment thread cli/src/program.rs
Comment thread cli/src/program.rs
Comment thread cli/src/program.rs
@swaroop-osec swaroop-osec requested a review from jamie-osec May 14, 2026 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

anchor deploy getting stuck in 1.0.x

1 participant