Skip to content

feat(sdk): high-level document APIs lack idempotent retry — timeout causes duplicate state transitions #3090

@thepastaclaw

Description

@thepastaclaw

Problem

The high-level document APIs (sdk.documents.create(), sdk.documents.replace(), sdk.documents.delete()) are atomic and opaque — they bundle nonce management, ST construction, signing, broadcasting, and waiting into a single call. This means:

  1. The nonce is bumped internally during ST construction (via get_identity_contract_nonce(bump=true) in the Rust SDK / put_to_platform_and_wait_for_response)
  2. On timeout, the caller has no access to the signed ST bytes — they cannot rebroadcast the same transition
  3. Retrying the high-level call creates a brand new ST with a new nonce, leading to duplicate state transitions (e.g., double posts, double deletes)

This is not a theoretical issue — it actively causes double-posting in real applications when DAPI gateway returns 504 timeouts. See PastaPastaPasta/yappr#260 for a detailed application-level workaround.

Current Workaround (application-level)

Applications must bypass the high-level API entirely and manually:

  1. Build DocumentDocumentCreateTransitionBatchTransitionStateTransition
  2. Fetch and manage the identity contract nonce themselves
  3. Sign the ST and cache the serialized bytes (e.g., localStorage)
  4. Call sdk.wasm.broadcastStateTransition() and sdk.wasm.waitForResponse() separately
  5. On timeout/retry, deserialize the cached bytes and rebroadcast the identical signed ST
  6. Manually call refreshIdentityNonce() after success since the SDK's internal nonce cache is now stale

This is ~200 lines of intricate, error-prone code that every application consuming the SDK would need to independently implement. It should be handled by the SDK.

Affected APIs

All high-level document operations in wasm-sdk (and likely rs-sdk):

  • sdk.documents.create()document_create()put_to_platform_and_wait_for_response()
  • sdk.documents.replace()document_replace()put_to_platform_and_wait_for_response()
  • sdk.documents.delete()document_delete()sdk.document_delete()
  • sdk.documents.transfer()document_transfer()transfer_document_to_identity_and_wait_for_response()
  • sdk.documents.purchase() / sdk.documents.updatePrice()

Source: packages/wasm-sdk/src/state_transitions/document.rs

Proposed Solution

Option A: Two-Phase API (Prepare + Execute)

Add a prepare variant for each document operation that returns a signed StateTransition without broadcasting:

// Phase 1: Build, sign, and return the ST (bumps nonce internally)
const signedST: StateTransition = await sdk.documents.prepareCreate({
  document, identityKey, signer
});

// Application can now cache signedST.toBytes() for retry safety

// Phase 2: Broadcast and wait (can be retried with the same ST)
await sdk.wasm.broadcastStateTransition(signedST);
const result = await sdk.wasm.waitForResponse(signedST);

Pros: Minimal API change, gives applications full control over retry and caching strategies.
Cons: Applications still manage their own retry/cache logic (though much simpler now).

Option B: Built-in Idempotent Retry

Make the existing high-level API internally idempotent:

  1. After building and signing the ST, cache the signed bytes internally before broadcasting
  2. If waitForResponse times out, query Platform for the document to check if it landed
  3. On retry (same document ID + same nonce), rebroadcast the cached ST bytes instead of building a new one
  4. On confirmation or "already exists" error, clear the cache and return success
// Same API, but now idempotent — safe to retry on timeout
const result = await sdk.documents.create({
  document, identityKey, signer,
  settings: { retryOnTimeout: true }  // opt-in for backward compat
});

Pros: Zero application-level complexity, "just works."
Cons: More complex SDK internals, cache storage strategy needs thought (in-memory? configurable?).

Option C: Both

Implement Option A (prepare/execute split) first as it's the simpler change, then build Option B on top of it. Applications that need custom retry logic use the two-phase API; applications that want simplicity use the high-level API with built-in idempotency.

Additional Context

  • The low-level primitives already exist: broadcastStateTransition(), waitForResponse(), StateTransition.toBytes()/fromBytes() (in broadcast.rs)
  • The nonce bump happens in the Rust SDK's PutDocument trait implementation — deep enough that WASM consumers cannot intercept it
  • Platform's protocol already handles duplicate STs gracefully (rejects as "already in chain/mempool") — the idempotency guarantee exists at the protocol level, the SDK just does not leverage it
  • refreshIdentityNonce() exists but is needed as a manual workaround when the SDK's internal nonce cache gets stale from manual ST construction

References

  • Application-level workaround: PastaPastaPasta/yappr#260
  • WASM SDK document operations: packages/wasm-sdk/src/state_transitions/document.rs
  • Broadcast primitives: packages/wasm-sdk/src/state_transitions/broadcast.rs

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions