Outbound webhooks: retry with jitter, signature per attempt, and delivery id
## Description
When **StreamPay** calls customer **webhooks** (stream settled, payment tick), implement
durable retry with exponential backoff, jitter, max attempts, and
idempotent delivery ids. Customers must dedupe; we must not lose events in
our DLQ without visibility.
Pair with HMAC from the inbound verification issue for full story.
## Requirements and context
- **At-least-once** delivery; document minimum **2xx** and retry-on-5xx/timeout.
-
Jittered schedule; per-endpoint circuit if they always 500.
-
Table of attempts with next_attempt_at for observability.
-
Tests: flaky receiver in integration tests; ensure eventual success or DLQ.
-
PII in payload minimized; doc retention.
## Suggested execution
1. `git checkout -b feature/webhook-delivery-retry`
-
Extract sender from ad-hoc code; add integration test with httpbin or mock server.
-
PR with table of status codes; security note on not signing different bodies across retries.
-
SLO for delivery latency in PR description.
-
Timeframe: 96h; split DLQ admin UI in follow-up.
- Run the full test suite; add or update tests until the agreed coverage bar is met.
- Cover edge cases listed in this issue; document any intentional exclusions with brief rationale in the PR.
- Include relevant test output (e.g. test runner summary) or a link to a passing CI run in the pull request.
- Add security notes for auth, keys, PII, chain settlement, or money movement (assumptions verified, out-of-scope items).
Example commit message
feat(webhooks): outbound retry with jitter, idempotent delivery id, and DLQ on failure
Guidelines
- Target: at least 95% coverage on new or meaningfully changed code (per the repo’s standard tooling).
- Documentation: update contributor-facing or API documentation where a reviewer would be blocked without it.
- Timeframe: 96 hours to ready-for-review (surface blockers early).
Outbound webhooks: retry with jitter, signature per attempt, and delivery id
durable retry with exponential backoff, jitter, max attempts, and
idempotent delivery ids. Customers must dedupe; we must not lose events in
our DLQ without visibility.
Pair with HMAC from the inbound verification issue for full story.
Jittered schedule; per-endpoint circuit if they always 500.
Table of attempts with
next_attempt_atfor observability.Tests: flaky receiver in integration tests; ensure eventual success or DLQ.
PII in payload minimized; doc retention.
Extract sender from ad-hoc code; add integration test with
httpbinor mock server.PR with table of status codes; security note on not signing different bodies across retries.
SLO for delivery latency in PR description.
Timeframe: 96h; split DLQ admin UI in follow-up.
Example commit message
feat(webhooks): outbound retry with jitter, idempotent delivery id, and DLQ on failureGuidelines