Skip to content

Incident runbook: settlement stuck, double payout suspicion, and chain outage #141

@greatest0fallt1me

Description

@greatest0fallt1me

On-call runbook and POST /internal/incident/… (optional) for StreamPay

    ## Description

    **Author** a practical **runbook** in `docs/runbooks/`: (1) **stuck** settlement, (2)

suspected double payout or escrow leak, (3) Stellar outage or
Horizon hard-down. Link Grafana queries, log searches, and Soroban RPC
checks. This issue can add optional read-only internal status endpoints to speed triage.
No manual money move without 2-person rule—documented.

    ## Requirements and context

    - **Step-by-step** for L1; **escalation** for L2 with roles.
  • Links to rollback and key rotation runbooks.

  • Comms template for “we are having payment delays.”

  • PII rules in public customer comms.

  • Quarterly table-top exercise checkbox in the doc.

      ## Suggested execution
    
      1. `git checkout -b docs/runbook-stream-settlement`
    
  1. Interviews with current on-call; capture real grep and queries.

  2. PR with reviewer = security+ops; no code required, but 95% is N/A; instead completeness checklist.

  3. If internal endpoints: small tests; otherwise docs-only.

  4. Timeframe: 96h for v1; iterate after first incident.

     ## Test and commit
    
  • Run the full test suite; add or update tests until the agreed coverage bar is met.
  • Cover edge cases listed in this issue; document any intentional exclusions with brief rationale in the PR.
  • Include relevant test output (e.g. test runner summary) or a link to a passing CI run in the pull request.
  • Add security notes for auth, keys, PII, chain settlement, or money movement (assumptions verified, out-of-scope items).

Example commit message

docs(ops): add settlement and Stellar incident runbook with triage and escalation

Guidelines

  • Target: at least 95% coverage on new or meaningfully changed code (per the repo’s standard tooling).
  • Documentation: update contributor-facing or API documentation where a reviewer would be blocked without it.
  • Timeframe: 96 hours to ready-for-review (surface blockers early).

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-settlementStreamPay ghit: area-settlementdomain-operationsStreamPay ghit: domain-operationspriority-p1StreamPay ghit: priority-p1type-docsStreamPay ghit: type-docs

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions