Skip to content

docs: production operations guide for SSE, replay, and backpressure #223

Description

@catgoose

Summary

Add an operator-focused guide for running Tavern-backed SSE systems in production.

Why

The library semantics are getting mature, but production success also depends on operational choices:

  • proxy/load balancer buffering and timeouts
  • replay retention windows
  • subscriber pressure and eviction behavior
  • observability around disconnects, degradation, and recovery
  • sizing/fanout expectations

These concerns are central to the real value proposition of honest live delivery.

Scope

Document:

  • proxy/server settings that matter for SSE
  • retention/replay tradeoffs
  • interpreting backpressure and degradation signals
  • suggested metrics/logging
  • operational failure modes and expected client behavior

Non-goals

  • promising universal deployment recipes for every platform
  • replacing app-specific monitoring/alerting

Deliverables

  • production guide doc
  • README pointers to it
  • examples tied back to Tavern control events and delivery states

Acceptance criteria

  • a team adopting Tavern has a concrete starting point for production deployment and troubleshooting
  • the doc reflects the current delivery semantics rather than generic SSE advice

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions