|
| 1 | +# OJS Real-Time Status — Extension Specification |
| 2 | + |
| 3 | +| Field | Value | |
| 4 | +|-------------|---------------------------------------------| |
| 5 | +| **Title** | OJS Real-Time Job Status Updates | |
| 6 | +| **Version** | 0.1.0 | |
| 7 | +| **Status** | Experimental (Stage 0) | |
| 8 | +| **Maturity** | Experimental | |
| 9 | +| **Date** | 2025-07-15 | |
| 10 | +| **Layer** | 3 — Protocol Binding | |
| 11 | +| **Depends On** | ojs-core.md, ojs-events.md | |
| 12 | + |
| 13 | +--- |
| 14 | + |
| 15 | +## Abstract |
| 16 | + |
| 17 | +This extension defines how clients subscribe to real-time job status changes via **Server-Sent Events (SSE)** and **WebSocket** protocols. It eliminates the need for polling by pushing state-change notifications directly to connected clients. |
| 18 | + |
| 19 | +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). |
| 20 | + |
| 21 | +--- |
| 22 | + |
| 23 | +## Table of Contents |
| 24 | + |
| 25 | +1. [Introduction and Motivation](#1-introduction-and-motivation) |
| 26 | +2. [Server-Sent Events (SSE) Binding](#2-server-sent-events-sse-binding) |
| 27 | +3. [WebSocket Binding](#3-websocket-binding) |
| 28 | +4. [Event Format](#4-event-format) |
| 29 | +5. [Connection Management](#5-connection-management) |
| 30 | +6. [Security Considerations](#6-security-considerations) |
| 31 | +7. [Conformance Requirements](#7-conformance-requirements) |
| 32 | +8. [Examples](#8-examples) |
| 33 | + |
| 34 | +--- |
| 35 | + |
| 36 | +## 1. Introduction and Motivation |
| 37 | + |
| 38 | +Polling-based status checks create unnecessary load on the server and introduce latency between state transitions and client awareness. Real-time push notifications solve both problems by delivering events to subscribed clients the moment a job's state changes. |
| 39 | + |
| 40 | +**Rationale:** Background job systems frequently power user-facing workflows (file uploads, report generation, payment processing). Users expect immediate feedback when their job completes or fails. Without a standardized real-time mechanism, every OJS implementation invents its own, fragmenting the ecosystem and preventing portable dashboards and monitoring tools. |
| 41 | + |
| 42 | +This extension provides two complementary transport bindings: |
| 43 | + |
| 44 | +- **SSE** — Simple, HTTP-based, unidirectional streaming. Ideal for browser-based dashboards and monitoring tools. Built on standard HTTP infrastructure with automatic reconnection. |
| 45 | +- **WebSocket** — Full-duplex communication. Ideal for interactive applications that need to subscribe/unsubscribe dynamically and receive events with minimal overhead. |
| 46 | + |
| 47 | +An implementation MAY support one or both bindings. If an implementation advertises real-time support in its manifest, it MUST support at least the SSE binding. |
| 48 | + |
| 49 | +--- |
| 50 | + |
| 51 | +## 2. Server-Sent Events (SSE) Binding |
| 52 | + |
| 53 | +### 2.1 Subscribe to Job Updates |
| 54 | + |
| 55 | +``` |
| 56 | +GET /ojs/v1/jobs/{id}/events |
| 57 | +Accept: text/event-stream |
| 58 | +``` |
| 59 | + |
| 60 | +The server MUST respond with `Content-Type: text/event-stream` and begin streaming events for the specified job. |
| 61 | + |
| 62 | +**Rationale:** Per-job subscriptions enable lightweight, targeted monitoring. A client tracking a single job should not receive the full event firehose. |
| 63 | + |
| 64 | +If the job does not exist, the server MUST respond with HTTP 404 and a standard OJS error body (not an SSE stream). |
| 65 | + |
| 66 | +If the job is in a terminal state (`completed`, `cancelled`, `discarded`), the server SHOULD send a single `job.state_changed` event reflecting the current state and then close the stream. |
| 67 | + |
| 68 | +**Rationale:** Terminal jobs will never produce further events. Sending the current state and closing prevents clients from holding idle connections. |
| 69 | + |
| 70 | +### 2.2 Subscribe to Queue Events |
| 71 | + |
| 72 | +``` |
| 73 | +GET /ojs/v1/queues/{name}/events |
| 74 | +Accept: text/event-stream |
| 75 | +``` |
| 76 | + |
| 77 | +The server MUST respond with `Content-Type: text/event-stream` and begin streaming events for all jobs in the specified queue. |
| 78 | + |
| 79 | +If the queue does not exist, the server MUST respond with HTTP 404. |
| 80 | + |
| 81 | +### 2.3 SSE Protocol Requirements |
| 82 | + |
| 83 | +The server MUST comply with the [W3C Server-Sent Events specification](https://html.spec.whatwg.org/multipage/server-sent-events.html): |
| 84 | + |
| 85 | +- Each event MUST include an `event` field indicating the event type. |
| 86 | +- Each event MUST include a `data` field containing the JSON-encoded event payload. |
| 87 | +- Each event MUST include an `id` field containing a monotonically increasing event identifier. |
| 88 | +- Events SHOULD include a `retry` field (in milliseconds) to advise the client on reconnection interval. The RECOMMENDED default is `3000`. |
| 89 | + |
| 90 | +**Rationale:** The `id` field enables automatic reconnection via the `Last-Event-ID` header, preventing event loss during transient disconnections. |
| 91 | + |
| 92 | +### 2.4 Heartbeat |
| 93 | + |
| 94 | +The server MUST send a comment line (`:heartbeat`) at least every **15 seconds** when no events are pending. |
| 95 | + |
| 96 | +**Rationale:** SSE connections traverse proxies and load balancers that may close idle connections. Regular heartbeats prevent premature termination. |
| 97 | + |
| 98 | +### 2.5 Reconnection |
| 99 | + |
| 100 | +When a client reconnects with a `Last-Event-ID` header, the server SHOULD replay any events that occurred after the specified ID. If the server cannot replay (e.g., events were not retained), it MUST resume from the current point without error. |
| 101 | + |
| 102 | +**Rationale:** At-least-once delivery during reconnection is critical for reliable monitoring. However, implementations are not required to maintain unbounded event history. |
| 103 | + |
| 104 | +--- |
| 105 | + |
| 106 | +## 3. WebSocket Binding |
| 107 | + |
| 108 | +### 3.1 Connection Endpoint |
| 109 | + |
| 110 | +``` |
| 111 | +WS /ojs/v1/ws |
| 112 | +``` |
| 113 | + |
| 114 | +The server MUST accept WebSocket upgrade requests at this path. The server MUST use the `ojs.v1` WebSocket subprotocol when offered by the client. |
| 115 | + |
| 116 | +### 3.2 Subscribe Message |
| 117 | + |
| 118 | +After connecting, a client subscribes to event channels by sending: |
| 119 | + |
| 120 | +```json |
| 121 | +{ |
| 122 | + "action": "subscribe", |
| 123 | + "channel": "job:{id}" |
| 124 | +} |
| 125 | +``` |
| 126 | + |
| 127 | +Supported channel patterns: |
| 128 | + |
| 129 | +| Pattern | Description | |
| 130 | +|-----------------|--------------------------------------| |
| 131 | +| `job:{id}` | Events for a specific job | |
| 132 | +| `queue:{name}` | Events for all jobs in a queue | |
| 133 | +| `all` | All events (global firehose) | |
| 134 | + |
| 135 | +The server MUST respond with an acknowledgment: |
| 136 | + |
| 137 | +```json |
| 138 | +{ |
| 139 | + "type": "subscribed", |
| 140 | + "channel": "job:01926f5e-..." |
| 141 | +} |
| 142 | +``` |
| 143 | + |
| 144 | +If the referenced job or queue does not exist, the server MUST respond with an error message: |
| 145 | + |
| 146 | +```json |
| 147 | +{ |
| 148 | + "type": "error", |
| 149 | + "code": "not_found", |
| 150 | + "message": "Job not found." |
| 151 | +} |
| 152 | +``` |
| 153 | + |
| 154 | +### 3.3 Unsubscribe Message |
| 155 | + |
| 156 | +```json |
| 157 | +{ |
| 158 | + "action": "unsubscribe", |
| 159 | + "channel": "job:{id}" |
| 160 | +} |
| 161 | +``` |
| 162 | + |
| 163 | +The server MUST respond with: |
| 164 | + |
| 165 | +```json |
| 166 | +{ |
| 167 | + "type": "unsubscribed", |
| 168 | + "channel": "job:{id}" |
| 169 | +} |
| 170 | +``` |
| 171 | + |
| 172 | +### 3.4 Event Messages |
| 173 | + |
| 174 | +The server pushes events to subscribed clients using the same payload format as SSE (Section 4): |
| 175 | + |
| 176 | +```json |
| 177 | +{ |
| 178 | + "type": "event", |
| 179 | + "channel": "job:01926f5e-...", |
| 180 | + "event": "job.state_changed", |
| 181 | + "data": { ... }, |
| 182 | + "id": "evt_0042", |
| 183 | + "timestamp": "2025-07-15T10:30:00.000Z" |
| 184 | +} |
| 185 | +``` |
| 186 | + |
| 187 | +### 3.5 Connection Health |
| 188 | + |
| 189 | +The server MUST send WebSocket Ping frames at least every **30 seconds**. If a client fails to respond with a Pong within **10 seconds**, the server SHOULD close the connection. |
| 190 | + |
| 191 | +**Rationale:** Ping/Pong ensures dead connections are detected promptly, freeing server resources. |
| 192 | + |
| 193 | +--- |
| 194 | + |
| 195 | +## 4. Event Format |
| 196 | + |
| 197 | +All real-time events use the following JSON structure: |
| 198 | + |
| 199 | +### 4.1 `job.state_changed` |
| 200 | + |
| 201 | +Emitted when a job transitions between lifecycle states. |
| 202 | + |
| 203 | +```json |
| 204 | +{ |
| 205 | + "job_id": "01926f5e-7a3c-7def-8000-111111111111", |
| 206 | + "queue": "default", |
| 207 | + "type": "email.send", |
| 208 | + "from": "active", |
| 209 | + "to": "completed", |
| 210 | + "timestamp": "2025-07-15T10:30:00.000Z" |
| 211 | +} |
| 212 | +``` |
| 213 | + |
| 214 | +| Field | Type | Required | Description | |
| 215 | +|-------------|--------|----------|---------------------------------------------------| |
| 216 | +| `job_id` | string | Yes | UUIDv7 of the job | |
| 217 | +| `queue` | string | Yes | Queue the job belongs to | |
| 218 | +| `type` | string | Yes | Job type | |
| 219 | +| `from` | string | Yes | Previous state (one of the 8 OJS lifecycle states) | |
| 220 | +| `to` | string | Yes | New state | |
| 221 | +| `timestamp` | string | Yes | RFC 3339 timestamp of the transition | |
| 222 | + |
| 223 | +### 4.2 `job.progress` |
| 224 | + |
| 225 | +Emitted when a worker reports progress on an active job. |
| 226 | + |
| 227 | +```json |
| 228 | +{ |
| 229 | + "job_id": "01926f5e-7a3c-7def-8000-111111111111", |
| 230 | + "progress": 75, |
| 231 | + "message": "Processing page 3 of 4", |
| 232 | + "timestamp": "2025-07-15T10:29:55.000Z" |
| 233 | +} |
| 234 | +``` |
| 235 | + |
| 236 | +| Field | Type | Required | Description | |
| 237 | +|-------------|---------|----------|----------------------------------------| |
| 238 | +| `job_id` | string | Yes | UUIDv7 of the job | |
| 239 | +| `progress` | integer | Yes | Percentage complete (0–100) | |
| 240 | +| `message` | string | No | Human-readable progress description | |
| 241 | +| `timestamp` | string | Yes | RFC 3339 timestamp | |
| 242 | + |
| 243 | +--- |
| 244 | + |
| 245 | +## 5. Connection Management |
| 246 | + |
| 247 | +### 5.1 Graceful Shutdown |
| 248 | + |
| 249 | +When the server is shutting down, it MUST: |
| 250 | + |
| 251 | +1. Stop accepting new SSE and WebSocket connections. |
| 252 | +2. Send a `server.shutdown` event to all connected clients. |
| 253 | +3. Close all connections within a RECOMMENDED grace period of **5 seconds**. |
| 254 | + |
| 255 | +**Rationale:** Graceful shutdown prevents clients from hanging on dead connections and allows them to reconnect to another server instance. |
| 256 | + |
| 257 | +### 5.2 Client Limits |
| 258 | + |
| 259 | +The server SHOULD enforce a maximum number of concurrent real-time connections per client (identified by IP address or authentication token). The RECOMMENDED default limit is **100 connections**. |
| 260 | + |
| 261 | +**Rationale:** Without connection limits, a single misbehaving client could exhaust server resources. |
| 262 | + |
| 263 | +### 5.3 Event Buffering |
| 264 | + |
| 265 | +The server MAY buffer recent events (RECOMMENDED: last 100 events per channel) to support SSE reconnection via `Last-Event-ID`. |
| 266 | + |
| 267 | +--- |
| 268 | + |
| 269 | +## 6. Security Considerations |
| 270 | + |
| 271 | +- Real-time endpoints SHOULD be subject to the same authentication and authorization mechanisms as other OJS API endpoints. |
| 272 | +- The server MUST NOT leak job data to unauthorized subscribers. If a client is not authorized to view a job, subscribe requests MUST be rejected with HTTP 403 (SSE) or an error message (WebSocket). |
| 273 | +- WebSocket connections SHOULD validate the `Origin` header to prevent cross-site WebSocket hijacking. |
| 274 | + |
| 275 | +--- |
| 276 | + |
| 277 | +## 7. Conformance Requirements |
| 278 | + |
| 279 | +An implementation claiming conformance to this extension: |
| 280 | + |
| 281 | +- MUST support the SSE binding (Section 2). |
| 282 | +- MUST emit `job.state_changed` events for all lifecycle transitions. |
| 283 | +- SHOULD support the WebSocket binding (Section 3). |
| 284 | +- MAY support the `job.progress` event type. |
| 285 | +- MUST implement heartbeats as specified. |
| 286 | +- MUST handle graceful shutdown as specified (Section 5.1). |
| 287 | + |
| 288 | +--- |
| 289 | + |
| 290 | +## 8. Examples |
| 291 | + |
| 292 | +### 8.1 SSE — Monitoring a Single Job |
| 293 | + |
| 294 | +**Request:** |
| 295 | +```http |
| 296 | +GET /ojs/v1/jobs/01926f5e-7a3c-7def-8000-111111111111/events HTTP/1.1 |
| 297 | +Accept: text/event-stream |
| 298 | +``` |
| 299 | + |
| 300 | +**Response stream:** |
| 301 | +``` |
| 302 | +retry: 3000 |
| 303 | +
|
| 304 | +:heartbeat |
| 305 | +
|
| 306 | +id: evt_0001 |
| 307 | +event: job.state_changed |
| 308 | +data: {"job_id":"01926f5e-7a3c-7def-8000-111111111111","queue":"default","type":"email.send","from":"available","to":"active","timestamp":"2025-07-15T10:30:00.000Z"} |
| 309 | +
|
| 310 | +:heartbeat |
| 311 | +
|
| 312 | +id: evt_0002 |
| 313 | +event: job.state_changed |
| 314 | +data: {"job_id":"01926f5e-7a3c-7def-8000-111111111111","queue":"default","type":"email.send","from":"active","to":"completed","timestamp":"2025-07-15T10:30:05.000Z"} |
| 315 | +
|
| 316 | +``` |
| 317 | + |
| 318 | +### 8.2 WebSocket — Subscribe and Receive Events |
| 319 | + |
| 320 | +**Client sends:** |
| 321 | +```json |
| 322 | +{"action":"subscribe","channel":"queue:default"} |
| 323 | +``` |
| 324 | + |
| 325 | +**Server responds:** |
| 326 | +```json |
| 327 | +{"type":"subscribed","channel":"queue:default"} |
| 328 | +``` |
| 329 | + |
| 330 | +**Server pushes event:** |
| 331 | +```json |
| 332 | +{"type":"event","channel":"queue:default","event":"job.state_changed","data":{"job_id":"01926f5e-7a3c-7def-8000-111111111111","queue":"default","type":"email.send","from":"available","to":"active","timestamp":"2025-07-15T10:30:00.000Z"},"id":"evt_0001","timestamp":"2025-07-15T10:30:00.000Z"} |
| 333 | +``` |
| 334 | + |
| 335 | +**Client unsubscribes:** |
| 336 | +```json |
| 337 | +{"action":"unsubscribe","channel":"queue:default"} |
| 338 | +``` |
| 339 | + |
| 340 | +**Server confirms:** |
| 341 | +```json |
| 342 | +{"type":"unsubscribed","channel":"queue:default"} |
| 343 | +``` |
0 commit comments