fix: Graceful 429 rate-limit retry with countdown by vrinek · Pull Request #8 · ShinMegamiBoson/OpenPlanter

vrinek · 2026-02-20T19:30:05Z

Summary

When the API returns HTTP 429, the app now retries up to 5 times honoring the Retry-After header instead of crashing
A visible per-second countdown ([d0/s6] Rate limited (attempt 1/5). Retrying in 20s...) appears in the TUI trace so the user knows what's happening
Both _http_stream_sse (streaming LLM calls) and _http_json (model listing) are covered
Non-429 HTTP errors still raise immediately; connection-timeout retries remain independent

Changes

File	What changed
`agent/model.py`	Added `_parse_retry_after()`, `_notify_retry()`, `_sleep_with_countdown()` helpers. Added outer 429-retry loop to `_http_stream_sse()` and `_http_json()`. Added `on_retry` field to both model classes. Error bodies truncated to 8KB.
`agent/engine.py`	Wire `model.on_retry` at all depths (not depth-gated like `on_content_delta`). Clear in `finally` block. Messages include `[d/s]` prefix.
`tests/test_rate_limit.py`	22 new tests: helpers, transport 429 logic, engine integration.
`tests/conftest.py`	Updated mock helper signatures for new params.
`tests/test_model_complex.py`	Updated `fake_stream_sse` signatures for new params.

Test plan

22 new tests covering all acceptance criteria
Full suite: 436 passed, 22 skipped, 0 failed
Manual verification: confirmed countdown appears in TUI during real 429 from Anthropic API

Known limitations (v1)

Parallel workers retry independently (no jitter or coordination)
_ThinkingDisplay spinner continues alongside countdown messages
HTTP 529 (overloaded_error) not handled (architecture supports trivial addition)
Worst-case hang: 5 retries × 120s cap ≈ 10 min per call (no total wall-clock timeout)

When the Anthropic API returns HTTP 429, the app now retries up to 5 times honoring the Retry-After header, with a visible per-second countdown via on_retry callback. Both _http_stream_sse (streaming) and _http_json (model listing) are covered. Non-429 errors still raise immediately. Connection-timeout retries remain independent.

Verify that 429 retry messages reach the engine's on_event callback end-to-end, and that model.on_retry is cleared after complete() returns.

Retry countdown messages now include the [d0/s6] prefix matching all other engine trace lines, for consistent TUI output.

vrinek added 3 commits February 20, 2026 20:00

test: add engine integration test for on_retry wiring

fccc947

Verify that 429 retry messages reach the engine's on_event callback end-to-end, and that model.on_retry is cleared after complete() returns.

fix: add depth/step prefix to rate-limit retry messages

79aac6b

Retry countdown messages now include the [d0/s6] prefix matching all other engine trace lines, for consistent TUI output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Graceful 429 rate-limit retry with countdown#8

fix: Graceful 429 rate-limit retry with countdown#8
vrinek wants to merge 3 commits intoShinMegamiBoson:mainfrom
vrinek:fix/429-rate-limit-retry

vrinek commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vrinek commented Feb 20, 2026

Summary

Changes

Test plan

Known limitations (v1)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant