fix: Graceful 429 rate-limit retry with countdown#8
Open
vrinek wants to merge 3 commits intoShinMegamiBoson:mainfrom
Open
fix: Graceful 429 rate-limit retry with countdown#8vrinek wants to merge 3 commits intoShinMegamiBoson:mainfrom
vrinek wants to merge 3 commits intoShinMegamiBoson:mainfrom
Conversation
When the Anthropic API returns HTTP 429, the app now retries up to 5 times honoring the Retry-After header, with a visible per-second countdown via on_retry callback. Both _http_stream_sse (streaming) and _http_json (model listing) are covered. Non-429 errors still raise immediately. Connection-timeout retries remain independent.
Verify that 429 retry messages reach the engine's on_event callback end-to-end, and that model.on_retry is cleared after complete() returns.
Retry countdown messages now include the [d0/s6] prefix matching all other engine trace lines, for consistent TUI output.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Retry-Afterheader instead of crashing[d0/s6] Rate limited (attempt 1/5). Retrying in 20s...) appears in the TUI trace so the user knows what's happening_http_stream_sse(streaming LLM calls) and_http_json(model listing) are coveredChanges
agent/model.py_parse_retry_after(),_notify_retry(),_sleep_with_countdown()helpers. Added outer 429-retry loop to_http_stream_sse()and_http_json(). Addedon_retryfield to both model classes. Error bodies truncated to 8KB.agent/engine.pymodel.on_retryat all depths (not depth-gated likeon_content_delta). Clear infinallyblock. Messages include[d/s]prefix.tests/test_rate_limit.pytests/conftest.pytests/test_model_complex.pyfake_stream_ssesignatures for new params.Test plan
Known limitations (v1)
_ThinkingDisplayspinner continues alongside countdown messagesoverloaded_error) not handled (architecture supports trivial addition)