Skip to content

fix: resolve false "plan limit reached" error for cloud models#1251

Open
Dang-Ver wants to merge 1 commit intoeigent-ai:mainfrom
Dang-Ver:fix/false-plan-limit-error
Open

fix: resolve false "plan limit reached" error for cloud models#1251
Dang-Ver wants to merge 1 commit intoeigent-ai:mainfrom
Dang-Ver:fix/false-plan-limit-error

Conversation

@Dang-Ver
Copy link

Summary

  • Fix incorrect model_platform mapping for cloud models routed through LiteLLM proxy — non-GPT models (Gemini, Claude, Minimax) were using native provider SDKs (gemini, anthropic) instead of openai-compatible-model, causing protocol mismatches that were misinterpreted as budget errors
  • Remove overly broad " 429" heuristic in error_format.py that falsely classified HTTP 429 rate limits as insufficient_quota errors, and add a separate rate_limit_exceeded handler
  • Add missing cloud models (GPT-5/5.1/5.2/5-mini, Gemini 3 Flash, Claude Sonnet 4-5) to backend ModelType enum

Test plan

  • Select Minimax M2.5 as cloud model and send a message — should no longer show "plan limit reached"
  • Select Gemini 3 Pro as cloud model and send a message — should work without false quota error
  • Select GPT 5.2 as cloud model and send a message — should work as before
  • Select GPT 4.1 as cloud model and send a message — verify no regression
  • Trigger an actual rate limit (429) and verify the new "Rate limit exceeded" message appears instead of "exceeded your quota"

@Dang-Ver Dang-Ver force-pushed the fix/false-plan-limit-error branch from 226e847 to 3351732 Compare February 14, 2026 11:46
@Dang-Ver
Copy link
Author

@Wendong-Fan.
plz help me with this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant