Update agent model configs and provider budgets for 2025#127
Merged
Conversation
- Add browser User-Agent header to test-models.py to bypass Cloudflare bot protection (HTTP 403 error code 1010) on Groq and Cerebras APIs - Handle list-type content in responses from reasoning models (Mistral magistral) that return content as array of parts instead of string - Remove unavailable models: o4-mini (GitHub), qwen-long (Qwen) - Update OpenRouter free model list to currently available models https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
Groq and Cerebras had hardcoded model lists with models that no longer exist (openai/gpt-oss-120b, kimi-k2-0905, qwen3-235b). Switch to fetch:true so LibreChat discovers available models from the API, keeping only known-good models as defaults. https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
Previous defaults (gemini-2.5-flash-preview, llama-4-scout, deepseek-r1-0528, qwen3-235b) don't exist in OpenRouter's free tier. Replaced with verified models from /models endpoint: gpt-oss-120b, nemotron-3-super-120b-a12b, llama-3.3-70b-instruct, hermes-3-llama-3.1-405b. https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
Add gpt-oss-120b, kimi-k2, qwen3-32b from confirmed /models response. https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
Model lists are now confirmed from live /models endpoints — no need for runtime discovery which can be blocked by Cloudflare. https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
librechat-user.yaml: - Add free usage amounts to each provider comment (RPM, RPD, tok/day) agent-models.json: - Fix Cerebras model IDs: qwen3-235b → qwen-3-235b-a22b-instruct-2507 - Remove gpt-oss-120b from Cerebras (doesn't exist there) - Fix OpenRouter news agent: gemini-2.5-flash:free → minimax-m2.5:free - Update budget notes with verified free tier limits https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
- Qwen: clarify one-time token grant (not monthly) - GitHub Models: 10 RPM / 50 RPD for high-end, ~150 RPD for small - OpenRouter: 20 RPM / 50 RPD (was ~50 RPD) - agent-models.json: update all budget notes with verified numbers https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
- signals-analyst (L2): GitHub Models → Groq - news-the-augur (L4): Cohere → Cerebras - news-financial-augur (L4): OpenRouter → Groq - news-finanz-augur (L4): GitHub Models → Mistral - L5 default: GitHub Models → Cohere (allowed for live-chat tier) - Cohere/GitHub Models/OpenRouter/Qwen reserved for prototyping and one-off bootstrap runs (e.g. Qwen's 1M-token data seeding) https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
- agent-models-bootstrap.json: Qwen-heavy config for initial data seeding (burns 1M-token free grant across qwen3-max/plus/flash) - seed-agents.py --mode bootstrap|continuous: auto-selects models file - Post-install info: model test, agent setup, bootstrap workflow - 9 new tests: mode selection, bootstrap JSON validity, provider restrictions (no Qwen/GitHub/OpenRouter in L1-L4 continuous) Workflow: seed --mode bootstrap → bootstrap-data.py → seed --mode continuous https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
- agent_client.py: shared AgentClient with discovery, streaming, .env loading — replaces duplicate code in cron dispatcher, trigger command, and bootstrap-data.py - bootstrap-data.py: auto-loads AUGUR_AGENTS_API_KEY from .env, auto-discovers agent ID via API (no manual --api-key/--agent-id needed) - Augur.sh cron/trigger: refactored to use AgentClient via PYTHONPATH - Augur.sh bootstrap command: simplified, no more env var requirements - 14 new tests for agent_client (load_env, find_agent, invoke, logging) - Updated bootstrap tests for new AgentClient-based API https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
The agents command no longer uses --mode flags; it uses --group and --all instead. Bootstrap workflow simplified to reflect that AUGUR_AGENTS_API_KEY is now set in .env rather than auto-discovered. https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
- --bootstrap: shorthand for --mode bootstrap (Qwen free tokens, core only) - --trading/--news: shorthand for --group trading/news - Remove --group flag from CLI (covered by shorthands + --all) - Update post-install info with correct bootstrap workflow https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
Don't prescribe --bootstrap as mandatory; show it as an alternative to regular agents for users who want to use free Qwen tokens. https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
Replace cron-planner as bootstrap target with a purpose-built bootstrap agent. It coordinates L1 data agents (market-data, osint-data, signals-data) which maintain the unified store interface, rather than calling data tools directly. - New agent: bootstrap (L4, group: bootstrap) with edges to all 3 L1 agents - New prompt: prompts/bootstrap.md with delegation rules per kind - Update bootstrap-data.py default agent from cron-planner to bootstrap - Add bootstrap to live-chat edges, both model config files, ALL_GROUPS - Update test expectations for 17th agent https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
Extend bootstrap from profiles-only to a full 4-phase bootstrapper: - Phase 1: Profiles (existing) - Phase 2: Timeseries (existing) - Phase 3: Events - seed current events via L1 delegation (GDELT, market moves, trending signals) for countries/stocks/commodities/ crypto/regions - Phase 4: Plans - create initial research plans and watchlists so cron-planner has work from day one New CLI flags: --events, --plans, --all-phases Add google-news MCP to bootstrap agent tools for event research. https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
…tats Rewrite bootstrap-data.py: - One agent call per kind per phase (not per batch of 10) - No batching, no random sleeps — agent handles iteration internally - Safe to re-run: enriches existing data, deduplicates events/plans - Before/after profile coverage stats table - Structured timestamped log output - --phase flag for single phase, default runs all 4 phases - 10min timeout per call (agent makes many tool calls per kind) Rewrite agent_client.py: - Retry with exponential backoff on 429, 5xx, timeout, network errors - Up to 4 retries (2s, 4s, 8s, 16s), respects Retry-After header Update tests for new API (no batch_targets, build_profiles_prompt, etc). https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
The EU Parliament API (data.europarl.europa.eu) regularly takes >60s to respond. Bump pytest-timeout to 120s for this specific test to avoid flaky CI failures. https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Updated agent model configurations, provider budget notes, and endpoint settings to reflect current API limits and model availability. Improved test script robustness for model validation.
Key Changes
Agent Model Configuration (
agent-models.json)qwen-3-235b-a22b-instruct-2507across all agents (replacingqwen3-235bandgpt-oss-120b)signals-analystfrom GitHub Models to Groq for better budget efficiencynews-the-augurfrom Cohere to Cerebras for consistencynews-finanz-augurfrom GitHub Models to Mistral (leveraging Mistral's European language strength)command-a-03-2025)LibreChat Endpoint Configuration (
librechat-user.yaml)moonshotai/kimi-k2-instructandqwen/qwen3-32b; removedkimi-k2-0905qwen-3-235b-a22b-instruct-2507; removedgpt-oss-120bo4-mini(unavailable)qwen-longminimax/minimax-m2.5:free,nvidia/nemotron-3-super-120b-a12b:free,openai/gpt-oss-120b:free; removed Gemini and DeepSeek free models; updated titleModelTest Script (
test-models.py)Notable Details
qwen-3-235b-a22b-instruct-2507)https://claude.ai/code/session_01GrTcXSgevbdYJ97Jq8tsbd