Tag gemini-2.5-pro as reasoning-class by fxspeiser · Pull Request #28 · fxspeiser/crosscheck-agent

fxspeiser · 2026-06-01T00:04:48Z

Summary

Closes the follow-up flagged in PR #27. gemini-2.5-pro burns reasoning tokens just like gpt-5 and claude-opus-4-7 — it should have been in `PROVIDER_CAPS["gemini"]["reasoning_prefixes"]` from day one.

Effects:

`_is_reasoning_model("gemini", "gemini-2.5-pro")` now returns True.
`_budget_for_purpose` picks reasoning ceilings (2048) for non-overridden purposes:
- debate: 1500 → 2048
- synth: 1024 → 2048
- audit: 768 → 2048
- confer + triangulate stay at the PR Bump confer / triangulate token budgets for OpenAI + Gemini #27 override (6144)
Prompt adapters now strip "think step by step" preambles before sending to gemini-2.5-pro (matches behavior for the other reasoning families; already shipped in PR Add provider-specific prompt adapters #19).

Future-proof: prefix is `"gemini-2.5-pro"` so a hypothetical `gemini-1.5-flash` or `gemini-2.0-flash` variant correctly falls through to the non-reasoning ceilings.

No API call change — `generationConfig.temperature` + `maxOutputTokens` are both accepted by gemini-2.5-pro.

Test plan

Updated `scripts/test_provider_token_budgets.py` asserts reasoning detection + new ceilings + future-proof non-reasoning fallthrough
Full suite (38 scripts) passes locally

🤖 Generated with Claude Code

Closes the follow-up flagged in PR #27. gemini-2.5-pro burns reasoning tokens just like gpt-5 and claude-opus-4-7 — it should have been in PROVIDER_CAPS["gemini"]["reasoning_prefixes"] from day one. Effects of the tag: - `_is_reasoning_model("gemini", "gemini-2.5-pro")` now returns True. - `_budget_for_purpose` picks the reasoning-class ceilings (2048) for every non-overridden purpose instead of falling through to the smaller non-reasoning ceilings: debate: 1500 -> 2048 synth: 1024 -> 2048 audit: 768 -> 2048 moderator/orchestrate/plan/review/coordinate: already 2048 confer + triangulate keep the PR #27 override at 6144. - Prompt adapters now strip "think step by step" / "think out loud" preambles before sending to gemini-2.5-pro (already shipped in PR #19; matches behavior for the other reasoning families). Future-proof: the prefix is "gemini-2.5-pro" so a hypothetical gemini-1.5-flash / gemini-2.0-flash variant correctly falls through to the non-reasoning ceilings (verified in the test). No change to the Gemini API call itself — `generationConfig.temperature` + `maxOutputTokens` are both accepted by gemini-2.5-pro. Tests (scripts/test_provider_token_budgets.py): - `_is_reasoning_model("gemini", "gemini-2.5-pro")` is True - gemini debate / synth / audit all = 2048 (reasoning default) - gemini confer / triangulate stay = 6144 (PR #27 override) - A hypothetical non-reasoning gemini still falls through correctly Full suite (38 scripts) passes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fxspeiser merged commit 7b54d80 into main Jun 1, 2026
1 check passed

fxspeiser deleted the feature/gemini-reasoning-class branch June 1, 2026 00:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tag gemini-2.5-pro as reasoning-class#28

Tag gemini-2.5-pro as reasoning-class#28
fxspeiser merged 1 commit into
mainfrom
feature/gemini-reasoning-class

fxspeiser commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fxspeiser commented Jun 1, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant