fix(deepagent): rename max_tokens → total_token_budget (closes #278) by fede-kamel · Pull Request #279 · oracle-samples/locus

fede-kamel · 2026-05-28T11:31:27Z

Closes #278.

What changed

Removed: create_deepagent(max_tokens=...) (BREAKING — beta SDK, no migrators).
Added: total_token_budget: int | None = None — explicit name for the run-level TokenLimit(total_token_budget) termination cap. Default None means no TokenLimit term in the algebra (was the silent-kill default at 80K).
Loud rejection: passing the old max_tokens= raises TypeError with a migration message pointing at both total_token_budget (run-level) and max_output_tokens (per-completion).
Kept: max_output_tokens — per-completion output cap, forwarded to AgentConfig.max_tokens → model provider's per-request max_tokens field.

Why

See #278. max_tokens was Locus-unique-and-opposite-to-every-other-LLM-SDK semantics (run-level cap, not per-completion). Default 80K silently killed any agent with a ~50K-token system prompt. Callers reasonably passing max_tokens=65536 for long-form output got empty results.

Tests

Unit (tests/unit/test_deepagent.py): 12 tests pass. Added 4 new; updated 1.
Integration stub (tests/integration/test_deepagent_token_budget.py): 5 new stub-mode tests covering bug shape + fix + loud rejection. No model calls.
Integration live (tests/integration/test_deepagent_token_budget_live.py): 2 new tests gated by RUN_LIVE_OCI=1. Verified against API_FREE_TIER + Gemini 2.5 Flash: test_long_prompt_with_old_default_would_have_died passes; test_long_prompt_with_default_none_produces_real_output xfails pending Locus bug feat: initial public release of locus #2 (separate issue — runtime_loop drops the final assistant message on non-submit-tool termination).

$ .venv/bin/python -m pytest tests/unit/test_deepagent.py tests/integration/test_deepagent_token_budget.py -q
17 passed

$ RUN_LIVE_OCI=1 OCI_PROFILE=API_FREE_TIER OCI_REGION=us-chicago-1 \
    .venv/bin/python -m pytest tests/integration/test_deepagent_token_budget_live.py -v
1 passed, 1 xfailed

Related bugs found but NOT fixed here

Two more Locus bugs surfaced during this work. Filing as separate issues:

Bug feat: initial public release of locus #2: runtime_loop drops the final assistant message when the agent terminates via MaxIterations / TokenLimit without calling submit_tool. AgentResult.text='' even when metrics.completion_tokens > 0.
Bug docs: surface tutorial 31 (plugins) and fix stale source links #3: OCIModel + Gemini rejects Pydantic-derived structured-output schemas containing additionalProperties: false. Need vendor-aware schema munging in the OCI provider.

Migration

# Before (silent foot-gun)
create_deepagent(model=..., tools=..., system_prompt=..., max_tokens=65_536)
#                                                          ^^^^^^^^^^^^^^^^
#                                                          interpreted as run-level cap

# After (explicit + clear)
create_deepagent(
    model=...,
    tools=...,
    system_prompt=...,
    max_output_tokens=65_536,  # per-completion cap
    # total_token_budget defaults to None — no TokenLimit termination
)

# Or if you really want a cumulative run cap:
create_deepagent(..., total_token_budget=500_000, max_output_tokens=65_536)

…278) BREAKING: ``create_deepagent(max_tokens=...)`` is removed. Use ``total_token_budget=N`` for the run-level TokenLimit termination, or ``max_output_tokens=N`` for the per-completion output cap on each LLM call. Background — this is what was silently failing: The old ``max_tokens`` parameter controlled the TOTAL-RUN token budget (cumulative input+output across every iteration of one run), wired into the typed-termination algebra as ``TokenLimit(max_tokens)``. The name clashed with every LLM SDK on earth — OpenAI, Anthropic, Google all use ``max_tokens`` for the per-completion output cap. Callers reasonably passing ``max_tokens=65536`` expecting Gemini- style "max 65K output tokens per call" got Locus's ``TokenLimit(65536)`` termination instead. On any agent with a long system prompt (graph-grounded research, evaluator prompts, multi- datastore RAG context), the input alone exceeded the cap on iteration 1 → ``TokenLimit`` fired → run exited via ``TerminateEvent`` with empty output. No warning, no diagnostic. The old 80_000 default was harmful — any agent with a ~50K-token prompt was 1-2 iterations away from being silently killed. Cost real debugging hours in the observai/optic AFS DeepAgent integration before bisecting down to this. What changes: - ``max_tokens=`` kwarg removed entirely (beta SDK, no migration needed). Rejected loud via TypeError with a message pointing to the new names + version so any straggler call sites fail at the bound boundary instead of silently mis-behaving. - ``total_token_budget: int | None = None`` is the new name for the run-level TokenLimit cap. ``None`` (default) means no TokenLimit term in the termination algebra — the run is bounded only by ToolCalled+Confidence or MaxIterations. - ``max_output_tokens: int | None = None`` stays — this is the per-completion cap forwarded to ``AgentConfig.max_tokens`` (and from there to the model provider's per-request max_tokens field). This is the knob callers usually meant when they passed the old name. - Docstring carries a loud "naming note" + "breaking change" block so anyone hitting the TypeError finds the migration path immediately. Tests added: Unit (tests/unit/test_deepagent.py): - test_token_limit_omitted_when_budget_none — default doesn't add TokenLimit term (the foot-gun fix) - test_legacy_max_tokens_kwarg_rejected — TypeError with clear message; can't silently flow through to AgentConfig - test_max_output_tokens_propagated_independently_of_budget — per-completion cap lands on AgentConfig.max_tokens regardless of the run-budget setting - Updated test_typed_termination_attached to use the new name Integration stub (tests/integration/test_deepagent_token_budget.py): - 5 stub-mode tests covering bug shape + fix + loud rejection - No model calls; inspects termination tree directly Integration live (tests/integration/test_deepagent_token_budget_live.py): - Real OCI Gemini calls gated by RUN_LIVE_OCI=1 - Reproduces the bug shape with explicit 80K opt-in (passes) - Verifies real-output happy path (xfailed pending Locus bug #2: runtime_loop drops final assistant message when agent terminates via MaxIterations without calling submit_tool — tracked in a follow-up issue) Two related Locus bugs discovered during this work, NOT fixed here (will be filed as separate issues): #2 — runtime_loop's final-message flush path is conditional on the submit_tool exit branch; agents that terminate via MaxIterations / TokenLimit return AgentResult.text='' even when the model emitted completion tokens. Live test marks this xfail. #3 — OCIModel + Gemini rejects Pydantic-derived structured-output schemas containing additionalProperties:false ("Unsupported JSON Schema feature for Gemini"). Vendor-aware schema munging needed in the OCI provider. Out of scope here; live tests omit output_schema to avoid hitting this. Refs: #278 (this issue), observai/optic AFS DeepAgent integration end-to-end testing surfaced all three bugs. Signed-off-by: Federico Kamelhar <federico.kamelhar@oracle.com>

oracle-contributor-agreement Bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label May 28, 2026

This was referenced May 28, 2026

deepagent: runtime_loop drops final assistant message when agent terminates without calling submit_tool #280

Closed

OCIModel + Gemini: structured output rejects Pydantic schemas with additionalProperties:false #281

Closed

fede-kamel force-pushed the fix/deepagent-token-budget-naming branch from 17dbbaf to 89bafbe Compare May 28, 2026 11:32

fede-kamel merged commit 6304505 into main May 28, 2026
10 checks passed

fede-kamel deleted the fix/deepagent-token-budget-naming branch May 28, 2026 11:35

fede-kamel mentioned this pull request May 28, 2026

fix(agent): force summary call when assistant content is empty (closes #280) #282

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deepagent): rename max_tokens → total_token_budget (closes #278)#279

fix(deepagent): rename max_tokens → total_token_budget (closes #278)#279
fede-kamel merged 1 commit into
mainfrom
fix/deepagent-token-budget-naming

fede-kamel commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fede-kamel commented May 28, 2026

What changed

Why

Tests

Related bugs found but NOT fixed here

Migration

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant