How to limit OpenAI API spend per agent run #209

bmdhodl · 2026-02-18T01:47:04Z

bmdhodl
Feb 18, 2026
Maintainer

I'm building an autonomous agent that makes multiple OpenAI calls per run. Sometimes a single run costs $0.50, sometimes it spirals to $15+ if the model gets stuck reasoning in circles.

OpenAI's dashboard spend limits are account-wide — they don't help when you need per-agent or per-run budgets. I wanted a way to say "this agent run gets $5 max, period."

Here's what I ended up with using AgentGuard:

from agentguard import Tracer, BudgetGuard, patch_openai, JsonlFileSink

# Set a $5 hard cap with a warning at 80%
tracer = Tracer(
    sink=JsonlFileSink("traces.jsonl"),
    guards=[BudgetGuard(
        max_cost_usd=5.00,
        warn_at_pct=0.8,
        on_warning=lambda msg: print(f"WARNING: {msg}"),
    )],
)
patch_openai(tracer)  # auto-tracks cost per ChatCompletion call

# Now use OpenAI normally — AgentGuard intercepts every call
import openai
client = openai.OpenAI()

for i in range(100):  # agent loop
    try:
        resp = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": f"Step {i}: analyze data"}],
        )
    except Exception as e:
        if "BudgetExceeded" in type(e).__name__:
            print(f"Agent stopped at budget limit: {e}")
            break
        raise

The key thing: patch_openai(tracer) monkey-patches the OpenAI client so every call gets cost-tracked automatically. When cumulative cost hits $4 (80%), the warning fires. At $5, BudgetExceeded is raised and the agent stops mid-run.

The cost estimates use published per-token pricing (GPT-4o, GPT-4, GPT-3.5, etc.), updated monthly.

pip install agentguard47

Zero dependencies, MIT licensed: https://github.com/bmdhodl/agent47

Has anyone else solved per-agent spend limits differently? I'm curious about approaches that work with streaming responses or multi-agent systems where each agent needs its own budget.

UrRhb · 2026-03-26T21:44:00Z

UrRhb
Mar 26, 2026

Nice approach with the monkey-patching pattern. We took a similar route on the Node.js side with burn0 -- it patches globalThis.fetch and node:http so every outbound API call gets cost-tracked automatically. No SDK-specific patching needed, which means it works across OpenAI, Anthropic, Gemini, and ~50 other services out of the box.

For the multi-agent question you raised -- burn0 has a track() wrapper that lets you attribute costs to specific features or agent runs. Different ecosystem (JS vs Python) but same philosophy -- cost visibility shouldn't require a platform. Would be cool to compare notes on how you handle pricing updates for new models.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to limit OpenAI API spend per agent run #209

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

How to limit OpenAI API spend per agent run #209

Uh oh!

bmdhodl Feb 18, 2026 Maintainer

Replies: 1 comment

Uh oh!

UrRhb Mar 26, 2026

bmdhodl
Feb 18, 2026
Maintainer

UrRhb
Mar 26, 2026