Skip to content

[REVIEW] agent-security: add resource budget enforcement evidence gates #1380

@catcherintheroad-hub

Description

@catcherintheroad-hub

Skill Being Reviewed

Skill name: agent-security
Skill path: skills/ai-security/agent-security/

False Positive Analysis

Benign-looking budget control that can be over-credited:

agent:
  max_tokens: 8000
  timeout_seconds: 120
  rate_limit: "100 requests/minute"
retry:
  max_retries: 5
tools:
  web_search:
    cost_metering: false

Why this is a false positive:

The architecture lists token, timeout, and rate-limit controls, but it does not prove they are enforced across tenant, user, session, tool, retry, concurrency, and fallback paths. A review can credit resource containment while a compromised agent still burns spend through parallel sessions, recursive tool calls, unmetered tools, retry storms, or fail-open provider fallback.

Coverage Gaps

Missed variant 1: per-request limits without cumulative session budget

Each request is capped, but an agent can make hundreds of sequential requests or spawn sub-agents that each receive a fresh budget.

Missed variant 2: retries and fallback providers bypass metering

Timeouts trigger retries or provider fallback without decrementing the same budget ledger.

Missed variant 3: tool costs are not budgeted

LLM tokens are metered, but web search, browser automation, code execution, storage, email, and external API calls are unmetered.

Edge Cases

  • Free-tier or internal-only agents still need denial-of-service limits even when direct API spend is low.
  • Batch jobs need separate budgets from interactive sessions, with explicit owner and kill switch.
  • Cached responses may reduce model spend but can still create storage, retrieval, or tool-call costs.

Remediation Quality

  • Fix resolves the vulnerability
  • Fix doesn't introduce new security issues
  • Fix doesn't break functionality
  • Issues found: Add budget enforcement evidence gates for cumulative ledgers, per-tenant/user/session/tool quotas, retry/fallback accounting, concurrency limits, alert thresholds, and fail-closed behavior.

Comparison to Other Tools

Tool Catches this? Notes
API gateway rate limiting Partial Usually covers request count but not token, tool, retry, or workflow budgets.
Cloud billing alerts Partial Detects spend after it accrues; does not prevent runaway agent actions.
LLM provider usage limits Partial Covers provider spend but not local tools, downstream APIs, or sub-agent fan-out.

Overall Assessment

Strengths: Strong agent architecture coverage for least privilege, containment, HITL, audit trails, rollback, and multi-agent trust.

Needs improvement: Resource-limit checks should require proof of enforcement and accounting across the full agent workflow, not only configuration fields.

Priority recommendations:

  1. Add a budget enforcement evidence checklist under least privilege or blast radius containment.
  2. Require a shared budget ledger for model calls, tool calls, retries, fallback providers, sub-agents, and batch jobs.
  3. Add output fields for quota scope, enforcement point, fail mode, alerting, kill switch, and residual bypass paths.

Sources Checked

Bounty Info

  • I have read and agree to the CONTRIBUTING.md bounty terms
  • Preferred payment method: GitHub Sponsors

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions