Skill Being Reviewed
Skill name: agent-security
Skill path: skills/ai-security/agent-security/
False Positive Analysis
Benign-looking budget control that can be over-credited:
agent:
max_tokens: 8000
timeout_seconds: 120
rate_limit: "100 requests/minute"
retry:
max_retries: 5
tools:
web_search:
cost_metering: false
Why this is a false positive:
The architecture lists token, timeout, and rate-limit controls, but it does not prove they are enforced across tenant, user, session, tool, retry, concurrency, and fallback paths. A review can credit resource containment while a compromised agent still burns spend through parallel sessions, recursive tool calls, unmetered tools, retry storms, or fail-open provider fallback.
Coverage Gaps
Missed variant 1: per-request limits without cumulative session budget
Each request is capped, but an agent can make hundreds of sequential requests or spawn sub-agents that each receive a fresh budget.
Missed variant 2: retries and fallback providers bypass metering
Timeouts trigger retries or provider fallback without decrementing the same budget ledger.
Missed variant 3: tool costs are not budgeted
LLM tokens are metered, but web search, browser automation, code execution, storage, email, and external API calls are unmetered.
Edge Cases
- Free-tier or internal-only agents still need denial-of-service limits even when direct API spend is low.
- Batch jobs need separate budgets from interactive sessions, with explicit owner and kill switch.
- Cached responses may reduce model spend but can still create storage, retrieval, or tool-call costs.
Remediation Quality
Comparison to Other Tools
| Tool |
Catches this? |
Notes |
| API gateway rate limiting |
Partial |
Usually covers request count but not token, tool, retry, or workflow budgets. |
| Cloud billing alerts |
Partial |
Detects spend after it accrues; does not prevent runaway agent actions. |
| LLM provider usage limits |
Partial |
Covers provider spend but not local tools, downstream APIs, or sub-agent fan-out. |
Overall Assessment
Strengths: Strong agent architecture coverage for least privilege, containment, HITL, audit trails, rollback, and multi-agent trust.
Needs improvement: Resource-limit checks should require proof of enforcement and accounting across the full agent workflow, not only configuration fields.
Priority recommendations:
- Add a budget enforcement evidence checklist under least privilege or blast radius containment.
- Require a shared budget ledger for model calls, tool calls, retries, fallback providers, sub-agents, and batch jobs.
- Add output fields for quota scope, enforcement point, fail mode, alerting, kill switch, and residual bypass paths.
Sources Checked
Bounty Info
Skill Being Reviewed
Skill name:
agent-securitySkill path:
skills/ai-security/agent-security/False Positive Analysis
Benign-looking budget control that can be over-credited:
Why this is a false positive:
The architecture lists token, timeout, and rate-limit controls, but it does not prove they are enforced across tenant, user, session, tool, retry, concurrency, and fallback paths. A review can credit resource containment while a compromised agent still burns spend through parallel sessions, recursive tool calls, unmetered tools, retry storms, or fail-open provider fallback.
Coverage Gaps
Missed variant 1: per-request limits without cumulative session budget
Each request is capped, but an agent can make hundreds of sequential requests or spawn sub-agents that each receive a fresh budget.
Missed variant 2: retries and fallback providers bypass metering
Timeouts trigger retries or provider fallback without decrementing the same budget ledger.
Missed variant 3: tool costs are not budgeted
LLM tokens are metered, but web search, browser automation, code execution, storage, email, and external API calls are unmetered.
Edge Cases
Remediation Quality
Comparison to Other Tools
Overall Assessment
Strengths: Strong agent architecture coverage for least privilege, containment, HITL, audit trails, rollback, and multi-agent trust.
Needs improvement: Resource-limit checks should require proof of enforcement and accounting across the full agent workflow, not only configuration fields.
Priority recommendations:
Sources Checked
Bounty Info