Threat Model

Overview

agent-tool-firewall is a default-deny policy gateway that sits between an LLM/AI agent and the backend tools it invokes. This document describes trust boundaries, known threats, existing mitigations, and residual risks.

Trust boundaries

+-------------------+        +-----------------------+        +---------------+
|   LLM / Agent     | -----> |  agent-tool-firewall  | -----> |  Backend Tool |
|   (untrusted)     |  HTTP  |  (trust boundary)     |        |  (trusted)    |
+-------------------+        +-----------------------+        +---------------+
                                      |
                                      v
                               Audit log (JSONL)

Boundary	From	To	Trust level
B1: Agent -> Firewall	LLM agent	Firewall `/v1/evaluate`	Untrusted. The agent may be prompt-injected or compromised. All input is treated as adversarial.
B2: Firewall -> Backend	Firewall decision	Backend tool execution	Trusted. The firewall's allow decision is authoritative. The backend tool trusts that the firewall has validated the call.
B3: Admin -> Firewall	Operator	`/v1/reload` endpoint	Privileged. Protected by bearer token authentication. Policy changes affect the entire security posture.
B4: Firewall -> Audit log	Firewall	Audit log file	Trusted output. The audit log must not be writable by the agent or backend tools.

Threats

T1: Policy bypass

Description: An attacker crafts a tool call that circumvents policy evaluation, gaining access to a denied or unlisted tool.

Attack vectors:

Tool name manipulation (case variations, Unicode confusables)
Exploiting gaps between policy evaluation and actual tool invocation

Severity: Critical

T2: Path traversal

Description: An attacker uses directory traversal sequences to escape the allowlisted directories and access sensitive files.

Attack vectors:

../../etc/shadow style relative path escapes
Null-byte injection (file.txt\x00.jpg)
Symlink following to escape the allowlist jail
URL-encoded path components (%2e%2e%2f)
Unicode normalization tricks (fullwidth characters, combining dots)

Severity: Critical

T3: Argument injection

Description: An attacker injects malicious content into tool arguments to exploit downstream tools.

Attack vectors:

Shell metacharacters in arguments passed to shell-based tools
Oversized arguments to trigger buffer overflows in downstream consumers
Blocklisted patterns hidden via encoding or case manipulation

Severity: High

T4: Token exfiltration

Description: An attacker extracts the service bearer token used for the /v1/reload endpoint, enabling unauthorized policy changes.

Attack vectors:

Reading the token file via a path traversal in a filesystem tool
Exfiltrating the token from process environment or memory
Sniffing the token on the network (if not using TLS)

Severity: High

T5: Denial of service via rate limit abuse

Description: An attacker floods the firewall with requests to exhaust the rate limit budget, blocking legitimate tool calls.

Attack vectors:

Rapid-fire requests to consume the sliding window budget
Distributed requests from multiple compromised agents
Oversized request bodies to consume memory/CPU

Severity: Medium

Mitigations in place

Threat	Mitigation	Implementation
T1: Policy bypass	Default-deny policy	Tools must be explicitly listed in the allow list. Unknown tools are rejected. Deny list is evaluated before allow list.
T2: Path traversal	Path canonicalization and validation	`filepath.Clean` + `filepath.Abs` resolve all `../` sequences. Null bytes are rejected. Paths are matched against allowlist/denylist after canonicalization.
T3: Argument injection	Argument filtering and length limits	`max_arg_length` enforces size limits. `args_blocklist` rejects arguments containing dangerous patterns. Case-insensitive matching.
T4: Token exfiltration	Bearer token auth with constant-time comparison	Token loaded from a file (not environment variable). `crypto/subtle.ConstantTimeCompare` prevents timing attacks. Token file should be root-owned with mode 0400.
T5: DoS via rate limiting	Sliding window rate limiter	Configurable `requests_per_minute` with burst support. Request body size capped at 64 KB via `http.MaxBytesReader`.
General	Minimal attack surface	Single static binary with zero runtime dependencies. No shell access. No network egress. Systemd sandboxing (DynamicUser, PrivateNetwork, seccomp).
General	Audit logging	Every policy decision is logged to a structured JSONL audit trail with timestamp, tool name, parameters, decision, and reason.

Residual risks

Risk	Description	Recommended mitigation
R1: Symlink race conditions	Path validation now resolves symlinks for the deepest existing prefix, but a filesystem race could still swap a path between evaluation and tool execution.	Mount allowlisted directories with `nosymfollow` and ensure the executor opens files relative to a trusted directory descriptor.
R2: Unicode / encoded path tricks	Percent-encoded traversal and known Unicode path confusables are rejected before matching; other filesystem-specific canonicalization quirks may still exist.	Keep deny-by-default policies and add platform-specific tests for any new execution environment.
R3: TOCTOU on policy reload	A race between reading the policy file and applying it could lead to inconsistent state.	The current `sync.RWMutex` serializes reads and writes, but file-level TOCTOU remains if the file is modified during read. Use atomic file replacement (rename).
R4: No TLS by default	The firewall listens on plain HTTP. If exposed beyond localhost, tokens and tool call data are transmitted in cleartext.	Default bind is `127.0.0.1` (localhost only). For multi-host deployments, use a reverse proxy with mTLS termination.
R5: Single-point rate limiter	The rate limiter is in-process and per-instance. Multiple firewall instances do not share rate limit state.	Acceptable for single-appliance deployment. For distributed deployments, use an external rate limiter (e.g., envoy, nginx).
R6: No request authentication on /v1/evaluate	Any process on localhost can submit tool evaluation requests.	Acceptable for appliance mode where only the trusted agent process has access. For multi-tenant deployments, add per-client authentication.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Threat Model

Overview

Trust boundaries

Threats

T1: Policy bypass

T2: Path traversal

T3: Argument injection

T4: Token exfiltration

T5: Denial of service via rate limit abuse

Mitigations in place

Residual risks

FilesExpand file tree

THREAT_MODEL.md

Latest commit

History

THREAT_MODEL.md

File metadata and controls

Threat Model

Overview

Trust boundaries

Threats

T1: Policy bypass

T2: Path traversal

T3: Argument injection

T4: Token exfiltration

T5: Denial of service via rate limit abuse

Mitigations in place

Residual risks