trustabl · trustabl · Jun 4, 2026 · Jun 4, 2026
diff --git a/POLICY_INDEX.md b/POLICY_INDEX.md
diff --git a/docs/Policy/langchain/agent_safety.md b/docs/Policy/langchain/agent_safety.md
@@ -0,0 +1,116 @@
+---
+policy_id: langchain_agent_safety
+category: langchain
+topic: agent_safety
+rules:
+  - id: LC-101
+    severity: high
+    confidence: 0.85
+    scope: agent
+    fix_type: code
+  - id: LC-102
+    severity: medium
+    confidence: 0.8
+    scope: agent
+    fix_type: config
+  - id: LC-111
+    severity: medium
+    confidence: 0.8
+    scope: agent
+    fix_type: config
+references: [LLM06, LLM10]
+---
+
+# Policy Rationale: LangChain Agent Safety
+
+**Policy ID:** `langchain_agent_safety`
+**File:** `langchain/agent_safety.yaml`
+**Rules:** LC-101, LC-102, LC-111
+**Severities:** high, medium
+**Fix types:** code, config
+**References:** LLM06 (Excessive Agency), LLM10 (Unbounded Consumption)
+
+---
+
+## What this policy covers
+
+Agent-scope rules for the constructor-shaped LangChain / LangGraph agents Trustabl
+discovers: `create_react_agent` and `create_agent` (normalized class `ReactAgent` /
+`CreateAgent`) and the legacy `AgentExecutor`. The rules cover the two highest-signal
+agent-level risks: wiring a code-execution/shell built-in tool (LC-101) and an
+unbounded tool-calling loop (LC-102 / LC-111).
+
+The raw `StateGraph` graph agent is a documented discovery gap — its tools and model
+are assembled across many call sites, so it is not yet modeled as a single agent.
+
+---
+
+## Rule-by-rule defense
+
+### LC-101 — Agent wires a code-execution or shell built-in tool (Severity: high, Confidence: 0.85, Fix type: code)
+
+**What we detect:** a LangChain agent (`ReactAgent` / `CreateAgent` / `AgentExecutor`)
+whose resolved tool set includes `PythonREPLTool`, `PythonAstREPLTool`, or
+`ShellTool` (predicate `agent_uses_hosted_tool_class`). Discovery recognizes these
+built-ins when they appear in the agent's tool list — including the common
+positional form, `create_react_agent(model, [PythonREPLTool()])` — and records them
+as hosted-tool edges.
+
+**Why it is flaggable:** these built-ins execute code or shell commands chosen by
+the model. Once one is on the tool surface, a prompt injection or a confused model
+has a direct path to arbitrary execution in the agent process. PythonREPLTool and
+ShellTool have been the concrete vector in multiple published LangChain RCE
+advisories — this is excessive agency (LLM06) in its most literal form: the agent is
+granted the ability to run anything.
+
+**Real-world consequence:** an agent built to "answer questions about a CSV" is
+given a `PythonREPLTool`; a crafted question makes it run `__import__('os').system(...)`
+and read the deployment's secrets.
+
+**Severity high:** the capability is the defect; the fix is to remove the built-in or
+sandbox-and-gate it. **Confidence 0.85:** a few agents legitimately need a REPL and
+have sandboxed it out of band, which the class-name match cannot see.
+
+### LC-102 — AgentExecutor has no max_iterations limit (Severity: medium, Confidence: 0.8, Fix type: config)
+
+**What we detect:** an `AgentExecutor` with no effective `max_iterations` kwarg
+(predicate `agent_kwarg_missing`).
+
+**Why it is flaggable:** with no iteration ceiling, a model that never emits a final
+answer — it loops calling tools, or oscillates between two — runs until it exhausts
+the API budget or wall-clock (LLM10, Unbounded Consumption). When the looped tools
+have side effects, the runaway loop is also a correctness and safety problem, not
+just a cost one.
+
+**Severity medium:** a cost/availability incident rather than a direct compromise.
+**Confidence 0.8:** an executor wrapped by an external timeout or a custom loop
+guard is over-flagged.
+
+### LC-111 — TypeScript AgentExecutor has no maxIterations limit (Severity: medium, Confidence: 0.8, Fix type: config)
+
+**What we detect:** a TS `AgentExecutor` with no effective `maxIterations` kwarg.
+
+**Why it is flaggable / consequence:** identical to LC-102 in LangChain.js.
+
+**Severity medium / Confidence 0.8:** same profile.
+
+---
+
+## What this policy does not cover
+
+The raw `StateGraph` agent (discovery gap), the `Requests*` SSRF built-ins (recorded
+as hosted edges but not yet a dedicated agent rule), v1 `create_agent` middleware
+quality, and whether a code-execution tool is *actually* sandboxed out of band. The
+iteration rules check `AgentExecutor` only — `create_react_agent` / `create_agent`
+enforce their own recursion limit differently and are out of scope here.
+
+---
+
+## Recommendations beyond the fix
+
+Remove REPL/shell built-ins from production agents; if code execution is required,
+run it in an isolated sandbox and gate it behind a human-in-the-loop approval (a
+LangGraph `interrupt_before` breakpoint or a tool-approval middleware). Set
+`max_iterations` / `maxIterations` (and a `max_execution_time`) sized to the task,
+and set `handle_parsing_errors` so a malformed model step surfaces rather than
+retrying forever.
diff --git a/docs/Policy/langchain/code_execution.md b/docs/Policy/langchain/code_execution.md
@@ -0,0 +1,101 @@
+---
+policy_id: langchain_code_execution
+category: langchain
+topic: code_execution
+rules:
+  - id: LC-004
+    severity: high
+    confidence: 0.85
+    scope: tool
+    fix_type: code
+  - id: LC-012
+    severity: high
+    confidence: 0.85
+    scope: tool
+    fix_type: code
+references: [LLM05, LLM06]
+---
+
+# Policy Rationale: LangChain Dynamic Code Execution
+
+**Policy ID:** `langchain_code_execution`
+**File:** `langchain/code_execution.yaml`
+**Rules:** LC-004, LC-012
+**Severities:** high
+**Fix types:** code
+**References:** LLM05 (Improper Output Handling), LLM06 (Excessive Agency)
+
+> **Read [openai_sdk/code_execution.md](../openai_sdk/code_execution.md) for the
+> full threat model.** This document covers the LangChain-specific differences only.
+
+---
+
+## What this policy covers
+
+LangChain tools whose body evaluates code at runtime. Python (LC-004) fires on a
+bare `eval` / `exec` / `compile` callee (predicate `has_code_exec_call`, an AST
+walk). TypeScript (LC-012) reads the `code_exec` discovery fact, set when a handler
+calls `eval` or constructs `new Function(...)`.
+
+---
+
+## Why dynamic evaluation is a distinct concern in LangChain agents
+
+The mechanism is identical to the OpenAI case — a model-influenced string reaching
+an interpreter is arbitrary code execution; see
+[openai_sdk/code_execution.md](../openai_sdk/code_execution.md). The
+LangChain-specific note is that this ecosystem *ships* code execution as a feature:
+`PythonREPLTool` / `PythonAstREPLTool` (flagged at agent scope by LC-101) and the
+pandas/CSV/SQL "dataframe" agents are built on a REPL. Hand-rolling `eval()` inside
+a `@tool` reproduces that capability with none of the (already thin) sandboxing the
+REPL tools attempt, and hides it inside an ordinary tool body. The result reaches
+the model and the user unsanitized (LLM05), and the model can drive it (LLM06).
+
+---
+
+## Rule-by-rule defense
+
+### LC-004 — Python tool body evaluates dynamic code (Severity: high, Confidence: 0.85, Fix type: code)
+
+**What we detect:** a Python LangChain tool whose body calls `eval`, `exec`, or
+`compile` as a bare builtin (so `re.compile` and other attribute calls are not
+flagged).
+
+**Why it is flaggable / consequence:** a tool that evaluates its string input can
+be steered by prompt injection to run attacker-chosen Python in the agent process —
+read secrets, pivot to the network, or rewrite state.
+
+**Severity high:** the fix is to remove the evaluation or sandbox it; partial input
+filtering does not contain `eval`. **Confidence 0.85:** the bare-callee match
+avoids the obvious false positives, but a tool that only ever evaluates a trusted
+constant is over-flagged.
+
+### LC-012 — TypeScript tool evaluates dynamic code (Severity: high, Confidence: 0.85, Fix type: code)
+
+**What we detect:** a TS LangChain tool whose handler calls `eval()` or
+`new Function(...)` (the `code_exec` fact).
+
+**Why it is flaggable / consequence:** identical in the Node runtime — both
+evaluate a string as code, so a model-influenced argument is RCE.
+
+**Severity high / Confidence 0.85:** same profile as LC-004.
+
+---
+
+## What this policy does not cover
+
+Indirect evaluation (`importlib`, `pickle.loads`, `vm.runInContext`, a templating
+engine with code execution), evaluation behind a cross-module helper, and the
+`PythonREPLTool` built-in itself (agent scope, LC-101). Whether a given evaluated
+string is attacker-reachable is not proven — the presence of the primitive is the
+signal.
+
+---
+
+## Recommendations beyond the fix
+
+Parse structured input with a real parser (`ast.literal_eval` / `JSON.parse` / a
+typed schema) instead of evaluating it. If code execution is genuinely the product,
+run it in a locked-down sandbox (no filesystem, no network, no credentials, hard
+timeout) and gate it behind a human approval rather than letting the model invoke
+it unattended.
diff --git a/docs/Policy/langchain/repo_hygiene.md b/docs/Policy/langchain/repo_hygiene.md
@@ -0,0 +1,94 @@
+---
+policy_id: langchain_repo_hygiene
+category: langchain
+topic: repo_hygiene
+rules:
+  - id: LC-201
+    severity: low
+    confidence: 0.9
+    scope: repo
+    fix_type: config
+references: [LLM06]
+---
+
+# Policy Rationale: LangChain Repo Hygiene
+
+**Policy ID:** `langchain_repo_hygiene`
+**File:** `langchain/repo_hygiene.yaml`
+**Rules:** LC-201
+**Severities:** low
+**Fix types:** config
+**References:** LLM06 (Excessive Agency)
+
+> This rule is one of the cross-SDK "missing agent-guidance doc" family. See
+> [openai_sdk/repo_hygiene.md](../openai_sdk/repo_hygiene.md) and
+> [google_adk/repo_hygiene.md](../google_adk/repo_hygiene.md) for the shared
+> rationale; this document covers the LangChain-specific framing.
+
+---
+
+## What this policy covers
+
+A repo that uses LangChain / LangGraph in code (`SDKLangChain` observed in the
+inventory) but ships no agent-guidance doc — neither `AGENTS.md` (the cross-vendor
+convention) nor `CLAUDE.md` — at any depth. Fires once per scan
+(`repo_has_sdk_in_code: [langchain]` AND NOT `repo_component_present: [agents_md,
+claude_md]`). It carries no `language` field, so it fires for both Python and
+TypeScript LangChain repos.
+
+---
+
+## Why a missing guidance doc matters for a LangChain repo
+
+An editing coding agent reads `AGENTS.md` before it acts. With neither file present,
+any agent that opens this repo has no project-specific guidance on the choices that
+make LangChain code safe or unsafe, and LangChain offers an unusually wide menu of
+those choices:
+
+- which agent constructor to use — `create_react_agent` (deprecated), the v1
+  `create_agent`, the legacy `AgentExecutor`, or a raw `StateGraph`;
+- how tools must be defined (typed `args_schema`, descriptions) and guarded;
+- whether the REPL/shell built-ins (`PythonREPLTool`, `ShellTool`) are permitted at
+  all, and behind what sandbox/approval;
+- the local test, lint, and build commands.
+
+Without that guidance, a generative agent reaches for the most-documented pattern —
+often the deprecated one, or a REPL tool — and produces code that violates the
+project's tool and agent contracts. That is a slow-acting excessive-agency risk
+(LLM06): nothing in-tree teaches the next agent the local rules.
+
+---
+
+## Rule-by-rule defense
+
+### LC-201 — LangChain project ships no agent-guidance doc (Severity: low, Confidence: 0.9, Fix type: config)
+
+**What we detect:** `SDKLangChain` in `SDKsDetected` and no `agents_md` or
+`claude_md` component anywhere in the repo.
+
+**Why it is flaggable:** the absence is the signal; it is a hygiene gap, not a
+runtime vulnerability.
+
+**Severity low:** advisory. **Confidence 0.9:** the check is a near-deterministic
+file-presence test; the small residual is a repo that documents agent guidance under
+a non-standard filename the component scan does not recognize.
+
+**Fix type — config:** adding a doc file is a repo-config change, not a code edit.
+
+---
+
+## What this policy does not cover
+
+The *quality* of a present `AGENTS.md`/`CLAUDE.md` (any such file silences the
+rule), guidance docs under non-standard names, and whether the documented
+conventions are actually followed in code.
+
+---
+
+## Recommendations beyond the fix
+
+Add an `AGENTS.md` at the repo root (a `CLAUDE.md` also satisfies the rule). State
+which LangChain agent constructors the project uses and why, how tools must be
+defined and guarded, whether the REPL/shell built-ins are permitted and behind what
+gate, and the exact test, lint, and build commands. Keep it short and concrete so an
+editing agent can act on it without re-deriving the conventions.