From 6b39579305e7e052bc2aa4f1416ec15bafe09f33 Mon Sep 17 00:00:00 2001 From: agent-of-mkmeral Date: Wed, 17 Jun 2026 04:14:51 +0000 Subject: [PATCH 1/2] docs: add Strands Shell SDK documentation section --- site/src/config/navigation.yml | 8 + .../docs/user-guide/shell-sdk/commands.mdx | 133 ++++++++++++++ .../user-guide/shell-sdk/configuration.mdx | 171 ++++++++++++++++++ .../docs/user-guide/shell-sdk/index.mdx | 87 +++++++++ .../docs/user-guide/shell-sdk/mcp-server.mdx | 107 +++++++++++ .../docs/user-guide/shell-sdk/quickstart.mdx | 161 +++++++++++++++++ .../docs/user-guide/shell-sdk/security.mdx | 66 +++++++ 7 files changed, 733 insertions(+) create mode 100644 site/src/content/docs/user-guide/shell-sdk/commands.mdx create mode 100644 site/src/content/docs/user-guide/shell-sdk/configuration.mdx create mode 100644 site/src/content/docs/user-guide/shell-sdk/index.mdx create mode 100644 site/src/content/docs/user-guide/shell-sdk/mcp-server.mdx create mode 100644 site/src/content/docs/user-guide/shell-sdk/quickstart.mdx create mode 100644 site/src/content/docs/user-guide/shell-sdk/security.mdx diff --git a/site/src/config/navigation.yml b/site/src/config/navigation.yml index 662f48c3f..1a79e97b6 100644 --- a/site/src/config/navigation.yml +++ b/site/src/config/navigation.yml @@ -231,6 +231,14 @@ sidebar: - docs/user-guide/evals-sdk/how-to/result_caching - docs/user-guide/evals-sdk/how-to/experiment_management - docs/user-guide/evals-sdk/how-to/serialization + - label: Strands Shell SDK + items: + - docs/user-guide/shell-sdk + - docs/user-guide/shell-sdk/quickstart + - docs/user-guide/shell-sdk/configuration + - docs/user-guide/shell-sdk/commands + - docs/user-guide/shell-sdk/mcp-server + - docs/user-guide/shell-sdk/security - docs/user-guide/versioning-and-support - label: Examples diff --git a/site/src/content/docs/user-guide/shell-sdk/commands.mdx b/site/src/content/docs/user-guide/shell-sdk/commands.mdx new file mode 100644 index 000000000..6fb90711e --- /dev/null +++ b/site/src/content/docs/user-guide/shell-sdk/commands.mdx @@ -0,0 +1,133 @@ +--- +title: Strands Shell Commands +description: The full command inventory in Strands Shell, with supported flags, shell builtins, and known divergences from GNU and BSD coreutils behavior. +sidebar: + label: "Commands" +--- + +Strands Shell reimplements a curated subset of POSIX and coreutils in Rust, targeting the operations agents reach for most rather than a complete toolset. This page is the reference for what is supported and where behavior diverges from GNU or BSD: missing flags, unsupported features, and known gaps. + +The shell is Bourne-compatible, with pipes, loops, functions, and subshells. The commands group into text processing, file contents, file management, path and system utilities, networking, JSON, search, and scripting. + +## General callouts + +These notes apply across commands: + +- Regex uses the Rust `regex` crate. Backreferences and lookaround are unavailable, so `grep -P` is unsupported and GNU BRE escapes are not translated. +- `jq` is backed by [`jaq`](https://github.com/01mf02/jaq), a jq subset. +- Unsupported flags are rejected, not ignored. Idioms such as `cp -p`, `set -o pipefail`, or `ln -sf` produce an error rather than silently doing the wrong thing. +- Multiple file arguments are not uniformly supported. `cut` and `uniq` read only the first file; `head` and `tail` return a hard error; `cat`, `sort`, and `wc` handle multiple files correctly. +- Malformed numeric input passes silently. `test`, `[`, and arithmetic `$(( ))` treat non-numeric or empty operands as `0` without an error. +- Stdin under `strands-shell -c` is not connected to commands (`bad fd 0`). Use an in-shell pipe instead. + +## Text processing + +| Command | Notable gaps | +|---|---| +| `grep` | No `-P`, backreferences, lookaround, or `-f`. `-o` on empty-matching patterns emits blank lines. | +| `sed` | No branching (`b`, `t`, `:label`) or multiline (`N`, `D`, `P`); no `-f`. `s///N` replaces the wrong match; range `c` emits per line. | +| `tr` | Missing `[:punct:]`, `[:cntrl:]`, and `[c*n]` repeats. `-c` two-set translate uses the wrong replacement character. | +| `cut` | No `-b` or `--complement`. Reads only the first file when given several. | +| `sort` | No `-c`, `-o`, `-V`, or `-h` (`-h` prints help instead). Keyed-tie order is non-deterministic. Loads all input into memory. | +| `uniq` | No `-w` or `-D`. Reads only the first file when given several. | +| `wc` | No `-m` (char count) or `-L`. Counts are correct; column padding differs from coreutils. | + +## File contents + +| Command | Notable gaps | +|---|---| +| `cat` | No `-b`, `-s`, `-A`, `-e`, or `-t`; no `-` stdin operand. | +| `head` | No `-c`, negative `-n`, or `head -5` shorthand. Multiple files produce a hard error (no `==>` headers). | +| `tail` | No `-c`, `-f` or `-F` follow, or shorthand. Multiple files produce a hard error. | +| `tee` | No `-i`. Standard usage (stdout plus files, `-a`) is supported. | + +## File management + +| Command | Notable gaps | +|---|---| +| `cp` | `-r` works, including nested directories; no `-p`, `-a`, `-f`, or `-i`. `cp f f` silently succeeds. | +| `mv` | Rename, move, and cross-mount operations work; no `-f`, `-i`, `-n`, or `-v`. | +| `rm` | `-r`, `-f`, and symlink non-follow behave correctly; no `-d` or `-i`. | +| `mkdir` | No `-m`. An invalid-option path returns exit 0. | +| `rmdir` | No `-p`. | +| `touch` | File creation and timestamp update only; no `-t`, `-d`, `-c`, or `-r` (cannot set a specific time). | +| `ln` | `-s` only; hard links are refused by design; no `-f`, so `ln -sf` fails. | +| `chmod` | Full support. Octal and symbolic modes; no `-R`. | + +## Path and system + +| Command | Notable gaps | +|---|---| +| `ls` | `-l` omits owner, group, and link count; `-a` omits `.` and `..`. No `-t`, `-S`, `-d`, `-F`, or `-h`. Output is always single-column. | +| `basename` | No `-a` or `-s`. `basename /` and `""` diverge from coreutils. | +| `dirname` | Single operand only. A trailing slash is mishandled (`/usr/lib/` becomes `/usr/lib`). | +| `readlink` | Plain read only; no `-f`, `-e`, or `-m` canonicalization. | +| `mktemp` | No `-u` or `-t`. | +| `date` | `%s` is not expanded; unknown specifiers pass through literally; no `-d` or `-r`. UTC only. | +| `env` | Print-only. `env VAR=v cmd`, `-i`, and `-u` are unsupported; use the shell-prefix form instead. | +| `echo` | Expands escapes by default; `-e` and `-E` are printed literally (escape expansion cannot be disabled). | +| `pwd` | Full support, including `-L` and `-P`. | +| `sleep` | No unit suffixes (`0.1s`). Respects the shell timeout. | +| `true`, `false` | Full support. | + +## Networking + +| Command | Notable gaps | +|---|---| +| `curl` | GET, POST, JSON, headers, auth, redirects, and `-w` are supported. No `--max-time`, `--connect-timeout`, or `--retry` (risk of an indefinite hang); no `-I`, `-F`, `-A`, `-G`, or cookie jar. SSRF and credential controls are enforced. | + +The SSRF guard and credential injection on `curl` are security features, covered in the [security model](security.md). They are not optional and cannot be disabled per command. + +## JSON + +| Command | Notable gaps | +|---|---| +| `jq` (jaq) | Broad filter coverage. A missing nested key throws an error, which breaks the `.a.b // default` pattern; no `--arg`, `--argjson`, or `-S`; `inputs` and `setpath` are absent. | + +## Search + +| Command | Notable gaps | +|---|---| +| `find` | `-name`, `-type`, `-maxdepth`, `-exec`, and `-print0` are supported. No `-delete`, `-size`, `-mtime`, `-prune`, or `-regex`. Children are sorted alphabetically, not in readdir order. | +| `xargs` | Space-separated `-n`, `-I`, `-0`, and `-d` are supported; attached forms (`-n1`, `-I{}`) and `-t`, `-r`, `-L`, `-P` are not. Implicitly behaves as `-r`. | + +## Scripting + +| Command | Notable gaps | +|---|---| +| `lua` | Full support. Sandboxed Lua 5.4 with VFS-backed `io` and `os`. Metatables are disabled (`setmetatable` removed); no `os.date` or `debug`. No wall-clock timeout under a bare `-c`. | + +Reach for `lua` when shell syntax gets awkward: multi-step transforms, structured data manipulation, or logic that would otherwise sprawl across a chain of pipes. + +## Shell builtins + +| Builtin | Notable gaps | +|---|---| +| `test`, `[` | No `[[ ]]`. Non-numeric or empty operands are treated as `0` (succeeds rather than erroring). | +| `printf` | No floating-point (`%f`, `%e`, `%g` print literally); `%d` does not parse `0x` or octal prefixes. | +| `read` | No `-a`, `-n`, or `-d`; a prefix `IFS=` is ignored; `-r` is effectively always on. | +| `set` | No `-o`, so `set -o pipefail` fails; only `e`, `u`, and `x` are supported. | +| `trap` | `EXIT` works; numeric signals (`trap ... 0`) are ignored; `-p` and `-l` are no-ops. | +| `alias` | Can be defined and listed, but aliases are never expanded during execution. | +| `readonly` | Enforced, but a violation returns exit 0; no `-p`. | +| `local` | Works within functions; silently succeeds when used outside a function. | +| `type` | No `-t` or `-a`. | +| `cd` | No `CDPATH`; `-P` and `-L` are not parsed. | +| `getopts`, `export`, `unset`, `shift`, `umask`, `hash`, `wait`, `:` | Full support. | + +Arithmetic `$(( ))` handles precedence, bitwise operators, the ternary, hex, and octal correctly. Malformed input returns a wrong result with exit 0 (`$((2 3))` yields `2`, `$((1/0))` yields `0`). Post-increment `x++` and `x--` return the value without updating the variable. + +## Shell language + +| Feature | Supported | +|---|---| +| Pipelines and lists | `\|`, `&&`, `\|\|`, `;` | +| Redirections | `>`, `>>`, `<`, `2>`, `&>`, here-docs `<<` | +| Conditionals | `if`, `elif`, `else`, `case` | +| Loops | `for`, `while`, `until` | +| Functions | definitions, `local` variables, `return` | +| Grouping | command groups `{ }`, subshells `( )` | +| Expansion | variables (`${VAR:-default}`, `${VAR%pat}`, `${#VAR}`), command substitution (`` `cmd` ``, `$(cmd)`), arithmetic `$(( ))`, globs | +| Quoting | single, double | +| Jobs | background `&` | +| Scripts | `. script.sh`, `source` | diff --git a/site/src/content/docs/user-guide/shell-sdk/configuration.mdx b/site/src/content/docs/user-guide/shell-sdk/configuration.mdx new file mode 100644 index 000000000..8707660f7 --- /dev/null +++ b/site/src/content/docs/user-guide/shell-sdk/configuration.mdx @@ -0,0 +1,171 @@ +--- +title: Configuring Strands Shell +description: Bind host directories, inject credentials per URL, set the network allowlist, and tune resource limits for Strands Shell, in code or from a TOML file. +sidebar: + label: "Configuration" +--- + +A fresh shell can reach nothing. You open it up by declaring three things: which directories the agent sees, which credentials get injected on which requests, and which URLs the agent may reach. This guide covers each, plus resource limits and the TOML format that captures the whole policy in one file. + +Every option is available both as a constructor argument and as a TOML key. Use the constructor when the policy is dynamic (per session, computed at runtime); use TOML when the policy is static and you want it under version control or shared with the [MCP server](mcp-server.md). + +The examples below use the Python API. The Node.js API takes the same options as a config object with camelCase keys (`allowedUrls`, `configFile`, `maxOutput`), and the TOML format is identical across both. + +## Binds + +A bind maps a host directory into the shell's virtual filesystem. The agent sees the destination path; everything outside a bound path does not exist. + +Each bind has a mode that decides how host and sandbox relate: + +- `copy` snapshots the host directory into the VFS at construction time. The agent's reads and writes stay inside the sandbox and never touch your host files. Use this for source code. +- `direct` passes reads and writes through to the host in real time. The agent can modify your live files, and host-side changes after construction are visible to the agent. Use this only for designated output directories. + +```python +import strands_shell + +shell = strands_shell.Shell( + binds=[ + strands_shell.Bind("/host/project", "/workspace", mode="copy"), + strands_shell.Bind("/tmp/output", "/output", mode="direct"), + ], +) +``` + +Add `readonly=True` to reject writes through a mount even in `direct` mode, which lets you expose a live host directory for reading without risking modification. + +```python +strands_shell.Bind("/host/reference", "/ref", mode="direct", readonly=True) +``` + +:::caution +A `direct` bind is live. The agent can read and modify host files in real time, including deleting them. Never `direct`-bind a directory that holds secrets, credentials, or configuration you do not want the agent to change. When in doubt, use `copy`. +::: + +## Credentials + +A credential rule attaches a secret to a URL prefix. When a command makes a request to a matching URL, the Kernel injects the secret as a bearer token at request time. The agent never holds the value: it does not appear in the environment, in command output, or in the Lua scripting context. + +Provide the secret one of two ways, and exactly one: an inline `token`, or an `env_var` resolved against the process environment when the shell is constructed. + +```python +shell = strands_shell.Shell( + credentials=[ + strands_shell.Cred("https://api.example.com/", env_var="API_TOKEN"), + ], + allowed_urls=["https://api.example.com/"], +) + +# The bearer token from $API_TOKEN is injected automatically. +result = shell.run("curl https://api.example.com/v1/status") +``` + +The Kernel matches on URL prefix with a path-boundary check, and it injects only on the original request. It never re-injects on a redirect, even a redirect back to the same host, which closes the credential-exfiltration path where an agent follows a redirect to a logging endpoint. + +## Network access + +By default `curl` blocks private address ranges (RFC1918, link-local, loopback, and cloud metadata endpoints such as IMDS) while letting public URLs through. The block happens at DNS-resolution time, so a public hostname that resolves to a private address is still blocked. + +To permit a specific internal host, add its URL prefix to `allowed_urls`. + +```python +shell = strands_shell.Shell( + allowed_urls=["https://internal-api.corp.example.com/"], +) +``` + +:::caution +Do not allowlist a bare scheme like `https://`. A prefix that broad permits every host and disables the SSRF guard entirely. List the specific endpoints the agent needs. +::: + +## Behavioral settings + +Three top-level settings tune how commands run: + +- `env` seeds environment variables into the shell. +- `timeout` sets a per-command wall-clock limit in seconds. It defaults to 30, which bounds runaway commands out of the box; raise it for long-running work. It must be a positive, finite number. +- `umask` sets the file-creation mask. The default is `0o022`. + +```python +shell = strands_shell.Shell( + env={"PROJECT": "demo"}, + timeout=30.0, +) +``` + +## Resource limits + +Resource caps go in a single `Limits` bundle, separate from behavioral settings, so protective caps stay visually distinct from runtime behavior. Override only the caps you care about; the rest keep their defaults. + +```python +shell = strands_shell.Shell( + limits=strands_shell.Limits( + max_output=1 << 20, + max_file_size=10 << 20, + ), +) +``` + +| Limit | Default | Caps | +|---|---|---| +| `max_output` | 1 MiB | Bytes of output captured from a single command | +| `max_file_size` | 10 MiB | Bytes a single file may reach on write or read | +| `max_fds` | 128 | Open file descriptors at once | +| `max_bg_jobs` | 8 | Concurrent background jobs | +| `max_pipeline` | 16 | Stages in a single pipeline | +| `max_input` | 1 MiB | Bytes of a single input | +| `max_inodes` | 10,000 | Total files and directories in the VFS | +| `max_depth` | 64 | Directory nesting depth | + +These caps are best-effort. They stop a runaway agent from exhausting memory or hanging; they are not a defense against an adversary actively trying to break out. For hard guarantees, see the [security model](security.md). + +## TOML configuration + +When the policy is static, declare it in a TOML file. The Python and Node.js constructors accept a config file path, and the [MCP server](mcp-server.md) reads the same format through its `--config` flag. Pass a file and explicit constructor arguments together, and the explicit arguments win. + +```toml +umask = "022" + +[[bind]] +mode = "copy" +source = "/host/project" +destination = "/workspace" + +[[bind]] +mode = "direct" +source = "/tmp/output" +destination = "/output" + +[[cred]] +url = "https://api.openai.com/v1/" +methods = ["POST"] +kind = "bearer" +api_key_env = "OPENAI_API_KEY" + +allowed_urls = ["https://api.openai.com/"] + +[env] +PROJECT = "demo" + +[limits] +timeout = 30 +max_output = 1048576 +``` + +Load it in code: + +```python +shell = strands_shell.Shell(config_file="sandbox.toml") +``` + +A few rules the parser enforces: + +- Unknown keys are rejected, so a typo like `timeout_seconds` fails loudly instead of being silently ignored. +- `bind` mode defaults to `copy` when omitted, which is the safe choice. +- A `cred` entry needs a `kind` (`bearer` or `query`) and exactly one of `api_key` or `api_key_env`. A `query` credential also needs a `param` naming the query parameter. +- `timeout` is in whole seconds and defaults to 30 when the key is absent. A value of `0` is rejected, because it would expire every command immediately. + +## Next steps + +- [Commands](commands.md): the full command inventory and supported flags. +- [MCP Server](mcp-server.md): serve this configuration to any MCP-compatible agent. +- [Security Model](security.md): how binds, credential injection, and the SSRF guard hold up, and where they stop. diff --git a/site/src/content/docs/user-guide/shell-sdk/index.mdx b/site/src/content/docs/user-guide/shell-sdk/index.mdx new file mode 100644 index 000000000..1230a6898 --- /dev/null +++ b/site/src/content/docs/user-guide/shell-sdk/index.mdx @@ -0,0 +1,87 @@ +--- +title: Strands Shell +description: Strands Shell is an in-process shell sandbox that gives agents a fast, isolated command line with filesystem, network, and credential mediation built in. +sidebar: + label: "Overview" +--- + +Agents run shell commands in tight loops: install dependencies, run tests, grep for errors, iterate. Those loops need to be fast, and they need to be contained. An agent that can run `curl` can also read your cloud credentials, reach your internal network, and overwrite files you never meant to expose. + +Strands Shell is a Bourne-compatible shell that runs inside your own process. It ships `grep`, `sed`, `jq`, `curl`, `find`, and dozens of other commands without ever calling `fork`, `exec`, or a raw syscall. You declare what the agent can reach (files, URLs, credentials) up front, and everything else does not exist as far as the agent is concerned. + +## Why a virtual shell + +Most agent setups reach for a container or a cloud sandbox to isolate command execution. Both work, and both cost you something on every command: a Docker container adds a cold start and a daemon to manage, a cloud sandbox adds a network round trip and a platform to depend on. When an agent runs hundreds of commands per task, that overhead compounds. + +Strands Shell takes a different position. The isolation boundary is an in-process virtual filesystem and a mediation layer, not an operating-system primitive. There is no container to start and no VM to provision, so constructing a shell and running a command costs under a millisecond. + +| | Docker | Cloud sandbox | Strands Shell | +|---|---|---|---| +| Cold start | ~200ms | ~1s (network) | under 1ms | +| Isolation | Container namespace | MicroVM | In-process VFS | +| Network | iptables or sidecar | Platform policy | URL allowlist plus SSRF guard | +| Secrets | Environment variables the agent can read | Platform-specific | Injected per request, agent never sees them | +| Setup | Docker daemon | API key plus network | `pip install strands-shell` | +| Platforms | Linux | Cloud only | macOS, Linux, WASM | + +This is a different tradeoff, not a strictly better one. A container isolates at the kernel; Strands Shell isolates at the process. The [security model](#what-strands-shell-is-not) section below draws that line precisely, because picking the wrong tool for an adversarial workload is a real risk. + +## How it works + +Your code talks to the shell through one of three surfaces: an MCP server, the Python API, or the Node.js API. Every command, file read, and network request flows through the Kernel, which is the single mediation boundary. + +```mermaid +flowchart TB + agent["Your agent code"] + agent -->|"MCP, Python, or Node.js"| shell + + subgraph shell ["Strands Shell"] + direction TB + subgraph kernel ["Kernel (mediation boundary)"] + vfs["VFS: isolated filesystem"] + net["Network: SSRF guard plus allowlist"] + creds["Credentials: injected per URL"] + limits["Limits: timeout, output, fds"] + end + engine["Shell engine: parser, builtins, commands, Lua"] + end +``` + +The engine parses and runs shell syntax: pipelines, loops, functions, subshells. When a command needs to touch the outside world, it asks the Kernel, and the Kernel decides. The filesystem is an in-memory VFS with explicit bind mounts. Network access goes through an SSRF guard that blocks private address ranges by default. Credentials are injected per request and stripped before the agent can read them. + +State persists across `run` calls. Environment variables, the working directory, and shell functions set in one command are visible in the next, so an agent can `cd` into a directory and keep working there across turns. + +Strands Shell is written in Rust and compiles from one source to native bindings for Python (via PyO3) and Node.js (via napi-rs), plus a WASM target. The three language surfaces are intentionally parallel: the same config shape, the same command set, the same mediation guarantees. + +## What you control + +A fresh shell is an empty sandbox: no files, no network, no credentials. You grant access explicitly through three mechanisms. + +**Binds** map a host directory into the shell's filesystem. A `copy` bind snapshots the directory at construction time, isolating the agent from your live files. A `direct` bind passes reads and writes through to the host in real time. Prefer `copy` for source code and reserve `direct` for output directories. + +**Credentials** attach a secret to a URL prefix. When a command makes a request to a matching URL, the Kernel injects the credential at request time. The agent never holds the secret, and the Kernel never re-injects it on a redirect, even back to the same host. + +**Allowed URLs** widen the network policy. By default the SSRF guard blocks private ranges (RFC1918, link-local, loopback, and cloud metadata endpoints) while letting public URLs through. Add a prefix to the allowlist to permit a specific internal host. + +The [Configuration](configuration.md) guide covers each of these in depth, including the TOML format that lets you declare the whole policy in a file. + +## What Strands Shell is not + +Strands Shell is a mediation layer, not a hardened sandbox. The Kernel enforces what the agent *should* reach through deny-by-default policy, and it runs in the same process as your code. It does not protect against memory-safety exploits in the shell engine itself, timing side channels, or an attacker who already controls the host process. + +The distinction matters for your threat model: + +- For "my agent should not touch anything I have not explicitly allowed," the Kernel handles it. This is the common case: a coding agent, a research agent, a CI assistant. +- For "an untrusted tenant is running arbitrary adversarial code," you need OS-level isolation. Run each Strands Shell instance inside a container or microVM, and let the Kernel handle the in-process mediation on top. + +Resource limits (timeouts, output caps, file-descriptor and inode limits) are best-effort. They stop a runaway agent from filling memory or hanging forever; they do not stop someone actively trying to break out. For hard guarantees, use OS-level cgroups. + +A Strands Shell instance is single-owner. If you serve multiple agents, create one shell per session. Construction is cheap (no containers, no VMs, just an in-memory VFS), so spinning one up per request is the intended pattern, not a workaround. + +## Next steps + +- [Quickstart](quickstart.md): install the shell and run your first sandboxed command through MCP, Python, or Node.js. +- [Configuration](configuration.md): bind directories, inject credentials, set the network allowlist, and load it all from TOML. +- [Commands](commands.md): the full command inventory, supported flags, and known gaps versus GNU coreutils. +- [MCP Server](mcp-server.md): expose the shell to any MCP-compatible agent framework over stdio. +- [Security Model](security.md): the Kernel boundary, the SSRF guard, credential handling, and how to layer OS isolation for adversarial workloads. diff --git a/site/src/content/docs/user-guide/shell-sdk/mcp-server.mdx b/site/src/content/docs/user-guide/shell-sdk/mcp-server.mdx new file mode 100644 index 000000000..b322ed362 --- /dev/null +++ b/site/src/content/docs/user-guide/shell-sdk/mcp-server.mdx @@ -0,0 +1,107 @@ +--- +title: Strands Shell MCP Server +description: Expose Strands Shell to any MCP-compatible agent over stdio, with four sandboxed tools and optional nested MCP servers surfaced as Lua modules. +sidebar: + label: "MCP Server" +--- + +The built-in MCP server exposes a sandboxed shell over JSON-RPC on stdio, so any framework that speaks the [Model Context Protocol](https://modelcontextprotocol.io/) can use it without a line of binding code. This is the integration path when you are not embedding the Python or Node.js API directly. + +## Start the server + +Run the server with the `--mcp` flag. With no config, it starts a bare in-memory sandbox: no host files, no network, no credentials. + +```bash +uvx strands-shell --mcp +``` + +To grant access, pass a [TOML config file](configuration.md#toml-configuration) before the flag. The server applies the same binds, credentials, allowlist, and limits you would set in code. + +```bash +uvx strands-shell --config sandbox.toml --mcp +``` + +To register the server with an MCP client, point the client's configuration at the same command: + +```json +{ + "mcpServers": { + "shell": { + "command": "uvx", + "args": ["strands-shell", "--config", "sandbox.toml", "--mcp"] + } + } +} +``` + +## The four tools + +The server exposes four tools. All filesystem paths are absolute paths inside the sandbox VFS. + +### `shell` + +Runs a command in the virtual shell. The response carries two text blocks: the first is stdout, the second is stderr (both always present, empty when a stream produced nothing). The exit code is in the response metadata. + +| Parameter | Type | Required | Description | +|---|---|---|---| +| `command` | string | yes | The shell command string to execute. | +| `timeout_ms` | number | no | Timeout in milliseconds. Default 30000. | + +State persists across calls on the same connection. A `cd`, an `export`, or a function defined in one `shell` call is visible in the next. + +### `read_file` + +Reads a file from the VFS. Text files return as line-numbered text, 1-indexed, honoring `offset` and `limit`. Images return as image content; other binary files return as embedded resource blobs. + +| Parameter | Type | Required | Description | +|---|---|---|---| +| `file_path` | string | yes | Absolute path to the file. | +| `offset` | number | no | 1-indexed line to start from. Default 1. | +| `limit` | number | no | Maximum lines to return. Default 2000. | + +### `write_file` + +Creates or overwrites a file in the VFS. + +| Parameter | Type | Required | Description | +|---|---|---|---| +| `file_path` | string | yes | Absolute path to the file. | +| `content` | string | yes | The content to write. | + +### `list_dir` + +Lists the entries in a directory in the VFS. + +| Parameter | Type | Required | Description | +|---|---|---|---| +| `dir_path` | string | yes | Absolute path to the directory. | + +## Nested MCP servers as Lua modules + +A shell config can declare other MCP servers under `[[mcp]]` entries. Each one becomes a Lua module inside the shell, callable from the `lua` command with `require`. This lets an agent reach your existing MCP tools through the same sandboxed shell, with the tool's responses available as ordinary Lua tables. + +```toml +[[mcp]] +name = "my_tools" +command = "/path/to/mcp-server" +args = ["--stdio"] +``` + +Inside the shell, the server's tools arrive as a table: + +```lua +local tools = require("my_tools") +local result = tools.search({ query = "deny by default" }) +print(result) +``` + +Nested MCP servers are a native-target feature. They are not available under the WASM build, which has no MCP server or client and no `--config` flag. + +## When to use the MCP server + +Reach for the MCP server when your agent already speaks MCP and you want isolation without writing binding code: the agent gets a shell plus file tools, all mediated by the Kernel, configured from one TOML file. When you are writing the agent host yourself in Python or Node.js, embed the [Python](quickstart.md#python) or [Node.js](quickstart.md#nodejs) API directly instead, which gives you the typed `Output`, file operations, and per-session control without a subprocess. + +## Next steps + +- [Configuration](configuration.md): the full TOML schema for binds, credentials, the allowlist, and limits. +- [Security Model](security.md): what the sandbox guarantees per session and why one shell per session is the contract. diff --git a/site/src/content/docs/user-guide/shell-sdk/quickstart.mdx b/site/src/content/docs/user-guide/shell-sdk/quickstart.mdx new file mode 100644 index 000000000..3286201a3 --- /dev/null +++ b/site/src/content/docs/user-guide/shell-sdk/quickstart.mdx @@ -0,0 +1,161 @@ +--- +title: Strands Shell Quickstart +description: Install Strands Shell and run your first sandboxed command through the MCP server, the Python API, or the Node.js API in just a few minutes. +sidebar: + label: "Quickstart" +tags: [quickstart] +--- + +This guide gets you from zero to a running sandboxed command. You will install Strands Shell, create a shell with a single bound directory, and run a command against it. By the end you will have a working sandbox you can hand to an agent. + +Strands Shell offers three surfaces. Pick the one that matches how you build: + +- **MCP server** works with any agent framework that speaks the Model Context Protocol. Nothing to write in your own language. +- **Python API** embeds the shell directly in a Python program. +- **Node.js API** embeds the shell directly in a JavaScript or TypeScript program. + +## MCP server + +The fastest way to give an existing agent a sandboxed shell is the built-in MCP server. It needs no code: point your MCP client at the `strands-shell` command and the agent gets four tools (`shell`, `read_file`, `write_file`, `list_dir`). + +Add this to your MCP client configuration: + +```json +{ + "mcpServers": { + "shell": { + "command": "uvx", + "args": ["strands-shell", "--mcp"] + } + } +} +``` + +This starts a bare in-memory sandbox: no host files, no network, no credentials. To grant access, write a [TOML config file](configuration.md#toml-configuration) and pass it before the `--mcp` flag: + +```json +{ + "mcpServers": { + "shell": { + "command": "uvx", + "args": ["strands-shell", "--config", "sandbox.toml", "--mcp"] + } + } +} +``` + +The [MCP Server](mcp-server.md) page documents the four tools, their parameters, and how to expose other MCP servers as Lua modules inside the shell. + +## Python + +Install the shell. You need Python 3.10 or later. + +```bash +pip install strands-shell +``` + +Create a shell, bind a directory into it, and run a command. The bind is the grant: `/my/project` on your host becomes `/workspace` inside the sandbox, and nothing else on your filesystem is visible. + +```python +import strands_shell + +shell = strands_shell.Shell( + binds=[strands_shell.Bind("/my/project", "/workspace", mode="copy")], +) + +result = shell.run("grep -rn TODO /workspace") +print(result.stdout) +``` + +`run` returns an `Output` with three fields: `stdout`, `stderr`, and `status` (the exit code). It does not raise when a command fails, so check `status` to branch on success. + +```python +result = shell.run("test -f /workspace/pyproject.toml") +if result.status == 0: + print("found pyproject.toml") +``` + +State carries across calls. Export a variable or change directory in one `run`, and the next `run` sees it. + +```python +shell.run("cd /workspace && export PROJECT=demo") +result = shell.run("echo $PROJECT in $(pwd)") +print(result.stdout) + +# Typical output: +# demo in /workspace +``` + +### Reading and writing files + +You can touch the sandbox filesystem directly, without going through a shell command. This is the path to use when your own code needs to seed an input file or collect a result. + +```python +shell.write_file("/workspace/note.txt", b"hello") +data = shell.read_file("/workspace/note.txt") +print(data.decode()) + +entries = shell.list_files("/workspace") +for entry in entries: + print(entry.name) +``` + +`read_file` and `write_file` work in bytes. A missing path raises `strands_shell.FileNotFoundError`, which also subclasses the built-in `FileNotFoundError`, so existing error-handling code catches it without a translation shim. + +## Node.js + +Install the shell. You need Node.js 18 or later. + +```bash +npm install @strands-agents/shell +``` + +Create a shell with `Shell.create`, which returns a promise. Bind a directory, then run a command. + +```javascript +import { Shell } from '@strands-agents/shell' + +const shell = await Shell.create({ + binds: [{ source: '/my/project', destination: '/workspace', mode: 'copy' }], +}) + +const result = await shell.run('grep -rn TODO /workspace') +console.log(result.stdout) +``` + +Every method returns a promise. `run` resolves to an `Output` with `stdout`, `stderr`, and `status`, and it resolves even when the command exits non-zero, so check `status` rather than catching an error. + +```javascript +const result = await shell.run('test -f /workspace/package.json') +if (result.status === 0) { + console.log('found package.json') +} +``` + +### Reading and writing files + +File operations take and return `Uint8Array`. A missing path rejects with `NotFoundError`, which carries a `.code` of `'ENOENT'` and the offending `.path`. + +```javascript +const enc = new TextEncoder() +const dec = new TextDecoder() + +await shell.writeFile('/workspace/note.txt', enc.encode('hello')) +const data = await shell.readFile('/workspace/note.txt') +console.log(dec.decode(data)) + +const entries = await shell.listFiles('/workspace') +for (const entry of entries) { + console.log(entry.name) +} +``` + +## What you built + +You created a sandbox with exactly one directory visible to it and ran a command that could not reach anything else: not your home directory, not your credentials, not the network. That is the whole model. You add capability by adding binds, credentials, and allowed URLs, and the agent gets nothing you did not grant. + +## Next steps + +- [Configuration](configuration.md): the difference between `copy` and `direct` binds, credential injection, the network allowlist, and the TOML format. +- [Commands](commands.md): which commands and flags are supported, and where they diverge from GNU coreutils. +- [Security Model](security.md): what the sandbox guarantees, what it does not, and when to add OS-level isolation. diff --git a/site/src/content/docs/user-guide/shell-sdk/security.mdx b/site/src/content/docs/user-guide/shell-sdk/security.mdx new file mode 100644 index 000000000..decde9f40 --- /dev/null +++ b/site/src/content/docs/user-guide/shell-sdk/security.mdx @@ -0,0 +1,66 @@ +--- +title: Strands Shell Security Model +description: How the Kernel boundary mediates filesystem, network, and credential access, what Strands Shell guarantees, and when to add OS-level isolation. +sidebar: + label: "Security Model" +--- + +Strands Shell exists to answer one question: what should this agent be allowed to touch? The answer is enforced by the Kernel, a single mediation boundary that every filesystem, network, and credential operation flows through. This page explains where that boundary sits, what it guarantees, and the threat models it does and does not cover. + +The framing that matters most: **Strands Shell is a mediation layer, not a hardened sandbox.** It enforces what an agent *should* reach through deny-by-default policy. It runs in the same process as your code, not in a VM. Getting this distinction right is the difference between an appropriate deployment and a false sense of safety. + +## The Kernel boundary + +All effects flow through the Kernel. There is no path around it: the shell makes no `fork`, no `exec`, and no raw syscalls, because the engine runs entirely in userspace. A command that wants to read a file, reach a URL, or use a credential asks the Kernel, and the Kernel applies policy. + +The default Kernel is backed by an in-memory virtual filesystem and enforces four guarantees: + +1. **Filesystem isolation.** The VFS exposes only explicitly bound paths. No path can escape its declared mounts, and `..` components, symlinks, and bind-path tricks are checked against the mount boundary. +2. **Network SSRF guard.** A two-layer check filters network access: a URL-level parse plus IP filtering at DNS-resolution time. It blocks RFC1918, link-local, loopback, IMDS, IPv4-mapped IPv6, 6to4, and Teredo addresses by default. +3. **Credential injection.** Secrets are matched per URL prefix with a path-boundary check and injected only on the original request, then stripped on redirects. +4. **No syscalls.** There is no `fork`, `exec`, or raw syscall path for a command to reach the host environment. + +A custom Kernel (for embedding the shell in another runtime) carries its own security properties. The guarantees above describe the bundled implementation. + +## Deny by default + +Out of the box a shell is empty: no files, no network, no credentials. You add capability explicitly, and the safe choice is the default at every turn. The principle is least privilege, and the practices that follow from it: + +- **Prefer `copy` over `direct` binds for source code.** A `copy` bind snapshots files into the VFS, isolating the agent from your live filesystem. Reserve `direct` for output directories where the agent must persist results. +- **Scope binds narrowly.** Bind `/my/project/src`, not `/my/project` or `/`. The agent does not need your `.git`, your `.env`, or your `node_modules`. +- **Allowlist URLs explicitly.** Never allowlist a bare `https://`; that disables the SSRF guard. List the specific endpoints the agent needs. +- **Keep a timeout set.** The config-driven API defaults to a 30-second per-command timeout, which bounds runaway commands. Raise it for long-running work rather than removing it. +- **Set `max_output`.** Cap the output an agent can generate so a single command cannot fill memory. 1 MiB is a reasonable default. + +The [configuration guide](configuration.md) shows how to apply each of these. + +## How credentials stay out of reach + +The credential design solves a specific problem: an agent that can make HTTP requests should be able to *use* a secret without being able to *read* it. A secret in an environment variable fails this, because any command can print the environment. + +Strands Shell injects credentials at request time, keyed to a URL prefix. The value never enters the agent's reach: it is absent from the environment, from command output, from error messages, and from the Lua scripting context. The Kernel injects only on the original request and never re-injects on a redirect, even one that points back to the same host. That closes the exfiltration path where an agent follows a crafted redirect to a logging endpoint and reads the forwarded header. + +## What is out of scope + +Some properties are explicitly not part of the security boundary. Treating them as guarantees is a mistake: + +- **Resource exhaustion within configured limits.** Limits are best-effort. They catch runaway agents; they do not stop deliberate attempts to exhaust CPU or memory. Use OS-level cgroups for hard caps. +- **Speculative execution and side channels.** The same-process architecture does not defend against timing attacks. Use VM isolation for that threat model. +- **Multi-tenancy within one process.** A shell instance is single-owner. Sharing one instance across mutually distrusting agents is a documented non-goal. +- **Memory-safety exploits in the engine itself.** The Kernel mediates policy; it is not a defense against an attacker who compromises the host process. +- **Reading files the agent was granted.** A bind is a grant. An agent reading a directory you bound is the system working as designed. + +## Choosing the right isolation + +Match the tool to the threat model: + +- **"My agent should not touch anything I have not allowed."** The Kernel handles this directly. This covers the common cases: a coding agent on a repository, a research agent with a scoped API key, a CI assistant. No container required. +- **"An untrusted tenant is running arbitrary adversarial code."** Add OS-level isolation. Run each shell instance inside a container or microVM, and let the Kernel handle in-process mediation on top. Construction is cheap, so one shell per session is the intended pattern, not a workaround. + +The two layers compose. The Kernel gives you fine-grained, fast, deny-by-default mediation; the container gives you a hard kernel boundary. For adversarial workloads, use both. + +## Reporting a security issue + +A bypass of filesystem mediation, the SSRF guard, or credential injection is a security issue, not a normal bug. Reaching a path outside a bound mount, defeating the metadata-service block, exfiltrating an injected credential, or causing one MCP session to read another session's state all qualify. + +Do not open a public GitHub issue. Report through the [AWS Vulnerability Disclosure Program on HackerOne](https://hackerone.com/aws_vdp) or by email to [aws-security@amazon.com](mailto:aws-security@amazon.com). The repository's [SECURITY.md](https://github.com/strands-agents/shell/blob/main/SECURITY.md) has the full policy and the complete in-scope and out-of-scope lists. From 7dafbaa1660390d350b6a9a3a429ad9b4e8009e9 Mon Sep 17 00:00:00 2001 From: agent-of-mkmeral Date: Wed, 17 Jun 2026 04:46:03 +0000 Subject: [PATCH 2/2] review-pass-1: fix build-breaking mailto link in shell-sdk security page The astro-broken-links-checker (throwError: true) failed the build on security.mdx: the mailto: link was rewritten by PageLink.astro into a bogus relative path /docs/user-guide/shell-sdk/mailto:... because isRelativeLink() in src/util/links.ts does not exempt the mailto: scheme. Render the address as inline code instead; HackerOne remains the linked reporting channel. Full npm run build now passes clean (689 pages, no broken links). --- site/src/content/docs/user-guide/shell-sdk/security.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/site/src/content/docs/user-guide/shell-sdk/security.mdx b/site/src/content/docs/user-guide/shell-sdk/security.mdx index decde9f40..722e052d1 100644 --- a/site/src/content/docs/user-guide/shell-sdk/security.mdx +++ b/site/src/content/docs/user-guide/shell-sdk/security.mdx @@ -63,4 +63,4 @@ The two layers compose. The Kernel gives you fine-grained, fast, deny-by-default A bypass of filesystem mediation, the SSRF guard, or credential injection is a security issue, not a normal bug. Reaching a path outside a bound mount, defeating the metadata-service block, exfiltrating an injected credential, or causing one MCP session to read another session's state all qualify. -Do not open a public GitHub issue. Report through the [AWS Vulnerability Disclosure Program on HackerOne](https://hackerone.com/aws_vdp) or by email to [aws-security@amazon.com](mailto:aws-security@amazon.com). The repository's [SECURITY.md](https://github.com/strands-agents/shell/blob/main/SECURITY.md) has the full policy and the complete in-scope and out-of-scope lists. +Do not open a public GitHub issue. Report through the [AWS Vulnerability Disclosure Program on HackerOne](https://hackerone.com/aws_vdp) or by email to `aws-security@amazon.com`. The repository's [SECURITY.md](https://github.com/strands-agents/shell/blob/main/SECURITY.md) has the full policy and the complete in-scope and out-of-scope lists.