Perstack: A Harness for Micro-Agents.

Docs · Getting Started · Website · Discord · X

If you want to build practical agentic apps like Claude Code or OpenClaw, a harness helps manage the complexity.

Perstack is a harness for agentic apps. It aims to:

Do big things with small models: If a smaller model can do the job, there's no reason to use a bigger one.
Quality is a system property, not a model property: Building agentic software people actually use doesn't require an AI science degree—just a solid understanding of the problems you're solving.
Keep it simple and reliable: The biggest mistake is cramming AI into an overly complex harness and ending up with an inoperable agent.

Getting Started

Perstack keeps expert definition, orchestration, and application integration as separate concerns.

create-expert scaffolds experts, the harness handles orchestration, and deployment stays simple because Perstack runs on standard container and serverless infrastructure.

Defining your first expert

To get started, use the built-in create-expert expert to scaffold your first agentic app:

# Use `create-expert` to scaffold a micro-agent team named `ai-gaming`
docker run --pull always --rm -it \
  -e FIREWORKS_API_KEY \
  -v ./ai-gaming:/workspace \
  perstack/perstack start create-expert \
    --provider fireworks \
    --model accounts/fireworks/models/kimi-k2p5 \
    "Form a team named ai-gaming to build a Bun-based CLI indie game playable on Bash for AI."

create-expert is a built-in expert. It generates a perstack.toml that defines a team of micro-agents, runs them, evaluates the results, and iterates until the setup works. Each agent has a single responsibility and its own context window. Complex tasks are broken down and delegated to specialists.

[experts."ai-gaming"]
description = "Game development team lead"
instruction = "Coordinate the team to build a CLI dungeon crawler."
delegates = ["@ai-gaming/level-designer", "@ai-gaming/programmer", "@ai-gaming/tester"]

[experts."@ai-gaming/level-designer"]
description = "Designs dungeon layouts and game mechanics"
instruction = "Design engaging dungeon levels, enemy encounters, and progression systems."

[experts."@ai-gaming/programmer"]
description = "Implements the game in TypeScript"
instruction = "Write the game code using Bun, targeting terminal-based gameplay."

[experts."@ai-gaming/tester"]
description = "Tests the game and reports bugs"
instruction = "Play-test the game, find bugs, and verify fixes."

Running your expert

To let your agents work on an actual task, you can use the perstack start command to run them interactively:

# Let `ai-gaming` build a Wizardry-like dungeon crawler
docker run --pull always --rm -it \
  -e FIREWORKS_API_KEY \
  -v ./ai-gaming:/workspace \
  perstack/perstack start ai-gaming \
    --provider fireworks \
    --model accounts/fireworks/models/kimi-k2p5 \
    "Create a Wizardry-like dungeon crawler in a fixed 10-floor labyrinth with complex layouts, traps, fixed room encounters, and random battles. Include special-effect gear drops, leveling, and a skill tree for one playable character. Balance difficulty around build optimization. Death in the dungeon causes loss of one random equipped item."

Here is an example of a game built with these commands: demo-dungeon-crawler. It was built entirely with Kimi K2.5 on Fireworks. You can play it directly:

npx perstack-demo-dungeon-crawler start

Generation stats for demo-dungeon-crawler


Date	March 8, 2026
Duration	32 min 44 sec
Steps	199
Generated Code	13,587 lines across 25 files
Tokens (Input)	11.4 M + Cached 10.7 M
Tokens (Output)	257.3 K
Cost	$2.27 (via Fireworks)

Integrating with your app

Perstack separates the agent harness from the application layer. Your app stays a normal web or terminal app, with no LLM dependencies in the client.

┌─────────────────┐              ┌──────────────────┐
│  Your app       │   events     │  perstack run    │
│  (React, TUI…)  │ ◄─────────── │  (@perstack/     │
│                 │  SSE / WS /  │    runtime)      │
│  @perstack/     │  any stream  │                  │
│    react        │              │                  │
└─────────────────┘              └──────────────────┘
     Frontend                         Server

Swap models, change agent topology, or scale the harness — without touching application code. @perstack/react provides hooks (useJobStream, useRun) that turn the event stream into React state. See the documentation for details.

Deployment

FROM perstack/perstack:latest
COPY perstack.toml .
RUN perstack install
ENTRYPOINT ["perstack", "run", "my-expert"]

The image is Ubuntu-based, multi-arch (linux/amd64, linux/arm64), and is ~74 MB. perstack install pre-resolves MCP servers and prewarms tool definitions for faster, reproducible startup. The runtime can also be imported directly as a TypeScript library (@perstack/runtime) for serverless environments. See the deployment guide for details.

Why micro-agents?

Perstack is a harness for micro-agents — purpose-specific agents with a single responsibility.

Reusable: Delegates are dependency management for agents — like npm packages or crates. Separate concerns through delegate chains, and compose purpose-built experts across different projects.
Cost-Effective: Purpose-specific experts are designed to run on affordable models. A focused agent with the right domain knowledge on a cheap model outperforms a generalist on an expensive one.
Fast: Smaller models generate faster. Fine-grained tasks broken into delegates run concurrently via parallel delegation.
Maintainable: A monolithic system prompt is like refactoring without tests — every change risks breaking something. Single-responsibility experts are independently testable. Test each one, then compose them.

Prerequisites

Docker
An LLM provider API key (see Providers and Models)

Giving API keys

There are two ways to provide API keys:

1. Pass host environment variables with -e

Export the key on the host and forward it to the container:

export FIREWORKS_API_KEY=fw_...
docker run --rm -it \
  -e FIREWORKS_API_KEY \
  -v ./workspace:/workspace \
  perstack/perstack start my-expert "query" --provider fireworks

2. Store keys in a .env file in the workspace

Create a .env file in the workspace directory. Perstack loads .env and .env.local by default:

# ./workspace/.env
FIREWORKS_API_KEY=fw_...

docker run --rm -it \
  -v ./workspace:/workspace \
  perstack/perstack start my-expert "query"

You can also specify custom .env file paths with --env-path:

perstack start my-expert "query" --env-path .env.production

What's inside?

An agent harness needs a broad set of capabilities—almost like an operating system.

┌──────────────────────────────────────────────────────────────────-┐
│  Interface                                                        │
│  CLI · Event streaming · Programmatic API                         │
├──────────────────────────────────────────────────────────────────-┤
│  Runtime                                                          │
│  Agentic loop · Event-sourcing · Checkpointing · Tool use         │
├──────────────────────────────────────────────────────────────────-┤
│  Context                                                          │
│  System prompts · Prompt caching · AgenticRAG · Extended thinking │
├──────────────────────────────────────────────────────────────────-┤
│  Definition                                                       │
│  Multi-agent topology · MCP skills · Provider abstraction         │
├──────────────────────────────────────────────────────────────────-┤
│  Infrastructure                                                   │
│  Sandbox isolation · Workspace boundary · Secret management       │
└──────────────────────────────────────────────────────────────────-┘

Most of the features below are not new ideas. Perstack takes the usual harness building blocks — tool use, delegation, checkpointing, prompt caching, etc. — makes them easy to operate, puts them on top of standards you already know (MCP, TOML, Docker, SSE), and ships them as one runtime. Where cost or operational burden demands it, Perstack introduces its own take — micro-agents being the first example.

Full feature matrix

Layer	Feature	Description
Definition	`perstack.toml`	Declarative project config with global defaults (model, reasoning budget, retries, timeout)
	Expert definitions	Instruction, description, delegates, tags, version, and minimum runtime version per expert
	Skill types	MCP stdio, MCP SSE, and interactive skills with tool pick/omit filtering and domain restrictions
	Provider config	9 providers (Anthropic, OpenAI, Google, Fireworks, DeepSeek, Ollama, Azure OpenAI, Amazon Bedrock, Google Vertex) with per-provider settings
	Model tiers	Provider-aware model selection via `defaultModelTier` (low / middle / high) with fallback cascade
	Provider tools	Provider-native capabilities (web search, code execution, image generation, etc.) with per-tool options
	Lockfile	`perstack.lock` — resolved snapshot of experts and tool definitions for reproducible deployments
Context	Meta-prompts	Role-specific system prompts (coordinator vs. delegate) with environment injection (time, working directory, sandbox)
	Context window tracking	Per-model context window lookup with usage ratio monitoring
	Message types	Instruction, user, expert, and tool messages with text, image, file, thinking, and tool-call parts
	Prompt caching	Provider-specific cache control with cache-hit tracking
	Delegation	Parallel child runs with isolated context, parent history preservation, and result aggregation
	Extended thinking	Provider-specific reasoning budgets (Anthropic thinking, OpenAI reasoning effort, Google thinking config)
	Token usage	Input, output, reasoning, cached, and total token tracking accumulated across steps and delegations
	Resume / continue	Resume from any checkpoint, specific job, or delegation stop point
Runtime	State machine	9-state machine (init → generate → call tools → resolve → finish, with delegation and interactive stops)
	Event-sourcing	21 run events, 6 streaming events, and 5 runtime events for full execution observability
	Checkpoints	Immutable state snapshots with messages, usage, pending tool calls, and delegation metadata
	Skill manager	Dynamic skill lifecycle — connect, discover tools, execute, disconnect — with adapter pattern
	Tool execution	Parallel MCP tool calls with priority classification (MCP → delegate → interactive)
	Error handling	Configurable retries with provider-specific error normalization and retryability detection
	Job hierarchy	Job → run → checkpoint structure with step continuity across delegations
	Streaming	Real-time reasoning and result deltas via streaming callbacks
Infrastructure	Container isolation	Docker image (Ubuntu, multi-arch, ~74 MB) with `PERSTACK_SANDBOX=1` marker and non-root user
	Workspace boundaries	Path validation with symlink resolution to prevent traversal and escape attacks
	Env / secrets	`.env` loading with `--env-path`, `requiredEnv` minimal-privilege filtering, and protected-variable blocklist
	Exec protection	Filtered environment for subprocesses blocking `LD_PRELOAD`, `NODE_OPTIONS`, and similar vectors
	Install & lockfile	`perstack install` pre-resolves tool definitions for faster, reproducible startup
Interface	`perstack` CLI	`start` (interactive TUI), `run` (JSON events), `log` (history query), `install`, and expert management commands
	TUI	React/Ink terminal UI with real-time activity log, token metrics, delegation tree, and job/checkpoint browser
	JSON event stream	Machine-readable event output via `perstack run` with `--filter` for programmatic integration
	`@perstack/runtime`	TypeScript library for serverless and custom apps — `run()` with event listener, checkpoint storage callbacks
	`@perstack/react`	React hooks (`useRun`, `useJobStream`) and event-to-activity processing utilities
	Studio	Expert lifecycle management — create, push, version, publish, yank — via Perstack API
	Log system	Query execution history by job, run, step, or event type with terminal and JSON formatters

Documentation

Topic	Link
Getting started	Getting Started
Architecture and core concepts	Understanding Perstack
Expert definitions	Making Experts
Rapid prototyping	Rapid Prototyping Guide
Breaking agents into specialists	Taming Prompt Sprawl
Adding tools via MCP	Extending with Tools
Deployment	Deployment
CLI and API reference	References

Status

Pre-1.0. The runtime is production-tested, but the API surface may change. Pin your versions.

Name		Name	Last commit message	Last commit date
Latest commit History 613 Commits
.changeset		.changeset
.github		.github
apps		apps
benchmarks		benchmarks
definitions/create-expert		definitions/create-expert
demo		demo
docker		docker
docs		docs
e2e		e2e
packages		packages
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
biome.json		biome.json
bun.lock		bun.lock
knip.json		knip.json
package.json		package.json
tsconfig.json		tsconfig.json
tsdown.config.ts		tsdown.config.ts
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Perstack: A Harness for Micro-Agents.

Getting Started

Defining your first expert

Running your expert

Integrating with your app

Deployment

Why micro-agents?

Prerequisites

Giving API keys

What's inside?

Documentation

Status

Community

Contributing

License

About

Uh oh!

Releases 358

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Perstack: A Harness for Micro-Agents.

Getting Started

Defining your first expert

Running your expert

Integrating with your app

Deployment

Why micro-agents?

Prerequisites

Giving API keys

What's inside?

Documentation

Status

Community

Contributing

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 358

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages