Skip to content

perstack-ai/perstack

Perstack: A Harness for Micro-Agents.

License npm downloads npm version Docker image version

Docs · Getting Started · Website · Discord · X

If you want to build practical agentic apps like Claude Code or OpenClaw, a harness helps manage the complexity.

Perstack is a harness for agentic apps. It aims to:

  • Do big things with small models: If a smaller model can do the job, there's no reason to use a bigger one.
  • Quality is a system property, not a model property: Building agentic software people actually use doesn't require an AI science degree—just a solid understanding of the problems you're solving.
  • Keep it simple and reliable: The biggest mistake is cramming AI into an overly complex harness and ending up with an inoperable agent.

Getting Started

Perstack keeps expert definition, orchestration, and application integration as separate concerns.

create-expert scaffolds experts, the harness handles orchestration, and deployment stays simple because Perstack runs on standard container and serverless infrastructure.

Defining your first expert

To get started, use the built-in create-expert expert to scaffold your first agentic app:

# Use `create-expert` to scaffold a micro-agent team named `ai-gaming`
docker run --pull always --rm -it \
  -e FIREWORKS_API_KEY \
  -v ./ai-gaming:/workspace \
  perstack/perstack start create-expert \
    --provider fireworks \
    --model accounts/fireworks/models/kimi-k2p5 \
    "Form a team named ai-gaming to build a Bun-based CLI indie game playable on Bash for AI."

create-expert is a built-in expert. It generates a perstack.toml that defines a team of micro-agents, runs them, evaluates the results, and iterates until the setup works. Each agent has a single responsibility and its own context window. Complex tasks are broken down and delegated to specialists.

[experts."ai-gaming"]
description = "Game development team lead"
instruction = "Coordinate the team to build a CLI dungeon crawler."
delegates = ["@ai-gaming/level-designer", "@ai-gaming/programmer", "@ai-gaming/tester"]

[experts."@ai-gaming/level-designer"]
description = "Designs dungeon layouts and game mechanics"
instruction = "Design engaging dungeon levels, enemy encounters, and progression systems."

[experts."@ai-gaming/programmer"]
description = "Implements the game in TypeScript"
instruction = "Write the game code using Bun, targeting terminal-based gameplay."

[experts."@ai-gaming/tester"]
description = "Tests the game and reports bugs"
instruction = "Play-test the game, find bugs, and verify fixes."

Running your expert

To let your agents work on an actual task, you can use the perstack start command to run them interactively:

# Let `ai-gaming` build a Wizardry-like dungeon crawler
docker run --pull always --rm -it \
  -e FIREWORKS_API_KEY \
  -v ./ai-gaming:/workspace \
  perstack/perstack start ai-gaming \
    --provider fireworks \
    --model accounts/fireworks/models/kimi-k2p5 \
    "Create a Wizardry-like dungeon crawler in a fixed 10-floor labyrinth with complex layouts, traps, fixed room encounters, and random battles. Include special-effect gear drops, leveling, and a skill tree for one playable character. Balance difficulty around build optimization. Death in the dungeon causes loss of one random equipped item."

Here is an example of a game built with these commands: demo-dungeon-crawler. It was built entirely with Kimi K2.5 on Fireworks. You can play it directly:

npx perstack-demo-dungeon-crawler start
Generation stats for demo-dungeon-crawler
Date March 8, 2026
Duration 32 min 44 sec
Steps 199
Generated Code 13,587 lines across 25 files
Tokens (Input) 11.4 M + Cached 10.7 M
Tokens (Output) 257.3 K
Cost $2.27 (via Fireworks)

Integrating with your app

Perstack separates the agent harness from the application layer. Your app stays a normal web or terminal app, with no LLM dependencies in the client.

┌─────────────────┐              ┌──────────────────┐
│  Your app       │   events     │  perstack run    │
│  (React, TUI…)  │ ◄─────────── │  (@perstack/     │
│                 │  SSE / WS /  │    runtime)      │
│  @perstack/     │  any stream  │                  │
│    react        │              │                  │
└─────────────────┘              └──────────────────┘
     Frontend                         Server

Swap models, change agent topology, or scale the harness — without touching application code. @perstack/react provides hooks (useJobStream, useRun) that turn the event stream into React state. See the documentation for details.

Deployment

FROM perstack/perstack:latest
COPY perstack.toml .
RUN perstack install
ENTRYPOINT ["perstack", "run", "my-expert"]

The image is Ubuntu-based, multi-arch (linux/amd64, linux/arm64), and is ~74 MB. perstack install pre-resolves MCP servers and prewarms tool definitions for faster, reproducible startup. The runtime can also be imported directly as a TypeScript library (@perstack/runtime) for serverless environments. See the deployment guide for details.

Why micro-agents?

Perstack is a harness for micro-agents — purpose-specific agents with a single responsibility.

  • Reusable: Delegates are dependency management for agents — like npm packages or crates. Separate concerns through delegate chains, and compose purpose-built experts across different projects.
  • Cost-Effective: Purpose-specific experts are designed to run on affordable models. A focused agent with the right domain knowledge on a cheap model outperforms a generalist on an expensive one.
  • Fast: Smaller models generate faster. Fine-grained tasks broken into delegates run concurrently via parallel delegation.
  • Maintainable: A monolithic system prompt is like refactoring without tests — every change risks breaking something. Single-responsibility experts are independently testable. Test each one, then compose them.

Prerequisites

Giving API keys

There are two ways to provide API keys:

1. Pass host environment variables with -e

Export the key on the host and forward it to the container:

export FIREWORKS_API_KEY=fw_...
docker run --rm -it \
  -e FIREWORKS_API_KEY \
  -v ./workspace:/workspace \
  perstack/perstack start my-expert "query" --provider fireworks

2. Store keys in a .env file in the workspace

Create a .env file in the workspace directory. Perstack loads .env and .env.local by default:

# ./workspace/.env
FIREWORKS_API_KEY=fw_...
docker run --rm -it \
  -v ./workspace:/workspace \
  perstack/perstack start my-expert "query"

You can also specify custom .env file paths with --env-path:

perstack start my-expert "query" --env-path .env.production

What's inside?

An agent harness needs a broad set of capabilities—almost like an operating system.

┌──────────────────────────────────────────────────────────────────-┐
│  Interface                                                        │
│  CLI · Event streaming · Programmatic API                         │
├──────────────────────────────────────────────────────────────────-┤
│  Runtime                                                          │
│  Agentic loop · Event-sourcing · Checkpointing · Tool use         │
├──────────────────────────────────────────────────────────────────-┤
│  Context                                                          │
│  System prompts · Prompt caching · AgenticRAG · Extended thinking │
├──────────────────────────────────────────────────────────────────-┤
│  Definition                                                       │
│  Multi-agent topology · MCP skills · Provider abstraction         │
├──────────────────────────────────────────────────────────────────-┤
│  Infrastructure                                                   │
│  Sandbox isolation · Workspace boundary · Secret management       │
└──────────────────────────────────────────────────────────────────-┘

Most of the features below are not new ideas. Perstack takes the usual harness building blocks — tool use, delegation, checkpointing, prompt caching, etc. — makes them easy to operate, puts them on top of standards you already know (MCP, TOML, Docker, SSE), and ships them as one runtime. Where cost or operational burden demands it, Perstack introduces its own take — micro-agents being the first example.

Full feature matrix
Layer Feature Description
Definition perstack.toml Declarative project config with global defaults (model, reasoning budget, retries, timeout)
Expert definitions Instruction, description, delegates, tags, version, and minimum runtime version per expert
Skill types MCP stdio, MCP SSE, and interactive skills with tool pick/omit filtering and domain restrictions
Provider config 9 providers (Anthropic, OpenAI, Google, Fireworks, DeepSeek, Ollama, Azure OpenAI, Amazon Bedrock, Google Vertex) with per-provider settings
Model tiers Provider-aware model selection via defaultModelTier (low / middle / high) with fallback cascade
Provider tools Provider-native capabilities (web search, code execution, image generation, etc.) with per-tool options
Lockfile perstack.lock — resolved snapshot of experts and tool definitions for reproducible deployments
Context Meta-prompts Role-specific system prompts (coordinator vs. delegate) with environment injection (time, working directory, sandbox)
Context window tracking Per-model context window lookup with usage ratio monitoring
Message types Instruction, user, expert, and tool messages with text, image, file, thinking, and tool-call parts
Prompt caching Provider-specific cache control with cache-hit tracking
Delegation Parallel child runs with isolated context, parent history preservation, and result aggregation
Extended thinking Provider-specific reasoning budgets (Anthropic thinking, OpenAI reasoning effort, Google thinking config)
Token usage Input, output, reasoning, cached, and total token tracking accumulated across steps and delegations
Resume / continue Resume from any checkpoint, specific job, or delegation stop point
Runtime State machine 9-state machine (init → generate → call tools → resolve → finish, with delegation and interactive stops)
Event-sourcing 21 run events, 6 streaming events, and 5 runtime events for full execution observability
Checkpoints Immutable state snapshots with messages, usage, pending tool calls, and delegation metadata
Skill manager Dynamic skill lifecycle — connect, discover tools, execute, disconnect — with adapter pattern
Tool execution Parallel MCP tool calls with priority classification (MCP → delegate → interactive)
Error handling Configurable retries with provider-specific error normalization and retryability detection
Job hierarchy Job → run → checkpoint structure with step continuity across delegations
Streaming Real-time reasoning and result deltas via streaming callbacks
Infrastructure Container isolation Docker image (Ubuntu, multi-arch, ~74 MB) with PERSTACK_SANDBOX=1 marker and non-root user
Workspace boundaries Path validation with symlink resolution to prevent traversal and escape attacks
Env / secrets .env loading with --env-path, requiredEnv minimal-privilege filtering, and protected-variable blocklist
Exec protection Filtered environment for subprocesses blocking LD_PRELOAD, NODE_OPTIONS, and similar vectors
Install & lockfile perstack install pre-resolves tool definitions for faster, reproducible startup
Interface perstack CLI start (interactive TUI), run (JSON events), log (history query), install, and expert management commands
TUI React/Ink terminal UI with real-time activity log, token metrics, delegation tree, and job/checkpoint browser
JSON event stream Machine-readable event output via perstack run with --filter for programmatic integration
@perstack/runtime TypeScript library for serverless and custom apps — run() with event listener, checkpoint storage callbacks
@perstack/react React hooks (useRun, useJobStream) and event-to-activity processing utilities
Studio Expert lifecycle management — create, push, version, publish, yank — via Perstack API
Log system Query execution history by job, run, step, or event type with terminal and JSON formatters

Documentation

Topic Link
Getting started Getting Started
Architecture and core concepts Understanding Perstack
Expert definitions Making Experts
Rapid prototyping Rapid Prototyping Guide
Breaking agents into specialists Taming Prompt Sprawl
Adding tools via MCP Extending with Tools
Deployment Deployment
CLI and API reference References

Status

Pre-1.0. The runtime is production-tested, but the API surface may change. Pin your versions.

Community

Contributing

See CONTRIBUTING.md.

License

Apache License 2.0 — see LICENSE for details.

About

Perstack: A harness for micro-agents. Define experts and execute them at scale.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages