stage0-agent-runtime-guard

stage0-agent-runtime-guard is a customer-facing proof-of-value demo for SignalPulse, showing why AI agents need a runtime policy authority before they can publish, deploy, keep retrying autonomously, or otherwise create side effects.

This repository is intentionally lightweight:

a small autonomous agent
a Stage0 API client aligned with the current hosted runtime contract
side-by-side guarded vs unguarded demos
customer-oriented scenarios you can show in a sales call, pilot, or internal evaluation

If you want the full product surface, dashboards, billing flows, API-key lifecycle, and hosted runtime described in the broader SignalPulse app, this repo is the shortest path to understanding the core runtime-guard value proposition.

Why buyers care

Most teams already know how to make an agent produce text. The hard part is stopping the agent from quietly escalating into actions that should require policy, approval, or stronger guardrails.

Without a runtime guard, an agent can:

turn research into advice
turn drafting into publishing
turn analysis into deployment
turn "help me think" into "I already executed it"

Stage0 addresses that by validating execution intent before the action happens and returning an external verdict: ALLOW, DENY, or DEFER.

The current hosted contract also carries runtime metadata such as:

decision (GO, NO_GO, DEFER, ERROR)
request_id
policy_pack_version / policy_version
runtime context fields for approvals, roles, environments, channels, and loop state

What this demo proves

This repo demonstrates the boundary between:

useful bounded assistance
unsafe autonomous escalation

The demo intentionally compares two modes:

WITHOUT Stage0 The agent executes every planned step, including higher-risk steps that go beyond the user's original request.
WITH Stage0 The agent still completes safe informational work, but Stage0 denies or defers steps that attempt to publish, deploy, or keep looping without explicit guardrails.

What this proves

An external runtime guard can block escalation before it happens
Safe work (research, analysis, drafting) still proceeds
The agent cannot self-approve sensitive actions
High-risk side effects (publish, deploy) require explicit guardrails
Runtime context (approvals, environment, roles) shapes policy decisions

What this does NOT prove

This is a demo, not a production integration
The agent is simulated; real agents may behave differently
Policy decisions depend on your Stage0 configuration
This does not cover all possible escalation vectors
Live API behavior may differ from simulated mode

Customer scenarios included

This repository now includes four scenarios designed to map to common buyer conversations:

frameworks Research assistant for AI product teams. The agent should research and summarize, but should not silently become an implementation advisor.
policy_publish Customer-facing content publisher. The agent can draft and analyze, but should not publish outward-facing claims without approval.
deployment Production ops assistant. The agent can investigate incidents, but should not approve or execute rollout changes on its own.
agent_loops Runtime loop guard. The agent can inspect retry history and prepare an operator handoff, but should not keep extending its own retry budget or repeat sensitive tool paths unattended.

These scenarios are useful for:

AI SaaS founders
internal tools teams
platform and DevOps teams
buyers evaluating runtime safety controls for agents

Quick start

Prerequisites

Python 3.10 or newer
A Stage0 API key from SignalPulse if you want live policy decisions

Setup

git clone https://github.com/Starlight143/stage0-agent-runtime-guard.git
cd stage0-agent-runtime-guard
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env

Then set:

STAGE0_API_KEY=your_api_key_here
STAGE0_BASE_URL=https://api.signalpulse.org

If no API key is configured, the guarded demo falls back to simulated Stage0 responses so you can still demonstrate the control flow.

Run the demo

Run one scenario interactively

python run_demo.py --scenario frameworks

Run with buyer-friendly concise output (recommended for demos)

python run_demo.py --scenario frameworks --concise --auto

The --concise flag shows a clean side-by-side comparison of guarded vs unguarded execution, perfect for sales calls and presentations.

Run with deterministic simulated output (for recordings/CI)

python run_demo.py --scenario frameworks --concise --simulated --auto

The --simulated flag forces simulated Stage0 responses even with an API key configured, ensuring deterministic output for recordings and CI.

Run a scenario without pause prompts

python run_demo.py --scenario deployment --auto

Run every scenario in sequence

python run_demo.py --scenario all --auto

Available scenario keys: frameworks, policy_publish, deployment, agent_loops

Sample outputs

Pre-generated sample outputs are available in artifacts/:

artifacts/frameworks.txt - Research Assistant scenario
artifacts/policy_publish.txt - Content Publisher scenario
artifacts/deployment.txt - Production Ops scenario
artifacts/agent_loops.txt - Loop Guard scenario
artifacts/all-scenarios.txt - All scenarios combined

Included reference docs

This demo repo now includes lightweight customer-facing reference docs derived from the main product:

docs/runtime-contract.md
docs/reference-scenarios.md
docs/service-overview.md

Expected outcome

For each scenario, you will see:

an unguarded run
a guarded run
the exact steps that Stage0 denied or deferred
a summary of why those denied steps matter commercially and operationally

The important observation is not "the agent was blocked." The real value is:

safe work still proceeds
risky work is denied or deferred before execution
the guard is external to the agent

DEFER is especially useful when the agent is drifting into loop-like behavior, under-specified follow-up actions, or situations that need a human checkpoint instead of a silent retry.

Integrate Stage0 into your own agent

The minimum integration is intentionally small:

from stage0 import Stage0Client
from stage0.client import Verdict

client = Stage0Client()
response = client.check_goal(
    goal="Publish the weekly changelog",
    success_criteria=["Post to the public changelog"],
    constraints=["human approval required"],
    tools=["shell"],
    side_effects=["publish"],
    context={
        "actor_role": "platform_admin",
        "approval_status": "approved",
        "environment": "production",
        "request_channel": "dashboard",
    },
)

if response.verdict != Verdict.ALLOW:
    raise RuntimeError(response.reason)

In a real implementation, you should validate every execution step, not just the top-level task. The client in this repo now supports both the request context object and richer response fields like decision, request_id, policy_version, guardrail_checks, and decision_trace_summary.

For loop-style workloads, the same pattern should be applied to runtime state as well. In practice that means passing stable execution identifiers and constraints such as retry budgets, elapsed-time ceilings, or repeated-tool thresholds into the policy layer before the agent decides to continue.

Repository structure

agent/              Agent planning and execution logic
demo/               Guarded and unguarded scenario runners
stage0/             Stage0 API client
docs/               Buyer-facing contract and scenario references
tests/              Unit and smoke tests
artifacts/          Pre-generated sample outputs
run_demo.py         Multi-scenario demo entrypoint

What changed in this repo

This repo has been updated to better support customer conversion and live evaluation:

richer customer-facing demo scenarios
a new agent-loop scenario aligned to runtime retry controls
non-interactive CLI mode for recordings and scripted demos
--concise flag for buyer-friendly comparison output
pre-generated sample outputs in artifacts/
smoke tests for all scenarios
clearer environment configuration for hosted Stage0
stronger buyer-oriented README positioning around ALLOW / DENY / DEFER

When to use this repo vs the full app

Use this repo when you want to:

show the concept quickly
demo the difference between guarded and unguarded execution
validate runtime guard behavior before a larger integration
explain why loop guards matter before integrating runtime state into production agents
support customer conversations with a minimal example

Use the broader SignalPulse application when you want:

account management
hosted billing flows
API key lifecycle and issuance
dashboards, logs, and analytics
a fuller product experience

Get access

To try the real Stage0 runtime:

Visit signalpulse.org
Create an account
Generate an API key
Re-run this demo with your key in .env

License

See the repository license and remote project terms before production use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

stage0-agent-runtime-guard

Why buyers care

What this demo proves

What this proves

What this does NOT prove

Customer scenarios included

Quick start

Prerequisites

Setup

Run the demo

Run one scenario interactively

Run with buyer-friendly concise output (recommended for demos)

Run with deterministic simulated output (for recordings/CI)

Run a scenario without pause prompts

Run every scenario in sequence

Sample outputs

Included reference docs

Expected outcome

Integrate Stage0 into your own agent

Repository structure

What changed in this repo

When to use this repo vs the full app

Get access

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
agent		agent
artifacts		artifacts
demo		demo
docs		docs
stage0		stage0
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
requirements.txt		requirements.txt
run_demo.py		run_demo.py

Folders and files

Latest commit

History

Repository files navigation

stage0-agent-runtime-guard

Why buyers care

What this demo proves

What this proves

What this does NOT prove

Customer scenarios included

Quick start

Prerequisites

Setup

Run the demo

Run one scenario interactively

Run with buyer-friendly concise output (recommended for demos)

Run with deterministic simulated output (for recordings/CI)

Run a scenario without pause prompts

Run every scenario in sequence

Sample outputs

Included reference docs

Expected outcome

Integrate Stage0 into your own agent

Repository structure

What changed in this repo

When to use this repo vs the full app

Get access

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages