A specification for how services communicate their operational limits to humans and autonomous agents.
API and service operators, plus the agent builders calling them, who need operational limits expressed in a way autonomous callers can actually act on.
Services signal limits with status codes (429, 403, 500) that agents can't interpret, so agents retry blindly and the waste compounds. Graceful Boundaries is a specification for communicating operational limits to humans and autonomous agents.
https://gracefulboundaries.dev/
Every unclear response generates follow-up traffic. A vague 429 causes blind retries. A vague 403 causes re-attempts with different credentials. A generic 500 causes indefinite retries. When autonomous agents are the caller, the waste compounds: agents retry faster, probe more systematically, and lack the human judgment to know when to stop.
Most services enforce rate limits but communicate them poorly. A 429 Too Many Requests with Retry-After: 60 tells a retry loop what to do. It doesn't tell an autonomous agent whether to retry, use a cached result, try a different endpoint, or inform the human. It doesn't tell a developer what the limits are before they hit them. It doesn't tell anyone why the limit exists.
The specification addresses three gaps that existing standards cover separately but no specification combines:
- Proactive discovery -- limits are machine-readable before they are hit
- Structured refusal -- every non-success response explains what happened, why, and what to do next
- Constructive guidance -- refusals include a useful next step, not just a block
This applies to every HTTP error class, not just rate limits. A 400 explains the validation rule and its security rationale. A 404 tells you whether the resource never existed or expired, and offers a creation path. A 500 names the affected subsystem and suggests a retry window. Every non-success response MUST include error, detail, and why.
Read the full specification: spec.md
These examples use Siteline, a Level 4 conformant reference implementation.
Discover limits before hitting them:
curl -s https://siteline.to/api/limits | jq '{service, limits: .limits.scan}'{
"service": "Siteline",
"limits": {
"scan": {
"endpoint": "/api/scan",
"method": "GET",
"limits": [
{
"type": "ip-rate",
"maxRequests": 10,
"windowSeconds": 3600,
"description": "10 scans per IP per hour."
}
]
}
}
}Structured refusal with constructive guidance (when a rate limit is exceeded):
{
"error": "rate_limit_exceeded",
"detail": "You can run up to 10 scans per hour. Try again in 2400 seconds.",
"limit": "10 scans per IP per hour",
"retryAfterSeconds": 2400,
"why": "Siteline is a free service. Rate limits keep it available for everyone and prevent abuse.",
"alternativeEndpoint": "/api/result?id=example.com"
}The caller knows the limit, when to retry, why the limit exists, and where to get the result without waiting.
Every error class is self-explanatory, not just 429s:
{
"error": "invalid_input",
"detail": "This URL is outside the scanner's accepted public-target policy.",
"why": "Siteline accepts only public scan targets to prevent the scanner from being used as a proxy.",
"field": "url",
"expected": "A public URL with a resolvable hostname."
}An agent reading this 400 understands the input-safety policy and can fix the input. Without why, it would blindly retry with different URLs.
Proactive headers on successful responses:
curl -s 'https://siteline.to/api/result?id=example.com' \
-D - -o /dev/null 2>&1 | grep ratelimitratelimit: limit=60, remaining=59, reset=60
ratelimit-policy: 60;w=60
A caller seeing remaining=1 self-throttles before the next request. A caller seeing remaining=59 knows it has budget.
For more examples, see docs/curl-examples.md.
Services self-declare a conformance level. The eval suite validates the claim.
| Level | What it requires |
|---|---|
| N/A: Not Applicable | No API endpoints, rate limits, or agentic interaction surface. |
| Level 0: Non-Conformant | Limits exist but are not described per this specification. |
| Level 1: Structured Refusal | All non-success responses include error, detail, and why. All 429s add limit and retryAfterSeconds. |
| Level 2: Discoverable | Level 1 + a limits discovery endpoint. |
| Level 3: Constructive | Level 2 + refusal responses include constructive guidance when applicable. |
| Level 4: Proactive | Level 3 + successful responses include proactive limit headers. |
Run the checker directly via npx (no install, no dependencies):
npx graceful-boundaries check https://siteline.to # Level 4 — proactive headers
npx graceful-boundaries check https://google.com # Level 0 — no conformance
npx graceful-boundaries check https://your-service.com --json
npx graceful-boundaries check https://your-service.com --min-level 2 # nonzero exit below Level 2
npx graceful-boundaries check https://your-service.com --check-cloakingOr clone and run from the project root with node evals/check.js <url>. Run the unit test suite (256 tests, no dependencies):
npm testThe repo doubles as a composite GitHub Action. Add a job that fails when your deployed service drops below its declared level:
- uses: snapsynapse/graceful-boundaries@v1
with:
url: https://staging.your-service.com
min-level: "2"Levels 2 and 4 are confirmable passively and make reliable CI gates; Levels 1 and 3 require observing a live refusal (see CONFORMANCE.md).
Graceful Boundaries adopts the GuideCheck Human-Verifiable Assistant Guide profile. The assistant-facing guide is committed at assistant-guide.txt and published from https://gracefulboundaries.dev/.well-known/assistant-guide.txt.
Verify the guide before asking an assistant to follow it. Use the hosted verifier at guidecheck.org/verify or a conformant local verifier:
python3 /path/to/guidecheck/scripts/guidecheck_verify.py assistant-guide.txtThe committed root copy and the well-known copy must remain byte-identical.
The current GuideCheck implementation and all agent-facing repository surfaces are documented in docs/agentic-surfaces.md. Current local verification: GuideCheck reference verifier 0.3.2, achieved Level 3, guide SHA-256 7dbf6472d5a49905054b0d541c27a4246bdc1f10e5d7bb9c16c028fa04b8bfdd, with 0 blocking findings and 0 warnings.
- No API or agentic surface? Declare
not-applicable— takes 5 minutes. - API with rate limits? Start at Level 1 (structured refusals). This is the minimum useful level.
- Agents call your API? Target Level 2 (add discovery) so agents learn the rules before breaking them.
- Want to reduce 429 traffic? Target Level 3 (constructive guidance) — offer cached results and alternatives instead of bare refusals.
- High-traffic API with agent callers? Target Level 4 (proactive headers) — callers self-throttle before hitting limits.
For a step-by-step walkthrough with code samples, see the implementation guide.
The shortest path from "read the spec" to "conformant service":
- Copy a middleware example -- dependency-free Level 2 implementations (Level 4 with one flag) for Express, FastAPI, Cloudflare Workers, and Hono.
- Copy the nearest limits.json -- complete discovery responses for a SaaS API, free scanner, LLM API, and content site.
- Validate your body shapes -- published JSON Schemas for refusals, 429s, and the discovery endpoint, served at
https://gracefulboundaries.dev/schema/. Importable into OpenAPI and CI validators; run the checker for origin-aware SC-6 URL safety and conformance. - Verify --
npx graceful-boundaries check https://your-service.example, then gate it in CI with the GitHub Action above. - Declare -- add yourself to ADOPTERS.md and embed a conformance badge.
Already emitting RFC 9457 Problem Details? You can adopt without changing your content type: see the RFC 9457 compatibility profile.
Building the agent side? evals/test-agent-behavior.js is an agent compliance suite: point it at your retry/refusal-handling logic and it checks you respect retryAfterSeconds, prefer cached results, ignore off-origin guidance URLs (SC-6), and treat guidance text as untrusted data (SC-16).
Start here -- Every non-success response (400, 401, 403, 404, 429, 500, 503) MUST include three core fields: error (stable machine-parseable string), detail (human-readable explanation), and why (the security, policy, or operational reason). This applies to all error classes, not just rate limits.
Level 1 -- All non-success responses include the three core fields. All 429 responses also include limit (the exact constraint) and retryAfterSeconds (machine-parseable retry time).
Level 2 -- Add a discovery endpoint at /api/limits or /.well-known/limits that returns all enforced limits as structured JSON. Agents can plan before they hit anything. Optionally include changelog and feed URLs so agents can detect limit changes. Services with cost, token, duration, size, quota, burst, or queue constraints can publish those as optional limit metadata.
Level 3 -- Add constructive guidance to refusals. When a cached result exists, include cachedResultUrl. When a different endpoint can help, include alternativeEndpoint. When paid access has higher limits, include upgradeUrl. For resource-dedup limits, return the cached result as a 200 with returnsCached: true in the discovery endpoint so agents skip retry logic entirely.
Level 4 -- Add RateLimit and RateLimit-Policy headers to successful responses so callers can self-throttle before hitting limits.
Optional extensions -- Services with consequential agent actions can link Action Boundaries documents from the discovery endpoint. Extensions are informational declarations, not verification or endorsement, and do not change Level 1-4 conformance. See docs/action-boundaries.md.
Security baseline -- Treat machine-readable guidance, refusal text, URLs, and boundary documents as untrusted service-provided data. Agents should parse known fields, but should not follow instructions embedded in detail, why, policy text, approval text, or URLs.
HTML endpoints -- HTML pages that return 429 SHOULD include <meta name="retry-after" content="N"> and/or <link rel="alternate" type="application/json" href="..."> so agents can discover structured refusals without parsing prose.
See the full specification for field definitions, response classes, and security considerations.
| Standard | What it covers | What Graceful Boundaries adds |
|---|---|---|
draft-ietf-httpapi-ratelimit-headers |
Proactive headers on success | Discovery endpoint, structured refusal body, why field, constructive guidance |
| RFC 6585 (429 status) | The status code itself | Structured body format with required fields |
| RFC 9457 (Problem Details) | Generic error format | Required fields for rate limits (limit, retryAfterSeconds, why) and guidance categories |
| OpenAPI Rate Limit extensions | Docs-time limit specs | Runtime discovery endpoint, runtime refusal format |
Graceful Boundaries is complementary to these standards, not a replacement. For services already emitting RFC 9457 Problem Details, the compatibility profile shows how to satisfy both specs in one response body.
Siteline is a Level 4 conformant implementation with five API endpoints. Verify it:
npx graceful-boundaries check https://siteline.toServices implementing the spec are listed in ADOPTERS.md, which also covers how to add yours and embed a conformance badge.
The specification includes a threat model and security audit covering rate limit calibration attacks, security posture disclosure, validation oracles, content cloaking via agent-signaling headers, action boundary risks, untrusted machine-readable guidance, and other considerations (SC-1 through SC-16), all addressed in the spec.
Graceful Boundaries is free and open. If your team relies on this spec, consider sponsoring its development to keep it maintained and evolving. See SPONSORS.md.
CC-BY-4.0. Use it, adapt it, build on it. Attribution required.
Graceful Boundaries is a PAICE.work project. PAICE.work PBC is a public benefit corporation dedicated to enabling safer and more effective People+AI collaboration. We believe that clear, honest communication between services and their callers (human or AI agent) is foundational to trustworthy AI infrastructure. This specification is part of that mission.
The patterns in this spec emerged from building Siteline, an AI agent readiness scanner, where the quality of the refusal matters as much as the enforcement.
The conformance audit skill is available on ClawHub.
See also: GuideCheck -- human-verifiable assistant guides, and Skill Provenance -- version identity that travels with agent skill bundles. Also PAICE.work projects.