Skip to content

iglesiasbrandon/dox402

Repository files navigation

dox402

Edge-native, accountless, pay-per-use AI inference using Cloudflare Durable Objects + the x402 payment protocol.

Flow: Client → POST /infer → 402 + PAYMENT-REQUIRED header → client pays on-chain (USDC on Base) → retries with PAYMENT-SIGNATURE header → streamed AI output.

No signup. No API key. No custody of funds.


Prerequisites

# Authenticate with Cloudflare (required for Workers AI, even in local dev)
npx wrangler login

Workers AI calls are remote even during local development — you need a Cloudflare account with Workers AI access.


Local Development

npm install
npx wrangler dev          # starts on http://localhost:8787

Ensure .dev.vars contains:

PAYMENT_ADDRESS=0x...
BASE_RPC_URL=https://mainnet.base.org
MOCK_PAYMENTS=true
SESSION_SECRET=<any-hex-string>
ADMIN_SECRET=dev-admin-secret-for-local-testing

Run unit tests (no server needed):

npm test

Run the 9-phase end-to-end test (requires running dev server):

npm run test:e2e

The E2E test generates a random wallet, signs SIWE messages programmatically using @noble/curves, authenticates, deposits with a mock proof, runs inference, and verifies balance/history.


Deploy to Cloudflare

# Set production secrets (use wrangler secret, not vars)
npx wrangler secret put PAYMENT_ADDRESS   # your USDC-receiving wallet
npx wrangler secret put BASE_RPC_URL      # e.g. https://mainnet.base.org
npx wrangler secret put SESSION_SECRET    # generate with: openssl rand -hex 32
npx wrangler secret put ADMIN_SECRET      # generate with: openssl rand -hex 32

npx wrangler deploy

# Verify
curl https://<worker>.workers.dev/health

API

Authenticated endpoints accept either an Authorization: Bearer <token> header or an ig_session HttpOnly cookie (both obtained via SIWE login).

Public

Endpoint Description
GET /health Liveness probe
GET /payment-info Payment address + network details

Auth (SIWE)

Endpoint Description
GET /auth/nonce?wallet=0x... Generate one-time nonce
POST /auth/login Verify SIWE signature → session cookie + JWT
POST /auth/logout Clear session cookie

Authenticated

Endpoint Description
POST /infer Run inference (post-billed from balance, SSE stream). Accepts optional systemPrompt (max 2000 chars) for persistent instructions.
POST /deposit Top-up balance with payment proof
GET /balance Token balance + usage stats
GET /history Conversation messages + metadata
DELETE /history Clear conversation
POST /documents Upload document for RAG (embedding cost deducted)
GET /documents List uploaded documents
DELETE /documents/:id Delete document + Vectorize embeddings
POST /documents/reindex Re-upsert all document vectors (fixes metadata indexing)

Admin (requires ADMIN_SECRET Bearer token)

Endpoint Description
GET /admin/wallets Paginated list of registered wallets
GET /admin/wallets/:wallet/status Detailed wallet status (balance, usage, documents)
GET /admin/stats Aggregate statistics (total wallets)
GET /admin/stale Identify zero-balance inactive wallets

SIWX (Single-Request Auth)

x402-compatible clients can skip the nonce/login flow and pass a SIGN-IN-WITH-X header on POST /infer for stateless, single-request wallet authentication. The 402 response advertises supported chains via a sign-in-with-x extension.


Agent Discovery

dox402 publishes machine-readable specs so AI agents can discover and use the API without out-of-band configuration.

Discovery endpoints

Endpoint Description
GET /openapi.json OpenAPI 3.1 specification of every REST endpoint
GET /SKILL.md Agent-readable usage guide
GET /.well-known/agent.json A2A agent card (identity, transport, skill)
GET /.well-known/agents.json Multi-step flows (login, infer, balance, etc.)
GET /.well-known/api-catalog RFC 9727 linkset (application/linkset+json) pointing to OpenAPI / SKILL.md / health
GET /.well-known/agent-skills/index.json Cloudflare Agent Skills Discovery v0.2.0 index
GET /robots.txt Crawl rules + AI bot policies + Content Signals (ai-train=no, search=yes, ai-input=yes)
GET /sitemap.xml Canonical URLs

Content negotiation on /

  • Accept: text/markdown → returns SKILL.md with Content-Type: text/markdown; charset=utf-8 and x-markdown-tokens header (Cloudflare Markdown for Agents).
  • Accept: text/html → HTML response includes RFC 8288 Link headers advertising openapi.json, agent.json, agents.json, and SKILL.md.

In-browser agents (WebMCP)

When the homepage loads in a WebMCP-capable browser, it registers 5 tools via navigator.modelContext.registerTool(): connect_wallet, get_balance, send_inference, view_history, open_deposit_ui. No tool auto-spends — deposits still require an explicit user wallet signature.

What we deliberately don't publish

See public/.well-known/README.md for the rationale on /.well-known/openid-configuration, /.well-known/oauth-authorization-server, /.well-known/oauth-protected-resource, /.well-known/http-message-signatures-directory, /.well-known/mcp/server-card.json, /.well-known/ucp, and /.well-known/acp.json — none of which fit dox402's architecture (wallet-signed auth, per-call API monetization, no MCP endpoint, no product catalog).


Models

Model Context Window Speed
Llama 3.1 8B 7,968 tokens Fast
Llama 3.3 70B 24,000 tokens Medium
Gemma 3 12B 8,000 tokens Fast
Mistral 7B 8,000 tokens Fast
DeepSeek R1 32B 80,000 tokens Medium

Total input (prompt + conversation history + RAG file context) is validated against the selected model's context window before inference. Requests exceeding the limit receive a 413 error.


Payment Headers (x402 spec)

Header Direction Content
PAYMENT-REQUIRED Server → Client base64-encoded PaymentRequired JSON
PAYMENT-SIGNATURE Client → Server base64-encoded PaymentProof JSON

Architecture

  • Durable Object (InferenceGate): one instance per wallet address with embedded SQLite storage. Holds token balance, conversation history, rate limits, and replay-prevention data. All balance updates happen inside storage.transactionSync() to prevent race conditions. New DOs are co-located in Eastern North America (locationHint: 'enam') to minimize latency to Base chain RPC providers.
  • Worker (index.ts): validates wallet address format, authenticates via SIWE session tokens or SIWX headers, routes to the correct DO instance via typed RPC stubs.
  • KV Registry (WALLET_REGISTRY): global index of all active wallet DO instances, updated on first use via fire-and-forget ctx.waitUntil() calls. Powers the admin endpoints for fleet visibility.
  • Replay prevention: each payment hash stored in the seen_transactions SQL table with automatic 1-hour TTL cleanup via DO alarms.
  • Authentication: SIWE (EIP-4361) proves wallet ownership; HMAC-SHA256 JWTs delivered as HttpOnly cookies (browser) or Bearer tokens (API). SIWX single-request auth also supported for x402 clients.
  • Verification: Tier 1 structural checks + Tier 2 on-chain RPC receipt verification via eth_getTransactionReceipt. Grace mode provides provisional tokens when RPC is unreachable, with automatic alarm-based re-verification.
  • Streaming: SSE responses with heartbeat keepalive (:keepalive comments every 15s of inactivity) and a 2-minute max-duration guard to prevent runaway streams. Backpressure is handled naturally via await writer.write().
  • Billing safeguards: failed AI responses (empty, error JSON, stream errors) are detected and not billed — tokens are only deducted for successful inference.
  • RAG (Retrieval-Augmented Generation): per-wallet document knowledge base powered by Cloudflare Vectorize and Workers AI embeddings (bge-base-en-v1.5). Documents are chunked (1600 chars, 200 char overlap), embedded, and stored in a shared Vectorize index with per-wallet metadata filtering. Opt-in via useRag: true on /infer — relevant chunks are retrieved (top-5, cosine similarity ≥ 0.45) and injected as system context. Total input (prompt + history + RAG context) is validated against each model's context window. RAG failure is non-fatal.

About

Pay-per-use AI inference utilizing Cloudflare workers, Durable Objects, Workers AI, and x402 payment protocol.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors