anvil engineering specification

1. Overview

anvil is a Cloudflare-native CI runner for personal projects and small teams.

The v1 architecture is intentionally split across Cloudflare products by access pattern:

D1 stores relational control-plane data shared across users and projects.
KV stores short-lived session state with TTL-based expiry.
Durable Objects with SQLite store hot coordination state and live run state.
Queues decouple trigger ingestion from runner execution.
Sandbox runs builds in isolated Linux environments.
React provides the operator UI, served from the same Worker application.

anvil is designed around three hard requirements from the start:

A single public API prefix that can be protected by one WAF rate limit rule.
Multi-user ownership with single-owner projects in v1.
Repository-defined pipeline config instead of UI-defined commands.

2. Goals

2.1 v1 goals

Multiple users.
Multiple projects per user.
Custom HTTPS Git repositories.
Manual run trigger.
Webhook-triggered runs.
Repository-defined config from .anvil.yml.
Invite-only access for v1.
One active run per project.
Per-project FIFO pending run queue.
User-initiated cancellation of active or pending runs.
Live log streaming.
Strong coordination around run creation and run state.

2.2 Explicit v1 non-goals

Deployments.
Preview environments.
SSH Git auth.
Matrix builds.
DAG or multi-stage orchestration.
Warm reusable runners.
User-specified runner images.
Artifact browser.
R2 log archiving.
Human approval gates.
Shared multi-user projects and project collaboration beyond future expansion.

3. Technology stack

3.1 Language and runtime

TypeScript
Cloudflare Workers
Hono
React
Vite
@cloudflare/vite-plugin

Use hono as the Worker HTTP framework and routing layer.

3.2 Validation and contracts

@cloudflare/util-en-garde

All external and internal boundary payloads must be described with util-en-garde codecs and inferred TypeScript types. If usage patterns are unclear, refer to en-garde.README.md.

3.3 Persistence

D1 for relational data across users/projects/runs.
Workers KV for short-lived session state.
SQLite-backed Durable Objects for project-local and run-local state.
Drizzle ORM for D1 and Durable Object SQLite access.

All application-level database reads and writes must use drizzle-orm by default. Use Drizzle's documented APIs for transactional and batched database work where appropriate; see Drizzle transactions and Drizzle batch API. Raw SQL may be used only when drizzle-orm cannot express the required operation cleanly or when it is absolutely necessary for correctness or performance, and any such usage must be narrowly scoped.

3.4 Identifier conventions

All durable entity IDs use the format:

{prefix}_{base62(uuidv7)}

Examples:

usr_000Ff2k9A6pQzL1cM8xYwR
prj_000Ff2m4sC7vTb9Jk2nHdP
run_000Ff2qQw8LmNc3Xy6rStU

Rules:

the base62 suffix is fixed-width at 22 characters
the canonical base62 alphabet is 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
public IDs are opaque and stable
this format applies to durable entity IDs such as usr_, prj_, run_, inv_, and whk_
high-entropy security tokens such as session IDs, invite tokens, WebSocket tickets, and webhook secrets do not use this format

4. High-level architecture

4.1 Components

Worker frontdoor
- API routing
- auth/session checks
- D1 access
- Durable Object RPC invocation
- authenticated WebSocket upgrade routing
- Queue producer
- frontend asset serving
ProjectDO
- one object per project
- project-level concurrency and trigger arbitration
- accepted run state and D1 sync/dispatch reconciliation
- webhook configuration and encrypted secret storage
- active and pending run lock state
RunDO
- one object per run
- live run metadata
- rolling log storage
- WebSocket fanout for log viewers
- run completion and tail retention
Queue consumer
- sandbox creation
- git checkout
- repo config parsing
- sequential command execution
- log streaming into RunDO
Sandbox
- isolated build execution per run

4.2 Control-plane split

The control plane is deliberately divided:

Global relational control plane in D1:
- users
- projects
- invites
- run index rows
Project-local coordination plane in ProjectDO:
- active run lock
- pending run queue coordination
- accepted run metadata and D1 sync/dispatch retry state
- webhook definitions and encrypted secret material
Run-local live plane in RunDO:
- hot status
- step state
- rolling log tail
- WebSocket attachments and tags

This is the core architectural boundary of anvil.

4.3 Durable Object routing model

Public identifiers are not Durable Object IDs.

ProjectDO is addressed internally via idFromName(projectId)
RunDO is addressed internally via idFromName(runId)
Durable Object IDs remain internal implementation details and are never exposed as API identifiers

4.4 Durable Object invocation model

All non-WebSocket interactions with Durable Objects must use Workers RPC.

the Worker frontdoor and queue consumer call typed RPC methods on ProjectDO and RunDO stubs
Durable Objects are internal actors, not general-purpose HTTP handlers for private API routes
the Worker owns HTTP parsing, request validation, authentication, authorization, and response shaping before invoking RPC
Durable Object RPC methods receive trusted typed inputs and enforce project-local or run-local invariants
the log-stream WebSocket upgrade is the only fetch-based Durable Object path in v1, and the Worker authenticates the upgrade before handing it to RunDO

5. API surface and routing

5.1 Route prefixes

All public, unauthenticated, or brute-forceable endpoints must live under one shared prefix:

/api/public/*

All authenticated application endpoints must live under:

/api/private/*

5.2 WAF strategy

A single WAF rate limit rule should protect:

starts_with(http.request.uri.path, "/api/public/")

This one rule is the primary public attack-surface control for:

login brute force
session abuse
webhook spray
password reset abuse, if added later

5.3 Public routes

POST /api/public/auth/login
POST /api/public/auth/logout
POST /api/public/auth/invite/accept (only route that can create a user in v1)
POST /api/public/hooks/:provider/:ownerSlug/:projectSlug

Registration is invite-only in v1. There is no open self-signup route.

5.4 Private routes

GET /api/private/me
GET /api/private/projects
POST /api/private/projects
PATCH /api/private/projects/:projectId
GET /api/private/projects/:projectId
GET /api/private/projects/:projectId/runs
POST /api/private/projects/:projectId/runs
GET /api/private/projects/:projectId/webhooks
PUT /api/private/projects/:projectId/webhooks/:provider
POST /api/private/projects/:projectId/webhooks/:provider/rotate-secret
DELETE /api/private/projects/:projectId/webhooks/:provider
POST /api/private/runs/:runId/cancel
GET /api/private/runs/:runId
POST /api/private/runs/:runId/log-ticket
GET /api/private/runs/:runId/logs (WebSocket upgrade)
POST /api/private/invites

6. Authentication and sessions

6.1 Session storage choice

Session records are stored in KV, not D1.

The frontend stores the opaque session identifier in browser localStorage, not cookies.

Each session key is:

random opaque identifier
written with expirationTtl
returned by login and stored in browser localStorage
sent by the frontend on private requests, typically using an Authorization: Bearer <sessionId> header
deleted on logout or allowed to expire naturally

6.2 Session payload

Recommended KV value:

{
  "userId": "usr_...",
  "issuedAt": "2026-03-16T00:00:00.000Z",
  "expiresAt": "2026-03-16T06:00:00.000Z",
  "version": 1
}

Suggested key pattern:

sess:{sessionId}

6.3 TTL guidance

Session TTL should be short and renewable.

Recommended v1 policy:

default TTL: 6 hours
refresh-on-use: refresh when less than 1 hour remains
delete on logout

6.4 KV caveat

KV is eventually consistent across regions. This is acceptable for short-lived opaque sessions, but the design must tolerate:

logout invalidation not becoming globally visible instantly
recently-created sessions taking some time to appear in far regions

Mitigations:

use random session IDs with high entropy
keep session payload minimal
do not use KV for authorization data beyond the user ID and expiry
fetch authorization and project ownership from D1 on private requests
treat logout as best-effort immediate and globally convergent shortly after

6.4.1 Browser storage caveat

Because the frontend uses localStorage rather than cookies:

the application avoids ambient cookie attachment and the CSRF exposure tied to cookie-based session transport
the application must treat XSS resistance as critical because localStorage is accessible to frontend JavaScript
the frontend must never place the session identifier in URLs, WebSocket query strings, or any other browser-visible location beyond the dedicated auth storage key
logout must clear in-memory auth state and remove the localStorage entry immediately
the frontend must enforce a strict Content Security Policy and avoid inline script execution
run logs and all other untrusted runner output must be rendered as text, not raw HTML
any rich log formatting such as ANSI colorization must start from escaped text and apply only an allowlisted presentation transform

6.4.2 Disabled-user behavior

For v1:

login must reject users whose disabled_at is set
private requests must reject sessions whose user row is disabled in D1, even if the KV session has not yet expired
disabled users cannot create new projects, runs, webhooks, or invites

6.5 Password data

Password credential rows remain in D1.

Recommended v1 password storage format:

algorithm: PBKDF2
per-user random salt stored alongside the password hash
iteration count stored alongside the password hash so parameters can be raised later
derived key length and digest algorithm recorded as metadata if the implementation wants explicit forward compatibility

Suggested columns:

user_id
algorithm
digest
iterations
salt
password_hash
updated_at

The salt is required so identical passwords do not map to identical stored hashes and to make precomputed rainbow tables ineffective.

6.6 Future authentication methods

Not in v1, but the architecture should leave room for:

OAuth login
SAML login

Recommended future shape:

keep local password auth as one provider
add an identity_providers table in D1 later
add user_identities rows mapping users to external providers and stable provider subject IDs
keep /api/public/auth/* as the public auth ingress prefix so WAF protection remains unchanged

7. D1 usage model

7.1 What belongs in D1

D1 is the global relational source of truth for:

users
password credentials
projects
project ownership
run index
canonical prefixed entity identifiers and owner-scoped slugs
encrypted user-provided project credentials
invite tokens

7.2 What stays out of D1

The following should not be stored centrally in D1:

live run logs
active-run lock state
webhook configuration
encrypted webhook secret material
live WebSocket connection state
per-project accepted-run and pending-queue coordination state

Webhook configuration lives in ProjectDO.

8. D1 Sessions API and read-replication design

anvil should use the D1 Sessions API whenever possible, especially on read-heavy application routes.

8.1 Session helper policy

Create two D1 helpers:

openReadSession(request, env)
openPrimarySession(request, env)

`openReadSession`

Use when the route is logically read-only.

Behavior:

read bookmark from request header if present
call env.DB.withSession(bookmark ?? "first-unconstrained")
execute all D1 reads through this session
return the updated bookmark back to the client

`openPrimarySession`

Use when the route may write or must start from the latest primary state.

Behavior:

call env.DB.withSession("first-primary")
execute D1 read/write operations through this session
return the updated bookmark back to the client

8.2 Bookmark transport

Use a lightweight browser-visible storage for the D1 bookmark.

Recommended initial approach:

response header: x-anvil-d1-bookmark
mirrored into browser localStorage by the frontend fetch wrapper

The bookmark is not auth material. It is only a consistency token.

8.3 Likely read-only routes

These routes should use openReadSession:

GET /api/private/me
GET /api/private/projects
GET /api/private/projects/:projectId
GET /api/private/projects/:projectId/runs
GET /api/private/projects/:projectId/webhooks
GET /api/private/runs/:runId
POST /api/private/runs/:runId/log-ticket for ownership verification and ticket minting before WebSocket upgrade

Potentially read-only public routes, if later added:

GET /api/public/auth/session
GET /api/public/projects/:ownerSlug/:projectSlug/info if ever exposed

8.4 Write-capable routes

These routes should use openPrimarySession:

POST /api/public/auth/login
POST /api/public/auth/invite/accept
POST /api/private/invites
POST /api/private/projects
PATCH /api/private/projects/:projectId
POST /api/private/projects/:projectId/runs
PUT /api/private/projects/:projectId/webhooks/:provider
POST /api/private/projects/:projectId/webhooks/:provider/rotate-secret
DELETE /api/private/projects/:projectId/webhooks/:provider
POST /api/private/runs/:runId/cancel
POST /api/public/hooks/:provider/:ownerSlug/:projectSlug

8.5 Read route principle

Any route that only:

validates session via KV
checks ownership in D1
returns data without mutating D1

should use the D1 Sessions API read path.

9. Durable Objects

9.1 ProjectDO

One ProjectDO exists per project.

Responsibilities

serialize run trigger requests
allocate runId values for accepted runs
enforce one-active-run-per-project
own the per-project FIFO pending run queue in v1
persist accepted run metadata before D1 sync and queue dispatch succeed
snapshot the non-secret execution inputs required to execute an accepted run
act as the single durable reconciler for queue dispatch and D1 run-summary sync
store webhook definitions and encrypted secrets
deduplicate webhook deliveries
return webhook verification material to the Worker and accept verified control-plane actions via RPC
coordinate run start, cancellation, and lock release

SQLite tables in ProjectDO

`project_state`

project_id TEXT PRIMARY KEY
active_run_id TEXT NULL
updated_at INTEGER NOT NULL

`project_runs`

id TEXT PRIMARY KEY
project_id TEXT NOT NULL
run_id TEXT NOT NULL
trigger_type TEXT NOT NULL
triggered_by_user_id TEXT NULL
branch TEXT NOT NULL
commit_sha TEXT NULL
provider TEXT NULL
delivery_id TEXT NULL
repo_url TEXT NOT NULL
config_path TEXT NOT NULL
position INTEGER NULL
status TEXT NOT NULL
d1_sync_status TEXT NOT NULL
dispatch_status TEXT NOT NULL
dispatch_attempts INTEGER NOT NULL
last_error TEXT NULL
created_at INTEGER NOT NULL
cancel_requested_at INTEGER NULL

project_runs is ProjectDO's durable reconciliation ledger for accepted runs.

status tracks ProjectDO's accepted-run and queue-local state
d1_sync_status tracks whether the D1 run summary is reconciled for both initial acceptance and terminal completion
dispatch_status tracks whether the currently executable run has been queued for execution

At acceptance time, ProjectDO snapshots the non-secret execution inputs required for execution.

the effective branch
repo_url
config_path

Repository credentials are not snapshotted. The queue consumer resolves the latest stored repository token from D1 at execution time.

`project_runs` enum guidance

status values in v1:

pending
executable
active
cancel_requested
passed
failed
canceled

Allowed status transitions:

pending -> executable
pending -> canceled
executable -> active
executable -> failed
executable -> canceled
active -> cancel_requested
active -> passed
active -> failed
cancel_requested -> canceled
cancel_requested -> failed

d1_sync_status values in v1:

needs_create
current
needs_terminal_update
done

Allowed d1_sync_status transitions:

needs_create -> current
needs_create -> needs_terminal_update
current -> needs_terminal_update
needs_terminal_update -> done

dispatch_status values in v1:

blocked
pending
queued
started
terminal

Allowed dispatch_status transitions:

blocked -> pending
pending -> queued
queued -> started
blocked -> terminal
pending -> terminal
queued -> terminal
started -> terminal

`project_webhooks`

id TEXT PRIMARY KEY
project_id TEXT NOT NULL
provider TEXT NOT NULL
secret_ciphertext BLOB NOT NULL
secret_key_version INTEGER NOT NULL
secret_nonce BLOB NOT NULL
enabled INTEGER NOT NULL
created_at INTEGER NOT NULL
updated_at INTEGER NOT NULL

`project_webhook_deliveries`

id TEXT PRIMARY KEY
project_id TEXT NOT NULL
provider TEXT NOT NULL
delivery_id TEXT NOT NULL
run_id TEXT NULL
received_at INTEGER NOT NULL

ProjectDO index plan

CREATE UNIQUE INDEX idx_project_webhooks_project_provider ON project_webhooks(project_id, provider);
CREATE INDEX idx_project_webhooks_provider_enabled ON project_webhooks(provider, enabled);
CREATE INDEX idx_project_webhooks_project_enabled ON project_webhooks(project_id, enabled);
CREATE UNIQUE INDEX idx_project_webhook_deliveries_project_provider_delivery ON project_webhook_deliveries(project_id, provider, delivery_id);
CREATE INDEX idx_project_webhook_deliveries_project_received_at ON project_webhook_deliveries(project_id, received_at);
CREATE UNIQUE INDEX idx_project_runs_project_position ON project_runs(project_id, position);
CREATE INDEX idx_project_runs_project_status_position ON project_runs(project_id, status, position);
CREATE UNIQUE INDEX idx_project_runs_run_id ON project_runs(run_id);

The state table is primary-key driven and does not need extra indexes in v1.

9.2 RunDO

One RunDO exists per run.

Responsibilities

receive live log events from runner
persist a rolling log tail
own all log stream WebSockets
broadcast to viewers
keep authoritative hot run state during execution
finalize run completion metadata
return minimal trusted run metadata to the Worker when a newly accepted runId is not yet visible in D1
expose run-state and log mutation operations via RPC

RunDO is authoritative for active run state and recent run detail. D1 run_index is the durable query/index layer and may lag while a run is active. Its fetch handler is reserved for the Worker-authenticated WebSocket upgrade path.

SQLite tables in RunDO

`run_meta`

id TEXT PRIMARY KEY
project_id TEXT NOT NULL
status TEXT NOT NULL
trigger_type TEXT NOT NULL
branch TEXT NOT NULL
commit_sha TEXT NULL
current_step INTEGER NULL
started_at INTEGER NULL
finished_at INTEGER NULL
exit_code INTEGER NULL
error_message TEXT NULL

`run_steps`

id TEXT PRIMARY KEY
run_id TEXT NOT NULL
position INTEGER NOT NULL
name TEXT NOT NULL
command TEXT NOT NULL
status TEXT NOT NULL
started_at INTEGER NULL
finished_at INTEGER NULL
exit_code INTEGER NULL

`run_logs`

id TEXT PRIMARY KEY
run_id TEXT NOT NULL
seq INTEGER NOT NULL
stream TEXT NOT NULL
chunk TEXT NOT NULL
created_at INTEGER NOT NULL

RunDO index plan

CREATE UNIQUE INDEX idx_run_logs_run_seq ON run_logs(run_id, seq);
CREATE INDEX idx_run_logs_run_created_at ON run_logs(run_id, created_at);
CREATE UNIQUE INDEX idx_run_steps_run_position ON run_steps(run_id, position);
CREATE INDEX idx_run_meta_project_started_at ON run_meta(project_id, started_at);

The most common queries in RunDO are:

fetch latest log tail for one run
append ordered log chunks
fetch ordered steps for one run

These indexes are designed specifically for those patterns.

10. WebSocket Hibernation design

This is a first-class design decision, not an implementation detail.

Run log streaming must use the Durable Object WebSocket Hibernation API.

10.1 Why Hibernation is mandatory

CI logs are bursty:

large bursts while commands are active
idle gaps during install, network wait, or subprocess silence
viewers can remain attached for long periods

Hibernation is the right fit because:

clients stay connected while the object is evicted from memory
the object wakes automatically on the next event
duration charges do not accrue while the object is sleeping
anvil does not need to pin a RunDO in memory just because a browser tab is open

10.2 Required Hibernation APIs

RunDO must use:

ctx.acceptWebSocket(ws)
ctx.getWebSockets()
ws.serializeAttachment(...)
ws.deserializeAttachment()
ctx.setWebSocketAutoResponse(...)

10.3 Attachment contents

Each WebSocket attachment should store:

runId
userId
connectedAt
lastAckedSeq if incremental replay is later added

10.4 Wake-up behavior

When RunDO wakes after hibernation:

constructor runs again
in-memory state is rebuilt from SQLite and socket attachments
attached sockets are recovered via ctx.getWebSockets()
replay state must not depend on old memory

10.5 Log replay model

For v1:

keep a bounded rolling log tail in RunDO SQLite
cap retained hot log storage at 2 MiB per run
on new WebSocket connection, replay the recent tail
then stream live events

Full log archival is deferred to future R2 integration.

10.6 Cost-sensitive behavior

Use auto-response for ping/pong-style keepalive traffic so idle viewers do not wake the object unnecessarily.

10.7 WebSocket auth

Browser WebSocket clients cannot attach an Authorization header during the upgrade flow. For v1, anvil uses a short-lived log-stream ticket stored in KV.

authenticated client calls POST /api/private/runs/:runId/log-ticket
Worker validates session identity and run ownership before minting the ticket
the ticket is stored in KV with runId, userId, and expiry metadata
the ticket is best-effort single-use and should expire after 60 seconds
the browser connects using GET /api/private/runs/:runId/logs?ticket=...
the Worker validates and consumes the ticket before forwarding the upgrade to RunDO
strict global single-use is not required in v1 because KV is eventually consistent; the security boundary is short TTL plus binding the ticket to runId and userId
the Worker forwards trusted authenticated upgrade metadata to RunDO; RunDO must not treat the browser query string as auth material
session identifiers must never appear in WebSocket query strings

11. Queue and runner execution

11.1 Queue role

Queues provide durable handoff between trigger ingestion and execution.

Each queue message contains:

{
  "projectId": "prj_...",
  "runId": "run_..."
}

A queue message is a delivery hint, not the source of truth for scheduling.

Cloudflare Queues do not provide strict FIFO delivery guarantees, so v1 must not rely on queue delivery order to preserve per-project execution order.

ProjectDO is authoritative for:

whether a run is still pending
whether a run is currently active
whether a run has been canceled
which pending run is next in FIFO order

The queue consumer must re-check ProjectDO before starting work and must no-op stale, duplicate, canceled, or superseded queue messages.

11.1.1 Run acceptance boundary

A run is considered accepted once ProjectDO durably writes the accepted run record to its local SQLite state.

ProjectDO allocates the canonical runId
the accepted run record snapshots the non-secret execution inputs required for execution
the accepted run record is written before D1 sync and queue enqueue are required to succeed
the API returns 202 Accepted with runId after the ProjectDO commit succeeds
D1 run_index creation is a post-acceptance reconciliation step
queue enqueue is a post-acceptance reconciliation step only when the accepted run is currently executable

11.1.2 Queue and reconciliation policy

For v1:

maximum pending accepted runs per project: 20
ProjectDO is the single durable reconciler for queue dispatch and D1 run-summary sync
ProjectDO is also the durable watchdog owner for an active accepted run until terminalization is confirmed
only the currently executable run should have a queue message enqueued
accepted runs behind an active run remain only in the ProjectDO FIFO queue until promoted
when ProjectDO promotes the next pending run to executable, exactly one queue message should be enqueued for that run
queue enqueue failures before execution begins should be retried from ProjectDO with bounded exponential backoff
D1 sync failures for both initial acceptance and terminal completion should be retried from ProjectDO using an alarm or equivalent retry mechanism
if dispatch retries are exhausted before sandbox execution begins, the run is marked failed with a system reason such as dispatch_failed
while a run is active, the queue consumer must periodically heartbeat execution progress to ProjectDO
if the heartbeat becomes stale before a terminal update is recorded, ProjectDO marks the run failed with a system reason such as runner_lost, reconciles D1, releases the active lock, and advances the queue
once a sandbox has started, anvil does not automatically rerun the build on worker-side failure; it only finalizes the accepted run

11.1.3 Platform execution limits

For v1:

queue consumer invocations have a 15 minute wall-clock limit on Cloudflare
the queue consumer Worker should run on a paid plan with limits.cpu_ms set to 300000
whole-run timeout must stay below the queue consumer wall-clock limit so checkout, reconciliation, and cleanup have headroom
the queue consumer should use Sandbox SDK WebSocket transport to avoid per-operation subrequest pressure
active CI sandboxes should use keepAlive: true and must always be explicitly destroyed

11.2 Queue consumer responsibilities

load project summary from D1, including the latest encrypted repository credential metadata if present
call ProjectDO RPC to confirm run ownership and queue state and retrieve the accepted-run execution snapshot
no-op the message if ProjectDO reports the run is stale, duplicate, canceled, already completed, or not the current executable run
treat a message for a non-executable run as an unexpected but tolerated stale delivery and emit a structured log or metric before acknowledging it
create Sandbox with keepAlive: true
use the Sandbox SDK to check out the repository inside the Sandbox
load the repository config from the snapshotted config_path
validate config with util-en-garde
transition the run through starting and running in RunDO via RPC
create step rows in RunDO via RPC
start heartbeat updates to ProjectDO while the run is active
run named steps sequentially
stream output to RunDO via RPC using batched/coalesced log appends rather than one-row-per-small-fragment writes
finalize run in RunDO via RPC and report the terminal summary back to ProjectDO
let ProjectDO perform or retry the D1 run-summary sync
release the project lock in ProjectDO via RPC
advance the ProjectDO FIFO queue via RPC if another pending run exists and enqueue exactly one queue message for the newly promoted executable run
destroy sandbox in finally

11.3 Failure boundaries

The queue consumer is responsible for best-effort cleanup on:

sandbox startup failure
checkout failure
config parse failure
command non-zero exit
worker-side exception

RunDO should still receive a terminal state update for all of those paths. If a command timeout or cancellation occurs, the queue consumer must explicitly terminate the underlying Sandbox process or session and must not assume the SDK timeout alone has stopped execution.

11.4 Runner model and cancellation semantics

The runner model must make cancellation explicit.

Each executing build step must run in a way that exposes a controllable Sandbox process or session handle.

Required semantics:

soft cancel attempts to stop execution at the running process boundary or via a graceful process signal
hard cancel escalates by explicitly killing the Sandbox process group, session, or sandbox when graceful shutdown does not complete within 30 seconds
command timeout alone is not sufficient as a cancellation mechanism; the implementation must actively terminate the underlying process or session because Sandbox SDK command timeouts only end the caller-side wait
the next FIFO run must not be promoted until the active run is confirmed stopped

11.5 Platform runner image

v1 uses one platform-owned runner image. Repositories cannot choose or override the runner image in .anvil.yml.

Recommended image source:

docker/runner.Dockerfile

Recommended Dockerfile:

ARG SANDBOX_VERSION=0.7.0
FROM docker.io/cloudflare/sandbox:${SANDBOX_VERSION}-python

ENV DEBIAN_FRONTEND=noninteractive \
    CI=1 \
    COREPACK_ENABLE_DOWNLOAD_PROMPT=0 \
    NPM_CONFIG_UPDATE_NOTIFIER=false \
    NPM_CONFIG_FUND=false \
    PNPM_HOME=/opt/pnpm \
    PATH=/opt/pnpm:$PATH

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    curl \
    file \
    git \
    git-lfs \
    jq \
    pkg-config \
    procps \
    rsync \
    unzip \
    wget \
    xz-utils \
    zip \
    && rm -rf /var/lib/apt/lists/*

RUN corepack enable \
    && corepack prepare pnpm@9.15.0 --activate \
    && corepack prepare yarn@4.6.1 --activate

WORKDIR /workspace

Runner contract for v1:

Ubuntu 22.04-based Cloudflare Sandbox image
Node.js 20 LTS with npm
Bun from the Cloudflare base image
Python 3.11 with pip and venv
pnpm and yarn via corepack
common CI utilities including git, git-lfs, curl, wget, jq, zip, unzip, file, procps, rsync, build-essential, and pkg-config

The Docker base image version must stay in lockstep with the @cloudflare/sandbox npm package version used by the application.

11.6 Repository checkout and credentials

Repository checkout in v1 should use a deliberately narrow policy.

Allowed repository URL policy:

repository URLs must use https://
the host must be a normal DNS hostname
embedded credentials in the URL are rejected
query strings and fragments are rejected
explicit non-default ports are rejected
localhost, loopback hosts, and IP-literal hosts are rejected
standard TLS validation is required; self-signed or private CA repositories are unsupported in v1

Private repository credential handling in v1:

each project may store one encrypted repository token in D1
the token is decrypted only for clone or fetch operations
the queue consumer may construct an in-memory credentialed HTTPS URL in the provider's supported PAT format and pass it directly to sandbox.gitCheckout(...)
the credentialed URL is an ephemeral runtime value only and must never be stored in D1, ProjectDO, RunDO, .git/config, or any persisted repository config files
the clean repository URL stored in D1 must remain uncredentialed
checkout failures and runner logs must redact credentialed URLs and tokens before they are emitted or returned
the token must never appear in structured logs, user-visible error messages, or persisted configuration

The v1 checkout model is intentionally limited to keep repository access predictable and avoid leaking credentials through common git transport surfaces.

12. Repository configuration

Pipeline configuration is repository-defined.

12.1 Default path

.anvil.yml

12.2 Optional override

Projects may store a custom path in D1, for example:

.config/anvil.yml
ci/anvil.yml

For v1, config_path must be repo-relative. Absolute paths and path traversal such as .. are rejected.

12.3 v1 config schema

version: 1
checkout:
  depth: 1
run:
  workingDirectory: .
  timeoutSeconds: 720
  steps:
    - name: install
      run: npm ci
    - name: test
      run: npm test
    - name: build
      run: npm run build

v1 step shape is intentionally minimal:

name
run

run.timeoutSeconds is a whole-run timeout, not a per-step timeout.

12.3.1 v1 config limits

For v1:

maximum config file size: 64 KiB
maximum step count: 20
maximum step name length: 64
maximum step command length: 4096 bytes
maximum run.timeoutSeconds: 720
workingDirectory must be repo-relative
absolute paths are rejected
path traversal such as .. is rejected

12.4 Validation behavior

The config file is validated after checkout.

If validation fails:

the run is marked failed
a structured warn or error log line is emitted
no build commands are executed
unknown top-level fields must be rejected
unknown step-level fields must be rejected
config values exceeding the v1 limits above must be rejected

12.5 Reserved future expansion fields

v1 keeps repository config intentionally small, but the schema should leave room for future expansion such as:

environment variables
cache hints
artifact declarations
image selection
conditional steps

These are not implemented in v1. The v1 runner image is platform-owned and cannot be selected from .anvil.yml.

13. D1 schema

All D1 id columns use the canonical prefixed identifier format defined in section 3.4.

13.1 users

id TEXT PRIMARY KEY
slug TEXT NOT NULL UNIQUE
email TEXT NOT NULL UNIQUE
display_name TEXT NOT NULL
created_at INTEGER NOT NULL
disabled_at INTEGER NULL

users.slug is the canonical owner slug.

Indexes:

CREATE UNIQUE INDEX idx_users_slug ON users(slug);
CREATE UNIQUE INDEX idx_users_email ON users(email);

13.2 password_credentials

user_id TEXT PRIMARY KEY
algorithm TEXT NOT NULL
digest TEXT NOT NULL
iterations INTEGER NOT NULL
salt BLOB NOT NULL
password_hash BLOB NOT NULL
updated_at INTEGER NOT NULL

13.3 projects

id TEXT PRIMARY KEY
owner_user_id TEXT NOT NULL
owner_slug TEXT NOT NULL
project_slug TEXT NOT NULL
name TEXT NOT NULL
repo_url TEXT NOT NULL
default_branch TEXT NOT NULL
config_path TEXT NOT NULL DEFAULT '.anvil.yml'
repo_token_ciphertext BLOB NULL
repo_token_key_version INTEGER NULL
repo_token_nonce BLOB NULL
created_at INTEGER NOT NULL
updated_at INTEGER NOT NULL

projects.owner_slug is a denormalized copy of users.slug kept for owner-scoped lookup efficiency.

Indexes:

CREATE UNIQUE INDEX idx_projects_owner_project_slug ON projects(owner_slug, project_slug);
CREATE INDEX idx_projects_owner_user_updated_at ON projects(owner_user_id, updated_at DESC);
CREATE INDEX idx_projects_updated_at ON projects(updated_at DESC);

13.4 invites

id TEXT PRIMARY KEY
created_by_user_id TEXT NOT NULL
token_hash BLOB NOT NULL
expires_at INTEGER NOT NULL
accepted_by_user_id TEXT NULL
accepted_at INTEGER NULL
created_at INTEGER NOT NULL

Indexes:

CREATE UNIQUE INDEX idx_invites_token_hash ON invites(token_hash);
CREATE INDEX idx_invites_created_by_created_at ON invites(created_by_user_id, created_at DESC);
CREATE INDEX idx_invites_expires_at ON invites(expires_at);

13.5 run_index

id TEXT PRIMARY KEY
project_id TEXT NOT NULL
triggered_by_user_id TEXT NULL
trigger_type TEXT NOT NULL
branch TEXT NOT NULL
commit_sha TEXT NULL
status TEXT NOT NULL
queued_at INTEGER NOT NULL
started_at INTEGER NULL
finished_at INTEGER NULL
exit_code INTEGER NULL

run_index is the last-synced durable summary in D1. While a run is active, RunDO remains authoritative and D1 status may lag. Immediately after acceptance, the D1 row may be temporarily absent until ProjectDO reconciliation succeeds.

Indexes:

CREATE INDEX idx_run_index_project_queued_at ON run_index(project_id, queued_at DESC);
CREATE INDEX idx_run_index_project_started_at ON run_index(project_id, started_at DESC);
CREATE INDEX idx_run_index_user_queued_at ON run_index(triggered_by_user_id, queued_at DESC);
CREATE INDEX idx_run_index_status_queued_at ON run_index(status, queued_at DESC);

13.6 Query patterns these indexes support

list projects for current user
fetch one project by owner-scoped slug or id
resolve owner-scoped public webhook routes efficiently
list recent runs for one project using keyset pagination, not offset pagination
list recent runs initiated by one user using keyset pagination, not offset pagination
fetch one run summary by id
create and redeem invite tokens efficiently

14. Private API auth and authorization flow

For every private route:

Read the session identifier from the request, typically from the Authorization header.
Resolve session in KV.
Reject if missing or expired.
Open D1 session.
Read project ownership or resource ownership from D1.
Validate request payload and derive the target Durable Object public ID if applicable.
If Durable Object state is needed, invoke the target object via RPC using trusted typed inputs.
Shape the HTTP response in the Worker.

The session in KV identifies the user. The authoritative authorization checks still happen in D1. Durable Objects must not read browser session headers or perform primary authentication for private routes.

14.1 Run-scoped route resolution during D1 lag

runId may exist before its D1 run_index row is visible because ProjectDO accepts the run before reconciliation completes.

For private run-scoped routes such as:

GET /api/private/runs/:runId
POST /api/private/runs/:runId/cancel
POST /api/private/runs/:runId/log-ticket

the Worker should:

validate the session via KV
attempt to resolve the run from D1 run_index
if the D1 row is missing, call RunDO using runId to fetch minimal trusted metadata such as projectId and current run status
authorize the caller by checking project ownership in D1 using that projectId
continue with the route-specific logic

This preserves D1 as the source of authorization while allowing newly accepted runs to be queried or canceled immediately.

14.2 WebSocket log stream auth flow

GET /api/private/runs/:runId/logs is authenticated by short-lived log-stream ticket rather than by Authorization header.

Client calls POST /api/private/runs/:runId/log-ticket.
Worker validates the session via KV.
Worker checks run ownership via D1 or, if needed during reconciliation lag, via the RunDO-assisted ownership flow above.
Worker stores a short-lived best-effort single-use ticket in KV.
Client opens GET /api/private/runs/:runId/logs?ticket=....
Worker validates and consumes the ticket.
Worker forwards the authenticated WebSocket upgrade to RunDO.
RunDO attaches the socket using trusted Worker-provided auth context.

15. Public webhook flow

Webhook configuration is owned by ProjectDO, not D1.

15.1 Public webhook ingress steps

Request arrives at /api/public/hooks/:provider/:ownerSlug/:projectSlug.
Worker frontdoor resolves (ownerSlug, projectSlug) -> projectId from D1.
Worker frontdoor derives ProjectDO from idFromName(projectId).
Worker calls ProjectDO RPC to load the minimal webhook verification material and project-local webhook settings for that provider.
Worker authenticates the incoming webhook request using the provider signature scheme.
Worker normalizes the verified event and applies event-type and branch policy checks.
Worker calls ProjectDO RPC to deduplicate the delivery and accept the verified trigger.
If accepted, ProjectDO allocates runId and durably records the accepted run.
Worker attempts to write the D1 summary row and, if the run is currently executable, enqueue it.
If D1 sync or enqueue fails, ProjectDO retains reconciliation state and retries later.

15.2 v1 provider and event scope

Supported providers in v1:

GitHub
GitLab
Gitea

Webhook management scope in v1:

users configure provider webhooks manually in the upstream provider UI
anvil stores provider verification material and enablement state
anvil does not create, update, or delete provider webhooks through provider APIs in v1

Webhook trigger policy in v1:

only push events create runs
only pushes to projects.default_branch create runs
provider ping/test events return success but do not create runs
duplicate webhook deliveries must be deduplicated by (project_id, provider, delivery_id) for 72 hours
manual triggers are not deduplicated

15.3 Why slug lookup still uses D1

The webhook config itself is not in D1. Only the stable mapping from public owner-scoped slug to project identity is in D1.

Slug policy for v1:

allowed characters: alphanumeric, hyphen (-), underscore (_)
user slug is chosen once at signup
rename flow is deferred
project slug is unique within an owner scope

This keeps webhook secrets localized to the project actor while preserving simple public routing and a Worker-owned authentication boundary.

16. Project and run lifecycle

16.1 Project creation

Authenticated user calls POST /api/private/projects.
Worker uses D1 primary session.
Worker inserts projects row with owner identity.
Worker returns project summary.
ProjectDO is created lazily on first use.

16.2 Manual run trigger

Authenticated user calls POST /api/private/projects/:projectId/runs.
Worker validates session via KV.
Worker checks project ownership via D1.
Request may include an optional branch override; if omitted, anvil uses projects.default_branch.
Worker calls ProjectDO RPC to accept the run.
ProjectDO allocates runId, snapshots the non-secret execution inputs for the run, records the accepted run, and initializes RunDO.
Worker returns 202 Accepted with runId.
Worker attempts to insert the run_index row in D1.
Worker enqueues the run only if ProjectDO reports that it is currently executable.
If D1 sync or enqueue fails, ProjectDO retries reconciliation asynchronously.

16.2.1 Run cancellation

Users may cancel:

the active run for a project
any pending run in that project's FIFO queue

Cancellation is requested through:

POST /api/private/runs/:runId/cancel

Behavior:

if the run is pending, Worker authorizes the caller and then invokes ProjectDO RPC to remove it from the FIFO queue and mark it canceled
if the run is active, repeated cancel requests are idempotent and do not create a second cancellation workflow
if the run is active, Worker authorizes the caller and then invokes ProjectDO and RunDO via RPC; anvil first attempts a soft cancel at the running process boundary
if soft cancel does not complete in time and the runtime allows it, anvil escalates to a hard kill of the sandbox process or session
RunDO transitions the run toward canceled and ProjectDO advances the next queued run
ProjectDO reconciles D1 run_index to terminal status canceled

16.3 Webhook run trigger

Public webhook request hits WAF-protected prefix.
Worker resolves project identity in D1 using owner-scoped slug.
Worker calls ProjectDO RPC to load verification material for the provider.
Worker authenticates the webhook request and validates provider event type, default-branch policy, and delivery idempotency preconditions.
If the delivery should create a run, Worker calls ProjectDO RPC to accept it and append it to the per-project FIFO queue.
ProjectDO allocates runId, snapshots the non-secret execution inputs for the run, and initializes RunDO.
Worker attempts to insert the run_index row in D1.
Worker enqueues the run only if ProjectDO reports that it is currently executable.
If D1 sync or enqueue fails, ProjectDO retries reconciliation asynchronously.

16.4 Run execution

Queue consumer receives {projectId, runId}.
Queue consumer confirms with ProjectDO that the run is still the current executable run for the project and retrieves the accepted-run execution snapshot.
If ProjectDO reports the message is stale, duplicate, canceled, or not executable, the consumer acknowledges it without creating a Sandbox.
Queue consumer creates Sandbox with keepAlive: true.
Queue consumer uses the accepted-run snapshot and the latest repository token to check out the repository inside the Sandbox.
Sandbox loads the snapshotted config path.
Queue consumer calls RunDO RPC to transition the run from queued to starting.
Validated commands are written to RunDO step rows through RPC.
Queue consumer starts heartbeat updates to ProjectDO and then calls RunDO RPC to transition the run to running.
Commands execute sequentially.
Output chunks stream to RunDO through RPC.
RunDO broadcasts to viewers.
Terminal state is written to RunDO through RPC and reported back to ProjectDO.
ProjectDO updates or retries the D1 run_index terminal sync.
Queue consumer calls ProjectDO RPC to release the lock.
Queue consumer calls ProjectDO RPC to advance the next FIFO pending run, if any, and enqueue exactly one queue message for the newly promoted executable run.
Sandbox is destroyed in finally.

17. Frontend

17.1 App structure

Frontend lives under src/web and is served by the Worker.

Recommended stack:

React Router
TanStack Query
typed API wrapper consuming util-en-garde contracts
frontend auth wrapper storing the session identifier in browser localStorage
frontend D1 bookmark wrapper storing the latest read-replication bookmark in browser localStorage
log stream wrapper that mints short-lived tickets before opening the WebSocket

17.2 Pages

/app/projects
/app/projects/new
/app/projects/:projectId
/app/runs/:runId
/app/login

17.3 Core UI responsibilities

Projects list

show projects owned by current user
show last known run status

Project detail

repo URL
default branch
config path
recent runs
trigger run button
webhook summary
pending queue summary

Run detail

run status
step list
live log panel
reconnecting log stream client

18. Contracts

Shared contracts live under src/contracts.

Recommended files:

auth.ts
project.ts
run.ts
webhook.ts
repo-config.ts
log.ts
common.ts

Required contract types

LoginRequest
LoginResponse
CreateProjectRequest
ProjectSummary
ProjectDetail
TriggerRunRequest
RunSummary
RunDetail
LogStreamTicketResponse
WebhookSummary
UpsertWebhookRequest
WebhookTriggerPayload
RepoConfig
LogEvent

19. Repo layout

anvil/
  src/
    contracts/
      auth.ts
      project.ts
      run.ts
      webhook.ts
      repo-config.ts
      log.ts
      common.ts

    worker/
      index.ts
      env.ts

      api/
        public/
          auth.ts
          webhooks.ts
        private/
          me.ts
          projects.ts
          runs.ts
          webhooks.ts

      auth/
        headers.ts
        sessions.ts
        passwords.ts
        tickets.ts

      durable/
        project-do.ts
        run-do.ts

      queue/
        consumer.ts
        messages.ts

      sandbox/
        runner.ts
        git.ts
        repo-config.ts
        commands.ts

      db/
        d1/
          schema/
          repositories/
        durable/
          schema/
          repositories/
        migrate.ts

      services/
        project-service.ts
        run-service.ts
        webhook-service.ts
        id-service.ts

    client/
      main.tsx
      app.tsx
      router.tsx
      pages/
      components/
      lib/

  drizzle/
    d1/
    durable/

  docker/
    runner.Dockerfile

  public/
  wrangler.jsonc
  package.json
  tsconfig.json

20. Future extension points

20.1 Cloudflare Workflows

Workflows are not part of v1 execution, but the specification should leave room for them.

Likely fit for Workflows later

multi-stage pipelines
retries across long-running steps
durable approval gates
scheduled retries or backoff across external systems
artifact publication or promotion flows
long waits for external events

Probable future shape

v1 uses:

Worker frontdoor
Queue
ProjectDO
RunDO
Sandbox

A future v2 may add Workflows as an orchestration layer above the queue consumer:

trigger accepted
workflow started
workflow step starts sandbox
workflow step waits for completion event
workflow step publishes artifacts or notifies external systems

If Workflows are added later, every step must be designed idempotently.

20.2 R2 for full logs and artifacts

R2 is not in v1, but the specification should reserve a clear role for it.

Future use of R2

full run log archival
uploaded artifacts
test reports
compressed logs for completed runs
build outputs too large for Durable Object SQLite retention

Future log storage split

RunDO SQLite retains only a bounded hot tail for live UI and recent history.
R2 stores immutable completed-run log archives.

Future artifact shape

Suggested key patterns:

logs/{projectId}/{runId}.txt
logs/{projectId}/{runId}.jsonl
artifacts/{projectId}/{runId}/{artifactName}

Suggested metadata rows later

If R2 is added later, add D1 tables such as:

run_archives
run_artifacts

v1 does not implement these.

21. Concurrency rules

21.1 v1 rule

one active run per project
FIFO pending queue per project

21.2 Queue behavior and cancellation

v1 supports:

one active run per project
FIFO pending queue for additional accepted runs
user-initiated cancellation of active runs
user-initiated cancellation of pending runs

ProjectDO is responsible for queue mutation, cancellation, and advancement.

21.3 Ownership of concurrency

ProjectDO is the sole owner of project-level concurrency state.

No other component should attempt to coordinate active-run state or pending-queue state outside ProjectDO.

22. Error handling and cleanup

22.1 Terminal state guarantee

Every accepted run must end in exactly one terminal state:

passed
failed
canceled

22.1.1 Canonical run statuses

The canonical run status enum for v1 is:

queued
starting
running
cancel_requested
canceling
passed
failed
canceled

RunDO should expose the freshest status. D1 run_index.status uses the same enum but may lag for active runs.

pending is an internal ProjectDO queue concept in v1, not a public run status. Public APIs and persisted run summaries should use only the canonical status enum above.

22.1.2 Allowed status transitions

v1 should allow only these transitions:

queued -> starting
queued -> canceled
starting -> running
starting -> failed
starting -> cancel_requested
running -> passed
running -> failed
running -> cancel_requested
cancel_requested -> canceling
cancel_requested -> canceled
canceling -> canceled
canceling -> failed if forced termination or cleanup fails after cancellation has begun

Terminal states do not transition further.

22.2 Cleanup responsibilities

queue consumer destroys sandbox
RunDO finalizes run status
ProjectDO reconciles and retries D1 run_index updates
ProjectDO releases active lock

22.2.1 Retention and pruning

For v1:

project_webhook_deliveries rows should be retained for 72 hours
terminal project_runs rows that are fully reconciled to D1 should be pruned after 7 days
RunDO detail state (run_meta, run_steps, and the retained hot log tail) should be retained for 7 days after terminal completion

After RunDO detail retention expires:

D1 run_index remains the durable summary source
GET /api/private/runs/:runId should still return the D1 summary if it exists
the response should indicate that detailed run state is no longer available

22.3 Timeouts

For v1, enforce:

run.timeoutSeconds from repo config as the user-visible whole-run timeout
run.timeoutSeconds must not exceed 720
the configured run timeout must leave headroom within the queue consumer's 15 minute wall-clock limit for checkout, reconciliation, cancellation, and cleanup
the queue consumer Worker should run with limits.cpu_ms set to 300000
internal platform safety timeouts may exist, but they are implementation details rather than user-configurable step timeouts

ProjectDO must use an alarm or equivalent watchdog mechanism to detect stale active-run heartbeats and recover orphaned runs in v1.

22.4 Structured logging and observability

Structured level logging is required in v1.

All runtime components should emit structured JSON logs:

Worker frontdoor
ProjectDO
RunDO
queue consumer

Required log levels:

debug
info
warn
error

Minimum required fields on every log event:

ts
level
event
component

Include these contextual fields whenever available:

requestId
projectId
runId
userId
queueMessageId
provider
deliveryId
attempt
status
errorCode

Required structured log events include at least:

run acceptance
queue dispatch retry
D1 sync retry
stale queue delivery
sandbox startup failure
checkout failure
config validation failure
run cancellation request
cancel escalation to hard kill
watchdog recovery of an orphaned run

Structured logs must never contain:

repository tokens or PATs
session identifiers
webhook secrets
log-stream tickets
raw Authorization headers
credentialed repository URLs

23. Security model

23.1 Secrets and encryption

webhook secrets live in ProjectDO SQLite as encrypted blobs
user-provided repository tokens are encrypted before being stored in D1
stored repository tokens are used for Git access only in v1
plaintext repository tokens and webhook secrets are never persisted in D1, KV, or Durable Object SQLite
short-lived WebSocket log-stream tickets live in KV
password hashes are derived with PBKDF2 using a per-user random salt

23.1.1 Versioned master-key encryption for stored project tokens

anvil should support storing one user-provided repository token per project in D1 using application-level encryption.

Recommended v1 design:

one global app master key in the Worker environment
the master key has a monotonically increasing integer version
when a user saves a token, anvil encrypts it before writing to D1
the D1 project row stores ciphertext plus the key version and nonce/IV
reads decrypt using the master key matching the stored version
future key rotation is performed by introducing a new version and re-encrypting rows over time

Suggested storage fields per encrypted token:

repo_token_ciphertext
repo_token_key_version
repo_token_nonce

The exact cipher can be implementation-defined, but it should be an authenticated encryption mode. The important invariant for the specification is that token plaintext never lands in the database.

23.1.2 Versioned master-key encryption for stored webhook secrets

anvil should support storing one webhook secret per provider per project in ProjectDO SQLite using the same application-level encryption model.

Recommended v1 design:

use the same master-key versioning strategy as encrypted repository tokens
encrypt the webhook secret before writing it to project_webhooks
store ciphertext plus the key version and nonce/IV alongside the webhook row
decrypt only in the Worker-owned webhook verification path before invoking ProjectDO acceptance RPC

Suggested storage fields per encrypted webhook secret:

secret_ciphertext
secret_key_version
secret_nonce

23.2 Public edge protection

all public routes under /api/public/*
single WAF rate limit rule on that prefix
login and webhook ingress share the same outer rate limit boundary

23.3 Authorization

KV authenticates session identity
D1 authorizes project ownership in v1
the Worker authenticates and authorizes both private API requests and public webhook requests before invoking Durable Objects
ProjectDO and RunDO enforce only trusted RPC invariants and object-local state transitions
private API requests carry the opaque session identifier explicitly rather than relying on browser cookies
WebSocket log streaming is authorized by short-lived best-effort single-use KV ticket after D1 ownership verification

23.4 Invite-only onboarding

v1 is invite-only.

Recommended D1 table:

invites
- hashed invite token
- inviter user id
- expiry
- accepted by user id
- accepted at

v1 invite semantics:

any registered user may generate an invite link
invite links carry a simple opaque token
the stored database value should be a hash of that token, not the raw token itself
only a valid invite token allows a new user record to be created in v1
v1 does not impose per-user invite caps or invite-specific application rate limits beyond normal authenticated route protections

24. Recommended implementation order

Implementation should be split into separate backend and frontend tracks. Each phase should deliver a coherent product slice and minimize dependencies on unfinished work in other phases.

24.1 Backend

Backend Phase 1: foundation and access control

repo skeleton
shared contracts under src/contracts
.anvil.yml schema
D1 schema + Drizzle setup
canonical ID generator and prefix conventions
structured logger foundation
KV session helper
login route
private auth middleware
invite generation and invite acceptance flow

Backend Phase 2: project management API

GET /api/private/me
GET /api/private/projects
POST /api/private/projects
PATCH /api/private/projects/:projectId
D1 read and primary session helpers
project ownership checks in D1
repository URL validation
config_path validation
encrypted repository token storage in D1

Backend Phase 3: manual run execution MVP

ProjectDO schema and project-local coordination state
accepted-run ledger and FIFO queue logic
minimal RunDO schema for run metadata, steps, and rolling logs
GET /api/private/projects/:projectId
GET /api/private/projects/:projectId/runs
GET /api/private/runs/:runId
POST /api/private/projects/:projectId/runs
queue message contract and queue consumer
platform runner image and docker/runner.Dockerfile
Sandbox runner
repository checkout flow
repository config parsing and validation
D1 run_index creation and terminal update reconciliation

Backend Phase 4: live run control and recovery

POST /api/private/runs/:runId/cancel
POST /api/private/runs/:runId/log-ticket
authenticated WebSocket upgrade flow for GET /api/private/runs/:runId/logs
RunDO WebSocket Hibernation implementation
rolling tail replay for newly attached viewers
active-run heartbeat updates from the queue consumer
ProjectDO watchdog recovery for stale active runs
queue dispatch retry and stale delivery handling
D1 sync retry for accepted and terminal runs
cancel flow for pending and active runs

Backend Phase 5: webhook automation

GET /api/private/projects/:projectId/webhooks
PUT /api/private/projects/:projectId/webhooks/:provider
POST /api/private/projects/:projectId/webhooks/:provider/rotate-secret
DELETE /api/private/projects/:projectId/webhooks/:provider
ProjectDO webhook configuration and encrypted secret storage
public webhook ingress route
provider-specific verification adapters for GitHub, GitLab, and Gitea
webhook delivery dedupe
default-branch push trigger policy

24.2 Frontend

Frontend work should begin as soon as the corresponding backend slice exposes stable contracts and routes. The frontend track does not need to wait for the entire backend track to be complete.

Frontend Phase 1: app shell and project management

app shell and route structure
frontend auth wrapper using localStorage
typed API wrapper consuming shared contracts
frontend D1 bookmark wrapper using localStorage
login page
projects list page
create project page

Frontend Phase 2: project operations

project detail page
recent runs list
manual trigger run action
polling-based run status refresh
project metadata display for repository URL, default branch, and config path
queue and active-run summary display

Frontend Phase 3: live run UX

run detail page
live log panel
reconnecting log stream client
cancel run action
run state presentation for active, canceling, canceled, failed, and passed runs

Frontend Phase 4: webhook settings

webhook settings UI
webhook provider summary display
secret rotation and provider enablement flows

25. Open questions intentionally deferred

session rotation policy on privileged operations
whether logout should blacklist old sessions beyond KV delete
exact R2 retention policy when archives are added
whether Workflows should replace the queue consumer or sit above it
whether local password auth will remain mandatory once OAuth/SAML arrive

26. Summary

anvil v1 should be built around a simple but strong architecture:

KV for short-lived session state
D1 for relational control-plane data
ProjectDO for project-local coordination, accepted-run reconciliation, FIFO run queue, and webhook config
RunDO for hot run state and log fanout
WebSocket Hibernation as the default log-stream transport
Queue + Sandbox for execution on a platform-owned runner image
repo-defined config from .anvil.yml

This design keeps each Cloudflare product aligned with the kind of state it handles best, while leaving clean extension points for Workflows, R2 log archiving, and artifacts later.

FilesExpand file tree

anvil-spec.md

Latest commit

History

anvil-spec.md

File metadata and controls

anvil engineering specification

1. Overview

2. Goals

2.1 v1 goals

2.2 Explicit v1 non-goals

3. Technology stack

3.1 Language and runtime

3.2 Validation and contracts

3.3 Persistence

3.4 Identifier conventions

4. High-level architecture

4.1 Components

4.2 Control-plane split

4.3 Durable Object routing model

4.4 Durable Object invocation model

5. API surface and routing

5.1 Route prefixes

5.2 WAF strategy

5.3 Public routes

5.4 Private routes

6. Authentication and sessions

6.1 Session storage choice

6.2 Session payload

6.3 TTL guidance

6.4 KV caveat

6.4.1 Browser storage caveat

6.4.2 Disabled-user behavior

6.5 Password data

6.6 Future authentication methods

7. D1 usage model

7.1 What belongs in D1

7.2 What stays out of D1

8. D1 Sessions API and read-replication design

8.1 Session helper policy

openReadSession

openPrimarySession

8.2 Bookmark transport

8.3 Likely read-only routes

8.4 Write-capable routes

8.5 Read route principle

9. Durable Objects

9.1 ProjectDO

Responsibilities

SQLite tables in ProjectDO

project_state

project_runs

project_runs enum guidance

project_webhooks

project_webhook_deliveries

ProjectDO index plan

9.2 RunDO

Responsibilities

SQLite tables in RunDO

run_meta

run_steps

run_logs

RunDO index plan

10. WebSocket Hibernation design

10.1 Why Hibernation is mandatory

10.2 Required Hibernation APIs

10.3 Attachment contents

10.4 Wake-up behavior

10.5 Log replay model

10.6 Cost-sensitive behavior

10.7 WebSocket auth

11. Queue and runner execution

11.1 Queue role

11.1.1 Run acceptance boundary

11.1.2 Queue and reconciliation policy

11.1.3 Platform execution limits

11.2 Queue consumer responsibilities

11.3 Failure boundaries

11.4 Runner model and cancellation semantics

11.5 Platform runner image

`openReadSession`

`openPrimarySession`

`project_state`

`project_runs`

`project_runs` enum guidance

`project_webhooks`

`project_webhook_deliveries`

`run_meta`

`run_steps`

`run_logs`