From cd0a9dd2879e2745fab36149f817d0e89c865e1a Mon Sep 17 00:00:00 2001 From: Chris Alfano Date: Tue, 19 May 2026 09:27:59 -0400 Subject: [PATCH 1/4] docs(specs): add hot-reload webhook to storage behavior + plan Captures the contract for POST /api/_internal/reload-data: bearer-auth, optional body, cheap ancestry pre-check, lock-protected reconcile, in-place state rebuild. Plan tracks #65. Co-Authored-By: Claude Opus 4.7 (1M context) --- plans/hot-reload-webhook.md | 63 +++++++++++++++++++++++++++++++++++++ specs/behaviors/storage.md | 16 ++++++++++ 2 files changed, 79 insertions(+) create mode 100644 plans/hot-reload-webhook.md diff --git a/plans/hot-reload-webhook.md b/plans/hot-reload-webhook.md new file mode 100644 index 0000000..56f8e78 --- /dev/null +++ b/plans/hot-reload-webhook.md @@ -0,0 +1,63 @@ +--- +status: in-progress +depends: [] +specs: + - specs/behaviors/storage.md +issues: [65] +--- + +# Plan: Hot-reload webhook for the public data branch + +## Scope + +Add `POST /api/_internal/reload-data` — an authenticated webhook that pulls the latest commit on the configured `CFP_DATA_BRANCH` and atomically rebuilds the in-memory state, so a push to `published` propagates to the running pod without a restart. + +Triggered by a GitHub Actions workflow living on the `codeforphilly-data` repo (delivered as PR-body YAML, not committed here). + +Out of scope: HMAC payload signing, multi-pod fanout, push-side schema validation in the GH Action. + +## Implements + +- [specs/behaviors/storage.md](../specs/behaviors/storage.md) — new "Hot reload" subsection covering the endpoint's existence + behavior. + +## Approach + +1. **Env var.** Add `CFP_DATA_RELOAD_SECRET` to `apps/api/src/env.ts` + `envJsonSchema` (optional, min 32 chars). When unset, the route exists but responds 503. +2. **Helper.** New `apps/api/src/store/memory/reload.ts` exports `reloadInMemoryStateAndFts(fastify)` — builds a fresh `InMemoryState` first, then mutates the live state's Maps in a tight synchronous block. Failure during the build leaves the running state untouched; failure during the mutate block is loud and the pod is in undefined state (caller returns 5xx). Adds a `reload(state)` method to `FtsEngine` that drops every FTS5 table's rows and re-inserts. +3. **Route.** New `apps/api/src/routes/internal.ts` registers `POST /api/_internal/reload-data`: + - `schema: { hide: true }` so it's omitted from the public OpenAPI doc. + - Bearer-token auth using `crypto.timingSafeEqual` (length-checked first); generic 401 message. + - 503 if `CFP_DATA_RELOAD_SECRET` is unset. + - Body `{ branch?: string, commitHash?: string }` — both optional, validated via Fastify schema. + - **Cheap pre-check**: if `commitHash` given and `git merge-base --is-ancestor commitHash HEAD` exits 0 → 200 noChanges, no lock acquired. + - Otherwise calls `fastify.reconcileDataRepo({ branch })`. If outcome is `'in-sync'` → 200 noChanges with the outcome. Anything else → rebuild via the helper and return 200 with `rebuilt: true`. +4. **Wire-up.** Register `internalRoutes` in `apps/api/src/app.ts` alongside other routes. +5. **Tests.** `apps/api/tests/internal-reload.test.ts` covers 401 (missing/wrong token), 503 (unset secret), 200 noChanges via pre-check, 200 in-sync, 200 fast-forward + rebuilt with the new record visible via a service call. +6. **Docs.** Add `CFP_DATA_RELOAD_SECRET` row to the deploy.md env table. New "Hot-reload webhook" section in runbook.md. +7. **Workflow YAML.** Not committed to this repo; delivered in the PR body for the operator to drop into `codeforphilly-data`. + +## Validation + +- [ ] `npm run -w apps/api type-check` passes +- [ ] `npm run -w apps/api test` — full suite green, including the new `internal-reload.test.ts` +- [ ] `POST /api/_internal/reload-data` without Authorization → 401 generic message +- [ ] Wrong bearer token → 401 (same shape; constant-time comparison verified by code review) +- [ ] Unset `CFP_DATA_RELOAD_SECRET` → 503 "hot-reload not configured" +- [ ] Body `{ commitHash: }` → 200 noChanges with no rebuild +- [ ] Empty body, no remote changes → 200 noChanges with `outcome: 'in-sync'` +- [ ] Empty body, remote ahead of local → 200 with `outcome: 'fast-forwarded'`, `rebuilt: true`, and a service call sees the new record +- [ ] Half-built rebuild does not corrupt running state (validated by reading reload.ts — fresh state is built fully before live state mutates) + +## Risks / unknowns + +- **In-place mutation of `fastify.inMemoryState`** — services hold references to the live state object. We must mutate Map contents in place; replacing the object would orphan the services. Mitigation: helper clears + re-populates Maps on the existing object. +- **FTS reload mid-failure** — if the DELETE succeeds but inserts throw, the running FTS index is in a partial state. Mitigation: load fresh state to a local variable first; if FTS reload throws, log loudly and the route returns 500 so the operator can manually restart the pod. +- **Push-daemon self-trigger** — the API pushes its own commits, the workflow fires, the webhook arrives for a commit the pod already has. The cheap pre-check handles this without a fetch. + +## Notes + +(To be filled in at closeout.) + +## Follow-ups + +(To be filled in at closeout.) diff --git a/specs/behaviors/storage.md b/specs/behaviors/storage.md index b1f27dc..95ac102 100644 --- a/specs/behaviors/storage.md +++ b/specs/behaviors/storage.md @@ -289,6 +289,22 @@ Migration scripts live in `apps/api/scripts/migrations/- At our corpus size (~5,000 records, mostly small), boot is sub-second on a modest container. Boot time grows linearly with corpus size; at 50K records expect 5–10s. If that becomes painful, partition the read or cache the in-memory representation to disk — but neither is needed at civic scale. +## Hot reload + +A push to the configured `CFP_DATA_BRANCH` from outside the API (typically a merge of the importer branch into `published`) makes the local working tree stale. Rather than rolling the pod, the API exposes a hidden webhook that the data repo's GitHub Actions workflow calls on each push. + +- **Endpoint** — `POST /api/_internal/reload-data`. Hidden from the public OpenAPI doc; not advertised externally. +- **Auth** — `Authorization: Bearer ` where the token matches the `CFP_DATA_RELOAD_SECRET` env var (a Kubernetes Secret in production). Constant-time comparison. When the env var is unset, the endpoint is registered but every request gets 503 ("hot-reload not configured"). +- **Body** — `{ branch?: string, commitHash?: string }`. Both optional. `branch` defaults to `CFP_DATA_BRANCH`. `commitHash` is the commit the caller observed pushing; the endpoint uses it to short-circuit when the pod already has it. +- **Cheap pre-check** — if `commitHash` is an ancestor of local HEAD (`git merge-base --is-ancestor`), respond 200 noChanges immediately without acquiring the data-repo lock. This covers the common self-trigger where the API's own push daemon prompted the workflow. +- **Reconcile + rebuild** — otherwise acquire the data-repo lock, call the same reconciliation state machine the boot path uses (`fastify.reconcileDataRepo`), and: + - If outcome is `'in-sync'`, skip the rebuild and return 200 noChanges with the outcome. + - Otherwise rebuild the in-memory state and FTS index from the new tree, then return 200 with the outcome, the old and new commit, and `rebuilt: true`. +- **Atomicity** — the rebuild constructs a fresh `InMemoryState` first; only after that succeeds does it mutate the live Maps in place. The FTS engine exposes a `reload(state)` that drops and re-inserts every FTS5 table. If the rebuild throws partway, the route returns 500 and the operator should restart the pod. +- **Concurrency** — uses the same `dataRepoLock` as boot reconciliation, so a webhook fires can't race a `transact`-driven write. + +The GitHub Actions workflow that calls this endpoint lives in the `codeforphilly-data` repo (`.github/workflows/notify-deployments.yml`), not in this app repo. It fires on push to `CFP_DATA_BRANCH` and posts `{ branch, commitHash: }` with the secret as a bearer token. + ## Disaster recovery The data repo is git. Recovery from total local loss: From 173c10cb1981d6157e0847806cb257803479b0e1 Mon Sep 17 00:00:00 2001 From: Chris Alfano Date: Tue, 19 May 2026 09:50:57 -0400 Subject: [PATCH 2/4] feat(api): hot-reload webhook for the public data branch MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit POST /api/_internal/reload-data — closes the loop on #65. Hidden from the OpenAPI doc; bearer-auth via CFP_DATA_RELOAD_SECRET with constant- time compare; 503 when the secret is unset so the deployment surface stays uniform. Two layers of no-op coverage: 1. Cheap pre-check via `git merge-base --is-ancestor` — handles the self-trigger from push-daemon-emitted pushes without acquiring the data-repo lock or hitting the network. 2. After a reconcile, outcome === 'in-sync' short-circuits the rebuild. Otherwise the reconcile state machine (#66) runs under the data-repo lock and the helper at `store/memory/reload.ts` re-opens the public store (gitsheets caches a dataTree per Sheet, so the snapshot has to be replaced after a fast-forward), builds a fresh InMemoryState, mutates the live Maps in place, swaps the public-store reference, reloads the FTS index in a single SQLite transaction, and invalidates the facet cache. If the rebuild throws after the swap has begun, the route logs loudly and returns 500 so the operator knows a restart is warranted. Co-Authored-By: Claude Opus 4.7 (1M context) --- apps/api/src/app.ts | 2 + apps/api/src/env.ts | 8 + apps/api/src/routes/internal.ts | 262 +++++++++++++++++ apps/api/src/store/fts.ts | 60 +++- apps/api/src/store/memory/reload.ts | 123 ++++++++ apps/api/src/store/store.ts | 19 +- apps/api/tests/internal-reload.test.ts | 393 +++++++++++++++++++++++++ 7 files changed, 856 insertions(+), 11 deletions(-) create mode 100644 apps/api/src/routes/internal.ts create mode 100644 apps/api/src/store/memory/reload.ts create mode 100644 apps/api/tests/internal-reload.test.ts diff --git a/apps/api/src/app.ts b/apps/api/src/app.ts index d579de6..edc0d7a 100644 --- a/apps/api/src/app.ts +++ b/apps/api/src/app.ts @@ -53,6 +53,7 @@ import { helpWantedRoutes } from './routes/projects-help-wanted.js'; import { projectMembershipRoutes } from './routes/projects-members.js'; import { previewRoutes } from './routes/preview.js'; import { samlRoutes } from './routes/saml.js'; +import { internalRoutes } from './routes/internal.js'; declare module 'fastify' { interface FastifyInstance { @@ -174,6 +175,7 @@ export async function buildApp(opts: BuildAppOptions = {}): Promise