Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
44a8f21
Add CTO review & unified go-forward roadmap
Jun 17, 2026
043dd17
P0-4: add BigQuery query-parameter support; parameterize allowlist
Jun 17, 2026
bb18f54
P0-1: cache stats-page loaders in front of BigQuery
Jun 17, 2026
e2ca1cc
P0-2: un-hide the region trend chart
Jun 17, 2026
7d79e0c
P0-3: add provider-agnostic error-reporting seam
Jun 17, 2026
8ae8cb4
chore: apply Prettier formatting to roadmap doc
Jun 17, 2026
c44e72d
Revert P0-2: re-hide the region trend chart
Jun 17, 2026
5e07aac
docs: reclassify P0-2 charts item after site-wide-hidden finding
Jun 17, 2026
69a48be
P1-5: centralize the stats filter contract in lib/filters.ts
Jun 17, 2026
97d65a6
P1-6: regression tests for the two search postmortems + lib unit tests
Jun 17, 2026
6e9f9a7
P1-7: make fixed card heights responsive
Jun 17, 2026
4917d85
P1-8: add local-only Playwright smoke test
Jun 17, 2026
a49bf3c
P1-9: add ops runbook for secret hygiene + monitoring
Jun 17, 2026
58f77e5
P1-7 follow-up: cap scroll-card height on desktop
Jun 17, 2026
7ab7c67
P1-7 follow-up: stop Achievements card from stretching
Jun 17, 2026
ae2d21c
P2-10: parameterize remaining BigQuery string interpolation
Jun 17, 2026
76f91d8
P2-11: remove dead code and unused dependencies
Jun 17, 2026
90cb783
P2-15: make the leaderboard active β€” ranks + emphasized metric
Jun 17, 2026
367f91d
P2-15 follow-up: revert metric emphasis, keep rank numbers
Jun 17, 2026
c2abaeb
P2-16: collapse event-card attendee list on mobile
Jun 17, 2026
9802a1e
P2-14: reorder region dashboard to a decision pyramid
Jun 17, 2026
db12436
P2-12a: extract shared EntityDataUnavailable empty state
Jun 17, 2026
7afd467
P2-12b: share renderStat + HelpHint across SummaryCards
Jun 17, 2026
8ef3f80
P2-12c: extract buildBreadcrumb() helper
Jun 17, 2026
23563a5
P2-13: centralize BQ->domain normalization in normalizeDeep()
Jun 17, 2026
8e2e9b6
P3-18: add keyboard focus-visible baseline for native elements
Jun 17, 2026
78c9f6d
P3-22: ADR for the deferred BigQuery serving layer
Jun 17, 2026
ccf6c94
P3-19: add F3 brand glyph to the navbar
Jun 17, 2026
be3d6b8
P3-18 + P2-17: extract ActiveFilterChips, make chips removable
Jun 17, 2026
8c40056
docs: record remaining/deferred work after the roadmap pass
Jun 17, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions .context/adr/0001-bigquery-serving-layer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# ADR 0001 β€” BigQuery as the serving layer (and when to add one)

- **Status:** Accepted (deferred build)
- **Date:** 2026-06-16
- **Roadmap ref:** P3-22 / the "documented-but-unbuilt serving tier" in
[cto-review-roadmap.md](../cto-review-roadmap.md)

## Context

Stats pages read directly from **BigQuery**, an OLAP warehouse, on render.
BigQuery has per-query latency (hundreds of ms to seconds) and bills by bytes
scanned, so using it as an interactive app's serving database is, in principle,
the wrong shape β€” and was the root cause behind the two search postmortems.

**P0-1 mitigated this** by caching each stats page's query in front of BigQuery
(`lib/cache.ts`, `unstable_cache`, 1h revalidate). That removes BigQuery from
the hot path for repeat views and cuts cost/latency substantially.

**Known limit of the current cache:** Next's Data Cache on Cloud Run is
**per-instance** (in-memory + `.next/cache`), not shared across scaled-out
instances, and is lost on cold starts. Each instance still serves repeat loads
from cache within the revalidate window, but there is no single global cache.

## Decision

**Do not build a dedicated serving tier (Postgres / materialized views) now.**
For a solo/volunteer-maintained app at current traffic, the per-instance cache
is sufficient and a serving tier is maintenance the team shouldn't carry yet.
Revisit when the triggers below fire.

## Triggers to revisit (build the serving tier when any holds)

- BigQuery spend becomes material (watch the budget alert from the ops runbook).
- p95 page latency on cache misses is consistently user-visible, or cold
starts / many instances make per-instance caching ineffective.
- Traffic grows enough that cache-miss query volume is a real cost or
rate-limit concern.

## Sketch (if/when built)

- Keep **BigQuery as the source of truth** and the place heavy aggregation runs.
- Precompute the per-page aggregates the loaders need into a **serving store**
(Postgres, already a removed dependency we'd re-add β€” see P2-11 β€” or a managed
KV/edge cache) on a schedule (e.g. after the daily data refresh).
- Loaders read from the serving store; BigQuery is queried only by the
precompute job. The `lib/cache.ts` seam stays in front for hot data.
- Alternatively, a shared Next `cacheHandler` (e.g. Redis) gives a global cache
with far less work than a full serving tier β€” consider this first.

## Consequences

- Today: lowest maintenance; accept per-instance cache + occasional cache-miss
BigQuery latency.
- Deferred: the migration is non-trivial (a precompute job + a store to operate),
which is exactly why it waits for evidence that it's needed.
257 changes: 257 additions & 0 deletions .context/cto-review-roadmap.md

Large diffs are not rendered by default.

61 changes: 61 additions & 0 deletions .context/ops-runbook.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# PAX-Vault Ops Runbook

Operator checklist for the reliability/security items that live outside the
codebase (GCP console, billing, secret management). These are **manual actions**
β€” there is no code to deploy for them. Roadmap ref: P1-9.

## 1. Secret hygiene

**Problem:** production secrets currently sit in plaintext in `.env.firebase`
on a developer laptop. That file is correctly git-ignored (not in the repo), but
a laptop copy is still an exposure.

- [ ] Treat **Google Secret Manager** as the source of truth for production
secrets (it already backs `apphosting.yaml`). Delete `.env.firebase` from
the laptop once confirmed; pull values on demand with
`gcloud secrets versions access` rather than keeping a local copy.
- [ ] Keep `.env.local` (dev-only values) on the laptop β€” that's expected.
- [ ] Confirm `.gitignore` still excludes `.env*` except `.env.example`
(it does today).

### Rotate `SESSION_SECRET` (hygiene, ~quarterly or on suspicion)

Rotating invalidates the HMAC on every existing `__session` cookie, so **all
users are logged out and must sign in again** (low impact for this app).

1. Generate a new secret: `openssl rand -hex 32`
2. Add it as a new version in Secret Manager for the `SESSION_SECRET` secret.
3. Trigger a rollout so App Hosting picks up the new version
(`npm run firebase:deploy`).
4. Verify sign-in end-to-end after the rollout.

## 2. Uptime check (catch "site is down")

Use either option β€” both have free tiers. Target the **public** landing page
(`/`) so the check doesn't need auth.

- **Google Cloud Monitoring** β†’ Uptime checks β†’ Create:
- Protocol HTTPS, host `pax-vault.f3nation.com`, path `/`, check every 5 min.
- Alert policy β†’ notify your email/Slack on failure.
- **or UptimeRobot** (external): HTTPS monitor on the same URL, 5-min interval.

## 3. BigQuery cost guardrail (catch "runaway query cost")

Two complementary controls:

- [ ] **Billing budget alert** (Cloud Billing β†’ Budgets & alerts β†’ Create
budget): scope to the `f3data` project (or the BigQuery service), set a
monthly amount, and alerts at 50/90/100%. This notifies; it does not cap.
- [ ] **Custom query quota** (optional hard cap) β€” IAM & Admin β†’ Quotas β†’
filter "BigQuery API: Query usage per day", set a per-day bytes ceiling so
a pathological query/loop can't run up an unbounded bill.

> Pairs with the P0 caching work: cached page loads no longer issue BigQuery
> jobs, so steady-state cost should be low and a sudden spike is a real signal.

## 4. Error tracking (cross-ref P0-3)

The provider-agnostic seam is already wired (`src/lib/observability.ts`,
`reportError`). When you adopt a provider (e.g. Sentry), forward from the single
`TODO(P0-3)` hook there and set its DSN as a Secret Manager secret β€” no call
sites change.
52 changes: 52 additions & 0 deletions .context/remaining-work.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Remaining Work β€” after the CTO roadmap pass

Status as of the `cto-review-roadmap` branch. The substantive roadmap
([cto-review-roadmap.md](cto-review-roadmap.md)) is executed; this tracks what
was **deliberately deferred**, **blocked on a decision**, or is an **operator
action**. Nothing here is a pressing code defect.

## Deferred by design (incremental / opportunistic / needs a visual pass)

- **#17 β€” finish the big-component splits.** The `pageFilter` accordion drawer
(Region/AO/Tag/Type/Category sections, heavily `useState`-coupled) and
`events.tsx` (~36KB). Do incrementally when next editing them. Note: HeroUI
`<Accordion>` requires `<AccordionItem>` as direct children, so any section
extraction must return `AccordionItem` directly (not wrap it in a component).
_Already done:_ the active-filter chip subsystem was extracted to
`ActiveFilterChips.tsx`.
- **#12 β€” finish the SummaryCard de-dup.** Unify the per-row markup into a
shared `StatRow` (region rows use `text-sm`, area/sector don't β€” normalizing
needs a visual check) and migrate the PAX `SummaryCard` (11 rows, structurally
different). `renderStat` + `HelpHint` are already shared.
- **#20 β€” tighten the server/client seam.** Move static dashboard cards to
Server Components, pushing `"use client"` to interactive leaves. Real
bundle-size win, but a large refactor β€” do when next reworking the dashboard.
- **#21 β€” file-naming consistency.** PascalCase components / camelCase utils.
Low value, churn-y; opportunistic only. (`nation/` needs no loader β€” it's a
static page.)

## Blocked on a product decision

- **Charts (#2 / #18 trend deltas).** Charts are intentionally hidden site-wide
(region/area/sector). "Ship charts properly" = un-hide all three consistently
**and** fix the chart-quality issues (readable labels, color-encoding), then
add trend deltas/CTAs. Real feature work β€” awaiting a go/no-go.
- **#19 β€” warmer empty/celebration copy.** Subjective; pair on it if wanted.

## Operator actions (not code β€” see [ops-runbook.md](ops-runbook.md))

- Rotate `SESSION_SECRET`; move prod secrets off the laptop to Secret Manager.
- Add an uptime check (public landing page) + a BigQuery billing budget alert.
- Pick an error-tracking provider and wire the single `TODO(P0-3)` hook in
`src/lib/observability.ts` (set its DSN as a secret).
- Run `npm run test:e2e` (local smoke suite) in your workflow; it needs
`npx playwright install chromium` once.

## Done (for reference)

P0–P3 executed: caching, error seam, full query parameterization, the filter
contract, postmortem regression tests, mobile/height fixes, local smoke test,
dead-code/dep cleanup, entity de-dup (empty-state / `renderStat` / `HelpHint` /
breadcrumb), `normalizeDeep`, dashboard reorder, leaderboard ranks, mobile
attendee collapse, focus-visible baseline, navbar F3 glyph, removable filter
chips, ops runbook, and the serving-tier ADR.
8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,10 @@ pr.md

# large log files in issue context
.context/issues/**/*.json
certificates
certificates

# Playwright (local-only e2e)
/test-results/
/playwright-report/
/blob-report/
/playwright/.cache/
79 changes: 79 additions & 0 deletions e2e/smoke.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
import { test, expect, type BrowserContext } from "@playwright/test";
import { createHmac } from "crypto";

/**
* Mint a `__session` cookie the same way src/lib/auth/session.ts does, so the
* smoke test can exercise authenticated pages without driving real OAuth.
* (Replicated here because Playwright does not resolve the app's "@/" alias.
* Keep in sync with createSessionValue / SESSION_COOKIE_NAME.)
*/
function mintSessionValue(email: string): string {
const secret = process.env.SESSION_SECRET;
if (!secret) {
throw new Error("SESSION_SECRET is required (load it from .env.local)");
}
const payload = {
sub: email,
email,
name: "Smoke Test",
iat: Math.floor(Date.now() / 1000),
};
const json = Buffer.from(JSON.stringify(payload)).toString("base64url");
const signature = createHmac("sha256", secret)
.update(json)
.digest("base64url");
return `${json}.${signature}`;
}

const BASE_URL = process.env.PLAYWRIGHT_BASE_URL ?? "https://localhost:3001";
const SAMPLE_REGION = process.env.SAMPLE_REGION;

async function signIn(context: BrowserContext) {
await context.addCookies([
{
name: "__session",
value: mintSessionValue("smoke@pax-vault.test"),
domain: new URL(BASE_URL).hostname,
path: "/",
httpOnly: true,
secure: true,
sameSite: "Lax",
},
]);
}

test.describe("pax-vault smoke", () => {
test("public landing page renders", async ({ page }) => {
await page.goto("/");
await expect(page.getByText(/PAX Vault/i).first()).toBeVisible();
});

test("unauthenticated stats route redirects to landing", async ({ page }) => {
await page.goto(`/stats/region/${SAMPLE_REGION ?? "1"}`);
// Middleware bounces unauthenticated users back to "/".
await expect(page).toHaveURL(/\/(\?.*)?$/);
});

test("authenticated region dashboard loads", async ({ page, context }) => {
test.skip(!SAMPLE_REGION, "SAMPLE_REGION not set in .env.local");
await signIn(context);
await page.goto(`/stats/region/${SAMPLE_REGION}`);
await expect(page).toHaveURL(new RegExp(`/stats/region/${SAMPLE_REGION}`));
// Some dashboard chrome should render once the data resolves.
await expect(page.getByText(/Summary|Leaders/i).first()).toBeVisible({
timeout: 15000,
});
});

test("authenticated search API returns a well-formed 200", async ({
context,
}) => {
await signIn(context);
const res = await context.request.get("/api/search?q=no");
expect(res.status()).toBe(200);
const body = await res.json();
expect(body).toHaveProperty("regions");
expect(body).toHaveProperty("aos");
expect(body).toHaveProperty("pax");
});
});
Loading
Loading