Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/plans/2026-05-31-large-upload-gcs-resumable-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Updated 2026-06-01. Marks what has actually landed so a fresh agent can resume w
- [x] **Phase 1 — Small-path fix & error clarity** (T1.1, T1.2) — merged
- [x] **Phase 2 — Upload contract + state machine + schema** (T2.1–T2.3) — merged (#92)
- [x] **Phase 3 — GCS resumable** (T3.1–T3.3) — merged (#93)
- [ ] **Phase 4 — Async processing (Cloud Tasks)** (T4.1–T4.3) — **next**; detailed sub-plan: [2026-06-01-phase4-cloud-tasks-impl.md](2026-06-01-phase4-cloud-tasks-impl.md)
- [x] **Phase 4 — Async processing (Cloud Tasks)** (T4.1–T4.3) — merged (PRs #99, #102, #101). Sub-plan: [2026-06-01-phase4-cloud-tasks-impl.md](2026-06-01-phase4-cloud-tasks-impl.md). **Prod activation pending** (set Cloud Run env): [2026-06-06-phase4-cloud-run-activation.md](2026-06-06-phase4-cloud-run-activation.md)
- [ ] **Phase 5 — Handler registry + Tier 1 + safe ZIP** (T5.1–T5.3)
- [ ] **Phase 6 — Dashboard large-upload UX** (T6.1, T6.2)
- [ ] **Phase 7 — Cleanup, observability, deployment docs**
Expand Down
94 changes: 94 additions & 0 deletions docs/plans/2026-06-06-phase4-cloud-run-activation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Phase 4 — Cloud Run activation

**Date:** 2026-06-06 · **Status:** Ready to run (maintainer step) · **Parent:**
[infra/cloud-tasks/README.md](../../infra/cloud-tasks/README.md),
[2026-06-01-phase4-cloud-tasks-impl.md](2026-06-01-phase4-cloud-tasks-impl.md)

Phase 4 (async upload processing via Google Cloud Tasks) is merged to `main` and the
image auto-deploys to Cloud Run (Cloud Build source-deploy on push to `main`). The
feature is **inert** until four env vars are set on the service and the request timeout
is raised. Everything else — the Cloud Tasks queue, the OIDC invoker SA, the GCS bucket,
and all IAM bindings — is already provisioned by
[infra/cloud-tasks/setup.sh](../../infra/cloud-tasks/setup.sh); do **not** re-run it.

> Identifiers below are placeholders (this is a public repo). Resolve the real values
> from gcloud at run time — the active gcloud project may differ from the target, so
> always pass `--project` explicitly. Confirm your account with
> `gcloud config get-value account`.

## Resolve your values first

```sh
PROJECT=<your-project> # GCP project id of the deployment
REGION=<your-region> # region of the Cloud Run service + queue + bucket
SERVICE=<your-cloud-run-service> # Cloud Run service name
QUEUE=<your-upload-processing-queue>
TASKS_SA=<tasks-sa>@${PROJECT}.iam.gserviceaccount.com
SERVICE_URL=$(gcloud run services describe "$SERVICE" \
--project="$PROJECT" --region="$REGION" --format='value(status.url)')
```

Optional sanity checks (the only gap is the four env vars + the timeout):

```sh
# Already-deployed image (expect the current main commit SHA):
gcloud run services describe "$SERVICE" --project="$PROJECT" --region="$REGION" \
--format='value(spec.template.spec.containers[0].image)'
# Queue is RUNNING:
gcloud tasks queues describe "$QUEUE" --project="$PROJECT" --location="$REGION" \
--format='value(state)'
# Current request timeout (raise to 600 below):
gcloud run services describe "$SERVICE" --project="$PROJECT" --region="$REGION" \
--format='value(spec.template.spec.timeoutSeconds)'
```

`GCS_UPLOAD_BUCKET` is expected to already be set on the service (Phase 3 storage); if
it is not, add it to the `--update-env-vars` list below.

## Activate (the only change)

```sh
gcloud run services update "$SERVICE" --project="$PROJECT" --region="$REGION" \
--update-env-vars="CLOUD_TASKS_QUEUE=${QUEUE},CLOUD_TASKS_LOCATION=${REGION},CLOUD_TASKS_SERVICE_ACCOUNT=${TASKS_SA},UPLOAD_PROCESS_URL=${SERVICE_URL}/api/upload/process" \
--timeout=600
```

- `UPLOAD_PROCESS_URL` is BOTH the Cloud Tasks target base (task → `<URL>/<uploadId>`)
AND the OIDC audience the app verifies. It must be the live `https://…` service URL;
the app rejects an invalid URL at startup (zod `.url()`).
- All of `CLOUD_TASKS_QUEUE` + `UPLOAD_PROCESS_URL` + `CLOUD_TASKS_SERVICE_ACCOUNT` must
be set together, or the app falls back to the in-memory queue (logs a warning).
- `--update-env-vars` only adds/overwrites the listed keys; other env is untouched.
- This changes live production config (creates a new revision from the same already-
deployed image). It is outward-facing — confirm before running it.

## Verify after (smoke test)

See the checklist in
[2026-05-31-large-upload-gcs-resumable-plan.md](2026-05-31-large-upload-gcs-resumable-plan.md)
§13.

1. Internal endpoint rejects non-OIDC callers — must never return 200:

```sh
curl -s -o /dev/null -w '%{http_code}\n' -X POST \
"${SERVICE_URL}/api/upload/process/test-id" # expect 401 or 403
```

2. End-to-end: a large upload flows `/init` → resumable PUT to GCS → `/complete` (202)
→ a task appears in the queue → `GET /api/upload/:id/status` goes
`queued → processing → completed` with a document created. Watch:

```sh
gcloud run services logs read "$SERVICE" --project="$PROJECT" --region="$REGION" --limit=50
gcloud tasks queues describe "$QUEUE" --project="$PROJECT" --location="$REGION"
```

## Rollback

Reverts to the in-memory queue (processing goes inert again):

```sh
gcloud run services update "$SERVICE" --project="$PROJECT" --region="$REGION" \
--remove-env-vars=CLOUD_TASKS_QUEUE,CLOUD_TASKS_LOCATION,CLOUD_TASKS_SERVICE_ACCOUNT,UPLOAD_PROCESS_URL
```
5 changes: 5 additions & 0 deletions infra/cloud-tasks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@ re-run.

## Configure Cloud Run (do this when the T4.1–T4.3 code is deployed)

> **The T4.1–T4.3 code is now merged and deployed.** For the remaining activation
> steps — which env vars are still missing, how to resolve the live service URL,
> and a copy-paste agent prompt — see
> [docs/plans/2026-06-06-phase4-cloud-run-activation.md](../../docs/plans/2026-06-06-phase4-cloud-run-activation.md).

`setup.sh` prints the exact values. The processing code reads these env vars;
they are inert until that code ships, so set them at the same deploy:

Expand Down
Loading