Skip to content

feat(support): opaque per-conversation customer reconnect token (US-071)#34

Merged
hcho22 merged 5 commits into
mainfrom
fm/us071-token-k8
Jun 25, 2026
Merged

feat(support): opaque per-conversation customer reconnect token (US-071)#34
hcho22 merged 5 commits into
mainfrom
fm/us071-token-k8

Conversation

@hcho22

@hcho22 hcho22 commented Jun 25, 2026

Copy link
Copy Markdown
Owner

Intent

The developer set out to implement US-071, an opaque per-conversation customer token for the Agentic_RAG support-widget surface, providing anonymous customers a reconnect credential that deliberately stays off the Supabase JWT trust surface. They required the token to be cryptographically-random opaque bytes (not signed with the JWT secret, distinct from US-068's bot JWT), returned in the clear exactly once at conversation creation and stored server-side only as a SHA-256 hash in a dedicated conversation_tokens(token_hash, conversation_id, expires_at, created_at) table, with no server-side customer-identity table. They specified a ~24h lifetime refreshed on activity while status != 'resolved', invalidation on resolve, each token bound to exactly one conversation, plus resume and transcript-GET endpoints that reject missing/expired/resolved tokens cleanly without creating rows. A security-critical automated test was mandated to prove a token for conversation X returns only X, cannot read conversation Y, and stops resuming once X is resolved, mirroring the existing asyncpg/DATABASE_URL-gated test style. Constraints included using a strictly-later migration timestamp, not merging the two existing conversation trust models, never putting workspace_membership.role in a visibility predicate, never logging/returning the raw token, scoping to issuance/storage/resume (leaving US-078's lazy creation flow out), and not wiping the shared local test DB while ensuring typecheck/lint pass and the migration applies cleanly.

What Changed

  • Added backend/conversation_tokens.py with the token primitives: generate_conversation_token() (256-bit secrets.token_urlsafe, opaque and signed with nothing — distinct from US-068's bot JWT) and hash_conversation_token() (SHA-256 hex, the only representation ever stored), plus the ~24h TTL constant.
  • Added migration 20260624160000_conversation_tokens.sql: a conversation_tokens(token_hash, conversation_id, expires_at, created_at) table with RLS enabled and zero policies (anon/authenticated denied wholesale), a resume_conversation(text, boolean) SECURITY DEFINER RPC granted to service_role only that resolves a hash to its single bound conversation while expires_at > now() AND status <> 'resolved' (the p_slide flag gates the 24h activity refresh so a read-only GET never extends lifetime), and an AFTER-UPDATE trigger that purges a conversation's tokens on transition into resolved.
  • Wired POST /widget/conversations/resume and GET /widget/conversations/{id}/transcript in backend/main.py, both authed by an X-Conversation-Token header hashed and resolved via the service-role RPC (binding-enforced, transcript limited to user/assistant roles, service-role-unset fail-closed to 503), with the _issue_conversation_token issuance helper for US-078, plus the security-critical backend/test_us071_conversation_tokens.py and doc updates (AGENTS.md, README, .env.example).

Risk Assessment

✅ Low: A well-bounded, additive feature (new migration, pure token primitives, two public widget endpoints) whose security-critical binding invariant holds; the two actionable prior-round findings are applied and verified, and the two remaining items are deliberate author-accepted decisions.

Testing

Ran the committed backend/test_us071_conversation_tokens.py (unit + integration) against a running local Supabase with the US-071 migration already applied; both layers pass and prove the DB-level token boundary. Because unit/DB tests alone don't show the end-user surface, I additionally drove the real FastAPI widget endpoints over HTTP with tokens issued via the production backend helper, capturing a request→response transcript that demonstrates the customer reconnect flow: opaque token issued once and stored hashed, valid resume of X, transcript limited to user/assistant roles, a token for X rejected (401) when reaching for Y (with Ty→Y positive control), missing/invalid/resolved tokens cleanly 401 without creating rows, and resolve invalidating the token. All seed data and the transient harness were cleaned up; the working tree is clean.

Evidence: US-071 widget endpoints end-to-end HTTP transcript (customer reconnect flow)

POST /widget/conversations/resume (missing)->401; (invalid)->401; (valid Tx)->200 {id:X, status:active} workspace_id hidden; token rows 2->2 (resume never creates a row) GET /widget/conversations/{X}/transcript (Tx)->200 messages roles=[user,assistant] (system/tool filtered) GET /widget/conversations/{Y}/transcript (Tx)->401 cross-conversation reject; positive control Ty->Y 200 Refresh: GET leaves expiry unchanged; POST slides 24h window forward Resolve(X): 0 token rows survive (purge trigger); POST resume Tx ->401; GET transcript Tx ->401 Issuance: raw token opaque (no dot segments), DB stores only sha256 hash != raw, raw not a key (0 rows)


=== Issuance: backend mints opaque tokens, stores ONLY the SHA-256 hash ===
  raw token Tx (returned once, to iframe only): dFpvmr2a7TMpADksZ1I4bjoFb1hUhHSln25FxFlgYN0
  is it a JWT?  False (opaque -> no dot segments)
  DB stores token_hash: 0f9d3d40dd314285fe51c435842053269257146c2709b0fd86ddc2b3b2b0e97d
  hash == sha256(Tx)?  True
  DB row equals raw token?  False  (raw never persisted)
  rows where token_hash == RAW Tx: 0  (0 -> raw is not a key)

=== POST /widget/conversations/resume — missing token (clean reject, no row created) ===
  POST /widget/conversations/resume
    X-Conversation-Token: (none)
    -> 401  {"detail": "missing conversation token"}

=== POST /widget/conversations/resume — invalid/unknown token ===
  POST /widget/conversations/resume
    X-Conversation-Token: not-a-re…(raw opaque token)
    -> 401  {"detail": "invalid or expired conversation token"}

=== POST /widget/conversations/resume — valid Tx resumes conversation X ===
  POST /widget/conversations/resume
    X-Conversation-Token: dFpvmr2a…(raw opaque token)
    -> 200  {"conversation": {"id": "a45367dd-32fa-4d67-aca9-fea93cb17919", "status": "active", "created_at": "2026-06-25T12:48:02.771076+00:00"}}
    resumes X: True; workspace_id hidden: True
  token rows before/after the rejected+valid resumes: 2 -> 2 (resume never creates a row)

=== GET /widget/conversations/{X}/transcript — Tx reads X (system/tool filtered out) ===
  GET /widget/conversations/a45367dd-32fa-4d67-aca9-fea93cb17919/transcript
    X-Conversation-Token: dFpvmr2a…(raw opaque token)
    -> 200  {"conversation": {"id": "a45367dd-32fa-4d67-aca9-fea93cb17919", "status": "active", "created_at": "2026-06-25T12:48:02.771076+00:00"}, "messages": [{"id": "23a8e06b-56b5-4578-beae-bb35e10e8ce9", "role": "user", "content": "How do I reset my password?", "created_at": "2026-06-25T12:48:02.775028+00:00"}, {"id": "dd19754f-bbbe-46b0-b1c8-943c2262a624", "role": "assistant", "content": "Open Settings \u2192 Security \u2192 Reset password.", "created_at": "2026-06-25T12:48:02.775028+00:00"}]}
    roles returned: ['user', 'assistant']  (system/tool excluded: True)

=== GET /widget/conversations/{Y}/transcript with Tx — CROSS-CONVERSATION reject ===
  GET /widget/conversations/e35e673f-c89a-4ac6-97be-978bb87b0ee1/transcript
    X-Conversation-Token: dFpvmr2a…(raw opaque token)
    -> 401  {"detail": "invalid conversation token for this conversation"}
    Tx cannot read Y: 401 (binding check: resumed id must equal path id)
    positive control — Ty reads Y: 200 (id=e35e673f-c89a-4ac6-97be-978bb87b0ee1)

=== Refresh semantics: POST /resume slides the 24h window; GET transcript does not ===
  expiry after GET transcript unchanged: True
  expiry after POST resume moved forward: True

=== Resolve X — token Tx is invalidated (row purged) and can no longer resume ===
  Tx rows surviving resolve: 0 (purge-on-resolve trigger)
  POST /widget/conversations/resume
    X-Conversation-Token: dFpvmr2a…(raw opaque token)
    -> 401  {"detail": "invalid or expired conversation token"}
  GET /widget/conversations/a45367dd-32fa-4d67-aca9-fea93cb17919/transcript
    X-Conversation-Token: dFpvmr2a…(raw opaque token)
    -> 401  {"detail": "invalid conversation token for this conversation"}

ALL END-TO-END CHECKS PASSED: opaque token issued once + stored hashed, resume/transcript work over HTTP, cross-conversation reads rejected, missing/expired/resolved tokens 401 cleanly, resolve invalidates the token.
Evidence: US-071 committed security test output (DB RPC + trigger boundary)

unit: opaque/unique/URL-safe/not-JWT; hash deterministic 64-hex one-way != raw; empty rejected; TTL 24h schema: RLS on + 0 policies; resume_conversation EXECUTE = service_role only; purge trigger present step1 resume(Tx)->X only, resume(Ty)->Y control; step2 Tx X-bound, 'Tx requesting Y' rejected; step2b anon RPC denied, service-role->X refresh: POST slides expiry, GET (slide=False) does not; expiry: expired rejected & not resurrected step3: resolve(X) purges Tx and rejects resume(Tx) OK: 9 exact assertions, zero cross-conversation leak

  unit: tokens are opaque, unique, URL-safe, not JWT-shaped
  unit: hash is deterministic 64-hex, one-way, != raw token
  unit: empty token rejected by hash
  unit: TTL constant is 24h
  (4 unit checks passed)
  schema: conversation_tokens has RLS on + 0 policies (backend-mediated)
  schema: resume_conversation EXECUTE = service_role only (anon/authenticated denied)
  schema: purge-on-resolve trigger present
  step 1: resume(Tx) → X only; resume(Ty) → Y (positive control)
  step 2: Tx is X-bound; 'Tx requesting Y' rejected by binding check
  step 2b: anon RPC denied (non-200); service-role RPC → X (backend path)
  refresh: POST resume slides the 24h expiry; GET transcript (slide=False) does not
  expiry: an expired token is rejected and not resurrected
  step 3: resolve(X) purges Tx and rejects resume(Tx) (invalidated on resolve)
OK: US-071 integration passed — 9 exact assertions; the opaque token is X-bound, backend-mediated, refreshed on activity, expiry- and resolve-invalidated, with zero cross-conversation leak

Pipeline

Updates from git push no-mistakes

✅ **intent** - passed

✅ No issues found.

✅ **Rebase** - passed

✅ No issues found.

🔧 **Review** - 4 issues found → auto-fixed ✅
  • ⚠️ supabase/migrations/20260624160000_conversation_tokens.sql:67 - The migration's RPC signature changed across the branch's commits: e922fc7 created resume_conversation(text), e61eb01 changed it to resume_conversation(text, boolean). Editing an already-applied migration file is a hazard: any environment that applied the earlier version - notably the shared local test DB the author was told not to wipe - keeps the stale single-arg function, and a plain supabase migration up will NOT reconcile it (it tracks applied filenames). On such a DB the mandated security test (select ... from public.resume_conversation($1, $2)) and the backend's 2-arg RPC call would fail with 'function does not exist'. Fresh CI/prod applies are correct; this only bites environments that ran the intermediate version. Recommend a db reset (or reapply) on the local DB before relying on the test.
  • ℹ️ AGENTS.md:116 - The verify hint references public.resume_conversation(text), but the function is now resume_conversation(text, boolean). has_function_privilege(&#39;anon&#39;,&#39;public.resume_conversation(text)&#39;,&#39;EXECUTE&#39;) will raise 'function does not exist' rather than return false. Update the documented signature to (text, boolean) (the test already uses the correct one).
  • ℹ️ supabase/migrations/20260624160000_conversation_tokens.sql:108 - The 24h window is encoded in two places that must stay in lockstep but can silently drift: CONVERSATION_TOKEN_TTL_SECONDS = 24*60*60 (Python, used only for the initial issuance expires_at) and the hardcoded interval &#39;24 hours&#39; in the RPC's slide-refresh UPDATE. Changing the constant would alter issuance TTL while leaving every slid token at 24h (and vice versa). Consider passing the interval into the RPC, or at least cross-referencing the two so the coupling is explicit.
  • ℹ️ backend/test_us071_conversation_tokens.py:299 - Step 2b issues httpx calls to PostgREST (port 54321) but only the asyncpg connect (port 54322) is guarded for clean-skip. If Postgres is up while PostgREST/Kong is down, the http.post raises an uncaught httpx.ConnectError and the integration layer errors instead of skipping cleanly, slightly violating the stated skip contract. Low probability since supabase start brings both up together.

🔧 Fix: fix stale RPC signature doc and clean-skip PostgREST in US-071 test
✅ Re-checked - no issues remain.

✅ **Test** - passed

✅ No issues found.

  • python3 -m backend.test_us071_conversation_tokens — unit + integration/security layers green against local Supabase (9 integration assertions: RLS deny-all, service_role-only RPC, anon-denied via PostgREST, Tx→X only with Ty→Y control, slide/no-slide refresh, expiry rejection, resolve-purge invalidation)
  • Confirmed the US-071 migration objects exist in the local DB: public.conversation_tokens table, resume_conversation(text, boolean) RPC, and conversations_purge_tokens_on_resolve trigger
  • End-to-end through the real FastAPI endpoints via TestClient (transient backend/_e2e_us071_widget.py, since removed): issued tokens via the real _issue_conversation_token helper; exercised POST /widget/conversations/resume (missing→401, invalid→401, valid Tx→200 resuming X with workspace_id hidden, no row created) and GET /widget/conversations/{id}/transcript (Tx reads X with system/tool roles filtered out, Tx→Y rejected 401 with Ty→Y positive control, resolved-token→401)
  • Verified token issuance stores only the SHA-256 hash (raw token never persisted, not JWT-shaped) and that resolving X purges its token row, plus POST slides the 24h expiry while GET transcript does not
  • Verified DB cleanup left no E2E/US-071 seed rows and git status is clean
✅ **Document** - passed

✅ No issues found.

✅ **Lint** - passed

✅ No issues found.

✅ **Push** - passed

✅ No issues found.

hcho22 added 5 commits June 25, 2026 05:20
…hed storage, resume (US-071)

The anonymous support-widget customer's reconnect credential. Cryptographically
-random opaque bytes (NOT a Supabase JWT, signed with nothing - distinct from
US-068's bot JWT), returned in the clear exactly once at conversation creation
and stored only as a SHA-256 hash. Keeps the customer structurally off the
Supabase trust surface (ADR-0008, amends ADR-0004) with NO server-side
customer-identity table - continuity comes solely from the iframe-stored token.

- migration 20260623140000_conversation_tokens.sql: conversation_tokens
  (token_hash pk, conversation_id fk, expires_at, created_at) with RLS on + ZERO
  policies (deny-all to anon/authenticated); a service-role-only SECURITY DEFINER
  resume_conversation(p_token_hash) RPC that resolves the hash to its bound
  conversation iff not-expired AND status<>'resolved', slides the 24h window, and
  returns the ONE bound conversation (no caller-supplied id, so X cannot read Y);
  and an AFTER-UPDATE purge trigger that deletes a conversation's tokens on
  resolve (literal invalidation). Revokes EXECUTE from public/anon/authenticated
  by name - Supabase grants it directly via default privileges, so revoking PUBLIC
  alone is not enough.
- backend/conversation_tokens.py: pure primitives (generate_conversation_token,
  hash_conversation_token, 24h TTL).
- main.py: _issue_conversation_token (for US-078's first-message flow),
  POST /widget/conversations/resume, GET /widget/conversations/{id}/transcript -
  authed by the X-Conversation-Token header (never Authorization); transcript
  enforces the token<->id binding so a token for X cannot read Y's transcript.
- test_us071_conversation_tokens.py: unit layer (always runs) + security
  integration layer (DATABASE_URL-gated, skips cleanly) encoding the PRD
  validation test.

Typecheck/lint clean (no new mypy errors vs HEAD; flake8 clean); migration applies
cleanly; all 13 US-071 assertions plus US-066/067/068 tests pass.
@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentic-rag Ready Ready Preview, Comment Jun 25, 2026 12:57pm

@github-actions

Copy link
Copy Markdown
Contributor

Retrieval eval — PR vs main

n = 50 questions × 3 modes (vector, keyword, hybrid) on a 14-chunk corpus. PR ran in 77.16s; main in 82.37s.

Headline (each cell: PR value, Δ vs main)

Mode recall@5 MRR nDCG@5
vector 0.860 (±0.000) 0.772 (±0.000) 0.779 (±0.000)
keyword 0.110 (±0.000) 0.120 (±0.000) 0.112 (±0.000)
hybrid 0.860 (±0.000) 0.759 (±0.000) 0.769 (±0.000)

Per-category recall@5

Mode single_chunk multi_hop adversarial paraphrase
vector 0.900 (±0.000) 0.933 (±0.000) 0.600 (±0.000) 1.000 (±0.000)
keyword 0.250 (±0.000) 0.033 (±0.000) 0.000 (±0.000) 0.000 (±0.000)
hybrid 0.900 (±0.000) 0.933 (±0.000) 0.600 (±0.000) 1.000 (±0.000)

Comment is updated in place on each push by .github/workflows/retrieval-eval.yml (US-035). Comment-only — never blocks the build.

@hcho22 hcho22 merged commit c304905 into main Jun 25, 2026
3 checks passed
@hcho22 hcho22 deleted the fm/us071-token-k8 branch June 25, 2026 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant