Skip to content

feat(support): support-bot per-turn retrieval seam + non-security workspace filter (US-070)#33

Merged
hcho22 merged 2 commits into
mainfrom
fm/us070-bot-retrieval
Jun 25, 2026
Merged

feat(support): support-bot per-turn retrieval seam + non-security workspace filter (US-070)#33
hcho22 merged 2 commits into
mainfrom
fm/us070-bot-retrieval

Conversation

@hcho22

@hcho22 hcho22 commented Jun 25, 2026

Copy link
Copy Markdown
Owner

Intent

Implement US-070 (Phase-2 PRD): the support-bot per-turn retrieval seam, now rebased onto main after US-068 (#31) and US-069 (#32) landed. Each customer turn mints a ~60s Supabase-compatible bot JWT (sub=bot_user_id, via US-068 mint_supabase_jwt), calls match_chunks AS the bot, then discards the token (no cross-turn cache), so the bot answers only from chunk_acl share-to-bot docs. Key points a reviewer should know: (1) US-069's PR #32 created backend/support_bot.py for bot PROVISIONING (provision_workspace_bot); US-070 adds the RETRIEVAL seam (run_bot_deflection_turn). During rebase I intentionally MERGED both into one support_bot.py module under a unified docstring - they are two phases of one support-bot lifecycle meeting at conversations.bot_user_id - preserving both sides' code verbatim. (2) The US-068 minter is DEPENDENCY-INJECTED (MintToken callable) into the retrieval path to avoid a main.py import cycle and keep it pure/testable; the endpoint that wires it is US-080. (3) filter_workspace_id is added to BOTH match_chunks and keyword_search as an ORDINARY NON-SECURITY narrowing filter (null default = no-op; trust boundary stays the auth.uid() workspace_membership clause, untouched) because hybrid fuses both legs; E4/E6/permissions stay byte-identical. (4) An earlier review correctly flagged a no-op bearer-scrub in a finally block; I removed it and the comment now states the real structural no-cache guarantee honestly. Tests: unit (no DB/secret), live RLS integration (bot sees only the shared doc, 0 from non-shared; filter narrows non-securely), and US-069's provisioning test all pass. The bot bearer token must never reach an SSE/response/log surface; only DeflectionResult.customer_message is client-safe.

What Changed

  • Added run_bot_deflection_turn to backend/support_bot.py (merged with US-069's provision_workspace_bot into one support-bot lifecycle module): each customer turn mints a ~60s sub=bot_user_id Supabase JWT via a dependency-injected mint_supabase_jwt (avoids a main.py import cycle), runs the deflection pipeline as the bot principal, then discards the token with no cross-turn cache, so the bot answers only from chunk_acl share-to-bot documents; escalation.py/retrieval.py are updated to plumb workspace_id through.
  • Added filter_workspace_id uuid default null to both match_chunks and keyword_search (migrations 20260624150000 / 20260624150100) as an ordinary non-security narrowing filter AND-ed beside filter_topics on both hybrid legs; null is a no-op so /api/chat and the E4/E6 paths stay byte-identical, and the trust boundary remains the auth.uid() workspace_membership clause.
  • Added unit tests (no DB/secret) and live-RLS integration tests proving the bot sees only the shared doc and the filter narrows without leaking, plus AGENTS.md and a CONTEXT.md glossary sync; note run_bot_deflection_turn has no production caller yet (the wiring endpoint is US-080).

Risk Assessment

✅ Low: The change is well-bounded and backward-compatible: migration bodies are byte-identical to their predecessors except an optional null-default narrowing param AND-ed under the untouched membership boundary, every new Python param defaults to None (no-op for existing callers), and the new bot-retrieval seam is covered by both unit and live RLS integration tests with no bearer-leak path.

Testing

Ran the US-070 unit seam test (6 groups, no DB/secret) and the live RLS integration test (5 checks, the PRD US-070 validation) against the running local Supabase with the migrations confirmed applied; both pass. Verified no regression in the touched paths via the deflection-pipeline, US-069 provisioning, and permissions/RLS suites. As reviewer-visible evidence I captured the actual raw PostgREST match_chunks/keyword_search responses under a self-minted bot JWT: the bot retrieves only the share-to-bot doc D (non-shared E and other-workspace F return zero rows), the owner positive control sees D+E (proving the data is retrievable), and filter_workspace_id only subtracts within membership - demonstrating the auth.uid() membership clause stays the trust boundary. This is backend security work with no UI surface, so the evidence is API-response transcripts rather than screenshots. Overall: all green, no findings.

Evidence: Raw PostgREST RPC responses (bot vs owner) proving the US-070 retrieval boundary

[2] match_chunks AS BOT B, filter_workspace_id=W <<< PRD US-070 VALIDATION HTTP 200 -> ['D (in W, SHARED-to-bot via chunk_acl)'] raw rows: [{"document_id": "f1c9c0dc-...", "content": "shared answer about returns policy", "granting_principal_display": "bot-a441a6a8@test.local"}] => bot retrieves ONLY the share-to-bot doc D; NON-shared E -> 0 rows (zero-leak) [4] match_chunks AS OWNER U — filter_workspace_id is NON-SECURITY narrowing no filter -> {D,E,F}; filter=W -> {D,E}; filter=W2 -> {F}

==============================================================================
US-070 SUPPORT-BOT RETRIEVAL — RAW PostgREST RPC RESPONSES (live local Supabase)
==============================================================================
Supabase REST : http://127.0.0.1:54321
Workspace  W  : 2141c4c5-7271-4cef-8d3a-228c3d972389
Workspace  W2 : ccc923cf-08a4-4f9a-9f0d-14994131fec7
Owner  U      : cddf0a78-c766-4b33-87c3-186e7e205fd2  (member of W and W2; owns D,E,F)
Bot    B      : a441a6a8-bd27-4df0-b834-ef71c2fb2f61  (member of W only; share-to-bot on D's chunk)
Bot bearer    : self-minted HS256 role=authenticated JWT, sub=B (US-068 shape)

[1] match_chunks AS OWNER U, filter_workspace_id=W  (positive control)
    HTTP 200  ->  ['D  (in W,  SHARED-to-bot via chunk_acl)', 'E  (in W,  NOT shared)']
    (data IS retrievable — both D and E exist and match)

[2] match_chunks AS BOT B,   filter_workspace_id=W   <<< PRD US-070 VALIDATION
    HTTP 200  ->  ['D  (in W,  SHARED-to-bot via chunk_acl)']
    raw rows:
    [
      {
        "document_id": "f1c9c0dc-78fd-4125-ad66-e34ed296bc26",
        "content": "shared answer about returns policy",
        "granting_principal_display": "bot-a441a6a8@test.local"
      }
    ]
    => bot retrieves ONLY the share-to-bot doc D; NON-shared E -> 0 rows (zero-leak)

[3] match_chunks AS BOT B,   NO workspace filter
    HTTP 200  ->  ['D  (in W,  SHARED-to-bot via chunk_acl)']
    => still ONLY D — the chunk_acl grant (not the filter) is the boundary

[4] match_chunks AS OWNER U — filter_workspace_id is NON-SECURITY narrowing
    no filter -> ['D  (in W,  SHARED-to-bot via chunk_acl)', 'E  (in W,  NOT shared)', 'F  (in W2, owner-only, other workspace)']
    filter=W  -> ['D  (in W,  SHARED-to-bot via chunk_acl)', 'E  (in W,  NOT shared)']
    filter=W2 -> ['F  (in W2, owner-only, other workspace)']
    => the filter only SUBTRACTS within membership; omitting it leaks nothing

[5] keyword_search AS BOT B, filter_workspace_id=W  (hybrid's second leg)
    HTTP 200  ->  ['D  (in W,  SHARED-to-bot via chunk_acl)']
    => keyword leg scoped identically; both hybrid legs coherent

==============================================================================
RESULT: bot answers only from share-to-bot docs; filter_workspace_id narrows
        non-securely; the auth.uid() membership clause remains the boundary.
==============================================================================
Evidence: Evidence capture script (reusable; drives real PostgREST RPCs as bot/owner)
"""US-070 evidence capture: drives the REAL match_chunks / keyword_search RPCs
through PostgREST under a self-minted bot JWT (the exact production path) and
records the raw HTTP responses so a reviewer can see the support-bot retrieval
boundary working end-to-end.

This reuses the integration test's fixture/seed/mint helpers verbatim, then prints
labelled raw API responses (which doc each principal actually retrieves) and writes
the same transcript to a file. Not a pass/fail harness — it shows the product
surface (PostgREST RPC responses) directly.
"""

from __future__ import annotations

import asyncio
import json
import os
import sys
from pathlib import Path

import asyncpg
import httpx

ROOT = Path(__file__).resolve().parent
# Import the integration test module's helpers (seed/mint/fixture).
sys.path.insert(0, "/Users/hcho/.no-mistakes/worktrees/3074c1251a17/01KVY3Q4QB7C7P51750366T7VX")
sys.path.insert(0, "/Users/hcho/.no-mistakes/worktrees/3074c1251a17/01KVY3Q4QB7C7P51750366T7VX/backend")

from backend.test_us070_bot_retrieval_integration import (  # noqa: E402
    EMBEDDING,
    LOCAL_ANON_KEY,
    LOCAL_JWT_SECRET,
    Fixture,
    _headers,
    _mint_jwt,
    _seed,
    _cleanup,
)

OUT = ROOT / "us070_postgrest_responses.txt"
lines: list[str] = []


def emit(s: str = "") -> None:
    print(s)
    lines.append(s)


async def call(http, url, headers, rpc, body):
    r = await http.post(f"{url}/rest/v1/rpc/{rpc}", headers=headers, json=body)
    return r.status_code, r.json()


async def main() -> None:
    db_url = os.environ["DATABASE_URL"]
    url = os.environ.get("SUPABASE_URL", "http://127.0.0.1:54321")
    anon = os.environ.get("SUPABASE_ANON_KEY", LOCAL_ANON_KEY)
    secret = os.environ.get("SUPABASE_JWT_SECRET", LOCAL_JWT_SECRET)

    http = httpx.AsyncClient(timeout=10.0)
    fx = Fixture()
    conn = await asyncpg.connect(db_url)
    # Human-readable labels for the doc ids the RPC returns.
    label = {
        fx.doc_d: "D  (in W,  SHARED-to-bot via chunk_acl)",
        fx.doc_e: "E  (in W,  NOT shared)",
        fx.doc_f: "F  (in W2, owner-only, other workspace)",
    }

    def describe(rows):
        seen = {row["document_id"] for row in rows}
        return [label.get(d, d) for d in sorted(seen, key=lambda d: list(label).index(d) if d in label else 99)]

    try:
        await conn.execute("notify pgrst, 'reload schema'")
        await _seed(conn, fx)
        owner_email = await conn.fetchval("select email from auth.users where id=$1::uuid", fx.owner)
        bot_email = await conn.fetchval("select email from auth.users where id=$1::uuid", fx.bot)
        owner_h = _headers(_mint_jwt(fx.owner, owner_email, secret), anon)
        bot_h = _headers(_mint_jwt(fx.bot, bot_email, secret), anon)

        emit("=" * 78)
        emit("US-070 SUPPORT-BOT RETRIEVAL — RAW PostgREST RPC RESPONSES (live local Supabase)")
        emit("=" * 78)
        emit(f"Supabase REST : {url}")
        emit(f"Workspace  W  : {fx.ws1}")
        emit(f"Workspace  W2 : {fx.ws2}")
        emit(f"Owner  U      : {fx.owner}  (member of W and W2; owns D,E,F)")
        emit(f"Bot    B      : {fx.bot}  (member of W only; share-to-bot on D's chunk)")
        emit("Bot bearer    : self-minted HS256 role=authenticated JWT, sub=B (US-068 shape)")
        emit("")

        base = {"query_embedding": EMBEDDING, "match_threshold": 0.5, "match_count": 50}

        # 1. owner positive control, scoped to W
        st, rows = await call(http, url, owner_h, "match_chunks", {**base, "filter_workspace_id": fx.ws1})
        emit("[1] match_chunks AS OWNER U, filter_workspace_id=W  (positive control)")
        emit(f"    HTTP {st}  ->  {describe(rows)}")
        emit("    (data IS retrievable — both D and E exist and match)")
        emit("")

        # 2. bot scoped to W — the PRD validation
        st, rows = await call(http, url, bot_h, "match_chunks", {**base, "filter_workspace_id": fx.ws1})
        emit("[2] match_chunks AS BOT B,   filter_workspace_id=W   <<< PRD US-070 VALIDATION")
        emit(f"    HTTP {st}  ->  {describe(rows)}")
        emit("    raw rows:")
        emit("    " + json.dumps([
            {"document_id": r["document_id"], "content": r["content"],
             "granting_principal_display": r["granting_principal_display"]}
            for r in rows], indent=2).replace("\n", "\n    "))
        emit("    => bot retrieves ONLY the share-to-bot doc D; NON-shared E -> 0 rows (zero-leak)")
        emit("")

        # 3. bot with NO filter — grant is the gate, not the filter
        st, rows = await call(http, url, bot_h, "match_chunks", base)
        emit("[3] match_chunks AS BOT B,   NO workspace filter")
        emit(f"    HTTP {st}  ->  {describe(rows)}")
        emit("    => still ONLY D — the chunk_acl grant (not the filter) is the boundary")
        emit("")

        # 4. owner no filter vs W vs W2 — filter is non-security narrowing
        st, rows_all = await call(http, url, owner_h, "match_chunks", base)
        st, rows_w = await call(http, url, owner_h, "match_chunks", {**base, "filter_workspace_id": fx.ws1})
        st, rows_w2 = await call(http, url, owner_h, "match_chunks", {**base, "filter_workspace_id": fx.ws2})
        emit("[4] match_chunks AS OWNER U — filter_workspace_id is NON-SECURITY narrowing")
        emit(f"    no filter -> {describe(rows_all)}")
        emit(f"    filter=W  -> {describe(rows_w)}")
        emit(f"    filter=W2 -> {describe(rows_w2)}")
        emit("    => the filter only SUBTRACTS within membership; omitting it leaks nothing")
        emit("")

        # 5. keyword leg honours the filter identically
        kbase = {"query": "returns policy", "match_count": 50}
        st, rows = await call(http, url, bot_h, "keyword_search", {**kbase, "filter_workspace_id": fx.ws1})
        emit("[5] keyword_search AS BOT B, filter_workspace_id=W  (hybrid's second leg)")
        emit(f"    HTTP {st}  ->  {describe(rows)}")
        emit("    => keyword leg scoped identically; both hybrid legs coherent")
        emit("")
        emit("=" * 78)
        emit("RESULT: bot answers only from share-to-bot docs; filter_workspace_id narrows")
        emit("        non-securely; the auth.uid() membership clause remains the boundary.")
        emit("=" * 78)
    finally:
        await _cleanup(conn, fx)
        await conn.close()
        await http.aclose()

    OUT.write_text("\n".join(lines) + "\n")
    emit("")
    emit(f"(transcript written to {OUT})")


if __name__ == "__main__":
    asyncio.run(main())
Evidence: Live integration test transcript (PRD validation, 5 checks)
owner sees both D and E in W (positive control — data is retrievable)
bot sees EXACTLY {D} — shared doc only; E (not shared) -> 0 rows
bot with NO workspace filter still sees only {D} (grant is the gate, not the filter)
owner: no-filter -> {D,E,F}; filter=W -> {D,E}; filter=W2 -> {F} (narrowing only)
keyword_search honours filter_workspace_id identically (both hybrid legs scoped)

PASS: 5 US-070 live integration checks

Pipeline

Updates from git push no-mistakes

✅ **intent** - passed

✅ No issues found.

✅ **Rebase** - passed

✅ No issues found.

⚠️ **Review** - 1 info
  • ℹ️ backend/support_bot.py:372 - run_bot_deflection_turn (and the workspace_id plumbing into run_deflection_pipeline) has no production caller yet — only the two new tests exercise it. This is intentional per the US-070 scope (the endpoint that wires it is US-080), so it is not dead code to remove; flagging only so it isn't mistaken for an oversight during review.
✅ **Test** - passed

✅ No issues found.

  • python -m backend.test_us070_bot_retrieval (unit, no DB/secret): per-turn mint sub=bot_user_id+60s TTL is the Bearer on both legs, filter_workspace_id on both legs, no cross-turn token cache, bot token absent from DeflectionResult and from propagated errors, no-visible-chunks => generic-deferral escalate
  • python -m backend.test_us070_bot_retrieval_integration against live Supabase (DATABASE_URL=postgresql://postgres:postgres@127.0.0.1:54322/postgres, SUPABASE_URL=http://127.0.0.1:54321): bot sees EXACTLY {D} (share-to-bot), 0 from non-shared E and other-workspace F; owner positive control sees D+E; filter_workspace_id narrows non-securely on both vector and keyword legs
  • Confirmed US-070 migrations applied live: match_chunks/keyword_search both carry filter_workspace_id uuid (queried pg_proc identity args)
  • Regression checks (no boundary/byte-identical change): python -m backend.test_deflection_pipeline, python -m backend.test_us069_bot_provisioning (module-merge intact), python -m backend.test_permissions
  • Evidence capture script driving the real PostgREST RPCs as the bot vs owner principals, writing raw HTTP responses to the evidence dir
✅ **Document** - passed

✅ No issues found.

✅ **Lint** - passed

✅ No issues found.

✅ **Push** - passed

✅ No issues found.

hcho22 added 2 commits June 24, 2026 17:42
…ty workspace filter (US-070)

Each customer turn the deflection pipeline now retrieves AS the per-workspace
support bot, not as a privileged reader, so the bot answers only from
share-to-bot documents (ADR-0008; no new content role, no new enforcement path).

- backend/support_bot.py (new): run_bot_deflection_turn mints a ~60s
  role=authenticated bot JWT (sub=bot_user_id) via an injected minter
  (US-068 mint_supabase_jwt, dependency-injected to avoid a main.py import
  cycle and keep the module pure/testable), runs the ADR-0003 deflection
  pipeline with that JWT in the Supabase headers so match_chunks/keyword_search
  resolve auth.uid() to the bot, then discards the token (no cross-turn cache).
  The bearer is confined to the Supabase headers and never reaches an
  SSE/response/log surface; only DeflectionResult.customer_message is client-safe.

- migrations 20260624150000 / 20260624150100: add filter_workspace_id uuid
  default null to match_chunks / keyword_search as an ORDINARY non-security
  narrowing filter (the extension point the US-007 note + documents_workspace_id_idx
  reserved). AND-ed beside filter_topics, so it can only subtract within what the
  auth.uid()-resolved membership clause already allows; null is a no-op, so
  /api/chat and E4/E6 are byte-identical. The trust boundary stays the membership
  EXISTS clause, never this param. Applied to both hybrid legs for coherence.

- retrieval.py / escalation.run_deflection_pipeline: thread an optional
  workspace_id=None through search_documents/keyword_search/keyword_only_search/
  hybrid_search and the pipeline; default None leaves every existing caller
  unchanged.

Tests: test_us070_bot_retrieval.py (no DB/LLM/secret) pins mint-per-turn,
bot bearer + filter on both legs, no caching, token absent from result/error.
test_us070_bot_retrieval_integration.py (live Postgres+PostgREST under a
self-minted bot JWT) pins the PRD validation: bot retrieves only the
share-to-bot doc and 0 rows from a non-shared doc; filter_workspace_id narrows
non-securely. AGENTS.md documents the seam and the sharp edge.
@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentic-rag Ready Ready Preview, Comment Jun 25, 2026 1:01am

@github-actions

Copy link
Copy Markdown
Contributor

Retrieval eval — PR vs main

n = 50 questions × 3 modes (vector, keyword, hybrid) on a 14-chunk corpus. PR ran in 82.29s; main in 71.79s.

Headline (each cell: PR value, Δ vs main)

Mode recall@5 MRR nDCG@5
vector 0.860 (±0.000) 0.772 (±0.000) 0.779 (±0.000)
keyword 0.110 (±0.000) 0.120 (±0.000) 0.112 (±0.000)
hybrid 0.860 (±0.000) 0.759 (±0.000) 0.769 (±0.000)

Per-category recall@5

Mode single_chunk multi_hop adversarial paraphrase
vector 0.900 (±0.000) 0.933 (±0.000) 0.600 (±0.000) 1.000 (±0.000)
keyword 0.250 (±0.000) 0.033 (±0.000) 0.000 (±0.000) 0.000 (±0.000)
hybrid 0.900 (±0.000) 0.933 (±0.000) 0.600 (±0.000) 1.000 (±0.000)

Comment is updated in place on each push by .github/workflows/retrieval-eval.yml (US-035). Comment-only — never blocks the build.

@hcho22 hcho22 merged commit 59ad0d2 into main Jun 25, 2026
3 checks passed
@hcho22 hcho22 deleted the fm/us070-bot-retrieval branch June 25, 2026 03:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant