feat(support): support-bot per-turn retrieval seam + non-security workspace filter (US-070) by hcho22 · Pull Request #33 · hcho22/Agentic_RAG

hcho22 · 2026-06-25T01:01:16Z

Intent

Implement US-070 (Phase-2 PRD): the support-bot per-turn retrieval seam, now rebased onto main after US-068 (#31) and US-069 (#32) landed. Each customer turn mints a ~60s Supabase-compatible bot JWT (sub=bot_user_id, via US-068 mint_supabase_jwt), calls match_chunks AS the bot, then discards the token (no cross-turn cache), so the bot answers only from chunk_acl share-to-bot docs. Key points a reviewer should know: (1) US-069's PR #32 created backend/support_bot.py for bot PROVISIONING (provision_workspace_bot); US-070 adds the RETRIEVAL seam (run_bot_deflection_turn). During rebase I intentionally MERGED both into one support_bot.py module under a unified docstring - they are two phases of one support-bot lifecycle meeting at conversations.bot_user_id - preserving both sides' code verbatim. (2) The US-068 minter is DEPENDENCY-INJECTED (MintToken callable) into the retrieval path to avoid a main.py import cycle and keep it pure/testable; the endpoint that wires it is US-080. (3) filter_workspace_id is added to BOTH match_chunks and keyword_search as an ORDINARY NON-SECURITY narrowing filter (null default = no-op; trust boundary stays the auth.uid() workspace_membership clause, untouched) because hybrid fuses both legs; E4/E6/permissions stay byte-identical. (4) An earlier review correctly flagged a no-op bearer-scrub in a finally block; I removed it and the comment now states the real structural no-cache guarantee honestly. Tests: unit (no DB/secret), live RLS integration (bot sees only the shared doc, 0 from non-shared; filter narrows non-securely), and US-069's provisioning test all pass. The bot bearer token must never reach an SSE/response/log surface; only DeflectionResult.customer_message is client-safe.

What Changed

Added run_bot_deflection_turn to backend/support_bot.py (merged with US-069's provision_workspace_bot into one support-bot lifecycle module): each customer turn mints a ~60s sub=bot_user_id Supabase JWT via a dependency-injected mint_supabase_jwt (avoids a main.py import cycle), runs the deflection pipeline as the bot principal, then discards the token with no cross-turn cache, so the bot answers only from chunk_acl share-to-bot documents; escalation.py/retrieval.py are updated to plumb workspace_id through.
Added filter_workspace_id uuid default null to both match_chunks and keyword_search (migrations 20260624150000 / 20260624150100) as an ordinary non-security narrowing filter AND-ed beside filter_topics on both hybrid legs; null is a no-op so /api/chat and the E4/E6 paths stay byte-identical, and the trust boundary remains the auth.uid() workspace_membership clause.
Added unit tests (no DB/secret) and live-RLS integration tests proving the bot sees only the shared doc and the filter narrows without leaking, plus AGENTS.md and a CONTEXT.md glossary sync; note run_bot_deflection_turn has no production caller yet (the wiring endpoint is US-080).

Risk Assessment

✅ Low: The change is well-bounded and backward-compatible: migration bodies are byte-identical to their predecessors except an optional null-default narrowing param AND-ed under the untouched membership boundary, every new Python param defaults to None (no-op for existing callers), and the new bot-retrieval seam is covered by both unit and live RLS integration tests with no bearer-leak path.

Testing

Ran the US-070 unit seam test (6 groups, no DB/secret) and the live RLS integration test (5 checks, the PRD US-070 validation) against the running local Supabase with the migrations confirmed applied; both pass. Verified no regression in the touched paths via the deflection-pipeline, US-069 provisioning, and permissions/RLS suites. As reviewer-visible evidence I captured the actual raw PostgREST match_chunks/keyword_search responses under a self-minted bot JWT: the bot retrieves only the share-to-bot doc D (non-shared E and other-workspace F return zero rows), the owner positive control sees D+E (proving the data is retrievable), and filter_workspace_id only subtracts within membership - demonstrating the auth.uid() membership clause stays the trust boundary. This is backend security work with no UI surface, so the evidence is API-response transcripts rather than screenshots. Overall: all green, no findings.

Evidence: Raw PostgREST RPC responses (bot vs owner) proving the US-070 retrieval boundary

[2] match_chunks AS BOT B, filter_workspace_id=W <<< PRD US-070 VALIDATION HTTP 200 -> ['D (in W, SHARED-to-bot via chunk_acl)'] raw rows: [{"document_id": "f1c9c0dc-...", "content": "shared answer about returns policy", "granting_principal_display": "bot-a441a6a8@test.local"}] => bot retrieves ONLY the share-to-bot doc D; NON-shared E -> 0 rows (zero-leak) [4] match_chunks AS OWNER U — filter_workspace_id is NON-SECURITY narrowing no filter -> {D,E,F}; filter=W -> {D,E}; filter=W2 -> {F}

==============================================================================
US-070 SUPPORT-BOT RETRIEVAL — RAW PostgREST RPC RESPONSES (live local Supabase)
==============================================================================
Supabase REST : http://127.0.0.1:54321
Workspace  W  : 2141c4c5-7271-4cef-8d3a-228c3d972389
Workspace  W2 : ccc923cf-08a4-4f9a-9f0d-14994131fec7
Owner  U      : cddf0a78-c766-4b33-87c3-186e7e205fd2  (member of W and W2; owns D,E,F)
Bot    B      : a441a6a8-bd27-4df0-b834-ef71c2fb2f61  (member of W only; share-to-bot on D's chunk)
Bot bearer    : self-minted HS256 role=authenticated JWT, sub=B (US-068 shape)

[1] match_chunks AS OWNER U, filter_workspace_id=W  (positive control)
    HTTP 200  ->  ['D  (in W,  SHARED-to-bot via chunk_acl)', 'E  (in W,  NOT shared)']
    (data IS retrievable — both D and E exist and match)

[2] match_chunks AS BOT B,   filter_workspace_id=W   <<< PRD US-070 VALIDATION
    HTTP 200  ->  ['D  (in W,  SHARED-to-bot via chunk_acl)']
    raw rows:
    [
      {
        "document_id": "f1c9c0dc-78fd-4125-ad66-e34ed296bc26",
        "content": "shared answer about returns policy",
        "granting_principal_display": "bot-a441a6a8@test.local"
      }
    ]
    => bot retrieves ONLY the share-to-bot doc D; NON-shared E -> 0 rows (zero-leak)

[3] match_chunks AS BOT B,   NO workspace filter
    HTTP 200  ->  ['D  (in W,  SHARED-to-bot via chunk_acl)']
    => still ONLY D — the chunk_acl grant (not the filter) is the boundary

[4] match_chunks AS OWNER U — filter_workspace_id is NON-SECURITY narrowing
    no filter -> ['D  (in W,  SHARED-to-bot via chunk_acl)', 'E  (in W,  NOT shared)', 'F  (in W2, owner-only, other workspace)']
    filter=W  -> ['D  (in W,  SHARED-to-bot via chunk_acl)', 'E  (in W,  NOT shared)']
    filter=W2 -> ['F  (in W2, owner-only, other workspace)']
    => the filter only SUBTRACTS within membership; omitting it leaks nothing

[5] keyword_search AS BOT B, filter_workspace_id=W  (hybrid's second leg)
    HTTP 200  ->  ['D  (in W,  SHARED-to-bot via chunk_acl)']
    => keyword leg scoped identically; both hybrid legs coherent

==============================================================================
RESULT: bot answers only from share-to-bot docs; filter_workspace_id narrows
        non-securely; the auth.uid() membership clause remains the boundary.
==============================================================================

Evidence: Evidence capture script (reusable; drives real PostgREST RPCs as bot/owner)

"""US-070 evidence capture: drives the REAL match_chunks / keyword_search RPCs
through PostgREST under a self-minted bot JWT (the exact production path) and
records the raw HTTP responses so a reviewer can see the support-bot retrieval
boundary working end-to-end.

This reuses the integration test's fixture/seed/mint helpers verbatim, then prints
labelled raw API responses (which doc each principal actually retrieves) and writes
the same transcript to a file. Not a pass/fail harness — it shows the product
surface (PostgREST RPC responses) directly.
"""

from __future__ import annotations

import asyncio
import json
import os
import sys
from pathlib import Path

import asyncpg
import httpx

ROOT = Path(__file__).resolve().parent
# Import the integration test module's helpers (seed/mint/fixture).
sys.path.insert(0, "/Users/hcho/.no-mistakes/worktrees/3074c1251a17/01KVY3Q4QB7C7P51750366T7VX")
sys.path.insert(0, "/Users/hcho/.no-mistakes/worktrees/3074c1251a17/01KVY3Q4QB7C7P51750366T7VX/backend")

from backend.test_us070_bot_retrieval_integration import (  # noqa: E402
    EMBEDDING,
    LOCAL_ANON_KEY,
    LOCAL_JWT_SECRET,
    Fixture,
    _headers,
    _mint_jwt,
    _seed,
    _cleanup,
)

OUT = ROOT / "us070_postgrest_responses.txt"
lines: list[str] = []


def emit(s: str = "") -> None:
    print(s)
    lines.append(s)


async def call(http, url, headers, rpc, body):
    r = await http.post(f"{url}/rest/v1/rpc/{rpc}", headers=headers, json=body)
    return r.status_code, r.json()


async def main() -> None:
    db_url = os.environ["DATABASE_URL"]
    url = os.environ.get("SUPABASE_URL", "http://127.0.0.1:54321")
    anon = os.environ.get("SUPABASE_ANON_KEY", LOCAL_ANON_KEY)
    secret = os.environ.get("SUPABASE_JWT_SECRET", LOCAL_JWT_SECRET)

    http = httpx.AsyncClient(timeout=10.0)
    fx = Fixture()
    conn = await asyncpg.connect(db_url)
    # Human-readable labels for the doc ids the RPC returns.
    label = {
        fx.doc_d: "D  (in W,  SHARED-to-bot via chunk_acl)",
        fx.doc_e: "E  (in W,  NOT shared)",
        fx.doc_f: "F  (in W2, owner-only, other workspace)",
    }

    def describe(rows):
        seen = {row["document_id"] for row in rows}
        return [label.get(d, d) for d in sorted(seen, key=lambda d: list(label).index(d) if d in label else 99)]

    try:
        await conn.execute("notify pgrst, 'reload schema'")
        await _seed(conn, fx)
        owner_email = await conn.fetchval("select email from auth.users where id=$1::uuid", fx.owner)
        bot_email = await conn.fetchval("select email from auth.users where id=$1::uuid", fx.bot)
        owner_h = _headers(_mint_jwt(fx.owner, owner_email, secret), anon)
        bot_h = _headers(_mint_jwt(fx.bot, bot_email, secret), anon)

        emit("=" * 78)
        emit("US-070 SUPPORT-BOT RETRIEVAL — RAW PostgREST RPC RESPONSES (live local Supabase)")
        emit("=" * 78)
        emit(f"Supabase REST : {url}")
        emit(f"Workspace  W  : {fx.ws1}")
        emit(f"Workspace  W2 : {fx.ws2}")
        emit(f"Owner  U      : {fx.owner}  (member of W and W2; owns D,E,F)")
        emit(f"Bot    B      : {fx.bot}  (member of W only; share-to-bot on D's chunk)")
        emit("Bot bearer    : self-minted HS256 role=authenticated JWT, sub=B (US-068 shape)")
        emit("")

        base = {"query_embedding": EMBEDDING, "match_threshold": 0.5, "match_count": 50}

        # 1. owner positive control, scoped to W
        st, rows = await call(http, url, owner_h, "match_chunks", {**base, "filter_workspace_id": fx.ws1})
        emit("[1] match_chunks AS OWNER U, filter_workspace_id=W  (positive control)")
        emit(f"    HTTP {st}  ->  {describe(rows)}")
        emit("    (data IS retrievable — both D and E exist and match)")
        emit("")

        # 2. bot scoped to W — the PRD validation
        st, rows = await call(http, url, bot_h, "match_chunks", {**base, "filter_workspace_id": fx.ws1})
        emit("[2] match_chunks AS BOT B,   filter_workspace_id=W   <<< PRD US-070 VALIDATION")
        emit(f"    HTTP {st}  ->  {describe(rows)}")
        emit("    raw rows:")
        emit("    " + json.dumps([
            {"document_id": r["document_id"], "content": r["content"],
             "granting_principal_display": r["granting_principal_display"]}
            for r in rows], indent=2).replace("\n", "\n    "))
        emit("    => bot retrieves ONLY the share-to-bot doc D; NON-shared E -> 0 rows (zero-leak)")
        emit("")

        # 3. bot with NO filter — grant is the gate, not the filter
        st, rows = await call(http, url, bot_h, "match_chunks", base)
        emit("[3] match_chunks AS BOT B,   NO workspace filter")
        emit(f"    HTTP {st}  ->  {describe(rows)}")
        emit("    => still ONLY D — the chunk_acl grant (not the filter) is the boundary")
        emit("")

        # 4. owner no filter vs W vs W2 — filter is non-security narrowing
        st, rows_all = await call(http, url, owner_h, "match_chunks", base)
        st, rows_w = await call(http, url, owner_h, "match_chunks", {**base, "filter_workspace_id": fx.ws1})
        st, rows_w2 = await call(http, url, owner_h, "match_chunks", {**base, "filter_workspace_id": fx.ws2})
        emit("[4] match_chunks AS OWNER U — filter_workspace_id is NON-SECURITY narrowing")
        emit(f"    no filter -> {describe(rows_all)}")
        emit(f"    filter=W  -> {describe(rows_w)}")
        emit(f"    filter=W2 -> {describe(rows_w2)}")
        emit("    => the filter only SUBTRACTS within membership; omitting it leaks nothing")
        emit("")

        # 5. keyword leg honours the filter identically
        kbase = {"query": "returns policy", "match_count": 50}
        st, rows = await call(http, url, bot_h, "keyword_search", {**kbase, "filter_workspace_id": fx.ws1})
        emit("[5] keyword_search AS BOT B, filter_workspace_id=W  (hybrid's second leg)")
        emit(f"    HTTP {st}  ->  {describe(rows)}")
        emit("    => keyword leg scoped identically; both hybrid legs coherent")
        emit("")
        emit("=" * 78)
        emit("RESULT: bot answers only from share-to-bot docs; filter_workspace_id narrows")
        emit("        non-securely; the auth.uid() membership clause remains the boundary.")
        emit("=" * 78)
    finally:
        await _cleanup(conn, fx)
        await conn.close()
        await http.aclose()

    OUT.write_text("\n".join(lines) + "\n")
    emit("")
    emit(f"(transcript written to {OUT})")


if __name__ == "__main__":
    asyncio.run(main())

Evidence: Live integration test transcript (PRD validation, 5 checks)

owner sees both D and E in W (positive control — data is retrievable)
bot sees EXACTLY {D} — shared doc only; E (not shared) -> 0 rows
bot with NO workspace filter still sees only {D} (grant is the gate, not the filter)
owner: no-filter -> {D,E,F}; filter=W -> {D,E}; filter=W2 -> {F} (narrowing only)
keyword_search honours filter_workspace_id identically (both hybrid legs scoped)

PASS: 5 US-070 live integration checks

Pipeline

Updates from git push no-mistakes

✅ **intent** - passed

✅ No issues found.

✅ **Rebase** - passed

✅ No issues found.

⚠️ **Review** - 1 info

ℹ️ backend/support_bot.py:372 - run_bot_deflection_turn (and the workspace_id plumbing into run_deflection_pipeline) has no production caller yet — only the two new tests exercise it. This is intentional per the US-070 scope (the endpoint that wires it is US-080), so it is not dead code to remove; flagging only so it isn't mistaken for an oversight during review.

✅ **Test** - passed

✅ No issues found.

python -m backend.test_us070_bot_retrieval (unit, no DB/secret): per-turn mint sub=bot_user_id+60s TTL is the Bearer on both legs, filter_workspace_id on both legs, no cross-turn token cache, bot token absent from DeflectionResult and from propagated errors, no-visible-chunks => generic-deferral escalate
python -m backend.test_us070_bot_retrieval_integration against live Supabase (DATABASE_URL=postgresql://postgres:postgres@127.0.0.1:54322/postgres, SUPABASE_URL=http://127.0.0.1:54321): bot sees EXACTLY {D} (share-to-bot), 0 from non-shared E and other-workspace F; owner positive control sees D+E; filter_workspace_id narrows non-securely on both vector and keyword legs
Confirmed US-070 migrations applied live: match_chunks/keyword_search both carry filter_workspace_id uuid (queried pg_proc identity args)
Regression checks (no boundary/byte-identical change): python -m backend.test_deflection_pipeline, python -m backend.test_us069_bot_provisioning (module-merge intact), python -m backend.test_permissions
Evidence capture script driving the real PostgREST RPCs as the bot vs owner principals, writing raw HTTP responses to the evidence dir

✅ **Document** - passed

✅ No issues found.

✅ **Lint** - passed

✅ No issues found.

✅ **Push** - passed

✅ No issues found.

…ty workspace filter (US-070) Each customer turn the deflection pipeline now retrieves AS the per-workspace support bot, not as a privileged reader, so the bot answers only from share-to-bot documents (ADR-0008; no new content role, no new enforcement path). - backend/support_bot.py (new): run_bot_deflection_turn mints a ~60s role=authenticated bot JWT (sub=bot_user_id) via an injected minter (US-068 mint_supabase_jwt, dependency-injected to avoid a main.py import cycle and keep the module pure/testable), runs the ADR-0003 deflection pipeline with that JWT in the Supabase headers so match_chunks/keyword_search resolve auth.uid() to the bot, then discards the token (no cross-turn cache). The bearer is confined to the Supabase headers and never reaches an SSE/response/log surface; only DeflectionResult.customer_message is client-safe. - migrations 20260624150000 / 20260624150100: add filter_workspace_id uuid default null to match_chunks / keyword_search as an ORDINARY non-security narrowing filter (the extension point the US-007 note + documents_workspace_id_idx reserved). AND-ed beside filter_topics, so it can only subtract within what the auth.uid()-resolved membership clause already allows; null is a no-op, so /api/chat and E4/E6 are byte-identical. The trust boundary stays the membership EXISTS clause, never this param. Applied to both hybrid legs for coherence. - retrieval.py / escalation.run_deflection_pipeline: thread an optional workspace_id=None through search_documents/keyword_search/keyword_only_search/ hybrid_search and the pipeline; default None leaves every existing caller unchanged. Tests: test_us070_bot_retrieval.py (no DB/LLM/secret) pins mint-per-turn, bot bearer + filter on both legs, no caching, token absent from result/error. test_us070_bot_retrieval_integration.py (live Postgres+PostgREST under a self-minted bot JWT) pins the PRD validation: bot retrieves only the share-to-bot doc and 0 rows from a non-shared doc; filter_workspace_id narrows non-securely. AGENTS.md documents the seam and the sharp edge.

…rrowing filter

vercel · 2026-06-25T01:01:22Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agentic-rag	Ready	Preview, Comment	Jun 25, 2026 1:01am

github-actions · 2026-06-25T01:07:47Z

Retrieval eval — PR vs `main`

n = 50 questions × 3 modes (vector, keyword, hybrid) on a 14-chunk corpus. PR ran in 82.29s; main in 71.79s.

Headline (each cell: PR value, Δ vs `main`)

Mode	recall@5	MRR	nDCG@5
vector	0.860 (±0.000)	0.772 (±0.000)	0.779 (±0.000)
keyword	0.110 (±0.000)	0.120 (±0.000)	0.112 (±0.000)
hybrid	0.860 (±0.000)	0.759 (±0.000)	0.769 (±0.000)

Per-category recall@5

Mode	single_chunk	multi_hop	adversarial	paraphrase
vector	0.900 (±0.000)	0.933 (±0.000)	0.600 (±0.000)	1.000 (±0.000)
keyword	0.250 (±0.000)	0.033 (±0.000)	0.000 (±0.000)	0.000 (±0.000)
hybrid	0.900 (±0.000)	0.933 (±0.000)	0.600 (±0.000)	1.000 (±0.000)

_{Comment is updated in place on each push by .github/workflows/retrieval-eval.yml (US-035). Comment-only — never blocks the build.}

hcho22 added 2 commits June 24, 2026 17:42

no-mistakes(document): sync CONTEXT glossary with US-070 workspace na…

e7954e2

…rrowing filter

hcho22 merged commit 59ad0d2 into main Jun 25, 2026
3 checks passed

hcho22 deleted the fm/us070-bot-retrieval branch June 25, 2026 03:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(support): support-bot per-turn retrieval seam + non-security workspace filter (US-070)#33

feat(support): support-bot per-turn retrieval seam + non-security workspace filter (US-070)#33
hcho22 merged 2 commits into
mainfrom
fm/us070-bot-retrieval

hcho22 commented Jun 25, 2026

Uh oh!

vercel Bot commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hcho22 commented Jun 25, 2026

Intent

What Changed

Risk Assessment

Testing

Pipeline

Uh oh!

vercel Bot commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Retrieval eval — PR vs main

Headline (each cell: PR value, Δ vs main)

Per-category recall@5

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Retrieval eval — PR vs `main`

Headline (each cell: PR value, Δ vs `main`)