Skip to content

feat(support): lazy per-workspace support-bot provisioning + is_bot flag (US-069)#32

Merged
hcho22 merged 3 commits into
mainfrom
fm/us069-ship-z2
Jun 25, 2026
Merged

feat(support): lazy per-workspace support-bot provisioning + is_bot flag (US-069)#32
hcho22 merged 3 commits into
mainfrom
fm/us069-ship-z2

Conversation

@hcho22

@hcho22 hcho22 commented Jun 25, 2026

Copy link
Copy Markdown
Owner

What Changed

  • Added migration 20260624120000_workspace_membership_is_bot.sql: a is_bot boolean not null default false flag on workspace_membership (administrative metadata only, never in any visibility/retrieval predicate) plus a partial unique index workspace_membership_one_bot_per_workspace ... where is_bot that enforces exactly one bot per workspace at the DB layer as the race-loss serialization point.
  • Added backend/support_bot.py:provision_workspace_bot(workspace_id) - an async, lazy, idempotent primitive that creates the bot auth.users row via the GoTrue admin API (service-role key, fail-closed at call time) then inserts a role='member' membership row; a lost race is gated on the 23505 unique-violation marker (other 409s, e.g. FK 23503, surface as errors), the orphan auth.users row is dropped with status-failure logging, and the existing bot id is returned. Backed by backend/test_us069_bot_provisioning.py (unit + skip-clean integration layers).
  • Wired the new optional SUPPORT_BOT_EMAIL_DOMAIN env (default bots.support.internal) and documented the service-role key's bot-provisioning use across backend/.env.example, backend/main.py, README.md, and AGENTS.md.

Risk Assessment

✅ Low: Well-bounded new support-bot provisioning primitive with a DB-enforced one-bot-per-workspace guard; the two prior-round findings were correctly and minimally fixed, and no new material issues remain.

Testing

Ran the existing US-069 unit+integration test against a real running local Supabase (migration already applied) - all 11 assertions pass - then produced reviewer-visible product evidence with two harnesses that drive the real provisioning primitive against the live GoTrue admin API and Postgres: the first captures the full lifecycle (no bot at creation, a real GoTrue auth.users bot row with is_support_bot/workspace_id app_metadata, idempotent second call, member-listing exclusion, DB race guard), and the second exercises the two branches the review-fix commit changed using real PostgREST SQLSTATE responses (23505 race loss returns the winner and cleans up the orphan; a 23503 FK 409 raises rather than being swallowed). No rendered UI surface exists for this backend/DB change, so evidence is CLI transcripts plus JSON of persisted DB state and live API responses rather than screenshots. All fixtures were cleaned up and the worktree is clean; no issues found.

Evidence: End-to-end provisioning lifecycle transcript (real Supabase: GoTrue user + persisted is_bot membership, idempotency, member-listing exclusion, DB race guard)

=== 2. provision_workspace_bot(W) -> bot id === returned_bot_id: f7512fed-a31b-4897-a277-1fd5d1c49939 persisted_membership: [human role=member is_bot=False, f7512fed... role=member is_bot=True] === 3. bot id resolves to a REAL auth.users row (GoTrue admin GET) === http_status: 200 email: support-bot-1bdc26eb...@bots.support.internal app_metadata: {'is_support_bot': True, 'workspace_id': '368c0ad0-...'} user_metadata: {'display_name': 'Support Bot'} === 4. Second provision idempotent: same id, ONE is_bot row === ids_equal: True is_bot_count: 1 === 5. Member listing (not is_bot) excludes bot, keeps human === === 6. DB race guard: second is_bot insert rejected by workspace_membership_one_bot_per_workspace ===


=== 1. Workspace seeded: 1 human member, NO bot (lazy - nothing provisions a bot at creation) ===
  workspace: 368c0ad0-d8e6-49e5-bfee-0434ae6f7d53
  human_member: e9824c82-ab5e-4b2a-b8fc-65134b308201
  bot_rows_now: [{'user_id': UUID('e9824c82-ab5e-4b2a-b8fc-65134b308201'), 'role': 'member', 'is_bot': False}]
  is_bot_count: 0

=== 2. provision_workspace_bot(W) -> bot id (real GoTrue user + is_bot membership created) ===
  returned_bot_id: f7512fed-a31b-4897-a277-1fd5d1c49939
  persisted_membership: [{'user_id': UUID('e9824c82-ab5e-4b2a-b8fc-65134b308201'), 'role': 'member', 'is_bot': False}, {'user_id': UUID('f7512fed-a31b-4897-a277-1fd5d1c49939'), 'role': 'member', 'is_bot': True}]

=== 3. The bot id resolves to a REAL auth.users row (GoTrue admin API GET) ===
  http_status: 200
  id: f7512fed-a31b-4897-a277-1fd5d1c49939
  email: support-bot-1bdc26eb61ad49f4b176d91147fd82c3@bots.support.internal
  app_metadata: {'is_support_bot': True, 'provider': 'email', 'providers': ['email'], 'workspace_id': '368c0ad0-d8e6-49e5-bfee-0434ae6f7d53'}
  user_metadata: {'display_name': 'Support Bot', 'email_verified': True}

=== 4. Second provision is idempotent: same id, still exactly ONE is_bot row (one bot per workspace, not per key) ===
  first_call_id: f7512fed-a31b-4897-a277-1fd5d1c49939
  second_call_id: f7512fed-a31b-4897-a277-1fd5d1c49939
  ids_equal: True
  is_bot_count: 1

=== 5. Member-management listing excludes the bot (where not is_bot); unfiltered includes it ===
  member_listing_for_humans: [{'user_id': UUID('e9824c82-ab5e-4b2a-b8fc-65134b308201'), 'role': 'member', 'is_bot': False}]
  unfiltered_listing: [{'user_id': UUID('e9824c82-ab5e-4b2a-b8fc-65134b308201'), 'role': 'member', 'is_bot': False}, {'user_id': UUID('f7512fed-a31b-4897-a277-1fd5d1c49939'), 'role': 'member', 'is_bot': True}]

=== 6. DB-layer race guard: a raw SECOND is_bot insert for W is rejected (two concurrent provisions cannot create two bots) ===
  second_bot_insert: rejected by partial unique index: workspace_membership_one_bot_per_workspace

Wrote transcript JSON -> /private/var/folders/9t/k_yy9fqs5vd27rf12jx_rzqh0000gn/T/no-mistakes-evidence/01KVY1FBFD8Z99YJQB3690XMWN/us069_evidence.json

Cleaned up workspace, human/extra users, and the provisioned bot user.
Evidence: Review-fix (e4eb66c) branch transcript: 23505 race loss vs non-23505 FK 409, with real PostgREST SQLSTATE bodies

=== A. Genuine lost race: insert hits 23505 -> returns WINNER, drops orphan === returned_is_winner: True orphan_auth_user_cleaned_up: True is_bot_rows_for_workspace: 1 === B. Non-23505 (23503 FK violation): provision RAISES, NOT swallowed as race loss; orphan dropped === raised_runtime_error: True error_snippet: support-bot provisioning failed during membership insert: HTTP 409 {"code":"23503","details":"Key (workspace_id)=(...) is not present in table "workspaces"." mentions_23503: True orphan_auth_user_cleaned_up: True


=== A. Genuine lost race: insert hits 23505 -> provision returns the WINNER, drops the orphan it created ===
  winner_id: 585292d4-60e9-4637-aae6-81c7ed25ad32
  orphan_user_created: 621bd577-8ce8-4e53-9fb3-ce47074631f7
  provision_returned: 585292d4-60e9-4637-aae6-81c7ed25ad32
  returned_is_winner: True
  orphan_auth_user_cleaned_up: True
  is_bot_rows_for_workspace: 1

=== B. Non-23505 failure (23503 FK violation, nonexistent workspace): provision RAISES, NOT swallowed as a race loss; orphan dropped ===
  nonexistent_workspace: 7ce74b0f-d91e-4f80-a92d-0be613b10c4a
  orphan_user_created: 44aad539-9257-4125-b677-3e28e2c01527
  raised_runtime_error: True
  error_snippet: support-bot provisioning failed during membership insert: HTTP 409 {"code":"23503","details":"Key (workspace_id)=(7ce74b0f-d91e-4f80-a92d-0be613b10c4a) is not present in table \"workspaces\".","hint":
  mentions_23503: True
  orphan_auth_user_cleaned_up: True

Wrote transcript JSON -> /private/var/folders/9t/k_yy9fqs5vd27rf12jx_rzqh0000gn/T/no-mistakes-evidence/01KVY1FBFD8Z99YJQB3690XMWN/us069_raceloss_evidence.json

Cleaned up all race/error-path fixtures.
Evidence: Lifecycle evidence JSON (machine-readable persisted state per step)
[
  {
    "step": "1. Workspace seeded: 1 human member, NO bot (lazy - nothing provisions a bot at creation)",
    "workspace": "368c0ad0-d8e6-49e5-bfee-0434ae6f7d53",
    "human_member": "e9824c82-ab5e-4b2a-b8fc-65134b308201",
    "bot_rows_now": [
      {
        "user_id": "e9824c82-ab5e-4b2a-b8fc-65134b308201",
        "role": "member",
        "is_bot": false
      }
    ],
    "is_bot_count": 0
  },
  {
    "step": "2. provision_workspace_bot(W) -> bot id (real GoTrue user + is_bot membership created)",
    "returned_bot_id": "f7512fed-a31b-4897-a277-1fd5d1c49939",
    "persisted_membership": [
      {
        "user_id": "e9824c82-ab5e-4b2a-b8fc-65134b308201",
        "role": "member",
        "is_bot": false
      },
      {
        "user_id": "f7512fed-a31b-4897-a277-1fd5d1c49939",
        "role": "member",
        "is_bot": true
      }
    ]
  },
  {
    "step": "3. The bot id resolves to a REAL auth.users row (GoTrue admin API GET)",
    "http_status": 200,
    "id": "f7512fed-a31b-4897-a277-1fd5d1c49939",
    "email": "support-bot-1bdc26eb61ad49f4b176d91147fd82c3@bots.support.internal",
    "app_metadata": {
      "is_support_bot": true,
      "provider": "email",
      "providers": [
        "email"
      ],
      "workspace_id": "368c0ad0-d8e6-49e5-bfee-0434ae6f7d53"
    },
    "user_metadata": {
      "display_name": "Support Bot",
      "email_verified": true
    }
  },
  {
    "step": "4. Second provision is idempotent: same id, still exactly ONE is_bot row (one bot per workspace, not per key)",
    "first_call_id": "f7512fed-a31b-4897-a277-1fd5d1c49939",
    "second_call_id": "f7512fed-a31b-4897-a277-1fd5d1c49939",
    "ids_equal": true,
    "is_bot_count": 1
  },
  {
    "step": "5. Member-management listing excludes the bot (where not is_bot); unfiltered includes it",
    "member_listing_for_humans": [
      {
        "user_id": "e9824c82-ab5e-4b2a-b8fc-65134b308201",
        "role": "member",
        "is_bot": false
      }
    ],
    "unfiltered_listing": [
      {
        "user_id": "e9824c82-ab5e-4b2a-b8fc-65134b308201",
        "role": "member",
        "is_bot": false
      },
      {
        "user_id": "f7512fed-a31b-4897-a277-1fd5d1c49939",
        "role": "member",
        "is_bot": true
      }
    ]
  },
  {
    "step": "6. DB-layer race guard: a raw SECOND is_bot insert for W is rejected (two concurrent provisions cannot create two bots)",
    "second_bot_insert": "rejected by partial unique index: workspace_membership_one_bot_per_workspace"
  }
]
Evidence: Race/error-path evidence JSON
[
  {
    "step": "A. Genuine lost race: insert hits 23505 -> provision returns the WINNER, drops the orphan it created",
    "winner_id": "585292d4-60e9-4637-aae6-81c7ed25ad32",
    "orphan_user_created": "621bd577-8ce8-4e53-9fb3-ce47074631f7",
    "provision_returned": "585292d4-60e9-4637-aae6-81c7ed25ad32",
    "returned_is_winner": true,
    "orphan_auth_user_cleaned_up": true,
    "is_bot_rows_for_workspace": 1
  },
  {
    "step": "B. Non-23505 failure (23503 FK violation, nonexistent workspace): provision RAISES, NOT swallowed as a race loss; orphan dropped",
    "nonexistent_workspace": "7ce74b0f-d91e-4f80-a92d-0be613b10c4a",
    "orphan_user_created": "44aad539-9257-4125-b677-3e28e2c01527",
    "raised_runtime_error": true,
    "error_snippet": "support-bot provisioning failed during membership insert: HTTP 409 {\"code\":\"23503\",\"details\":\"Key (workspace_id)=(7ce74b0f-d91e-4f80-a92d-0be613b10c4a) is not present in table \\\"workspaces\\\".\",\"hint\":",
    "mentions_23503": true,
    "orphan_auth_user_cleaned_up": true
  }
]

Pipeline

Updates from git push no-mistakes

⏭️ **intent** - skipped

✅ No issues found.

✅ **Rebase** - passed

✅ No issues found.

🔧 **Review** - 2 issues found → auto-fixed ✅
  • ⚠️ backend/support_bot.py:214 - _insert_bot_membership treats any HTTP 409 as a one-bot-per-workspace race loss, but PostgREST also returns 409 for foreign-key violations (SQLSTATE 23503). workspace_membership.workspace_id references public.workspaces(id), so a valid-UUID-but-nonexistent workspace creates an auth.users row, FK-fails the membership insert (409 -> return False), deletes the bot, then raises the misleading 'membership conflicted but no existing bot was found ... (unexpected partial state)' instead of a clear 'workspace does not exist'. Gate the return-False on the 23505 marker specifically and let other 409s fall through to _provisioning_error. The two '23505' text sub-checks are also dead given the bare status_code==409, and '"23505"' in text is subsumed by '23505' in text.
  • ℹ️ backend/support_bot.py:233 - _delete_bot_user is documented as best-effort with a warning log, but it never calls raise_for_status, so a non-2xx DELETE response (4xx/5xx from the GoTrue admin API) returns silently with no log - only transport errors (httpx.HTTPError) reach the warning. Orphan auth.users rows are harmless but accumulate invisibly when the most likely failure mode (an error status) leaves no breadcrumb. Check the response status (e.g. r.raise_for_status() inside the try) so HTTP-status failures also log.

🔧 Fix: gate membership-insert race loss on 23505, log orphan-cleanup status failures
✅ Re-checked - no issues remain.

✅ **Test** - passed

✅ No issues found.

  • python3 -m backend.test_us069_bot_provisioning (unit + integration against running local Supabase; all 11 assertions pass)
  • End-to-end lifecycle harness us069_evidence.py: seed workspace with no bot, provision twice, fetch the real GoTrue user via admin API, verify idempotency, member-listing exclusion (where not is_bot), and the partial-unique-index race guard
  • Review-fix branch harness us069_raceloss_evidence.py: forced a genuine 23505 race loss (provision returns winner, drops orphan auth.users row) and a non-23505 409 (real 23503 FK violation -> RuntimeError raised, orphan dropped)
  • Verified no leaked DB fixtures (0 support-bot users, 0 is_bot rows, 0 US069 workspaces, 0 @test.local users) and a clean worktree via git status --porcelain
✅ **Document** - passed

✅ No issues found.

✅ **Lint** - passed

✅ No issues found.

✅ **Push** - passed

✅ No issues found.

hcho22 added 3 commits June 24, 2026 13:53
Add the per-workspace support bot as an ordinary auth.users +
workspace_membership(role='member', is_bot=true) row - a FLAG, not a
content role (ADR-0008, ADR-0002 intact).

- migration 20260624120000_workspace_membership_is_bot.sql: adds
  is_bot boolean not null default false and a partial unique index
  (workspace_id) where is_bot - the DB-layer one-bot-per-workspace race
  guard. is_bot is administrative metadata, never in any visibility
  predicate; member listings must filter `not is_bot`; room left for an
  optional explicit write-deny policy.
- backend/support_bot.py:provision_workspace_bot(workspace_id) ->
  bot_user_id: lazy, idempotent provisioning via the GoTrue admin API +
  service-role PostgREST; fail-closed on missing SUPABASE_URL /
  SUPABASE_SERVICE_ROLE_KEY; orphan-cleanup on a lost race. The returned
  id populates conversations.bot_user_id. Designed for US-072 to call on
  first widget-key issuance (not wired here - out of scope).
- backend/test_us069_bot_provisioning.py: unit layer (always runs) +
  integration layer (skips cleanly without local Supabase / is_bot
  column / API) proving one bot per workspace across two calls,
  role=member, member-listing exclusion, no bot content-role, and the
  partial-unique-index race guard.
- main.py: note SUPABASE_SERVICE_ROLE_KEY is also the provisioning key.
- AGENTS.md: US-069 invariant section.
@vercel

vercel Bot commented Jun 25, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agentic-rag Ready Ready Preview, Comment Jun 25, 2026 12:22am

@github-actions

Copy link
Copy Markdown
Contributor

Retrieval eval — PR vs main

n = 50 questions × 3 modes (vector, keyword, hybrid) on a 14-chunk corpus. PR ran in 61.71s; main in 64.87s.

Headline (each cell: PR value, Δ vs main)

Mode recall@5 MRR nDCG@5
vector 0.860 (±0.000) 0.772 (±0.000) 0.779 (±0.000)
keyword 0.110 (±0.000) 0.120 (±0.000) 0.112 (±0.000)
hybrid 0.860 (±0.000) 0.759 (±0.000) 0.769 (±0.000)

Per-category recall@5

Mode single_chunk multi_hop adversarial paraphrase
vector 0.900 (±0.000) 0.933 (±0.000) 0.600 (±0.000) 1.000 (±0.000)
keyword 0.250 (±0.000) 0.033 (±0.000) 0.000 (±0.000) 0.000 (±0.000)
hybrid 0.900 (±0.000) 0.933 (±0.000) 0.600 (±0.000) 1.000 (±0.000)

Comment is updated in place on each push by .github/workflows/retrieval-eval.yml (US-035). Comment-only — never blocks the build.

@hcho22 hcho22 merged commit 5aaf81b into main Jun 25, 2026
3 checks passed
@hcho22 hcho22 deleted the fm/us069-ship-z2 branch June 25, 2026 00:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant