Skip to content

Fix auth-handoff cookies not reaching parallel-session agent leases#1

Merged
federicodeponte merged 1 commit into
mainfrom
fix/auth-replica-cookie-sync
Jun 10, 2026
Merged

Fix auth-handoff cookies not reaching parallel-session agent leases#1
federicodeponte merged 1 commit into
mainfrom
fix/auth-replica-cookie-sync

Conversation

@federicodeponte

Copy link
Copy Markdown
Member

Symptom

A human completed a login in the broker auth portal (auth_request → portal_url → noVNC, identity chrome-depontefede), but a concurrent/subsequent agent browser_lease on the SAME identity did NOT see the logged-in session — api.slack.com still showed logged-out in the agent's lease tab. The portal's authenticated cookies did not become visible to the agent lease, breaking the "human logs in once, agent continues" handoff.

Root cause

  • The auth portal launches Chrome with --user-data-dir={identity.profile_dir} (auth.py:451) — i.e. the identity's base profile dir. The login cookies land there.
  • For identities with max_parallel_sessions > 1 (chrome-depontefede = 3), agent leases are served from per-slot replicas under profiles/.replicas/<identity>/<slot> (pool._identity_profile_for_slot).
  • Replicas are only rsynced from base when a slot is (re)activated (write_slot_config_sync_replica_profile), and the lease path deliberately reuses warm replicas without re-syncing (test_identity_lease_prefers_warm_active_replica_without_resync).
  • So a replica synced before the human's login never picks up the new cookies → the agent lease sees a logged-out session.

Verified empirically: base chrome-depontefede profile had 23 slack cookies; replica pool-b had 0, pool-d had 0. Same pattern on chrome-rocketlist (base 11 supabase cookies, replica pool-c 0).

Fix (root cause, preserves replica isolation)

  1. Lease freshness guard (pool.py): when a lease would be served from a warm replica whose cookie DB predates the identity's base profile, force re-activation so the replica re-syncs from base before it is handed out. Self-healing for any base update, not only auth.
  2. Replica invalidation on auth completion (identities.invalidate_identity_replicas): on /auth/{token}/complete, drop stale replicas + slot configs for every non-leased slot of the identity so the next lease rebuilds from the freshly-authenticated base. Slots with an active foreign lease are skipped (guard re-syncs on their next lease) so live sessions are never corrupted.
  3. Cookie-landed verification (api._verify_auth_cookie_landed): after stopping the portal browser (flush to disk), verify the target-origin cookie actually landed in the base profile; surface a cookie_verification block + warning telemetry if not.

Verification

  • Full suite: 168 passed (+6 new tests covering staleness detection, forced re-sync on stale warm replica, invalidation skipping leased slots, and the auth-complete wiring).
  • Run against real diverged on-disk profiles: chrome-depontefede and chrome-rocketlist replicas with 0 target-origin cookies are correctly flagged stale and would re-sync from base.

Deploy note

The broker runtime is the /root/ax-browser-broker checkout (uvicorn imports from cwd). Applying needs a broker restart, which was deliberately NOT performed because 2 active foreign leases (kimi-baradona-gsc on pool-a/pool-b) were live. Restart only after leases drain.

🤖 Generated with Claude Code

The auth portal logs the human in against an identity's BASE profile dir
(auth.py launches Chrome with --user-data-dir={identity.profile_dir}), but
identities with max_parallel_sessions > 1 serve agent leases from per-slot
replica copies under profiles/.replicas/<identity>/<slot>. Replicas are only
rsynced from base when a slot is (re)activated, and the lease path deliberately
reuses warm replicas without re-syncing. A replica synced before the human's
login therefore never picks up the new cookies, so the agent lease sees a
logged-out session (e.g. api.slack.com still logged out after a portal login).

Root-cause fix, preserving per-slot replica isolation:
- pool.lease: when a lease would be served from a warm replica whose cookie DB
  predates the identity's base profile, force re-activation so the replica is
  re-synced from base before it is handed out (_replica_is_stale_against_base,
  _cookie_db_mtime). Self-healing for any base profile update, not just auth.
- identities.invalidate_identity_replicas: on auth completion, drop stale
  replicas + their slot configs for every NON-leased slot of the identity so
  the next lease rebuilds them from the freshly-authenticated base. Slots with
  an active foreign lease are skipped (the pool freshness guard re-syncs them on
  their next lease) so live sessions are never corrupted.
- api auth_complete: stop the portal browser FIRST (flush cookies to disk),
  then invalidate replicas and verify the target-origin cookie actually landed
  in the base profile, surfacing a cookie_verification block + warning event.

Verified against real diverged on-disk profiles: chrome-depontefede and
chrome-rocketlist replicas with 0 target-origin cookies are correctly flagged
stale and would re-sync from base (which holds the authenticated cookies).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@federicodeponte federicodeponte merged commit ed104e8 into main Jun 10, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant