Require explicit pg-sync source URLs#4576
Open
KyleAMathews wants to merge 9 commits into
Open
Conversation
Contributor
Contributor
Electric Agents Mobile BuildLocal mobile checks ran for commit The EAS Android preview build was skipped because the |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4576 +/- ##
==========================================
- Coverage 58.35% 56.78% -1.58%
==========================================
Files 370 327 -43
Lines 40674 38008 -2666
Branches 11553 10955 -598
==========================================
- Hits 23735 21582 -2153
+ Misses 16865 16391 -474
+ Partials 74 35 -39
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Exclude per-request metadata (wakeId, principal, tenant) from the sourceRef hash so re-registrations reuse the same bridge and stream, and fetch the shape log once before registering so bad URLs fail the registration with Electric's error instead of dying silently in the bridge retry loop. Add Horton prompt guidance for the now-required shape URL. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Electric returns 200 on non-shape paths like its root, so an ok status alone can't validate a source URL. The probe now requires the electric-handle header and suggests the /v1/shape path when missing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Extract shared manifest-rebind helper in process-wake; log unlogged registration 500s; delete legacy url-less pg-sync registry rows at startup instead of warn-spamming every boot; drop the tenant-less streamUrl fallbacks (observe tool and legacy manifest entries); warn when change messages are dropped; remove the dead second parameter of pgSyncMessageToDurableEvent; harden tests (manifest streamUrl write side, wake-registry old_value mapping, fetch unstub). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
pg-sync observations (the
observe_pg_syncHorton tool) now require an explicit Electric shape endpoint URL, and the server validates that endpoint at registration time instead of failing silently. This fixes the core bug where pg-sync observations registered "successfully" but never delivered any row changes to the agent.Root cause
Two independent defects made pg-sync observations silently dead:
Per-request metadata polluted the source identity.
sourceRefForPgSync()hashed the entire canonical options object, including themetadatafield (principal, tenant,wakeId,runtimeConsumerId). SincewakeIdis unique per wake, every re-observation minted a brand-newsourceRef→ a brand-new empty durable stream → a brand-new bridge that subscribed fromoffset: now. The change that triggered the wake lived on the previous wake's stream, so the agent's handle DB was always empty. Old bridges leaked, one ShapeStream per wake.Registration succeeded before anything touched Electric. The server returned
200and the tool reported success before the bridge ever connected. The real connection happened later inside the bridge'sShapeStreamsubscription, and any failure (wrong URL, unreachable host, nonexistent table) landed in an infiniterecoverStream()backoff loop with only a server-sidewarn. A bare host URL likehttp://localhost:30000is especially deadly: Electric answers200with an empty body on its root path, so even an "ok" check passed while the shape API actually lives at/v1/shape.Approach
Stable identity.
sourceRefForPgSync()now stripsmetadatabefore hashing, so identity derives only from the observed shape (url,table,columns,where,params,replica). Re-registrations reuse the same bridge and stream. The server'sregister()reuses this same helper, so client and server agree onsourceRefby construction.Registration-time probe.
PgSyncBridgeManager.register()fetches the shape log once before persisting a registry row or starting a bridge:A genuine shape response always carries the
electric-handleheader — that's what distinguishes a real shape endpoint from Electric's root path.PgSyncSourceValidationErrormaps to HTTP400in the router, so the failure reaches the agent as actionable feedback (it can correct the URL/table and retry) rather than dying in a retry loop. The probe runs only for not-yet-running sources, so the hot path stays a single round-trip.Prompt guidance. The Horton system prompt and tool descriptions now tell the model the
urlis an HTTP(S) shape endpoint (with example), not apostgres://string, that there's no default, and to ask the user rather than guess.Key invariants
sourceRefForPgSync(options)depends only on shape identity, never on request context. Same shape ⇒ same ref ⇒ same bridge/stream, regardless of who or which wake registers it.electric-handle).ShapeStream, so "probe passed" predicts "bridge can connect" (caveat under Trade-offs).ShapeStream.Non-goals
url) can never resume, so server startup now deletes them with one clear log line instead of warn-spamming every boot — affected agents must re-observe. This was an explicit decision (the user-facing contract was already broken).Trade-offs
buildShapeProbeUrlre-derives the Electric TS client's query encoding (comma-joined arrays,params[n]). Unlike the client it does not quote column identifiers, so probe and stream encoding can diverge for exotic column names. Documented in a comment; acceptable because the probe is a validity check, not the data path.electric-handleheuristic. A proxy/CDN in front of Electric that strips response headers would fail the probe even if the shape works. Low risk, and the error message points at/v1/shape, so it's debuggable.Verification
Manually verified end-to-end against a live Electric instance: a bare URL (
http://localhost:30000) now fails fast with a/v1/shapesuggestion, the agent retries with the correct URL, registration succeeds, and row changes flow through to wakes.Files changed
Core fix
agents-runtime/src/observation-sources.ts— stripmetadatafrom the identity hash.agents-server/src/pg-sync-bridge-manager.ts—buildShapeProbeUrl,probeSource,PgSyncSourceValidationError,electric-handlecheck; delete legacy url-less rows at startup; drop the dead second param ofpgSyncMessageToDurableEvent; warn when a change message is dropped.agents-server/src/routing/pg-sync-router.ts— requireurl; mapPgSyncSourceValidationError→ 400; log unexpected 500s.agents-server/src/entity-registry.ts—deletePgSyncBridge.agents-server/src/manifest-side-effects.ts— resolve pgSync source URL fromconfig.streamUrl; drop the tenant-less fallback path.agents-runtime/src/process-wake.ts—withRegisteredManifestEntryhelper rebinds entities/pgSync sources to their server-assigned ref/streamUrl.agents-runtime/src/setup-context.ts,types.ts— propagatestreamUrlon the observation handle.agents-server/src/wake-registry.ts— read pg-syncold_valueintochange.oldValue.Tool & prompt
agents/src/tools/observe-pg-sync.ts— requiredurl; return the server handle'sstreamUrl(throw if absent, no guessed fallback).agents/src/agents/horton.ts— "Observing Postgres tables" prompt section.Tests — probe failure taxonomy, sourceRef stability under differing metadata, legacy-row deletion, manifest
config.streamUrlwrite side, wake-registryold_valuemapping, router 400/500 mapping.