Skip to content

fix(cycle): cache empty propose_takes scans#1218

Open
AdityaRajeshGadgil wants to merge 1 commit into
garrytan:masterfrom
AdityaRajeshGadgil:fix/propose-takes-empty-scan-cache
Open

fix(cycle): cache empty propose_takes scans#1218
AdityaRajeshGadgil wants to merge 1 commit into
garrytan:masterfrom
AdityaRajeshGadgil:fix/propose-takes-empty-scan-cache

Conversation

@AdityaRajeshGadgil
Copy link
Copy Markdown

Summary

propose_takes used take_proposals as both the operator-facing queue and the per-page idempotency cache. That works only when the extractor produces at least one proposal. The valid result [] left no row behind, so unchanged pages could re-spend extractor tokens every autopilot cycle.

This PR splits those two jobs:

  • adds take_proposal_page_scans as the successful extractor-call cache, keyed by (source_id, page_slug, content_hash, prompt_version) and written even for []
  • keeps take_proposals as proposal rows only, with dedup moved to (source_id, page_slug, content_hash, prompt_version, claim_text) so same-page multi-claim output is not collapsed
  • treats malformed or wrong-shape extractor output as retryable failure instead of caching it as empty success
  • threads dryRun into propose_takes so dry runs do not call the LLM or write proposal/cache rows
  • adds migration v80 plus fresh Postgres/PGLite schema coverage, including explicit Postgres RLS enablement for the new cache table

Test Coverage

  • bun test ./test/propose-takes.test.ts ./test/propose-takes-cache-schema.test.ts
  • bun test ./test/schema-bootstrap-coverage.test.ts ./test/propose-takes.test.ts ./test/propose-takes-cache-schema.test.ts
  • bun run typecheck
  • bun run check:all
  • bun run build
  • git diff --cached --check

bun run ci:local:diff could not run in this environment because Docker is not installed (docker: command not found). The diff-aware gate reached Docker startup before failing.

Pre-Landing Review

Two independent review passes were run before opening this PR. The first review found two blockers — wrong-shape JSON could still be cached as [], and the new cache table needed explicit RLS coverage — plus an insert-count nit. This version fixes all three:

  1. strict production parsing now throws on malformed/wrong-shape output, while literal JSON [] remains a valid cacheable empty result
  2. migration v80 and fresh schema SQL enable RLS for take_proposal_page_scans on Postgres, with a PGLite override for the local engine
  3. proposal inserts use RETURNING id, so proposals_inserted reflects rows actually inserted after ON CONFLICT DO NOTHING

The second review passed with no blocking findings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant