Skip to content

Development#342

Merged
ofcskn merged 6 commits into
mainfrom
development
Jun 21, 2026
Merged

Development#342
ofcskn merged 6 commits into
mainfrom
development

Conversation

@ofcskn

@ofcskn ofcskn commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

No description provided.

ofcskn and others added 4 commits June 20, 2026 09:57
Deep review found the agents/battles/workflows execution layer fails for
distinct reasons that share one root: a single-threaded, timeout-less worker
loop plus DB recovery primitives that were never wired up.

Worker (apps/worker):
- C1: bound every provider call. callProvider/queued runs get AbortSignal
  timeouts; media calls are raced via withTimeout. A hung provider no longer
  wedges the tick. (new lib/timeout.ts)
- C2: decouple the loop. Each job type runs an independent self-scheduling
  poller (startPollLoop) instead of one tick awaiting all four, so a slow
  workflow/Chainabit poll can't starve battles/agents/queued runs.
- C3: team runs actually execute. team-run-worker was a stub that marked runs
  completed without running anything; it now creates a linked workflow_run and
  executes it through the shared executor (new run-workflow-graph.ts).
- C4: no more zombie worker. SIGTERM/SIGINT graceful drain, process.exit(1) on
  crash, _workerReady reset so a dead loop stops reporting healthy.
- H1: remove dead BYOK-battle code that read columns absent from workflow_runs
  (battles execute via battle_jobs, not workflow_runs).
- H2: idempotent event dispatch (advisory-lock dedup) so N replicas can't
  create N duplicate runs; dispatcher resubscribes on realtime drop.
- H3: wire the existing-but-unused crash recovery. Executor heartbeats the run;
  new workflow-recovery-worker reclaims stale runs via fn_claim_stale_workflow_run.
- H4: render agent personality with task input + fall back to raw note.
- M2/M5: warn on unrecognised model id; /health requires WORKER_HEALTH_TOKEN.

Migration (additive + backward-compatible):
- fn_worker_set_workflow_run_status: service-role-safe terminal write.
  fn_update_workflow_run_status is human-auth gated (NULL for the worker) and
  raised 42501 on private workflows, so runs never terminalized — which the new
  recovery loop would re-execute forever. Workers repointed to the safe writer.
- fn_claim_scheduled_workflow_run: stamp heartbeat_at + run_worker_id at claim
  to close the race where recovery steals a freshly-claimed run.
- fn_worker_claim_scheduled_workflow_run: wrapper return type corrected from 6
  mismatched columns to the 7 the inner fn returns (it raised 42804 on every
  call, so scheduled runs were never claimable).
- fn_worker_finalize_team_run, fn_worker_get_run_exec_context,
  fn_worker_create_team_run_workflow_run: support team-run execution + recovery.

Tests: added timeout + workflow-recovery specs; rewrote team-run spec for real
execution; updated scheduled-workflow + battle-worker specs for new mocks/args.
Verified by a static review agent (all checks pass). Local jest/nx could not be
run in this sandbox (dependency install blocked); CI is the gate.

Follow-ups (pre-existing, out of scope): regenerate database.types.ts for the
claim wrapper; fn_worker_get_workflow_context lacks workspace_id; reconcile
fn_build_lenser_prompt_context / fn_worker_render_template signatures.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tence

Pre-existing latent issues surfaced during the worker-runtime-stability
review (migration 20270602000001):

- fn_worker_get_workflow_context now returns workspace_id (the workflow
  owner's personal workspace). The executor reads wfCtxRow.workspace_id to
  attribute workflow-output media, but the function only returned
  (workflow_id, triggered_by), so workspace_id was always NULL and the
  media-object insert never fired. The insert is now also best-effort
  (try/catch) so a media-store failure can never fail the run.
- fn_worker_render_template(text, jsonb): the worker calls a body-based
  render, but only the version-based (uuid, jsonb) overload existed in
  committed migrations. Added the body overload ([[key]] substitution, the
  repo-wide token convention); distinct signature so it coexists.
- fn_build_lenser_prompt_context: added a public wrapper delegating to the
  agents-schema function, which the worker's public-schema rpc() could never
  resolve before — agent memory context was silently dropped.

Also regenerated the database.types.ts entry for
fn_worker_claim_scheduled_workflow_run to the corrected 7-column return shape
(context_inputs/global_model_id/triggered_by) matching the wrapper fix in the
prior commit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Disambiguate the two public properties so they stop declaring the same
canonical host: apps/web canonicalizes to moon.lenserfight.com and apps/arena
to the apex lenserfight.com, across index.html, the build-time SEO generator
(tools/seo/app-seo.mjs), PageMeta, SEOHead, seoService, robots, sitemaps, and
llms files. Remove the non-existent arena.lenserfight.com host everywhere
(generator sameAs, docs policy links, auth return allow-list, both SEO specs).

Strengthen GEO/structured data: lenses as CreativeWork, battles and threads
as DiscussionForumPosting, lensers as ProfilePage>Person, each wrapped in an
@graph with BreadcrumbList and interactionStatistic. Fix the ?lang=tr canonical
self-reference and hreflang emission, noindex thin search routes, and allow-list
AI crawlers. Convert entity card/row navigation from onClick handlers to real
LocaleLink anchors so crawlers (and middle-click/new-tab) see real hrefs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Claude-Session: https://claude.ai/code/session_01FY9vwRL6YA7dNiV4x18EBn
Replace the client-side public-view + private-merge search with a single
SECURITY DEFINER RPC (fn_search_lenses) that re-encodes the base-table RLS
contract: authors see their own lenses at any visibility/status, everyone else
sees only published lenses gated by visibility. Matches title, description,
content, tag name/slug, and parameter labels, scoped optionally to one profile.
The workflow palette now passes a null owner scope so the RPC enforces
per-viewer visibility instead of the auth id matching nothing.

Add getMySavedLenses (fn_get_my_saved_lenses) for bookmarked lenses, gated by
current visibility so a lens that turns private after saving drops off, surfaced
as an owner-only Saved tab on the profile page. Both functions set
row_security off with a scoped search_path to satisfy the definer+FORCE-RLS
contract.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Claude-Session: https://claude.ai/code/session_01FY9vwRL6YA7dNiV4x18EBn
const FORUM_HOST = 'https://moon.lenserfight.com'
/** Apex marketing site (different app) — reserved for genuine cross-links only, not entity canonicals. */
// eslint-disable-next-line @typescript-eslint/no-unused-vars
const ARENA_HOST = 'https://lenserfight.com'
ofcskn added 2 commits June 21, 2026 11:16
Raise the pnpm override floor from >=6.27.0 to ">=7.28.0 <8" so the single
resolved undici version moves from the vulnerable 7.24.5 to 7.28.0, closing
three high-severity advisories pulled in transitively via expo > @expo/cli:
TLS validation bypass via SOCKS5 ProxyAgent (GHSA-vmh5-mc38-953g), WebSocket
DoS via fragment-count bypass (GHSA-vxpw-j846-p89q), and cross-origin routing
via SOCKS5 pool reuse (GHSA-hm92-r4w5-c3mj). The <8 upper bound keeps the tree
on the 7.x line @expo/cli expects instead of forcing a jump to undici 8.

pnpm audit --prod --audit-level high now exits 0 (high: 3 -> 0).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Claude-Session: https://claude.ai/code/session_01FY9vwRL6YA7dNiV4x18EBn
The search() method moved from a vw_lenses_public ilike + fn_list_my_private_lenses
merge to a single SECURITY DEFINER RPC (fn_search_lenses). Update the three stale
assertions to expect the RPC call with p_query/p_owner_id/p_offset/p_limit, covering
both owner-scoped and null-owner search paths.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Claude-Session: https://claude.ai/code/session_01FY9vwRL6YA7dNiV4x18EBn
@ofcskn ofcskn merged commit c4b0587 into main Jun 21, 2026
16 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant