feat: normalize labels into indexed table for efficient filtering by coji · Pull Request #103 · coji/durably

coji · 2026-03-08T09:45:11Z

Summary

Add durably_run_labels(run_id, key, value) normalized table with (key, value) index for O(log n) label filtering instead of full-table-scan JSON extraction
Dual-write strategy: labels written to both JSON column (source of truth) and normalized table (derived performance index for filtering)
getRuns() label filtering uses EXISTS subqueries with JSON fallback (json_extract/->>) to ensure runs are never lost even if label row insert hasn't completed
Consolidated migrations into single v1 (pre-release, no existing databases need incremental migration)
deleteRun() cleans up label rows in the same transaction
Fix SQLITE_BUSY with libsql: Added write mutex to serialize all mutating Store operations, preventing concurrent write contention caused by libsql's connection model (transactions open separate SQLite connections)
Atomicity fix: enqueue() now wraps run insert + label row insert in a transaction (was two separate operations)
Batch optimization: enqueueMany() collects all label rows into a single INSERT instead of one per run

Motivation

For multi-tenant workloads (e.g., 10 tenants × 48 runs/day × 365 × 3 years = 525K runs), JSON-based label filtering requires scanning every row. The normalized table with a (key, value) index reduces this to O(log n).

libsql's transaction model opens separate SQLite connections, causing SQLITE_BUSY when concurrent writes happen (e.g., worker processing + user enqueue). The write mutex serializes mutations within a single process to prevent this.

Test plan

All 161 existing tests pass
pnpm validate passes (format, lint, typecheck, tests)
Migration test updated to verify consolidated v1 schema with label indexes
New libsql write contention tests (5 scenarios: enqueue, batchTrigger, deleteRun, triggerAndWait with idempotencyKey, concurrent enqueue+batchTrigger)

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added a normalized run labels table with indexes; labels are persisted on run creation (including batch enqueues) and removed on run delete.
Refactor
- Switched label filtering to use indexed label rows with a safe JSON fallback.
- Serialized all mutating operations within a process to avoid write contention.
Tests
- Added migration assertions for label indexes and tests simulating libsql/SQLite write contention.

Labels were stored as JSON in durably_runs.labels and filtered via json_extract/json_each, which requires full table scans. At scale (500K+ runs for multi-tenant workloads), this becomes a bottleneck. Add durably_run_labels(run_id, key, value) with (key, value) index for O(log n) lookups. Uses dual-write strategy: JSON column kept for reads/events, normalized table used for WHERE-clause filtering via EXISTS subqueries. Migration v2 backfills from existing JSON data with SQLite/PostgreSQL-specific JSON functions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-03-08T09:45:15Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
durably-demo	Ready	Preview	Mar 8, 2026 10:58am

coderabbitai · 2026-03-08T09:45:24Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0f405bcf-3220-4f43-98df-21ec564ae3e2

📥 Commits

Reviewing files that changed from the base of the PR and between 6975bb6 and 675be95.

📒 Files selected for processing (1)

packages/durably/src/storage.ts

🚧 Files skipped from review as they are similar to previous changes (1)

packages/durably/src/storage.ts

📝 Walkthrough

Walkthrough

Adds a normalized durably_run_labels table and indexes, persists labels on enqueue/delete, replaces JSON-only label filtering with indexed EXISTS checks (with JSON fallback), serializes mutating storage operations with an in-process write lock, and records the detected backend for migrations and storage creation.

Changes

Cohort / File(s)	Summary
Schema & Migrations `packages/durably/src/schema.ts`, `packages/durably/src/migrations.ts`, `packages/durably/tests/node/migration-consolidated.test.ts`	Introduce `RunLabelsTable` and `durably_run_labels` table; add unique index on `(run_id,key)` and non-unique index on `(key,value)`; migration/backfill added; tests assert new indexes.
Storage & Concurrency `packages/durably/src/storage.ts`, `packages/durably/src/durably.ts`	Add in-process write mutex to serialize writes; add `insertLabelRows` and persist labels on `enqueue`/`enqueueMany`; deleteRun removes label rows; label-based filtering uses indexed EXISTS with JSON fallback; propagate detected backend into store creation.
Docs `CLAUDE.md`	Update schema note to mention new `durably_run_labels` table.
Tests: Write Contention `packages/durably/tests/node/libsql-write-contention.test.ts`	Add tests simulating concurrent enqueue/enqueueMany/deleteRun under an active worker to validate write-lock behavior and absence of SQLITE_BUSY errors.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant State as DurablyState
    participant Backend as BackendDetector
    participant Migrations as Migrations
    participant DB as Database

    App->>State: initialize(options)
    State->>Backend: detectBackend(options.dialect)
    Backend-->>State: backend
    State->>Migrations: runMigrations(db, backend)
    Migrations->>DB: check schema version
    alt needs v2
        Migrations->>DB: create durably_run_labels table
        Migrations->>DB: create indexes (run_id+key unique, key+value non-unique)
        alt backend supports set-based backfill
            Migrations->>DB: set-based insert of label rows
        else
            Migrations->>DB: iterate runs and insert label rows
        end
    end
    DB-->>Migrations: complete
    Migrations-->>State: done
    State-->>App: ready

sequenceDiagram
    participant Client as App/API
    participant Store as Storage Layer
    participant Mutex as In-Process WriteLock
    participant DB as Database
    participant Labels as durably_run_labels
    participant Worker as Worker

    Client->>Store: enqueue(run, labels)
    Store->>Mutex: acquire write lock
    Mutex-->>Store: lock granted
    Store->>DB: insert into durably_runs -> run_id
    Store->>Labels: insert normalized rows (run_id, key, value)
    Labels-->>Store: rows inserted
    Store->>Mutex: release write lock
    Store-->>Client: enqueue result

    Note over Worker,Store: concurrent mutating ops serialized by Mutex

    Client->>Store: getRuns(filter with labels)
    Store->>Labels: EXISTS query for matching (key,value)
    alt EXISTS supported/used
        Labels-->>Store: matching run_ids
        Store->>DB: fetch runs by ids/filters
    else JSON fallback
        Store->>DB: JSON-based label filtering
    end
    DB-->>Store: runs
    Store-->>Client: filtered runs

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

coji/durably PR 45: Modifies run-claiming logic in packages/durably/src/storage.ts — overlaps with write-lock and claim/claimNext-related changes.
coji/durably PR 75: Changes label handling in storage and API — directly related to normalized label storage and queries.
coji/durably PR 101: Refactors storage/migration surface — touches backend propagation and migration behavior affected here.

Poem

🐇 I tunneled through rows and indexed vines,
Turned JSON nests into tidy lines,
Locks keep my burrow calm and sure,
Labels hop home — precise, secure,
Hooray — a carrot for each indexed sign! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 62.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main feature: normalizing labels into an indexed table to improve filtering efficiency.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch normalize-labels

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

No need for separate v2 migration since there are no existing databases to migrate from. Remove backend parameter from runMigrations since backfill logic is no longer needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

packages/durably/src/storage.ts (1)

486-500: Consider avoiding redundant JSON re-parsing.

The labels are already available from each input.labels in the original inputs array. Currently, the code JSON-stringifies labels into run.labels (line 471) and then re-parses them (line 490). While functionally correct, you could avoid this round-trip by tracking which input corresponds to each new run.

♻️ Optional refactor to avoid re-parsing

         // Insert all new runs in a single batch
-        const newRuns = runs.filter((r) => r.created_at === now)
+        const newRunIndices: number[] = []
+        for (let i = 0; i < runs.length; i++) {
+          if (runs[i].created_at === now) newRunIndices.push(i)
+        }
+        const newRuns = newRunIndices.map((i) => runs[i])
         if (newRuns.length > 0) {
           await trx.insertInto('durably_runs').values(newRuns).execute()

           // Insert normalized labels for indexed filtering
           const labelRows: { run_id: string; key: string; value: string }[] = []
-          for (const run of newRuns) {
-            const labels = JSON.parse(run.labels) as Record<string, string>
+          for (const idx of newRunIndices) {
+            const labels = inputs[idx].labels ?? {}
             for (const [key, value] of Object.entries(labels)) {
-              labelRows.push({ run_id: run.id, key, value })
+              labelRows.push({ run_id: runs[idx].id, key, value })
             }
           }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/durably/src/storage.ts` around lines 486 - 500, The code reparses
run.labels from newRuns to build labelRows; instead, carry the parsed labels
from the original inputs to avoid the JSON round-trip: when creating the runs
(the runRows/newRuns creation around run.labels assignment), attach or preserve
the parsed labels (e.g., parsedLabels or use the inputs[index].labels) and then
use that property instead of JSON.parse(run.labels) when building labelRows for
inserting into durably_run_labels via trx.insertInto; reference symbols:
newRuns, run.labels, inputs, labelRows, durably_run_labels, trx.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/durably/src/storage.ts`:
- Around line 486-500: The code reparses run.labels from newRuns to build
labelRows; instead, carry the parsed labels from the original inputs to avoid
the JSON round-trip: when creating the runs (the runRows/newRuns creation around
run.labels assignment), attach or preserve the parsed labels (e.g., parsedLabels
or use the inputs[index].labels) and then use that property instead of
JSON.parse(run.labels) when building labelRows for inserting into
durably_run_labels via trx.insertInto; reference symbols: newRuns, run.labels,
inputs, labelRows, durably_run_labels, trx.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3a2a061c-fe99-48ce-9364-4ee92e77b4ac

📥 Commits

Reviewing files that changed from the base of the PR and between a8ddecf and 1e68092.

📒 Files selected for processing (6)

CLAUDE.md
packages/durably/src/durably.ts
packages/durably/src/migrations.ts
packages/durably/src/schema.ts
packages/durably/src/storage.ts
packages/durably/tests/node/migration-consolidated.test.ts

…bel filtering - Extract shared insertLabelRows() helper for enqueue/enqueueMany - Avoid JSON round-trip in enqueueMany by tracking labels in a Map - Add JSON fallback to getRuns label filter (EXISTS OR json_extract) to handle atomicity gap without requiring transactions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Tests that exercise concurrent writes between the worker and user operations (enqueue, enqueueMany, deleteRun, triggerAndWait). deleteRun currently fails with SQLITE_BUSY due to libsql's connection model opening separate connections for transactions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

libsql opens separate SQLite connections for transactions, causing SQLITE_BUSY when concurrent writes happen from the same Kysely instance (e.g., worker processing + user enqueue/delete). Add an async write mutex that serializes all mutating operations within a single process. Read operations are not locked. This preserves all existing transactions (deleteRun, recordStep, enqueueMany) while preventing write/write conflicts. Also fix contention test type errors by adding output schemas. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

packages/durably/src/storage.ts (1)

515-519: Consider batching all label inserts for efficiency.

Currently, this iterates and issues a separate insertLabelRows call per run. For large batch enqueues with many labeled runs, this could be optimized to a single insert.

♻️ Optional: Single batch insert for all labels

         // Insert normalized labels for indexed filtering
-        for (const [runId, labels] of newRunLabels) {
-          await insertLabelRows(trx, runId, labels)
-        }
+        const allLabels = [...newRunLabels.entries()].flatMap(([runId, labels]) =>
+          Object.entries(labels).map(([key, value]) => ({ run_id: runId, key, value }))
+        )
+        if (allLabels.length > 0) {
+          await trx.insertInto('durably_run_labels').values(allLabels).execute()
+        }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/durably/src/storage.ts` around lines 515 - 519, The loop issues one
insertLabelRows(trx, runId, labels) per run which is inefficient; instead
collect all label rows from newRunLabels into a single array and perform one
bulk insert within the transaction (either by extending insertLabelRows to
accept a batched payload or adding a new insertLabelRowsBatch(trx, rows)
helper). Build rows with runId, key, value (or the same shape insertLabelRows
expects) and call the single bulk insert using the existing trx so all labels
are inserted in one query.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/durably/src/storage.ts`:
- Around line 515-519: The loop issues one insertLabelRows(trx, runId, labels)
per run which is inefficient; instead collect all label rows from newRunLabels
into a single array and perform one bulk insert within the transaction (either
by extending insertLabelRows to accept a batched payload or adding a new
insertLabelRowsBatch(trx, rows) helper). Build rows with runId, key, value (or
the same shape insertLabelRows expects) and call the single bulk insert using
the existing trx so all labels are inserted in one query.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2104fc7a-955e-4feb-ab26-1ab3eaaf9f80

📥 Commits

Reviewing files that changed from the base of the PR and between 1e68092 and 6975bb6.

📒 Files selected for processing (5)

packages/durably/src/durably.ts
packages/durably/src/migrations.ts
packages/durably/src/storage.ts
packages/durably/tests/node/libsql-write-contention.test.ts
packages/durably/tests/node/migration-consolidated.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

packages/durably/src/durably.ts
packages/durably/tests/node/migration-consolidated.test.ts

- Wrap enqueue() run+labels in transaction for atomicity - Batch all label rows in enqueueMany() into single INSERT - Fix orphaned JSDoc comment placement Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel Bot deployed to Preview March 8, 2026 09:45 View deployment

vercel Bot deployed to Preview March 8, 2026 09:48 View deployment

coderabbitai Bot reviewed Mar 8, 2026

View reviewed changes

coji and others added 3 commits March 8, 2026 19:08

vercel Bot deployed to Preview March 8, 2026 10:51 View deployment

coderabbitai Bot reviewed Mar 8, 2026

View reviewed changes

refactor: fix enqueue atomicity and batch label inserts

675be95

- Wrap enqueue() run+labels in transaction for atomicity - Batch all label rows in enqueueMany() into single INSERT - Fix orphaned JSDoc comment placement Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel Bot deployed to Preview March 8, 2026 10:58 View deployment

coji merged commit 912039d into main Mar 8, 2026
4 checks passed

coji deleted the normalize-labels branch March 8, 2026 11:00

coji mentioned this pull request Mar 8, 2026

RFC: Phase 1 — Lease-based runtime redesign #98

Closed

28 tasks

coderabbitai Bot mentioned this pull request Mar 12, 2026

feat: add purgeRuns API and retainRuns auto-cleanup option #109

Merged

9 tasks

coji mentioned this pull request Mar 16, 2026

chore: bump version to 0.13.0 #110

Merged

2 tasks

coderabbitai Bot mentioned this pull request Mar 16, 2026

perf: denormalize step_count and remove labels JSON fallback #132

Merged

This was referenced Mar 26, 2026

feat: accept status array in useRuns filter (#140) #154

Merged

test: formalize store contract tests (#162 step 4/4) #177

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: normalize labels into indexed table for efficient filtering#103

feat: normalize labels into indexed table for efficient filtering#103
coji merged 6 commits into
mainfrom
normalize-labels

coji commented Mar 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented Mar 8, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Mar 8, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

coji commented Mar 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Test plan

Summary by CodeRabbit

Uh oh!

vercel Bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coji commented Mar 8, 2026 •

edited by coderabbitai Bot

Loading

vercel Bot commented Mar 8, 2026 •

edited

Loading

coderabbitai Bot commented Mar 8, 2026 •

edited

Loading