Skip to content

feat: compile() — the ergonomic .each path, codegen'd to bindColumns speed#86

Merged
andymai merged 1 commit into
mainfrom
perf/compile-each
Jun 8, 2026
Merged

feat: compile() — the ergonomic .each path, codegen'd to bindColumns speed#86
andymai merged 1 commit into
mainfrom
perf/compile-each

Conversation

@andymai
Copy link
Copy Markdown
Owner

@andymai andymai commented Jun 8, 2026

What

Adds query.compile(body) — a source-rewriting query compiler for the ergonomic iteration path. It reads an .each-style callback's own .toString(), rewrites every e.<component>.<field> into direct typed-array column indexing, and codegens a specialized per-archetype loop (reusing the PR1 new Function mechanism). The readable accessor syntax now lands near eachChunk/bindColumns instead of paying the per-row proxy tax.

const run = q.compile<{ dt: number }>((e, ctx) => {
  e.position.x += e.velocity.dx * ctx.dt
  e.position.y += e.velocity.dy * ctx.dt
})
run({ dt: 1 / 60 }) // per frame

Measured (50k integrate): compiled ~1.6 ns/entity vs proxy .each ~10 — a ~6× speedup, matching bindColumns. CI regression lane asserts compile stays ≤ 1.0× a same-run bitECS control.

Why it's safe (never changes results)

  • Conservative transform. Compiles only straight-line numeric-scalar bodies; falls back to the unchanged proxy .each for anything it can't prove safe (control flow / ? / && / return / loops / nested fns, strings / comments / regex literals, destructuring-assignment, non-numeric-scalar fields vec/bool/eid/bigint/rich, non-required components, any e use beyond e.comp.field, per-row ctx writes, row-filtered queries, runtimes that block new Function).
  • No name collisions. Every generated identifier is __-prefixed and the body is rejected if it contains __, so a user local (v0, trackWrite, …) can never silently shadow a generated one.
  • Self-containment pre-flight. A scratch run on throwaway 1-row arrays catches bodies that close over an outer variable (ReferenceError) → proxy fallback, instead of crashing the first real frame.
  • Reactivity preserved. A written component is recorded in the write log exactly as the accessor setter would (component-granular, gated on tracking.active), with the branch hoisted out of the loop so the no-consumer path is a clean kernel. .changed()/observers see compiled writes identically.
  • Property-tested byte-identical to proxy .each under random spawn/despawn/write/growth churn, with and without a .changed consumer.

An adversarial review caught the generated-name-collision (silent shadow), destructuring-write-miss, and prefix-++ write-detection gaps — all fixed and regression-tested here.

Changes

  • new packages/core/src/query/compile-each.ts — the analyzer/transform
  • LiveQuery.compile() + public Query/LooseQuery.compile type
  • bench bucket (ecsia compile) wired into bench:report + the CI regression-ratio lane
  • docs: performance.md "Compile the ergonomic path" section
  • bundle budget ratcheted (+~2.3 KB gz — compile is a query-kernel method, not tree-shakeable)

Test plan

  • pnpm test (1269 passed)
  • pnpm typecheck:tests / typecheck:extras
  • pnpm docs:check
  • pnpm size
  • BENCH_REGRESSION=1 regression lane (4/4 ratios under ceiling)

…speed

`query.compile(body)` reads an `.each`-style callback's own source, rewrites
each `e.<comp>.<field>` to direct typed-array column indexing, and codegens a
specialized per-archetype loop — so the readable accessor syntax lands near
eachChunk/bindColumns (~1.6 ns/entity, ~6x faster than the proxy .each) while
you keep writing `e.position.x += e.velocity.dx * ctx.dt`.

Unlike bindColumns it PRESERVES reactivity: a written component is recorded in
the write log exactly as the accessor setter would (component-granular, gated on
tracking.active, so free when no .changed/observer consumer exists). The tracking
branch is hoisted OUT of the loop, so the common no-consumer path is a clean
kernel V8 compiles like bindColumns.

Pure speedup that can never change results: the analyzer is conservative and
falls back to the unchanged proxy .each for anything it can't prove safe —
non-straight-line bodies, strings/comments/regex literals, destructuring-assign,
non-numeric-scalar fields (vec/bool/eid/bigint/rich), non-required components,
any `e` use beyond `e.comp.field`, per-row ctx writes, row-filtered queries,
`__`-prefixed locals (no generated-name collision), and runtimes that block
`new Function`. Every generated identifier is `__`-prefixed and the body is
rejected if it contains `__`, so a user local can never silently shadow one. A
scratch pre-flight runs the runner once on throwaway arrays, so a body that
closes over an outer variable falls back instead of crashing the first frame.

Correctness is property-tested byte-identical to proxy .each under random
spawn/despawn/write/growth churn, with and without a .changed consumer.

- new: packages/core/src/query/compile-each.ts (the source transform)
- LiveQuery.compile() + public Query/LooseQuery.compile type
- bench bucket (ecsia compile) + CI regression-ratio lane (≤1.0x bitECS)
- docs: performance.md "Compile the ergonomic path" section
- bundle budget ratcheted (+~2.3KB gz: compile lives in the query kernel)
@andymai andymai enabled auto-merge (squash) June 8, 2026 09:48
@andymai andymai merged commit fb1317c into main Jun 8, 2026
9 checks passed
@andymai andymai deleted the perf/compile-each branch June 8, 2026 09:50
@release-kun release-kun Bot mentioned this pull request Jun 8, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 8, 2026

Greptile Summary

Adds query.compile(body) — a source-rewriting compiler that reads an .each-style callback's .toString(), rewrites every e.<comp>.<field> to direct typed-array column indexing, and codegens a per-archetype runner via the existing new Function path. The result lands near bindColumns speed (~6× faster than the proxy .each) while preserving reactivity, with a conservative bail-out to the unchanged proxy for anything it cannot prove safe.

  • compile-each.ts implements the pure analyzer/transform: regex-based hazard detection (control flow, strings, regex literals, destructuring-assign, __-name collisions), write-detection for assignment operators and pre/post ++/--, ctx-property hoisting, and tracked/clean loop codegen with a per-archetype factory.
  • live-query.ts wires the plan into LiveQuery.compile(): scratch pre-flight to catch closed-over variables before the first real frame, per-archetype binding cache with lazy rebuild on archetype-set growth, column view-change detection (re-back), and cold-archetype proxy fallback.
  • Bench, regression, bundle-budget, type definitions, and docs are all updated consistently.

Confidence Score: 4/5

Safe to merge; the only behavioral issue is a narrow false positive in write detection that causes over-reporting of .changed() notifications in a contrived pattern, never data corruption.

The compiler is well-designed with conservative fallback semantics — any analysis failure degrades to the proxy, never to wrong data. Three small gaps in compile-each.ts are worth fixing: a preInc check that samples the full preceding block text instead of just the current statement interval (spurious write detection across line boundaries), a dead idByName map, and a regex literal at block start that slips past REGEX_LITERAL. None affect data correctness; only the first could cause extra .changed() notifications in an unusual pattern.

packages/core/src/query/compile-each.ts — the three findings above are all in the analyzer; live-query.ts and the test files look solid.

Important Files Changed

Filename Overview
packages/core/src/query/compile-each.ts New ergonomic-path compiler: parses callback .toString(), rewrites e.comp.field accesses to column indexing, and codegens a per-archetype loop. Conservative bail-out design is sound; three minor issues: preInc cross-statement false positive (spurious .changed()), dead idByName map, and regex literal at block start is not caught by REGEX_LITERAL.
packages/core/src/query/live-query.ts Adds compile() method with scratch pre-flight, per-archetype binding cache, lazy rebuild on archetype-set changes, and column-growth detection via view reference comparison. Logic is solid; seam caching, cold-archetype proxy fallback, and reinvoke on re-back are all handled correctly.
packages/core/test/compile-each.test.ts Unit tests covering correctness vs .each, column growth, reactivity, fallback paths, and the analyzeEachBody transform. Good coverage of the bail conditions documented in the header.
packages/core/test/compile-each.property.test.ts Property-based tests using fast-check assert byte-identical output vs the proxy path under random spawn/despawn/write/growth sequences, and matching .changed() drain counts. Solid oracle design with two separate worlds driven by the same op sequence.
packages/schema/src/index.ts Adds compile<Ctx>() to the Query and LooseQuery public interfaces with accurate JSDoc. Signatures are consistent with the runtime implementation.
bench/iterate.ts Adds makeEcsiaCompiledIter benchmark case wired into the report and honesty-gate cross-validation. Straightforward extension of the existing pattern.
bench/test/regression.bench.test.ts Adds the compile bucket to the regression lane at a 1.0× ceiling (vs the measured ~0.75× baseline), giving adequate headroom for run-to-run noise without being so loose as to miss real regressions.
website/guide/performance.md Adds a clear compile section documenting the ergonomic path, reactivity preservation, and fallback behaviour. Bundle size numbers updated to match the new budget.

Fix All in Claude Code

Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
packages/core/src/query/compile-each.ts:208
The `preInc` check tests whether the full block text up to this match ends with `++\s*`. This can fire falsely across statement boundaries. For a block like `e.position.x++
e.position.x` (the second access is just a read), `block.slice(0, m.index)` is `e.position.x++
`, which ends with `++
``
` satisfies `\s*` — so `preInc` fires and `position` is falsely added to `writtenIds`. This causes spurious `.changed()` notifications for bodies that post-increment one access and then read the same field later as a standalone expression.

```suggestion
    // Only consider the text between the previous match end (or start of block) and this match start.
    const preInc = /(?:\+\+|--)\s*$/.test(block.slice(last, m.index)) // prefix ++/-- (any whitespace)
```

### Issue 2 of 3
packages/core/src/query/compile-each.ts:171-174
The `idByName` map is populated inside the access loop but never read after it. This is dead code, likely a leftover from an earlier version of the factory-source builder that referenced component ids by name.

```suggestion
  const specs: EachViewSpec[] = []
  const specIndex = new Map<string, number>() // "comp.field" -> views[] index (dedups repeat accesses)
  const writtenIds = new Set<number>()
```

### Issue 3 of 3
packages/core/src/query/compile-each.ts:79
**Regex literal at block start not caught**

`REGEX_LITERAL` only fires when `/` is preceded by an operator character, so a regex literal at the very start of a block (`/e\.position\.x/ .test(...)`) would not be caught. In that contrived case `accessRe` would match the literal text `e.position.x` inside the regex pattern and corrupt the generated source. A simple additional guard — bail if the trimmed block starts with `/` — closes this gap.

Reviews (1): Last reviewed commit: "feat: compile() — the ergonomic .each pa..." | Re-trigger Greptile

// written. Straight-line bodies guarantee the write always runs, so the after-body trackWrite is exact.
const rest = block.slice(accessRe.lastIndex).replace(/^\s+/, '')
const postInc = rest.startsWith('++') || rest.startsWith('--')
const preInc = /(?:\+\+|--)\s*$/.test(block.slice(0, m.index)) // prefix ++/-- (any whitespace)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The preInc check tests whether the full block text up to this match ends with ++\s*. This can fire falsely across statement boundaries. For a block like e.position.x++ e.position.x (the second access is just a read), block.slice(0, m.index) is e.position.x++ , which ends with ++ satisfies \s* — so preInc fires and position is falsely added to writtenIds. This causes spurious .changed() notifications for bodies that post-increment one access and then read the same field later as a standalone expression.

Suggested change
const preInc = /(?:\+\+|--)\s*$/.test(block.slice(0, m.index)) // prefix ++/-- (any whitespace)
// Only consider the text between the previous match end (or start of block) and this match start.
const preInc = /(?:\+\+|--)\s*$/.test(block.slice(last, m.index)) // prefix ++/-- (any whitespace)
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/core/src/query/compile-each.ts
Line: 208

Comment:
The `preInc` check tests whether the full block text up to this match ends with `++\s*`. This can fire falsely across statement boundaries. For a block like `e.position.x++
e.position.x` (the second access is just a read), `block.slice(0, m.index)` is `e.position.x++
`, which ends with `++
``
` satisfies `\s*` — so `preInc` fires and `position` is falsely added to `writtenIds`. This causes spurious `.changed()` notifications for bodies that post-increment one access and then read the same field later as a standalone expression.

```suggestion
    // Only consider the text between the previous match end (or start of block) and this match start.
    const preInc = /(?:\+\+|--)\s*$/.test(block.slice(last, m.index)) // prefix ++/-- (any whitespace)
```

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Comment on lines +171 to +174
const specs: EachViewSpec[] = []
const specIndex = new Map<string, number>() // "comp.field" -> views[] index (dedups repeat accesses)
const writtenIds = new Set<number>()
const idByName = new Map<string, number>()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The idByName map is populated inside the access loop but never read after it. This is dead code, likely a leftover from an earlier version of the factory-source builder that referenced component ids by name.

Suggested change
const specs: EachViewSpec[] = []
const specIndex = new Map<string, number>() // "comp.field" -> views[] index (dedups repeat accesses)
const writtenIds = new Set<number>()
const idByName = new Map<string, number>()
const specs: EachViewSpec[] = []
const specIndex = new Map<string, number>() // "comp.field" -> views[] index (dedups repeat accesses)
const writtenIds = new Set<number>()
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/core/src/query/compile-each.ts
Line: 171-174

Comment:
The `idByName` map is populated inside the access loop but never read after it. This is dead code, likely a leftover from an earlier version of the factory-source builder that referenced component ids by name.

```suggestion
  const specs: EachViewSpec[] = []
  const specIndex = new Map<string, number>() // "comp.field" -> views[] index (dedups repeat accesses)
  const writtenIds = new Set<number>()
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

// A regex literal could spell a real `e.comp.field` that string-rewriting would corrupt. A `/` right after
// an operator/open-bracket starts a regex (division's `/` follows a value: identifier/`)`/`]`/number), so
// this flags regex literals without bailing on division.
const REGEX_LITERAL = /[=(,:[!&|?{;+\-*%<>~^]\s*\//
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Regex literal at block start not caught

REGEX_LITERAL only fires when / is preceded by an operator character, so a regex literal at the very start of a block (/e\.position\.x/ .test(...)) would not be caught. In that contrived case accessRe would match the literal text e.position.x inside the regex pattern and corrupt the generated source. A simple additional guard — bail if the trimmed block starts with / — closes this gap.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/core/src/query/compile-each.ts
Line: 79

Comment:
**Regex literal at block start not caught**

`REGEX_LITERAL` only fires when `/` is preceded by an operator character, so a regex literal at the very start of a block (`/e\.position\.x/ .test(...)`) would not be caught. In that contrived case `accessRe` would match the literal text `e.position.x` inside the regex pattern and corrupt the generated source. A simple additional guard — bail if the trimmed block starts with `/` — closes this gap.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant