-
Notifications
You must be signed in to change notification settings - Fork 0
ci: bench regression lane — ecsia/bitECS ratios under a ceiling #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| { | ||
| "_comment": "CI bench regression ceilings — MAX allowed ns/entity RATIO of each ecsia path vs a SAME-RUN bitECS control (so machine drift cancels). A real regression (e.g. codegen breaking → bindColumns deopts from ~0.72x to ~1.5x) trips the ceiling; ~10% run-to-run noise does not. RATCHET: when a path durably improves, lower its ceiling here. Measured 2026-06-08: bindColumns ~0.72x, eachChunk ~1.08x, each ~7.4x.", | ||
| "ratiosVsBitecs": { | ||
| "bindColumns": 0.9, | ||
| "eachChunk": 1.3, | ||
| "each": 9.0 | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| // CI bench REGRESSION lane. Times each ecsia iteration path against a SAME-RUN bitECS control and | ||
| // asserts the ns/entity RATIO stays under a committed ceiling (bench/regression-baseline.json). The | ||
| // ratio cancels machine drift — a noisy shared runner moves both ecsia and bitECS together — so a | ||
| // failure means a genuine regression (e.g. codegen breaking and bindColumns deopting to ~1.5x), not | ||
| // scheduling noise. Gated behind BENCH_REGRESSION=1 so it runs ONLY in its dedicated CI job, never | ||
| // in the default `pnpm test` (where measurement noise would flake unit CI). Ratchet ceilings down in | ||
| // the baseline file when a path durably improves. | ||
|
|
||
| import { describe, expect, test } from 'vitest' | ||
| import { readFileSync } from 'node:fs' | ||
| import { fileURLToPath } from 'node:url' | ||
| import { makeEcsiaIter, makeEcsiaCursorIter, makeEcsiaPinnedIter, makeBitEcsIter } from '../iterate.js' | ||
| import type { IterCase } from '../iterate.js' | ||
|
|
||
| const ENABLED = process.env['BENCH_REGRESSION'] === '1' | ||
| const N = 50_000 | ||
| const WARMUP = 300 | ||
| const TIMED = 1500 | ||
| const REPS = 3 // best-of-N rounds (each round rebuilds + re-warms) to shake off a single bad schedule | ||
|
|
||
| interface CtxIter extends IterCase { | ||
| step(): void | ||
| } | ||
|
|
||
| /** p50 ns/entity over TIMED samples, taking the best (min) p50 across REPS rebuilds. */ | ||
| function nsPerEntity(make: (n: number) => CtxIter): number { | ||
| let best = Infinity | ||
| for (let rep = 0; rep < REPS; rep++) { | ||
| const c = make(N) | ||
| for (let i = 0; i < WARMUP; i++) c.step() | ||
| const s: number[] = [] | ||
| for (let r = 0; r < TIMED; r++) { | ||
| const t0 = performance.now() | ||
| c.step() | ||
| s.push(performance.now() - t0) | ||
| } | ||
| s.sort((a, b) => a - b) | ||
| const p50 = (s[s.length >> 1] as number) * 1e6 / N | ||
| if (p50 < best) best = p50 | ||
| } | ||
| return best | ||
| } | ||
|
|
||
| const baseline = JSON.parse( | ||
| readFileSync(fileURLToPath(new URL('../regression-baseline.json', import.meta.url)), 'utf8'), | ||
| ) as { ratiosVsBitecs: Record<string, number> } | ||
|
|
||
| describe.skipIf(!ENABLED)('bench regression — ecsia/bitECS ns/entity ratios under ceiling', { timeout: 120_000 }, () => { | ||
| // ONE bitECS control measured in the same process/run as the ecsia paths below. | ||
| const bit = nsPerEntity(makeBitEcsIter) | ||
|
|
||
|
Comment on lines
+49
to
+51
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Even when Prompt To Fix With AIThis is a comment left during a code review.
Path: bench/test/regression.bench.test.ts
Line: 49-51
Comment:
**`bit` and ecsia paths measured in different execution phases**
Even when `ENABLED=true`, `bit` is measured during the `describe` callback (collection phase) while the three ecsia paths are measured inside `test.each` (execution phase). On a noisy shared runner, the machine state can differ noticeably between these two phases — other background work, JIT warm-up of unrelated code, etc. — slightly undermining the "same-run same-conditions" goal. Moving `bit` into a `beforeAll` would keep all timing in the same execution phase.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! |
||
| test.each([ | ||
| ['bindColumns', makeEcsiaPinnedIter as (n: number) => CtxIter], | ||
| ['eachChunk', makeEcsiaCursorIter as (n: number) => CtxIter], | ||
| ['each', makeEcsiaIter as (n: number) => CtxIter], | ||
| ])('%s ratio vs bitECS stays under its ceiling', (name, make) => { | ||
| const ns = nsPerEntity(make) | ||
| const ratio = ns / bit | ||
| const ceiling = baseline.ratiosVsBitecs[name] as number | ||
| // Report the actual ratio in the assertion message so a CI failure shows the regression size. | ||
| expect(ratio, `${name}: ${ns.toFixed(2)} ns/e = ${ratio.toFixed(3)}x bitECS (${bit.toFixed(2)} ns/e); ceiling ${ceiling}x`).toBeLessThanOrEqual(ceiling) | ||
| }) | ||
| }) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bitruns at describe-body scope, defeating theBENCH_REGRESSIONflagconst bit = nsPerEntity(makeBitEcsIter)sits in thedescribecallback body, which Vitest evaluates during test collection — even when the suite is marked skip viadescribe.skipIf. Because thebenchproject (bench/test/**/*.test.ts) is included in the sharedvitest.config.tsandpnpm testruns all projects, this fires the full bitECS measurement (3 × 1 800 iterations × 50 k entities) on every ordinarypnpm test -- --coverageCI run, directly contradicting the stated intent. Moving it into abeforeAllcallback is the standard fix: Vitest does not invokebeforeAllfor skipped suites, so the work is fully suppressed whenENABLEDis false.Prompt To Fix With AI