Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,9 @@ jobs:
- name: Runnable engine + build manifest
run: npm run test:runnable

- name: Lazy engine (--lazy-engine load()/runScoped() + cone scoping)
run: npm run test:lazy-engine

- name: Artifact slimming
run: npm run test:slimming

Expand Down
48 changes: 48 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,53 @@
# excel-to-engine — Changelog

## 2026-05-29 — Chunked-build scaling walls: streamed emit, borrowed partitions, opt-in lazy engine (#22)

With the partition-hang fixed, a clean `ete init` on the real models got *past*
partitioning but then drove the parser past 18 GB in the module-emit step (and a
complete build was still slow to *run* as an oracle). Three walls closed — two
internal memory fixes done unconditionally, one opt-in runtime feature.

- **Wall C — streamed module emit (`chunked_emitter.rs`).** The emit did
`partitions.par_iter().map(generate_sheet_module).collect()` and wrote in a
*second* pass — holding **all ~800 MB** of generated JS in memory at once (on
top of the multi-million-cell workbook), with nothing in `sheets/` until every
module finished. It now **writes each module the instant it's generated and
drops the string**; the few "heavy" sheets (≥200k formula cells) are emitted
one-at-a-time (peak ≈ one big module) while the many light sheets stay parallel.
Files land incrementally; a write failure is still fatal.
- **Wall B — `SheetPartition` borrows cells instead of cloning
(`sheet_partition.rs`).** `partition_sheets` did `cell.clone()` into the
partition while `workbook.sheets` still held the originals — a full second copy
of ~6M `CellData` (addresses + values + formula strings) → peak-memory doubling.
`SheetPartition<'a>` now holds `Vec<&'a CellData>` (the workbook outlives every
partition), so the partition is a few pointers per cell. The four consumers are
read-only, so they're unchanged beyond the borrow.
- **Wall A — opt-in `--lazy-engine` (`chunked_emitter.rs`, `main.rs`,
`cli/`).** The default `engine.js` statically imports every sheet module, so
`import('engine.js')` pulls ~800 MB into the heap before `run()` can be called.
`ete init --lazy-engine` (parser `--lazy-engine`) now emits a lazy orchestrator:
sheet modules load on demand via `export async function load(options)` (with
**output-cone scoping** — `load({ sheets })` / `load({ cells })` loads only the
requested sheets' transitive dependency closure, expanding whole clusters), a
synchronous `run()` guarded against being called before any load, and
`runScoped(inputs, options)` (load + run in one await). **The default engine is
unchanged** — it stays eager + synchronous, so the Mippy contract, `ete eval`,
the smoke test, and the engine suite are untouched. The eager and lazy engines
share the `run()` body via `emit_run_function`, so they can't drift.
- New `npm run test:lazy-engine` (19) + CI step: asserts the lazy engine has no
static sheet imports, exports `run`/`load`/`runScoped`, throws before load,
matches the eager engine's `run()` output after load (base + cross-sheet
override), and that cone scoping loads only the closure.
- Validated: `cargo build --release`, `cargo test` 17/17, `smoke` 78/78,
`test:engine` 21/21, `test:runnable` 20/20, `test:depgraph` 11/11,
`test:lazy-engine` 19/19, `test:slimming` 13/13, `test:golden` 20/20, full
`npm test`, and an `ete init --lazy-engine` end-to-end build.
- **Residual (deeper, deferred):** `generate_sheet_module` builds a `Vec<String>`
of lines then `.join("\n")` — ~2× a monster module transiently; and even one
~200 MB monster module is heavy to import. Row-chunking the monster sheets
(Owned_Asset_PP_E, Future_Owned_Acquisitions, Technology) into smaller lazy
modules is the next step to make them usable, not just emittable.

## 2026-05-29 — Fix chunked-build hang in `partition_sheets` (range-expansion blowup)

A clean `ete init` on the full real models hung for ~12h in the chunked emitter,
Expand Down
36 changes: 29 additions & 7 deletions HANDOFF.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,35 @@ formulas. Validated: `cargo test` 17/17, `smoke` 78/78, `test:depgraph`/`runnabl
`engine` 11/20/21. **Rebuild the release parser** (`cd pipelines/rust && cargo
build --release`) before re-running the regen — the fix is in the binary.

Next session (all nice-to-have, none on the critical path): **P3 (#22)**
output-cone scoping / lazy sheet loading — now also the home for the two residual
scaling walls (partition clones every cell → peak-memory doubling; `engine.js`
eagerly imports ~800 MB of sheet modules → Node load-time wall, so even a complete
build is slow to *run* as the oracle). Plus **deeper transpiler coverage** (the
11,813 `_fn` offenders behind #26) and **cluster-once eval**. The Mippy contract +
its trust gates are complete.
**Latest session (chunked-build scaling walls, 2026-05-29):** the three #22 walls
are closed. A clean build got *past* partitioning but the module-emit step drove
the parser past 18 GB (it `collect()`ed all ~800 MB of generated module strings
before writing any), and even a complete engine was slow to *run* (eager imports).
- **Wall C (streamed emit):** `chunked_emitter.rs` writes each sheet module to
disk the instant it's generated and drops the string (heavy sheets ≥200k
formulas one-at-a-time, light ones parallel) — peak ≈ one monster module, files
land incrementally. **This is the fix for the 18 GB OOM the regen hit.**
- **Wall B (borrowed partitions):** `SheetPartition<'a>` holds `Vec<&CellData>`
(`sheet_partition.rs`) instead of cloning ~6M cells — no more peak-memory
doubling during emit.
- **Wall A (opt-in lazy engine):** `ete init --lazy-engine` (parser
`--lazy-engine`) emits an engine whose sheet modules load on demand via async
`load()`/`runScoped()` with **output-cone scoping** (`load({sheets})` /
`load({cells})` loads only the dependency closure, whole clusters included);
sync `run()` preserved, guarded against pre-load calls. **Default engine.js is
unchanged** (eager + sync) — Mippy / `ete eval` / smoke / engine suite untouched.
Eager & lazy share the `run()` body so they can't drift. New
`npm run test:lazy-engine` (19) + CI. **Rebuild the release parser before regen.**

Next session (none on the critical path): **a clean A1/A2 regen** to confirm the
emit completes within memory (couldn't be measured here — models are gitignored);
then **row-chunk the 3 monster sheets** (Owned_Asset_PP_E, Future_Owned_Acquisitions,
Technology) so even one is small to generate (`generate_sheet_module` still builds
a `Vec<String>` then joins, ~2× a monster transiently) and import. Plus the rest of
**#22's umbrella** (`--output-profile contract` to skip the per-sheet emit for
contract-only consumers; guided `ete create` skill), **deeper transpiler coverage**
(the 11,813 `_fn` offenders behind #26), and **cluster-once eval**. The Mippy
contract + its trust gates are complete.

**Baseline (real models, `npm run bench`):** Model A **84.3%**,
Model B **85.5%** — standalone sheets only (cluster + 190 MB PP&E skipped).
Expand Down
45 changes: 38 additions & 7 deletions PLAN.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,43 @@
# excel-to-engine — Plan

> **Next session:** the real-model chunked build now gets *past* `partition_sheets`
> (the 12h hang is fixed). The remaining scaling walls are about actually
> *running* the oracle at this size — the partition step still clones every cell
> (peak-memory doubling) and `engine.js` eagerly imports ~800 MB of sheet modules
> (Node load-time wall). Both fold into **P3 (#22) output-cone scoping / lazy
> sheet loading**. Also still open: deeper transpiler coverage (the 11,813 `_fn`
> offenders behind #26) and cluster-once eval.
> **Next session:** the three chunked-build scaling walls are closed (#22) — the
> emit streams module-by-module (was: hold all ~800 MB before writing), partitions
> borrow cells instead of cloning (was: peak-memory doubling), and `ete init
> --lazy-engine` emits an on-demand engine so a consumer can run the oracle without
> importing ~800 MB up front. **Next: a full clean regen on the real A1/A2 models
> to confirm the emit now completes within memory** (couldn't be measured here —
> the models are gitignored), then the deeper residual: **row-chunk the 3 monster
> sheets** (Owned_Asset_PP_E, Future_Owned_Acquisitions, Technology) so even one is
> small to generate + import. Also still open: deeper transpiler coverage (the
> 11,813 `_fn` offenders behind #26) and cluster-once eval.

## Status: Chunked-build scaling walls closed (streamed emit + borrowed partitions + opt-in lazy engine) — landed 2026-05-29

With the partition hang fixed, a clean build got *past* partitioning but then the
module-emit step drove the parser past 18 GB (it materialized all ~800 MB of
generated module strings before writing any), and even a complete ~800 MB engine
was slow to *run* (eager imports). Three walls (#22) closed:

- **Wall C — streamed emit (`chunked_emitter.rs`).** Each sheet module is written
to disk the instant it's generated and the string dropped, instead of
collect-all-then-write; heavy sheets (≥200k formulas) emit one-at-a-time, light
ones in parallel. Peak emit memory ≈ one monster module, not the whole output.
- **Wall B — borrowed partitions (`sheet_partition.rs`).** `SheetPartition<'a>`
holds `Vec<&'a CellData>` instead of cloning ~6M cells; removes the second
full copy that doubled peak memory during emit.
- **Wall A — opt-in `--lazy-engine` (`chunked_emitter.rs`, `main.rs`, `cli/`).**
Emits a chunked `engine.js` whose sheet modules load on demand via async
`load()`/`runScoped()` with output-cone scoping; sync `run()` is preserved
(guarded against pre-load calls). **Default engine.js is unchanged** (eager +
sync) — the Mippy contract and all in-repo consumers are untouched. Eager/lazy
share the `run()` body, so they can't drift.

New `npm run test:lazy-engine` (19) + CI. Validated: `cargo test` 17/17, `smoke`
78/78, `test:engine`/`test:runnable`/`test:depgraph` 21/20/11, `test:lazy-engine`
19/19, `test:slimming`/`test:golden` 13/20, full `npm test`, `ete init
--lazy-engine` e2e. **Residual (deferred):** `generate_sheet_module` still builds
a `Vec<String>` then joins (~2× a monster transiently), and a single monster
module is still heavy to import — row-chunking the monster sheets is the next step.

## Status: Chunked-build partition hang fixed — landed 2026-05-29

Expand Down
36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,42 @@ synthetic fixture (`npm run test:golden`); point it at a real build with
`ETE_GOLDEN_DIR` + a gitignored `canonical-returns.json` to verify a regenerated
model still reproduces the hand-port's gross/net MOIC & IRR exactly.

### Lazy engine for large models (`--lazy-engine`)

The default `engine.js` statically imports every per-sheet module, so
`import('engine.js')` pulls **all** of them into memory (hundreds of MB on the
big PE models — dominated by a couple of monster sheets) before `run()` can be
called. For a consumer that only needs to *sample* the model (the calibration-
oracle use case), that load is the wall.

`ete init --lazy-engine` emits an engine that imports sheet modules **on demand**:

```js
import engine from './my-model/chunked/engine.js';

// Load only what you need, then run() synchronously (same return shape as always).
await engine.load({ cells: ['Returns!D22', 'Returns!E22'] }); // loads just the
// dependency cone
const { values, meta } = engine.run({ 'Assumptions!B3': 18 }); // override + run

// Or do both in one call:
const r = await engine.runScoped({ 'Assumptions!B3': 18 }, { cells: ['Returns!D22'] });
```

- **`load(options)`** — `{ sheets: [...] }` and/or `{ cells: ['Sheet!A1', ...] }`
loads only those sheets plus their transitive dependency closure (whole
circular clusters are pulled in as a unit). No options ⇒ load everything (still
lazy, but complete). To scope to named outputs, map their names → cells via
`named-outputs.json`, then pass `cells`.
- **`run(inputs, options)`** — unchanged synchronous semantics; throws if called
before anything is loaded. Sheets outside the loaded cone are simply skipped.
- **`runScoped(inputs, options)`** — `await load(options)` then `run(inputs, options)`.

The **default build is unchanged** — `engine.js` stays eager and `run()` stays
synchronous, so existing integrations are untouched. `--lazy-engine` is purely
opt-in. (Per-sheet modules are emitted either way; the flag only changes how
`engine.js` loads them.)

### The Delta Cascade

When you run a scenario, the CLI doesn't re-execute the full engine (which can take 10+ minutes on large models). Instead, it:
Expand Down
34 changes: 24 additions & 10 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,12 +60,18 @@ Mippy. Order (issues on ebootheee/excel-to-engine; the Done line is the contract
artifacts. New `npm run test:runnable` + CI. See CHANGELOG/PLAN.
- **P2 · [#25] — pin the value-bearing cells as named-outputs. ✅ DONE (2026-05-29).** Per-class MIP Proceeds, hurdle/threshold, participation %, equity basis, valuation/shares — not just MOIC/IRR. Schedules and timeline timelines (such as debt, equity base, cash flow) are now surfaced and participate fully in closure analysis via range expansion. Drivable driver-inputs (`exitMultiple`, `exitYearSelector`, and `hurdleRate`) are also mapped under `named-inputs.json`.
- **P2 · [#26] — `_fn` fallback audit (`_fn-fallbacks.json`). ✅ DONE (2026-05-29).** Scans the generated sheet modules → `_fn-fallbacks.json`, and checks each named output/schedule's dependency closure against it. **Reports** by default (annotates affected outputs with `resolvesThroughFallback`, records `stats.fallbackViolations`, `ete init` warns); **hard-fails only under `--assert-no-fallbacks`** so the gate doesn't block the real models (~11,813 fallbacks today). The "assert no value cell uses a stub" target is the golden-master CI check below, run with `--assert-no-fallbacks`.
- **P3 (nice-to-have) · [#22] — output-cone scoping / lazy sheet loading.**
Cheaper oracle; not required (we don't ship the blob). Now also the home for
the two scaling walls that remain after the partition-hang fix below: the
partition step `clone()`s every cell (peak-memory doubling), and `engine.js`
eagerly imports ~800 MB of sheet modules (Node load-time wall — even a complete
build is slow to *run*). Scoping/lazy-loading addresses both.
- **P3 · [#22] — scaling walls + lazy sheet loading. ✅ DONE (2026-05-29).**
Three walls closed so the real models both *build* and *run* at scale:
**(C) streamed emit** — write each sheet module to disk as generated and drop
the string (was: hold all ~800 MB before writing → 18 GB peak); heavy sheets
emit one-at-a-time. **(B) borrowed partitions** — `SheetPartition<'a>` holds
`Vec<&CellData>` (was: clone ~6M cells → peak-memory doubling). **(A) opt-in
`ete init --lazy-engine`** — emits an engine whose sheet modules load on demand
via async `load()`/`runScoped()` with output-cone scoping; sync `run()`
preserved; **default engine unchanged** (eager + sync, Mippy untouched).
`npm run test:lazy-engine` (19) + CI. Still open under #22's original umbrella:
the `--output-profile contract` knob (skip the per-sheet emit entirely for
contract-only consumers) and a guided `ete create` skill.

Supporting (makes the oracle trustworthy, not on the critical path):
- **Golden-master CI assert ✅ DONE (2026-05-29).** `eval/golden-master.mjs` +
Expand Down Expand Up @@ -123,10 +129,18 @@ Issues filed: [#22] (output scoping) and [#23] (parser/emitter perf).
range-expanding `extract_refs` (post-Round-2 it explodes every range to ≤1000
cells per formula, then discards the same-sheet ones) on the 1.62M-formula PP&E
sheet → swap thrash. Now uses a sheet-names-only scanner (`collect_sheet_deps`);
cycle detection uses `extract_refs_shallow`. **Residual scaling walls (→ [#22]):**
partition still `clone()`s every cell (peak-memory doubling), and the generated
`engine.js` eagerly imports ~800 MB of sheet modules (Node load-time wall). Also
still wanted: within-sheet parallelism for the heaviest sheets.
cycle detection uses `extract_refs_shallow`. ✅ **Two more walls fixed
(2026-05-29, #22):** the emit was materializing all ~800 MB of generated module
strings before writing any (18 GB peak) — now **streamed** (write + drop per
module, heavy sheets one-at-a-time); and `partition_sheets` cloned every cell
(peak-memory doubling) — now **borrows** (`Vec<&CellData>`). The eager
`engine.js` still imports all modules, so `ete init --lazy-engine` adds an
on-demand engine for the run-the-oracle path. **Residual (deferred):**
`generate_sheet_module` builds a `Vec<String>` then joins (~2× a monster
transiently), and a single ~200 MB monster module is still heavy to import →
**row-chunk the 3 monster sheets** into smaller lazy modules. Also still wanted:
within-sheet parallelism for the heaviest sheets. **Not yet measured on the real
models** (gitignored) — a clean A1/A2 regen should confirm the emit completes.
- **`--output-profile` / guided `ete create` ([#22]).** Skip the ~752 MB
per-sheet engine emit when a consumer only needs ground truth + contract maps.
- **Transpiler coverage — 11,813 `_fn()` fallbacks (unchanged old→new).** That
Expand Down
5 changes: 5 additions & 0 deletions cli/commands/init.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,11 @@ export function runInit(excelPath, args) {
// in chunked mode by default; see request #8 slimming).
const parserArgs = [resolve(excelPath), absOutput, '--chunked'];
if (args.emitDebug) parserArgs.push('--emit-debug');
// --lazy-engine emits a chunked engine.js that imports sheet modules on
// demand (async load()/runScoped() + output-cone scoping) instead of
// eagerly at module-load time — so a consumer can run the engine without
// pulling every sheet module into memory just to import it (#22, Wall A).
if (args.lazyEngine) parserArgs.push('--lazy-engine');
const result = spawnSync(
parserBin,
parserArgs,
Expand Down
5 changes: 4 additions & 1 deletion cli/index.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,10 @@ Commands:
--assert-no-fallbacks (hard-fail if any named
output resolves through an _fn() stub),
--emit-debug (retain dependency-graph.json,
_graph.json, model-map.json for offline analysis)
_graph.json, model-map.json for offline analysis),
--lazy-engine (engine.js loads sheet modules on
demand via async load()/runScoped() + output-cone
scoping; run() stays sync — await load() first)
summary <modelDir> One-shot model overview (--terse to hide suspects)
query <modelDir> [args] Query ground truth cells
pnl <modelDir> Extract annual P&L by segment
Expand Down
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
"test:engine": "node pipelines/rust/tests/test-engine-runtime.mjs",
"test:depgraph": "node pipelines/rust/tests/test-dependency-graph.mjs",
"test:runnable": "node pipelines/rust/tests/test-runnable-engine.mjs",
"test:lazy-engine": "node pipelines/rust/tests/test-lazy-engine.mjs",
"test:slimming": "node tests/cli/test-artifact-slimming.mjs",
"test:golden": "node tests/cli/test-golden-master.mjs",
"golden": "node eval/golden-master.mjs",
Expand Down
Loading
Loading