mcp-tool-shop-org · mcp-tool-shop · Jun 10, 2026
diff --git a/packages/vector-caliper/baselines/DOGFOOD-NOTES-2026-06-09.md b/packages/vector-caliper/baselines/DOGFOOD-NOTES-2026-06-09.md
@@ -0,0 +1,48 @@
+# Dogfood notes — first real-data run (qwen-lora-tallow-fen-v1, 2026-06-09)
+
+Friction and findings from feeding VectorCaliper its first production training run.
+Input for the next working session; nothing here was patched upstream yet.
+
+## 1. Determinism guarantee is not actually byte-deterministic
+`src/projection/engine.ts` — the PCA power-iteration seeds eigenvector init with raw
+`Math.random()` while the seeded Mulberry32 (`createSeededRandom`) sits unused in the
+same file. PCA usually converges to the same components up to sign, but the README's
+determinism promise ("deterministic, reproducible rendering") is not guaranteed at the
+byte level. Fix: thread the seeded RNG into `pca()`.
+
+## 2. Raw Node ESM cannot consume the package
+`tsc` emits the source's extensionless/directory relative imports verbatim
+(`from './schema'`, `from './types/state'`); Node ESM rejects both
+(`ERR_UNSUPPORTED_DIR_IMPORT`). Rendering required patching 32 dist files to append
+`.js` / `/index.js`. Fix: `moduleResolution: "NodeNext"` + explicit `.js` extensions
+in source imports. Related: no `dist/` ships and `files` only includes `dist/` — a
+consumer must build from source with devDeps.
+
+## 3. Naming/metadata drift
+- README installs `@mcp-tool-shop/vector-caliper`; package.json says
+  `@mcptoolshop/vector-caliper` (and `"private": true` — not published at all).
+- `repository.url` points at `mcp-tool-shop-org/VectorCaliper.git`; the source lives
+  in `mcp-tool-shop-org/prototypes`.
+
+## 4. API fit for diffusion/LoRA training runs
+The schema REQUIRES `uncertainty.{entropy, margin, calibration}` — natural for
+classifiers, nonexistent for diffusion LoRA runs. This baseline used documented
+proxies (entropy of the normalized centroid-similarity distribution; style-vs-photo
+text-anchor contrast gap as margin; similarity std as calibration). Options:
+make the uncertainty group optional like `dynamics`, or ship a domain preset
+("diffusion-style-lora") that defines blessed proxies so cross-run baselines stay
+comparable.
+
+## 5. The demo bypasses the product
+`demo/canonical-demo.ts` hand-rolls its SVG and uses a flat ad-hoc JSON, bypassing
+ProjectionEngine/SemanticMapper/SceneBuilder/SVGRenderer entirely — so the checked-in
+canonical output exercises none of the public pipeline. This baseline's SVG is, as far
+as the dogfood could tell, the first artifact rendered through the real pipeline.
+
+## 6. What worked
+Zero-dep pure-TS core imported cleanly once dist was patched; all 8 states passed
+`createModelState` validation on the first attempt (the capture script pre-clamped
+its [0,1] proxies specifically because the factories fail closed — the contract
+shaped the producer, which is the point of a strict schema); budget classes were a
+non-issue at n=8; the semantic encoding (hue←effdim, radius←spread) makes the
+step-2000 cloud collapse visible in the SVG without reading any numbers.
diff --git a/packages/vector-caliper/baselines/README.md b/packages/vector-caliper/baselines/README.md
@@ -0,0 +1,28 @@
+# Baselines
+
+Real measured trajectories from production training runs, shaped to the
+`createModelState()` contract. These are VectorCaliper's ground truth for the
+"establish baselines → predict/hypothesize" roadmap: once several runs are in,
+early-trajectory geometry (e.g. spread-collapse rate by step 500) can be tested
+as a predictor of where the binding peak lands.
+
+## qwen-lora-tallow-fen-v1 (2026-06-09) — first real-data baseline
+
+A Qwen-Image rank-16 style LoRA (`tallow_fen_style_v1`, RTX 5090, 2000 steps,
+8 checkpoints). Per checkpoint: a fixed 12-prompt eval grid was generated and the
+CLIP ViT-B/32 embedding cloud measured. Field mapping and uncertainty PROXIES are
+documented in the capture script docstring (a diffusion-LoRA run has no native
+classifier entropy/margin/ECE — see dogfood notes #4).
+
+**What this baseline demonstrates** (the headline for the tool's thesis):
+between steps 1750→2000, `performance.accuracy` (CLIP-sim to the style centroid)
+ROSE 0.7796→0.7937 while `geometry.anisotropy` spiked 8.2→12.5 and
+`geometry.effectiveDimension` collapsed 7.0→6.76. The similarity gain came from a
+collapsing, less-diverse embedding cloud — overfit masquerading as improvement.
+Performance-only checkpoint selection picks step 2000; geometry+performance picks
+step 1250 (also the CMMD minimum, 0.1351, and the human looked-at choice, which
+saw the same overfit as monochrome drift on neutral subjects). **The combined view
+caught what the single metric missed.**
+
+- `qwen-lora-tallow-fen-v1.json` — 8 states (capture: `E:/AI/training/_caliper_capture.py` on the rig)
+- `qwen-lora-tallow-fen-v1.svg` — rendered through the real pipeline (ProjectionEngine → SceneBuilder → SVGRenderer)
diff --git a/packages/vector-caliper/baselines/qwen-lora-tallow-fen-v1.json b/packages/vector-caliper/baselines/qwen-lora-tallow-fen-v1.json
@@ -0,0 +1,235 @@
+{
+  "run": "tallow_fen_style_v1",
+  "captured": "2026-06-09",
+  "reference": {
+    "dir": "E:\\AI\\training\\dataset_tallow_fen",
+    "n": 44,
+    "sigma": 0.47900169753130234
+  },
+  "states": [
+    {
+      "id": "tallow-fen-v1-step-250",
+      "time": 250,
+      "geometry": {
+        "effectiveDimension": 7.757477402078029,
+        "anisotropy": 6.621865643989259,
+        "spread": 0.8430308699607849,
+        "density": 7.102524389014206
+      },
+      "uncertainty": {
+        "entropy": 3.576477527618408,
+        "margin": 0.01452073804102838,
+        "calibration": 0.08244698494672775
+      },
+      "performance": {
+        "accuracy": 0.7647979855537415,
+        "loss": 0.16690856218338013
+      },
+      "metadata": {
+        "source": "tallow_fen_style_v1 (Qwen-Image LoRA, RTX 5090)",
+        "version": "1.0.0",
+        "tags": [
+          "n=12",
+          "clip-vit-b32",
+          "proxy-uncertainty"
+        ]
+      }
+    },
+    {
+      "id": "tallow-fen-v1-step-500",
+      "time": 500,
+      "geometry": {
+        "effectiveDimension": 7.180756051976849,
+        "anisotropy": 7.760998726659996,
+        "spread": 0.8325328826904297,
+        "density": 7.181163759334916
+      },
+      "uncertainty": {
+        "entropy": 3.575670003890991,
+        "margin": 0.025575989857316017,
+        "calibration": 0.08706717193126678
+      },
+      "performance": {
+        "accuracy": 0.7735676169395447,
+        "loss": 0.15337598323822021
+      },
+      "metadata": {
+        "source": "tallow_fen_style_v1 (Qwen-Image LoRA, RTX 5090)",
+        "version": "1.0.0",
+        "tags": [
+          "n=12",
+          "clip-vit-b32",
+          "proxy-uncertainty"
+        ]
+      }
+    },
+    {
+      "id": "tallow-fen-v1-step-750",
+      "time": 750,
+      "geometry": {
+        "effectiveDimension": 7.409353421390496,
+        "anisotropy": 6.979115957964436,
+        "spread": 0.8266791701316833,
+        "density": 7.201104059214372
+      },
+      "uncertainty": {
+        "entropy": 3.5770695209503174,
+        "margin": 0.03614329965785146,
+        "calibration": 0.08109613507986069
+      },
+      "performance": {
+        "accuracy": 0.7801888585090637,
+        "loss": 0.14684104919433594
+      },
+      "metadata": {
+        "source": "tallow_fen_style_v1 (Qwen-Image LoRA, RTX 5090)",
+        "version": "1.0.0",
+        "tags": [
+          "n=12",
+          "clip-vit-b32",
+          "proxy-uncertainty"
+        ]
+      }
+    },
+    {
+      "id": "tallow-fen-v1-step-1000",
+      "time": 1000,
+      "geometry": {
+        "effectiveDimension": 7.3610166758567885,
+        "anisotropy": 7.320895825606809,
+        "spread": 0.8148157596588135,
+        "density": 7.246805784660056
+      },
+      "uncertainty": {
+        "entropy": 3.577988862991333,
+        "margin": 0.041199419647455215,
+        "calibration": 0.07699479907751083
+      },
+      "performance": {
+        "accuracy": 0.788444459438324,
+        "loss": 0.13987720012664795
+      },
+      "metadata": {
+        "source": "tallow_fen_style_v1 (Qwen-Image LoRA, RTX 5090)",
+        "version": "1.0.0",
+        "tags": [
+          "n=12",
+          "clip-vit-b32",
+          "proxy-uncertainty"
+        ]
+      }
+    },
+    {
+      "id": "tallow-fen-v1-step-1250",
+      "time": 1250,
+      "geometry": {
+        "effectiveDimension": 7.075301788220325,
+        "anisotropy": 8.279761391066161,
+        "spread": 0.8146253824234009,
+        "density": 7.2864497225930664
+      },
+      "uncertainty": {
+        "entropy": 3.5780458450317383,
+        "margin": 0.045075961388647556,
+        "calibration": 0.07691206783056259
+      },
+      "performance": {
+        "accuracy": 0.7905473709106445,
+        "loss": 0.13513541221618652
+      },
+      "metadata": {
+        "source": "tallow_fen_style_v1 (Qwen-Image LoRA, RTX 5090)",
+        "version": "1.0.0",
+        "tags": [
+          "n=12",
+          "clip-vit-b32",
+          "proxy-uncertainty"
+        ]
+      }
+    },
+    {
+      "id": "tallow-fen-v1-step-1500",
+      "time": 1500,
+      "geometry": {
+        "effectiveDimension": 7.096816902083409,
+        "anisotropy": 8.518901948452722,
+        "spread": 0.8217833638191223,
+        "density": 7.262251305894052
+      },
+      "uncertainty": {
+        "entropy": 3.5785281658172607,
+        "margin": 0.04449977073818445,
+        "calibration": 0.07387224584817886
+      },
+      "performance": {
+        "accuracy": 0.7842926979064941,
+        "loss": 0.1435713768005371
+      },
+      "metadata": {
+        "source": "tallow_fen_style_v1 (Qwen-Image LoRA, RTX 5090)",
+        "version": "1.0.0",
+        "tags": [
+          "n=12",
+          "clip-vit-b32",
+          "proxy-uncertainty"
+        ]
+      }
+    },
+    {
+      "id": "tallow-fen-v1-step-1750",
+      "time": 1750,
+      "geometry": {
+        "effectiveDimension": 7.008293973017701,
+        "anisotropy": 8.21347886570257,
+        "spread": 0.8244690895080566,
+        "density": 7.25062711021022
+      },
+      "uncertainty": {
+        "entropy": 3.5780160427093506,
+        "margin": 0.04056647885590792,
+        "calibration": 0.07635637372732162
+      },
+      "performance": {
+        "accuracy": 0.7795696258544922,
+        "loss": 0.15124398469924927
+      },
+      "metadata": {
+        "source": "tallow_fen_style_v1 (Qwen-Image LoRA, RTX 5090)",
+        "version": "1.0.0",
+        "tags": [
+          "n=12",
+          "clip-vit-b32",
+          "proxy-uncertainty"
+        ]
+      }
+    },
+    {
+      "id": "tallow-fen-v1-step-2000",
+      "time": 2000,
+      "geometry": {
+        "effectiveDimension": 6.761863905951503,
+        "anisotropy": 12.496390278327958,
+        "spread": 0.7998718023300171,
+        "density": 7.395438826698832
+      },
+      "uncertainty": {
+        "entropy": 3.5797951221466064,
+        "margin": 0.051343273371458054,
+        "calibration": 0.0672067403793335
+      },
+      "performance": {
+        "accuracy": 0.7937332987785339,
+        "loss": 0.14313781261444092
+      },
+      "metadata": {
+        "source": "tallow_fen_style_v1 (Qwen-Image LoRA, RTX 5090)",
+        "version": "1.0.0",
+        "tags": [
+          "n=12",
+          "clip-vit-b32",
+          "proxy-uncertainty"
+        ]
+      }
+    }
+  ]
+}
diff --git a/packages/vector-caliper/baselines/qwen-lora-tallow-fen-v1.svg b/packages/vector-caliper/baselines/qwen-lora-tallow-fen-v1.svg