Skip to content

bug: PGLite apply-migrations wedges with zembed-1 2560d due HNSW index emitted on migrate path #1189

@jeades

Description

@jeades

Summary

Fresh PGLite brain initialized with ZeroEntropy zembed-1 at 2560 dimensions can become wedged when gbrain apply-migrations / migration schema replay runs.

The DB itself remains structurally intact, but the migration ledger records repeated partials and doctor reports a wedged migration. The failure appears to be the migrate/schema replay path using default 1536-dim gateway state instead of the configured embedding_dimensions: 2560, causing schema replay to emit the HNSW chunk index even though pgvector HNSW caps at 2000 dims.

Related but not identical: #1141. That issue is Postgres/Supabase schema v66 -> v67. This report is the PGLite fresh-install/apply-migrations surface.

Environment / config

Current upstream inspected while triaging:

  • master: 3a0e1116e76b1be45a136cf184d1a49d431b6b60
  • VERSION: 0.36.1.0
  • Engine: PGLite
  • Pages: 0

Config shape:

{
  "engine": "pglite",
  "database_path": "~/.gbrain/brain.pglite",
  "embedding_model": "zeroentropyai:zembed-1",
  "embedding_dimensions": 2560
}

Observed

Doctor / state after failure:

  • schema_version remained current / unchanged
  • connection: Connected
  • pages: 0
  • search_mode: balanced
  • config preserved
  • migration ledger had repeated partial entries for 0.11.0, later also 0.12.0
  • doctor reported WEDGED MIGRATION(s) after repeated partials

Representative underlying failure:

column cannot have more than 2000 dimensions for hnsw index

Expected

All schema replay / migration paths should apply the same HNSW policy as fresh schema generation:

  • if embedding dimensions <= 2000: create idx_chunks_embedding HNSW index
  • if embedding dimensions > 2000: skip HNSW index and rely on exact vector scan

No migration should wedge a 2560-dim brain by trying to create the chunk HNSW index.

Suspected root

init --migrate-only loads config and calls engine.initSchema(), but does not appear to configure the AI gateway from config before schema generation.

PGLite initSchema() reads dimensions from the gateway and falls back to 1536 when the gateway is not configured. That makes getPGLiteSchema() apply the <=2000 HNSW branch even though the actual brain config/column is 2560.

This may also affect the in-process PGLite schema path in the open #1182 fix-wave unless that path configures the gateway before eng.initSchema().

Related

Workaround

Use zembed-1 at 1280 dimensions for PGLite fresh installs until the migrate/schema replay path is fixed.

Happy to close this as a duplicate of #1141 if maintainers prefer, but wanted to preserve the PGLite-specific repro surface.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions