Bug: VECTOR_DIMS hardcoded to 384, ignores TROVE_EMBEDDING_MODEL

**Problem**
`database.py` hardcodes `VECTOR_DIMS = 384`, which matches the default BAAI/bge-small-en-v1.5 model. Changing TROVE_EMBEDDING_MODEL to any model with different output dimensions (e.g.` intfloat/multilingual-e5-large` at 1024, or `sentence-transformers/paraphrase-multilingual-mpnet-base-v2` at 768) causes:

`sqlite3.OperationalError: Dimension mismatch for inserted vector for the "embedding" column. Expected 384 dimensions but received 1024.`
This also blocks multilingual use, since all fastembed-supported multilingual models output either 384, 768, or 1024 dims — and only one happens to match the hardcoded value.

Suggested fix
Derive VECTOR_DIMS from the actual embedding model at init time:

```
from fastembed import TextEmbedding

def get_vector_dims(model_name: str) -> int:
    model = TextEmbedding(model_name=model_name)
    return len(list(model.embed(["dimension probe"]))[0])
```

Then use the result when creating the chunks_vec virtual table:

```
CREATE VIRTUAL TABLE IF NOT EXISTS chunks_vec USING vec0(
    embedding float[{vector_dims}]
);
```

Secondary issue: TROVE_PATHS splits on :, breaks Windows drive letters
C:\Users\foo\Documents gets split into C and \Users\foo\Documents. Fix: use `os.pathsep` (; on Windows, : on Linux) instead of hardcoded ":".

Environment
Windows 11

mcp-trove-crunchtools 0.3.0 via uvx

fastembed with intfloat/multilingual-e5-large

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: VECTOR_DIMS hardcoded to 384, ignores TROVE_EMBEDDING_MODEL #7

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bug: VECTOR_DIMS hardcoded to 384, ignores TROVE_EMBEDDING_MODEL #7

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions