feat: support Gemini Embedding 2 as native multimodal embedding provider

## Problem

GBrain v0.37 has a useful multimodal embedding path, but the native Google recipe only exposes `gemini-embedding-001` as a text embedding model. `embedMultimodal()` currently rejects the Google recipe because `supports_multimodal` is not set, and the error message points operators to Voyage.

Google has shipped Gemini Embedding 2 / `gemini-embedding-2-preview`, a native multimodal embedding model that maps text, images, video, audio, and documents into a single embedding space.

## Evidence

Google docs: https://ai.google.dev/gemini-api/docs/embeddings

- `gemini-embedding-2` / `gemini-embedding-2-preview` supports multimodal input: text, images, video, audio, PDF.
- Output dimensions support 128-3072, recommended 768/1536/3072.
- Current GBrain Google recipe only lists `gemini-embedding-001` under embedding.
- Current `embedMultimodal()` error says: Today: `voyage:voyage-multimodal-3`.

## Local validation

LiteLLM proxy in front of Gemini can expose `gemini-embedding-2-preview` and text embeddings work:

```text
POST /v1/embeddings model=gemini-embedding-2-preview input="hello world"
→ OK, 3072 dims
```

But the OpenAI-compatible multimodal content-array shape that GBrain sends through `embedMultimodalOpenAICompat()` fails via LiteLLM:

```text
POST /v1/embeddings
input=[{type:"image_url", image_url:{url:"data:image/png;base64,..."}}]
→ 400 INVALID_ARGUMENT
Invalid value at 'requests[0].content.parts[0]' (text), Starting an object on a scalar field
```

So the LiteLLM path is not currently sufficient as a drop-in bridge for Gemini multimodal embeddings.

## Desired behavior

Allow configs like:

```bash
gbrain config set embedding_multimodal true
gbrain config set embedding_multimodal_model google:gemini-embedding-2-preview
# or google:gemini-embedding-2 when stable
```

Then image ingestion / `gbrain reindex --multimodal` / cross-modal query paths should use Google native multimodal embeddings instead of requiring Voyage.

## Why this matters

- Single-vendor deployment for users already on Gemini.
- Native all-modal support: text, images, video, audio, documents.
- Avoids Voyage free-tier 429s during vault image ingestion.
- Aligns with GBrain's provider-agnostic embedding architecture.

## Implementation sketch

- Extend `src/core/ai/recipes/google.ts` with Gemini Embedding 2 model(s).
- Add Google native `embedMultimodal` request serialization for `models/*:embedContent` with `parts` containing text and inline image data.
- Handle output dimensions consistently with GBrain schema / `embedding_image` and `embedding_multimodal` columns.
- Add tests for text-only, image-only, text+image request payloads and model validation.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support Gemini Embedding 2 as native multimodal embedding provider #1216

Problem

Evidence

Local validation

Desired behavior

Why this matters

Implementation sketch

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: support Gemini Embedding 2 as native multimodal embedding provider #1216

Description

Problem

Evidence

Local validation

Desired behavior

Why this matters

Implementation sketch

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions