Summary
Add first-class config support for running mem0's embedder and/or fact-extraction LLM against a local Ollama instance, so memserv can run with zero per-call cost and no data leaving the host.
Idea from OB1
OB1's recipes/local-ollama-embeddings provides "on-device embedding generation" as an alternative to cloud embedding APIs — cutting cost and keeping content private.
Why it fits memserv
app/memory.py::_build_config already abstracts embedder.provider and llm.provider from settings — mem0 supports Ollama for both. This is largely a config + docs feature, not a re-architecture.
- Directly addresses two PRD §19 operational footguns: embedding cost and the Anthropic fact-extraction bill for high-volume automation.
- Privacy win for a self-hosted single-user system: no content sent to OpenAI/Anthropic.
Proposed approach
- Extend
Settings (app/config.py) to accept mem0_embed_provider=ollama / mem0_llm_provider=ollama plus OLLAMA_BASE_URL and model names.
- Wire the Ollama-specific config keys through
_build_config.
- Critical reminder:
MEM0_EMBED_DIMS must match the Ollama embed model's real output dimension, and changing models requires dropping/recreating the Qdrant collection (per the existing architecture invariant). Document this prominently.
- Add
docker-compose.yml notes / an optional Ollama service for the bundled-stack deploy path.
- Document in
docs/USER_GUIDE.md.
Notes / scope
Keep the default as anthropic+openai; Ollama is opt-in via env. No changes to the single-Memory-per-process invariant.
Source: https://github.com/NateBJones-Projects/OB1/tree/main/recipes
Summary
Add first-class config support for running mem0's embedder and/or fact-extraction LLM against a local Ollama instance, so memserv can run with zero per-call cost and no data leaving the host.
Idea from OB1
OB1's
recipes/local-ollama-embeddingsprovides "on-device embedding generation" as an alternative to cloud embedding APIs — cutting cost and keeping content private.Why it fits memserv
app/memory.py::_build_configalready abstractsembedder.providerandllm.providerfrom settings — mem0 supports Ollama for both. This is largely a config + docs feature, not a re-architecture.Proposed approach
Settings(app/config.py) to acceptmem0_embed_provider=ollama/mem0_llm_provider=ollamaplusOLLAMA_BASE_URLand model names._build_config.MEM0_EMBED_DIMSmust match the Ollama embed model's real output dimension, and changing models requires dropping/recreating the Qdrant collection (per the existing architecture invariant). Document this prominently.docker-compose.ymlnotes / an optional Ollama service for the bundled-stack deploy path.docs/USER_GUIDE.md.Notes / scope
Keep the default as anthropic+openai; Ollama is opt-in via env. No changes to the single-
Memory-per-process invariant.Source: https://github.com/NateBJones-Projects/OB1/tree/main/recipes