Build a local vector store system for Obsidian notes that enables semantic search and AI-assisted note management.
┌─────────────────────┐
│ Obsidian Vault │ /Users/ernestkoe/Documents/Obsidian/ProofKit/
└──────────┬──────────┘
│ file watcher (watchdog)
▼
┌─────────────────────┐
│ Indexer Service │ Background daemon
│ • Parse markdown │
│ • Chunk by heading│
│ • Generate embeds │ via Ollama/OpenAI/LMStudio/CoreML
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ ChromaDB │ Local persistent store
│ ~/Projects/obsidian-memory/data/
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ MCP Server │ stdio transport
│ • search_notes │
│ • get_similar │
│ • get_note_context│
└─────────────────────┘
~/Projects/obsidian-memory/
├── pyproject.toml # Project config, dependencies
├── README.md # Documentation
├── src/
│ └── obsidian_memory/
│ ├── __init__.py
│ ├── indexer.py # Markdown parsing, chunking, embedding
│ ├── store.py # ChromaDB wrapper
│ ├── watcher.py # File system watcher
│ ├── server.py # MCP server
│ └── cli.py # Command-line interface
├── data/ # ChromaDB persistent storage
└── tests/
└── test_indexer.py
- Parse markdown files
- Chunk by heading (preserve context)
- Generate embeddings via Ollama
- Store metadata: file path, heading, modified date, tags
- ChromaDB wrapper with collections:
daily_notes- Daily journal entriesnotes- All other notes
- CRUD operations: add, update, delete, query
- Metadata filtering (by date, path, tags)
- Monitor Obsidian vault for changes
- Trigger incremental reindex on file changes
- Handle creates, updates, deletes, renames
Tools exposed:
search_notes(query, limit, collection)- Semantic searchget_similar(note_path, limit)- Find related notesget_note_context(note_path)- Get note + related contextreindex(path?)- Force reindex (optional path filter)get_stats()- Collection stats
Commands:
obsidian-memory index- Full reindexobsidian-memory search "query"- Test searchobsidian-memory serve- Start MCP serverobsidian-memory watch- Start watcher daemon
# config.toml or pyproject.toml [tool.obsidian-memory]
vault_path = "/Users/ernestkoe/Documents/Obsidian/ProofKit"
data_path = "./data"
ollama_url = "http://localhost:11434"
embedding_model = "nomic-embed-text"
[collections]
daily_notes = "Daily Notes/**/*.md"
notes = "**/*.md"
exclude = ["attachments/**", ".obsidian/**"]chromadb- Vector storewatchdog- File system monitoringmcp- Model Context Protocol SDKhttpx- Ollama/LMStudio API callsopenai- OpenAI embeddingspyyaml- Frontmatter parsingclick- CLI framework
Optional:
coremltools- CoreML model conversion (for[coreml]extra)
- Set up project structure with pyproject.toml
- Implement markdown parser with heading-based chunking
- Implement Ollama embedding client
- Implement ChromaDB store wrapper
- Create CLI with
indexandsearchcommands - Test with Daily Notes folder
- Implement MCP server with search tools
- Add to Claude Code MCP config
- Test semantic search from Claude
- Implement watchdog-based file monitoring
- Incremental updates (not full reindex)
- Daemon mode for background operation
- Handle edge cases (empty files, binary files, etc.)
- Add stats/health endpoints
- Optimize chunking strategy based on real usage
Add native macOS embedding support to eliminate external dependencies.
Goals:
- No need for Ollama or OpenAI API
- Runs on Neural Engine (power efficient on Apple Silicon)
- Fully offline operation
Approach:
- Convert
all-MiniLM-L6-v2(22M params) to CoreML format usingcoremltools - Add
coremlas a provider option alongside openai/ollama/lmstudio - Create
CoreMLEmbedderclass following existing embedder pattern - Bundle or download the converted model on first use
Tradeoffs:
| Aspect | CoreML | Ollama/nomic-embed-text |
|---|---|---|
| Quality | Good (MiniLM) | Better (nomic) |
| Dependencies | None | Ollama running |
| Power usage | Low (Neural Engine) | Higher |
| Model size | ~90MB | ~270MB |
Alternative models:
bge-small-en-v1.5- similar size, slightly better quality- Apple's built-in
NLEmbedding- zero dependencies but older/lower quality
Implementation notes:
- Add
coremltoolsas optional dependency:pip install obsidian-notes-rag[coreml] - Model stored in
~/Library/Application Support/obsidian-notes-rag/models/ - First-run downloads or converts the model
- Chunking: By heading (## and ###) - preserves semantic sections
- Scope: Full vault - all markdown files
- Collections: Single collection with metadata filtering (type, path, date)