A personal journal insight engine that ingests handwritten journal pages, voice notes, and fitness data, then answers natural language queries about them.
- Ingests handwritten journal pages (OCR via Claude Opus 4.6) and voice notes (transcription via configurable
provider —
gpt-4o-transcribeby default, with Gemini as alternative andwhisper-1as fallback) - Pulls fitness data from Strava (activities) and Garmin Connect (activities + daily wellness — sleep, HRV, body battery, training load) so journal entries can be correlated against training and recovery state
- Stores entries in dual databases: SQLite for structured queries, ChromaDB for semantic search
- Answers natural language questions like "Which friends did I meet in February?" or "What makes me happy?"
- Interfaces: MCP server (for AI assistants), CLI, and API endpoints
- Python 3.13+
- uv for dependency management
- API keys for Anthropic and OpenAI
- Docker (for ChromaDB and deployment)
# Clone and install
git clone https://github.com/johnmathews/journal-server.git
cd journal-server
uv sync
# Set up environment
cp .env.example .env # Edit with your API keys
# Run tests
uv run pytest
# Start ChromaDB locally
docker run -d --name chromadb -p 8000:8000 -v ./chroma-data:/data chromadb/chroma:1.5.5
# Use the CLI
uv run journal ingest page.jpg --date 2026-03-22
uv run journal search "meetings with Atlas"
uv run journal stats# Set API keys
export ANTHROPIC_API_KEY=your-key
export OPENAI_API_KEY=your-key
# Start the full stack
docker compose up -dThis starts:
- Journal MCP server on port 8000 (streamable HTTP)
- ChromaDB on port 8001
MCP Client (Nanoclaw)
|
MCP Server (FastMCP)
|
+-----------+-----------+
| |
Query Service Ingestion Service
| |
+---------+---------+ +--------+--------+
| | | | | |
SQLite ChromaDB Embed OCR Whisper Embed
(FTS5) (vectors) API API API API
All external APIs are behind provider-agnostic interfaces (Python Protocols), making it easy to swap providers.
- Architecture — System design and data flow
- Configuration — Environment variables reference
- Transcription Providers — Multi-provider stack, retry/fallback, shadow mode
- Fitness Pipeline — Strava + Garmin data flow (engineer-facing overview)
- Fitness Operations — Re-auth, backfill, troubleshooting (operator runbook)
- Fitness Integration Plan — Decisions and rationale (sacred raw archive, four-layer pipeline, daily cadence, library pins)
- Fitness Schema — Tables, columns, indexes, migration sequencing
- Development — Local setup and contributing
- API Reference — MCP tool documentation
~$3.52/month for ~3 handwritten pages/day + 10 min voice notes. See project-brief.md for detailed estimates.