A local, AI-powered knowledge curation system. Drop in a PDF, article, or note — The Curator automatically atomizes it into an interlinked wiki of entities, concepts, and summaries. Chat with your knowledge in a multi-turn AI conversation. Explore everything as a visual knowledge graph in Obsidian. Sync seamlessly across your own computers via a private GitHub repository — or contribute to a collective Shared Brain with your cohort, team, or research group (v3.0.0-beta+, opt-in).
Built on the Karpathy llm-wiki concept: instead of one giant notebook where everything gets lost, you maintain dedicated, compounding wikis per domain (e.g. AI/Tech, Business, Personal Growth). Each one gets smarter with every source you add.
Your job is to curate sources, ask the right questions, and think about what it all means. The Curator's job is everything else — summarizing, cross-referencing, filing, and bookkeeping.
See The Curator in action: drop a PDF, watch it atomize into an interlinked wiki, explore the knowledge graph, and chat with your knowledge.
1. Drop in a PDF, article, or note
↓
2. The Curator reads it and writes 5–15 interlinked wiki pages
(summary + entity pages + concept pages, with YAML frontmatter)
↓
3. Chat with your knowledge — multi-turn AI conversation
with full memory, cited answers, persistent history
↓
4. Open Obsidian → explore the auto-colored visual knowledge graph
↓
5. Sync now → your knowledge backs up to your private GitHub repo
↓
6. (v3.0.0-beta+, optional) Join a Shared Brain → your opted-in
domain contributes to a collective wiki shared with your cohort,
team, or research group; everyone's reading compounds together
Everything is stored as plain markdown files on your computer. No subscriptions, no database, no cloud accounts — except a Google Gemini or Anthropic Claude API key (Gemini has a free tier with strict daily quotas; pay-as-you-go costs roughly €5/month for moderate solo use, or €10–20/month for an admin running cohort-scale synthesis weekly — see Cost & API keys for the full breakdown).
Most AI integrations use RAG (Retrieval-Augmented Generation): the AI scans raw files, retrieves chunks at query time, and forgets everything the moment the chat ends. It rediscovers knowledge from scratch on every question. Nothing compounds.
The Curator works differently. When you ingest a source:
- The AI reads it, extracts key people/tools/ideas, and writes persistent wiki pages
- On every subsequent ingest, it updates existing pages rather than creating duplicates
- Cross-references are baked in — the contradictions are flagged, the synthesis is maintained
- The wiki compounds with every source you add
The knowledge is compiled once and kept current — not re-derived on every query. This is the shift from a file cabinet to a neural network.
- Drop in a
.pdf,.txt, or.mdfile — the AI does the rest - Atomic Decomposition — automatic extraction of Entities (people, tools, companies), Concepts (ideas, techniques, frameworks), and Summaries (source narratives)
- Every page cross-references related pages with
[[wiki-links]] - YAML frontmatter on every page — structured metadata (
type,tags,created) that powers Obsidian's Properties panel, Dataview queries, and automatic graph coloring - Auto-colored knowledge graph — type tags (
type/entity,type/concept,type/summary) let Obsidian color-code every node automatically; set it up once, every future ingest colors itself - Multi-turn AI chat with persistent conversation history — ask follow-ups, connect the dots across sources, pick up where you left off
- Compile to Wiki (v2.5.0) — turn any chat conversation into permanent wiki pages with one click. The AI reads the dialogue, extracts the durable knowledge, writes a summary page plus any new entities/concepts that emerged, and updates everything related — same merge pipeline as ingest, no parallel write surface. Compiling the same conversation twice is a safe no-op. After every compile (and every ingest) you see exactly which pages were created and which were updated, with byte counts and per-section bullet deltas.
- Visual knowledge graph via Obsidian (free app, reads the same files)
- Personal Sync — one-time 3-minute setup, then a single Sync now button (with optional Push-only / Pull-only advanced controls) backs up your full wiki across any number of YOUR own computers via a private GitHub repository
- Shared Brain (v3.0.0-beta+, opt-in) — contribute to a collective wiki shared with your cohort, team, or research group. Each contributor keeps a private Curator; only opted-in domains push LLM-synthesised Delta summaries to a shared private GitHub repo; the synthesised collective wiki pulls back as a separate read-only
shared-<slug>/mirror domain. Two-primitives security model (invite token = metadata only, PAT = per-contributor identity), GDPR Article 17 right-to-erasure built in, two IP modes (contributor_retainsfor cohorts /organisationalfor enterprise). v3.1 will add Cloudflare R2 storage for EU data residency. See docs/shared-brain-user-guide.md - Domain management — create, rename, and delete domains from the UI; four AI-tuned templates auto-generate the right schema
- Settings tab — manage API keys, view version info, and check for updates from within the app
- System Check (Settings) — one click confirms the app itself is set up correctly: API key configured, knowledge folder writable, credential files locked down (
0600), and sync status. A free, instant, local-only check that never touches your wiki content — plus an optional, cost-confirmed "Verify AI connection" test (~$0.0001) that makes one tiny request to your provider so you can tell at a glance whether a failure is your key or a provider outage. (Distinct from the Health tab, which cleans up your wiki content.) See docs/system-check.md - Wiki Health tab — one-click scan for broken links, orphans, duplicate entities, folder-prefix violations, and missing backlinks. Auto-fix categories rewrite in place; broken links with a suggested target get an Apply button (and a bulk Apply all suggestions action); genuine ambiguities stay review-only. Review-only rows also get a ✨ Ask AI button that uses your configured LLM: for broken links it proposes a target; for orphans it proposes up to 5 existing pages that should link to the orphan — each with an AI-written bullet description. An opt-in Scan for semantic duplicates feature finds pages that describe the same concept under different slugs (e.g.
[[email]]+[[e-mail]],[[rag]]+[[retrieval-augmented-generation]]) — with cost preview, user-configurable ceiling, and a mandatory Preview-diff safety gate before any merge. Decisions persist: dismiss any review-only issue or semantic-duplicate pair once and it stops surfacing on future scans — and because dismissals live inside the wiki folder, they sync to your other computers automatically. See docs/ai-health.md. - First-run onboarding wizard — guided 3-step setup (API keys, create a domain, sync) on first launch
- Live UI updates — domain stats, wiki pages, and page counts refresh automatically after ingest and sync — no manual browser reload needed
- Auto-update — check for updates in Settings; the app pulls the latest version, rebuilds the Dock app, and restarts automatically
- One-command installer — auto-detects and installs Node.js, builds the Dock app, opens on completion
- Supports Google Gemini (recommended, very cheap) and Anthropic Claude
- Three built-in domains: AI/Tech · Business/Finance · Personal Growth
- Add unlimited custom domains — no terminal or file editing required
- Mac Dock app — double-click to launch, no terminal needed
| Mode | Tool | Best for |
|---|---|---|
| Chat | Built-in AI (Chat tab) | "How does X relate to Y?", synthesising across sources, multi-turn conversation |
| Visual | Obsidian graph view | Seeing the full knowledge map, spotting clusters, browsing pages |
| Frontier LLM | Claude Desktop via My Curator MCP bridge (v2.3+) | Deep research with Opus / Sonnet over the full graph — tags, links, backlinks, topology |
All three read the same markdown files — no sync or export needed between them. Set up My Curator from the Settings tab; see docs/mcp-user-guide.md.
The Curator is domain-agnostic. It works for anyone who accumulates knowledge over time and wants it organized, connected, and queryable rather than scattered.
Ingest all your reading material and research. When outlining a new video or article, open the Obsidian graph and look at the largest Concept nodes to see which themes you naturally gravitate toward. Click any Entity node to see every source you've read about that person or tool — generating a rich, fully cited script in minutes. Turns passive consumption into a content assembly line.
Batch-upload 20+ PDFs on a topic. The Curator extracts all distinct methodologies (Concepts) and authors (Entities). Use the graph's "Idea Collisions" to identify gaps in the literature — intersections between concepts that no existing paper has addressed. Query the chat to synthesise findings across all papers simultaneously with source citations.
Upload quarterly reports, competitor analyses, and meeting transcripts. Build an "Expertise Map" where the most-referenced nodes grow largest — giving you a visual heat map of where your intelligence is concentrated and where the gaps are. Query: "Synthesise the main friction points from the last 20 customer interviews." The Curator connects dots across months of documents, bypassing recency bias entirely.
Ingest architecture decision records (ADRs), API specs, post-mortems, and README files. The app builds a dependency graph of your codebase's decisions, not just its code. New team members can ask: "Why did we choose Postgres over MongoDB for the auth service?" and get an answer cited directly from an ADR written years ago.
Drop in clinical trial PDFs and academic papers. The Curator extracts Entities (genes, proteins, drugs, compounds) and Concepts (pathways, methodologies, biomarkers). The graph reveals hidden intersections — a compound used in one domain showing efficacy in a completely different study — by visually bridging nodes across your entire literature corpus.
Feed the app customer interview transcripts, investor updates, and market research reports. Build an external "Board of Advisors" from your own collected intelligence. If considering a product pivot, see which Concept nodes are growing fastest. Query the chat for synthesised strategic answers grounded entirely in your own research.
Ingest journal entries, book highlights, therapy notes, and podcast summaries. The app extracts recurring Entities (people, situations, environments) and Concepts (anxiety triggers, flow states, core values). Query: "What themes recur on high-stress days?" The Curator connects dots across months of journaling with the objectivity of a third party.
The use cases above are for individual users. Shared Brain extends The Curator with a collective layer where multiple contributors build a shared wiki together — each keeps their personal brain private, while one or more opted-in domains push synthesised contributions to a shared GitHub repo. Synthesis runs locally on the admin's machine using their LLM key; the collective wiki comes back to every contributor's machine as a separate read-only mirror domain.
A 20-student ML reading cohort each ingests papers into their personal work-ai domain and opts that one domain into the cohort Shared Brain. Synthesis runs weekly. The cohort ends the semester with a 500-page collective wiki that no single student could have built alone — every paper is in the entity graph, every concept cross-referenced, every contribution attributed. Privacy: students' other domains never leave their machines.
Four AI-safety researchers each contribute their papers domain to a shared brain. Nightly Pull brings everyone's notes into everyone else's shared-safety/ mirror. Friday meeting: someone asks Claude (via the My Curator MCP) "Which mechanistic-interpretability papers contradict each other on the role of induction heads?" — Claude reads the collective, surfaces three contradictions with paper citations. Synthesis resolves disagreements via the Jaccard contradiction heuristic + targeted LLM call.
A boutique strategy firm with 15 consultants contributes a sanitised client-insights domain to a firm-knowledge Shared Brain (in organisational IP mode — employment contracts cover IP assignment). The collective wiki becomes accumulated institutional intelligence that survives partner departures and onboards new hires in days instead of weeks.
A 50-person SaaS company pilots a Shared Brain for the engineering team. Each engineer opts in one engineering-knowledge domain with ADRs, post-mortems, and internal RFCs. New engineers query Claude: "Why did we pick PostgreSQL over MongoDB?" — Claude reads the collective via MCP, cites the 2023 ADR. Per-engineer attribution preserves who contributed what. shared-engineering/ is read-only for direct edits, so engineers can't accidentally overwrite the collective.
A product team (PM + designers + engineers + researcher) contributes 4 role-specific domains over 6 months. The collective wiki becomes the project's queryable memory. Six months later, the retrospective is informed by an actual searchable corpus, not just whoever happened to keep good notes.
→ More cohort & team patterns in docs/use-cases.md. Setup walkthrough in docs/shared-brain-user-guide.md. Architecture in docs/shared-brain.md. Compliance in docs/shared-brain-compliance.md.
Shared Brain's architecture supports paid access — domain experts, educators, researchers, artists, and consultants can charge audiences for access to a brain they curate. This works today on v3.0.0-beta.1 with zero code changes, using no-code payment platforms you already know (Gumroad, Lemon Squeezy, Stripe).
- Independent researchers — sell a recurring subscription to your curated reading domain (€10-30/mo). Example: an AI safety researcher with 4 years of paper reading + weekly synthesis.
- Educators & professors — package your cognitive-science / philosophy / history domain as a paid student companion or public knowledge product.
- Artists & designers — turn your 10-year visual-reference library with commentary into a paid resource.
- Industry experts — VC analysts, biotech researchers, longevity scientists with deep niche expertise.
- Consulting firms — sell sanitised pattern recognition (anonymised) to current clients as a recurring add-on.
- SaaS companies — sell domain expertise as a recurring asset bundled with their software product.
Unlike a Notion template (bought once, frozen) or a newsletter (single read, archived), a Shared Brain compounds. Buyers who pay in month 1 see the brain grow richer every synthesis run, and they can query it via Claude Desktop for deep research like "across this brain, which papers contradict each other on X?" The value keeps growing, which is exactly why subscription pricing works.
Pricing comparables:
| Product | Typical price | Why Shared Brain compares |
|---|---|---|
| Substack newsletter | €5-15/mo | Single-read content |
| Stratechery (Ben Thompson) | €15/mo | One expert's recurring analysis |
| Patreon tiers | €3-50/mo | Audience access |
| Shared Brain subscription | €10-30/mo | Compounding queryable knowledge graph + recurring synthesis + Claude integration |
Shared Brain has four serial gates from buyer → brain access. The first is the only one you (the admin) control 100%:
- 🚪 GitHub collaborator status — pay → you add → access granted; cancel → you remove → access revoked. This is THE money gate.
- 🚪 PAT scope — you instruct buyers to create a Read-only PAT (read-only tier) or Read AND Write PAT (contributor tier). Two tiers with no code.
- 🚪 Invite token — metadata-only, safe to email or even publish publicly. Not a gate, just a UX touchpoint.
- 🚪 The Curator app — buyer installs the free open-source app on their machine.
→ More example use cases (independent experts, artists, consulting firms, SaaS companies) in docs/use-cases.md.
Paste this into Terminal and press Enter:
curl -fsSL https://raw.githubusercontent.com/talirezun/the-curator/main/install.sh | bashThe script auto-detects and installs Node.js if needed, clones the repo, installs dependencies, and builds The Curator.app — all in one step. When it finishes, the app opens automatically. An onboarding wizard walks you through API key setup on first launch.
Pin it to your Dock. The installer puts The Curator.app in
~/the-curator/but doesn't add it to your Dock automatically — open a Finder window, navigate to~/the-curator, and drag the app icon down into your Dock. Now you can launch The Curator with one click any time.
Lifecycle on macOS. The app is a local web server that opens in your browser. Closing the browser tab does not stop the server — it keeps running in the background using virtually no CPU, so clicking the Dock icon again instantly reopens it. To fully quit: right-click The Curator in the Dock → Quit.
Optional: The repo includes a
research/folder with articles and papers about second brain architecture. This is not required to run the app. If you want to save disk space after installation, you can safely delete~/the-curator/research/— the app will work perfectly without it. The research folder is available for interested users who want to explore the concepts behind The Curator.
The Node.js server runs anywhere Node 18+ runs. Only the one-line installer and the auto-built .app Dock launcher are macOS-specific — the app itself is fully cross-platform.
Prerequisites
- Node.js 18+
- An API key — Google Gemini (free tier available, paid tier ~€5/month for moderate use) or Anthropic Claude (paid only)
- Obsidian for the knowledge graph (free, optional)
# 1. Clone the project
git clone https://github.com/talirezun/the-curator.git
cd the-curator
# 2. Install dependencies
npm install
# 3. Start the server
node src/server.js # macOS / Linux
# Windows PowerShell:
# $env:CURATOR_NO_OPEN=1; node src\server.jsOpen http://localhost:3333 in your browser.
Windows / Linux notes: the auto-update + Dock-app + folder-picker UI buttons are macOS-only; everything else (ingest, chat, wiki, MCP, sync, Health) works identically. Set
DOMAINS_PATH=...to point at your knowledge folder, andCURATOR_NO_OPEN=1to skip the macOS-onlyopenbrowser-launch on startup.
Install with a coding agent: Claude Code, Cursor, Augment, Cline, and other CLI-aware AI coding agents can install The Curator for you — paste the prompt from User Guide §20.
API keys: The onboarding wizard appears on first launch and asks for your key. You can also add or change keys anytime in the Settings tab. Alternatively, developers can create a
.envfile manually (cp .env.example .env) and setGEMINI_API_KEYthere.
For the Mac Dock app (double-click to launch, no Terminal needed), see docs/mac-app.md.
First time? Read the full User Guide — it covers every step in plain language, including how to get your API key, real-world cost estimates, how to use the chat, and how to set up Obsidian.
The Curator itself is free, open-source software. The only paid component is the AI provider you connect for the features that actually call an LLM. Knowing which features cost tokens and which don't makes the bill predictable.
| Feature | Uses tokens? | Why |
|---|---|---|
| Ingest (drop in a PDF / article / note) | ✅ Yes | The LLM reads the source and writes the wiki pages. This is by far the largest consumer of tokens. |
| Chat (built-in tab) | ✅ Yes | Each message + reply is one LLM call. Cheap — typically a few cents per long conversation. |
| Wiki Health — ✨ Ask AI on broken links (Phase 1) | ✅ Yes | One LLM call per click. ~$0.0001–0.0005 each. |
| Wiki Health — ✨ Ask AI on orphan pages (Phase 2) | ✅ Yes | One LLM call per click. ~$0.0001–0.0005 each. |
| Wiki Health — Semantic duplicate scan (Phase 3) | ✅ Yes — opt-in, cost-gated | A confirm dialog shows the estimate before you run it (typical: $0.003–$0.03 on Gemini Flash Lite). |
| Shared Brain — Push contributions (v3.0.0-beta+, contributor side) | ✅ Yes | Each push runs local LLM pre-processing to generate DeltaSummary objects from your changed pages. One LLM call per changed page. Typical: $0.001–0.01 per push on Gemini Flash Lite. |
| Shared Brain — Run synthesis (v3.0.0-beta+, admin side) | ✅ Yes — but contradiction-only | Synthesis only invokes the LLM for contradiction candidates flagged by the Jaccard heuristic. Most contributions don't conflict, so most synthesis runs are nearly free. Typical: $0.001–0.05 per synthesis on Gemini Flash Lite, scaling with disagreement rather than corpus size. |
| Feature | Why it's free |
|---|---|
| Wiki tab (browse pages) | Pure file rendering. No LLM call. |
| Domain management (create / rename / delete) | Filesystem operations only. |
| Settings, API keys, updates | Local. No LLM call. |
| Personal Sync (Sync now / Push only / Pull only) | A git push / git pull over HTTPS to your own private repo. |
| Wiki Health — structural scan & deterministic fixes (broken-link auto-fix, folder-prefix, hyphen variants, cross-folder dedup, missing backlinks) | Algorithmic — runs entirely on your machine. |
| My Curator MCP server (locally, on this machine) | The bridge itself is free. The frontier model you connect to it (Claude Desktop, etc.) bills you on its own plan, not through your Curator API key. |
| Shared Brain — Pull updates / Disconnect / List connections | GitHub REST API calls to read pages or list metadata — no LLM involved. |
| Shared Brain — Revoke a contributor (GDPR Article 17) | Storage operations only (delete contributions, scan + delete tainted pages, append audit log). Synthesis re-runs after — that step uses the LLM as above. |
| Provider | Free tier? | Cost (paid) | Real-world cost |
|---|---|---|---|
| Google Gemini 2.5 Flash Lite (default, recommended) | Yes — 15 RPM, 1,000 requests/day, 250k tokens/min (details) | $0.10/M input · $0.40/M output | ~€5/month at heavy use (50 articles × ~10 pages, plus daily chat) |
| Anthropic Claude Haiku 4.5 | No | $1/M input · $5/M output | ~10× the Gemini bill for the same workload |
About the Gemini "free tier": it exists, and it's enough to try the app — but the daily quota was tightened by 50–80% in December 2025, so a single batch ingest of 5–10 PDFs will usually exhaust it. For real use, enable billing in Google AI Studio — the per-token cost is so low that most users pay €1–€10/month total. See User Guide §19 for a full cost breakdown and pricing math.
Context window: Gemini 2.5 Flash Lite has a 1,048,576-token window (≈1M tokens), which means The Curator can in principle ingest articles of 200–300 pages in a single pass. The current ingest pipeline caps inputs at 80k characters per call (≈20k tokens) and uses a multi-phase pipeline for larger documents — books and very long PDFs work but haven't been stress-tested at the full 1M-token ceiling.
Building a second brain is rewarding. Querying it with a frontier model is the moment it becomes irreplaceable.
For most second-brain users, the loop is: ingest sources → admire the Obsidian graph. The graph is beautiful, the visual structure is enjoyable, and the local Chat tab handles everyday lookups. But the graph is something you look at. The synapses — the actual connections between thousands of knowledge nodes accumulated over years — are mostly invisible to you while you're inside the graph.
My Curator MCP is the bridge that opens that synapse layer to a frontier model. From v2.3 onwards, The Curator ships a local MCP server that exposes your wiki to any Model Context Protocol-compatible client — most importantly Claude Desktop with Opus or Sonnet, but also VS Code with an MCP-aware coding agent, LM Studio with a local model, or any other MCP client. From v2.5.2+, the bridge is read+write — Claude can save what you discussed, clean up wiki problems, and manage dismissals without you ever leaving the conversation.
This is not "another way to read your files." It's a graph-native access path. Seventeen dedicated tools — ten read tools (seven retrieval, three explicitly graph-shaped) and seven write tools (compile, scan/fix Health, manage dismissals) — let the model:
- Pull a topology overview of any domain — central hubs, cluster shape, orphan sample, top tags — in one call
- Traverse multi-hop neighbourhoods around any concept or entity
- Get bidirectional backlinks — "every source that mentions Karpathy"
- Search across every domain you've ever built, simultaneously
- Pivot from a tag to its pages, from a page to its links, from a link to its incoming references
- Save research findings back into the wiki (v2.5.2+) — "compile what we discussed and add it to my second brain"
- Heal the wiki on request (v2.5.2+) — "check for problems and fix what's safe" (auto-fixes the unambiguous ones, asks before destructive merges)
Imagine you've built your second brain over years. Thousands of nodes. Dozens of domains. Articles, research papers, books, customer interviews, journal entries — all ingested, all interconnected. You sit in Claude Desktop with Opus and ask:
"What are the most important ideas in my AI domain that I have never explicitly connected to my business strategy domain?"
Opus traverses the graph. Pulls hubs from both domains. Finds the intersections. Surfaces connections you made unconsciously, over years, without ever noticing them.
Or:
"For the white paper I'm drafting on organisational resilience, pull every entity and concept tagged
crisis-responseacross all domains, group them by source, and build a citation skeleton."
Or:
"Across my last six months of journal entries, identify recurring patterns I haven't named yet, and propose names for them — citing the specific entries each pattern shows up in."
And after the research, you finish the loop:
"Compile everything we just figured out and save it as a research summary in my
businessdomain — title it 'Q2 Strategic Patterns'."
Claude calls compile_to_wiki, and the synthesis lands in your wiki as a permanent page with bidirectional links to every entity and concept it referenced. The next research session can build on it.
That is not a chat interface. That is a frontier model doing deep research over your own intellectual history — and committing the conclusions back into it — with full citations, no hallucinations beyond your wiki, and no data ever leaving your machine.
When you join a Shared Brain (see docs/shared-brain-user-guide.md), the collective wiki appears on your machine as a shared-<slug>/ domain. MCP read tools work fully on it — Claude can search_wiki, get_node, get_index, search_cross_domain across the collective just like any other domain. This is where the cohort/team use cases get powerful: a research team can ask "across our shared brain, which papers contradict each other on X?" and Claude reads everyone's combined reading to surface the answer with citations.
MCP write tools refuse on shared-* mirrors by design — direct writes wouldn't propagate to other contributors and would be overwritten on the next Pull. To contribute, Claude writes to your personal opted-in domain (e.g. work-ai/), then you Push from the Sync tab. The skill (claude-skills/my-curator/SKILL.md §3.1) teaches Claude this contract so it knows where to compile when you say "save this to the shared brain."
Most "AI for personal knowledge" tools are RAG wrappers: they re-derive answers from raw files at query time and forget everything afterwards. Nothing compounds. Nothing traverses.
My Curator inverts that: ingest builds a persistent, graph-shaped knowledge structure during writing, and MCP exposes that graph as first-class structured data at read time. The model doesn't pretend to be your second brain — it uses your second brain, the way an analyst uses a database. Topology, tags, links, backlinks — all queryable, all cited, all yours.
For teams, Shared Brain extends this further: now the analyst-database is collective — built by every cohort member's reading, queried by everyone's Claude. The first time you ask Opus about your team's combined corpus and it surfaces a contradiction between two papers your colleagues read months apart, you understand why this matters.
This is what makes the difference between "I have a folder of notes" and "I have a queryable, compounding extension of my own thinking that any frontier model can reason against on demand."
📖 Setup is under 2 minutes from the Settings tab inside the app — see docs/mcp-user-guide.md for the wizard, prompt patterns, and the privacy/security model.
💡 The My Curator Claude skill (v2.5.7+): drop claude-skills/my-curator/SKILL.md into Claude Code's
~/.claude/skills/— or upload it to any Claude Desktop project's knowledge files — and every conversation that touches the my-curator MCP automatically follows the playbook: ground every wikilink, refuse speculative writes on fresh domains, three-tier-track Health fixes, respect domain siloing. No more typing detailed prompts every time. Install instructions in the MCP guide.
The Chat tab is a full multi-turn conversation interface. Ask anything about your wiki — the AI answers from your own pages, cites its sources, and remembers the entire conversation thread. Past conversations are saved and survive server restarts.
You: What is RAG and why does it matter?
AI: RAG combines retrieval with generation… [source: concepts/rag.md]
You: How does it compare to fine-tuning?
AI: As I mentioned, the key advantage is… [source: summaries/rag-paper.md]
Create multiple conversations per domain. Delete old ones. Pick up any thread later.
The Domains tab is a full GUI for creating, renaming, and deleting domains — no Finder or terminal needed.
Create a domain — type a display name, pick a template, and click Create. The folder and schema are generated automatically:
| Template | Best for |
|---|---|
| ⚙️ Tech / AI | Software, AI research, developer tools |
| 📈 Business / Finance | Startups, investing, strategy |
| 🌱 Personal Growth | Books, habits, mental models |
| 📁 Generic | Any other topic |
Rename — click the pencil icon. The folder is renamed on disk; all wiki pages, conversations, and Obsidian links update instantly.
Delete — click the trash icon. The confirmation panel shows exact page and conversation counts before you commit.
If GitHub sync is configured, a rename or delete shows a reminder to Sync now so all your computers stay consistent.
📖 Full reference: docs/domains.md — the CLAUDE.md schema, how domains relate to each other (siloed by default), and custom templates for specialised topics like history, health, or legal.
The Sync tab connects The Curator to a private GitHub repository so your wiki and chat history are available on every machine.
One-time setup (~3 minutes):
- Create a free, empty private repository on GitHub (no README/.gitignore/license)
- Create a Personal Access Token — fine-grained (recommended; Contents: Read and write on that one repo) or classic (
reposcope; can be set to never expire) - Open the Sync tab → follow the 3-step wizard
Three ways to set it up:
- In-app wizard (most users) — Sync tab → 3 steps. Full guide: docs/sync.md.
- With a coding agent (Claude Code, Cursor, opencode, Aider…) — paste one prompt and it does the whole thing: docs/sync-via-coding-agent.md.
- Manual — create the repo + token yourself and enter them in the wizard.
Daily use:
- Click Sync now at the start and end of every work session — it pulls remote changes first, then pushes yours. One button, both directions.
- Need a one-way operation? Open the Advanced disclosure in the Sync tab for Push only and Pull only buttons.
What syncs: wiki pages, chat history, domain schemas. What stays local: source files, API keys, app code.
See docs/sync.md for the full guide, including token permissions and troubleshooting.
The Shared Brain lets a cohort, team, or research group contribute to a collective wiki without merging personal data. Each contributor keeps their private Curator; only opted-in domains push to a shared private GitHub repo. The LLM-synthesised collective wiki comes back as a separate read-only mirror domain on every contributor's machine.
Alice's Mac Bob's PC Carlos's laptop
personal/ personal/ personal/ ← stays private
work-ai/ → work-ai/ → work-ai/ ← opted-in, pushes
shared-cohort/ shared-cohort/ shared-cohort/ ← pulled back (read-only)
↓ ↓ ↓
shared GitHub repo
(admin's private)
Use cases: educational cohorts (each student contributes a work domain), enterprise knowledge management (employees opt-in their work domain), research teams (shared research domain compounds everyone's reading).
v3.0.0-beta.1 is opt-in. Open the Sync tab, scroll to "Shared Brains", click Enable Shared Brain (beta). Then pick a card: 📨 I have an invite token → Join if your cohort admin sent you a token, or ⚙ I'm starting a new Shared Brain → Set up if you're spinning one up for your team. The 5-step wizard (Token → Access → PAT → Domains → Save) walks through it.
Future generations: v3.1 adds Cloudflare R2 as a second storage backend for EU data residency and custom-domain endpoints. v3.2 adds GitHub App mode and SSO for enterprise. See the roadmap.
→ Shared Brain User Guide (step-by-step) · Architecture (concept + decisions) · Admin Operations · Compliance reference (GDPR / IP / EU residency)
After ingesting your first document, open Obsidian → Open folder as vault → select your Knowledge Base folder (shown in the Domains tab → Knowledge Base Location). Click the graph icon to see all your knowledge as an interactive, zoomable network.
Tip: The Domains tab shows your Knowledge Base Location path and has a Copy button — paste it directly into Obsidian's vault picker.
Activate graph colors (one-time setup): In Graph View → ⚙ → Groups, create three groups:
| Group | Query | Color |
|---|---|---|
| Entities | tag:#type/entity |
Blue |
| Concepts | tag:#type/concept |
Green |
| Summaries | tag:#type/summary |
Purple |
Every future ingest auto-colors new nodes — no manual work needed. See the User Guide for full instructions.
The Curator uses precise language for what it does. Understanding these terms helps you get the most out of it:
| Term | Definition |
|---|---|
| Atomic Decomposition | Breaking a large document into three discrete network components: Entities, Concepts, and Summaries |
| Entities (The Nouns) | Specific people, companies, tools, datasets — nodes with a proper name |
| Concepts (The Verbs/Ideas) | Broad theories, techniques, frameworks, principles — ideas without a single owner |
| Summaries (The Glue) | The narrative that connects specific entities to concepts for a given source |
| Semantic Intelligence | The system's ability to read raw text, comprehend context, and extract structured knowledge |
| Hidden Relations | Intersections between concepts that only become visible in the graph — what search bars can never show you |
| Contextual Provenance | The ability to trace any synthesised idea back to its exact source page |
| Network Compounding | Each new source updates existing pages rather than duplicating — knowledge builds on itself |
If you're working with Shared Brain, you'll see these specific terms in the UI, docs, and audit logs. Confusing them — especially invite token vs PAT — is the #1 setup mistake.
| Term | Definition | |
|---|---|---|
| Shared Brain | A collective Curator wiki shared with a cohort, team, or research group. Each contributor's personal Curator stays private; only opted-in domains push to a shared private GitHub repo | Personal Sync, which backs up YOUR full wiki to YOUR own private repo |
| Contributor | Anyone in the cohort who joins and pushes contributions. There are N contributors per cohort | The Admin (just one per cohort) |
| Admin | The one person who creates the GitHub repo, generates the invite token, invites collaborators, and runs synthesis | A contributor — though the admin is also a contributor with their own data |
Invite token (sbi_...) |
Metadata-only label that tells the wizard which repo to connect to. Contains NO credentials. Safe to share with the whole cohort via Slack or email | A PAT — they are completely different things |
Personal Access Token (github_pat_...) |
Credential issued by GitHub. Each contributor creates their OWN. Never shared with anyone. Stays on the contributor's machine only | The invite token. Sharing your PAT is a security disaster |
| Opted-in domain | A personal Curator domain that the contributor explicitly chose to push to the Shared Brain. Other personal domains stay private | A shared-<slug> mirror domain (which is the pull destination, not the push source) |
Mirror domain (shared-<slug>/) |
The local read-only copy of the synthesised collective wiki, pulled to every contributor's machine | An opted-in domain. The Curator app, MCP write tools, and Health fixes refuse direct writes to mirror domains by design |
| Delta summary | The LLM-pre-processed payload that gets pushed to shared storage — {new_facts, removed_links, ...} for each changed page. Not a raw markdown file |
The wiki page itself — Delta is the structured change, not the page |
| Synthesis | The admin-triggered process that merges all contributions into the collective wiki, applies merge rules 1-5 (union facts, resolve contradictions, attribute provenance, rebuild index) | Push (which sends contributions) or Pull (which fetches synthesised pages) |
| Provenance | The auto-appended section on every collective page listing contributor UUIDs (or names, per Decision 6a) | Authorship of a personal opted-in page — that stays purely on the contributor's machine |
| Conflict marker | The ## CONFLICTING SOURCES block that synthesis inserts when two contributors disagree and the LLM can't unify their facts |
A Health-broken-link issue. Conflict markers are specific to Shared Brain synthesis |
| Data handling terms | The admin's IP-mode choice at brain setup: contributor_retains (default; educational/cohort) or organisational (enterprise IP transfer). Locked once invites go out |
Privacy controls. This is specifically about copyright in contributed content, not about who sees what |
| Revocation (GDPR Article 17) | Admin-triggered operation that permanently deletes a contributor's submissions + their facts from collective pages + appends an audit log entry. Irreversible | Removing a contributor as a GitHub collaborator (which stops future pushes but doesn't erase past contributions) |
the-curator/
├── src/
│ ├── server.js Express server (port 3333)
│ ├── routes/ API route handlers
│ ├── brain/
│ │ ├── llm.js LLM abstraction (Gemini + Claude)
│ │ ├── ingest.js Ingest pipeline (single-pass + multi-phase for large docs)
│ │ ├── chat.js Multi-turn chat with persistent conversations
│ │ ├── sync.js GitHub sync (git --git-dir / --work-tree)
│ │ └── files.js Filesystem helpers
│ └── public/ Web UI (vanilla JS, no build step)
├── domains/
│ └── <domain>/
│ ├── CLAUDE.md Domain schema (instructions for the AI)
│ ├── raw/ Your original uploaded files (local only)
│ ├── wiki/ Auto-generated knowledge pages
│ └── conversations/ Saved chat threads
├── scripts/ Maintenance utilities (dedup, repair, bulk-reingest)
├── images/ App icon in multiple sizes
└── docs/ Full documentation
For users
| User Guide | Full setup + usage — install, ingest, chat, costs, MCP, Health, sync, troubleshooting |
| Knowledge Immortality (essay) | The why — what a second brain is, why markdown matters, what compounding looks like in practice |
| My Curator MCP Guide | Connect the wiki to Claude Desktop (or any MCP client) for frontier-model research over your graph |
| AI Wiki Health Guide | AI-assisted broken-link / orphan / semantic-duplicate cleanup — what each phase does and the privacy tradeoffs |
| System Check | Settings → System Check: confirm the app setup (key, folder, credentials, sync) + an optional AI connection test |
| Sync Guide | Personal Sync — GitHub backup of your full wiki across your own computers (wizard, token permissions, troubleshooting) |
| Sync with a coding agent | Automated sync setup via Claude Code / Cursor / opencode / Aider — one copy-paste prompt |
| Shared Brain — User Guide | v3.0.0-beta+ — step-by-step for contributors AND admins; daily workflow; troubleshooting |
| Shared Brain — Architecture | What Shared Brain is, how it works internally, engineering decisions, v3.x+ roadmap |
| Shared Brain — Admin Operations | Advanced admin reference: synthesis cadence, revocation, contributor management |
| Shared Brain — Compliance | GDPR / IP / data residency reference for organisations evaluating deployment |
| Shared Brain — Monetization | Paid Shared Brain access: how independent experts, artists, professors, consulting firms, and SaaS companies can charge for brain access today using no-code payment platforms |
| Use Cases | Detailed workflows for every user profile, including cohort & team Shared Brain scenarios |
| Mac App Setup | Double-click Dock launcher for Mac |
For developers
| Contributing | Developer setup, running the tests (npm test / npm run test:live), adding a test, cutting a release |
| Ingestion Pipeline | The deep dive on the most important code path in The Curator — every safeguard, every failure mode, the quality contract, Mermaid diagrams |
| Domains | Full reference — managing domains, the CLAUDE.md schema, siloing model, custom templates |
| Model Lifecycle | Provider/model fallback policy, retiring deprecated models |
| API Reference | REST API documentation |
| Architecture | System design for developers |
- API keys can be stored via the Settings tab (saved in
.curator-config.json) or in.env— both are gitignored, never committed. Credential files (.curator-config.json,.sync-config.json,.sharedbrain-config.json,.env) are written with0600permissions (owner-only) as of v3.0.1-beta.20 - Sync token lives in
.sync-config.json— gitignored, never committed - The app runs entirely on your local machine — the only outbound calls are to Gemini/Claude and (when syncing) to your own private GitHub repo
- The server binds to
127.0.0.1(loopback) only, so it is not reachable from your local network, and a cross-origin guard rejects state-changing requests from other web origins (CSRF / DNS-rebinding defense). It still has no per-request authentication — it is a single-user local app and should not be reverse-proxied onto a public network
MIT — see LICENSE.
