Kiroku Memory

Tiered Retrieval Memory System for AI Agents

The only AI memory system with a native desktop app, 100% local storage, and automatic conflict resolution.

🚀 Get Started in 3 Steps

No Docker. No Python. No configuration. Just download and go!

1️⃣  Download → Kiroku Memory.app from GitHub Releases
2️⃣  Install  → npx skills add yelban/kiroku-memory
3️⃣  Restart  → Restart Claude Code and enjoy persistent memory!

⬇️ Download Desktop App

🎯 Why Kiroku?

	Kiroku	mem0	claude-mem
🖥️ Desktop GUI	✅ Native App	❌ Cloud	❌ Web
🔒 100% Local	✅	❌ Cloud-first	✅
🔄 Conflict Resolution	✅	❌	❌
⏰ Time Decay	✅	❌	❌

Core differentiators:

Native Desktop App — Visual memory browser, not just CLI
Fully Local — Your data never leaves your machine
Smart Memory — Auto-detects contradictions, confidence decays over time

A production-ready memory system for AI agents that implements persistent, evolving memory with tiered retrieval. Built on the principles from Rohit's "How to Build an Agent That Never Forgets" and community feedback.

Why This Project?

Traditional RAG (Retrieval-Augmented Generation) faces fundamental challenges at scale:

Semantic similarity ≠ Factual truth: Embeddings capture similarity, not correctness
No temporal context: Cannot handle "user liked A before, now prefers B"
Memory contradictions: Information accumulated over time may conflict
Scalability issues: Retrieval performance degrades with tens of thousands of memories

This system addresses these challenges with a Hybrid Memory Stack architecture.

Why Memory Matters: Expert Perspectives

Leading researchers in AI agents and cognitive science emphasize why persistent memory is crucial:

Lilian Weng (OpenAI Research Scientist)

In her influential article "LLM Powered Autonomous Agents", she identifies memory as a core component:

Memory enables agents to go beyond stateless interactions, accumulating knowledge across sessions.

Kiroku implements this through Tiered Retrieval — summaries first, then drill-down — avoiding the semantic drift problem of naive RAG.

Harrison Chase (LangChain Founder)

He outlines three layers of agent memory: Episodic (events), Semantic (facts), Procedural (skills).

LangChain Concept	Kiroku Implementation
Episodic	`events` category
Semantic	`facts`, `preferences` categories
Procedural	`skills` category

Plus: Conflict Resolution automatically detects contradicting facts, and Cross-project Sharing via global:user scope.

Daniel Kahneman (Nobel Laureate, Cognitive Psychology)

From "Thinking, Fast and Slow" — System 1 (intuition) vs System 2 (analysis).

Kiroku's implementation:

Mode	Feature	Benefit
System 1	Auto-load context	Claude "knows" you instantly
System 2	`/remember` command	Explicit marking of important info

Real impact: No more repeating "I prefer uv for Python" every session.

The Core Value

These experts converge on one insight: Memory transforms AI from a tool into a partner.

Continuity — Conversations aren't isolated islands
Personalization — AI truly "knows" you
Efficiency — Eliminates cognitive overhead of re-explaining context
Evolution — Memory accumulates, making AI smarter over time

✨ Features

Append-only Raw Logs: Immutable provenance tracking
Atomic Facts Extraction: LLM-powered structured fact extraction (subject-predicate-object)
Category-based Organization: 6 default categories with evolving summaries
Tiered Retrieval: Summaries first, drill down to facts when needed
Conflict Resolution: Automatic detection and archival of contradicting facts
Time Decay: Exponential decay of memory confidence over time
Vector Search: pgvector-powered semantic similarity search
Knowledge Graph: Relationship mapping between entities
Scheduled Maintenance: Nightly, weekly, and monthly maintenance jobs
Production Ready: Structured logging, metrics, and health checks

Architecture

flowchart TB
    subgraph KM["Kiroku Memory"]
        direction TB

        Ingest["Ingest<br/>(Raw Log)"] --> Resources[("Resources<br/>(immutable)")]

        Resources --> Extract["Extract<br/>(Facts)"]
        Extract --> Classify["Classify<br/>(Category)"]
        Classify --> Conflict["Conflict<br/>Resolver"]
        Conflict --> Items[("Items<br/>(active)")]

        Items --> Embeddings["Embeddings<br/>(pgvector)"]
        Items --> Summary["Summary<br/>Builder"]

        Embeddings --> Retrieve["Retrieve<br/>(Tiered + Priority)"]
        Summary --> Retrieve
    end

Desktop App

The easiest way to run Kiroku Memory — no Docker, no Python setup required.

Download

Download the latest release for your platform from GitHub Releases:

Platform	Architecture	Format
macOS	Apple Silicon (M1/M2/M3)	`.dmg`
macOS	Intel	`.dmg`
Windows	x86_64	`.msi`
Linux	x86_64	`.AppImage`

Usage

Install: Double-click the downloaded file to install
Run: Launch "Kiroku Memory" from your applications
Configure (Optional): Click settings icon to add your OpenAI API Key for semantic search

The Desktop App uses embedded SurrealDB — all data is stored locally with zero external dependencies.

Features

Zero Configuration: Works out of the box, no Docker or database setup
Embedded Database: SurrealDB stores data in your app data directory
Cross-Platform: Native apps for macOS, Windows, and Linux
Same API: Full REST API available at http://127.0.0.1:8000

Quick Start (Developer)

For developers who want to run from source or customize the system.

Prerequisites

Python 3.11+
Docker (for PostgreSQL + pgvector) OR SurrealDB (embedded, no Docker needed)
OpenAI API Key

New to development? See the detailed installation guide with step-by-step instructions.

Installation

# Clone the repository
git clone https://github.com/yelban/kiroku-memory.git
cd kiroku-memory

# Install dependencies using uv
uv sync

# Copy environment file
cp .env.example .env

# Edit .env and set your OPENAI_API_KEY

Start Services

Option A: PostgreSQL (Production)

# Start PostgreSQL with pgvector
docker compose up -d

# Start the API server
uv run uvicorn kiroku_memory.api:app --reload

# The API will be available at http://localhost:8000

Option B: SurrealDB (Desktop/Embedded, No Docker!)

# Configure backend in .env
echo "BACKEND=surrealdb" >> .env

# Start the API server (no Docker needed!)
uv run uvicorn kiroku_memory.api:app --reload

# Data stored in ./data/kiroku/

Verify Installation

# Health check
curl http://localhost:8000/health
# Expected: {"status":"ok","version":"0.1.0"}

# Detailed health status
curl http://localhost:8000/health/detailed

Usage

Basic Workflow

1. Ingest a Message

curl -X POST http://localhost:8000/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "content": "My name is John and I work at Google as a software engineer. I prefer using Neovim.",
    "source": "user:john",
    "metadata": {"channel": "chat"}
  }'

2. Extract Facts

curl -X POST http://localhost:8000/extract \
  -H "Content-Type: application/json" \
  -d '{"resource_id": "YOUR_RESOURCE_ID"}'

This extracts structured facts like:

John works at Google (category: facts)
John is a software engineer (category: facts)
John prefers Neovim (category: preferences)

3. Generate Summaries

curl -X POST http://localhost:8000/summarize

4. Retrieve Memories

# Tiered retrieval (summaries + items)
curl "http://localhost:8000/retrieve?query=What%20does%20John%20do"

# Get context for agent prompt
curl "http://localhost:8000/context"

API Endpoints

Core Endpoints

Method	Path	Description
POST	`/ingest`	Ingest raw message into memory
GET	`/resources`	List raw resources
GET	`/resources/{id}`	Get specific resource
GET	`/retrieve`	Tiered memory retrieval
GET	`/items`	List extracted items
GET	`/categories`	List categories with summaries

Intelligence Endpoints

Method	Path	Description
POST	`/extract`	Extract facts from resource
POST	`/process`	Batch process pending resources
POST	`/summarize`	Build category summaries
GET	`/context`	Get memory context for agent prompt

Maintenance Endpoints

Method	Path	Description
POST	`/jobs/nightly`	Run nightly consolidation
POST	`/jobs/weekly`	Run weekly maintenance
POST	`/jobs/monthly`	Run monthly re-indexing

Observability Endpoints

Method	Path	Description
GET	`/health`	Basic health check
GET	`/health/detailed`	Detailed health status
GET	`/metrics`	Application metrics
POST	`/metrics/reset`	Reset metrics

Scheduled Jobs (macOS)

Install launchd jobs for automatic maintenance:

bash launchd/install.sh

Job	Schedule	Description
nightly	03:00 daily	Decay calculation, cleanup, summaries
weekly	04:00 Sunday	Archive, compress
monthly	05:00 1st	Embeddings rebuild, graph rebuild

Verify installation:

launchctl list | grep kiroku

Integration

With Claude Code (Recommended)

Option 1: npx Skills CLI (Easiest)

npx skills add yelban/kiroku-memory

Option 2: Plugin Marketplace

# Step 1: Add the marketplace
/plugin marketplace add https://github.com/yelban/kiroku-memory.git

# Step 2: Install the plugin
/plugin install kiroku-memory

Option 3: Manual Installation

# One-click install
curl -fsSL https://raw.githubusercontent.com/yelban/kiroku-memory/main/skill/assets/install.sh | bash

# Or clone and install
git clone https://github.com/yelban/kiroku-memory.git
cd kiroku-memory/skill/assets && ./install.sh

After installation, restart Claude Code and use:

/remember 用戶偏好深色模式          # Save memory
/recall 編輯器偏好                  # Search memories
/memory-status                      # Check status

Features:

Auto-load: SessionStart hook injects memory context
Smart-save: Stop hook automatically saves important facts
Priority ordering: preferences > facts > goals (hybrid static+dynamic weights)
Smart truncation: Never truncates mid-category, maintains completeness
Cross-project: Global + project-specific memory scopes

Verify Hooks Are Working

When hooks are working correctly, you'll see this at conversation start:

SessionStart:startup hook success: <kiroku-memory>
## User Memory Context

### Preferences
...
</kiroku-memory>

This confirms:

✅ SessionStart hook executed successfully
✅ API service is connected
✅ Memory context has been injected

If memory content is empty (only category headers), no memories have been stored yet. Use /remember to store manually.

Auto-Save: Two-Phase Memory Capture

Stop Hook uses a Fast + Slow dual-phase architecture:

Phase 1: Fast Path (<1s, sync)

Regex-based pattern matching for immediate capture:

Pattern Type	Examples	Min Weighted Length
Preferences	`I prefer...`, `I like...`	10
Decisions	`decided to use...`, `chosen...`	10
Discoveries	`discovered...`, `found that...`, `solution is...`	10
Learnings	`learned...`, `root cause...`, `the issue was...`	10
Facts	`work at...`, `live in...`	10
No pattern	General content	35

Also extracts conclusion markers from Claude's responses:

Solution, Discovery, Conclusion, Recommendation, Root cause

Weighted length: CJK chars × 2.5 + other chars × 1

Phase 2: Slow Path (5-15s, async)

Background LLM analysis using Claude CLI:

Runs in detached subprocess (doesn't block Claude Code)
Analyzes last 6 user + 4 assistant messages
Extracts up to 5 memories with type/confidence
Memory types: discovery, decision, learning, preference, fact

Filtered out (noise):

Short responses: OK, 好的, Thanks
Questions: What is..., How to...
Errors: error, failed

Incremental Capture (PostToolUse Hook)

For long conversations, memories are captured incrementally during the session:

Trigger: After each tool use, with throttling
Throttle conditions: ≥5 min interval AND ≥10 new messages
Offset tracking: Only analyzes new messages since last capture
Smart skip: Skips if content too short

This distributes the capture load and ensures early conversation content isn't lost.

See Claude Code Integration Guide for details.

With MCP Server (Advanced)

For custom MCP server integration:

# memory_mcp.py
from mcp.server import Server
from kiroku_memory.db.database import get_session
from kiroku_memory.summarize import get_tiered_context

app = Server("memory-system")

@app.tool("memory_context")
async def memory_context():
    async with get_session() as session:
        return await get_tiered_context(session)

Configure in ~/.claude/mcp.json:

{
  "mcpServers": {
    "memory": {
      "command": "uv",
      "args": ["run", "python", "memory_mcp.py"]
    }
  }
}

With Chat Bots (Telegram/LINE)

const MEMORY_API = "http://localhost:8000";

// Get memory context before responding
async function getMemoryContext(userId) {
  const response = await fetch(`${MEMORY_API}/context`);
  const data = await response.json();
  return data.context;
}

// Save important information after conversation
async function saveToMemory(userId, content) {
  await fetch(`${MEMORY_API}/ingest`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      content,
      source: `bot:${userId}`
    })
  });
}

// Use in your bot
const memoryContext = await getMemoryContext(userId);
const enhancedPrompt = `${memoryContext}\n\n${SYSTEM_PROMPT}`;

See Integration Guide for detailed examples.

Maintenance

Scheduled Jobs

Set up cron jobs for automatic maintenance:

# Nightly: Merge duplicates, promote hot memories
0 2 * * * curl -X POST http://localhost:8000/jobs/nightly

# Weekly: Apply time decay, archive old items
0 3 * * 0 curl -X POST http://localhost:8000/jobs/weekly

# Monthly: Rebuild embeddings and knowledge graph
0 4 1 * * curl -X POST http://localhost:8000/jobs/monthly

Time Decay

Memories decay exponentially with a configurable half-life (default: 30 days):

def time_decay_score(created_at, half_life_days=30):
    age_days = (now - created_at).days
    return 0.5 ** (age_days / half_life_days)

Configuration

Environment Variables

Variable	Default	Description
`BACKEND`	`postgres`	Backend selection: `postgres` or `surrealdb`
`DATABASE_URL`	`postgresql+asyncpg://...`	PostgreSQL connection string
`SURREAL_URL`	`file://./data/kiroku`	SurrealDB URL (file:// for embedded)
`SURREAL_NAMESPACE`	`kiroku`	SurrealDB namespace
`SURREAL_DATABASE`	`memory`	SurrealDB database name
`OPENAI_API_KEY`	(required)	OpenAI API key for embeddings
`EMBEDDING_MODEL`	`text-embedding-3-small`	OpenAI embedding model
`EMBEDDING_DIMENSIONS`	`1536`	Vector dimensions
`DEBUG`	`false`	Enable debug mode

Project Structure

.
├── kiroku_memory/          # Core Python package
│   ├── api.py              # FastAPI endpoints
│   ├── ingest.py           # Resource ingestion
│   ├── extract.py          # Fact extraction (LLM)
│   ├── classify.py         # Category classification
│   ├── conflict.py         # Conflict resolution
│   ├── summarize.py        # Summary generation
│   ├── embedding.py        # Vector search
│   ├── observability.py    # Metrics & logging
│   ├── db/                 # Database layer
│   └── jobs/               # Maintenance jobs
├── skill/                  # Claude Code Skill
│   ├── SKILL.md            # Skill documentation (EN)
│   ├── SKILL.zh-TW.md      # 繁體中文
│   ├── SKILL.ja.md         # 日本語
│   ├── scripts/            # Commands & hooks
│   ├── references/         # Reference docs
│   └── assets/             # Install script
├── tests/
├── docs/
├── docker-compose.yml
├── pyproject.toml
└── README.md

Documentation

Installation Guide - Step-by-step installation for beginners
Architecture Design - System architecture and design decisions
Development Journey - From idea to implementation
User Guide - Comprehensive usage guide
Integration Guide - Integration with chat bots and custom agents
Claude Code Integration - Claude Code skill setup and usage
Renaming Changelog - Project renaming history

Tech Stack

Language: Python 3.11+
Framework: FastAPI + asyncio
Database: PostgreSQL 16 + pgvector OR SurrealDB (embedded)
ORM: SQLAlchemy 2.x / SurrealDB Python SDK
Embeddings: OpenAI text-embedding-3-small
Package Manager: uv

Contributing

Contributions are welcome! Please read our contributing guidelines before submitting a pull request.

License

This project is licensed under the PolyForm Noncommercial License 1.0.0.

Free for: Personal use, academic research, non-profit organizations, evaluation.

Commercial use: Please contact yelban@gmail.com for licensing.

Acknowledgments

Rohit (@rohit4verse) for the original "How to Build an Agent That Never Forgets" article
MemoraX team for open-source implementation reference
Rishi Sood for LC-OS Context Engineering papers
The community for valuable feedback and suggestions

Related Projects

MemoraX - Another implementation of agent memory
mem0 - Memory layer for AI applications

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
.claude-plugin		.claude-plugin
.github		.github
commands		commands
desktop		desktop
docs		docs
hooks		hooks
kiroku_memory		kiroku_memory
launchd		launchd
scripts		scripts
skill		skill
tests		tests
tools		tools
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.ja.md		CLAUDE.ja.md
CLAUDE.md		CLAUDE.md
CLAUDE.zh-TW.md		CLAUDE.zh-TW.md
LICENSE		LICENSE
README.ja.md		README.ja.md
README.md		README.md
README.zh-TW.md		README.zh-TW.md
cover.png		cover.png
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

yelban/kiroku-memory

Folders and files

Latest commit

History

Repository files navigation

Kiroku Memory

🚀 Get Started in 3 Steps

🎯 Why Kiroku?

Why This Project?

Why Memory Matters: Expert Perspectives

Lilian Weng (OpenAI Research Scientist)

Harrison Chase (LangChain Founder)

Daniel Kahneman (Nobel Laureate, Cognitive Psychology)

The Core Value

✨ Features

Architecture

Desktop App

Download

Usage

Features

Quick Start (Developer)

Prerequisites

Installation

Start Services

Option A: PostgreSQL (Production)

Option B: SurrealDB (Desktop/Embedded, No Docker!)

Verify Installation

Usage

Basic Workflow

1. Ingest a Message

2. Extract Facts

3. Generate Summaries

4. Retrieve Memories

API Endpoints

Core Endpoints

Intelligence Endpoints

Maintenance Endpoints

Observability Endpoints

Scheduled Jobs (macOS)

Integration

With Claude Code (Recommended)

Option 1: npx Skills CLI (Easiest)

Option 2: Plugin Marketplace

Option 3: Manual Installation

Verify Hooks Are Working

Auto-Save: Two-Phase Memory Capture

Incremental Capture (PostToolUse Hook)

With MCP Server (Advanced)

With Chat Bots (Telegram/LINE)

Maintenance

Scheduled Jobs

Time Decay

Configuration

Environment Variables

Project Structure

Documentation

Tech Stack

Contributing

License

Acknowledgments

Related Projects

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Contributors 2

Uh oh!

Languages

Packages