huss-mo · huss-mo · Mar 25, 2026 · Mar 24, 2026 · Mar 25, 2026
diff --git a/DOCS.md b/DOCS.md
diff --git a/README.md b/README.md
@@ -1,12 +1,12 @@
-<img src="https://raw.githubusercontent.com/huss-mo/GroundMemory/master/_assets/icon.png" alt="om logo" width="140">
+<img src="https://raw.githubusercontent.com/huss-mo/GroundMemory/master/_assets/icon.png" alt="om logo" width="140">
 
 # GroundMemory
 
 **Persistent, semantic memory for AI agents - mcp-native, local-first, framework-agnostic, production-ready.**
 
 [![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
 [![Unit Tests](https://github.com/huss-mo/GroundMemory/actions/workflows/unit-tests.yml/badge.svg)](https://github.com/huss-mo/GroundMemory/actions/workflows/unit-tests.yml)
-[![Test Suite](https://img.shields.io/badge/test%20suite-330%20tests-blue.svg)](#running-the-test-suite)
+[![Test Suite](https://img.shields.io/badge/test%20suite-380%20tests-blue.svg)](#running-the-test-suite)
 [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
 ![GitHub repo size](https://img.shields.io/github/repo-size/huss-mo/GroundMemory)
 ![GitHub language count](https://img.shields.io/github/languages/count/huss-mo/GroundMemory)
@@ -103,17 +103,17 @@ Comparison reflects publicly documented features as of MAR-2026. Submit a PR if
 
 | Feature | GroundMemory | Mem0 | Letta | memsearch | Zep |
 |---|:---:|:---:|:---:|:---:|:---:|
-| Zero-setup (no API key, no GPU) | ✅ | — | — | — | — |
-| Local-first / offline | ✅ | — | — | Partial¹ | — |
-| Human-readable Markdown memory | ✅ | — | — | ✅ | — |
-| Structured memory tiers | ✅ | ✅² | ✅³ | — | — |
-| Hybrid BM25 + vector search | ✅ | — | — | ✅ | ✅ |
-| Entity relation graph | ✅ | ✅ | — | — | ✅ |
-| MCP-native server | ✅ | Partial⁴ | Partial⁵ | — | — |
-| Compaction hooks | ✅ | — | ✅ | — | — |
-| Temporal knowledge graph | —⁶ | — | — | — | ✅ |
-| Full agent framework | — | — | ✅ | — | — |
-| Managed cloud service | — | ✅ | ✅ | — | ✅ |
+| Zero-setup (no API key, no GPU) | ✅ | - | - | - | - |
+| Local-first / offline | ✅ | - | - | Partial¹ | - |
+| Human-readable Markdown memory | ✅ | - | - | ✅ | - |
+| Structured memory tiers | ✅ | ✅² | ✅³ | - | - |
+| Hybrid BM25 + vector search | ✅ | - | - | ✅ | ✅ |
+| Entity relation graph | ✅ | ✅ | - | - | ✅ |
+| MCP-native server | ✅ | Partial⁴ | Partial⁵ | - | - |
+| Compaction hooks | ✅ | - | ✅ | - | - |
+| Temporal knowledge graph | -⁶ | - | - | - | ✅ |
+| Full agent framework | - | - | ✅ | - | - |
+| Managed cloud service | - | ✅ | ✅ | - | ✅ |
 
 ¹ memsearch supports local ONNX embeddings + Milvus Lite, but requires initial model download </br>
 ² Mem0 organizes memory into Conversation, Session, User, and Organizational layers </br>
@@ -128,11 +128,11 @@ Comparison reflects publicly documented features as of MAR-2026. Submit a PR if
 
 GroundMemory exposes 9 tools via MCP and the Python API: `memory_bootstrap`, `memory_write`, `memory_search`, `memory_get`, `memory_list`, `memory_delete`, `memory_replace_text`, `memory_replace_lines`, and `memory_relate`.
 
-`MEMORY.md` and all `daily/*.md` files are **append-only** — `memory_delete`, `memory_replace_text`, and `memory_replace_lines` enforce this and will reject edits to those files. Only `USER.md`, `AGENTS.md`, and `RELATIONS.md` are mutable.
+`MEMORY.md` and all `daily/*.md` files are **append-only** - `memory_delete`, `memory_replace_text`, and `memory_replace_lines` enforce this and will reject edits to those files. Only `USER.md`, `AGENTS.md`, and `RELATIONS.md` are mutable.
 
 **When using the MCP server**, instruct your agent to call `memory_bootstrap` at the start of every session before doing anything else. This loads the full memory context (MEMORY.md, USER.md, AGENTS.md, RELATIONS.md, daily logs) into the conversation. Clients that support the MCP Prompts primitive (Cline, Claude Desktop) can instead use the `memory_bootstrap_prompt` prompt from their Prompts panel.
 
-**When using the Python API**, call `session.bootstrap()` and pass the result as your system prompt — no tool call is needed.
+**When using the Python API**, call `session.bootstrap()` and pass the result as your system prompt - no tool call is needed.
 
 For the full tools reference including parameters, tiers, and source filters, see [DOCS.md - Tools Reference](DOCS.md#tools-reference).
 

diff --git a/examples/anthropic_agent.py b/examples/anthropic_agent.py
@@ -1,4 +1,4 @@
-"""
+"""
 examples/anthropic_agent.py
 ===========================
 Minimal example of an Anthropic Claude-powered agent that uses groundmemory to
@@ -72,7 +72,7 @@
     approx_tokens = sum(len(str(m.get("content", ""))) // 4 for m in messages)
     if session.should_compact(approx_tokens, 200_000):
         prompts = session.compaction_prompts()
-        print("[groundmemory] Context window approaching limit — triggering memory flush.")
+        print("[groundmemory] Context window approaching limit - triggering memory flush.")
         # Inject a compact request as a user turn (Anthropic doesn't support system mid-stream)
         messages.append({"role": "user", "content": prompts["user"]})
 

diff --git a/examples/openai_agent.py b/examples/openai_agent.py
@@ -1,4 +1,4 @@
-"""
+"""
 examples/openai_agent.py
 ========================
 Minimal example of an OpenAI-powered agent that uses groundmemory to persist
@@ -69,11 +69,11 @@
     messages.append({"role": "user", "content": user_input})
 
     # Check if we should compact before calling the model
-    # (token counts are approximate here — use tiktoken for production)
+    # (token counts are approximate here - use tiktoken for production)
     approx_tokens = sum(len(m.get("content", "") or "") // 4 for m in messages)
     if session.should_compact(approx_tokens, 128_000):
         prompts = session.compaction_prompts()
-        print("[groundmemory] Context window approaching limit — triggering memory flush.")
+        print("[groundmemory] Context window approaching limit - triggering memory flush.")
         messages.append({"role": "system", "content": prompts["system"]})
         messages.append({"role": "user", "content": prompts["user"]})
 

diff --git a/groundmemory/__init__.py b/groundmemory/__init__.py
@@ -1,5 +1,5 @@
-"""
-groundmemory — local-first, model-agnostic persistent memory for AI agents.
+"""
+groundmemory - local-first, model-agnostic persistent memory for AI agents.
 
 Quick start
 -----------

diff --git a/groundmemory/adapters/__init__.py b/groundmemory/adapters/__init__.py
@@ -1 +1 @@
-"""Adapters — format groundmemory tool schemas for specific LLM provider APIs."""
+"""Adapters - format groundmemory tool schemas for specific LLM provider APIs."""
diff --git a/groundmemory/adapters/anthropic.py b/groundmemory/adapters/anthropic.py
@@ -1,5 +1,5 @@
-"""
-Anthropic adapter — converts groundmemory tool schemas into the format expected
+"""
+Anthropic adapter - converts groundmemory tool schemas into the format expected
 by the Anthropic Messages API (``tools`` parameter).
 
 Usage
@@ -84,8 +84,8 @@ def handle_tool_calls(
     -------
     (assistant_turn, tool_result_turn)
         Two message dicts ready to be appended to your messages list.
-        ``assistant_turn``   — the assistant message containing tool_use blocks.
-        ``tool_result_turn`` — a user message containing all tool_result blocks.
+        ``assistant_turn``   - the assistant message containing tool_use blocks.
+        ``tool_result_turn`` - a user message containing all tool_result blocks.
 
     Example
     -------
@@ -182,7 +182,7 @@ def run_agent_loop(
             if should_flush(used, compaction_cfg):
                 prompts = get_compaction_prompts(compaction_cfg)
                 messages.append({"role": "user", "content": prompts["user"]})
-                # One dedicated flush turn — let the model write to memory
+                # One dedicated flush turn - let the model write to memory
                 flush_response = client.messages.create(
                     model=model,
                     max_tokens=max_tokens,

diff --git a/groundmemory/adapters/openai.py b/groundmemory/adapters/openai.py
@@ -1,5 +1,5 @@
-"""
-OpenAI adapter — converts groundmemory tool schemas into the format expected by
+"""
+OpenAI adapter - converts groundmemory tool schemas into the format expected by
 the OpenAI Chat Completions ``tools`` parameter (function-calling API).
 
 Usage
@@ -168,7 +168,7 @@ def run_agent_loop(
             if should_flush(used, compaction_cfg):
                 prompts = get_compaction_prompts(compaction_cfg)
                 messages.append({"role": "user", "content": prompts["user"]})
-                # One dedicated flush turn — let the model write to memory
+                # One dedicated flush turn - let the model write to memory
                 flush_response = client.chat.completions.create(
                     model=model,
                     messages=[{"role": "system", "content": prompts["system"]}] + messages,

diff --git a/groundmemory/bootstrap/__init__.py b/groundmemory/bootstrap/__init__.py
@@ -1 +1 @@
-"""Bootstrap package — system-prompt injection and compaction helpers."""
+"""Bootstrap package - system-prompt injection and compaction helpers."""
diff --git a/groundmemory/bootstrap/compaction.py b/groundmemory/bootstrap/compaction.py
@@ -1,5 +1,5 @@
-"""
-Compaction helpers — detect when the context window is nearly full and
+"""
+Compaction helpers - detect when the context window is nearly full and
 provide prompts that instruct the agent to flush important information
 to memory before the session is compacted.
 """
@@ -21,7 +21,7 @@
 3. Updated user preferences or project state.
 4. Anything the user would want remembered in a future session.
 
-Write concisely — prefer bullet points. Do NOT include information that is \
+Write concisely - prefer bullet points. Do NOT include information that is \
 already recorded in previous memory entries unless it has changed.\
 """
 
@@ -53,7 +53,7 @@ def should_flush(
     The flush fires when token usage reaches the lower of two limits::
 
         soft limit : cfg.soft_threshold_tokens
-                     (absolute usage ceiling — fire no later than this)
+                     (absolute usage ceiling - fire no later than this)
         hard limit : context_window - cfg.reserve_floor_tokens
                      (always keep this many tokens free for the model's reply)
     """

diff --git a/groundmemory/bootstrap/injector.py b/groundmemory/bootstrap/injector.py
@@ -1,5 +1,5 @@
-"""
-Bootstrap injector — builds the system-prompt string that loads an agent's
+"""
+Bootstrap injector - builds the system-prompt string that loads an agent's
 long-term memory context at the start of a session.
 
 Design goals
@@ -43,7 +43,7 @@ def _read_capped(path: Path, max_chars: int) -> tuple[str, bool]:
 
 def _section(title: str, body: str, truncated: bool = False) -> str:
     """Wrap *body* in a labelled Markdown block."""
-    marker = " [TRUNCATED — use memory_get to read the rest]" if truncated else ""
+    marker = " [TRUNCATED - use memory_get to read the rest]" if truncated else ""
     return f"### {title}{marker}\n\n{body}\n"
 
 

diff --git a/groundmemory/config/.env.example b/groundmemory/config/.env.example
@@ -1,4 +1,4 @@
-# GroundMemory - Environment Variable Reference
+# GroundMemory - Environment Variable Reference
 # Copy this file to .env in your config directory and fill in your values.
 #
 # Config file locations (first match wins):
@@ -20,47 +20,13 @@
 # Default workspace name (default: "default")
 # GROUNDMEMORY_WORKSPACE=default
 
-# ---------------------------------------------------------------------------
-# MCP server (groundmemory-mcp command)
-# ---------------------------------------------------------------------------
-
-# Host address the MCP server binds to (default: 127.0.0.1 - localhost only).
-# Change to 0.0.0.0 to accept connections from other machines on your network.
-# See "Network access" below before changing this.
-# GROUNDMEMORY_MCP__HOST=127.0.0.1
-
-# TCP port the MCP server listens on (default: 4242)
-# GROUNDMEMORY_MCP__PORT=4242
-
-# ---------------------------------------------------------------------------
-# Network access
-# ---------------------------------------------------------------------------
-# By default the server is localhost-only. To allow access from other devices
-# on your local network, uncomment and set HOST and ALLOWED_HOSTS together.
-#
-# GROUNDMEMORY_MCP__HOST: bind address. Set to 0.0.0.0 for LAN access.
-# GROUNDMEMORY_MCP__ALLOWED_HOSTS: DNS-rebinding protection allowlist.
-#   List every "host:port" value clients will use in the Host: header.
-#   localhost and 127.0.0.1 are always allowed implicitly.
-#   Separate multiple values with commas.
-#
-# Example - LAN access:
-# GROUNDMEMORY_MCP__HOST=0.0.0.0
-# GROUNDMEMORY_MCP__ALLOWED_HOSTS=192.168.1.50:4242
-#
-# GROUNDMEMORY_MCP__FORWARDED_ALLOW_IPS: only needed when running behind a
-#   reverse proxy (nginx, Caddy, Traefik). Set to the proxy's internal IP so
-#   uvicorn trusts the X-Forwarded-For / X-Real-IP headers it sends.
-#   Leave unset for direct LAN access (no proxy in front of GroundMemory).
-# GROUNDMEMORY_MCP__FORWARDED_ALLOW_IPS=127.0.0.1
-
 # ---------------------------------------------------------------------------
 # Embedding provider
 # ---------------------------------------------------------------------------
 # PROVIDER options:
-#   "none"   — BM25 keyword search only (no embeddings, no vector search) [default]
-#   "openai" — OpenAI-compatible HTTP API (OpenAI, Ollama, LM Studio, etc.)
-#   "local"  — sentence-transformers (requires: pip install groundmemory[local])
+#   "none"   - BM25 keyword search only (no embeddings, no vector search) [default]
+#   "openai" - OpenAI-compatible HTTP API (OpenAI, Ollama, LM Studio, etc.)
+#   "local"  - sentence-transformers (requires: pip install groundmemory[local])
 
 GROUNDMEMORY_EMBEDDING__PROVIDER=none
 
@@ -79,7 +45,7 @@ GROUNDMEMORY_EMBEDDING__PROVIDER=none
 # GROUNDMEMORY_EMBEDDING__BATCH_SIZE=64
 
 # ---------------------------------------------------------------------------
-# Search
+# Hybrid search
 # ---------------------------------------------------------------------------
 
 # Number of results returned by memory_search
@@ -100,7 +66,7 @@ GROUNDMEMORY_EMBEDDING__PROVIDER=none
 # GROUNDMEMORY_SEARCH__MMR_LAMBDA=0.0
 
 # ---------------------------------------------------------------------------
-# Chunking
+# Text chunking
 # ---------------------------------------------------------------------------
 
 # Target chunk size in approximate tokens (1 token ~= 4 chars)
@@ -135,12 +101,74 @@ GROUNDMEMORY_EMBEDDING__PROVIDER=none
 # ---------------------------------------------------------------------------
 # Compaction (pre-context-window-flush hooks)
 # ---------------------------------------------------------------------------
+#
+# When token usage crosses the flush threshold the adapter injects a message
+# asking the agent to call memory_write for anything worth keeping, before the
+# LLM provider silently drops or summarises old messages.
+#
+# Flush fires when: current_tokens >= min(soft_threshold_tokens, context_window_tokens - reserve_floor_tokens)
 
 # Enable compaction detection hooks
 # GROUNDMEMORY_COMPACTION__ENABLED=true
 
-# Tokens remaining in context window that trigger a flush suggestion
-# GROUNDMEMORY_COMPACTION__SOFT_THRESHOLD_TOKENS=4000
+# Total token capacity of the model being used.
+# Used to derive the hard flush limit together with reserve_floor_tokens.
+# GROUNDMEMORY_COMPACTION__CONTEXT_WINDOW_TOKENS=128000
+
+# Flush when this many tokens have been consumed in the context window
+# (counted from zero - this is token usage, not tokens remaining).
+# GROUNDMEMORY_COMPACTION__SOFT_THRESHOLD_TOKENS=64000
+
+# Always keep this many tokens free for the model's reply.
+# Hard flush limit = context_window_tokens - reserve_floor_tokens.
+# GROUNDMEMORY_COMPACTION__RESERVE_FLOOR_TOKENS=32000
 
-# Minimum tokens always kept free for model responses
-# GROUNDMEMORY_COMPACTION__RESERVE_FLOOR_TOKENS=20000
+# Messages injected at the flush turn (override if you need custom wording)
+# GROUNDMEMORY_COMPACTION__SYSTEM_PROMPT=Session nearing compaction. Store durable memories now.
+# GROUNDMEMORY_COMPACTION__USER_PROMPT=Review the conversation and write lasting facts to memory using memory_write. Reply DONE when finished.
+
+# ---------------------------------------------------------------------------
+# Optional tools
+# ---------------------------------------------------------------------------
+
+# Expose the memory_list tool (lists workspace files with sizes and line counts).
+# Disabled by default - enable only if your agent/client needs it.
+# GROUNDMEMORY_EXPOSE_MEMORY_LIST=false
+
+# Enable dispatcher mode: replaces all four core tools with a single memory_tool
+# dispatcher. Useful for clients/agents that handle a unified action+args interface.
+# In dispatcher mode, only memory_tool is registered (not the four core tools).
+# GROUNDMEMORY_DISPATCHER_MODE=false
+
+# ---------------------------------------------------------------------------
+# MCP server (groundmemory-mcp command)
+# ---------------------------------------------------------------------------
+
+# Host address the MCP server binds to (default: 127.0.0.1 - localhost only).
+# Change to 0.0.0.0 to accept connections from other machines on your network.
+# See "Network access" below before changing this.
+# GROUNDMEMORY_MCP__HOST=127.0.0.1
+
+# TCP port the MCP server listens on (default: 4242)
+# GROUNDMEMORY_MCP__PORT=4242
+
+# --- Network access (disabled by default) ---
+#
+# By default the server is localhost-only. To allow access from other devices
+# on your local network, uncomment and set HOST and ALLOWED_HOSTS together.
+#
+# GROUNDMEMORY_MCP__HOST: bind address. Set to 0.0.0.0 for LAN access.
+# GROUNDMEMORY_MCP__ALLOWED_HOSTS: DNS-rebinding protection allowlist.
+#   List every "host:port" value clients will use in the Host: header.
+#   localhost and 127.0.0.1 are always allowed implicitly.
+#   Separate multiple values with commas.
+#
+# Example - LAN access:
+# GROUNDMEMORY_MCP__HOST=0.0.0.0
+# GROUNDMEMORY_MCP__ALLOWED_HOSTS=192.168.1.50:4242
+#
+# GROUNDMEMORY_MCP__FORWARDED_ALLOW_IPS: only needed when running behind a
+#   reverse proxy (nginx, Caddy, Traefik). Set to the proxy's internal IP so
+#   uvicorn trusts the X-Forwarded-For / X-Real-IP headers it sends.
+#   Leave unset for direct LAN access (no proxy in front of GroundMemory).
+# GROUNDMEMORY_MCP__FORWARDED_ALLOW_IPS=127.0.0.1
Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		"""Adapters — format groundmemory tool schemas for specific LLM provider APIs."""
		"""Adapters - format groundmemory tool schemas for specific LLM provider APIs."""
Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		"""Bootstrap package — system-prompt injection and compaction helpers."""
		"""Bootstrap package - system-prompt injection and compaction helpers."""