CyClaw

Offline-first, RAG-enforced, soul-governed personal AI assistant (no internet required!) Version 1.4.0 (planning) · Baseline 1.3.0 (production) · Python 3.12 · LM Studio + ChromaDB + BM25 + LangGraph

<-- Screenshots of Local AI web interface

What It Does

CyClaw is a personal RAG (Retrieval-Augmented Generation) backend that:

Answers questions exclusively from your local Markdown corpus — no internet by default
Enforces every safety invariant via LangGraph topology — not prompts, not config flags, not discipline
Maintains a persistent soul/personality layer (soul.md) with SHA-256 drift detection, atomic evolution writes, and user-gated modification
Falls back to Grok (xAI) only with explicit user confirmation in hybrid mode — triple-gated at config, env, and per-query level
Exposes both a FastAPI HTTP gateway and an MCP server for Claude Desktop / Copilot Studio integration

Zero telemetry. Binds to 127.0.0.1:8787 only. All embeddings run locally via sentence-transformers. No cloud dependency for offline operation.

Version History

Version	Status	Key Changes
v1.2.0	Superseded	8 OWASP patterns, 90-day TTL, sanitizer baseline
v1.3.0	Pre-Langgrinch	Rate limiting (60/min), 13 OWASP patterns, soul SHA-256 drift detection, atomic writes, TTL→365 days
v1.4.0	Production (current)	Updated requirements.txt to patch vulns and modernize for Python 3.12
v1.5.0	Planning	Fix Stemmer.py, sql write placeholder code sections, other cleanups,test Dropbox corpus sync integration, BM25 SHA Integrity Detection

Architecture

User Query (HTTP POST /query or MCMC tool call)
         │
         ▼
    ┌─────────────────────────────────────────────────────┐
    │  gate.py  (FastAPI, 127.0.0.1:8787)                 │
    │  • Rate limit (60 req/min per IP — RUNS FIRST)      │
    │  • Injection filter (sanitizer.py, config-driven)   │
    │  • Soul init (PersonalityManager closure)           │
    │  • Telemetry kill block (before any SDK import)     │
    └──────────────────┬──────────────────────────────────┘
                       │
                       ▼
    ┌─────────────────────────────────────────────────────┐
    │  graph.py  (LangGraph 7-node State Machine)         │
    │                                                     │
    │  [ENTRY]                                            │
    │     ↓                                               │
    │  1. retrieve  (Chroma + BM25 + RRF fusion)          │
    │     ↓                                               │
    │  2. route_score  (top_score >= 0.028 RRF?)          │
    │     ├─ YES ──→ 3. local_llm (LM Studio :1234)       │
    │     └─ NO  ──→ 4. user_gate (needs_confirm=true)    │
    │                    ├─ confirmed + hybrid ──→        │
    │                    │      5. grok_fallback          │
    │                    └─ declined / offline ──→        │
    │                           6. offline_best_effort    │
    │     ↓ (all paths converge)                          │
    │  7. audit_logger (SHA-256 + PII redact → jsonl)     │
    │     ↓                                               │
    │  [END]                                              │
    └─────────────────────────────────────────────────────┘
                       │
                       ▼
    ┌─────────────────────────────────────────────────────┐
    │  HybridRetriever  (retrieval/hybrid_search.py)      │
    │  • ChromaDB  (semantic, all-MiniLM-L6-v2, 384d)    │
    │  • BM25Okapi (keyword, Porter stemming)             │
    │  • RRF fusion (k=60, equal 1.0/1.0 weighting)      │
    │  • Per-chunk provenance metadata in every result    │
    └─────────────────────────────────────────────────────┘

Five security invariants enforced by graph edges — not prompts:

#	Invariant	Enforcement
1	RAG-First	`retrieve` is the unconditional graph entry point — no LLM call can precede it
2	Topology = Policy	Routing is graph edges, not LLM decisions or if/else code
3	Triple-Gated External	Grok requires: `mode=hybrid` AND `grok.enabled=true` AND `user_confirmed_online=true` — simultaneously
4	Audit Convergence	All 6 execution paths converge at `audit_logger` — no shortcut path exists
5	Soul Governance	Soul evolution requires explicit human reason string; no autonomous modification from any path

Quick Start

Prerequisites

Requirement	Version	Notes
Python	3.12	Primary supported runtime (3.11 also works)
LM Studio	Any	Must be running on `localhost:1234`
GGUF model loaded in LM Studio	—	`mistral-7b-instruct` or `qwen2.5-7b` work well

Install

git clone https://github.com/CGFixIT/CyClaw
cd CyClaw
python3.12 -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

# 1) Install CPU-only torch first (pinned >=2.6.0 for CVE-2025-32434 safety)
pip install torch==2.6.0+cpu --index-url https://download.pytorch.org/whl/cpu

# 2) Install the rest, pinned to the verified transitive tree.
pip install -r requirements.txt -c constraints.txt

Detailed Setup Guide

Upgrading from a pre-1.4.0 checkout? ChromaDB moved from 0.4.x to 1.5.x and the on-disk index format changed — delete index/ and rebuild with python -m retrieval.indexer.

Offline note: embeddings use all-MiniLM-L6-v2. Because cyclaw_telemetry_kill.env sets HF_HUB_OFFLINE=1, the model must be cached locally first. On a machine with network, run the indexer once (it downloads + caches the model); afterwards it runs fully offline.

Configure

Key settings in config.yaml:

app:
  mode: "offline"          # "offline" | "hybrid" (hybrid enables Grok fallback)

models:
  local_llm:
    base_url: "http://127.0.0.1:1234/v1"   # LM Studio default
    model: "your-model-name-here"           # must match LM Studio loaded model name exactly
    timeout_sec: 720                        # long-context inference budget
    max_tokens: 5000

personality:
  enabled: true
  soul_path: "data/personality/soul.md"    # your identity file — source of truth
  interaction_ttl_days: 365               # audit window

retrieval:
  min_score: 0.028          # RRF fused-rank threshold (NOT cosine sim — different scale)
  top_k_semantic: 5
  top_k_keyword: 5
  rrf_k: 60
  max_context_tokens: 5000

Project Structure

CyClaw/
├── gate.py                     FastAPI gateway + soul endpoints
├── graph.py                    LangGraph 7-node state machine
├── mcp_hybrid_server.py        MCP server (retrieval-only, no LLM)
├── metrics.py                  Audit JSONL analyzer
├── config.yaml                 Single source of truth for all config
├── requirements.txt            Pinned Python deps
├── cyclaw_telemetry_kill.env  Kill-switch for LangChain/Chroma/OTel telemetry
├── cyclaw_suggestions_fix.md  Dev notes and open issues
├── .gitignore
├── old.md                      Archived prior README
├── llm/
│   └── client.py               LocalLLMClient + GrokClient
├── retrieval/
│   ├── embeddings.py           sentence-transformers wrapper
│   ├── hybrid_search.py        ChromaDB + BM25 + RRF fusion
│   ├── indexer.py              Corpus ingestion + index build
│   └── stemmer.py              Porter stemmer (tech-vocabulary tuned)
├── schemas/
│   └── api.py                  Pydantic request/response models
├── utils/
│   ├── errors.py               Typed RAGError hierarchy
│   ├── health.py               Startup dependency health checks
│   ├── logger.py               Audit JSONL + SHA-256 query hashing
│   ├── personality.py          PersonalityManager (soul CRUD + governance)
│   └── sanitizer.py            Prompt injection filter + PII redaction
├── static/
├   |── extractor.html          Browser-Based simplified insight_extractor.py to generate .md corpus files
│   └── terminal.html           Browser UI / Soul Console
├── data/
│   ├── corpus/                 .md / .txt knowledge base (gitignored runtime content)
│   └── personality/
│       └── soul.md             Identity source-of-truth
└── tests/
    ├── conftest.py
    ├── test_gate.py
    ├── test_graph.py
    ├── test_hybrid_search.py
    ├── test_sanitizer.py
    ├── test_personality.py
    ├── test_personality_changes.py
    ├── test_rate_limit.py
    ├── test_audit.py
    ├── test_stemmer.py
    ├── apipsTest.ps1           Windows PowerShell smoke test
    └── cmd2index.bat           Windows index rebuild shortcut

Soul / Personality Layer

CyClaw maintains a persistent identity through soul.md. Key properties:

File-as-truth: data/personality/soul.md is always the canonical version
Shadow SQLite DB: cyclaw_soul.db stores version history and interaction logs
SHA-256 drift detection: on startup, file hash vs. DB hash — mismatch triggers forensic log entry
Atomic writes: backup → atomic disk write (tmp file + os.replace) → DB version insert → in-memory update; the os.replace is what makes a crash unable to leave a half-written soul.md
Advisory injection scan on propose: POST /soul/propose runs an OWASP injection scan whose flags are advisory — surfaced for human review alongside the diff; propose never writes
Enforced injection scan on apply: POST /soul/apply is human-gated (explicit reason string required) and re-runs the injection scan at the write boundary — a proposed soul containing injection patterns is rejected with 400 PROMPT_INJECTION_BLOCKED before any file/DB write, closing the soul-poisoning vector. The trusted restore path (restore_from_backup, re-applying a previously vetted .bak) bypasses the scan via scan=False

Security Model

Layer	Mechanism
Network	Binds `127.0.0.1:8787` — no external exposure by design
Input	Config-driven injection filter (`policy.prompt_filter`, 31 patterns), 4000 char max
Rate limit	60 req/min per IP — thread-safe in-memory sliding window (`utils/ratelimit.py`, lock-guarded)
Telemetry	Kill block runs before any SDK import in `gate.py`
Audit	All paths (HTTP and MCP) log SHA-256 query hash + PII-redacted metadata
Grok gating	Triple gate: `mode=hybrid` AND `grok.enabled=true` AND `user_confirmed_online=true`
Soul writes	Enforced injection scan at the write boundary (`apply_evolution`, → `400 PROMPT_INJECTION_BLOCKED`) + human reason string + atomic (`os.replace`) crash-safe write
Corpus	Chunk sanitization at index time via `sanitizer.py`
Model Weights	Trusted/verified sources only. Safetensors strongly preferred. `torch.load(..., weights_only=True)` alone was insufficient on torch<2.6 (CVE-2025-32434). We pin torch==2.6.0+cpu and keep loading paths (embeddings.py) minimal + documented.

MCP Server

For Claude Desktop or other MCP-compatible clients:

{
  "mcpServers": {
    "cyclaw": {
      "command": "python",
      "args": ["/path/to/CyClaw/mcp_hybrid_server.py"]
    }
  }
}

The MCP server exposes a single hybrid_search tool. It has no sampling capability — sampling: null is set at the protocol level, making it architecturally impossible for this server to invoke an LLM.

Status & Roadmap

What works in v1.3.0:

RAG-first pipeline (ChromaDB + BM25 + RRF)
FastAPI /query with LangGraph 7-node controller
Local LLM via LM Studio
Optional Grok fallback (triple-gated)
MCP server (retrieval-only)
Audit JSONL with SHA-256 hashing and PII redaction
Soul persistence with drift detection and atomic writes
Rate limiting (60/min per IP)
Browser UI via static/terminal.html

v1.4.0 targets:

Dropbox/cloud corpus sync
plan_node for multi-step query decomposition
BM25 index SHA-256 integrity check on load
General-purpose agent (tool invocation from corpus context)

Not yet planned:

Multi-user or network-exposed deployment
Production security hardening (external pentest)

Built by Chris Grady · cgfixit.com/linkedin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CyClaw

What It Does

Version History

Architecture

Quick Start

Prerequisites

Install

Configure

Project Structure

Soul / Personality Layer

Security Model

MCP Server

Status & Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 258 Commits
.github		.github
data		data
docs		docs
llm		llm
retrieval		retrieval
schemas		schemas
static		static
tests		tests
utils		utils
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
constraints.txt		constraints.txt
cyclaw_telemetry_kill.env		cyclaw_telemetry_kill.env
gate.py		gate.py
graph.py		graph.py
mcp_hybrid_server.py		mcp_hybrid_server.py
metrics.py		metrics.py
package.json		package.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup-guide.md		setup-guide.md

Folders and files

Latest commit

History

Repository files navigation

CyClaw

What It Does

Version History

Architecture

Quick Start

Prerequisites

Install

Configure

Project Structure

Soul / Personality Layer

Security Model

MCP Server

Status & Roadmap

About

Topics

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages