npm package:
mcp-memory-gatewayβ install withnpx mcp-memory-gateway init
β Solo dev? Free tier has everything you need. Team or multi-repo? Pro syncs prevention rules across machines and team members. $49 one-time.
Thumbs down a mistake. It never happens again.
The safety net for vibe coding. Give your AI agent a thumbs-down and it auto-generates a prevention rule. Give a thumbs-up and it reinforces good behavior. Pre-action gates physically block the agent before it repeats a known mistake β a reliability layer for one sharp agent, without another planner or swarm.
Honest disclaimer: this is not RLHF weight training. ThumbGate is context engineering plus enforcement. Feedback becomes searchable memory, prevention rules, and gates that block known-bad actions before they execute.
Works with Claude Code, Cursor, Codex, Gemini, Amp, OpenCode, and any MCP-compatible agent.
Live Demo Dashboard | Landing Page | Verification Evidence
Most memory tools only help an agent remember. ThumbGate also enforces.
The problem without it:
BEFORE: Agent force-pushes to main. You correct it. Next session, it force-pushes again.
With ThumbGate (mcp-memory-gateway):
AFTER: Gate blocks the force-push before it executes. Agent can't repeat the mistake.
recallinjects the right context at session start.search_lessonsshows promoted lessons plus the corrective action, lifecycle state, linked rules, linked gates, and the next harness fix the system should make.search_rlhfsearches raw RLHF state across feedback logs, ContextFS memory, and prevention rules.- Pre-action gates physically block tool calls that match known failure patterns.
- Session handoff and primer keep continuity across sessions without adding an extra orchestrator.
Free and self-hosted users can invoke search_lessons directly through MCP, and via the CLI with npx mcp-memory-gateway lessons.
$ npx mcp-memory-gateway serve
[gate] β Blocked: git push --force (rule: no-force-push, confidence: 0.94)
[gate] β
Passed: git push origin feature-branch
# One command install β auto-detects your agent
npx mcp-memory-gateway init
# Or add the MCP server directly
claude mcp add rlhf -- npx -y mcp-memory-gateway serve
codex mcp add rlhf -- npx -y mcp-memory-gateway serve
amp mcp add rlhf -- npx -y mcp-memory-gateway serve
gemini mcp add rlhf "npx -y mcp-memory-gateway serve"
# Wire PreToolUse enforcement hooks
npx mcp-memory-gateway init --agent claude-code
npx mcp-memory-gateway init --agent codex
npx mcp-memory-gateway init --agent gemini
# Health check and inspect lessons
npx mcp-memory-gateway doctor
npx mcp-memory-gateway lessons
npx mcp-memory-gateway dashboard1. You give feedback β π "Force-pushed and lost commits"
2. ThumbGate validates β Rejects vague signals, promotes actionable ones
3. Rules auto-generate β "Block git push --force to protected branches"
4. Gates enforce β PreToolUse hook fires β BLOCKED before execution
5. Agent improves β Same mistake never happens again
Pipeline: Capture β Validate β Remember β Distill β Prevent β Gate β Export
Gates are the enforcement layer. They do not ask the agent to cooperate β they physically block the action.
Agent tries git push --force
β PreToolUse hook fires
β gates-engine checks rules
β BLOCKED: no force pushes to protected branches
Built-in gates:
push-without-thread-checkβ block push if PR threads unresolvedforce-pushβ blockgit push --forceto protected branchesprotected-branch-pushβ block direct pushes to main/masterpackage-lock-resetβ block destructive lock file changesenv-file-editβ block edits to.envfiles with secrets
Define custom gates in config/gates/custom.json.
| Actually works | Does not work |
|---|---|
recall injects past context into the next session |
Thumbs up/down changing model weights |
session_handoff and session_primer preserve continuity |
Agents magically remembering what happened last session |
search_lessons exposes corrective actions, lifecycle state, linked rules, linked gates, and next harness fixes |
Feedback stats automatically improving behavior by themselves |
| Pre-action gates block known-bad tool calls before execution | Agents self-correcting without context injection or gates |
| Auto-promotion turns repeated failures into warn/block rules | Calling this "RLHF" in the strict training sense |
| Rejection ledger shows why vague feedback was rejected | Vague signals silently helping the system |
| Tool | Purpose |
|---|---|
capture_feedback |
Accept up/down signal + context, validate, promote to memory |
recall |
Recall relevant past failures and rules for the current task |
search_lessons |
Search promoted lessons with corrective action, lifecycle state, rules, gates |
search_rlhf |
Search raw RLHF state across feedback logs, ContextFS, and rules |
prevention_rules |
Generate prevention rules from repeated mistakes |
enforcement_matrix |
Inspect promotion rate, active gates, and rejection ledger |
feedback_stats |
Approval rate and failure-domain summary |
estimate_uncertainty |
Bayesian uncertainty estimate for risky tags |
Lean install for recall + gates + lesson search only:
RLHF_MCP_PROFILE=essential claude mcp add rlhf -- npx -y mcp-memory-gateway serveFree and self-hosted users can invoke search_lessons directly through MCP to inspect corrective action per lesson. For broader retrieval across feedback logs, ContextFS memory, and prevention rules, use search_rlhf through MCP or the authenticated GET /v1/search API.
Phone-safe read-only surface for remote ops:
RLHF_MCP_PROFILE=dispatch claude mcp add rlhf -- npx -y mcp-memory-gateway serve
npx mcp-memory-gateway dispatchGuide: docs/guides/dispatch-ops.md
| Feature | ThumbGate | SpecLock | Mem0 | .cursorrules |
|---|---|---|---|---|
| Blocks mistakes before execution | Yes β PreToolUse gates | Yes β Patch Firewall | No | No |
| Learns from your feedback | Yes β thumbs up/down | No β manual spec writing | Yes β auto-capture | No |
| Works across sessions | Yes β SQLite + JSONL | Yes β encrypted store | Yes β cloud | No β per-project |
| Auto-generates rules | Yes β from repeated failures | No β manual or Gemini compile | No | No |
| Agent support | Claude Code, Codex, Gemini, Amp, Cursor, OpenCode | Claude Code, Cursor, Windsurf, Cline, Bolt.new | Claude, Cursor | Cursor only |
| Install | npx mcp-memory-gateway init |
npx speclock setup |
Cloud signup | Edit file |
| Cost | Free (Pro $49 for teams) | Free | Free tier + paid | Free |
| npm weekly downloads | 724 | 98 | N/A | N/A |
When to use ThumbGate: You want your agent to learn from mistakes automatically and enforce what it learned. One thumbs-down creates a gate.
When to use SpecLock: You have a written spec/PRD and want to lock specific sections from AI modification. Manual constraint authoring.
When to use Mem0: You want cloud-hosted memory shared across apps. No enforcement.
- Node.js
>=18.18.0 - Module system: CommonJS CLI/server runtime
- Primary entry points: CLI, MCP stdio server, authenticated HTTP API, OpenAPI adapters
- MCP stdio: adapters/mcp/server-stdio.js
- HTTP API: src/api/server.js
- OpenAPI surfaces: openapi/openapi.yaml, adapters/chatgpt/openapi.yaml
- CLI:
npx mcp-memory-gateway ...
- Local memory: JSONL logs in
.claude/memory/feedbackor.rlhf/* - Lesson DB (v0.8.0): SQLite + FTS5 full-text search via
better-sqlite3β dual-written alongside JSONL. Indexed by signal, domain, tags, importance. Replaces linear Jaccard token-overlap with sub-millisecond ranked search. - Corrective actions (v0.8.0): On negative feedback,
capture_feedbackreturnscorrectiveActions[]β top 3 remediation steps inferred from similar past failures by tag/domain overlap. - Context assembly: ContextFS packs and provenance logs
- Default retrieval path: SQLite FTS5 (primary) with JSONL Jaccard fallback
- Semantic/vector lane: LanceDB + Apache Arrow + local embeddings via Hugging Face Transformers
- MemAlign-inspired dual recall: Principle-based memory (distilled rules) + episodic context (raw feedback with timestamps). Recall surfaces both lanes ranked by relevance.
- Thompson Sampling: Bayesian multi-armed bandit over feedback tags β adapts gate sensitivity per failure domain based on observed positive/negative signal ratios.
- Corrective action inference: On negative feedback, the lesson DB infers top-3 remediation steps from similar past failures by tag/domain overlap.
- Bayesian belief update: Each memory carries a posterior belief that updates on new evidence β high-entropy contradictions auto-prune.
- PreToolUse enforcement: scripts/gates-engine.js
- Hook wiring:
init --agent claude-code|codex|gemini - Browser automation / ops:
playwright-core - Social analytics store:
better-sqlite3
- Billing: Stripe
- Hosted API / landing page: Railway
- Worker lane: Cloudflare Workers in
workers/
For autonomous agent runs against this or any repo using this workflow:
- WORKFLOW.md β scope, proof-of-work, hard stops, done criteria
- .github/ISSUE_TEMPLATE/ready-for-agent.yml β bounded intake template
- .github/pull_request_template.md β proof-first PR handoff
$49 one-time β hosted dashboard, priority support, commercial license.
MIT. See LICENSE.