cacheguard/README.md at main · SuperMarioYL/cacheguard

_{中文 ⇄ English}

CacheGuard 缓存卫士 — 套在任意 Coding Agent 与 DeepSeek 之间的 drop-in 代理

_{逐字节比对每轮请求前缀，当场标出是哪个字段破坏了 DeepSeek 服务端 prefix-cache 命中，把被悄悄翻倍的账单按回原价——不用换 agent，套一层就行。}

CacheGuard 是一层放在任意 Coding Agent（以 Cursor 为代表的 harness、Codex、或你自研的 harness）和 DeepSeek API 之间的 drop-in 代理。它逐轮对每个 /chat/completions 请求的 system + history 前缀做逐字节指纹比对：一旦某轮前缀相对上一轮发生变动（这正是 DeepSeek 服务端 context-cache 折扣失效、账单悄悄翻倍的根因），终端立刻红字告警「是哪个字段、哪个字节破坏了缓存」。v0.1 只检测 + 报告，绝不修改你的 payload。

架构

架构：Coding Agent → CacheGuard 代理（fingerprint → diff → lint → report）→ DeepSeek API

单二进制、单进程，除 DeepSeek 上游外零外部依赖。代理用 net/http/httputil 透传，绝不改写请求体（仅克隆读取前缀）；流式响应（SSE）原样 pass-through；任何解析错误都降级为纯透传 + 一行 warning，绝不挡住你的 agent。

快速开始

从 git clone 到看见第一份报告，≤ 10 分钟：

# 1. 拉源码并构建单二进制
git clone https://github.com/SuperMarioYL/cacheguard.git
cd cacheguard
go build -o cacheguard ./cmd/cacheguard
# 或直接 go install github.com/SuperMarioYL/cacheguard/cmd/cacheguard@latest

# 2. 在 DeepSeek 前面起代理，监听 localhost:8788
./cacheguard proxy --upstream https://api.deepseek.com

# 3. 把你 agent 的 base_url 指向 CacheGuard，api_key 照旧透传
#    （Cursor / Codex / 自研 harness 都不用动其它配置）
export OPENAI_BASE_URL=http://localhost:8788

正常用 agent 写代码，CacheGuard 静默记录每轮前缀指纹；前缀一旦被破坏，终端立刻红字告警。会话结束跑一句 cacheguard report 看命中率与多付倍数：

./cacheguard report

用法

两个子命令：proxy（拦截 + lint）与 report（渲染会话）。

# 在 DeepSeek 前起反向代理（所有 flag 都可省，括号内为默认值）
cacheguard proxy \
  --upstream https://api.deepseek.com \   # 上游 DeepSeek base URL
  --listen   :8788 \                      # 本地监听地址
  --log-dir  ~/.cacheguard \              # 会话 JSONL 落盘目录
  --config   ./cacheguard.yaml \          # 可选配置文件（flag 覆盖文件）
  --no-color                              # 关闭彩色告警

# 渲染会话报告：cache-hit-rate / 破坏事件清单 / 估算账单倍数
cacheguard report                          # 默认渲染最近一次会话
cacheguard report --session ~/.cacheguard/<id>.jsonl   # 指定某次会话

前缀一旦被破坏，CacheGuard 立刻打印的告警长这样（始终以字面量 PREFIX MUTATION: 开头，方便 grep）：

PREFIX MUTATION: system[0] byte 142 differs — cache hit voided (injected timestamp: "2026-06-23T02:06:00")
  fix: remove the live timestamp from the prefix (or move it past the cached boundary)

cacheguard report 的输出：

CacheGuard session report
────────────────────────────────────────────────────
  session      2026-06-23T02-06-19
  turns        3
  cache-hit    95.0%  (2/2 warm turns)
  upstream     94.7%  (from DeepSeek prompt_cache_hit/miss tokens)
  billing      1.05x prompt-token cost vs. an all-hits session
────────────────────────────────────────────────────
  ✓ no prefix mutations — cache stayed warm.

演示

30 秒内演完：起代理 → 第 3 轮在 system[0] 注入时间戳 → CacheGuard 红字精确指出破坏字段 → cacheguard report 命中率从 0% 回升到 95% + 省钱倍数。

为什么需要它

DeepSeek 是一个 MoE 模型，按服务端 context-cache 命中给 prompt token 打折——但这个折扣是逐字节的：system prompt 里多了一个实时时间戳、tools[] 重排了一次、history 被裁掉了头部，整段前缀的折扣就清零，账单悄悄翻倍，而你的 agent 完全不会报错。

DeepSeek 返回的 cache-hit 字段只告诉你「没命中」，不告诉你「是哪个字段、哪个字节破坏了前缀」 —— CacheGuard 的全部价值就是定位根因，让你能修。
不用换 agent。 这不是又一只 agent（那得换掉你现有的 harness），而是一层 drop-in 薄代理：Cursor / Codex / 自研都不用动，套一层就行。
绝不背锅。 v0.1 只检测 + 报告，绝不自动改写你的 payload 去「凑」缓存命中——破坏字段由你自己决定怎么修。
不阻塞。 解析失败一律降级为纯透传，你的 agent 永远照常工作。

配置

CacheGuard 零配置即可工作；下表每项都有合理默认值，CLI flag 覆盖配置文件，配置文件覆盖内置默认。完整示例见 configs/cacheguard.example.yaml。

键	默认	说明
`proxy.upstream`	`https://api.deepseek.com`	上游 DeepSeek base URL（v0.1 仅支持 DeepSeek）
`proxy.listen`	`:8788`	本地监听地址，agent 把 base_url 指向这里
`proxy.color`	`true`	彩色「PREFIX MUTATION」终端告警
`report.log_dir`	`~/.cacheguard`	每会话一份扁平 JSONL 日志
`report.cache_discount_factor`	`0.1`	命中前缀 token 的折扣价系数（DeepSeek 约 10× 便宜），用于估算账单倍数
`rules.injected_timestamp`	`true`	缓存段内被注入实时时间戳（头号元凶）
`rules.reordered_tools`	`true`	tools[] 在两轮间重排/变动
`rules.rewritten_system`	`true`	system[0] 在两轮间被改写
`rules.trimmed_history`	`true`	早期消息被裁掉，导致缓存尾部失效

路线图

m1 · 透传代理 — 本地反向代理跑通，agent 经 CacheGuard 透传到 DeepSeek，零行为改变。
m2 · 前缀 lint — 前缀指纹 + 逐轮 diff + 规则引擎，实时精确指出破坏字段与首个差异字节。
m3 · 会话报告 — cache-hit-rate、破坏事件清单、估算账单倍数，JSONL 落盘 + cacheguard report 渲染。
Kimi / GLM context-cache primitive 适配（stretch）。
托管团队版 — 单机报告升级成团队级缓存命中率看板 + 账单回归告警（见下方「付费」）。

付费 / 团队版

自托管开源版本永久免费——drop-in 代理 + linter + 单机报告，本仓库即是全部。

当你被账单咬过、并且是多人 / 多仓库一起跑 DeepSeek 的小团队时，单机 JSONL 报告会看不全谁在烧钱。托管团队版把它升级成：

「按 agent / 按仓库」的缓存命中率看板（多人、留存）；
前缀命中率掉破阈值时推 飞书 / Slack / 钉钉 webhook 的账单回归告警；
团队级前缀稳定性策略——在 CI 里 lint 谁的 harness 又破坏了缓存。

定价（起步参考）：¥99 / 席位 / 月，或 ¥499 / 月含 5 席 + 月度账单回归报告。最小迁移成本：把本地 proxy 指向托管 endpoint 即可，7 天试用看团队看板再自助订阅。想要早期席位？在 Issues 留个邮箱即可加入 waitlist。

许可证

本项目以 MIT 许可证开源，详见 LICENSE。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

架构

快速开始

用法

演示

为什么需要它

配置

路线图

付费 / 团队版

许可证

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

架构

快速开始

用法

演示

为什么需要它

配置

路线图

付费 / 团队版

许可证