Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 36 additions & 3 deletions docs/rfc/codex-backend-phase2-feasibility.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Codex Backend — Phase 2 Feature 可行性分析

> **状态**: 可行性分析(含 codex 0.141.0 实测探针,2026-06-21)
> **状态**: 可行性分析(含 codex 0.141.0 实测探针,2026-06-21);**embedded_context 已实现**(§6)
> **前置**: `codex-backend.md` v2 + `codex-backend-validation.md`(Phase 1 已交付 PR #2216)
> **范围**: 三个 claude 专属 UX Feature 在 codex backend 的落地评估。**本文不含实现代码。**
> **范围**: 三个 claude 专属 UX Feature 在 codex backend 的落地评估。

---

Expand Down Expand Up @@ -90,6 +90,39 @@ Phase 1 后,codex 与 kiro 的 Feature map **逐位相同**(7 项),三
> 每个特性建议独立 PR(独立可回滚)。passthrough/embedded_context 的底层若做通用,等于同时惠及 kiro(同为非-replay 后端)。

## 5. 实测探针留痕(2026-06-21,codex 0.141 + gpt-5.5 @ Bedrock)
- embedded_context:`@context.txt` 经 shell 工具读取(非原生内联)→ 倾向结构化 mention
- embedded_context:见 §6 的扩充探针矩阵
- passthrough:`clientUserMessageId` 逐字 round-trip 为 `item.clientId`;`turn/steer` mid-turn 成功返回 `{turnId}`,双 slot clientId 均回显。
- askuser:`requestUserInput` 响应 schema 未捕获(推迟前置任务)。

---

## 6. embedded_context 实现(2026-06-21,方案 A)

### 6.1 决策性探针矩阵

§1 的初判("倾向结构化 mention")被进一步探针**推翻**——结构化 mention 并不预内联:

| 探针 | 沙箱 | 结果 |
|---|---|---|
| 单独 `mention` UserInput(cwd-相对 path) | read-only | ❌ 未解析("unknown") |
| 单独 `mention`(绝对 path) | read-only | ❌ 未解析 |
| `text`+`text_elements`+`mention`(TUI 真实线格式) | read-only | ❌ "can't determine without reading the file" |
| **纯 `text` 含 `@path`** | workspace-write | ✅ codex 解析 `@path` 并发 `commandExecution` 读文件(agentic) |

**结论**:codex 的 `@path` 不是 claude 式静态内联,而是**靠 agentic shell 工具读取**,且只在沙箱允许读时生效。结构化 `mention` UserInput 在 0.141 下对内联无帮助(probe 全部未解析),故**不发** mention 条目——那是我自己探针无法证明有益的投机复杂度。

### 6.2 落地(最小正确改动)

- `internal/cli/backend/profile_codex.go`:`Features["embedded_context"] = true`,附诚实注释说明语义差异(agentic 读 vs 静态内联,取决于沙箱)。
- **无协议码改动**:`@path` 已随 `text` 透传进 `turn/start` 的 text UserInput(`CodexProtocol.WriteMessage` 原样写文本)。dashboard 的 `featureForCurrent('embedded_context')` 闸门(dashboard.js:4136)只要求"后端能从 prompt 内读文件路径",codex 满足。
- 测试:`TestCodexProtocol_WriteMessage_AtMentionVerbatim`(`@path` 逐字进 text UserInput)+ profile_test 断言 `embedded_context=true` / askuser·passthrough 仍 false。

### 6.3 与 claude 的诚实差异

| | claude | codex |
|---|---|---|
| 机制 | CLI 静态内联文件内容进 prompt | agentic shell 工具读取 |
| 纯对话/read-only 沙箱 | ✅ 总能内联 | ⚠️ 读不到(需沙箱许可) |
| naozhi 侧代码 | 纯透传 | 纯透传(零文件读取,零新增安全面) |

弱于 claude 的静态保证,但匹配 dashboard 契约且零安全面。若未来要 claude 式强保证,可走方案 B(naozhi 服务端内联),但会引入路径穿越/大小限制/workspace confine 安全面,留作独立 RFC。
20 changes: 17 additions & 3 deletions internal/cli/backend/profile_codex.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,18 +42,32 @@ func codexProfile() Profile {
// thread/tokenUsage/updated; there is no USD figure on the wire.
// Dashboard cost cells render unitless with a "tokens" suffix.
CostUnit: "tokens",
// RFC §5 phase1 conservative values (validated 2026-06-21):
// Feature values (validated 2026-06-21; embedded_context 2026-06-21 phase2):
// - askuser: requestUserInput reverse request not yet card-ified (phase2)
// - passthrough: turn/steer not yet wired to /urgent (phase2)
// - embedded_context: @file mention not yet plumbed (phase1)
// - embedded_context: @file mention works, but via a DIFFERENT
// mechanism than claude. claude statically inlines the file
// content into the prompt; codex does NOT — the `@path` rides
// through verbatim in the turn/start text UserInput and codex
// reads the file agentically with its shell tool. Verified
// 2026-06-21: codex parses `@path` from plain prompt text and
// issues a commandExecution to read it. The dashboard gate
// (dashboard.js featureForCurrent('embedded_context')) only needs
// the backend to "read file paths from inside the prompt", which
// codex satisfies. Caveat: resolution depends on the runtime
// sandbox permitting the read (codex default is workspace-write,
// which does); a read-only sandbox would leave the file unread.
// This is honestly a weaker guarantee than claude's static inline,
// but matches the dashboard contract and needs zero file-reading
// code in naozhi (no new path-traversal / size-cap surface).
// - image_input: codex responses accepts data: URL images (gpt-5.x path)
// - audio_input: no direct audio
// - mcp_http: codex supports HTTP MCP servers
// - mcp_sse: not supported
Features: map[string]bool{
"askuser": false,
"passthrough": false,
"embedded_context": false,
"embedded_context": true,
"image_input": true,
"audio_input": false,
"mcp_http": true,
Expand Down
13 changes: 13 additions & 0 deletions internal/cli/backend/profile_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,19 @@ func TestRegisterDefaults_RegistersClaudeAndKiro(t *testing.T) {
if len(codex.RequiredNodeCaps) != 1 || codex.RequiredNodeCaps[0] != "codex-app-server" {
t.Errorf("codex RequiredNodeCaps = %v; want [\"codex-app-server\"]", codex.RequiredNodeCaps)
}
// embedded_context: codex reads @path from inside the prompt
// (agentically via shell), satisfying the dashboard gate's contract.
// Phase-2 enabled 2026-06-21.
if !codex.Features["embedded_context"] {
t.Error("codex Features[embedded_context] = false; want true (codex reads @path from prompt)")
}
// image_input + mcp_http supported; the rest stay phase-1 false.
if !codex.Features["image_input"] || !codex.Features["mcp_http"] {
t.Error("codex should support image_input + mcp_http")
}
if codex.Features["askuser"] || codex.Features["passthrough"] {
t.Error("codex askuser/passthrough must stay false (phase-2 pipelines not built)")
}
})
}

Expand Down
31 changes: 31 additions & 0 deletions internal/cli/protocol_codex_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,37 @@ func TestCodexProtocol_WriteMessage_TurnStart(t *testing.T) {
}
}

// TestCodexProtocol_WriteMessage_AtMentionVerbatim pins the embedded_context
// mechanism: codex reads @path agentically from the prompt text, so the
// `@path` token MUST survive verbatim into the turn/start text UserInput
// (no stripping/rewriting). The dashboard only forwards the @-mention when
// the backend declares embedded_context=true (profile_codex.go); this test
// guarantees the wire side keeps the token intact for codex to act on.
func TestCodexProtocol_WriteMessage_AtMentionVerbatim(t *testing.T) {
t.Parallel()
p := &CodexProtocol{}
p.storeThreadID("t-2")
var w bytes.Buffer
const msg = "summarize @docs/design.md and @src/main.go please"
if err := p.WriteMessage(&w, msg, nil); err != nil {
t.Fatalf("WriteMessage error: %v", err)
}
var req struct {
Params struct {
Input []struct {
Type string `json:"type"`
Text string `json:"text"`
} `json:"input"`
} `json:"params"`
}
if err := json.Unmarshal(w.Bytes(), &req); err != nil {
t.Fatalf("turn/start not valid JSON: %v", err)
}
if len(req.Params.Input) != 1 || req.Params.Input[0].Text != msg {
t.Errorf("@-mention text not preserved verbatim; got %+v want %q", req.Params.Input, msg)
}
}

func TestCodexProtocol_WriteInterrupt(t *testing.T) {
t.Parallel()
p := &CodexProtocol{}
Expand Down
Loading