Skip to content

🤖 fix: remove redundant advisor pending-call handoff#3423

Merged
ThomasK33 merged 1 commit into
mainfrom
debug-console-xhjn
May 29, 2026
Merged

🤖 fix: remove redundant advisor pending-call handoff#3423
ThomasK33 merged 1 commit into
mainfrom
debug-console-xhjn

Conversation

@ThomasK33

Copy link
Copy Markdown
Member

Summary

Removes the redundant serialized advisor({"question": ...}) pending tool-call block from advisor handoff messages, so debug-console advisor requests show the advisor question only once.

Background

The debug LLM request modal renders backend-captured messages verbatim. Advisor handoff construction already includes the question as prose, so the extra pending-tool-call block duplicated the same content.

Implementation

  • Removed advisor-only pending tool call formatting from buildAdvisorHandoffMessage.
  • Preserved the question, current-step commentary, and current-step reasoning sections.
  • Updated advisor tool tests to assert the redundant block is absent.

Validation

  • bun test src/node/services/tools/advisor.test.ts
  • make typecheck
  • make static-check

Risks

Low. The change is scoped to advisor handoff message formatting and leaves generic debug rendering and advisor snapshot capture untouched.


📋 Implementation Plan

Plan: Remove redundant advisor pending-call text from debug-console payload

Goal

Stop the advisor-model invocation/debug LLM request from repeating the same advisor question as a serialized pending tool call:

**Pending tool call:**
advisor({"question":"..."})

The question should remain available once as prose in the advisor handoff, along with any useful same-step context.

Verified repo context

  • The repeated text is generated in backend code, not by a debug-console React renderer.
  • src/node/services/tools/advisor.ts builds the synthetic advisor handoff message:
    • buildAdvisorHandoffMessage(question, snapshot) appends **Question:** ....
    • The same function currently appends **Pending tool call:**\nadvisor(${JSON.stringify(snapshot.input)}) whenever an advisor snapshot exists.
    • formatPendingToolCall(input) exists only for that serialized pending-call block.
  • src/browser/components/DebugLlmRequestModal/DebugLlmRequestModal.tsx renders captured snapshot.messages with JSON.stringify(..., null, 2); it is showing the backend-generated message verbatim rather than adding tool-call formatting itself.
  • src/node/services/aiService.ts freezes an AdvisorToolCallSnapshot for toolName === "advisor" during streaming and passes it into the advisor tool runtime via takeToolCallSnapshot(toolCallId).
  • src/common/utils/tools/toolDefinitions.ts defines the advisor input schema as only question?: string | null, so the serialized advisor({"question":"..."}) block normally carries no additional user-visible information beyond the prose question.

Recommended approach

Remove the advisor-only pending tool-call block from the advisor handoff.

Net product LoC estimate: -6 to -8 LoC.

  1. In src/node/services/tools/advisor.ts:
    • Delete formatPendingToolCall(input) if it becomes unused.
    • Delete the if (snapshot != null) { ... sections.push("**Pending tool call:**...") } branch inside buildAdvisorHandoffMessage.
    • Keep the existing **Question:**, **Current-step commentary:**, and **Current-step reasoning:** sections unchanged.
    • Preserve the current early return where no handoff is added if question, same-step commentary, and same-step reasoning are all absent.
  2. Do not change generic tool rendering, the debug modal, or advisor snapshot capture. The snapshot is still useful for same-step commentary/reasoning.
Rejected narrower alternative

A conditional helper could suppress the block only when snapshot.input.question equals the normalized prose question and retain it for malformed or future extra fields. That is about +8 to +12 product LoC and is more complex than needed because the current advisor schema only supports question. If advisor inputs later gain more fields, revisit the handoff format then.

Implementation steps and quality gates

  1. Update advisor handoff construction

    • Edit only src/node/services/tools/advisor.ts.
    • Verify by inspection that no other references to formatPendingToolCall remain.
    • Quality gate: product diff should be a deletion-only/small surgical change in advisor handoff code.
  2. Update unit coverage

    • In src/node/services/tools/advisor.test.ts, update the existing test around the full advisor handoff with question + same-step snapshot context.
    • Expected handoff should still include:
      • the prose **Question:** ... section,
      • current-step commentary,
      • current-step reasoning.
    • Expected handoff should no longer include:
      • **Pending tool call:**,
      • advisor({"question":...}).
    • Prefer updating the existing behavioral test and adding not.toContain assertions over creating a tautological prose-only test.
    • Quality gate: targeted unit test fails before the implementation change and passes after it.
  3. Run validation

    • Targeted: bun test src/node/services/tools/advisor.test.ts
    • Type/static confidence: make typecheck
    • If nearby changes or failures suggest broader impact, run make test or make static-check before declaring the work done.

Dogfooding plan

Skill context read for this plan: dogfood, agent-browser, and dev-server-sandbox.

  1. Prepare the dogfood session and evidence directory

    • Use the direct agent-browser binary, never npx agent-browser.
    • Before executing browser automation, load the live CLI workflow content so commands match the installed version:
      • agent-browser skills get core
      • If driving the Electron desktop shell rather than a plain browser URL, also load agent-browser skills get electron.
    • Create an output directory such as ./dogfood-output/advisor-pending-call/ with screenshots/ and videos/ subdirectories, and keep all artifacts instead of deleting/restarting mid-session.
  2. Start an isolated Mux dev environment

    • Prefer the sandboxed backend/Vite flow to avoid touching the user’s normal Mux state: make dev-server-sandbox.
    • If the run must start from an empty project/provider state, use make dev-server-sandbox DEV_SERVER_SANDBOX_ARGS="--clean-providers --clean-projects".
    • Record the sandbox output ports/URLs (BACKEND_PORT, VITE_PORT) and remember the sandbox copies provider/project config when available but intentionally does not copy secrets.json.
    • Open the served UI URL with agent-browser --session advisor-pending-call open <target-url> when a browser target is sufficient; if a desktop shell is required, use the agent-browser Electron workflow after the sandbox server is running.
  3. Orient and capture baseline evidence

    • Wait for page load/network idle, then capture an annotated initial screenshot and interactive snapshot:
      • agent-browser --session advisor-pending-call screenshot --annotate ./dogfood-output/advisor-pending-call/screenshots/initial.png
      • agent-browser --session advisor-pending-call snapshot -i
    • Check browser console/errors before testing the scenario so unrelated failures are visible.
  4. Exercise the behavior with recorded repro evidence

    • Start a short video before reproducing the advisor/debug-console flow:
      • agent-browser --session advisor-pending-call record start ./dogfood-output/advisor-pending-call/videos/advisor-debug-request.webm
    • In a workspace, trigger an advisor call with a clear question.
    • Open the debug LLM request/debug console view for the advisor model request.
    • Inspect the rendered messages payload.
    • Use human-paced interactions for the recording, with screenshots at key steps: before triggering advisor, after opening debug console, and at the final messages payload.
  5. Verify and capture the final state

    • Final screenshot must show the advisor prose **Question:** ... section and no **Pending tool call:** block in the advisor model request payload.
    • If same-step commentary/reasoning appears, include it in the evidence to show those useful context sections were preserved.
    • Stop the video with agent-browser --session advisor-pending-call record stop.
    • Save console/error output from the session in the dogfood notes; failures unrelated to this change should be called out separately, not hidden.

Acceptance criteria

  • Advisor handoff/debug request payload no longer contains the redundant **Pending tool call:**\nadvisor({"question":"..."}) section for normal advisor invocations.
  • The advisor question remains present once as prose.
  • Same-step commentary and reasoning context remain present when available.
  • Existing no-context behavior is preserved: no extra handoff message is added when there is no question, commentary, or reasoning.
  • Generic debug console rendering and non-advisor tool behavior are untouched.
  • Targeted advisor tests pass, and typecheck/static validation is clean or any blocker is reported with exact failure details.

Generated with mux • Model: openai:gpt-5.5 • Thinking: xhigh • Cost: 4.70

Removes the serialized advisor(...) pending tool call from the advisor handoff while preserving the prose question and same-step context.

---

_Generated with `mux` • Model: `openai:gpt-5.5` • Thinking: `xhigh` • Cost: `4.70`_

<!-- mux-attribution: model=openai:gpt-5.5 thinking=xhigh costs=4.70 -->
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue May 29, 2026
Merged via the queue into main with commit 87c88f1 May 29, 2026
24 checks passed
@ThomasK33 ThomasK33 deleted the debug-console-xhjn branch May 29, 2026 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant