Skip to content

🤖 fix: reveal tail plans in hyper density#3420

Merged
ThomasK33 merged 1 commit into
mainfrom
ui-density-87xq
May 29, 2026
Merged

🤖 fix: reveal tail plans in hyper density#3420
ThomasK33 merged 1 commit into
mainfrom
ui-density-87xq

Conversation

@ThomasK33

@ThomasK33 ThomasK33 commented May 29, 2026

Copy link
Copy Markdown
Member

Summary

Reveal the final propose_plan tool call in hyper transcript density by default-expanding only the work and operational bundles that contain the tail plan.

Background

In hyper density, settled work and operational bundles can hide tool rows before ProposePlanToolCall renders. That made an agent pause after propose_plan look like no actionable plan was visible. The requested behavior is scoped to the last displayed tool call, not the latest plan anywhere in history.

Implementation

  • Detect when the last displayed tool call is propose_plan.
  • Resolve the containing work/operational bundle keys for the tail plan.
  • Default-expand only those matching bundles without mutating expansion override state.
  • Let explicit user collapse overrides win after the auto-reveal has made the tail plan visible.
  • Keep historical plans collapsed when any later tool call exists, including image_generate/image_edit tool rows.

Validation

  • bun test ./tests/ui/chat/transcriptDensity.test.ts
  • TEST_INTEGRATION=1 bun x jest tests/ui/chat/transcriptDensity.test.ts --runInBand
  • make static-check
  • make storybook-build
  • Dogfood via agent-browser against HyperTailProposePlanExpanded; captured screenshot/video artifacts under dogfood-output/ui-density-tail-plan/ and verified no browser errors.

Risks

Low-to-medium: the change touches transcript render projection consumption in ChatPane, but the default expansion is keyed to the exact tail plan's containing bundles and does not persist state changes.


📋 Implementation Plan

Plan: reveal tail propose_plan in hyper transcript density

Goal

When transcript density is hyper, a propose_plan tool call should be immediately visible when it is the last tool call in the displayed transcript, so the user can see why the agent paused. This must not globally expand old/historical propose_plan calls; if any later tool call exists, the older plan should stay collapsed under the normal hyper-density rules.

Evidence gathered

  • ChatPane.tsx owns transcript density, hyper bundle projection, expansion overrides, and per-row hide/show decisions.
  • Hyper density uses two collapse layers:
    • work bundles from computeWorkBundleInfos(deferredMessages);
    • operational bundles from computeOperationalBundleInfos(deferredMessages, ...).
  • Settled bundles default collapsed, which hides tool rows before MessageRenderer can render the tool component.
  • ProposePlanToolCall already uses useToolExpansion(true), so the missing UI is caused by parent bundle hiding, not by the plan component itself being collapsed.
  • ChatPane.tsx already computes a latest propose_plan id for isLatestProposePlan, but that means “latest plan anywhere”; the requested behavior is based on the last tool call, so this should remain separate.
  • Existing coverage exists in tests/ui/chat/transcriptDensity.test.ts and src/browser/features/Messages/TranscriptDensity.stories.tsx; plan mocks are available via createProposePlanTool in src/browser/stories/mocks/tools.ts.

Recommended approach

Approach A — ChatPane-owned tail-plan forced expansion

  • Product-code net LoC estimate: +20 to +35 LoC.
  • Keep the change local to ChatPane.tsx unless implementation needs a tiny pure helper for readability.
  • Use deferredMessages for tail-tool detection because it is the same message snapshot used by hyper bundle projections and rendering.
  • Do not mutate user expansion override maps; compute a render-time forced-expanded state while the condition holds.
Alternative considered

Extracting a reusable helper into transcriptRenderProjection.ts would make the tail-tool detection easier to unit test in isolation, but it likely raises product-code LoC to +35 to +60 LoC without much benefit because the actual bug is in ChatPane's render-time interaction between work and operational bundles. Use this only if the local helper becomes awkward.

Implementation steps

  1. Compute the last displayed tool call

    • In ChatPane.tsx, scan deferredMessages from the end for message.type === "tool".
    • Derive tailProposePlanToolId only when transcriptDensity === "hyper" and that last tool has toolName === "propose_plan".
    • Keep the existing latestProposePlanId logic unchanged for ProposePlanToolCall freshness/actions.
  2. Force-expand only bundles containing the tail plan

    • Add a small local predicate such as bundleContainsMessageId(bundle, tailProposePlanToolId).
    • For the work bundle containing tailProposePlanToolId, compute:
      • isWorkBundleExpanded = forceRevealTailPlan || (override ?? workBundle.defaultExpanded).
    • For the operational bundle containing tailProposePlanToolId, compute:
      • isOperationalBundleExpanded = forceRevealTailPlan || (override ?? operationalBundle.defaultExpanded).
    • Apply this at both operational-bundle sites if there is a top-level and nested path inside expanded work bundles.
    • Do not force-expand unrelated bundles.
  3. Preserve expected non-plan behavior

    • Normal density remains unchanged.
    • Hyper density remains collapsed for historical plan calls when a later tool call exists.
    • Later assistant text should not disqualify the plan if no later tool call exists; this follows the user’s “last tool call in the transcript” wording.
    • If a user manually collapses the tail-plan bundle while it is still the last tool call, the render-time forced reveal will reopen it. This is intentional for the pause state; once a later tool appears, normal user/default state applies again.

Tests

  1. Update tests/ui/chat/transcriptDensity.test.ts

    • Import createProposePlanTool.
    • Add a positive hyper-density case where a settled transcript has propose_plan as the last tool call.
      • Assert the relevant work bundle is expanded without a click.
      • Assert the relevant operational bundle is expanded without a click, or assert the plan tool UI/body is visible without a click.
    • Add a negative case where an earlier propose_plan is followed by another tool call.
      • Assert the older plan remains hidden under collapsed hyper-density defaults.
      • Optionally click the bundle afterward to prove the plan exists but was not auto-revealed.
  2. Add Storybook visual coverage in src/browser/features/Messages/TranscriptDensity.stories.tsx

    • Add a HyperTailProposePlanExpanded story that sets density to hyper and ends with createProposePlanTool(...).
    • Keep it deterministic using existing story helpers and STABLE_TIMESTAMP.
    • If the story depends on a viewport, pin Chromatic mode/globals consistently with existing story conventions.
  3. Optional pure-helper tests

    • Only if helper logic moves to transcriptRenderProjection.ts, add focused tests in src/browser/utils/messages/transcriptRenderProjection.test.ts for “last tool is plan” vs “later non-plan tool exists”.

Acceptance criteria

  • In hyper density, when the last displayed tool call is propose_plan, the plan tool UI is visible without manually expanding any transcript bundle.
  • Historical propose_plan calls do not auto-expand when a later tool call exists.
  • Both work-bundle and operational-bundle collapse layers are handled, so the plan is not hidden behind either summary row.
  • Existing latest-plan semantics used by ProposePlanToolCall remain unchanged.
  • Normal transcript density behavior is unchanged.
  • Tests cover the positive tail-plan case and the negative historical-plan case.

Validation and quality gates

Run after implementation:

  1. Targeted UI test:
    • bun test tests/ui/chat/transcriptDensity.test.ts
  2. If transcriptRenderProjection.ts is touched:
    • bun test src/browser/utils/messages/transcriptRenderProjection.test.ts
  3. Static checks:
    • make static-check
  4. Storybook build or targeted story validation as time allows:
    • make storybook-build

Proceed to the next phase only after the relevant gate passes; fix failures before moving on.

Dogfooding plan

Skill guidance applied: read dogfood, agent-browser, dev-server-sandbox, dogfood/references/issue-taxonomy.md, and current agent-browser skills get core guidance. Use direct agent-browser commands, not npx; use snapshots for refs; re-snapshot after page changes; collect screenshots/videos as evidence; check console/errors; and run isolated app instances with a temporary MUX_ROOT via make dev-server-sandbox.

  1. Start an isolated Mux dev surface

    • Run make dev-server-sandbox DEV_SERVER_SANDBOX_ARGS="--clean-projects" so dogfooding uses a fresh temporary MUX_ROOT, free backend/Vite ports, copied provider config if present, no copied projects, and no collision with the user’s real app state.
    • Use the Vite URL printed by the sandbox command as the target URL. Keep the sandbox running until evidence collection is complete.
    • Before browser automation, refresh the installed CLI guidance with agent-browser skills get core.
  2. Initialize browser evidence capture

    • Create a dogfood output directory such as ./dogfood-output/ui-density-tail-plan/ with screenshots/ and videos/ subdirectories.
    • Open the app with a named session, e.g. agent-browser --session ui-density-tail-plan open <VITE_URL> followed by agent-browser --session ui-density-tail-plan wait --load networkidle.
    • Capture an initial annotated screenshot and interactive snapshot:
      • agent-browser --session ui-density-tail-plan screenshot --annotate ./dogfood-output/ui-density-tail-plan/screenshots/initial.png
      • agent-browser --session ui-density-tail-plan snapshot -i
    • Use only browser-observed behavior for dogfood findings; do not inspect source while documenting UI issues.
  3. Positive behavior check: tail propose_plan is visible

    • Navigate to the deterministic HyperTailProposePlanExpanded Storybook story if the fixture is implemented there, or load the equivalent seeded app state in the sandbox.
    • Record video before reproducing: agent-browser --session ui-density-tail-plan record start ./dogfood-output/ui-density-tail-plan/videos/tail-plan-visible.webm.
    • Reload/open the scenario, wait for the transcript to settle, and take step screenshots.
    • Assert visually that hyper density shows the propose_plan tool UI without manually clicking work/operational bundle rows.
    • Capture the final annotated screenshot and stop recording.
    • Check agent-browser --session ui-density-tail-plan errors and agent-browser --session ui-density-tail-plan console.
  4. Negative behavior check: historical plan stays collapsed

    • Open a scenario where a propose_plan is followed by a later non-plan tool call.
    • Capture an annotated screenshot showing the older plan remains hidden/collapsed under hyper-density defaults.
    • If interaction is needed to prove the plan exists, record a short video: start recording, click the relevant bundle, show the historical plan, then stop recording.
    • Re-check console/errors.
  5. Focused exploratory pass around the changed surface

    • Apply the dogfood issue taxonomy to this feature only: visual layout, functional controls, UX clarity, performance/jank, console errors, accessibility/focus, and responsive clipping around expanded/collapsed transcript rows.
    • Use screenshots for static findings and video plus step screenshots for interactive findings.
    • Record any issues immediately with severity and reproduction steps; target evidence quality over issue count.
  6. Wrap up dogfood artifacts

    • Close the browser session with agent-browser --session ui-density-tail-plan close.
    • Keep screenshots/videos available for review and attach the key positive/negative screenshots if the execution environment supports attachments.
    • If KEEP_SANDBOX=1 was used for debugging, note the sandbox root; otherwise let the temporary MUX_ROOT clean up on exit.

Review focus

  • The last-tool predicate should be based on ordered displayed messages, not “latest plan”.
  • Forced expansion should be scoped to the containing work/operational bundles only.
  • The change should not persist or overwrite user expansion overrides.
  • Tests should assert behavior/visibility rather than tautological copy.

Generated with mux • Model: openai:gpt-5.5 • Thinking: xhigh • Cost: $39.77

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ed3b4a47f7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/components/ChatPane/ChatPane.tsx
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review
Please take another look.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12b63810d3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/components/ChatPane/ChatPane.tsx Outdated
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review
Please take another look.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review
Please take another look after the integration-test fix.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue May 29, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 29, 2026
@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review
Please take another look after rebasing onto main.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33

Copy link
Copy Markdown
Member Author

@codex review
Please take another look after the post-rebase CI test fix.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🚀

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue May 29, 2026
Merged via the queue into main with commit d06c58e May 29, 2026
24 checks passed
@ThomasK33 ThomasK33 deleted the ui-density-87xq branch May 29, 2026 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant