test: add mock-LLM E2E test for /model slash command (mid-conversation profile switch) by malhotra5 · Pull Request #998 · OpenHands/agent-canvas

malhotra5 · 2026-06-01T19:33:56Z

Summary

Adds a new mock-LLM E2E test spec (tests/e2e/mock-llm/mock-llm-model-switch.spec.ts) that exercises the /model slash command for mid-conversation LLM profile switching — a key feature that previously had zero test coverage.

What it tests

The test exercises the full /model flow end-to-end against the real agent-server with a scripted mock LLM backend:

Step 1: Setup

Creates two LLM profiles (A: openai/mock-model-alpha, B: openai/mock-model-beta) via the profiles API, both pointing at the mock LLM server
Activates profile A as the default
Registers a custom trajectory with two text replies

Step 2: Conversation + `/model` switch

Starts a conversation from the home page, sends a message
Waits for the agent to reply (trajectory turn 0)
Verifies the conversation started with profile A's model via GET /api/settings
Types /model model-switch-profile-b in the chat input and submits
Verifies the switch succeeded:
- ✅ The "Switched to profile" confirmation message renders in the chat UI (data-testid="model-messages")
- ✅ The POST /api/conversations/{id}/switch_profile request was intercepted with the correct profile name
Verifies post-switch continuity:
- Sends a follow-up message after the switch
- Verifies the agent responds (trajectory turn 1), proving the conversation continues working under the new profile
Verifies no error banners appear

Test infrastructure

Uses the existing mock-LLM test framework (playwright.mock-llm.config.ts)
Creates/cleans profiles via direct API calls (no UI dependency for setup)
Custom trajectory registered via the mock LLM admin API
Serial test mode (step 2 depends on step 1's profile setup)

Coverage

Covers the following items from APP-1785:

[E2E / mock] Type /model gpt4 and submit → intercept the switch-profile API call and assert it is called with the correct profile name

This PR was created by an AI agent (OpenHands) on behalf of @rmalhot.

@malhotra5 can click here to continue refining the PR

🐳 Docker images for this PR

• GHCR package: https://github.com/OpenHands/agent-canvas/pkgs/container/agent-canvas

Component	Value
Image	`ghcr.io/openhands/agent-canvas`
Architectures	amd64, arm64
Agent Server	`ghcr.io/openhands/agent-server:1.24.0-python`
Automation	`openhands-automation==1.0.0a5`
Commit	`b5562647de4143341d67eb1da958cde6466ecf37`

Pull (multi-arch manifest)

# Multi-arch manifest — Docker automatically pulls the correct architecture
docker pull ghcr.io/openhands/agent-canvas:sha-b556264

Run

docker run -it --rm \
  -p 8000:8000 \
  ghcr.io/openhands/agent-canvas:sha-b556264

All tags pushed for this build

ghcr.io/openhands/agent-canvas:sha-b556264-amd64
ghcr.io/openhands/agent-canvas:test-mock-llm-model-switch-amd64
ghcr.io/openhands/agent-canvas:pr-998-amd64
ghcr.io/openhands/agent-canvas:sha-b556264-arm64
ghcr.io/openhands/agent-canvas:test-mock-llm-model-switch-arm64
ghcr.io/openhands/agent-canvas:pr-998-arm64
ghcr.io/openhands/agent-canvas:sha-b556264
ghcr.io/openhands/agent-canvas:test-mock-llm-model-switch
ghcr.io/openhands/agent-canvas:pr-998

About Multi-Architecture Support

Each tag (e.g., sha-b556264) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., sha-b556264-amd64) are also available if needed

vercel · 2026-06-01T19:34:03Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agent-canvas	Ready	Preview, Comment	Jun 1, 2026 10:28pm

Add tests/e2e/mock-llm/mock-llm-model-switch.spec.ts exercising the full /model mid-conversation profile switch flow against the real agent-server with a scripted mock LLM backend. Step 1 — setup: - ensureMockLLMProfile() configures agent_settings.llm (proven pattern) - POST /api/profiles/{name} creates profile B as the switch target - Registers a 3-entry trajectory: padding for the internal LLM call, INITIAL_REPLY_TOKEN, POST_SWITCH_REPLY_TOKEN Step 2 — conversation + /model switch: - Starts conversation from home page, waits for agent reply - Types '/model model-switch-profile-b' and submits - Verifies 'Switched to profile' confirmation in data-testid=model-messages - Verifies POST /api/conversations/{id}/switch_profile was intercepted - Sends follow-up, verifies agent responds (post-switch continuity) - Verifies no error banners Co-authored-by: openhands <openhands@all-hands.dev>

Two Docker E2E reliability fixes: 1. Concurrency group: workflow_run triggers used the unique workflow_run.id as the group key, so multiple Docker builds completing on main each spawned their own E2E run (they never cancelled each other). Changed to key by the triggering workflow's head_branch, so main-branch workflow_run triggers share the group 'wr-main' and only the latest run survives. pull_request triggers still key by PR number (unchanged). 2. Retry: Docker Playwright config now uses retries:1 in CI to handle transient container startup ECONNREFUSED failures. The webServer health-check confirms the stack is up, but occasional races between container readiness and the first test request can still cause failures. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot · 2026-06-01T22:06:59Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

all-hands-bot

The previous review comments have all been addressed cleanly — the setChatInput helper is now correctly in shared utils, the trajectory padding comment is well-documented with upstream references, and the switchProfileBody assertion uses a direct field check rather than a loose JSON.stringify substring match. The redundant toBeVisible check is gone too.

The CI changes are well-motivated: restricting workflow_run to main/master prevents duplicate E2E runs on e2e-tests PRs, the revamped concurrency group makes cancellation semantics explicit, and the single retry in CI is a pragmatic fix for Docker startup races. Comments in the workflow are unusually thorough.

A couple of things still worth addressing before merge:

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

- Make saveProfile idempotent with best-effort delete-first so stale profiles from crashed CI runs don't cause persistent 409 failures - Increase switch-confirmation timeout from 15s to 30s for consistency with all other waitForNonUserMessageText calls in this test - Add waitForTestId guard before post-switch setChatInput to handle potential UI disable during profile switch settling Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-01T22:17:02Z

✅ Mock-LLM E2E Tests

14/14 passed

Commit: ba7b2aeb · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.5s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.5s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	210ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	23.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.5s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory	173ms
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch	4.8s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot · 2026-06-01T22:17:47Z

✅ Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

github-actions · 2026-06-01T22:22:24Z

❌ Mock-LLM Docker E2E Test Results

8/14 passed · 2 failed · 4 skipped

Commit: ba7b2aeb · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	506ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	28.8s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.4s
❌	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server (1 retries)	541ms
⏭️	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API (1 retries)	0ms
⏭️	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM (1 retries)	0ms
⏭️	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away (1 retries)	0ms
❌	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory (1 retries)	1.7s
⏭️	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch (1 retries)	0ms

🔍 Failure details (2)

❌ mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server

Error: page.goto: net::ERR_CONNECTION_REFUSED at http://localhost:18300/settings/llm
Call log:
  - navigating to "http://localhost:18300/settings/llm", waiting until "domcontentloaded"

❌ mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory

Error: apiRequestContext.get: connect ECONNREFUSED ::1:18300
Call log:
  - → GET http://localhost:18300/api/settings
    - user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.7727.15 Safari/537.36
    - accept: */*
    - accept-encoding: gzip,deflate,br
    - X-Session-API-Key: b51e7f966f8c2c47bbafde9933ce15f97acb97293095f2587901bddf9f2b968e
    - X-Expose-Secrets: encrypted

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T22:22:40Z

❌ Mock-LLM Docker E2E Test Results

12/14 passed · 1 failed · 1 skipped

Commit: ba7b2aeb · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.9s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.5s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	521ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	28.8s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s
❌	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory (1 retries)	1.6s
⏭️	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch (1 retries)	0ms

🔍 Failure details (1)

❌ mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory

Error: apiRequestContext.get: connect ECONNREFUSED ::1:18300
Call log:
  - → GET http://localhost:18300/api/settings
    - user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.7727.15 Safari/537.36
    - accept: */*
    - accept-encoding: gzip,deflate,br
    - X-Session-API-Key: 377c6b5cf09fa3a5eea2d749919051c7fae12cbcbee1c6ff81d155fd9d825e8f
    - X-Expose-Secrets: encrypted

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

all-hands-bot

All feedback from the previous two review rounds has been addressed cleanly. The current state of the code is solid. A couple of minor observations below — neither is a blocker.

🟢 Good taste overall. This is a well-structured, comprehensive E2E test that fills a real gap in coverage for the /model slash command. The serial step split, idempotent profile setup (delete-before-post), trajectory padding documentation, shared setChatInput helper, and CI deduplication guards are all done correctly.

[IMPROVEMENT OPPORTUNITIES]

[tests/e2e/mock-llm/mock-llm-model-switch.spec.ts, line 244–245] After expect(switchProfileBody).toBeTruthy() on line 242, the subsequent (switchProfileBody as Record<string, unknown>)?.profile_name cast is slightly redundant — the ?. optional chain suggests null is still possible, but the assertion above already guarantees it isn't. switchProfileBody!.profile_name is more direct and makes the intent clearer (asserting on a known-non-null value). Minor nit, not a blocker.
[tests/e2e/mock-llm/utils/mock-llm-helpers.ts, line 394] The error thrown inside page.evaluate() reads "Chat input not found" — since testId is passed as a parameter, including it in the message (e.g. `Chat input [data-testid="${testId}"] not found`) would make failures easier to diagnose if the helper is ever extended to support different test IDs. Tiny nit.

CI changes are well-motivated:

Restricting workflow_run to main/master to eliminate duplicate E2E comment pairs on e2e-tests PRs is the right call.
The revamped concurrency group keying (pr_number || wr-{branch} || github.ref) is correct and explicit; the inline comment explaining each case is unusually good.
The empty pr_number= for workflow_run events (results go to step summary only, no PR comment) is documented clearly — that's the right trade-off.
retries: process.env.CI ? 1 : 0 is a pragmatic fix for Docker container startup races with no correctness downside.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟢 LOW
New E2E test spec with zero production code changes. CI workflow changes are scoped to test infrastructure. The retry increase introduces minor latency for flaky-startup tests in CI but carries no correctness risk. The workflow_run main/master guard removes a class of duplicate comment noise without affecting coverage.

VERDICT:
✅ Worth merging — all prior feedback addressed, code is clean, meaningful E2E coverage added for a previously untested flow.

KEY INSIGHT:
The serial step split with test.describe.configure({ mode: "serial" }) is the correct pattern here — step 2's browser interaction genuinely depends on step 1's API state, and Playwright's serial mode ensures skip-on-failure without requiring brittle test.skip guards.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and context the reviewer is missing. See the customization docs for the required frontmatter format.

Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

Was this review helpful? React with 👍 or 👎 to give feedback.

- Use switchProfileBody!.profile_name after toBeTruthy guard instead of optional chain (the assertion guarantees non-null) - Include testId in setChatInput error message for easier diagnosis; accept optional testId parameter for future reuse Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-06-01T22:31:01Z

✅ Mock-LLM E2E Tests

14/14 passed

Commit: b5562647 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.5s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.1s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.7s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	215ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	24.4s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.2s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.7s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory	214ms
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch	4.8s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T22:34:05Z

✅ Mock-LLM Docker E2E Test Results

14/14 passed

Commit: b5562647 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.7s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	2.1s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	497ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	26.9s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.0s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.4s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.6s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory	210ms
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch	4.8s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T22:34:28Z

✅ Mock-LLM Docker E2E Test Results

14/14 passed

Commit: b5562647 · Workflow run · Test artifacts

Status	Test	Duration
✅	mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key	3.8s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured	1.2s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key	1.4s
✅	mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage)	1.5s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory	199ms
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI	29.7s
✅	mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server	4.1s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM	4.3s
✅	mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away	3.6s
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory	216ms
✅	mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch	4.7s

_{Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)}

github-actions · 2026-06-01T22:38:11Z

📸 Snapshot Test Report

✅ All snapshots match the main branch baselines.

Category	Count
🔴 Changed	0
🆕 New	0
✅ Unchanged	73
Total	73

✅ Unchanged snapshots (73)

archived-conversation

conversation-panel-with-archived-badges
conversation-view-archived
conversation-view-sandbox-error

automations

automations-delete-modal
automations-list-active-inactive
automations-no-automations
automations-search-no-results

backends-extended

backend-add-blank-disabled
backend-add-cloud-advanced-open
backend-add-cloud-no-key-disabled
backend-add-cloud-with-key-enabled
backend-add-form-partially-filled
backend-add-invalid-url-disabled
backend-add-local-ready
backend-add-name-only-disabled
backend-add-two-column-layout
backend-add-whitespace-host-disabled
backend-after-switch
backend-cancel-nothing-saved
backend-dropdown-two-backends
backend-edit-prefilled
backend-manage-after-removal
backend-manage-two-listed
backend-remove-cancelled
backend-remove-confirmation
backend-switch-overlay

backends

backend-add-modal
backend-manage-modal
backend-selector-open

changes-tab

changes-deleted-file
changes-diff-viewer
changes-empty

collapsible-thinking

reasoning-content-collapsed
reasoning-content-expanded
think-action-collapsed
think-action-expanded

mcp-page

mcp-custom-server-1-editor-open
mcp-custom-server-2-url-filled
mcp-custom-server-3-all-filled
mcp-custom-server-4-installed
mcp-custom-server-editor
mcp-empty-installed
mcp-search-filtered
mcp-slack-install-1-marketplace
mcp-slack-install-2-modal
mcp-slack-install-3-filled
mcp-slack-install-4-installed

onboarding

onboarding-step-0-choose-agent
onboarding-step-1-check-backend
onboarding-step-2-setup-llm
onboarding-step-3-say-hello

projects-workspace-browser

projects-workspace-browser

settings-page

add-backend-modal
analytics-consent-modal
home-screen
settings-app-page
settings-page

settings-secrets

secrets-add-form-filled
secrets-add-form
secrets-after-save
secrets-delete-confirm
secrets-list

settings-verification

condenser-settings
verification-settings-off
verification-settings-on

sidebar

sidebar-collapsed
sidebar-conversation-panel
sidebar-filter-menu

skills-page

skills-empty
skills-loaded
skills-no-match
skills-search-filtered
skills-type-filter

Generated by the Snapshot Tests workflow. This comment was created by an AI agent (OpenHands) on behalf of the repo maintainers.

malhotra5 added the e2e-tests Triggers mock-LLM E2E tests on PRs label Jun 1, 2026 — with OpenHands AI

vercel Bot deployed to Preview June 1, 2026 19:34 View deployment

linear Bot mentioned this pull request Jun 1, 2026

Test coverage: bulletproof the implemented MVP 'I Can' statements #511

Open

47 tasks

vercel Bot deployed to Preview June 1, 2026 19:41 View deployment

vercel Bot deployed to Preview June 1, 2026 19:47 View deployment

vercel Bot deployed to Preview June 1, 2026 19:52 View deployment

vercel Bot deployed to Preview June 1, 2026 20:02 View deployment

vercel Bot deployed to Preview June 1, 2026 20:09 View deployment

malhotra5 marked this pull request as ready for review June 1, 2026 20:19

OpenHands deleted a comment from github-actions Bot Jun 1, 2026

malhotra5 force-pushed the test/mock-llm-model-switch branch from a806b0f to b566848 Compare June 1, 2026 20:55

vercel Bot deployed to Preview June 1, 2026 20:56 View deployment

malhotra5 requested a review from all-hands-bot June 1, 2026 21:01

OpenHands deleted a comment from github-actions Bot Jun 1, 2026

vercel Bot deployed to Preview June 1, 2026 21:42 View deployment

OpenHands deleted a comment from github-actions Bot Jun 1, 2026

malhotra5 force-pushed the test/mock-llm-model-switch branch from 6484f16 to 1ab0495 Compare June 1, 2026 21:51

vercel Bot deployed to Preview June 1, 2026 21:52 View deployment

OpenHands deleted a comment from github-actions Bot Jun 1, 2026

malhotra5 requested a review from all-hands-bot June 1, 2026 22:06

all-hands-bot reviewed Jun 1, 2026

View reviewed changes

Comment thread tests/e2e/mock-llm/mock-llm-model-switch.spec.ts

Comment thread tests/e2e/mock-llm/mock-llm-model-switch.spec.ts Outdated

Comment thread tests/e2e/mock-llm/mock-llm-model-switch.spec.ts

OpenHands deleted a comment from github-actions Bot Jun 1, 2026

vercel Bot deployed to Preview June 1, 2026 22:14 View deployment

malhotra5 requested a review from all-hands-bot June 1, 2026 22:16

all-hands-bot reviewed Jun 1, 2026

View reviewed changes

Comment thread tests/e2e/mock-llm/mock-llm-model-switch.spec.ts Outdated

Comment thread tests/e2e/mock-llm/utils/mock-llm-helpers.ts Outdated

vercel Bot deployed to Preview June 1, 2026 22:28 View deployment

Conversation

malhotra5 commented Jun 1, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What it tests

Step 1: Setup

Step 2: Conversation + /model switch

Test infrastructure

Coverage

Uh oh!

vercel Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

all-hands-bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 1, 2026

❌ Mock-LLM Docker E2E Test Results

❌ mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server

❌ mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory

Uh oh!

github-actions Bot commented Jun 1, 2026

❌ Mock-LLM Docker E2E Test Results

❌ mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM Docker E2E Test Results

Uh oh!

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM Docker E2E Test Results

Uh oh!

github-actions Bot commented Jun 1, 2026

📸 Snapshot Test Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

malhotra5 commented Jun 1, 2026 •

edited by github-actions Bot

Loading

Step 2: Conversation + `/model` switch

vercel Bot commented Jun 1, 2026 •

edited

Loading

all-hands-bot commented Jun 1, 2026 •

edited

Loading

all-hands-bot commented Jun 1, 2026 •

edited

Loading