Skip to content

test: add mock-LLM E2E test for /model slash command (mid-conversation profile switch)#998

Open
malhotra5 wants to merge 6 commits into
mainfrom
test/mock-llm-model-switch
Open

test: add mock-LLM E2E test for /model slash command (mid-conversation profile switch)#998
malhotra5 wants to merge 6 commits into
mainfrom
test/mock-llm-model-switch

Conversation

@malhotra5
Copy link
Copy Markdown
Member

@malhotra5 malhotra5 commented Jun 1, 2026

Summary

Adds a new mock-LLM E2E test spec (tests/e2e/mock-llm/mock-llm-model-switch.spec.ts) that exercises the /model slash command for mid-conversation LLM profile switching — a key feature that previously had zero test coverage.

What it tests

The test exercises the full /model flow end-to-end against the real agent-server with a scripted mock LLM backend:

Step 1: Setup

  • Creates two LLM profiles (A: openai/mock-model-alpha, B: openai/mock-model-beta) via the profiles API, both pointing at the mock LLM server
  • Activates profile A as the default
  • Registers a custom trajectory with two text replies

Step 2: Conversation + /model switch

  1. Starts a conversation from the home page, sends a message
  2. Waits for the agent to reply (trajectory turn 0)
  3. Verifies the conversation started with profile A's model via GET /api/settings
  4. Types /model model-switch-profile-b in the chat input and submits
  5. Verifies the switch succeeded:
    • ✅ The "Switched to profile" confirmation message renders in the chat UI (data-testid="model-messages")
    • ✅ The POST /api/conversations/{id}/switch_profile request was intercepted with the correct profile name
  6. Verifies post-switch continuity:
    • Sends a follow-up message after the switch
    • Verifies the agent responds (trajectory turn 1), proving the conversation continues working under the new profile
  7. Verifies no error banners appear

Test infrastructure

  • Uses the existing mock-LLM test framework (playwright.mock-llm.config.ts)
  • Creates/cleans profiles via direct API calls (no UI dependency for setup)
  • Custom trajectory registered via the mock LLM admin API
  • Serial test mode (step 2 depends on step 1's profile setup)

Coverage

Covers the following items from APP-1785:

  • [E2E / mock] Type /model gpt4 and submit → intercept the switch-profile API call and assert it is called with the correct profile name

This PR was created by an AI agent (OpenHands) on behalf of @rmalhot.

@malhotra5 can click here to continue refining the PR


🐳 Docker images for this PR

GHCR package: https://github.com/OpenHands/agent-canvas/pkgs/container/agent-canvas

Component Value
Image ghcr.io/openhands/agent-canvas
Architectures amd64, arm64
Agent Server ghcr.io/openhands/agent-server:1.24.0-python
Automation openhands-automation==1.0.0a5
Commit b5562647de4143341d67eb1da958cde6466ecf37

Pull (multi-arch manifest)

# Multi-arch manifest — Docker automatically pulls the correct architecture
docker pull ghcr.io/openhands/agent-canvas:sha-b556264

Run

docker run -it --rm \
  -p 8000:8000 \
  ghcr.io/openhands/agent-canvas:sha-b556264

All tags pushed for this build

ghcr.io/openhands/agent-canvas:sha-b556264-amd64
ghcr.io/openhands/agent-canvas:test-mock-llm-model-switch-amd64
ghcr.io/openhands/agent-canvas:pr-998-amd64
ghcr.io/openhands/agent-canvas:sha-b556264-arm64
ghcr.io/openhands/agent-canvas:test-mock-llm-model-switch-arm64
ghcr.io/openhands/agent-canvas:pr-998-arm64
ghcr.io/openhands/agent-canvas:sha-b556264
ghcr.io/openhands/agent-canvas:test-mock-llm-model-switch
ghcr.io/openhands/agent-canvas:pr-998

About Multi-Architecture Support

  • Each tag (e.g., sha-b556264) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., sha-b556264-amd64) are also available if needed

@malhotra5 malhotra5 added the e2e-tests Triggers mock-LLM E2E tests on PRs label Jun 1, 2026 — with OpenHands AI
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agent-canvas Ready Ready Preview, Comment Jun 1, 2026 10:28pm

Request Review

@malhotra5 malhotra5 marked this pull request as ready for review June 1, 2026 20:19
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
Add tests/e2e/mock-llm/mock-llm-model-switch.spec.ts exercising the
full /model mid-conversation profile switch flow against the real
agent-server with a scripted mock LLM backend.

Step 1 — setup:
  - ensureMockLLMProfile() configures agent_settings.llm (proven pattern)
  - POST /api/profiles/{name} creates profile B as the switch target
  - Registers a 3-entry trajectory: padding for the internal LLM call,
    INITIAL_REPLY_TOKEN, POST_SWITCH_REPLY_TOKEN

Step 2 — conversation + /model switch:
  - Starts conversation from home page, waits for agent reply
  - Types '/model model-switch-profile-b' and submits
  - Verifies 'Switched to profile' confirmation in data-testid=model-messages
  - Verifies POST /api/conversations/{id}/switch_profile was intercepted
  - Sends follow-up, verifies agent responds (post-switch continuity)
  - Verifies no error banners

Co-authored-by: openhands <openhands@all-hands.dev>
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
@OpenHands OpenHands deleted a comment from github-actions Bot Jun 1, 2026
Two Docker E2E reliability fixes:

1. Concurrency group: workflow_run triggers used the unique
   workflow_run.id as the group key, so multiple Docker builds
   completing on main each spawned their own E2E run (they never
   cancelled each other). Changed to key by the triggering
   workflow's head_branch, so main-branch workflow_run triggers
   share the group 'wr-main' and only the latest run survives.
   pull_request triggers still key by PR number (unchanged).

2. Retry: Docker Playwright config now uses retries:1 in CI to
   handle transient container startup ECONNREFUSED failures. The
   webServer health-check confirms the stack is up, but occasional
   races between container readiness and the first test request
   can still cause failures.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor

all-hands-bot commented Jun 1, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous review comments have all been addressed cleanly — the setChatInput helper is now correctly in shared utils, the trajectory padding comment is well-documented with upstream references, and the switchProfileBody assertion uses a direct field check rather than a loose JSON.stringify substring match. The redundant toBeVisible check is gone too.

The CI changes are well-motivated: restricting workflow_run to main/master prevents duplicate E2E runs on e2e-tests PRs, the revamped concurrency group makes cancellation semantics explicit, and the single retry in CI is a pragmatic fix for Docker startup races. Comments in the workflow are unusually thorough.

A couple of things still worth addressing before merge:

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

Comment thread tests/e2e/mock-llm/mock-llm-model-switch.spec.ts
Comment thread tests/e2e/mock-llm/mock-llm-model-switch.spec.ts Outdated
Comment thread tests/e2e/mock-llm/mock-llm-model-switch.spec.ts
- Make saveProfile idempotent with best-effort delete-first so stale
  profiles from crashed CI runs don't cause persistent 409 failures
- Increase switch-confirmation timeout from 15s to 30s for consistency
  with all other waitForNonUserMessageText calls in this test
- Add waitForTestId guard before post-switch setChatInput to handle
  potential UI disable during profile switch settling

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

14/14 passed

Commit: ba7b2aeb · Workflow run · Test artifacts

Status Test Duration
mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key 3.5s
mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured 1.2s
mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error 1.5s
mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key 1.8s
mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage) 1.5s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory 210ms
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI 23.5s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page 4.2s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server 4.2s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API 4.2s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM 4.5s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away 3.7s
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory 173ms
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch 4.8s

Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)

Copy link
Copy Markdown
Contributor

all-hands-bot commented Jun 1, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

❌ Mock-LLM Docker E2E Test Results

8/14 passed · 2 failed · 4 skipped

Commit: ba7b2aeb · Workflow run · Test artifacts

Status Test Duration
mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key 3.8s
mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured 1.2s
mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error 1.4s
mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key 1.8s
mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage) 1.5s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory 506ms
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI 28.8s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page 4.4s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server (1 retries) 541ms
⏭️ mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API (1 retries) 0ms
⏭️ mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM (1 retries) 0ms
⏭️ mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away (1 retries) 0ms
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory (1 retries) 1.7s
⏭️ mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch (1 retries) 0ms
🔍 Failure details (2)

❌ mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server

Error: page.goto: net::ERR_CONNECTION_REFUSED at http://localhost:18300/settings/llm
Call log:
  - navigating to "http://localhost:18300/settings/llm", waiting until "domcontentloaded"

❌ mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory

Error: apiRequestContext.get: connect ECONNREFUSED ::1:18300
Call log:
  - → GET http://localhost:18300/api/settings
    - user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.7727.15 Safari/537.36
    - accept: */*
    - accept-encoding: gzip,deflate,br
    - X-Session-API-Key: b51e7f966f8c2c47bbafde9933ce15f97acb97293095f2587901bddf9f2b968e
    - X-Expose-Secrets: encrypted

Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

❌ Mock-LLM Docker E2E Test Results

12/14 passed · 1 failed · 1 skipped

Commit: ba7b2aeb · Workflow run · Test artifacts

Status Test Duration
mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key 3.9s
mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured 1.2s
mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error 1.5s
mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key 1.8s
mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage) 1.5s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory 521ms
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI 28.8s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page 4.2s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server 4.3s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API 4.4s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM 4.4s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away 3.7s
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory (1 retries) 1.6s
⏭️ mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch (1 retries) 0ms
🔍 Failure details (1)

❌ mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory

Error: apiRequestContext.get: connect ECONNREFUSED ::1:18300
Call log:
  - → GET http://localhost:18300/api/settings
    - user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.7727.15 Safari/537.36
    - accept: */*
    - accept-encoding: gzip,deflate,br
    - X-Session-API-Key: 377c6b5cf09fa3a5eea2d749919051c7fae12cbcbee1c6ff81d155fd9d825e8f
    - X-Expose-Secrets: encrypted

Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All feedback from the previous two review rounds has been addressed cleanly. The current state of the code is solid. A couple of minor observations below — neither is a blocker.


🟢 Good taste overall. This is a well-structured, comprehensive E2E test that fills a real gap in coverage for the /model slash command. The serial step split, idempotent profile setup (delete-before-post), trajectory padding documentation, shared setChatInput helper, and CI deduplication guards are all done correctly.


[IMPROVEMENT OPPORTUNITIES]

  • [tests/e2e/mock-llm/mock-llm-model-switch.spec.ts, line 244–245] After expect(switchProfileBody).toBeTruthy() on line 242, the subsequent (switchProfileBody as Record<string, unknown>)?.profile_name cast is slightly redundant — the ?. optional chain suggests null is still possible, but the assertion above already guarantees it isn't. switchProfileBody!.profile_name is more direct and makes the intent clearer (asserting on a known-non-null value). Minor nit, not a blocker.

  • [tests/e2e/mock-llm/utils/mock-llm-helpers.ts, line 394] The error thrown inside page.evaluate() reads "Chat input not found" — since testId is passed as a parameter, including it in the message (e.g. `Chat input [data-testid="${testId}"] not found`) would make failures easier to diagnose if the helper is ever extended to support different test IDs. Tiny nit.


CI changes are well-motivated:

  • Restricting workflow_run to main/master to eliminate duplicate E2E comment pairs on e2e-tests PRs is the right call.
  • The revamped concurrency group keying (pr_number || wr-{branch} || github.ref) is correct and explicit; the inline comment explaining each case is unusually good.
  • The empty pr_number= for workflow_run events (results go to step summary only, no PR comment) is documented clearly — that's the right trade-off.
  • retries: process.env.CI ? 1 : 0 is a pragmatic fix for Docker container startup races with no correctness downside.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW
    New E2E test spec with zero production code changes. CI workflow changes are scoped to test infrastructure. The retry increase introduces minor latency for flaky-startup tests in CI but carries no correctness risk. The workflow_run main/master guard removes a class of duplicate comment noise without affecting coverage.

VERDICT:
Worth merging — all prior feedback addressed, code is clean, meaningful E2E coverage added for a previously untested flow.

KEY INSIGHT:
The serial step split with test.describe.configure({ mode: "serial" }) is the correct pattern here — step 2's browser interaction genuinely depends on step 1's API state, and Playwright's serial mode ensures skip-on-failure without requiring brittle test.skip guards.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and context the reviewer is missing. See the customization docs for the required frontmatter format.
  2. Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

Was this review helpful? React with 👍 or 👎 to give feedback.

Comment thread tests/e2e/mock-llm/mock-llm-model-switch.spec.ts Outdated
Comment thread tests/e2e/mock-llm/utils/mock-llm-helpers.ts Outdated
- Use switchProfileBody!.profile_name after toBeTruthy guard instead
  of optional chain (the assertion guarantees non-null)
- Include testId in setChatInput error message for easier diagnosis;
  accept optional testId parameter for future reuse

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM E2E Tests

14/14 passed

Commit: b5562647 · Workflow run · Test artifacts

Status Test Duration
mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key 3.5s
mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured 1.1s
mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error 1.4s
mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key 1.7s
mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage) 1.4s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory 215ms
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI 24.4s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page 4.1s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server 4.2s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API 4.1s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM 4.4s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away 3.7s
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory 214ms
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch 4.8s

Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM Docker E2E Test Results

14/14 passed

Commit: b5562647 · Workflow run · Test artifacts

Status Test Duration
mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key 3.8s
mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured 1.2s
mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error 1.4s
mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key 1.7s
mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage) 2.1s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory 497ms
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI 26.9s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page 4.0s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server 4.1s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API 4.4s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM 4.3s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away 3.6s
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory 210ms
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch 4.8s

Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

✅ Mock-LLM Docker E2E Test Results

14/14 passed

Commit: b5562647 · Workflow run · Test artifacts

Status Test Duration
mock-llm-auth-modes.spec.ts › auth mode: non-public key rotation › recovers when localStorage has a stale session API key 3.8s
mock-llm-auth-modes.spec.ts › auth mode: public gate › shows the auth screen when no key is configured 1.2s
mock-llm-auth-modes.spec.ts › auth mode: public gate › rejects an incorrect key with an inline error 1.4s
mock-llm-auth-modes.spec.ts › auth mode: public gate › allows access after pasting the correct key 1.4s
mock-llm-auth-modes.spec.ts › auth mode: public gate › re-prompts when the server rotates its key (stale localStorage) 1.5s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 1: setup LLM profile and register automation trajectory 199ms
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 2: create automation and dispatch run via the UI 29.7s
mock-llm-automation.spec.ts › mock-LLM automation lifecycle › step 3: verify automation and run on the automations page 4.1s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 1: create an LLM profile pointing at the mock LLM server 4.1s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 2: activate the mock-llm profile and verify settings API 4.3s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 3: run a conversation with the mock LLM 4.3s
mock-llm-conversation.spec.ts › mock-LLM agent-server conversation › step 4: resume conversation from sidebar after navigating away 3.6s
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 1: configure LLM, create switch-target profile, register trajectory 216ms
mock-llm-model-switch.spec.ts › mock-LLM /model slash command › step 2: start conversation, switch profile via /model, verify switch 4.7s

Posted by the Mock-LLM E2E workflow · results are deterministic (scripted LLM responses)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

📸 Snapshot Test Report

✅ All snapshots match the main branch baselines.

Category Count
🔴 Changed 0
🆕 New 0
✅ Unchanged 73
Total 73
✅ Unchanged snapshots (73)

archived-conversation

  • conversation-panel-with-archived-badges
  • conversation-view-archived
  • conversation-view-sandbox-error

automations

  • automations-delete-modal
  • automations-list-active-inactive
  • automations-no-automations
  • automations-search-no-results

backends-extended

  • backend-add-blank-disabled
  • backend-add-cloud-advanced-open
  • backend-add-cloud-no-key-disabled
  • backend-add-cloud-with-key-enabled
  • backend-add-form-partially-filled
  • backend-add-invalid-url-disabled
  • backend-add-local-ready
  • backend-add-name-only-disabled
  • backend-add-two-column-layout
  • backend-add-whitespace-host-disabled
  • backend-after-switch
  • backend-cancel-nothing-saved
  • backend-dropdown-two-backends
  • backend-edit-prefilled
  • backend-manage-after-removal
  • backend-manage-two-listed
  • backend-remove-cancelled
  • backend-remove-confirmation
  • backend-switch-overlay

backends

  • backend-add-modal
  • backend-manage-modal
  • backend-selector-open

changes-tab

  • changes-deleted-file
  • changes-diff-viewer
  • changes-empty

collapsible-thinking

  • reasoning-content-collapsed
  • reasoning-content-expanded
  • think-action-collapsed
  • think-action-expanded

mcp-page

  • mcp-custom-server-1-editor-open
  • mcp-custom-server-2-url-filled
  • mcp-custom-server-3-all-filled
  • mcp-custom-server-4-installed
  • mcp-custom-server-editor
  • mcp-empty-installed
  • mcp-search-filtered
  • mcp-slack-install-1-marketplace
  • mcp-slack-install-2-modal
  • mcp-slack-install-3-filled
  • mcp-slack-install-4-installed

onboarding

  • onboarding-step-0-choose-agent
  • onboarding-step-1-check-backend
  • onboarding-step-2-setup-llm
  • onboarding-step-3-say-hello

projects-workspace-browser

  • projects-workspace-browser

settings-page

  • add-backend-modal
  • analytics-consent-modal
  • home-screen
  • settings-app-page
  • settings-page

settings-secrets

  • secrets-add-form-filled
  • secrets-add-form
  • secrets-after-save
  • secrets-delete-confirm
  • secrets-list

settings-verification

  • condenser-settings
  • verification-settings-off
  • verification-settings-on

sidebar

  • sidebar-collapsed
  • sidebar-conversation-panel
  • sidebar-filter-menu

skills-page

  • skills-empty
  • skills-loaded
  • skills-no-match
  • skills-search-filtered
  • skills-type-filter

Generated by the Snapshot Tests workflow. This comment was created by an AI agent (OpenHands) on behalf of the repo maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

e2e-tests Triggers mock-LLM E2E tests on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants