Skip to content

Add AGENTS.md with Docker E2E setup + fix E2E login state race#3815

Closed
yasserfaraazkhan wants to merge 26 commits into
masterfrom
copy_cursor/setup-agents-md-c5d4
Closed

Add AGENTS.md with Docker E2E setup + fix E2E login state race#3815
yasserfaraazkhan wants to merge 26 commits into
masterfrom
copy_cursor/setup-agents-md-c5d4

Conversation

@yasserfaraazkhan
Copy link
Copy Markdown
Contributor

@yasserfaraazkhan yasserfaraazkhan commented May 8, 2026

Summary
AGENTS.md — Cursor Cloud-specific instructions for the desktop repo. Documents Node version, headless Linux launch, and most importantly: how to spin up a local Mattermost server via Docker to run and verify E2E test fixes. No GitHub Actions workflow files needed — agents use @cursoragent PR comments or Cursor Cloud's CI monitoring.

E2E login state race fix — Added waitForLoggedIn() helper that polls ServerManager.isLoggedIn directly in the Electron main process (the source of truth), bypassing the slow multi-hop IPC chain that caused 30s timeouts on all three CI platforms. Applied to window_menu, drag_and_drop, popout_windows, tab_management tests. Also added retry-reload for bad_servers expired certificate test.

Approach for Cursor automation: Instead of adding workflow files to the repo, agents are triggered via:

@cursoragent fix e2e comments on PRs (works today, zero setup)
Cursor Cloud dashboard CI monitoring (configurable per-repo)
Agents follow AGENTS.md instructions to start a Docker Mattermost server, reproduce failures, fix tests, and verify fixes locally before pushing.

NONE

Change Impact: 🟢 Low

Regression Risk: Minimal. All changes are isolated to E2E test infrastructure and documentation (AGENTS.md). No production code was modified. The new waitForLoggedIn() helper safely polls ServerManager.isLoggedIn via standard Playwright electronApp.evaluate() IPC, accessing an existing, widely-used property (isLoggedIn) that is already stable in production code (used in MainPage.tsx, tabManager.ts, navigationManager.ts, MattermostWebContentsView.ts). The polling mechanism is defensive with appropriate error handling and timeouts, and the synchronization is applied only to test setup hooks in a handful of E2E tests.

QA Recommendation: Minimal manual QA required beyond automated E2E test execution. The changes are test-only and do not affect production behavior. Focus should be limited to verifying that the E2E tests (window_menu, drag_and_drop, popout_windows, tab_management, and bad_servers) run reliably with the new login synchronization logic and that the retry-reload mechanism in bad_servers properly detects ErrorView on slow CI environments. No QA testing needed on the AGENTS.md documentation changes.

Generated by CodeRabbitAI

cursoragent and others added 26 commits April 13, 2026 01:07
Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
When Matterwick provisions cloud Mattermost instances for a PR, the server
URLs are now surfaced back to the PR as a comment immediately after the
matrix is ready — while tests are still running.  This lets developers
connect to the exact same servers that are running the CI suite to
reproduce and fix failing tests without waiting for the run to finish.

Changes:
- Add findPrNumber() helper to e2e/utils/github-actions.js, extracting the
  three-step PR resolution logic (explicit input → run.pull_requests →
  branch/SHA lookup) that was previously duplicated across removeE2ELabel
  and the remove-e2e-label job.
- Add postServerInfoComment() to e2e/utils/github-actions.js.  It builds a
  markdown table of per-platform server URLs, includes the admin username
  and server version, and adds a ready-to-run shell snippet.  The comment
  is idempotent: a hidden HTML marker (<!-- e2e-server-info -->) is used to
  find and update an existing comment on re-runs rather than appending a
  new one each time.  The admin password is intentionally omitted.
- Add post-server-info job to e2e-functional.yml.  It runs in parallel with
  update-initial-status after prepare-matrix succeeds, is skipped for
  nightly runs (no PR to comment on), and requires issues:write /
  pull-requests:write permissions.  The top-level workflow permissions block
  is extended to include those two scopes.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
When Matterwick provisions cloud Mattermost instances for a PR, the server
URLs are now surfaced back to the PR as a comment immediately after the
matrix is ready — while tests are still running.  This lets developers
connect to the exact same servers that are running the CI suite to
reproduce and fix failing tests without waiting for the run to finish.

Changes:
- Add findPrNumber() helper to e2e/utils/github-actions.js, extracting the
  three-step PR resolution logic (explicit input → run.pull_requests →
  branch/SHA lookup) that was previously duplicated across removeE2ELabel
  and the remove-e2e-label job.
- Add postServerInfoComment() to e2e/utils/github-actions.js.  It builds a
  markdown table of per-platform server URLs, includes the admin username
  and server version, and adds a ready-to-run shell snippet.  The comment
  is idempotent: a hidden HTML marker (<!-- e2e-server-info -->) is used to
  find and update an existing comment on re-runs rather than appending a
  new one each time.  The admin password is intentionally omitted.
- Add post-server-info job to e2e-functional.yml.  It runs in parallel with
  update-initial-status after prepare-matrix succeeds, is skipped for
  nightly runs (no PR to comment on), and requires issues:write /
  pull-requests:write permissions.  The top-level workflow permissions block
  is extended to include those two scopes.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Inputs interpolated directly into single-quoted JS string literals inside
an actions/github-script step could be escaped by a malicious dispatcher
(e.g. a value containing a single quote) to break out of the string and
run arbitrary JavaScript on the runner, which holds issues:write and
pull-requests:write permissions.

Fix: move all three user-controlled inputs (pr_number, MM_TEST_USER_NAME,
MM_SERVER_VERSION) out of the ${{ }} expression context and into env:
variables on the step, then read them via process.env.* inside the script.
The platforms JSON comes from a trusted internal job output (not a raw
workflow input) so its interpolation is safe and is left as-is.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Three issues raised in the CodeRabbit review of PR #3774:

1. Overly broad top-level permissions: remove issues:write and
   pull-requests:write from the workflow-level permissions block.
   Those scopes are already granted at job level on post-server-info
   and remove-e2e-label which are the only jobs that need them.

2. Comment posting not fault-tolerant: wrap postServerInfoComment in
   a try/catch inside the github-script step so a GitHub API error
   (e.g. rate limit, permissions) logs a warning but does not fail
   the post-server-info job and block the overall workflow run.

3. Loose PR number parsing in findPrNumber: replace parseInt which
   accepts '123abc' and negative values with strict validation —
   trim the input, test against /^\d+$/, then confirm the result
   is a positive integer before returning.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
…iven fixes

Matterwick destroys the provisioned cloud Mattermost servers as soon as the
E2E/Run label is removed from a PR. To allow agents to connect to those
servers and fix failing tests within the same PR run, both label removal
paths are disabled:

- e2e-functional.yml: remove-e2e-label job is commented out in full.
- e2e-label-cleanup.yml: removal logic replaced with a no-op job that logs
  a message; the workflow trigger and permissions block are preserved so
  the file remains valid YAML and the workflow still runs (harmlessly).

Re-enable both when automated fix-and-rerun is no longer needed.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Label removal:
- Disable remove-e2e-label job in e2e-functional.yml (commented out) so
  Matterwick keeps provisioned servers alive after tests finish.
- Replace the remove-e2e-label job in e2e-label-cleanup.yml with a no-op
  job for the same reason.
  Servers are destroyed when the label is removed; keeping them alive lets
  agents connect and fix failing tests in the same PR run.

Test fix — post-login tabsDisabled race (5 failing tests across all 3 OSes):
Root cause: MainPage.tabsDisabled is set to !currentServer.isLoggedIn.
After loginToMattermost() returns (web app shell ready), the
SERVER_LOGGED_IN_CHANGED IPC event still needs to travel from the server
WebContentsView through the main process ServerManager to the renderer
MainPage, where it triggers updateServers() which fetches the updated
currentServer.isLoggedIn value. Tests that called mainWindow.click('#newTabButton')
or mainWindow.waitForSelector('#newTabButton') immediately after
loginToMattermost() were racing this propagation and timing out because the
button was still disabled (tabsDisabled=true).

Fix: change the post-login wait in each affected beforeAll from
  waitForSelector('#newTabButton')          // present but possibly disabled
to
  waitForSelector('#newTabButton:not([disabled])')  // present AND enabled

Affected specs:
- e2e/specs/server_management/drag_and_drop.test.ts
- e2e/specs/server_management/popout_windows.test.ts
- e2e/specs/server_management/tab_management.test.ts
- e2e/specs/menu_bar/window_menu.test.ts

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Lint (Expected line before comment):
- Add blank line before each explanatory comment added in the previous
  commit across drag_and_drop, popout_windows, tab_management, and
  window_menu test files.

CodeRabbit review items (carried over from PR #3774 review of the
original commit, now applied to this branch):
- Remove issues:write / pull-requests:write from the workflow-level
  permissions block; those scopes already exist at job level on the
  post-server-info job.
- Add continue-on-error: true to the Post provisioned server URLs step
  so a comment API failure never blocks the overall workflow.
- Move workflow inputs (pr_number, MM_TEST_USER_NAME, MM_SERVER_VERSION)
  into env: vars and read them via process.env.* to prevent script
  injection (security fix from DryRun Security report).
- Replace parseInt() in findPrNumber with strict /^\d+$/ + positive
  integer validation to reject partial strings and negative values.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Security (DryRun Security findings on automation-changed files only):

drs_40c108a1 — Code injection via expression interpolation:
  `${{ needs.prepare-matrix.outputs.platforms }}` was interpolated
  directly into a JS string literal inside actions/github-script. A
  poisoned instance_details input could escape the shell command in
  prepare-matrix, control the platforms output, and inject arbitrary JS.
  Fix: pass PLATFORMS via env: and parse with JSON.parse(process.env.PLATFORMS)
  so the value is never textually interpolated into the script body.

drs_e5221983 — Markdown injection in PR comments:
  platform and url values from the platforms array were inserted raw into a
  Markdown table in postServerInfoComment(). An attacker controlling
  instance_details could inject pipe characters to break the table or embed
  malicious links/mentions.
  Fix: add sanitizeMd() helper in github-actions.js that escapes |, `,
  [ and ] before inserting any platform-provided value into the comment body.

CodeRabbit review (on automation-changed files only):

- Remove inputs.pr_number gate from post-server-info job condition so
  findPrNumber fallback resolution can run when pr_number is absent.
  New condition: if: ${{ !inputs.nightly }}

- Remove ~60 lines of commented-out remove-e2e-label implementation from
  e2e-functional.yml; replaced with a single explanatory comment line.
  Git history preserves the full implementation.

- Fix e2e-label-cleanup.yml noop job to never allocate a runner:
  change if condition to ${{ false }} and add permissions: {} so the job
  is skipped entirely and consumes no permissions or compute.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
…gents-md-c5d4

Merges PR #3774 into the combined branch. Conflict resolution:

e2e-functional.yml (4 conflicts):
- post-server-info job condition: took PR #3774 side (no pr_number gate,
  allows findPrNumber fallback resolution)
- Post step: took PR #3774 side (continue-on-error, PLATFORMS env var,
  JSON.parse instead of direct expression interpolation — security fix)
- remove-e2e-label block: took PR #3774 side (single comment line instead
  of ~60 lines of commented-out dead code)

e2e-label-cleanup.yml: took PR #3774 side entirely (if: false + permissions: {})

e2e/utils/github-actions.js: took PR #3774 side entirely (sanitizeMd
  injection protection, strict findPrNumber validation)

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
CodeRabbit review fixes (automation-changed files only):

AGENTS.md:
- Add 'bash' language tag to the fenced code block containing the nvm
  command so markdown linters and readers recognize it as a shell snippet.

e2e/utils/github-actions.js:

1. Comment out removeE2ELabel function body — entire implementation is
   commented out so the GitHub Actions master-branch cleanup job no longer
   removes the E2E/Run label after tests complete. Matterwick keeps the
   provisioned servers alive, allowing agents to connect and fix failures
   in the same run. Function signature is preserved; all callers continue
   to compile and run without errors.

2. sanitizeMd — extend to also strip/replace newline characters (\r, \n
   → space) and escape HTML-sensitive chars (&, <, >) in addition to the
   existing pipe/backtick/bracket escaping, preventing table-breaking and
   raw HTML injection in PR comments.

3. findPrNumber — remove fallback to prs.data[0] when no PR matches the
   run's head SHA; return null instead so the caller skips posting a
   comment rather than posting to the wrong PR.

4. postServerInfoComment — update the server lifetime note in the PR
   comment to reflect that label cleanup is disabled and servers may be
   retained after the run ends.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
New file: .github/workflows/e2e-fix-trigger.yml
  Fires on workflow_run:completed for 'Electron Playwright Tests'.
  Reads failure counts, applies triage logic, then either:
  - Skips with a PR comment if mass failure (all platforms failed or
    total >= MASS_FAILURE_THRESHOLD=15) — prevents wasting tokens on
    systemic build/infra breakages.
  - Launches a Cursor cloud agent (POST /v0/agents) with source.prUrl
    and target.autoBranch:false so fixes push to the PR's head branch
    and re-trigger a new E2E run automatically.
  The prompt tells the agent to:
  - Fix test bugs (selector changes, race conditions, wrong assertions)
    in e2e/ only, capped at 8 files per run.
  - Post a PR comment for product bugs instead of modifying tests.
  - Run each fixed spec against the live server before committing.
  Requires CURSOR_API_KEY repository secret.

e2e/utils/github-actions.js (CodeRabbit fixes):
  - removeE2ELabel: entire function body is commented out so the
    GitHub Actions master-branch cleanup job no longer removes E2E/Run.
  - sanitizeMd: also escapes \r/\n (→ space) and & < > (HTML entities)
    to prevent table-breaking and raw HTML injection in PR comments.
  - findPrNumber: return null instead of falling back to prs.data[0]
    when no PR matches the run's head SHA.
  - postServerInfoComment: update server lifetime note to reflect that
    label cleanup is disabled and servers may be retained after the run.

AGENTS.md (CodeRabbit fix):
  - Add 'bash' language tag to the nvm fenced code block.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
New file: .github/workflows/e2e-cursor-commands.yml

Adds two @cursor commands that any PR contributor can invoke via a PR
comment. Both trigger immediately (issue_comment event) and reply with a
reaction and a confirmation comment so the user knows the command landed.

Supported commands (case-insensitive):

  @cursor fix e2e
  @cursor fix e2e failures
    Launched after a large change caused mass E2E failures that the
    automatic fix trigger skipped. The agent looks for a shared root
    cause first (renamed selector, changed IPC channel, modified config),
    fixes it at the shared layer, then handles remaining individual test
    bugs. Product bugs are reported via PR comment, not test modification.
    Hard limit: 10 test files per run.

  @cursor add e2e tests
  @cursor add e2e tests for pr
    Generates new E2E test cases covering the behavioral changes in the PR.
    The agent reads each changed non-e2e file, writes Playwright tests in
    e2e/specs/, and runs them against the live server before committing.
    Hard limit: 3 new spec files per run.

Both commands:
  - React to the triggering comment with 👀 immediately
  - Use source.prUrl + target.autoBranch:false so the agent pushes
    directly to the PR branch (triggering a new CI run)
  - Include server URLs from the existing PR comment so the agent can
    reproduce and validate against the exact same servers
  - Post a summary comment when done (or a failure notice if the API key
    is missing/invalid)

Requires: CURSOR_API_KEY repository secret.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Derived from every CodeRabbit and DryRun Security finding raised on PRs
#3773 and #3774. Each rule is direct and includes a wrong/correct example
so future agents cannot repeat the same mistakes.

Rules added:

1. Script injection — never interpolate ${{ inputs.* }} inside github-script;
   use env: + process.env.* instead. JSON arrays must use JSON.parse().

2. Markdown injection — always sanitize platform/URL values with sanitizeMd()
   (escape newlines, HTML entities, pipes, backticks, brackets) before inserting
   into PR comment tables.

3. Permissions — issues:write and pull-requests:write must be declared at
   job level, never at workflow top level.

4. Disabled/noop jobs — must set permissions: {} and if: ${{ false }} to
   prevent unnecessary runner allocation and permission grants.

5. Fault tolerance — auxiliary side-effect steps (posting comments, labels,
   notifications) must set continue-on-error: true and wrap API calls in
   try/catch with core.warning().

6. Job condition gates — do not gate on an optional input like
   inputs.pr_number != '' when fallback resolution logic can find the PR
   without it; gate on the minimum necessary condition only.

7. Dead code — do not leave commented-out job blocks in workflow YAML.
   Remove them entirely; rely on git history.

8. Input validation — use /^\d+$/ + Number.isInteger() + > 0 for PR numbers;
   never parseInt() alone. Never fall back to prs.data[0] when a SHA match
   fails.

9. workflow_run execution context — these workflows run from the DEFAULT BRANCH.
   Changes to utilities called by workflow_run must land on master to take effect.

10. Fenced code blocks — always include a language tag (bash, yaml, ts, etc.).

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Root cause of mass failures across all 3 platforms in CI run #24329752006:
Matterwick cloud servers provisioned fresh on master have no default team,
so after login the Mattermost server redirects to /select_team instead of
directly to a channel. waitForAppShell only checked for channel-view
selectors (#post_textbox, #channelHeaderTitle, search bar) which are absent
on /select_team, causing it to time out and throw:
  'loginToMattermost: login succeeded but the app shell never became ready.'
This cascaded to every test whose beforeAll called loginToMattermost
(window_menu, drag_and_drop, popout_windows, tab_management, view_menu,
copy_link, add_server_modal, and others).

Fix: add /select_team and /create_team URL matching as a fourth accepted
'app shell ready' condition in waitForAppShell. The user is authenticated
when either page appears — they just have not yet joined a team. This is
consistent with the existing behaviour for channel pages and does not
weaken the check for servers that do have teams.

Also fix: AGENTS.md heading hyphenation (CodeRabbit duplicate comment)
  'Cursor Cloud specific instructions' -> 'Cursor Cloud-specific instructions'

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Problem
-------
Matterwick provisions fresh cloud Mattermost instances on every E2E run.
Fresh servers have no default team, so after login the app lands on
/select_team instead of a channel. Every server-backed test whose
beforeAll called loginToMattermost() then failed — either throwing
'app shell never became ready' or 'Target page closed' (Windows) —
cascading to 18 failures on Windows, 7 on macOS, 5 on Linux per run.

Solution
---------
New file: e2e/utils/server-setup.js
  provisionServer() — runs once at the start of the E2E suite via
  globalSetup. Uses the Mattermost REST API (no extra npm deps; Node 18
  fetch is used) to:
    1. Authenticate as the admin user (MM_TEST_USER_NAME / MM_TEST_PASSWORD)
    2. Check whether a team named 'e2e-team' exists; create it (type: open)
       if absent.
    3. Add the admin user to the team (idempotent — 409 is silently ignored).
    4. Confirm town-square channel exists (log only).
  Fully idempotent: safe to call on servers that already have the team.
  No-ops silently when MM_TEST_SERVER_URL or MM_TEST_PASSWORD are absent
  (local runs without a server are unaffected).

e2e/global-setup.ts
  Import and await provisionServer() at the end of globalSetup so it runs
  exactly once before Playwright spins up any workers.

e2e/helpers/login.ts
  Revert the /select_team band-aid added in the previous commit — now that
  provisioning guarantees a team exists, login always lands in a channel
  and the extra URL check is no longer needed.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
…_team workaround

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Keep the branch's intentional disable of remove-e2e-label in both
e2e-functional.yml and e2e-label-cleanup.yml. Label removal is
suppressed so Matterwick does not destroy provisioned servers before
agents can connect and fix failing tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…3785)

* ci: fix security, permissions, and reliability issues in Cursor E2E automation workflows

- e2e-fix-trigger.yml:
  * Move issues/pull-requests write permissions from top-level to job-level (least privilege)
  * Pass github.event.workflow_run.id via env to avoid interpolation inside script
  * Replace parseInt('${{ steps.*.outputs.pr_number }}') with strict positive-integer
    validation via process.env (prevents injection and fixes weak parseInt fallback)
  * Add id: launch to the curl step so AGENT_URL is read from step outputs
    instead of the broken env.AGENT_URL reference
  * Add continue-on-error: true to both auxiliary comment steps
  * Fix GITHUB_OUTPUT key casing (agent_id / agent_url)

- e2e-cursor-commands.yml:
  * Move issues/pull-requests write permissions from top-level to job-level
  * Pass github.event.comment.body via COMMENT_BODY env var and read as
    process.env.COMMENT_BODY — eliminates critical script-injection vector
    where user-controlled comment content was interpolated into a JS string
  * Pass github.event.comment.id and issue.number via env vars
  * Apply strict positive-integer validation for all PR number reads
  * Add continue-on-error: true to react, reply-success, and reply-failure steps

- e2e-functional.yml:
  * Replace direct JSON-array interpolation (${{ needs.*.outputs.platforms }})
    with env: + JSON.parse() in both update-initial-status and update-final-status
    github-script blocks

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>

* ci: improve E2E fix agent prompts — run-before-commit, exact test names, per-platform URLs

The agent prompts previously told the agent to 'run the test' without providing
the build commands, server credentials, or a hard prohibition on committing
without a passing run. This meant the agent could guess fixes without verifying
them on the live servers.

Changes in e2e-fix-trigger.yml and e2e-cursor-commands.yml:

Triage / context step:
- Download JUnit XML artifacts (test-results-{linux,macos,windows}) and parse
  exact failing test names using an in-memory AdmZip parse. The agent prompt
  now contains the exact test names instead of just 'linux: 1 failure(s)'.
- Extract per-platform server URLs from the server-info PR comment individually
  (SERVER_URL_LINUX, SERVER_URL_MACOS, SERVER_URL_WINDOWS) in addition to the
  raw table block. The admin username is also extracted.
- Expose the workflow run ID so the agent can use 'gh run download' if the
  in-workflow JUnit parse fails.

Agent prompt (both fix workflows):
- Step 1: exact build commands (nvm use, npm ci x2, npm run build-test)
- Step 2: how to get exact failing spec file paths from JUnit
- Step 3 (fix flow): shared-root-cause analysis before per-spec fixes
- Step 4: reproduce BEFORE editing — exact xvfb-run command with all env vars
          filled in; hard rule 'YOU MUST see the failure before touching code'
- Step 5: fix and re-run — hard rule 'DO NOT COMMIT until re-run shows passing'
- Platform strategy: run against Linux server to validate logic; macOS/Windows
  confirmed by CI re-run after push (cannot run natively on Linux agent)
- Summary comment requirement: must include exact run command + 'PASSED' result

AGENTS.md:
- New 'Cursor secrets required for E2E fix agents' section documenting that
  MM_TEST_PASSWORD must be added as a Cursor Dashboard secret so the agent can
  authenticate against the provisioned Matterwick servers.
- New 'Running E2E tests locally on this Linux VM' section with the exact
  xvfb-run command an agent or developer should use.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>

* fix: paginate all comment pages when finding the server-info comment

The previous implementation used a single listComments call with per_page:100.
On a PR with more than 100 comments the marker would never be found, causing
postServerInfoComment to call createComment on every E2E run instead of
updating the existing comment.

Replace the single-page fetch with a paginated loop that walks all pages until
the marker is found or the last page is consumed. All other upsert logic is
unchanged.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>

* fix: use correct Cursor secret names for E2E credentials

Cursor secrets are injected by name. The prompts and AGENTS.md were
referencing MM_TEST_PASSWORD as if it were a Cursor secret, but the
actual secrets added to the Cursor Dashboard are named:

  MM_DESKTOP_E2E_USER_NAME        (the admin username)
  MM_DESKTOP_E2E_USER_CREDENTIALS (the admin password)

Update every run command in the agent prompts and AGENTS.md to read
the credentials from the correct env var names. The Playwright env vars
MM_TEST_USER_NAME and MM_TEST_PASSWORD are still used as the test runner
expects them — they are now assigned from the Cursor secrets:

  export MM_TEST_USER_NAME="${MM_DESKTOP_E2E_USER_NAME}"
  export MM_TEST_PASSWORD="${MM_DESKTOP_E2E_USER_CREDENTIALS}"

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
When @mm-cloud-bot cancels a run mid-flight, the install-node-dependencies
step is killed before e2e/node_modules is populated. The check-for-failures
step ran unconditionally via `if: always()`, crashing with a missing
fast-xml-parser module. Guard it with steps.install-deps.outcome == 'success'
so it is silently skipped on cancellation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The #newTabButton is conditionally rendered — it only exists in the DOM
when tabsDisabled is false (which requires isLoggedIn=true). Since the
button is removed entirely from the DOM when disabled, the original
waitForSelector('#newTabButton') already waited for login state
propagation. The :not([disabled]) suffix was redundant and caused
30s timeouts across all 3 CI platforms (6 failures on linux, 7 on
macos, 7 on windows).

For window_menu.test.ts, remove the entire added wait block since
master never had it — the existing focusMainWindow() call suffices.

For drag_and_drop, popout_windows, tab_management: restore the
original waitForSelector('#newTabButton') selector.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
Root cause: after loginToMattermost() completes (web app shell visible),
the desktop app's isLoggedIn state must travel through a multi-hop IPC
chain: web app calls desktopAPI.onLogin() → TAB_LOGIN_CHANGED IPC →
WebContentsManager.handleTabLoginChanged → ServerManager.setLoggedIn →
SERVER_LOGGED_IN_CHANGED event → MainWindow.sendToRenderer → renderer
MainPage.updateServers → getOrderedServers IPC → setState. Only then
does tabsDisabled become false and #newTabButton render.

On slow CI servers (Matterwick-provisioned instances), this chain can
take longer than the 30s timeout used by waitForSelector('#newTabButton').

Fix: add waitForLoggedIn() helper that polls ServerManager.isLoggedIn
directly in the main process via electronApp.evaluate(). This is the
source of truth — no DOM/renderer round-trip needed. The helper then
waits for #newTabButton in the renderer with the remaining budget.

Applied to: window_menu, drag_and_drop, popout_windows, tab_management.

Also: add retry-reload + increased timeout for bad_servers expired
certificate test where .ErrorView may not appear after a single reload.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
…ud E2E guidance

Remove the GitHub Actions-based Cursor automation (e2e-fix-trigger.yml,
e2e-cursor-commands.yml, e2e-functional.yml changes, e2e-label-cleanup.yml
changes, server-setup.js, global-setup.ts provisionServer call) in favor
of a simpler approach:

- Cursor Cloud agents can be triggered via @cursoragent PR comments or
  the Cursor dashboard's CI monitoring — no repo workflow files needed.
- AGENTS.md now contains Docker-based local Mattermost server setup
  instructions so agents can spin up their own server instance, run E2E
  tests, and verify fixes without depending on Matterwick CI servers.
- Stripped workflow-specific sections (GitHub Actions coding rules, Cursor
  secrets, etc.) from AGENTS.md — not needed for the Docker approach.

The E2E test fixes (waitForLoggedIn helper, bad_servers retry) are
preserved in separate commits.

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
…-md-c5d4

# Conflicts:
#	AGENTS.md

Co-authored-by: yasser khan <attitude3cena.yf@gmail.com>
@github-actions github-actions Bot added the E2E/Run Run Desktop E2E Tests label May 8, 2026
@mm-cloud-bot
Copy link
Copy Markdown

❌ E2E Test Setup Failed

Failed to create E2E test instances: installation wait cancelled: context canceled

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ce7bcb35-d0ca-4c0d-aa5a-0597d197fc09

📥 Commits

Reviewing files that changed from the base of the PR and between 9538d4d and c80551d.

📒 Files selected for processing (7)
  • AGENTS.md
  • e2e/helpers/login.ts
  • e2e/specs/menu_bar/window_menu.test.ts
  • e2e/specs/server_management/bad_servers.test.ts
  • e2e/specs/server_management/drag_and_drop.test.ts
  • e2e/specs/server_management/popout_windows.test.ts
  • e2e/specs/server_management/tab_management.test.ts

Disabled knowledge base sources:

  • Jira integration is disabled

You can enable these sources in your CodeRabbit configuration.


📝 Walkthrough

Walkthrough

This PR introduces a new E2E helper function waitForLoggedIn that polls the Electron main process to verify login completion, updates five E2E tests to use this helper instead of DOM-based readiness checks, documents Cursor Cloud environment setup including the login state propagation issue, and enhances error detection resilience in the expired certificate test.

Changes

E2E Login Synchronization and Documentation

Layer / File(s) Summary
Cursor Cloud Documentation
AGENTS.md
Comprehensive setup guide for running the desktop app and E2E tests in headless Linux environments, covering Node version requirements, Electron/sandbox configuration, Docker Mattermost bootstrapping, API initialization, E2E test execution steps, troubleshooting for hanging tests, and documentation of the login state propagation flake motivating the new helper.
Login State Helper
e2e/helpers/login.ts
New ElectronApplication type alias and exported waitForLoggedIn helper that polls ServerManager.isLoggedIn via the Electron main process, times out with an error if login does not complete, and waits for the renderer DOM to confirm login with a configurable timeout (default 60s, minimum 5s for final UI check).
Test Login Synchronization
e2e/specs/menu_bar/window_menu.test.ts, e2e/specs/server_management/drag_and_drop.test.ts, e2e/specs/server_management/popout_windows.test.ts, e2e/specs/server_management/tab_management.test.ts
Five test specs now import and call waitForLoggedIn(electronApp, mainWindow) immediately after loginToMattermost, replacing previous DOM-selector-based readiness checks (#newTabButton) with explicit main-process polling for authentication completion.
ErrorView Detection Resilience
e2e/specs/server_management/bad_servers.test.ts
Enhanced the expired-certificate test to retry renderer readiness and wait longer (60s) for .ErrorView if the initial check returns null, improving stability when error states are slow to propagate.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch copy_cursor/setup-agents-md-c5d4

Comment @coderabbitai help to get the list of available commands and usage tips.

@yasserfaraazkhan yasserfaraazkhan deleted the copy_cursor/setup-agents-md-c5d4 branch May 8, 2026 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

E2E/Run Run Desktop E2E Tests release-note-none

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants