Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ index:
- rule: public/uploads/rules/git-worktrees/rule.mdx
- rule: public/uploads/rules/start-vibe-coding-best-practices/rule.mdx
- rule: public/uploads/rules/ai-with-design-system/rule.mdx
- rule: public/uploads/rules/playwright-with-ai/rule.mdx
created: 2024-08-26T22:47:01.000Z
createdBy: 'Eddie Kranz [SSW]'
createdByEmail: EddieKranz@ssw.com.au
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ index:
- rule: public/uploads/rules/do-you-only-roll-forward/rule.mdx
- rule: public/uploads/rules/use-testcontainers/rule.mdx
- rule: public/uploads/rules/playwright-page-object-model/rule.mdx
- rule: public/uploads/rules/playwright-with-ai/rule.mdx

---

Expand Down
139 changes: 139 additions & 0 deletions public/uploads/rules/playwright-with-ai/rule.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
---
type: rule
title: Do you know how to use Playwright with your AI Agents?
uri: playwright-with-ai
categories:
- category: categories/artificial-intelligence/rules-to-better-ai-assisted-development.mdx
- category: categories/software-engineering/rules-to-better-testing.mdx
authors:
- title: Michael Qiu
url: 'https://www.ssw.com.au/people/michael-qiu'
- title: Hark Singh
url: 'https://www.ssw.com.au/people/hark-singh'
- title: Daniel Mackay
url: 'https://www.ssw.com.au/people/daniel-mackay'
related:
- rule: public/uploads/rules/automated-ui-testing/rule.mdx
- rule: public/uploads/rules/playwright-page-object-model/rule.mdx
- rule: public/uploads/rules/mcp-servers-for-context/rule.mdx
- rule: public/uploads/rules/ai-cli-tools/rule.mdx
guid: 05f8fdb2-bbf8-4c93-a94c-eedd362cc218
seoDescription: 'Learn the different ways to combine Playwright with AI coding agents - the Playwright MCP server, the Playwright CLI, codegen, and the official Playwright Test Agents - for UI testing, frontend development, and browser automation.'
created: 2026-05-05T05:12:20.286Z
createdBy: Michael Qiu
createdByEmail: MichaelQiu@ssw.com.au
lastUpdated: 2026-05-06T07:52:55.325Z
lastUpdatedBy: Hark
lastUpdatedByEmail: harksingh@ssw.com.au
---

Playwright is Microsoft's end-to-end browser automation framework. On its own, it is a powerful test runner, something that you can use to implement visual tests, but when paired with an AI Agent, it becomes so much more. The agent gains the ability to write tests by recording real user flows, verify its own frontend changes in a live browser, heal flaky tests, and even automate repetitive web tasks.

There are a number of ways to give your AI Agent access to Playwright, whether than be through the MCP, CLI, or otherwise, with each one being best suited for a different job. The decision comes in figuring out and understanding which one to pick.

<endIntro />

<youtubeEmbed url="https://www.youtube.com/watch?v=Be0ceKN81S8" description="Video: Playwright CLI vs MCP - a new tool for your coding agent (6 min)" />

## Why pair Playwright with an AI agent?

Without access to a browser, agents have to guess and assume what the UI looks like. This means that it writes selectors blindly, and can't see the results of its own changes. In turn, it can ship tests that break the moment they're run and build features that look completely wrong without a second thought.

By giving the agent Playwright, we can close this loop and set up a sort of iterative inner loop for the agent to check its own changes, allowing for:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By giving the agent Playwright, we can close this loop and set up a sort of iterative inner loop for the agent to check its own changes, allowing for:
By giving the agent access to Playwright, we can close this loop and set up an iterative inner loop for the agent to check its own changes, allowing for:


* **Automated UI testing** - generating end-to-end tests for the frontend by exploring the layout of the app itself and self-healing the tests when certain selectors drift during development
* **Frontend development** - allow the agent to open the page that it just built, click through the page, and verify that the changes made actually work as intended before claiming that it is done, reducing the number of iterative cycles required
* **Automation and scraping** - have agents drive multi-step web workflows (form fills, data extraction, scheduled checks, etc.) using the existing structured accessibility data instead of brittle screenshots that break with the slightest UI change

## How can you pair Playwright with an AI agent?

There are a variety of ways that you can give your AI Agent access to Playwright, however, there are a few ways that are built and optimised specifically for AI Agents and those are what perform the best in terms of integration with AI Agents.

### Playwright CLI (Recommended)

<boxEmbed
body={<>
Although the CLI is more efficient than the MCP, it requires a coding agent and is more designed for coding, testing and is best for those already using agents such as Claude Code, GitHub Copilot, etc.

On the other hand, if your workflow involves creating an agentic loop that is performing specific tasks for your scenarios, MCP is still the superior option for that.
</>}
figurePrefix="none"
figure=""
style="info"
/>

The Playwright CLI driven by your AI agent is the **fastest and cheapest **option for day-to-day test authoring, running, and debugging. In benchmarks, the same prompt run via the CLI uses approximately **4× fewer tokens** than running it through the MCP.

##### Why it's token efficient?

* **Results written to disk: **the agent runs a test, output goes to a file, and only the relevant lines come back into the chat context (instead of a full page snapshot every turn)
* **Independent bash commands per action:** each action is a one-shot CLI call, no long-lived session to keep alive in context

##### Best for:

* **Stateless tasks:** write a test, run a test, debug a failing test, walk a codebase
* Example: `“Go to 5 tina.io pages and check if everything is working”`

###### Docs and setup

Docs: [https://playwright.dev/agent-cli/introduction](https://playwright.dev/agent-cli/introduction)

```bash
npm install -g @playwright/cli@latest
playwright-cli install --skills
```

### Playwright MCP server

The Playwright MCP works by exposing your browser to your AI Agent via the [Model Context Protocol](https://modelcontextprotocol.io/). Unlike other forms of Playwright, instead of taking screenshots of views and navigating it form there, it sends the page's **accesibility tree** (i.e. ARIA roles, labels, and states), which are deterministic, LLM-friendly, and allows users to avoid the cost of vision models.

##### Best for:

* **Stateful tasks: **The agent needs to explore an app across multiple turns (e.g. clicking around to figure out how to reproduce a reported bug)
* You're building an **agentic workflow** where the same browser session carries context between steps (e.g. "log in, add three items to cart, check out, verify the receipt email")

**Trade off: **higher token usage, because the page snapshot is part of the chat context every turn.

###### Docs and Setup:

Docs: [https://playwright.dev/mcp/introduction](https://playwright.dev/mcp/introduction)

```
AI Prompt: add playwright mcp to claude
```

<boxEmbed
body={<>
Playwright MCP is **NOT** a security boundary. By default, it can navigate to any URL that the agent asks for, and submit any form.

You should **always** scope it to **safe origins**, and never point it at production with real credentials.
</>}
figurePrefix="none"
figure=""
style="warning"
/>

### Playwright Test Agents (Planner, Generator, Healer)

Playwright now comes with 3 Playwright Test Agents immediately out of the box that wrap around the entire test lifecycle.

* 🎭 **Planner** - used to explore the app and produce a Markdown test plan that covers one or many scenarios and user flows, the plan itself is human readable but still accurate enough to drive test generation
* 🎭 **Generator** - used to take the Markdown test plan and produce executable Playwright tests. It also verifies that selectors and assertions live as it performs the relevant scenarios
* 🎭 **Healer** - automatically heals the tests created by the Generator agent that fails, replaying the failing steps, inspecting the current UI, suggesting a patch to fix the tests, and rerunning until the test passes or the built-in guardrails stop the loop

## Using Playwright for AI-assisted development

Most Playwright + AI content focuses on testing, but the more interesting day-to-day use case is the **inner development loop**. Without a browser, the agent's "is it done?" check is just "the file compiles." With Playwright (typically via the MCP), the agent can open the page it just changed, click through it, and verify the behaviour the same way a human dev would refresh the browser.

This unlocks a few high-value workflows:

* **Verify frontend changes inline** - After editing a component, the agent navigates to the page, captures an accessibility snapshot, and confirms the change actually rendered. Stops the "looks good to me, ship it" failure mode where the code compiles but the UI is broken
* **Reproduce reported bugs** - When a bug ticket comes in, the agent walks through the reproduction steps in the browser, captures the broken state, then proposes a fix. Faster than the human pasting screenshots back and forth
* **Explore before editing** - Before changing a flow the agent doesn't know, it navigates the app to understand the current behaviour (e.g. *"click through the checkout so you know what state I'm starting from"*). Reduces blind edits to code the agent has never seen run
* **Deep-link inner loops** - For changes buried behind login and several clicks, the agent automates the navigation each iteration, instead of asking the human to manually click through after every save
* **Component playground verification** - Agent navigates Storybook, Histoire, or your component sandbox after a change and confirms each variant still renders correctly
* **Network interception during dev** - Agent uses Playwright's route mocking to test edge cases (slow API, 500 errors, empty states) without needing to touch the real backend

## Learn more

* For more detailed information on the specifics of setting things up, refer to the [Playwright documentation](https://playwright.dev/docs/intro)
Loading