diff --git a/categories/artificial-intelligence/rules-to-better-ai-assisted-development.mdx b/categories/artificial-intelligence/rules-to-better-ai-assisted-development.mdx index e7912ca0798..66837e1aa8e 100644 --- a/categories/artificial-intelligence/rules-to-better-ai-assisted-development.mdx +++ b/categories/artificial-intelligence/rules-to-better-ai-assisted-development.mdx @@ -35,6 +35,7 @@ index: - rule: public/uploads/rules/git-worktrees/rule.mdx - rule: public/uploads/rules/start-vibe-coding-best-practices/rule.mdx - rule: public/uploads/rules/ai-with-design-system/rule.mdx + - rule: public/uploads/rules/playwright-with-ai/rule.mdx created: 2024-08-26T22:47:01.000Z createdBy: 'Eddie Kranz [SSW]' createdByEmail: EddieKranz@ssw.com.au diff --git a/categories/software-engineering/rules-to-better-testing.mdx b/categories/software-engineering/rules-to-better-testing.mdx index 49884083090..eaec3c29dec 100644 --- a/categories/software-engineering/rules-to-better-testing.mdx +++ b/categories/software-engineering/rules-to-better-testing.mdx @@ -37,6 +37,7 @@ index: - rule: public/uploads/rules/do-you-only-roll-forward/rule.mdx - rule: public/uploads/rules/use-testcontainers/rule.mdx - rule: public/uploads/rules/playwright-page-object-model/rule.mdx + - rule: public/uploads/rules/playwright-with-ai/rule.mdx --- diff --git a/public/uploads/rules/playwright-with-ai/rule.mdx b/public/uploads/rules/playwright-with-ai/rule.mdx new file mode 100644 index 00000000000..2d5ce19ed21 --- /dev/null +++ b/public/uploads/rules/playwright-with-ai/rule.mdx @@ -0,0 +1,139 @@ +--- +type: rule +title: Do you know how to use Playwright with your AI Agents? +uri: playwright-with-ai +categories: + - category: categories/artificial-intelligence/rules-to-better-ai-assisted-development.mdx + - category: categories/software-engineering/rules-to-better-testing.mdx +authors: + - title: Michael Qiu + url: 'https://www.ssw.com.au/people/michael-qiu' + - title: Hark Singh + url: 'https://www.ssw.com.au/people/hark-singh' + - title: Daniel Mackay + url: 'https://www.ssw.com.au/people/daniel-mackay' +related: + - rule: public/uploads/rules/automated-ui-testing/rule.mdx + - rule: public/uploads/rules/playwright-page-object-model/rule.mdx + - rule: public/uploads/rules/mcp-servers-for-context/rule.mdx + - rule: public/uploads/rules/ai-cli-tools/rule.mdx +guid: 05f8fdb2-bbf8-4c93-a94c-eedd362cc218 +seoDescription: 'Learn the different ways to combine Playwright with AI coding agents - the Playwright MCP server, the Playwright CLI, codegen, and the official Playwright Test Agents - for UI testing, frontend development, and browser automation.' +created: 2026-05-05T05:12:20.286Z +createdBy: Michael Qiu +createdByEmail: MichaelQiu@ssw.com.au +lastUpdated: 2026-05-06T07:52:55.325Z +lastUpdatedBy: Hark +lastUpdatedByEmail: harksingh@ssw.com.au +--- + +Playwright is Microsoft's end-to-end browser automation framework. On its own, it is a powerful test runner, something that you can use to implement visual tests, but when paired with an AI Agent, it becomes so much more. The agent gains the ability to write tests by recording real user flows, verify its own frontend changes in a live browser, heal flaky tests, and even automate repetitive web tasks. + +There are a number of ways to give your AI Agent access to Playwright, whether than be through the MCP, CLI, or otherwise, with each one being best suited for a different job. The decision comes in figuring out and understanding which one to pick. + + + + + +## Why pair Playwright with an AI agent? + +Without access to a browser, agents have to guess and assume what the UI looks like. This means that it writes selectors blindly, and can't see the results of its own changes. In turn, it can ship tests that break the moment they're run and build features that look completely wrong without a second thought. + +By giving the agent Playwright, we can close this loop and set up a sort of iterative inner loop for the agent to check its own changes, allowing for: + +* **Automated UI testing** - generating end-to-end tests for the frontend by exploring the layout of the app itself and self-healing the tests when certain selectors drift during development +* **Frontend development** - allow the agent to open the page that it just built, click through the page, and verify that the changes made actually work as intended before claiming that it is done, reducing the number of iterative cycles required +* **Automation and scraping** - have agents drive multi-step web workflows (form fills, data extraction, scheduled checks, etc.) using the existing structured accessibility data instead of brittle screenshots that break with the slightest UI change + +## How can you pair Playwright with an AI agent? + +There are a variety of ways that you can give your AI Agent access to Playwright, however, there are a few ways that are built and optimised specifically for AI Agents and those are what perform the best in terms of integration with AI Agents. + +### Playwright CLI (Recommended) + + + Although the CLI is more efficient than the MCP, it requires a coding agent and is more designed for coding, testing and is best for those already using agents such as Claude Code, GitHub Copilot, etc. + + On the other hand, if your workflow involves creating an agentic loop that is performing specific tasks for your scenarios, MCP is still the superior option for that. + } + figurePrefix="none" + figure="" + style="info" +/> + +The Playwright CLI driven by your AI agent is the **fastest and cheapest **option for day-to-day test authoring, running, and debugging. In benchmarks, the same prompt run via the CLI uses approximately **4× fewer tokens** than running it through the MCP. + +##### Why it's token efficient? + +* **Results written to disk: **the agent runs a test, output goes to a file, and only the relevant lines come back into the chat context (instead of a full page snapshot every turn) +* **Independent bash commands per action:** each action is a one-shot CLI call, no long-lived session to keep alive in context + +##### Best for: + +* **Stateless tasks:** write a test, run a test, debug a failing test, walk a codebase +* Example: `“Go to 5 tina.io pages and check if everything is working”` + +###### Docs and setup + +Docs: [https://playwright.dev/agent-cli/introduction](https://playwright.dev/agent-cli/introduction) + +```bash +npm install -g @playwright/cli@latest +playwright-cli install --skills +``` + +### Playwright MCP server + +The Playwright MCP works by exposing your browser to your AI Agent via the [Model Context Protocol](https://modelcontextprotocol.io/). Unlike other forms of Playwright, instead of taking screenshots of views and navigating it form there, it sends the page's **accesibility tree** (i.e. ARIA roles, labels, and states), which are deterministic, LLM-friendly, and allows users to avoid the cost of vision models. + +##### Best for: + +* **Stateful tasks: **The agent needs to explore an app across multiple turns (e.g. clicking around to figure out how to reproduce a reported bug) +* You're building an **agentic workflow** where the same browser session carries context between steps (e.g. "log in, add three items to cart, check out, verify the receipt email") + +**Trade off: **higher token usage, because the page snapshot is part of the chat context every turn. + +###### Docs and Setup: + +Docs: [https://playwright.dev/mcp/introduction](https://playwright.dev/mcp/introduction) + +``` +AI Prompt: add playwright mcp to claude +``` + + + Playwright MCP is **NOT** a security boundary. By default, it can navigate to any URL that the agent asks for, and submit any form. + + You should **always** scope it to **safe origins**, and never point it at production with real credentials. + } + figurePrefix="none" + figure="" + style="warning" +/> + +### Playwright Test Agents (Planner, Generator, Healer) + +Playwright now comes with 3 Playwright Test Agents immediately out of the box that wrap around the entire test lifecycle. + +* 🎭 **Planner** - used to explore the app and produce a Markdown test plan that covers one or many scenarios and user flows, the plan itself is human readable but still accurate enough to drive test generation +* 🎭 **Generator** - used to take the Markdown test plan and produce executable Playwright tests. It also verifies that selectors and assertions live as it performs the relevant scenarios +* 🎭 **Healer** - automatically heals the tests created by the Generator agent that fails, replaying the failing steps, inspecting the current UI, suggesting a patch to fix the tests, and rerunning until the test passes or the built-in guardrails stop the loop + +## Using Playwright for AI-assisted development + +Most Playwright + AI content focuses on testing, but the more interesting day-to-day use case is the **inner development loop**. Without a browser, the agent's "is it done?" check is just "the file compiles." With Playwright (typically via the MCP), the agent can open the page it just changed, click through it, and verify the behaviour the same way a human dev would refresh the browser. + +This unlocks a few high-value workflows: + +* **Verify frontend changes inline** - After editing a component, the agent navigates to the page, captures an accessibility snapshot, and confirms the change actually rendered. Stops the "looks good to me, ship it" failure mode where the code compiles but the UI is broken +* **Reproduce reported bugs** - When a bug ticket comes in, the agent walks through the reproduction steps in the browser, captures the broken state, then proposes a fix. Faster than the human pasting screenshots back and forth +* **Explore before editing** - Before changing a flow the agent doesn't know, it navigates the app to understand the current behaviour (e.g. *"click through the checkout so you know what state I'm starting from"*). Reduces blind edits to code the agent has never seen run +* **Deep-link inner loops** - For changes buried behind login and several clicks, the agent automates the navigation each iteration, instead of asking the human to manually click through after every save +* **Component playground verification** - Agent navigates Storybook, Histoire, or your component sandbox after a change and confirms each variant still renders correctly +* **Network interception during dev** - Agent uses Playwright's route mocking to test edge cases (slow API, 500 errors, empty states) without needing to touch the real backend + +## Learn more + +* For more detailed information on the specifics of setting things up, refer to the [Playwright documentation](https://playwright.dev/docs/intro)