cdp-browser-mcp

An MCP server for browser automation that uses Chrome's native Accessibility API instead of injected JavaScript. Returns the full page in a single compact snapshot — 3.3x fewer tokens than chrome-devtools-mcp and 4.6x fewer than Playwright MCP.

One cdp_navigate call returns a compact DOM with every interactive element indexed — ready for clicking, typing, and reading.

Why this exists

LLM-driven browser automation has a token problem. Every page snapshot eats context window. We benchmarked 5 browser tools across 8 page types:

Tool	Median tokens	Full page?	Notes
browser-use (80K+ stars)	~2,000 per viewport	Viewport only	Needs scroll calls to see full page
browser-autopilot / cdp-browser-mcp	~3,800	Full page	Complete DOM in 1 call
chrome-devtools-mcp (Google, 28K+ stars)	~10,900	Full page	Verbose AX tree format
Playwright MCP	~17,400	Full page	YAML ariaSnapshot format

browser-use has the smallest per-viewport size because it filters to viewport-visible elements only. But when you need the full page, scrolling adds up fast. We measured the actual full-page cost by scrolling browser-use through each page and summing all viewport snapshots:

Page	browser-use (full page)	cdp-browser-mcp (1 call)	Ratio
Wikipedia	66,067 (19 scrolls)	7,092	9.3x more
YouTube	50,544 (4 scrolls)†	3,961	12.8x more
Shoelace	26,295 (9 scrolls)	3,705	7.1x more
BBC News	15,624 (7 scrolls)	4,288	3.6x more
DataTables	8,874 (4 scrolls)	2,332	3.8x more
GitHub Issues	874 (fits in viewport)	4,006	browser-use wins
Excalidraw	986 (fits in viewport)	282	3.5x more
example.com	28 (fits in viewport)	31	~same

† YouTube has infinite scroll — we capped at 4 viewports (initial page height) for a fair comparison.

browser-use wins on pages that fit in a single viewport (GitHub Issues). For everything else — especially content-heavy or infinite-scroll pages — the full-page approach uses dramatically fewer total tokens.

cdp-browser-mcp gives you the full page in one call — the most token-efficient approach when you need to see or search across the entire page. Built on browser-autopilot, which reads Chrome's Accessibility tree via CDP and compresses it into a compact indexed format. No JavaScript injection, no DOM walking, no SVG noise.

Quick start

1. Install

git clone https://github.com/echo-lumen/cdp-browser-mcp.git
cd cdp-browser-mcp
npm install

2. Start Chrome with CDP enabled

"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --remote-debugging-port=9222 \
  --user-data-dir=/tmp/chrome-debug-profile \
  --no-first-run --no-default-browser-check "about:blank" &

On Linux:

google-chrome --remote-debugging-port=9222 \
  --user-data-dir=/tmp/chrome-debug-profile \
  --no-first-run --no-default-browser-check "about:blank" &

3. Add to your MCP config

Claude Code (~/.claude/mcp.json):

{
  "mcpServers": {
    "cdp-browser": {
      "command": "node",
      "args": ["/path/to/cdp-browser-mcp/server.mjs"],
      "env": {
        "CDP_URL": "http://127.0.0.1:9222"
      }
    }
  }
}

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "cdp-browser": {
      "command": "node",
      "args": ["/path/to/cdp-browser-mcp/server.mjs"],
      "env": {
        "CDP_URL": "http://127.0.0.1:9222"
      }
    }
  }
}

Tools (17)

Navigation

Tool	Description
`cdp_navigate`	Go to a URL. Returns DOM snapshot.
`cdp_go_back`	Navigate back. Returns DOM snapshot.
`cdp_reload`	Reload page. Returns DOM snapshot.

Reading page state

Tool	Description
`cdp_snapshot`	Get the current indexed DOM.
`cdp_screenshot`	Take a screenshot (PNG or JPEG).
`cdp_page_text`	Get `document.body.innerText`. Fills gaps where the accessibility tree misses prose/bio text.

Interaction

Tool	Description
`cdp_click`	Click element by `[N]` index.
`cdp_type`	Type into element by `[N]` index. Per-character key events (works with React/Vue).
`cdp_press_key`	Press a keyboard key (Enter, Escape, Tab, etc.).
`cdp_scroll`	Scroll up or down.
`cdp_click_at`	Click at x,y coordinates.

Tabs

Tool	Description
`cdp_tabs`	List, open, switch, or close tabs.

Advanced

Tool	Description
`cdp_evaluate`	Execute arbitrary JavaScript.
`cdp_handle_dialog`	Accept or dismiss alerts/confirms/prompts.
`cdp_wait`	Wait N seconds.
`cdp_wait_for_url`	Wait until URL contains a substring.
`cdp_element_html`	Get the HTML of an element by index.

How the DOM snapshot works

Every mutation tool auto-returns the updated DOM. Interactive elements are indexed:

heading "Example User Verified account"
[46] button "Profile Summary"
[47] button "Search"
[48] button "Following @example_user"
article "Example User @example_user Mar 6 Just published a new blog post about..."
[98] button "41 Replies. Reply"
[99] button "29 reposts. Repost"
[100] button "545 Likes. Like"

Use the [N] index with cdp_click or cdp_type. Non-interactive content (headings, articles, text) is shown for context but doesn't waste an index.

Benchmarks

Methodology

All benchmarks ran on the same machine (Mac Mini, Chrome 145) with all 5 tools connecting to the same Chrome instance via CDP on port 9222. Same viewport size (1,309 × 1,309px), same pages, same session.

What the numbers measure: Each tool returns a text representation of the page (accessibility tree, DOM snapshot, or YAML). We tokenized every raw snapshot using tiktoken (cl100k_base encoding). All token counts in the tables below are real tokenizer output, not estimates.

Earlier versions of this benchmark used characters ÷ 4 as a rough proxy. Actual tokenization showed this underestimates by 4–20% depending on the tool's format — structured text with brackets, short attribute names, and numbers tokenizes less efficiently than prose. The ratios between tools also shifted because each format tokenizes at a different rate (e.g. browser-use averages 3.3 chars/token, cdp-browser-mcp averages 3.6, chrome-devtools-mcp averages 3.2).

How snapshots were captured: Navigate to URL → wait 2–3 seconds for page load → capture the tool's DOM/accessibility representation. Each tool uses its own serialization format (see format comparison below).

Full-page scroll test (browser-use only): browser-use returns only the viewport-visible portion of the page. To measure the total cost of seeing the full page, we scrolled down by one viewport height at a time, re-captured the state at each position, and summed all viewport snapshots. This is what an LLM agent using browser-use would actually consume to read the entire page. For infinite-scroll pages (YouTube), we limited scrolling to the initial page height to keep the comparison fair.

Per-snapshot results (8 page types, 2026-03-07)

All values are tokens measured with tiktoken cl100k_base. browser-use shows both per-viewport and full-page cost (sum of all scroll positions):

#	Page type	browser-use (viewport)	browser-use (full page)	cdp-browser-mcp	chrome-devtools-mcp	Playwright MCP
1	Wikipedia (static article)	2,698	66,067 (19 scrolls)	7,092	67,946	58,138
2	DataTables (data table)	2,265	8,874 (4 scrolls)	2,332	8,960	9,632
3	Excalidraw (canvas app)	980	986	282	1,075	1,553
4	GitHub Issues (React SPA)	899	874	4,006	9,616	17,597
5	YouTube (iframe-heavy)	11,510	50,544 (4 scrolls)†	3,961	12,246	17,161
6	Shoelace (web components)	2,320	26,295 (9 scrolls)	3,705	12,120	17,739
7	example.com (minimal)	28	28	31	98	87
8	BBC News (news + ads)	1,819	15,624 (7 scrolls)	4,288	13,232	20,798

† YouTube has infinite scroll — new content lazy-loads as you scroll. We capped at 4 viewports (covering the initial 4,318px page height) for a fair comparison.

cdp-browser-mcp is a thin MCP wrapper around browser-autopilot's CDPBrowser — they produce identical snapshots (verified by running both independently on all 8 pages). The MCP layer adds zero token overhead.

browser-use full-page cost: When you need to see the entire page, browser-use's per-viewport advantage disappears. Wikipedia costs 66K tokens across 19 scrolls vs 7K in a single cdp-browser-mcp call (9.3x). YouTube costs 51K across 4 scrolls vs 4K (12.8x). browser-use wins only on pages that fit in a single viewport (GitHub Issues: 874 vs 4,006).

cdp-browser-mcp vs chrome-devtools-mcp: median 3.3x fewer tokens, range 2.4x–9.6x.

cdp-browser-mcp vs Playwright MCP: median 4.6x fewer tokens, range 2.8x–8.2x.

Why the differences?

Same Chrome instance, same pages — the differences come from how each tool serializes the page:

Tool	What it captures	Output format
browser-use	Viewport-visible elements only, filtered by paint order, 100-char text cap	`[N]<tag attr=val />` — HTML-like syntax
browser-autopilot / cdp-browser-mcp	Full page accessibility tree via CDP	`[N] role "name"` — compact indexed format, text inlined
chrome-devtools-mcp	Full page accessibility tree via Puppeteer	`uid=` prefix on every node, separate `StaticText` children
Playwright MCP	Full page via Playwright's `ariaSnapshot()`	YAML with `[ref=]` tags, nested indentation

Notes:

Amazon blocked cdp-browser-mcp and chrome-devtools-mcp (anti-bot) but loaded for browser-use (5,368 tokens). GitHub login/dashboard differed due to auth state. Both excluded from cross-tool ratios.
chrome-devtools-mcp has additional capabilities not tested here (performance tracing, Lighthouse audits, network inspection, device emulation).
browser-use has additional capabilities not tested here (autonomous agent loop, CAPTCHA solving, vision mode with screenshots).

Anti-bot detection

Browser fingerprinting: All 5 tools tested identically against bot.sannysoft.com (30/30 pass) and Cloudflare Turnstile:

Signal	Result (all tools)
`navigator.webdriver`	`false`
Chrome object present	Yes
Plugin count	5
Sannysoft pass rate	30/30
Cloudflare Turnstile	Challenge shown

No stealth advantage at the fingerprinting level — they all present as real Chrome.

Real-world anti-bot (Amazon): During the initial benchmark run, Amazon served a 503 error page ("Sorry! Something went wrong!") to cdp-browser-mcp, chrome-devtools-mcp, and Playwright MCP — but loaded full search results for browser-use. All four tools were connected to the same Chrome instance via CDP.

We re-tested Amazon separately and all tools — including cdp-browser-mcp — loaded the full page without issue. The original blocking was likely rate-based: running 5 tools in rapid succession against the same pages triggered Amazon's request-rate detection, and the order tools ran in determined who got blocked (browser-use ran first).

Takeaway: basic fingerprinting tests (sannysoft, Cloudflare) show all tools as identical. Real-world anti-bot systems like Amazon's add server-side rate limiting that can produce inconsistent results depending on test conditions — but no tool has an inherent advantage.

Raw benchmark data (JSON results + sample snapshots) is available in the benchmarks/ directory.

When to use other tools instead

browser-use (80K+ stars):

You want an autonomous agent that drives the browser with an LLM loop (not just MCP tools)
You prefer viewport-only snapshots (smaller per call, but requires scrolling)
You need built-in CAPTCHA solving or vision-based interaction
You're building a Python pipeline, not using Claude Code/Desktop

chrome-devtools-mcp (Google, 28K+ stars):

You need performance profiling, Lighthouse audits, or memory snapshots
You need network request inspection or device emulation
You're debugging a web app, not automating browser tasks for an LLM agent

Playwright MCP:

Chrome on port 9222 isn't running (Playwright launches its own browser)
You need Playwright-specific features like file upload or form fill helpers

How it works

This server is a thin MCP wrapper around browser-autopilot's CDPBrowser class, which reads Chrome's Accessibility tree via the Chrome DevTools Protocol. The key insight: Chrome already computes a semantic accessibility tree for screen readers. Instead of injecting JavaScript to walk the DOM and guess which elements are interactive, we just read what Chrome already knows.

The result is a compact, accurate representation of the page that naturally filters out SVG paths, hidden elements, and decorative wrappers — exactly the noise that inflates other tools' token counts.

Built on browser-autopilot

browser-autopilot is the underlying engine by @eigengajesh. It's a full autonomous browser agent framework that goes well beyond what this MCP server exposes:

Capability	cdp-browser-mcp	browser-autopilot
CDP browser control	17 MCP tools	Full CDPBrowser class + 25+ agent tools
Compact AX tree snapshots	Yes (the whole point)	Yes (this is where it comes from)
Autonomous agent loop	No (you're the agent)	Yes — multi-step LLM-driven via Vercel AI SDK
X11 fallback (Linux)	No	Yes — real mouse/keyboard when CDP gets blocked
Login orchestration	No	Yes — cached sessions, CDP login, X11 fallback
CAPTCHA solving	No	Yes — Capsolver, 2Captcha integration
File upload/paste	No	Yes — `upload_file`, `paste_content`, `paste_image`
Docker/cloud deployment	No	Yes — Xvfb, noVNC, containerized

If you need the MCP interface for Claude Code / Claude Desktop / Cursor — use this server. If you need the full autonomous agent framework for headless pipelines or authenticated workflows — use browser-autopilot directly.

npm install browser-autopilot ai zod

import { CDPBrowser, runAgent } from "browser-autopilot";

const browser = new CDPBrowser();
await browser.connect();

const { result, success } = await runAgent({
  browser,
  task: "Go to wikipedia.org and find the population of Tokyo.",
});

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
benchmarks		benchmarks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
server.mjs		server.mjs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cdp-browser-mcp

Why this exists

Quick start

1. Install

2. Start Chrome with CDP enabled

3. Add to your MCP config

Tools (17)

Navigation

Reading page state

Interaction

Tabs

Advanced

How the DOM snapshot works

Benchmarks

Methodology

Per-snapshot results (8 page types, 2026-03-07)

Why the differences?

Anti-bot detection

When to use other tools instead

How it works

Built on browser-autopilot

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cdp-browser-mcp

Why this exists

Quick start

1. Install

2. Start Chrome with CDP enabled

3. Add to your MCP config

Tools (17)

Navigation

Reading page state

Interaction

Tabs

Advanced

How the DOM snapshot works

Benchmarks

Methodology

Per-snapshot results (8 page types, 2026-03-07)

Why the differences?

Anti-bot detection

When to use other tools instead

How it works

Built on browser-autopilot

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages