Skip to content

lpc0387/guided-browser-runner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Guided Browser Runner 🎯

LLM-native browser automation engine — Intent DSL → Chrome headless execution. Replace Playwright / Selenium / Puppeteer with an LLM-friendly JSON interface.

PyPI Python License


Why Guided Browser Runner?

80% less token cost than Playwright MCP, same flexibility.

Feature Guided Runner @playwright/mcp Selenium Puppeteer
LLM-native intent DSL ❌ (raw API)
Token cost per flow ~550 ~2500+ N/A N/A
Execution speed ~2s ~10-15s ~3s ~2s
Auto LLM error recovery
Custom JS for complex ops ✅ (full Playwright)
No npm / heavy deps ✅ (Python only)

Use cases: LLM agents, browser automation, UI testing, web scraping, form filling, screenshot pipelines, AI-driven test recovery.


Quick Start

pip install guided-browser-runner
from guided_runner import GuidedRunner

runner = GuidedRunner()
result = runner.run({
    "steps": [
        {"action": "goto", "url": "https://example.com/login"},
        {"action": "fill", "selector": "#email", "value": "user@example.com"},
        {"action": "click", "selector": "#login-btn"},
        {"action": "wait", "timeout_ms": 2000},
        {"action": "expect_text", "selector": ".dashboard", "text": "Welcome"},
        # Complex action — LLM generates custom JS
        {"action": "custom", "code": """
            const fileInput = document.querySelector('#upload');
            const dt = new DataTransfer();
            dt.items.add(new File(['hello'], 'test.txt'));
            fileInput.files = dt.files;
            fileInput.dispatchEvent(new Event('change'));
        """},
        {"action": "screenshot"},
    ]
})
print(result["summary"])  # "7/7 steps in 3.2s (0 rewrites)"

How It Works

Intent DSL — LLM-native automation format

{
  "steps": [
    {"action": "click", "selector": "#login-btn"},
    {"action": "fill", "selector": "input[name=email]", "value": "admin"},
    {"action": "custom", "code": "/* LLM-generated JS for drag-and-drop */"}
  ]
}

LLM generates a lightweight JSON plan (~150 tokens) instead of raw code. The Runner executes it — no per-step LLM calls during normal operation.

Template actions — zero token overhead

Action Parameters Generated JS
goto url window.location.href = url
click selector document.querySelector(sel).click()
fill selector, value Set .value + dispatch input/change events
wait timeout_ms sleep(timeout)
expect_text selector, text Assert element contains text
screenshot Captured via Chrome --screenshot flag

Template actions execute with zero additional LLM tokens.

Custom actions — unlimited flexibility

{"action": "custom", "code": "/* any JavaScript */"}

Use for: drag-and-drop, file uploads, iframe switching, network intercept, multi-tab, canvas/SVG interaction, shadow DOM — anything the 6 templates don't cover.

The LLM generates the JS directly in the IntentStep, giving you full Playwright-level flexibility at ~50 tokens per custom step.

Auto error recovery — three-tier

L1 (auto):   3× retry with exponential backoff — no LLM cost
L2 (LLM):    Diagnose failure → fix selector / skip step / rollback
L3 (fatal):  Fallback to checkpoint screenshot + LLM recovery

On failure the LLM receives the error, step context, and recent attempts, then returns a structured fix decision:

{
  "diagnose": "login button selector changed to .login-v2-btn",
  "type": "fix_selector",
  "new_step": {"action": "click", "selector": ".login-v2-btn"}
}

Comparison: Browser Automation Tools

vs Playwright / Puppeteer / Selenium

Guided Runner complements these tools rather than replacing them. Use Playwright for deterministic CI test suites; use Guided Runner for LLM-driven agentic workflows where the steps are dynamic.

vs @playwright/mcp

Guided Runner @playwright/mcp
Architecture Intent DSL → template execution LLM calls tool each step
Token cost (3 steps) ~550 2500+ (4-5× more)
Latency per step ~0.5s (no LLM round-trip) 2-5s (LLM decides)
Complex ops Custom code field Full Playwright API
Error recovery Built-in L1/L2/L3 Manual
Dependencies Python 3.10+ + Chrome npm + Playwright + MCP server

vs Browser-use / Playwright-agent

Guided Runner browser-use
Token efficiency High (template + custom) Low (LLM per action)
Code footprint 382 lines 5000+ lines
Custom JS ✅ Direct ❌ Agent abstraction
Checkpoint/resume ✅ Built-in

Requirements

  • Python 3.10+
  • Chrome / Chromium (headless mode)
    # Install Chrome for testing (recommended)
    npx playwright install chrome
    
    # Or use system Chrome
    export GUIDED_CHROME_PATH=/usr/bin/google-chrome

Quick Install

pip install guided-browser-runner
# Or from source
git clone https://github.com/lpc0387/guided-browser-runner.git
cd guided-browser-runner
pip install -e .

Environment Variables

Variable Default Description
GUIDED_CHROME_PATH /opt/chrome-headless-shell-linux64/chrome-headless-shell Path to Chrome/Chromium binary

Security

  • Template actions are statically generated — no exec/eval/fetch
  • Custom code has a hard cap of 5 LLM rewrites per run
  • All Chrome instances run with --no-sandbox --disable-gpu (isolated mode)

License

MIT — free for personal and commercial use.


Keywords: browser automation, LLM agent, intent DSL, Chrome headless, AI automation, web scraping, UI testing, Playwright alternative, Puppeteer alternative, Selenium alternative, AI test recovery, LLM-native automation, browser control, headless browser, Python automation

Releases

No releases published

Packages

 
 
 

Contributors

Languages