Auto QA Workbench

AI-driven test generation and self-healing with trace-based backend validation -- proving not just that the UI looks right, but that every backend mutation actually happened correctly.

What is this?

Auto QA Workbench is an autonomous QA testing tool that replaces brittle, coordinate-based E2E scripts with self-healing AI agents. It generates deterministic Playwright TypeScript tests from natural language descriptions, validates backend behavior through OpenTelemetry trace correlation (no direct database connections), and files defects in Jira automatically. It runs fully autonomously in CI/CD pipelines.

Key Features

AI-Powered Test Generation -- Describe tests in natural language; a multi-agent LangGraph pipeline (Planner, Executor, Healer, Evaluator) generates production-ready Playwright TypeScript specs
Self-Healing Tests -- When locators break, a Healer agent re-extracts the DOM accessibility snapshot and reconstructs semantic locators (getByRole, getByLabel, getByText) automatically
Trace-Based Backend Validation -- Injects OpenTelemetry traceparent headers and uses Tracetest to assert that backend mutations (database writes, service calls) actually happened, not just that the UI looks right
GenAI Application Testing -- Validates AI application internals via OTel GenAI semantic conventions with version-adaptive attribute handling across 3 semconv epochs
Ephemeral CI Infrastructure -- Testcontainers spins up OTel Collector, Jaeger, and Tracetest on demand; zero permanent infrastructure needed
Automated Defect Filing -- Failures auto-create Jira tickets with full evidence: root-cause analysis (13-category taxonomy), trace deep links, and reproduction steps
CI/CD Native -- JUnit XML output, structured JSON/HTML reports, headless mode; designed for pipeline integration

Architecture Overview

LangGraph StateGraph orchestrates 5 specialized agents:

Agent	Role
Supervisor	Routes work between agents, manages retries, enforces token budget
Planner	Creates step-by-step test plans by probing live DOM via browser accessibility snapshots
Executor	Generates deterministic Playwright TypeScript code from test plans
Healer	Re-extracts DOM snapshots and reconstructs broken locators using semantic selectors
Evaluator	Validates backend mutations via OpenTelemetry trace correlation with Tracetest assertions

Each agent uses Playwright MCP Server for browser interaction with separate MCP sessions -- the Executor intentionally starts from a clean browser state to avoid inheriting the Planner's navigated session. This prevents generated tests from silently skipping setup steps (login, navigation) that would fail in CI.

Reports are generated post-graph with three output formats: JUnit XML (CI integration), structured JSON (programmatic consumption), and HTML (human review with trace deep links).

How it Works

Describe -- Provide a URL and a natural language test specification
Plan -- The Planner agent navigates the target app, probes the DOM accessibility tree, and builds a step-by-step test plan with semantic locators
Generate -- The Executor agent converts the plan into a deterministic Playwright TypeScript test file
Validate -- The Evaluator agent runs the test with OTel trace injection and asserts backend behavior via Tracetest
Heal -- If locators break on subsequent runs, the Healer agent automatically reconstructs them from fresh accessibility snapshots
Report -- Results are written as JUnit XML, JSON, and HTML; failures auto-create Jira tickets with root-cause analysis

Prerequisites

Python 3.11+
Node.js 18+ (Playwright runtime)
Docker (Testcontainers)
Anthropic API key

Quickstart

# Install
git clone https://github.com/damir-topic/auto-qa-workbench.git
cd auto-qa-workbench
uv venv && source .venv/bin/activate
uv pip install -e .
cp .env.example .env
# Add your ANTHROPIC_API_KEY to .env

# Install Playwright browsers
npx playwright install chromium

# Generate a test
auto-qa run --url https://example.com --spec "Verify the page has a heading"

# Run the generated test
npx playwright test generated/specs/

Project Structure

src/
  cli/          # Typer CLI entry point
  config/       # Pydantic settings
  graph/        # LangGraph agents (supervisor, planner, executor, healer, evaluator)
  mcp/          # Playwright MCP client
  infra/        # Testcontainers (OTel Collector, Jaeger, Tracetest)
  tracetest/    # Tracetest API client
  jira/         # Jira REST API integration
  genai/        # GenAI semantic convention support
  codegen/      # Playwright code generation and validation
  reporting/    # JUnit XML, JSON, HTML report generation

Configuration

All settings are managed via environment variables. See .env.example for the full list of available options, or check src/config/settings.py for detailed field documentation.

Variable	Required	Default	Description
`ANTHROPIC_API_KEY`	Yes	--	Anthropic API key for Claude
`MODEL_NAME`	No	`claude-sonnet-4-5-20250929`	Model to use for agents
`MAX_RETRIES`	No	`3`	Max retry attempts per agent
`TOKEN_BUDGET`	No	`150000`	Token budget per run
`OUTPUT_DIR`	No	`generated/specs`	Output directory for specs
`HEADLESS`	No	`false`	Run browser in headless mode

Jira integration is disabled by default. See .env.example for Jira configuration variables.

Generated Output

Each run produces a Playwright test file in generated/specs/:

// generated/specs/verify-login-page.spec.ts
import { test, expect } from '@playwright/test';

test('Verify login page has username and password fields', async ({ page }) => {
  await page.goto('https://example.com/login');
  await expect(page.getByRole('textbox', { name: 'Username' })).toBeVisible();
  await expect(page.getByRole('textbox', { name: 'Password' })).toBeVisible();
  await expect(page.getByRole('button', { name: 'Sign in' })).toBeEnabled();
});

Tests use accessibility-driven locators exclusively -- no CSS selectors or XPath.

License

Apache 2.0 -- see LICENSE for details.

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
.planning		.planning
generated/specs		generated/specs
src		src
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
pyproject.toml		pyproject.toml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Auto QA Workbench

What is this?

Key Features

Architecture Overview

How it Works

Prerequisites

Quickstart

Project Structure

Configuration

Generated Output

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Auto QA Workbench

What is this?

Key Features

Architecture Overview

How it Works

Prerequisites

Quickstart

Project Structure

Configuration

Generated Output

License

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages