LLM-driven QA orchestration for web apps. An agent in the terminal explores your project, generates declarative qa.yaml test specs alongside route files, enqueues browser-based QA jobs to a BullMQ worker, and reads back per-job status to diagnose failures.
.
├── src/ # Orchestrator + worker (TypeScript, run via tsx)
│ ├── main.ts # Interactive chat agent (the orchestrator)
│ ├── libs/
│ │ ├── inference.ts # OpenAI Responses API streaming + tool dispatch
│ │ ├── queue.ts # BullMQ producer
│ │ ├── redisConnection.ts # ioredis connection helper
│ │ ├── worker.ts # BullMQ consumer: launches Puppeteer, runs tests
│ │ ├── test/
│ │ │ ├── run.ts # The actual test runner (CLICK/TYPE/EXPECT/...)
│ │ │ └── prompt.ts # (legacy codegen prompt, currently unused)
│ │ └── tools/ # Function-call tools exposed to the agent
│ │ ├── filesys.ts # read / list / grep / write project files
│ │ ├── worker.ts # create_worker (enqueue a QA job)
│ │ └── status.ts # list_jobs / read_job_status / search_job_status
│ └── types/Test.ts # RunTestParams, PageTest, ActionType
│
├── web/ # Sample Next.js app under test (own package.json)
│ └── src/app/**/qa.yaml # Per-route QA specs live next to page.tsx
│
├── temp/orchestration/ # Worker-written job status history (gitignored)
│ └── job-<id>-status.json # Rolling array of { summary, tests[] } per job
│
├── docker-compose.yml # Redis 8 (BullMQ backend)
└── package.json # Orchestrator + worker scripts
- Node.js 22+
- Docker (for Redis), or any Redis you want to point at
- An OpenAI API key
npm install
cp .env.example .env # if you create one; otherwise just create .env with the keys below.env contents:
OPENAI_API_KEY=sk-... # your OpenAI key
REDIS_URL=redis://127.0.0.1:6379
QA_HEADED=1 # optional: 1 = launch Puppeteer non-headless, 0/unset = headless
QA_BASE_URL=http://localhost:3000 # optional, base URL the worker hits (defaults to localhost:3000)Start Redis:
docker compose up -d redis(Optional) Start the sample Next.js app on port 3000 so the worker has something to test:
cd web
npm install
npm run devIn two terminals:
# Terminal 1 — the BullMQ worker (Puppeteer runs here)
npm run dev:worker
# Terminal 2 — the orchestrator chat agent
npm run devThe chat agent will introduce itself briefly, then wait. Try things like:
list the qa specs in this projectrun full QAwhat jobs have we run?why did job 36 fail?regenerate the /todos qa.yaml
- The orchestrator (
src/main.ts) talks to the model viastreamText(src/libs/inference.ts), which exposes the file, worker, and status tool sets. - When the model calls
create_worker,src/libs/tools/worker.tsvalidates and pushes a job onto theRUN_TESTBullMQ queue (src/libs/queue.ts). src/libs/worker.tspicks up the job, launches Puppeteer, and runs eachPageTestviasrc/libs/test/run.ts.- After each job, the worker asks the model to summarize the run, then appends a
{ summary, tests[] }entry totemp/orchestration/job-<id>-status.json(capped at the last 5 entries). - The orchestrator can read those files back via
list_jobs/read_job_status/search_job_statusto diagnose failures.
One YAML file per route, sibling to the page file. Mirrors RunTestParams in src/types/Test.ts.
route: "/todos"
tests:
- name: "Loads seeded todos"
globalReferences: # documentation-only; runner ignores
- '[data-testid="todos-heading"]'
process:
- type: EXPECT
references:
- '[data-testid="todos-heading"]'
inst: "Visible with text exactly Todos"
- type: TYPE
references:
- '[data-testid="todos-input"]'
inst: "Buy bread"
- type: CLICK
references:
- '[data-testid="todos-add"]'
- type: EXPECT
references:
- '[data-testid="todos-list"]'
inst: "Visible and text contains Buy bread"CLICK, TYPE, EXPECT, HOVER, PRESS, SELECT, CHECK, UNCHECK, SCROLL, WAIT, RELOAD, GOTO. See src/libs/test/run.ts for exact semantics; the orchestrator system prompt in src/main.ts documents them in detail.
text exactly X/text contains Xtake everything afterexactly/containsas the literal expected string. Don't add commentary ininst.TYPErequires non-empty literal text. To test "ignores empty submit", justCLICKthe submit button without typing first.- For checkboxes prefer
CHECK/UNCHECKoverCLICK— they're idempotent and assert post-state. - There's no "not present" assertion. Use
WAITwithinst: "hidden"on the removed selector. - The worker does one
page.goto(route)per job and shares the page across everyPageTest. There is no per-test isolation yet; either order tests so they don't depend on prior state, or start state-mutating tests with aRELOAD.
| Command | What it does |
|---|---|
npm run dev |
Start the orchestrator chat agent (src/main.ts) |
npm run dev:worker |
Start the BullMQ QA worker (src/libs/worker.ts) |
docker compose up -d redis |
Start Redis 8 with append-only persistence |
cd web && npm run dev |
Start the sample Next.js app under test |
MIT — see LICENSE. Copyright (c) 2026 Alan (Xuren) Shen.