From 786646217360675c0026c7ab30fc75844a0a8cc8 Mon Sep 17 00:00:00 2001 From: DavertMik Date: Wed, 20 May 2026 10:44:53 +0300 Subject: [PATCH 1/7] Generate cases improved --- skills/generate-cases/SKILL.md | 30 ++++-- .../generate-cases/references/writing-rule.md | 100 ++++++++++++++++++ 2 files changed, 120 insertions(+), 10 deletions(-) create mode 100644 skills/generate-cases/references/writing-rule.md diff --git a/skills/generate-cases/SKILL.md b/skills/generate-cases/SKILL.md index 9be29de..65edff4 100644 --- a/skills/generate-cases/SKILL.md +++ b/skills/generate-cases/SKILL.md @@ -22,6 +22,12 @@ Trigger this skill when the user: - Shares designs (Figma, Miro, screenshots, etc.) and wants test coverage - Mentions QA activities or testing documentation needs +### Prerequisites + +- You already checked the project structure and identified what is the in this folder (source code, e2e tests, test cases) using /project-scan skill +- You pulled existing Testomat.io tests using /sync-cases skill +- You have switched to PLAN MODE if availble to start interview section + ### User interaction - When user interaction required, good to highlight it with question mark emoji ❓. Example: @@ -48,6 +54,8 @@ This skill follows an **iterative approach**. **Don't omit steps and strictly ask for user approval/feedback before proceeding to the next step**. Generate **checklist** prior to **test cases** generation even if user request sounds like "generate test cases". +WHEN AVAILABLE USE ASK TOOL WHEN ASKING USER FOR INPUT WITH CHOICES + ### Workflow Steps 1. **Gather context and goals** @@ -157,6 +165,8 @@ I've gathered information from the following sources: 2. ✏️ Type changes ``` +You must also look for existing test cases to avoid duplicating test cases. If existing test cases found report this to user and suggest expanding them or creating a new suite. + [Wait for user approval before proceeding to Step 2.] ### Step 2: Ask for coverage size @@ -301,24 +311,21 @@ Example: ### Step 5: Generate Detailed Test Cases + **Proceed with this step only after user approval of checklist.** If user prompted checklist generation only (not test cases), ask user if he wants to generate test cases (e.g. **"Do you want to generate test cases for this checklist?"**) or proceed to test case generation. -Rules: - - If multiple features to test or whole product => put generated test cases of each feature in separate file. -- Step should consist of **action** (with optional test data) and **expected result**. -- All checks/verifications/assertions should be in **expected result**, not as separate steps. -- **Be thorough but practical**. Cover important scenarios without overwhelming detail (unless user asks for it or select appropriate role e.g. "nerd"). -- **Use clear language**. Test steps should be unambiguous and easy to follow. +- - **Be thorough but practical**. Cover important scenarios without overwhelming detail (unless user asks for it or select appropriate role e.g. "nerd"). - If user provides **existing test cases**, follow their **style**. -- **Think like a tester**. Consider not just what works, but what could break. - **Consider the user**. Focus on user-facing functionality first. - **Be adaptable**. Adjust depth, detail, testing type and other parameters based on user feedback. - **Keep scope**. Test only the functionality under test but not the related entities. If feature linked to other features, don't test them if user didn't ask for it explicitly. -- Do not include **preconditions** like "service is running" if it's not explicitly mentioned by user or is not actually what the test is about. -- Put preparations into **preconditions**/**description** section, not as steps. + +**Follow test case writing rules**: `./references/test-case-format.md` + +Always use provided Test case format **Don't change the user's source code. Only generate `*.test.md` files with test cases.** @@ -348,6 +355,9 @@ Checklist should have hierarchical and categorized structure. - **IMPORTANT:** **Strictly follow the `./references/test-case-format.md` format** for **suites**, **tests** and **steps**. - Use `` and `` blocks to wrap test cases. - If required, put `tags:` and `labels:` inside **each** `` metadata block (see [test metadata](./references/test-case-format.md#test-metadata)). Not only on the suite block. +- Try to reuse existing tags and labels obtained by MCP or from other test cases. +- Variables and placeholders always must be formatted as `${variable}` or `${placeholder}` (use backticks). +- Do not use labels if you are not aware of any existing ones. - If reasonable, add test metadata like priority, preconditions, test data, labels, tags based on analyzed information and context. - **IMPORTANT: NEVER GENERATE test or suite IDs of ANY kind**: - Do NOT include Testomat.io IDs (`id: @T*`, `id: @S*`, e.g. `@T12345678`, `@S380c64db`). @@ -408,4 +418,4 @@ When the user provides an existing test cases example and/or asks for "similar" 6. Display checklist in terminal and ask for user approval - ask if user wants to proceed - ask about amount of cases / level of details (more, less, keep as is) -7. Suggest user to upload generated checklist to Testomat.io via `sync-cases` skill \ No newline at end of file +7. Suggest user to upload generated checklist to Testomat.io via `sync-cases` skill diff --git a/skills/generate-cases/references/writing-rule.md b/skills/generate-cases/references/writing-rule.md new file mode 100644 index 0000000..6ac48e9 --- /dev/null +++ b/skills/generate-cases/references/writing-rule.md @@ -0,0 +1,100 @@ +## Test Suite Writing Rules + + +Formulas, business rules, edge-case reasoning, and feature +context live in the suite/test description. +Prerequisites which are common to all test cases must be written as bullet points in the suite description. +Formulas, diagrams, relavant to all test cases can be included as codeblocks in the suite description. + + +## Test Case Writing Rules + +Describe the intent of the test case. +Intent is needed only if title is not fully enough to explain the intent. + +Test case consist of description and steps +Description must be clear and concise. +Focus on whitebox testing, thus each operation and observable results must be obtained via public apis or UI. +Prefer using UI over public apis when possible. +If you think UI is available, and test can be achived from UI, test must use it +Understand UI pages and components or API endpoints from source code. If source code not availble ask user for it. +Add system checks only if it not clear how to test via public apis or UI. +It is not a unit test, usually no direct server or code access is allowed. + +Why in description, what in steps. + +Put preparations into **preconditions**/**description** section, not as steps. + + +## Preconditions + +Do not include preconditions like "service is running" if it's not explicitly mentioned by user or is not actually what the test is about. +Define the pre-conditions and initial state of the system as bullet points. +Avoid if not needed or is the same as the suite description. +If relevant, include the user role + + +## Steps + +Step should consist of **action** (with optional test data) and **expected result**. +Each step must be simple sentences. +Avoid using commas, subsentances +Steps are mechanical: click, send, read, assert. +Each step must include exact instructions +Prefer steps that include specific values, urls +Prefer concrete values over general words. Replace "small", "known", "around", "e.g.", "like", "(…)" with literal numbers, strings, etc. +If no values are availble use placholders variable names like `${url}` or `${company-title}` +Avoid general statements in steps. Move general statements to the description. +Do not chain multiple distinct actions or use unnecessary And / Or combinations in a single sequence. +All step actions must be clear to perform via PUBLIC API or UI +All checks/verifications/assertions should be in **expected result**, not as separate steps. +You may add a codeblock after a step if needed to: + +- write API request +- write SQL +- write shell command +- to illustrate the point using pseudocode +- etc + +Avoid using bold/italic formatting in steps. + +If expected result has multiple conditions split them into separate lines: + +Instead of: + *Expected:* Comment appears in the Run's comment list with the current user as author* + +Use: + *Expected:* Comment appears in the Run's comment list + *Expected:* Current user is the author + + +## AI Writing Patterns to Avoid + +Avoid: + +Puffery: pivotal, crucial, vital, testament, enduring legacy +Empty "-ing" phrases: ensuring reliability, showcasing features, highlighting capabilities +Promotional adjectives: groundbreaking, seamless, robust, cutting-edge +Overused AI vocabulary: delve, leverage, multifaceted, foster, realm, tapestry +Formatting overuse: excessive bullets, emoji decorations, bold on every other word + +Be specific, not grandiose. Say what it actually does. Avoid marketing language. + + +Words to watch: stands/serves as, is a testament/reminder, plays a vital/significant/crucial/pivotal role, underscores/highlights its importance/significance, reflects broader, symbolizing its ongoing/enduring/lasting impact, key turning point, indelible mark, deeply rooted, profound heritage, steadfast dedication... + +Words to watch: ensuring ..., highlighting ..., emphasizing ..., reflecting ..., underscoring ..., showcasing ..., aligns with..., contributing to... + +Words to watch: continues to captivate, groundbreaking (in the figurative sense), stunning natural beauty, enduring/lasting legacy, nestled, in the heart of, boasts a... + +Words to watch: it's important/critical/crucial to note/remember/consider, may vary... + +Words to watch: In summary, In conclusion, Overall... + +Words to watch: Despite its... faces several challenges..., Despite these challenges, Challenges and Legacy, Future Outlook... + +Words to watch: align/aligns/aligning with,68 crucial,1 delve/delves/delving (pre-2025),681 emphasizing,68 enduring,8 enhance/enhances/enhancing,81 fostering,81 garnered/garnering,68 highlight/highlighted/highlighting/highlights (as a verb),1 interplay,8 intricate/intricacies,687 key (as an adjective), landscape,8 leveraging,1 multifaceted,687 notably,8 nuanced,68 realm,8 robust,1 seamless/seamlessly,1 shed light on, showcasing,8 streamline,1 tapestry,8 testament,18 underpin/underpins/underpinning,8 underscore/underscores/underscoring,8 vibrant,87 vital,1 ... + +Parallel constructions involving "not", "but", or "however" such as "Not only ... but ..." or "It is not just about ..., it's ..." are common in LLM writing but are often unsuitable for writing in a neutral tone. + +Avoid using emojis if not explicitly intended. From 221a7c5e7ed6abfb4b6db97cf6cc079f1dd788de Mon Sep 17 00:00:00 2001 From: DavertMik Date: Wed, 27 May 2026 04:43:37 +0300 Subject: [PATCH 2/7] added a skill to setup pr testing --- .../test-management/skills/setup-pr-testing | 1 + skills/setup-pr-testing/SKILL.md | 408 ++++++++++++++++++ skills/setup-pr-testing/evals/evals.json | 26 ++ .../references/REPORTER_CONTRACT.md | 173 ++++++++ 4 files changed, 608 insertions(+) create mode 120000 plugins/test-management/skills/setup-pr-testing create mode 100644 skills/setup-pr-testing/SKILL.md create mode 100644 skills/setup-pr-testing/evals/evals.json create mode 100644 skills/setup-pr-testing/references/REPORTER_CONTRACT.md diff --git a/plugins/test-management/skills/setup-pr-testing b/plugins/test-management/skills/setup-pr-testing new file mode 120000 index 0000000..82ae913 --- /dev/null +++ b/plugins/test-management/skills/setup-pr-testing @@ -0,0 +1 @@ +../../../skills/setup-pr-testing \ No newline at end of file diff --git a/skills/setup-pr-testing/SKILL.md b/skills/setup-pr-testing/SKILL.md new file mode 100644 index 0000000..f9a52ba --- /dev/null +++ b/skills/setup-pr-testing/SKILL.md @@ -0,0 +1,408 @@ +--- +name: setup-pr-testing +description: Set up change-aware PR regression testing — wire CI so every pull request runs only the manual and automated/e2e tests affected by the diff, via Testomat.io coverage maps and @testomatio/reporter. Use this skill when the user wants PR-triggered regression, "run only affected tests on PRs", selective manual runs per PR, post-deploy e2e on PRs, a coverage-driven CI pipeline, weekly/grouped PR regression runs in a rungroup, creating a Testomat.io run in CI and dispatching e2e tests to another repo, deploy-on-tag / on-release / on-merge regression, or to connect coverage.manual.yml / coverage.e2e.yml to their CI. CI-agnostic — adapts to whatever CI the project already uses (GitHub Actions, GitLab CI, Jenkins, Bitbucket, CircleCI, etc.). Trigger it even if the user only says "make PRs run the right tests" or "set up regression for pull requests". +license: MIT +metadata: + author: Testomat.io + version: 1.0.0 +--- + +# SETUP-PR-TESTING SKILL: What I do + +I wire a project's CI so each **pull request** (or deploy thereof) runs only the +tests affected by its change — not the whole suite. There are exactly two flows, +and they are independent: + +``` +PR open ──▶ Regression Manual Tests coverage.manual.yml ──▶ reporter run --kind manual + + TESTOMATIO_TITLE + TESTOMATIO_RUNGROUP_TITLE + +deploy done ──▶ Regression Automated Tests coverage.e2e.yml + ├─ same repo: ──▶ reporter run "" + TITLE + RUNGROUP + └─ cross repo: ──▶ reporter run --filter-list … --format=grep (compute selection) + ──▶ reporter start (pre-create empty run, capture ID) + ──▶ dispatch e2e repo with {grep, run=, env} (tests there export TESTOMATIO_RUN) +``` + +Both flows are **change-aware**: a coverage map (`coverage.*.yml`) maps source +files → test/suite IDs, and `@testomatio/reporter` filters by the PR diff so +only impacted tests are selected. + +Every run — manual or automated — lands in Testomat.io with a meaningful +**title** and inside a **rungroup**. The grouping strategy (week / day / +milestone / submodule / release) is the user's call; we ask, we do not invent. + +For cross-repo e2e, the run is **pre-created in the source repo** (where the +diff and coverage map live) and the resulting run ID is passed into the e2e +repo so its test results attach to the same run instead of opening a new one. + +I do **not** invent tests, write CI from guesswork, or assume a CI system. I +discover the project, confirm the unknowns with the user, reuse existing +coverage maps (or delegate creating them), and express the two flows in +whatever CI the project actually uses. + +## What this skill is and isn't + +This skill is the **method**, not a catalogue of CI snippets. The valuable, +non-obvious knowledge is: + +1. the two-flow model and how they differ (manual = no deploy dependency; + automated = after deploy, never blocks the release); +2. the `@testomatio/reporter` command contract — title/rungroup env vars, the + `reporter start` "empty run" pattern, and the cross-repo `--filter-list + --format=grep` → `start` → dispatch flow (see + [references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md)); +3. the decisions — coverage maps required, where e2e runs, what to ask the user + (deploy trigger, completion signal, rungroup strategy, diff base). + +Translating a trigger into a specific CI's YAML/Groovy/config is **not** special +knowledge — you already know how GitHub Actions, GitLab CI, Jenkins, or any +other CI express "run on PR open", "run after job X", and "don't fail the +pipeline". Write that yourself for the CI in front of you. Do not expect, or +add, a per-CI recipe file. + +## When to use + +- "Run only the tests affected by a PR", "selective regression on pull requests". +- "Create a pending manual run for each PR with just the relevant cases". +- "Run e2e tests after a PR is deployed, but only the affected ones". +- "Hook coverage.manual.yml / coverage.e2e.yml into our CI". + +--- + +## CRITICAL CONSTRAINTS + +- **Discovery first, always.** Never write CI before delegating to `project-scan` + (frameworks, manual + automated tests, where e2e tests live). Decisions below + depend on its result. +- **Never assume the CI system, and never hardcode one.** Identify the CI the + project already uses by reading the repo; if you cannot tell, ask. Then write + config for *that* CI from your own knowledge of it. +- **Regression cannot work without a coverage map.** If the needed + `coverage.*.yml` is missing, the only correct move is to propose creating it + via the coverage skills — not to hand-write a mapping or skip filtering. +- **Confirm the end system's PR flow with the user.** We do not know how their + PRs deploy or how a deploy completion is observable. Ask — do not guess. +- **The automated-regression job must never fail the deploy/release pipeline.** + It is observation, not a gate. Isolate it. +- **Every run gets a meaningful title.** Set `TESTOMATIO_TITLE` on every reporter + call — manual and automated, same-repo and cross-repo. Derive it from the + scope in hand: PR title + PR number for PR-triggered flows; commit subject + + short SHA for commit-triggered flows; tag name for tag/release-triggered + flows. Never leave the run untitled. +- **Every run lands in a rungroup.** Set `TESTOMATIO_RUNGROUP_TITLE` on every + reporter call. The grouping strategy — week, day, milestone, submodule, + release — is the user's choice; ask in Step 4. Reuse the same strategy for + the manual and automated flows unless the user explicitly splits them. +- **Cross-repo e2e MUST pre-create the run here.** When the e2e suite lives in + another repo, create the run in *this* repo with `reporter start` (carrying + title + rungroup), capture the run ID, and pass it to the e2e repo as + `TESTOMATIO_RUN`. The e2e repo must not create its own run — the title and + rungroup belong with the diff, which lives here. +- **Only touch CI config and (if asked) coverage files.** Do not modify + application or test source. When e2e tests live in another repo, do not edit + that repo blindly — produce the change and tell the user where it goes. + +--- + +## Workflow + +### Step 1 — Discover the project (delegate to `project-scan`) + +Run **`project-scan`**. From its result capture: + +- **Frameworks** — is there an automated **e2e** framework here (Playwright, + Cypress, CodeceptJS, WebdriverIO, Puppeteer, Appium)? Unit/integration + frameworks do **not** count as e2e. +- **Manual tests** — are there `*.test.md` cases (here or pullable from Testomat.io)? +- **Automated tests** — present in this repo, or only elsewhere? + +This tells you which of the two flows are even applicable. + +### Step 2 — Identify the CI system (do not assume) + +Read the repo to see which CI it already uses — its CI config files / dotfiles +make this obvious. If nothing is configured, or several CIs are present, **ask +the user which CI runs their PRs**. You do not need a lookup table or recipes +for this; once you know the CI, you know how to write for it. + +### Step 3 — Locate or create the coverage maps + +Look for existing coverage files in the repo root: `coverage.manual.yml`, +`coverage.e2e.yml`, or any `coverage*.yml`. Inspect the keys to confirm they +map this repo's source. + +For each flow the user wants: + +- **Map exists** → reuse it. Validate it still resolves (the coverage skills + bundle a checker) before wiring CI to it. +- **Map missing** → regression for that flow **will not work**. Propose + creating it and, on agreement, delegate: + - manual flow → **`manual-coverage`** → produces `coverage.manual.yml` + - automated/e2e flow → **`automation-coverage`** → produces `coverage.e2e.yml` + +Do not proceed to CI for a flow whose map does not exist. + +### Step 4 — Ask the unknowns about the end system's PR flow + +We cannot see how their PRs deploy, and we do not know how they want runs +grouped. Before designing the flows, ask (skip whatever the project-scan / CI +config already clearly answers — read their CI files first so you don't ask +something already there): + +1. **When does a deploy that should trigger e2e happen?** Surface the common + options so the user can pick: + - on merge to the main/deploy branch, + - on a release event (GitHub Release published, GitLab release created, + etc.), + - on a git tag (e.g. `v*`), + - on every commit / push to a deploy branch, + - on each push to the PR branch (preview environment), + - never — deploy is manual, run e2e on demand. +2. **How does that deploy signal its completion?** This gates the e2e run. We + need a concrete, observable signal — for example a deploy job/stage + completing, a `workflow_run`/pipeline-finished event, a deployment event, a + status check turning green, a release/tag being created, or a health + endpoint going 200. Ask which one their system actually exposes. +3. **What rungroup strategy should PR Regression runs use?** Every run goes + into a rungroup; the question is *what defines a group*. Common choices to + offer (let the user pick or write their own): + - **week** — e.g. `PR Regression W2 May 2026` (one recipe: + `W$(( ($(date +%-d) - 1) / 7 + 1 )) $(date +'%B %Y')`), + - **day** — e.g. `PR Regression 2026-05-27`, + - **milestone** — the active sprint/milestone label, + - **release** — the upcoming release/version tag, + - **submodule** — the project area touched (useful in monorepos). + Default to one strategy for both manual and automated unless the user + explicitly splits them. +4. **Diff base** for "what changed": usually the PR's target branch + (`main`/`master`) for the PR-open manual run; for a post-deploy run it is + the range that was just deployed (see + [references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md) on the + one-commit-per-deploy caveat). + +**Manual regression does not need the deploy answers (Q1, Q2, and the +post-deploy half of Q4).** A manual run can be created and left pending the +moment the PR opens — testers pick it up whenever. It still needs the rungroup +answer from Q3. + +### Step 5 — Decide WHERE the automated/e2e tests run + +This is the key architectural choice. Use the project-scan result. + +**Default — cross-repo dispatch (preferred for browser / UI e2e).** +For Playwright, Cypress, CodeceptJS, WebdriverIO, Puppeteer and similar +browser-driven suites, prefer running the tests in a **dedicated e2e repo** +even if a thin e2e folder also exists here. Reasons: + +- the e2e repo already owns the heavy setup — browsers, fixtures, environment + URLs, secrets, parallelism config; +- duplicating that setup in the source repo's pipeline causes secret sprawl + and CI drift; +- the source repo only needs its own coverage map + git history to compute the + selection. + +Pattern: this repo computes the grep, pre-creates the run via `reporter start`, +and triggers the **existing workflow in the e2e repo** with `{grep, run, env}`. +Step 6(b) details the ordering; the contract is in +[references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md). + +**Exceptions — run in this repo's pipeline instead:** + +- **Mobile (Appium, Detox, native iOS/Android simulators).** The runner is + bound to specific OS images, emulators and signing material that live with + the test job. Remote-triggering a mobile pipeline rarely buys anything; keep + it here. +- **Low-level / setup-heavy frameworks** where the e2e job is already wired + into this repo for legitimate reasons (compiled-in test harnesses, + fixtures generated from this repo's build, etc.). Don't fight an existing + setup that works. +- **API / contract tests** that exercise a server this repo can spin up. Start + the app locally, check out the API-tests repo if needed, run against + `localhost`. They are not gated on a remote deploy. + +**Same-repo path remains valid** when the e2e suite genuinely belongs here and +the setup cost is low. Run it directly with the reporter wrapping the runner. + +**No e2e suite anywhere (only unit/integration here).** Do **not** stand up an +e2e job from this repo. Set up the manual flow only and tell the user the +automated flow needs an e2e suite first (point them at `automation-coverage` / +test authoring). + +### Step 6 — Wire the two flows into the project's CI + +Express each applicable flow in the CI identified in Step 2, in that CI's own +syntax — you know it. The skill-specific parts are the reporter commands, the +env vars, and the structural rules below; the trigger/secret/job scaffolding +is ordinary CI config you write for whatever system it is. + +**Manual regression** — always safe, no deploy dependency: + +- Trigger: PR opened (add reopened/synchronize if the user wants refreshes). +- Required env on the job: `TESTOMATIO` (API key), `TESTOMATIO_TITLE`, + `TESTOMATIO_RUNGROUP_TITLE`. + - `TESTOMATIO_TITLE` — derived from the PR: e.g. `PR #`. + - `TESTOMATIO_RUNGROUP_TITLE` — computed from the strategy chosen in Step 4. + One recipe for a week-bucket strategy: + `PR Regression W `. The expression itself is + CI-shell trivia (use whatever date/expression syntax the CI gives you). +- Command (no runner — this only creates the run): + ``` + npx @testomatio/reporter run --kind manual \ + --filter "coverage:file=coverage.manual.yml,diff=" + ``` +- Result: a pending manual run in Testomat.io, titled and grouped, containing + only affected cases. Testers pick it up from the rungroup. + +**Automated/e2e regression** — after deploy, never blocks the pipeline: + +- Trigger: the deploy-completion signal from Step 4 — never "PR opened". +- Required env on every reporter call: `TESTOMATIO`, `TESTOMATIO_TITLE`, + `TESTOMATIO_RUNGROUP_TITLE`. + - Title derivation depends on the trigger: commit subject + short SHA for + commit-triggered, tag name for tag/release-triggered. + - Rungroup: reuse the same strategy as the manual flow unless the user + split them in Step 4. + +**(a) Same-repo run** — when Step 5 picked the same-repo path: + +``` +npx @testomatio/reporter run "" \ + --filter "coverage:file=coverage.e2e.yml,diff=" +``` + +The reporter creates the run, runs the filtered tests, and closes it. + +**(b) Cross-repo dispatch** — when Step 5 picked the cross-repo path. Four +explicit stages, in order: + +1. **Compute the grep** from this repo using the `--filter-list --format=grep` + dry-run (full mechanics in + [references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md)): + ``` + GREP=$(npx @testomatio/reporter run \ + --filter-list "coverage:file=coverage.e2e.yml,diff=" --format=grep) + ``` +2. **Skip the dispatch if `$GREP` is empty.** Nothing was affected; do not + trigger the e2e repo (most runners interpret an empty grep as "run + everything"). +3. **Pre-create the empty run here** with `reporter start`, carrying title + + rungroup, and capture the ID from stdout: + ``` + RUN_ID=$(npx @testomatio/reporter start --kind automated | tail -n1) + ``` +4. **Trigger the e2e repo's workflow** with `{grep: $GREP, run: $RUN_ID, + test_env: }` as inputs (whatever the CI's cross-repo trigger + mechanism is — repository_dispatch, pipeline trigger, project access token, + etc.). The e2e repo's job must export `TESTOMATIO_RUN=` so its + test results attach to this pre-created run. + +- **Isolation:** keep the automated flow off the release's critical path — + a separate workflow/pipeline triggered by the deploy signal, with the job + set non-failing for the release (every CI has such a knob: `continue-on-error`, + `allow_failure: true`, etc.). It reports to Testomat.io; it must not gate + the deploy. + +Always state the secrets/tokens the user must provision — the Testomat.io API +key (in both repos for the cross-repo path), and for cross-repo triggering a +token allowed to trigger the other repo (a CI's built-in token usually cannot +reach another repo; this typically means a PAT, project access token, or app +token with the appropriate scope). + +#### When the e2e repo has no dispatchable workflow + +If the e2e repo does not already accept a `grep` + `run` + `test_env` trigger, +do not silently restructure their pipeline. Propose a small dispatchable +workflow with this contract: + +- **Inputs:** + - `grep` — runner filter (e.g. `(@Sxxxx|@Tyyyy|@Tzzzz)`). + - `run` — Testomat.io run ID to attach results to. + - `test_env` (or `target_env`) — which environment the suite should hit + (e.g. `beta`, `staging`, `preview-pr-123`). +- **Behavior:** + - Set `TESTOMATIO_RUN=` in the test job's env. + - Invoke the runner with the grep applied (the runner's `--grep` / + `--testNamePattern` / equivalent), pointed at the chosen `test_env`. + - Report results via `@testomatio/reporter` as normal — because + `TESTOMATIO_RUN` is set, they attach to the pre-created run rather than + creating a new one. + +Produce a draft of this workflow in the e2e repo's CI syntax, name the file +path it belongs at, and tell the user to commit it there. Do not push to +another repo on the user's behalf. + +### Step 7 — Summarize and hand off + +Report concisely: + +- CI system targeted; files written in this repo (and, for a separate e2e + repo, the file the user must commit *there* — you cannot push it for them). +- Which flows are wired; which were skipped and why (missing coverage map / no + e2e suite). +- Title scheme chosen for each flow (e.g. "PR title + #number" for manual, + "commit subject + SHA" for automated). +- Rungroup strategy chosen and where the value is computed (the CI shell + expression / script step). +- For cross-repo e2e: that the run is pre-created in **this** repo and the + e2e repo's job exports `TESTOMATIO_RUN`; also whether a new dispatchable + workflow was proposed for the e2e repo (and where it should be committed). +- Secrets/tokens to add before it works — `TESTOMATIO` in both repos for the + cross-repo path; cross-repo trigger token (PAT/project token/app) on the + source side. +- Any assumption that needs the user's confirmation (deploy signal, diff base, + rungroup recipe). +- Recommend committing the coverage map(s) so CI and the team share one + mapping. + +--- + +## Examples + +**Example 1 — e2e suite in a separate repo** +Input: "Make our PRs run the manual cases and, after deploy, the affected e2e +tests. E2E lives in a sibling repo." +Output: ask the rungroup question (user picks weekly). PR-opened job creates a +titled manual run (`PR #<n>`) inside `PR Regression W<n> <Month> <Year>`. +A separate job keyed off the deploy-completion signal: computes the grep with +`--filter-list --format=grep`, pre-creates the run with `reporter start` +(same title scheme, same rungroup), captures the run ID, and triggers the e2e +repo's existing workflow with `{grep, run, test_env}`. Tokens listed; the +automated job set non-failing for the release. All written in the project's +own CI. + +**Example 2 — no coverage files yet** +Input: "Set up selective regression on our pull requests." +Output: find no `coverage.*.yml`; explain regression can't work without them; +delegate to `manual-coverage` / `automation-coverage`; only then wire the CI. + +**Example 3 — only unit tests in repo** +Input: "Run affected e2e on PRs." +Output: project-scan finds no e2e framework anywhere; set up the manual flow if +manual cases exist; explain the automated flow needs an e2e suite first; do not +fabricate an e2e job. + +**Example 4 — deploy-on-tag, release-grouped runs, e2e repo has no dispatch yet** +Input: "We deploy when we push a `v*` tag. Group runs by release. Our e2e +tests are in a separate repo that doesn't have a workflow we can trigger." +Output: rungroup strategy = release (`PR Regression <tag>`). Tag-creation event +is the deploy trigger; deploy job completion is the signal. Source repo computes +grep, pre-creates the run titled with the tag, and would dispatch — but the e2e +repo has no dispatchable workflow yet. Propose a workflow file for the e2e repo +with inputs `{grep, run, test_env}` and `TESTOMATIO_RUN` wired into the runner +job; hand it to the user to commit in the e2e repo. Wire the source side once +the e2e workflow exists. + +--- + +## References + +| Description | File | +| -------------------------------------- | ------------------------------- | +| Reporter command contract & cross-repo | references/REPORTER_CONTRACT.md | + +## Related skills + +`project-scan` (mandatory first), `manual-coverage` and `automation-coverage` +(create the maps this skill consumes), `reporter-setup` (install the reporter if +the project has no Testomat.io integration yet). diff --git a/skills/setup-pr-testing/evals/evals.json b/skills/setup-pr-testing/evals/evals.json new file mode 100644 index 0000000..e1334c4 --- /dev/null +++ b/skills/setup-pr-testing/evals/evals.json @@ -0,0 +1,26 @@ +{ + "skill_name": "setup-pr-testing", + "evals": [ + { + "id": 0, + "name": "github-cross-repo-happy-path", + "prompt": "The project is at FIXTURE/acme-api. Set up our PRs so they run only the manual cases affected by the change, and after the staging deploy run only the affected e2e tests. The e2e suite is not in this repo — it lives in the separate repo acme/e2e-tests, which already has a workflow e2e.yml that takes a `grep` input. Deploys happen on merge to main via the existing 'Deploy Staging' workflow.", + "expected_output": "Runs project-scan first; detects GitHub Actions (does not assume); reuses existing coverage.manual.yml + coverage.e2e.yml; writes a PR-opened manual workflow (reporter run --kind manual, no deploy dependency) and a separate post-deploy workflow gated on Deploy Staging success that computes the grep via --filter-list and dispatches acme/e2e-tests' existing e2e.yml; isolates the e2e job from the release; states the cross-repo token requirement.", + "files": [] + }, + { + "id": 1, + "name": "gitlab-missing-coverage-maps", + "prompt": "The project is at FIXTURE/shop-web. We want selective regression on our merge requests — only run the tests touched by the change. I don't think we have any coverage files set up yet.", + "expected_output": "Runs project-scan; detects GitLab CI (not GitHub); finds no coverage.*.yml; explains regression cannot work without coverage maps and proposes/delegates to manual-coverage and automation-coverage to create them before wiring .gitlab-ci.yml; asks how MRs deploy and how deploy completion is observable for the e2e flow; manual flow needs no deploy signal.", + "files": [] + }, + { + "id": 2, + "name": "no-e2e-suite-refuse-to-fabricate", + "prompt": "The project is at FIXTURE/lib-utils. Make our pull requests run the affected end-to-end tests automatically.", + "expected_output": "Runs project-scan; finds only Jest unit tests, no e2e framework anywhere and no e2e suite in another repo; does NOT fabricate an e2e job or pipeline; explains the automated/e2e flow needs an e2e suite first (points at authoring/automation-coverage); offers the manual flow only if manual cases exist; asks which CI is used since none is configured.", + "files": [] + } + ] +} diff --git a/skills/setup-pr-testing/references/REPORTER_CONTRACT.md b/skills/setup-pr-testing/references/REPORTER_CONTRACT.md new file mode 100644 index 0000000..8688129 --- /dev/null +++ b/skills/setup-pr-testing/references/REPORTER_CONTRACT.md @@ -0,0 +1,173 @@ +# Reporter Contract & Cross-Repo Triggering + +How `@testomatio/reporter` consumes a coverage map, and the exact mechanics for +the cross-repo e2e pattern. This is CI-independent — the CI recipes wrap these. + +## 1. The coverage filter + +A coverage map (`coverage.manual.yml` / `coverage.e2e.yml`) maps source globs → +test/suite IDs/tags. The reporter resolves *changed files* via git, maps them +through the YAML, and selects the matching tests. + +``` +--filter "coverage:file=<path-to-coverage.yml>,diff=<git-ref>" +``` + +- `file=` — path to the coverage map. **May be absolute.** It is read with + `fs`, independent of the working directory. +- `diff=` — git ref to diff against. Defaults to `master` if omitted. The + reporter runs `git diff <ref> --name-only` **in `process.cwd()`** (no `-C`). + So the reporter must be launched from inside the repo whose changes you want + to detect. Confirmed behavior, not configurable. + +Implication for cross-repo: if the coverage map + source live in repo A but the +test runner runs in repo B, launch the reporter with cwd = repo A (so the diff +is repo A's), and let the runner command `cd` into repo B. + +## 2. Manual flow — create a pending run + +No test runner. The reporter just creates a manual run in Testomat.io +containing only the affected cases: + +``` +npx @testomatio/reporter run --kind manual \ + --filter "coverage:file=coverage.manual.yml,diff=<base-branch>" +``` + +- Needs `TESTOMATIO` (API key) in env. +- `<base-branch>` for a PR is normally the target branch (`main`/`master`). +- Safe to run the instant a PR opens. No deploy dependency. Non-blocking. +- **Required in practice for `setup-pr-testing`:** + - `TESTOMATIO_TITLE` — meaningful run title. For PR-opened triggers derive + from PR title + number (e.g. `PR Add invoice export #482`). + - `TESTOMATIO_RUNGROUP_TITLE` — the rungroup bucket the run lands in + (week / day / milestone / submodule / release — whichever the user picked + in Step 4 of the skill). Supports `/` nesting for sub-grouping. + - `TESTOMATIO_ENV` (optional) — target environment label. + +## 3. Automated flow — run only affected tests + +The reporter wraps the runner command and injects the framework-appropriate +grep (`--grep`, `--testNamePattern`, Cypress env, etc.) for the matched IDs: + +``` +npx @testomatio/reporter run "<runner cmd>" \ + --filter "coverage:file=coverage.e2e.yml,diff=<base>" +``` + +Examples of `<runner cmd>`: `npx playwright test`, `npx cypress run`, +`npx codeceptjs run`, `npx wdio`. Run this from the repo that holds the +coverage map + git history (see §1). + +- **Required in practice for `setup-pr-testing`:** + - `TESTOMATIO_TITLE` — for commit-triggered runs derive from the commit + subject + short SHA; for tag/release-triggered runs use the tag name. + - `TESTOMATIO_RUNGROUP_TITLE` — same strategy as the manual flow unless the + user split them. + - `TESTOMATIO_ENV` (optional) — target environment label. + +## 4. Inspect-only (dry run) — get the selection without running + +`--filter-list` is the documented mode to compute the affected tests **without +executing** them (docs: coverage pipe → "Retrieve a list of tests matching your +filter"). Use it when the selection must be computed in one repo and the tests +run elsewhere. + +Pair it with `--format=grep` to get the alternation string directly on stdout +(no parsing needed): + +```bash +GREP=$(npx @testomatio/reporter run \ + --filter-list "coverage:file=coverage.e2e.yml,diff=$BASE" --format=grep) +# -> GREP = "(@Sxxxx|@Tyyyy|...)"; use as: <runner> --grep "$GREP" +``` + +Other `--format` values: `json`, `newline`, `ids` (default: comma-separated). + +Only proceed to trigger the run when `$GREP` is non-empty — an empty grep +means "nothing affected", and most runners treat an empty grep as "run +everything". + +## 5. `reporter start` — pre-create an empty run + +`start` initiates a new test run in Testomat.io and returns its identifier, +**without running any tests**. Use it when the selection is computed in one +repo (this one) but the tests run in another job or another repo, and you want +their results to attach to a single, pre-titled run that already lives in the +correct rungroup. + +```bash +RUN_ID=$(npx @testomatio/reporter start --kind automated | tail -n1) +``` + +- `--kind` — `automated`, `manual`, or `mixed`. +- Required env: `TESTOMATIO`. In practice also set `TESTOMATIO_TITLE` and + `TESTOMATIO_RUNGROUP_TITLE` so the run is created with the right metadata. + `TESTOMATIO_ENV` optional. +- **Output:** the run ID is the last line of stdout — capture with `tail -n1`. +- **Pair:** the consuming job (whatever runs the tests — same repo or + another) exports `TESTOMATIO_RUN=$RUN_ID`. Any subsequent `@testomatio/reporter` + call there will attach to that run instead of creating a new one. + +## 6. Cross-repo e2e pattern (preferred when e2e lives elsewhere) + +Roles: + +- **Source repo** (this one): owns `coverage.e2e.yml` + the diff + the run's + identity (title, rungroup). After the deploy signal: computes `$GREP` via + §4, pre-creates the run via §5, then triggers the e2e repo carrying both + `$GREP` and `$RUN_ID`. +- **E2E repo**: already has a workflow that accepts a grep/filter input + a + run ID and runs the suite against the deployed environment. We trigger that + existing workflow — we do **not** duplicate it. + +Why this split: the e2e repo holds the runner, browsers, environment URLs and +secrets. Reproducing all that in the source repo is duplication and a secret- +sprawl risk. The source repo only needs its own coverage map + git + a Testomat.io +API key. + +**Four explicit stages:** + +1. Compute the selection: `GREP=$(reporter run --filter-list … --format=grep)` + (§4). +2. If `$GREP` is empty → stop. Nothing was affected. +3. Pre-create the run: `RUN_ID=$(reporter start … | tail -n1)` (§5). +4. Trigger the e2e repo's workflow with inputs `{grep: $GREP, run: $RUN_ID, + test_env: <env name>}`. The e2e repo's runner job exports + `TESTOMATIO_RUN=<run input>` so its reports attach to the pre-created run. + +Triggering mechanism — whatever cross-repo pipeline/workflow trigger the CI +offers, carrying the grep + run + env as inputs/variables. The built-in CI +token usually cannot reach another repo — a PAT / project access token / app +token / deploy trigger token with permission on the e2e repo is required. +State this to the user as a provisioning step. + +If the e2e repo's workflow does **not** yet accept `grep` + `run` + `test_env` +inputs, that is a small addition there — describe the contract (`SKILL.md` §6 +"When the e2e repo has no dispatchable workflow") and hand the user a draft; +do not silently restructure their pipeline. + +## 7. Diff base caveats + +- **Manual, on PR open**: `diff=<target-branch>` (e.g. `main`) — the natural + "what this PR changes". +- **Automated, after deploy**: the meaningful diff is *what was just deployed*. + On a squash/rebase merge that is one commit, so `diff=<deployed_sha>~1` + works. If the project lands **multi-commit** pushes onto the deploy branch, + `~1` under-selects — instead diff against the previously-deployed SHA + (carry/persist it between deploys). Always surface this assumption to the user + rather than hard-coding silently. + +## 8. Required environment + +- `TESTOMATIO` — Testomat.io API key (both flows; in both repos for cross-repo). +- `TESTOMATIO_TITLE` — meaningful run title. Required in practice for every + `setup-pr-testing` reporter call (manual, automated, `start`). +- `TESTOMATIO_RUNGROUP_TITLE` — rungroup the run lands in. Required in + practice for every `setup-pr-testing` reporter call. Supports `/` nesting. +- `TESTOMATIO_RUN` — **cross-repo only.** Set in the e2e repo's runner job to + the run ID returned by `reporter start` in the source repo, so its reports + attach to the pre-created run. +- `TESTOMATIO_ENV` (optional) — target environment label. +- Cross-repo CI: a token authorized to trigger the e2e repo's CI (PAT / + project access token / app token; the built-in CI token usually cannot). From 0c9478c42a93462098fcf062037285addc3e9879 Mon Sep 17 00:00:00 2001 From: DavertMik <davert@testomat.io> Date: Sat, 30 May 2026 21:12:40 +0300 Subject: [PATCH 3/7] Restructure setup-pr-testing to comment-on-open + run-after-merge + --remote - PR open: post a notice comment with affected manual/automated test counts (reporter run --filter-list), no runs created, never blocks the PR - PR merged: create the manual run (pending) and prepare the automated run as a shared run titled by commit (reporter start), not executed - deploy done: launch the prepared run on a Testomat.io CI profile via reporter run --remote, replacing cross-repo dispatch - rewrite REPORTER_CONTRACT.md and evals for the three-phase model Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --- skills/setup-pr-testing/SKILL.md | 581 +++++++++--------- skills/setup-pr-testing/evals/evals.json | 19 +- .../references/REPORTER_CONTRACT.md | 267 ++++---- 3 files changed, 441 insertions(+), 426 deletions(-) diff --git a/skills/setup-pr-testing/SKILL.md b/skills/setup-pr-testing/SKILL.md index f9a52ba..2bb6f46 100644 --- a/skills/setup-pr-testing/SKILL.md +++ b/skills/setup-pr-testing/SKILL.md @@ -1,44 +1,56 @@ --- name: setup-pr-testing -description: Set up change-aware PR regression testing — wire CI so every pull request runs only the manual and automated/e2e tests affected by the diff, via Testomat.io coverage maps and @testomatio/reporter. Use this skill when the user wants PR-triggered regression, "run only affected tests on PRs", selective manual runs per PR, post-deploy e2e on PRs, a coverage-driven CI pipeline, weekly/grouped PR regression runs in a rungroup, creating a Testomat.io run in CI and dispatching e2e tests to another repo, deploy-on-tag / on-release / on-merge regression, or to connect coverage.manual.yml / coverage.e2e.yml to their CI. CI-agnostic — adapts to whatever CI the project already uses (GitHub Actions, GitLab CI, Jenkins, Bitbucket, CircleCI, etc.). Trigger it even if the user only says "make PRs run the right tests" or "set up regression for pull requests". +description: Set up change-aware PR regression testing — wire CI so each pull request posts a comment listing how many manual and automated tests its diff affects, and so that after the PR is merged and deployed only those affected tests are actually run, via Testomat.io coverage maps and @testomatio/reporter. Use this skill when the user wants PR-triggered regression, "comment affected test counts on PRs", "run only affected tests after merge/deploy", selective manual runs per PR, post-deploy e2e, a coverage-driven CI pipeline, weekly/grouped regression runs in a rungroup, launching automated regression on Testomat.io CI via `--remote`, or to connect coverage.manual.yml / coverage.e2e.yml to their CI. CI-agnostic — adapts to whatever CI the project already uses (GitHub Actions, GitLab CI, Jenkins, Bitbucket, CircleCI, etc.). Trigger it even if the user only says "make PRs comment affected tests" or "run the right tests after merge". license: MIT metadata: author: Testomat.io - version: 1.0.0 + version: 2.0.0 --- # SETUP-PR-TESTING SKILL: What I do -I wire a project's CI so each **pull request** (or deploy thereof) runs only the -tests affected by its change — not the whole suite. There are exactly two flows, -and they are independent: +I wire a project's CI so a pull request's change drives testing in **three +phases** — a cheap notice while the PR is open, real runs once it is merged, and +execution once it is deployed. Nothing heavy happens until the change is +actually going somewhere. ``` -PR open ──▶ Regression Manual Tests coverage.manual.yml ──▶ reporter run --kind manual - + TESTOMATIO_TITLE + TESTOMATIO_RUNGROUP_TITLE - -deploy done ──▶ Regression Automated Tests coverage.e2e.yml - ├─ same repo: ──▶ reporter run "<runner>" + TITLE + RUNGROUP - └─ cross repo: ──▶ reporter run --filter-list … --format=grep (compute selection) - ──▶ reporter start (pre-create empty run, capture ID) - ──▶ dispatch e2e repo with {grep, run=<id>, env} (tests there export TESTOMATIO_RUN) +PR opened ──▶ NOTICE ONLY — comment the affected test counts on the PR + reporter run --filter-list (manual map + e2e map) → counts + posted via the CI's native PR-comment API (GitHub / GitLab / Bitbucket) + no runs created · never blocks the PR + e.g. "0 automated tests, 10 manual tests are affected by this PR" + +PR merged ──▶ Manual regression run (created, pending) + reporter start --kind manual --filter "coverage:file=coverage.manual.yml,diff=<base>" + titled by the merge commit · in a rungroup · testers pick it up + + ──▶ Automated regression run (created, NOT executed) + reporter start --filter "coverage:file=coverage.e2e.yml,diff=<base>" + TESTOMATIO_SHARED_RUN=1 · TESTOMATIO_TITLE="report for commit <sha>" + TESTOMATIO_SHARED_RUN_TIMEOUT=<covers the merge→deploy gap> + → capture RUN_ID + +deploy done ──▶ Launch automated execution on Testomat.io CI + reporter run --remote <ci-profile> + (TESTOMATIO_RUN=$RUN_ID + same shared-run title + TESTOMATIO_SHARED_RUN=1) + Testomat.io dispatches the project's CI profile; the affected e2e + tests run there and report back into the same prepared run. ``` -Both flows are **change-aware**: a coverage map (`coverage.*.yml`) maps source +Every phase is **change-aware**: a coverage map (`coverage.*.yml`) maps source files → test/suite IDs, and `@testomatio/reporter` filters by the PR diff so -only impacted tests are selected. +only impacted tests are counted, prepared, and run. -Every run — manual or automated — lands in Testomat.io with a meaningful -**title** and inside a **rungroup**. The grouping strategy (week / day / -milestone / submodule / release) is the user's call; we ask, we do not invent. - -For cross-repo e2e, the run is **pre-created in the source repo** (where the -diff and coverage map live) and the resulting run ID is passed into the e2e -repo so its test results attach to the same run instead of opening a new one. +The big shift from older setups: **execution moved to after merge+deploy**, the +**PR-open step is now just an informational comment**, and **automated tests are +launched through Testomat.io's `--remote` CI profile** instead of this repo +reaching into another repo's pipeline. If the e2e tests live elsewhere, the CI +profile points there — the reporter never triggers a foreign repo directly. I do **not** invent tests, write CI from guesswork, or assume a CI system. I discover the project, confirm the unknowns with the user, reuse existing -coverage maps (or delegate creating them), and express the two flows in +coverage maps (or delegate creating them), and express the three phases in whatever CI the project actually uses. ## What this skill is and isn't @@ -46,26 +58,33 @@ whatever CI the project actually uses. This skill is the **method**, not a catalogue of CI snippets. The valuable, non-obvious knowledge is: -1. the two-flow model and how they differ (manual = no deploy dependency; - automated = after deploy, never blocks the release); -2. the `@testomatio/reporter` command contract — title/rungroup env vars, the - `reporter start` "empty run" pattern, and the cross-repo `--filter-list - --format=grep` → `start` → dispatch flow (see +1. the three-phase model and why each phase is gated where it is (comment = + harmless, on open; manual run = on merge, no deploy needed; automated run = + prepared on merge, executed only after deploy, never blocking the release); +2. the `@testomatio/reporter` command contract — `--filter-list` for the notice + counts, `reporter start` to create runs without executing, the shared-run + env vars (`TESTOMATIO_SHARED_RUN`, `TESTOMATIO_TITLE`, + `TESTOMATIO_SHARED_RUN_TIMEOUT`), and `reporter run --remote <profile>` to + launch a prepared run on Testomat.io CI (see [references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md)); -3. the decisions — coverage maps required, where e2e runs, what to ask the user - (deploy trigger, completion signal, rungroup strategy, diff base). +3. the decisions — coverage maps required, whether a Testomat.io CI profile + exists for `--remote`, and what to ask the user (deploy trigger, completion + signal, rungroup strategy, diff base, shared-run timeout). Translating a trigger into a specific CI's YAML/Groovy/config is **not** special knowledge — you already know how GitHub Actions, GitLab CI, Jenkins, or any -other CI express "run on PR open", "run after job X", and "don't fail the -pipeline". Write that yourself for the CI in front of you. Do not expect, or -add, a per-CI recipe file. +other CI express "run on PR open", "run on merge", "run after deploy", "post a +PR comment", and "don't fail the pipeline". Write that yourself for the CI in +front of you. **Do not write explicit per-CI workflow files into this skill or +expect a per-CI recipe file** — give the agent the contract and let it author +the config for whatever CI it finds. ## When to use -- "Run only the tests affected by a PR", "selective regression on pull requests". -- "Create a pending manual run for each PR with just the relevant cases". -- "Run e2e tests after a PR is deployed, but only the affected ones". +- "Comment how many tests a PR affects", "show affected test counts on PRs". +- "Run only the affected tests after a PR is merged and deployed". +- "Create a pending manual regression run per merged PR with just the relevant cases". +- "Launch affected e2e on Testomat.io CI after deploy", "use `--remote` for regression". - "Hook coverage.manual.yml / coverage.e2e.yml into our CI". --- @@ -77,31 +96,48 @@ add, a per-CI recipe file. depend on its result. - **Never assume the CI system, and never hardcode one.** Identify the CI the project already uses by reading the repo; if you cannot tell, ask. Then write - config for *that* CI from your own knowledge of it. + config for *that* CI from your own knowledge of it. Do not bake per-CI + workflow files into this skill. - **Regression cannot work without a coverage map.** If the needed `coverage.*.yml` is missing, the only correct move is to propose creating it - via the coverage skills — not to hand-write a mapping or skip filtering. -- **Confirm the end system's PR flow with the user.** We do not know how their - PRs deploy or how a deploy completion is observable. Ask — do not guess. -- **The automated-regression job must never fail the deploy/release pipeline.** - It is observation, not a gate. Isolate it. -- **Every run gets a meaningful title.** Set `TESTOMATIO_TITLE` on every reporter - call — manual and automated, same-repo and cross-repo. Derive it from the - scope in hand: PR title + PR number for PR-triggered flows; commit subject + - short SHA for commit-triggered flows; tag name for tag/release-triggered - flows. Never leave the run untitled. -- **Every run lands in a rungroup.** Set `TESTOMATIO_RUNGROUP_TITLE` on every - reporter call. The grouping strategy — week, day, milestone, submodule, - release — is the user's choice; ask in Step 4. Reuse the same strategy for - the manual and automated flows unless the user explicitly splits them. -- **Cross-repo e2e MUST pre-create the run here.** When the e2e suite lives in - another repo, create the run in *this* repo with `reporter start` (carrying - title + rungroup), capture the run ID, and pass it to the e2e repo as - `TESTOMATIO_RUN`. The e2e repo must not create its own run — the title and - rungroup belong with the diff, which lives here. + via the coverage skills — not to hand-write a mapping or skip filtering. The + PR-open comment also needs the maps to count affected tests. +- **PR open is a NOTICE ONLY.** On PR open/update, compute affected counts with + `--filter-list` and post a PR comment. **Never create runs and never block the + PR** in this phase. Counts only — e.g. `0 automated tests, 10 manual tests are + affected by this PR`. Must work on GitHub, GitLab, and Bitbucket using each + one's native PR/MR comment API. +- **Regression runs are created on MERGE, not on PR open.** When the PR merges: + create the manual run (pending) and create the automated run (prepared, not + executed). Nothing executes until deploy. +- **Automated run is prepared, then launched separately.** On merge, create it + with `reporter start` as a **shared run** keyed by the merge commit + (`TESTOMATIO_SHARED_RUN=1`, `TESTOMATIO_TITLE="report for commit <sha>"`) and + do **not** execute it. After deploy, launch it with `reporter run --remote + <profile>`. +- **Set the shared-run timeout to span the merge→deploy gap.** The shared run is + matched by title only within `TESTOMATIO_SHARED_RUN_TIMEOUT` minutes (default + 20). If deploy takes longer, the execute step won't attach to the prepared run + and a stray new run appears. Ask how long deploys take and set the timeout + above it (e.g. `TESTOMATIO_SHARED_RUN_TIMEOUT=120`). +- **Execute automated through `--remote`, never by triggering another repo.** + `reporter run --remote <profile>` asks Testomat.io to dispatch the project's + configured CI profile. Do not reach into a foreign repo's pipeline with a PAT + and `{grep, run, env}` inputs anymore — that responsibility now lives in the + Testomat.io CI profile (configured under **Settings → CI**). +- **The automated execute step must never fail the deploy/release pipeline.** It + is observation, not a gate. Isolate it (a non-failing job keyed off the deploy + signal). +- **Every created run gets a meaningful title.** Set `TESTOMATIO_TITLE` on every + `start`/`run` call that creates or launches a run. Derive it from the merge + commit (subject + short SHA), e.g. `report for commit <sha>`. The automated + prepare and execute steps MUST use the **same** title so they converge on one + shared run. +- **Every created run lands in a rungroup.** Set `TESTOMATIO_RUNGROUP_TITLE` on + the manual and automated runs. The grouping strategy (week / day / milestone / + release / submodule) is the user's choice; ask in Step 4. - **Only touch CI config and (if asked) coverage files.** Do not modify - application or test source. When e2e tests live in another repo, do not edit - that repo blindly — produce the change and tell the user where it goes. + application or test source. --- @@ -117,281 +153,244 @@ Run **`project-scan`**. From its result capture: - **Manual tests** — are there `*.test.md` cases (here or pullable from Testomat.io)? - **Automated tests** — present in this repo, or only elsewhere? -This tells you which of the two flows are even applicable. +This tells you which maps you can build counts from, and which phases apply +(manual-only vs manual + automated). ### Step 2 — Identify the CI system (do not assume) Read the repo to see which CI it already uses — its CI config files / dotfiles make this obvious. If nothing is configured, or several CIs are present, **ask -the user which CI runs their PRs**. You do not need a lookup table or recipes -for this; once you know the CI, you know how to write for it. +the user which CI runs their PRs**. Once you know the CI, you know how to write +its triggers and how to post a PR/MR comment with it. ### Step 3 — Locate or create the coverage maps Look for existing coverage files in the repo root: `coverage.manual.yml`, -`coverage.e2e.yml`, or any `coverage*.yml`. Inspect the keys to confirm they -map this repo's source. +`coverage.e2e.yml`, or any `coverage*.yml`. Inspect the keys to confirm they map +this repo's source. -For each flow the user wants: +For each kind of test the user has: - **Map exists** → reuse it. Validate it still resolves (the coverage skills bundle a checker) before wiring CI to it. -- **Map missing** → regression for that flow **will not work**. Propose - creating it and, on agreement, delegate: - - manual flow → **`manual-coverage`** → produces `coverage.manual.yml` - - automated/e2e flow → **`automation-coverage`** → produces `coverage.e2e.yml` +- **Map missing** → both the affected-count comment and the regression run for + that kind **will not work**. Propose creating it and, on agreement, delegate: + - manual → **`manual-coverage`** → `coverage.manual.yml` + - automated/e2e → **`automation-coverage`** → `coverage.e2e.yml` -Do not proceed to CI for a flow whose map does not exist. +You need a map for each kind you want to report or run. The PR-open comment can +list only the kinds whose maps exist. -### Step 4 — Ask the unknowns about the end system's PR flow +### Step 4 — Ask the unknowns -We cannot see how their PRs deploy, and we do not know how they want runs -grouped. Before designing the flows, ask (skip whatever the project-scan / CI -config already clearly answers — read their CI files first so you don't ask -something already there): +Read their CI files first so you don't ask what's already there, then confirm: -1. **When does a deploy that should trigger e2e happen?** Surface the common - options so the user can pick: - - on merge to the main/deploy branch, - - on a release event (GitHub Release published, GitLab release created, - etc.), +1. **When does a deploy that should trigger automated execution happen?** Offer + the common options: + - on merge to the main/deploy branch (deploy is part of that pipeline), + - on a release event (GitHub Release published, GitLab release created), - on a git tag (e.g. `v*`), - - on every commit / push to a deploy branch, - - on each push to the PR branch (preview environment), - - never — deploy is manual, run e2e on demand. -2. **How does that deploy signal its completion?** This gates the e2e run. We - need a concrete, observable signal — for example a deploy job/stage - completing, a `workflow_run`/pipeline-finished event, a deployment event, a - status check turning green, a release/tag being created, or a health - endpoint going 200. Ask which one their system actually exposes. -3. **What rungroup strategy should PR Regression runs use?** Every run goes - into a rungroup; the question is *what defines a group*. Common choices to - offer (let the user pick or write their own): - - **week** — e.g. `PR Regression W2 May 2026` (one recipe: - `W$(( ($(date +%-d) - 1) / 7 + 1 )) $(date +'%B %Y')`), - - **day** — e.g. `PR Regression 2026-05-27`, - - **milestone** — the active sprint/milestone label, + - on every push to a deploy branch, + - never automated — deploy is manual, launch e2e on demand. +2. **How does that deploy signal its completion?** This gates the execute step. + Need a concrete, observable signal — a deploy job/stage completing, a + `workflow_run`/pipeline-finished event, a deployment event, a status check + going green, a release/tag created, or a health endpoint returning 200. +3. **How long does merge → deploy-complete usually take?** This sets + `TESTOMATIO_SHARED_RUN_TIMEOUT` (minutes, default 20) so the execute step + still matches the prepared shared run. Pick a value comfortably above the + typical deploy duration. +4. **What rungroup strategy should regression runs use?** Every created run goes + into a rungroup; the question is what defines a group: + - **week** — e.g. `Regression W2 May 2026` + (`W$(( ($(date +%-d) - 1) / 7 + 1 )) $(date +'%B %Y')`), + - **day** — e.g. `Regression 2026-05-27`, + - **milestone** — the active sprint/milestone, - **release** — the upcoming release/version tag, - - **submodule** — the project area touched (useful in monorepos). - Default to one strategy for both manual and automated unless the user - explicitly splits them. -4. **Diff base** for "what changed": usually the PR's target branch - (`main`/`master`) for the PR-open manual run; for a post-deploy run it is - the range that was just deployed (see - [references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md) on the - one-commit-per-deploy caveat). - -**Manual regression does not need the deploy answers (Q1, Q2, and the -post-deploy half of Q4).** A manual run can be created and left pending the -moment the PR opens — testers pick it up whenever. It still needs the rungroup -answer from Q3. - -### Step 5 — Decide WHERE the automated/e2e tests run - -This is the key architectural choice. Use the project-scan result. - -**Default — cross-repo dispatch (preferred for browser / UI e2e).** -For Playwright, Cypress, CodeceptJS, WebdriverIO, Puppeteer and similar -browser-driven suites, prefer running the tests in a **dedicated e2e repo** -even if a thin e2e folder also exists here. Reasons: - -- the e2e repo already owns the heavy setup — browsers, fixtures, environment - URLs, secrets, parallelism config; -- duplicating that setup in the source repo's pipeline causes secret sprawl - and CI drift; -- the source repo only needs its own coverage map + git history to compute the - selection. - -Pattern: this repo computes the grep, pre-creates the run via `reporter start`, -and triggers the **existing workflow in the e2e repo** with `{grep, run, env}`. -Step 6(b) details the ordering; the contract is in -[references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md). - -**Exceptions — run in this repo's pipeline instead:** - -- **Mobile (Appium, Detox, native iOS/Android simulators).** The runner is - bound to specific OS images, emulators and signing material that live with - the test job. Remote-triggering a mobile pipeline rarely buys anything; keep - it here. -- **Low-level / setup-heavy frameworks** where the e2e job is already wired - into this repo for legitimate reasons (compiled-in test harnesses, - fixtures generated from this repo's build, etc.). Don't fight an existing - setup that works. -- **API / contract tests** that exercise a server this repo can spin up. Start - the app locally, check out the API-tests repo if needed, run against - `localhost`. They are not gated on a remote deploy. - -**Same-repo path remains valid** when the e2e suite genuinely belongs here and -the setup cost is low. Run it directly with the reporter wrapping the runner. - -**No e2e suite anywhere (only unit/integration here).** Do **not** stand up an -e2e job from this repo. Set up the manual flow only and tell the user the -automated flow needs an e2e suite first (point them at `automation-coverage` / -test authoring). - -### Step 6 — Wire the two flows into the project's CI - -Express each applicable flow in the CI identified in Step 2, in that CI's own -syntax — you know it. The skill-specific parts are the reporter commands, the -env vars, and the structural rules below; the trigger/secret/job scaffolding -is ordinary CI config you write for whatever system it is. - -**Manual regression** — always safe, no deploy dependency: - -- Trigger: PR opened (add reopened/synchronize if the user wants refreshes). -- Required env on the job: `TESTOMATIO` (API key), `TESTOMATIO_TITLE`, - `TESTOMATIO_RUNGROUP_TITLE`. - - `TESTOMATIO_TITLE` — derived from the PR: e.g. `PR <pr-title> #<pr-number>`. - - `TESTOMATIO_RUNGROUP_TITLE` — computed from the strategy chosen in Step 4. - One recipe for a week-bucket strategy: - `PR Regression W<week-of-month> <Month> <Year>`. The expression itself is - CI-shell trivia (use whatever date/expression syntax the CI gives you). -- Command (no runner — this only creates the run): + - **submodule** — the project area touched (monorepos). + Default to one strategy for both the manual and automated runs unless the + user splits them. +5. **Diff base** for "what changed": for the PR-open comment and the on-merge + runs, the PR's target branch (`main`/`master`) is the natural base. For a + post-deploy range see the one-commit-per-deploy caveat in + [references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md). + +**The PR-open comment needs none of the deploy answers (Q1–Q3).** It only needs +the coverage maps and a base branch (Q5). **The manual run needs only the +rungroup answer (Q4)** — it is created pending on merge regardless of deploy. + +### Step 5 — Confirm how automated execution is launched (`--remote`) + +Automated execution runs through a **Testomat.io CI profile** triggered by +`reporter run --remote <profile>`. Decide and confirm: + +- **Is there a CI profile configured on the project?** (Testomat.io → **Settings + → CI**.) It names the workflow Testomat.io dispatches — in this repo or in a + dedicated e2e repo. If none exists, that is a prerequisite the user must set up + (point them at the CI configuration page); `--remote` cannot work without it. +- **The CI profile owns the runner, browsers, environment URLs, and secrets.** + This is exactly why `--remote` replaces cross-repo dispatch: the source repo + only needs its coverage map + git history to *prepare* the scoped run; the CI + profile decides *where and how* the suite actually executes. No PAT, no + foreign-repo trigger, no `{grep, run, env}` inputs to maintain. +- **Exceptions — run inline in this pipeline instead of `--remote`:** + - **Mobile (Appium, Detox, native simulators)** — bound to specific OS images + and signing material on the test job; keep it in this pipeline. + - **API / contract tests** that exercise a server this repo can spin up — run + against `localhost`; not gated on a remote deploy or a CI profile. + - **An existing same-repo e2e job that already works** — don't fight it; you + can still prepare the run here and let that job execute it. + For these, after deploy run the reporter wrapping the runner directly + (`reporter run "<runner cmd>" --filter "coverage:file=coverage.e2e.yml,diff=<base>"`) on the + prepared run instead of `--remote`. +- **No e2e suite anywhere (only unit/integration here).** Do not stand up an e2e + job. Set up the comment (manual only) and the manual run; tell the user the + automated phase needs an e2e suite first (point at `automation-coverage` / test + authoring). + +### Step 6 — Wire the three phases into the project's CI + +Express each applicable phase in the CI from Step 2, in that CI's own syntax — +you know it. The skill-specific parts are the reporter commands, the env vars, +and the structural rules; the trigger/secret/job scaffolding and the PR-comment +call are ordinary CI config you write for whatever system it is. + +**(a) PR opened → affected-counts comment (notice only).** + +- Trigger: PR/MR opened (add reopened/synchronize if the user wants the comment + to refresh on new commits). +- For each coverage map that exists, compute the affected count with a + `--filter-list` dry run — it lists matching IDs without running anything: ``` - npx @testomatio/reporter run --kind manual \ - --filter "coverage:file=coverage.manual.yml,diff=<base-branch>" + npx @testomatio/reporter run \ + --filter-list "coverage:file=coverage.manual.yml,diff=<base>" --format ids + npx @testomatio/reporter run \ + --filter-list "coverage:file=coverage.e2e.yml,diff=<base>" --format ids ``` -- Result: a pending manual run in Testomat.io, titled and grouped, containing - only affected cases. Testers pick it up from the rungroup. + Count the IDs (empty output = `0`). Assemble a one-line comment, e.g. + `0 automated tests, 10 manual tests are affected by this PR`. +- Post it with the CI's native PR/MR comment mechanism (GitHub PR comment, + GitLab MR note, Bitbucket PR comment). Prefer updating a single existing + comment over adding a new one on every push. +- This phase **creates no runs** and must **never fail the PR check**. -**Automated/e2e regression** — after deploy, never blocks the pipeline: +**(b) PR merged → create the regression runs.** -- Trigger: the deploy-completion signal from Step 4 — never "PR opened". -- Required env on every reporter call: `TESTOMATIO`, `TESTOMATIO_TITLE`, - `TESTOMATIO_RUNGROUP_TITLE`. - - Title derivation depends on the trigger: commit subject + short SHA for - commit-triggered, tag name for tag/release-triggered. - - Rungroup: reuse the same strategy as the manual flow unless the user - split them in Step 4. +- Trigger: PR merged into the main/deploy branch. +- Required env on these calls: `TESTOMATIO` (API key), `TESTOMATIO_TITLE`, + `TESTOMATIO_RUNGROUP_TITLE`. Title derives from the merge commit + (e.g. `report for commit <short-sha>`); rungroup from Step 4. -**(a) Same-repo run** — when Step 5 picked the same-repo path: + **Manual run** — created pending, testers pick it up; nothing executes: + ``` + npx @testomatio/reporter start --kind manual \ + --filter "coverage:file=coverage.manual.yml,diff=<base>" + ``` -``` -npx @testomatio/reporter run "<test runner cmd>" \ - --filter "coverage:file=coverage.e2e.yml,diff=<base>" -``` + **Automated run** — prepared as a shared run, **not executed**: + ``` + TESTOMATIO_SHARED_RUN=1 \ + TESTOMATIO_TITLE="report for commit <short-sha>" \ + TESTOMATIO_SHARED_RUN_TIMEOUT=<minutes covering deploy> \ + npx @testomatio/reporter start \ + --filter "coverage:file=coverage.e2e.yml,diff=<base>" + ``` + Capture the printed run id as `RUN_ID` and carry it to the execute step + (pipeline output/artifact). The shared title + timeout let the execute step + (and any parallel executors) converge on this same prepared run. + +**(c) Deploy done → launch automated execution via `--remote`.** + +- Trigger: the deploy-completion signal from Step 4 — never "PR opened". +- Launch the prepared run on the Testomat.io CI profile, reusing the same title + and shared-run flag, and pointing at the prepared run: + ``` + TESTOMATIO_RUN=$RUN_ID \ + TESTOMATIO_SHARED_RUN=1 \ + TESTOMATIO_TITLE="report for commit <short-sha>" \ + npx @testomatio/reporter run --remote <ci-profile> + ``` + `TESTOMATIO_RUN=$RUN_ID` ties the launch to the run prepared in (b); with no + fresh `--filter`, Testomat.io greps that run's own stored scope, so only the + affected e2e tests run. Testomat.io dispatches the CI profile; the suite runs + there and reports back into the same run. +- If the deploy pipeline is fully decoupled and cannot carry `RUN_ID`, the + execute step matches the prepared run by its shared-run **title** instead — + which is why the title must be identical and the shared-run timeout must still + be open. Keep `TESTOMATIO_SHARED_RUN=1` and the same `TESTOMATIO_TITLE`. +- **Isolation:** keep this off the release's critical path — a separate + job/pipeline keyed off the deploy signal, set non-failing for the release + (`continue-on-error`, `allow_failure: true`, etc.). It reports to Testomat.io; + it must not gate the deploy. +- **Inline exception (Step 5):** when not using a CI profile, replace the + `--remote` call with the runner wrapped directly, on the prepared run: + ``` + TESTOMATIO_RUN=$RUN_ID npx @testomatio/reporter run "<runner cmd>" \ + --filter "coverage:file=coverage.e2e.yml,diff=<base>" + ``` -The reporter creates the run, runs the filtered tests, and closes it. - -**(b) Cross-repo dispatch** — when Step 5 picked the cross-repo path. Four -explicit stages, in order: - -1. **Compute the grep** from this repo using the `--filter-list --format=grep` - dry-run (full mechanics in - [references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md)): - ``` - GREP=$(npx @testomatio/reporter run \ - --filter-list "coverage:file=coverage.e2e.yml,diff=<base>" --format=grep) - ``` -2. **Skip the dispatch if `$GREP` is empty.** Nothing was affected; do not - trigger the e2e repo (most runners interpret an empty grep as "run - everything"). -3. **Pre-create the empty run here** with `reporter start`, carrying title + - rungroup, and capture the ID from stdout: - ``` - RUN_ID=$(npx @testomatio/reporter start --kind automated | tail -n1) - ``` -4. **Trigger the e2e repo's workflow** with `{grep: $GREP, run: $RUN_ID, - test_env: <env name>}` as inputs (whatever the CI's cross-repo trigger - mechanism is — repository_dispatch, pipeline trigger, project access token, - etc.). The e2e repo's job must export `TESTOMATIO_RUN=<run input>` so its - test results attach to this pre-created run. - -- **Isolation:** keep the automated flow off the release's critical path — - a separate workflow/pipeline triggered by the deploy signal, with the job - set non-failing for the release (every CI has such a knob: `continue-on-error`, - `allow_failure: true`, etc.). It reports to Testomat.io; it must not gate - the deploy. - -Always state the secrets/tokens the user must provision — the Testomat.io API -key (in both repos for the cross-repo path), and for cross-repo triggering a -token allowed to trigger the other repo (a CI's built-in token usually cannot -reach another repo; this typically means a PAT, project access token, or app -token with the appropriate scope). - -#### When the e2e repo has no dispatchable workflow - -If the e2e repo does not already accept a `grep` + `run` + `test_env` trigger, -do not silently restructure their pipeline. Propose a small dispatchable -workflow with this contract: - -- **Inputs:** - - `grep` — runner filter (e.g. `(@Sxxxx|@Tyyyy|@Tzzzz)`). - - `run` — Testomat.io run ID to attach results to. - - `test_env` (or `target_env`) — which environment the suite should hit - (e.g. `beta`, `staging`, `preview-pr-123`). -- **Behavior:** - - Set `TESTOMATIO_RUN=<run input>` in the test job's env. - - Invoke the runner with the grep applied (the runner's `--grep` / - `--testNamePattern` / equivalent), pointed at the chosen `test_env`. - - Report results via `@testomatio/reporter` as normal — because - `TESTOMATIO_RUN` is set, they attach to the pre-created run rather than - creating a new one. - -Produce a draft of this workflow in the e2e repo's CI syntax, name the file -path it belongs at, and tell the user to commit it there. Do not push to -another repo on the user's behalf. +State the secrets the user must provision — the `TESTOMATIO` API key on the jobs +that talk to Testomat.io, and a token allowed to post PR/MR comments for phase +(a) (the CI's built-in token usually suffices for same-repo comments). A CI +profile for `--remote` is configured in Testomat.io, not via a repo secret. ### Step 7 — Summarize and hand off Report concisely: -- CI system targeted; files written in this repo (and, for a separate e2e - repo, the file the user must commit *there* — you cannot push it for them). -- Which flows are wired; which were skipped and why (missing coverage map / no - e2e suite). -- Title scheme chosen for each flow (e.g. "PR title + #number" for manual, - "commit subject + SHA" for automated). -- Rungroup strategy chosen and where the value is computed (the CI shell - expression / script step). -- For cross-repo e2e: that the run is pre-created in **this** repo and the - e2e repo's job exports `TESTOMATIO_RUN`; also whether a new dispatchable - workflow was proposed for the e2e repo (and where it should be committed). -- Secrets/tokens to add before it works — `TESTOMATIO` in both repos for the - cross-repo path; cross-repo trigger token (PAT/project token/app) on the - source side. -- Any assumption that needs the user's confirmation (deploy signal, diff base, +- CI system targeted; files written in this repo. +- Which phases are wired (comment / manual run / automated prepare / automated + execute) and which were skipped and why (missing coverage map / no e2e suite / + no CI profile). +- The comment format and where its counts come from (which maps, which base). +- Title scheme (merge commit) and that the automated prepare + execute steps + share one title; rungroup strategy and where it is computed. +- Shared-run timeout chosen and the deploy duration it must cover. +- How automated execution is launched: the `--remote <profile>` CI profile (or + the inline-runner exception), and that `RUN_ID` (or the shared title) links + the prepare and execute steps. +- Secrets/tokens to add before it works; that a Testomat.io CI profile must + exist for `--remote`. +- Any assumption needing confirmation (deploy signal, diff base, timeout, rungroup recipe). -- Recommend committing the coverage map(s) so CI and the team share one - mapping. +- Recommend committing the coverage map(s) so CI and the team share one mapping. --- ## Examples -**Example 1 — e2e suite in a separate repo** -Input: "Make our PRs run the manual cases and, after deploy, the affected e2e -tests. E2E lives in a sibling repo." -Output: ask the rungroup question (user picks weekly). PR-opened job creates a -titled manual run (`PR <title> #<n>`) inside `PR Regression W<n> <Month> <Year>`. -A separate job keyed off the deploy-completion signal: computes the grep with -`--filter-list --format=grep`, pre-creates the run with `reporter start` -(same title scheme, same rungroup), captures the run ID, and triggers the e2e -repo's existing workflow with `{grep, run, test_env}`. Tokens listed; the -automated job set non-failing for the release. All written in the project's -own CI. +**Example 1 — comment on open, regression after merge+deploy, e2e via CI profile** +Input: "Comment how many tests each PR touches, then after we merge and deploy, +run the affected e2e and create a manual run." +Output: ask the rungroup (weekly), deploy signal, and deploy duration. On PR +open, a non-blocking job posts `N automated, M manual tests affected` computed +with `--filter-list`. On merge: a pending manual run (`report for commit <sha>` +in `Regression W<n> <Month> <Year>`) and a prepared automated shared run with the +same title and `TESTOMATIO_SHARED_RUN_TIMEOUT` above the deploy time; `RUN_ID` +captured. On deploy-complete: a non-failing job runs `reporter run --remote +<profile>` with that `RUN_ID` + shared title, launching the affected e2e on +Testomat.io CI. All written in the project's own CI. **Example 2 — no coverage files yet** Input: "Set up selective regression on our pull requests." -Output: find no `coverage.*.yml`; explain regression can't work without them; -delegate to `manual-coverage` / `automation-coverage`; only then wire the CI. +Output: find no `coverage.*.yml`; explain the comment and the regression runs +can't work without them; delegate to `manual-coverage` / `automation-coverage`; +only then wire the CI. **Example 3 — only unit tests in repo** Input: "Run affected e2e on PRs." -Output: project-scan finds no e2e framework anywhere; set up the manual flow if -manual cases exist; explain the automated flow needs an e2e suite first; do not -fabricate an e2e job. - -**Example 4 — deploy-on-tag, release-grouped runs, e2e repo has no dispatch yet** -Input: "We deploy when we push a `v*` tag. Group runs by release. Our e2e -tests are in a separate repo that doesn't have a workflow we can trigger." -Output: rungroup strategy = release (`PR Regression <tag>`). Tag-creation event -is the deploy trigger; deploy job completion is the signal. Source repo computes -grep, pre-creates the run titled with the tag, and would dispatch — but the e2e -repo has no dispatchable workflow yet. Propose a workflow file for the e2e repo -with inputs `{grep, run, test_env}` and `TESTOMATIO_RUN` wired into the runner -job; hand it to the user to commit in the e2e repo. Wire the source side once -the e2e workflow exists. +Output: project-scan finds no e2e framework anywhere; set up the PR-open comment +(manual count only) and the on-merge manual run if manual cases exist; explain +the automated phase needs an e2e suite first; do not fabricate an e2e job. + +**Example 4 — no CI profile configured** +Input: "Use `--remote` to run our e2e after deploy." +Output: confirm there is no Testomat.io CI profile yet; explain `--remote` +dispatches a configured profile and cannot work without one; point the user at +Testomat.io **Settings → CI** to add the profile that runs the e2e suite; wire +the prepare step now and the `--remote` execute step once the profile exists. +Until then, offer the inline-runner exception (Step 5) if the e2e suite can run +in this pipeline. --- @@ -399,7 +398,7 @@ the e2e workflow exists. | Description | File | | -------------------------------------- | ------------------------------- | -| Reporter command contract & cross-repo | references/REPORTER_CONTRACT.md | +| Reporter command contract & `--remote` | references/REPORTER_CONTRACT.md | ## Related skills diff --git a/skills/setup-pr-testing/evals/evals.json b/skills/setup-pr-testing/evals/evals.json index e1334c4..4d0fa1a 100644 --- a/skills/setup-pr-testing/evals/evals.json +++ b/skills/setup-pr-testing/evals/evals.json @@ -3,23 +3,30 @@ "evals": [ { "id": 0, - "name": "github-cross-repo-happy-path", - "prompt": "The project is at FIXTURE/acme-api. Set up our PRs so they run only the manual cases affected by the change, and after the staging deploy run only the affected e2e tests. The e2e suite is not in this repo — it lives in the separate repo acme/e2e-tests, which already has a workflow e2e.yml that takes a `grep` input. Deploys happen on merge to main via the existing 'Deploy Staging' workflow.", - "expected_output": "Runs project-scan first; detects GitHub Actions (does not assume); reuses existing coverage.manual.yml + coverage.e2e.yml; writes a PR-opened manual workflow (reporter run --kind manual, no deploy dependency) and a separate post-deploy workflow gated on Deploy Staging success that computes the grep via --filter-list and dispatches acme/e2e-tests' existing e2e.yml; isolates the e2e job from the release; states the cross-repo token requirement.", + "name": "github-remote-profile-happy-path", + "prompt": "The project is at FIXTURE/acme-api. On each PR I want a comment showing how many manual and automated tests the change affects. After the PR is merged and the staging deploy finishes, run only the affected e2e tests. The e2e suite runs through our Testomat.io CI profile 'staging-e2e'. Deploys happen on merge to main via the existing 'Deploy Staging' workflow.", + "expected_output": "Runs project-scan first; detects GitHub Actions (does not assume); reuses existing coverage.manual.yml + coverage.e2e.yml. Phase 1: a non-blocking PR-opened job computes affected counts with reporter run --filter-list for both maps and posts/updates a single PR comment like '0 automated tests, 10 manual tests are affected by this PR'; creates no runs. Phase 2 (on merge): reporter start --kind manual creates a pending manual run, and reporter start with TESTOMATIO_SHARED_RUN=1, TESTOMATIO_TITLE set to the merge commit, and TESTOMATIO_SHARED_RUN_TIMEOUT above the deploy duration prepares the automated run scoped to the e2e map without executing it, capturing RUN_ID. Phase 3 (after Deploy Staging succeeds): a non-failing job runs reporter run --remote staging-e2e with TESTOMATIO_RUN=$RUN_ID + the same shared title, launching the affected e2e on the CI profile. Sets meaningful title (merge commit) and rungroup; isolates the execute job from the release; does NOT dispatch a foreign repo directly.", "files": [] }, { "id": 1, "name": "gitlab-missing-coverage-maps", - "prompt": "The project is at FIXTURE/shop-web. We want selective regression on our merge requests — only run the tests touched by the change. I don't think we have any coverage files set up yet.", - "expected_output": "Runs project-scan; detects GitLab CI (not GitHub); finds no coverage.*.yml; explains regression cannot work without coverage maps and proposes/delegates to manual-coverage and automation-coverage to create them before wiring .gitlab-ci.yml; asks how MRs deploy and how deploy completion is observable for the e2e flow; manual flow needs no deploy signal.", + "prompt": "The project is at FIXTURE/shop-web. I want each MR to comment how many tests it affects, and after merge+deploy run only the affected tests. I don't think we have any coverage files set up yet.", + "expected_output": "Runs project-scan; detects GitLab CI (not GitHub); finds no coverage.*.yml; explains both the affected-counts comment and the regression runs cannot work without coverage maps and proposes/delegates to manual-coverage and automation-coverage to create them before wiring .gitlab-ci.yml; asks how merges deploy, how deploy completion is observable, and roughly how long deploys take (for the shared-run timeout); notes the PR-open comment needs no deploy answers.", "files": [] }, { "id": 2, "name": "no-e2e-suite-refuse-to-fabricate", "prompt": "The project is at FIXTURE/lib-utils. Make our pull requests run the affected end-to-end tests automatically.", - "expected_output": "Runs project-scan; finds only Jest unit tests, no e2e framework anywhere and no e2e suite in another repo; does NOT fabricate an e2e job or pipeline; explains the automated/e2e flow needs an e2e suite first (points at authoring/automation-coverage); offers the manual flow only if manual cases exist; asks which CI is used since none is configured.", + "expected_output": "Runs project-scan; finds only Jest unit tests, no e2e framework anywhere; does NOT fabricate an e2e job, pipeline, or --remote launch; explains the automated phase needs an e2e suite first (points at authoring/automation-coverage); offers the PR-open comment (manual count only) and an on-merge pending manual run if manual cases exist; asks which CI is used since none is configured.", + "files": [] + }, + { + "id": 3, + "name": "no-ci-profile-for-remote", + "prompt": "The project is at FIXTURE/acme-api. Use --remote to run our affected e2e tests after deploy, and comment the affected counts on each PR. We use GitHub Actions and deploy on tag v*.", + "expected_output": "Runs project-scan; detects GitHub Actions; confirms whether a Testomat.io CI profile exists for --remote and, finding none, explains --remote dispatches a configured profile and cannot work without one, pointing the user to Testomat.io Settings > CI to create it. Still wires phase 1 (PR-open counts comment) and phase 2 (on-merge manual run + prepared automated shared run titled by commit). Wires the phase-3 reporter run --remote execute step keyed off the tag/deploy-complete signal as non-failing, to be enabled once the profile exists; offers the inline-runner exception if the e2e suite can run in this pipeline instead.", "files": [] } ] diff --git a/skills/setup-pr-testing/references/REPORTER_CONTRACT.md b/skills/setup-pr-testing/references/REPORTER_CONTRACT.md index 8688129..09e0601 100644 --- a/skills/setup-pr-testing/references/REPORTER_CONTRACT.md +++ b/skills/setup-pr-testing/references/REPORTER_CONTRACT.md @@ -1,7 +1,9 @@ -# Reporter Contract & Cross-Repo Triggering +# Reporter Contract: counts, prepared runs & `--remote` -How `@testomatio/reporter` consumes a coverage map, and the exact mechanics for -the cross-repo e2e pattern. This is CI-independent — the CI recipes wrap these. +How `@testomatio/reporter` consumes a coverage map, counts affected tests for +the PR comment, prepares runs without executing them, and launches a prepared +run on a Testomat.io CI profile. This is CI-independent — the CI config wraps +these commands. ## 1. The coverage filter @@ -20,154 +22,161 @@ through the YAML, and selects the matching tests. So the reporter must be launched from inside the repo whose changes you want to detect. Confirmed behavior, not configurable. -Implication for cross-repo: if the coverage map + source live in repo A but the -test runner runs in repo B, launch the reporter with cwd = repo A (so the diff -is repo A's), and let the runner command `cd` into repo B. +The whole model rests on this: the repo that holds the coverage map + git +history is where the affected selection is computed — for the comment counts and +for scoping the prepared run alike. -## 2. Manual flow — create a pending run +## 2. Phase 1 — affected-counts comment (`--filter-list`) -No test runner. The reporter just creates a manual run in Testomat.io -containing only the affected cases: +`--filter-list` computes the affected tests **without executing or creating a +run** — exactly what the PR-open notice needs. Pair it with `--format` to get a +clean, parseable list on stdout (the banner is suppressed and logs go to +stderr): -``` -npx @testomatio/reporter run --kind manual \ - --filter "coverage:file=coverage.manual.yml,diff=<base-branch>" +```bash +npx @testomatio/reporter run \ + --filter-list "coverage:file=coverage.manual.yml,diff=$BASE" --format ids +npx @testomatio/reporter run \ + --filter-list "coverage:file=coverage.e2e.yml,diff=$BASE" --format ids ``` -- Needs `TESTOMATIO` (API key) in env. -- `<base-branch>` for a PR is normally the target branch (`main`/`master`). -- Safe to run the instant a PR opens. No deploy dependency. Non-blocking. -- **Required in practice for `setup-pr-testing`:** - - `TESTOMATIO_TITLE` — meaningful run title. For PR-opened triggers derive - from PR title + number (e.g. `PR Add invoice export #482`). - - `TESTOMATIO_RUNGROUP_TITLE` — the rungroup bucket the run lands in - (week / day / milestone / submodule / release — whichever the user picked - in Step 4 of the skill). Supports `/` nesting for sub-grouping. - - `TESTOMATIO_ENV` (optional) — target environment label. +`--format` values: `ids` (comma-separated, default), `newline` (one per line — +easy to `wc -l`), `json`, `grep` (`(@Sxxxx|@Tyyyy|...)`). -## 3. Automated flow — run only affected tests +- Count the entries per map; empty output = `0`. +- Assemble one line, e.g. `0 automated tests, 10 manual tests are affected by + this PR`, and post it via the CI's PR/MR comment API (GitHub / GitLab / + Bitbucket). Prefer updating one existing comment over re-posting. +- Needs `TESTOMATIO` (API key) for the `coverage:` resolution. Creates nothing, + must never fail the PR check. -The reporter wraps the runner command and injects the framework-appropriate -grep (`--grep`, `--testNamePattern`, Cypress env, etc.) for the matched IDs: +## 3. Phase 2 — create the regression runs on merge -``` -npx @testomatio/reporter run "<runner cmd>" \ - --filter "coverage:file=coverage.e2e.yml,diff=<base>" -``` +### 3a. Manual run — created pending, not executed -Examples of `<runner cmd>`: `npx playwright test`, `npx cypress run`, -`npx codeceptjs run`, `npx wdio`. Run this from the repo that holds the -coverage map + git history (see §1). +No test runner; the reporter creates a manual run containing only the affected +cases for testers to pick up: -- **Required in practice for `setup-pr-testing`:** - - `TESTOMATIO_TITLE` — for commit-triggered runs derive from the commit - subject + short SHA; for tag/release-triggered runs use the tag name. - - `TESTOMATIO_RUNGROUP_TITLE` — same strategy as the manual flow unless the - user split them. - - `TESTOMATIO_ENV` (optional) — target environment label. +```bash +npx @testomatio/reporter start --kind manual \ + --filter "coverage:file=coverage.manual.yml,diff=$BASE" +``` -## 4. Inspect-only (dry run) — get the selection without running +- Required env: `TESTOMATIO`, `TESTOMATIO_TITLE` (merge commit, e.g. `report for + commit <sha>`), `TESTOMATIO_RUNGROUP_TITLE` (the rungroup bucket; supports `/` + nesting). `TESTOMATIO_ENV` optional. -`--filter-list` is the documented mode to compute the affected tests **without -executing** them (docs: coverage pipe → "Retrieve a list of tests matching your -filter"). Use it when the selection must be computed in one repo and the tests -run elsewhere. +### 3b. Automated run — prepared as a shared run, not executed -Pair it with `--format=grep` to get the alternation string directly on stdout -(no parsing needed): +`start` creates the run scoped to the affected e2e tests and returns its id +**without running anything**. Created as a *shared* run so the later execute +step — and any parallel executors — converge on this one run by title: ```bash -GREP=$(npx @testomatio/reporter run \ - --filter-list "coverage:file=coverage.e2e.yml,diff=$BASE" --format=grep) -# -> GREP = "(@Sxxxx|@Tyyyy|...)"; use as: <runner> --grep "$GREP" +RUN_ID=$(TESTOMATIO_SHARED_RUN=1 \ + TESTOMATIO_TITLE="report for commit $SHA" \ + TESTOMATIO_SHARED_RUN_TIMEOUT=$DEPLOY_MINUTES \ + npx @testomatio/reporter start --filter "coverage:file=coverage.e2e.yml,diff=$BASE" \ + | tail -n1) ``` -Other `--format` values: `json`, `newline`, `ids` (default: comma-separated). +- `--filter` scopes the prepared run to the affected tests; that scope is stored + on the run and reused at launch time (§4). +- **Output:** the run id is the last line of stdout — capture with `tail -n1`. +- Required env: `TESTOMATIO`, `TESTOMATIO_TITLE`, `TESTOMATIO_RUNGROUP_TITLE`. -Only proceed to trigger the run when `$GREP` is non-empty — an empty grep -means "nothing affected", and most runners treat an empty grep as "run -everything". +### Shared-run env vars (the convergence mechanism) -## 5. `reporter start` — pre-create an empty run +- `TESTOMATIO_SHARED_RUN=1` — report/launch into the run matching + `TESTOMATIO_TITLE` instead of creating a new one. All parallel reporters with + the same title land in the same run. +- `TESTOMATIO_TITLE` — the match key. The prepare step (§3b) and the execute + step (§4) **must** use the identical title. +- `TESTOMATIO_SHARED_RUN_TIMEOUT` — minutes the title stays matchable, **default + 20**. After it elapses a new run is created instead. Set it above the typical + merge→deploy duration, or the execute step won't attach to the prepared run. + Example: `TESTOMATIO_SHARED_RUN_TIMEOUT=120` for a 2-hour window. -`start` initiates a new test run in Testomat.io and returns its identifier, -**without running any tests**. Use it when the selection is computed in one -repo (this one) but the tests run in another job or another repo, and you want -their results to attach to a single, pre-titled run that already lives in the -correct rungroup. +## 4. Phase 3 — launch the prepared run on CI (`reporter run --remote`) + +`reporter run --remote <profile>` asks Testomat.io to dispatch the project's CI +profile (configured under **Settings → CI**) for an already-prepared run, +instead of executing tests locally. This replaces the old cross-repo dispatch: +the CI profile owns the runner, browsers, environment URLs and secrets — the +reporter just triggers it. + +```bash +TESTOMATIO_RUN=$RUN_ID \ +TESTOMATIO_SHARED_RUN=1 \ +TESTOMATIO_TITLE="report for commit $SHA" \ +npx @testomatio/reporter run --remote <profile> +``` + +- `TESTOMATIO_RUN=$RUN_ID` points the launch at the run prepared in §3b. With no + fresh `--filter`, Testomat.io greps that run's **own stored scope**, so only + the affected e2e tests run — no need to recompute the diff at deploy time. +- The CI profile name must exist on the project, otherwise the call fails with + `CI launch failed: No settings for <profile>`. `--remote` cannot be combined + with `--filter-list`. +- Testomat.io passes the run id into the dispatched workflow, so the e2e tests + running there report back into the same prepared run. +- On success the CLI prints the launched profile and run URL, then exits 0; the + run transitions as the CI reports results. + +**Decoupled deploy (no `RUN_ID` to hand over).** If the deploy pipeline can't +carry `RUN_ID`, drop it and let the shared-run **title** match the prepared run +— keep `TESTOMATIO_SHARED_RUN=1` and the identical `TESTOMATIO_TITLE`, and make +sure the shared-run timeout (§3b) is still open. Carrying `RUN_ID` is the more +direct link when the same pipeline does merge→deploy; matching by title is the +fallback for a separate deploy pipeline. + +**Optional CI overrides.** `--remote-param key=value` (repeatable) forwards +values into the CI profile config at launch (e.g. `--remote-param branch=main`). +The same launch config can be set via env: `TESTOMATIO_CI_PROFILE` (= the +profile) and `TESTOMATIO_CI_PARAMS` (= comma-separated `key=value` overrides). + +## 5. Inline exception — execute in this pipeline (no CI profile) + +When the e2e suite runs in this pipeline (mobile, API/contract, or an existing +same-repo job) rather than via a CI profile, launch the prepared run by wrapping +the runner directly after deploy: ```bash -RUN_ID=$(npx @testomatio/reporter start --kind automated | tail -n1) +TESTOMATIO_RUN=$RUN_ID npx @testomatio/reporter run "<runner cmd>" \ + --filter "coverage:file=coverage.e2e.yml,diff=$BASE" ``` -- `--kind` — `automated`, `manual`, or `mixed`. -- Required env: `TESTOMATIO`. In practice also set `TESTOMATIO_TITLE` and - `TESTOMATIO_RUNGROUP_TITLE` so the run is created with the right metadata. - `TESTOMATIO_ENV` optional. -- **Output:** the run ID is the last line of stdout — capture with `tail -n1`. -- **Pair:** the consuming job (whatever runs the tests — same repo or - another) exports `TESTOMATIO_RUN=$RUN_ID`. Any subsequent `@testomatio/reporter` - call there will attach to that run instead of creating a new one. - -## 6. Cross-repo e2e pattern (preferred when e2e lives elsewhere) - -Roles: - -- **Source repo** (this one): owns `coverage.e2e.yml` + the diff + the run's - identity (title, rungroup). After the deploy signal: computes `$GREP` via - §4, pre-creates the run via §5, then triggers the e2e repo carrying both - `$GREP` and `$RUN_ID`. -- **E2E repo**: already has a workflow that accepts a grep/filter input + a - run ID and runs the suite against the deployed environment. We trigger that - existing workflow — we do **not** duplicate it. - -Why this split: the e2e repo holds the runner, browsers, environment URLs and -secrets. Reproducing all that in the source repo is duplication and a secret- -sprawl risk. The source repo only needs its own coverage map + git + a Testomat.io -API key. - -**Four explicit stages:** - -1. Compute the selection: `GREP=$(reporter run --filter-list … --format=grep)` - (§4). -2. If `$GREP` is empty → stop. Nothing was affected. -3. Pre-create the run: `RUN_ID=$(reporter start … | tail -n1)` (§5). -4. Trigger the e2e repo's workflow with inputs `{grep: $GREP, run: $RUN_ID, - test_env: <env name>}`. The e2e repo's runner job exports - `TESTOMATIO_RUN=<run input>` so its reports attach to the pre-created run. - -Triggering mechanism — whatever cross-repo pipeline/workflow trigger the CI -offers, carrying the grep + run + env as inputs/variables. The built-in CI -token usually cannot reach another repo — a PAT / project access token / app -token / deploy trigger token with permission on the e2e repo is required. -State this to the user as a provisioning step. - -If the e2e repo's workflow does **not** yet accept `grep` + `run` + `test_env` -inputs, that is a small addition there — describe the contract (`SKILL.md` §6 -"When the e2e repo has no dispatchable workflow") and hand the user a draft; -do not silently restructure their pipeline. - -## 7. Diff base caveats - -- **Manual, on PR open**: `diff=<target-branch>` (e.g. `main`) — the natural - "what this PR changes". -- **Automated, after deploy**: the meaningful diff is *what was just deployed*. - On a squash/rebase merge that is one commit, so `diff=<deployed_sha>~1` - works. If the project lands **multi-commit** pushes onto the deploy branch, - `~1` under-selects — instead diff against the previously-deployed SHA - (carry/persist it between deploys). Always surface this assumption to the user - rather than hard-coding silently. - -## 8. Required environment - -- `TESTOMATIO` — Testomat.io API key (both flows; in both repos for cross-repo). -- `TESTOMATIO_TITLE` — meaningful run title. Required in practice for every - `setup-pr-testing` reporter call (manual, automated, `start`). -- `TESTOMATIO_RUNGROUP_TITLE` — rungroup the run lands in. Required in - practice for every `setup-pr-testing` reporter call. Supports `/` nesting. -- `TESTOMATIO_RUN` — **cross-repo only.** Set in the e2e repo's runner job to - the run ID returned by `reporter start` in the source repo, so its reports - attach to the pre-created run. +Examples of `<runner cmd>`: `npx playwright test`, `npx cypress run`, +`npx codeceptjs run`, `npx wdio`. The reporter injects the framework-appropriate +grep for the matched IDs and reports into `$RUN_ID`. + +## 6. Diff base caveats + +- **Comment + on-merge runs**: `diff=<target-branch>` (e.g. `main`) — the + natural "what this PR changes". +- **If you recompute at deploy time**: the meaningful diff is *what was just + deployed*. On a squash/rebase merge that is one commit, so `diff=<sha>~1` + works. With **multi-commit** pushes onto the deploy branch, `~1` under-selects + — diff against the previously-deployed SHA (persist it between deploys). + Preferably avoid recomputing: the prepared run already carries its scope, so + the `--remote` launch reuses it (§4) and no deploy-time diff is needed. + Surface this to the user rather than hard-coding silently. + +## 7. Required environment + +- `TESTOMATIO` — Testomat.io API key (every phase that talks to Testomat.io). +- `TESTOMATIO_TITLE` — run title; derive from the merge commit. The automated + prepare (§3b) and execute (§4) steps MUST use the same value. +- `TESTOMATIO_RUNGROUP_TITLE` — rungroup for the created runs. Supports `/` + nesting. +- `TESTOMATIO_SHARED_RUN` — `1` to converge on the title-matched run (automated + prepare + execute). +- `TESTOMATIO_SHARED_RUN_TIMEOUT` — minutes the shared title stays matchable + (default 20); set above the merge→deploy duration. +- `TESTOMATIO_RUN` — the prepared run id; set on the execute step to launch that + specific run. +- `TESTOMATIO_CI_PROFILE` / `TESTOMATIO_CI_PARAMS` — env equivalents of + `--remote` / `--remote-param`. - `TESTOMATIO_ENV` (optional) — target environment label. -- Cross-repo CI: a token authorized to trigger the e2e repo's CI (PAT / - project access token / app token; the built-in CI token usually cannot). +- A Testomat.io **CI profile** (Settings → CI) is required for `--remote`; it is + configured in Testomat.io, not as a repo secret. From 61d9dbbb7902b1e3e16c4e556ec8b85d96db5906 Mon Sep 17 00:00:00 2001 From: DavertMik <davert@testomat.io> Date: Sun, 31 May 2026 12:03:57 +0300 Subject: [PATCH 4/7] Clarify: skill runs locally to author CI config, never runs in/as CI MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The deliverable is a working pipeline committed to the project's own CI system. The skill never executes the reporter, runs tests, or creates runs itself — every reporter command is written into the pipeline for CI to run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --- skills/setup-pr-testing/SKILL.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/skills/setup-pr-testing/SKILL.md b/skills/setup-pr-testing/SKILL.md index 2bb6f46..de8cb03 100644 --- a/skills/setup-pr-testing/SKILL.md +++ b/skills/setup-pr-testing/SKILL.md @@ -14,6 +14,15 @@ phases** — a cheap notice while the PR is open, real runs once it is merged, a execution once it is deployed. Nothing heavy happens until the change is actually going somewhere. +> **I run locally to set up the pipeline — I am never part of CI.** This skill +> executes on a developer's machine (or wherever an agent is invoked) and its +> only deliverable is **working CI configuration committed to the repo** that +> makes the project's own CI system do the three phases below. I do **not** run +> tests, create runs, post comments, or call `@testomatio/reporter` myself — +> every command shown here is something I *write into the pipeline* for CI to +> execute later. If you find yourself running a reporter command to "see it +> work", stop: the goal is a correct pipeline in the target CI, not a run. + ``` PR opened ──▶ NOTICE ONLY — comment the affected test counts on the PR reporter run --filter-list (manual map + e2e map) → counts @@ -91,6 +100,11 @@ the config for whatever CI it finds. ## CRITICAL CONSTRAINTS +- **The deliverable is a working pipeline in the project's CI — nothing else.** + This skill runs locally to *author* CI configuration; it is never a CI step + and never executes the reporter itself. Do not run tests, create runs, or call + `@testomatio/reporter` to "verify" — your output is committed CI config the + target system runs later. Done = a correct pipeline in the project's own CI. - **Discovery first, always.** Never write CI before delegating to `project-scan` (frameworks, manual + automated tests, where e2e tests live). Decisions below depend on its result. From 34dba4caedffdac42ae6cbbee3ba18359f27edf5 Mon Sep 17 00:00:00 2001 From: DavertMik <davert@testomat.io> Date: Sun, 31 May 2026 12:06:30 +0300 Subject: [PATCH 5/7] Frame the working CI pipeline as the explicit GOAL of the skill Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --- skills/setup-pr-testing/SKILL.md | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/skills/setup-pr-testing/SKILL.md b/skills/setup-pr-testing/SKILL.md index de8cb03..f0dd0ba 100644 --- a/skills/setup-pr-testing/SKILL.md +++ b/skills/setup-pr-testing/SKILL.md @@ -14,14 +14,16 @@ phases** — a cheap notice while the PR is open, real runs once it is merged, a execution once it is deployed. Nothing heavy happens until the change is actually going somewhere. -> **I run locally to set up the pipeline — I am never part of CI.** This skill -> executes on a developer's machine (or wherever an agent is invoked) and its -> only deliverable is **working CI configuration committed to the repo** that -> makes the project's own CI system do the three phases below. I do **not** run -> tests, create runs, post comments, or call `@testomatio/reporter` myself — -> every command shown here is something I *write into the pipeline* for CI to -> execute later. If you find yourself running a reporter command to "see it -> work", stop: the goal is a correct pipeline in the target CI, not a run. +> **GOAL: produce a working pipeline inside the project's own CI system.** That +> committed CI configuration is the one and only finished result — reaching it +> means the skill is done. +> +> **I run locally to set up that pipeline — I am never part of CI.** This skill +> executes on a developer's machine (or wherever an agent is invoked); it does +> **not** run tests, create runs, post comments, or call `@testomatio/reporter` +> itself. Every command shown here is something I *write into the pipeline* for +> CI to execute later. If you find yourself running a reporter command to "see +> it work", stop — that is not the goal; a correct pipeline in the target CI is. ``` PR opened ──▶ NOTICE ONLY — comment the affected test counts on the PR @@ -100,11 +102,12 @@ the config for whatever CI it finds. ## CRITICAL CONSTRAINTS -- **The deliverable is a working pipeline in the project's CI — nothing else.** - This skill runs locally to *author* CI configuration; it is never a CI step - and never executes the reporter itself. Do not run tests, create runs, or call - `@testomatio/reporter` to "verify" — your output is committed CI config the - target system runs later. Done = a correct pipeline in the project's own CI. +- **The goal is a working pipeline in the project's CI — nothing else.** Reaching + that committed CI configuration is the final goal; the task is complete only + when it exists. This skill runs locally to *author* that config; it is never a + CI step and never executes the reporter itself. Do not run tests, create runs, + or call `@testomatio/reporter` to "verify" — your output is committed CI config + the target system runs later. - **Discovery first, always.** Never write CI before delegating to `project-scan` (frameworks, manual + automated tests, where e2e tests live). Decisions below depend on its result. From f4e5760a60a3a03127331d7628c799a092aa16cf Mon Sep 17 00:00:00 2001 From: DavertMik <davert@testomat.io> Date: Mon, 1 Jun 2026 23:49:37 +0300 Subject: [PATCH 6/7] Use 'start --format id' for clean RUN_ID capture in setup-pr-testing Aligns the skill with the reporter change: start prints only the run id to stdout when --format is set, so capture with --format id instead of tail -n1. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --- skills/setup-pr-testing/SKILL.md | 15 ++++++++------- .../references/REPORTER_CONTRACT.md | 6 +++--- 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/skills/setup-pr-testing/SKILL.md b/skills/setup-pr-testing/SKILL.md index f0dd0ba..747f503 100644 --- a/skills/setup-pr-testing/SKILL.md +++ b/skills/setup-pr-testing/SKILL.md @@ -37,10 +37,10 @@ PR merged ──▶ Manual regression run (created, pending) titled by the merge commit · in a rungroup · testers pick it up ──▶ Automated regression run (created, NOT executed) - reporter start --filter "coverage:file=coverage.e2e.yml,diff=<base>" + reporter start --filter "coverage:file=coverage.e2e.yml,diff=<base>" --format id TESTOMATIO_SHARED_RUN=1 · TESTOMATIO_TITLE="report for commit <sha>" TESTOMATIO_SHARED_RUN_TIMEOUT=<covers the merge→deploy gap> - → capture RUN_ID + → capture RUN_ID (stdout is just the id with --format id) deploy done ──▶ Launch automated execution on Testomat.io CI reporter run --remote <ci-profile> @@ -306,15 +306,16 @@ call are ordinary CI config you write for whatever system it is. **Automated run** — prepared as a shared run, **not executed**: ``` - TESTOMATIO_SHARED_RUN=1 \ + RUN_ID=$(TESTOMATIO_SHARED_RUN=1 \ TESTOMATIO_TITLE="report for commit <short-sha>" \ TESTOMATIO_SHARED_RUN_TIMEOUT=<minutes covering deploy> \ npx @testomatio/reporter start \ - --filter "coverage:file=coverage.e2e.yml,diff=<base>" + --filter "coverage:file=coverage.e2e.yml,diff=<base>" --format id) ``` - Capture the printed run id as `RUN_ID` and carry it to the execute step - (pipeline output/artifact). The shared title + timeout let the execute step - (and any parallel executors) converge on this same prepared run. + `--format id` makes `start` print only the run id to stdout, so `RUN_ID` + captures it cleanly; carry it to the execute step (pipeline output/artifact). + The shared title + timeout let the execute step (and any parallel executors) + converge on this same prepared run. **(c) Deploy done → launch automated execution via `--remote`.** diff --git a/skills/setup-pr-testing/references/REPORTER_CONTRACT.md b/skills/setup-pr-testing/references/REPORTER_CONTRACT.md index 09e0601..3ddf74e 100644 --- a/skills/setup-pr-testing/references/REPORTER_CONTRACT.md +++ b/skills/setup-pr-testing/references/REPORTER_CONTRACT.md @@ -76,13 +76,13 @@ step — and any parallel executors — converge on this one run by title: RUN_ID=$(TESTOMATIO_SHARED_RUN=1 \ TESTOMATIO_TITLE="report for commit $SHA" \ TESTOMATIO_SHARED_RUN_TIMEOUT=$DEPLOY_MINUTES \ - npx @testomatio/reporter start --filter "coverage:file=coverage.e2e.yml,diff=$BASE" \ - | tail -n1) + npx @testomatio/reporter start --filter "coverage:file=coverage.e2e.yml,diff=$BASE" --format id) ``` - `--filter` scopes the prepared run to the affected tests; that scope is stored on the run and reused at launch time (§4). -- **Output:** the run id is the last line of stdout — capture with `tail -n1`. +- **Output:** `--format id` makes `start` print only the run id to stdout (banner + and logs go to stderr), so `RUN_ID=$(...)` captures just the id. - Required env: `TESTOMATIO`, `TESTOMATIO_TITLE`, `TESTOMATIO_RUNGROUP_TITLE`. ### Shared-run env vars (the convergence mechanism) From cae8b8e19ef11f7085f4afdbdff54c1ed56a9c84 Mon Sep 17 00:00:00 2001 From: DavertMik <davert@testomat.io> Date: Thu, 11 Jun 2026 15:15:11 +0300 Subject: [PATCH 7/7] Unify coverage skills into qa-test-coverage-map; per-project map naming Merge manual-coverage and automation-coverage into a single qa-test-coverage-map skill: manual and automated tests of one project go in one map. Split files per project, not per test kind. Coverage maps are now named by content so CI knows what is inside: coverage.<slug>.yml manual + automated coverage.<slug>.manual.yml manual only coverage.<slug>.e2e.yml automated only Update setup-pr-testing to read the filename suffix and pick the reporter --kind flag accordingly: --kind manual (manual only), --kind mixed (combined), or none (e2e only). One regression run is created per coverage map on merge. Refresh README, testomatio-flow, REPORTER_CONTRACT, and evals for the new skill name and naming scheme. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --- README.md | 3 +- .../skills/automation-coverage | 1 - .../test-management/skills/manual-coverage | 1 - .../skills/qa-test-coverage-map | 1 + skills/automation-coverage/SKILL.md | 267 -------- .../references/COVERAGE_FILE_FORMAT.md | 1 - .../scripts/check-coverage.mjs | 1 - skills/manual-coverage/SKILL.md | 253 -------- skills/qa-test-coverage-map/SKILL.md | 208 +++++++ .../references/COVERAGE_FILE_FORMAT.md | 57 +- .../references/E2E_FRAMEWORKS.md | 4 +- .../scripts/check-coverage.mjs | 4 +- skills/setup-pr-testing/SKILL.md | 587 +++++++----------- skills/setup-pr-testing/evals/evals.json | 10 +- .../references/REPORTER_CONTRACT.md | 89 ++- skills/testomatio-flow/SKILL.md | 3 +- 16 files changed, 542 insertions(+), 948 deletions(-) delete mode 120000 plugins/test-management/skills/automation-coverage delete mode 120000 plugins/test-management/skills/manual-coverage create mode 120000 plugins/test-management/skills/qa-test-coverage-map delete mode 100644 skills/automation-coverage/SKILL.md delete mode 120000 skills/automation-coverage/references/COVERAGE_FILE_FORMAT.md delete mode 120000 skills/automation-coverage/scripts/check-coverage.mjs delete mode 100644 skills/manual-coverage/SKILL.md create mode 100644 skills/qa-test-coverage-map/SKILL.md rename skills/{manual-coverage => qa-test-coverage-map}/references/COVERAGE_FILE_FORMAT.md (54%) rename skills/{automation-coverage => qa-test-coverage-map}/references/E2E_FRAMEWORKS.md (90%) rename skills/{manual-coverage => qa-test-coverage-map}/scripts/check-coverage.mjs (91%) diff --git a/README.md b/README.md index 39ae935..d76e5e3 100644 --- a/README.md +++ b/README.md @@ -32,8 +32,7 @@ For other ways of installation (Claude Code plugin, Codex, Cursor etc.) see [ins | `improve-test-cases` | Analyze and improve existing markdown test cases for clarity | | `find-duplicate-cases` | Find duplicate, near-duplicate, and overlapping test cases | | `sync-cases` | Synchronize Markdown test scenarios between local project and Testomat.io | -| `manual-coverage` | Map manual test cases to source files; generate `coverage.manual.yml` for affected-only runs | -| `automation-coverage` | Map e2e tests to source files; generate `coverage.e2e.yml` to run only the tests affected by a diff | +| `qa-test-coverage-map` | Map manual and automated tests to source files; generate a per-project `coverage.*.yml` for affected-only runs | | `testomatio-flow` | Orchestrate complete test case lifecycle: generate, improve, analyze coverage, upload to TMS | | `project-scan` | Scan project source code to inventory languages, frameworks, and existing tests | diff --git a/plugins/test-management/skills/automation-coverage b/plugins/test-management/skills/automation-coverage deleted file mode 120000 index c555298..0000000 --- a/plugins/test-management/skills/automation-coverage +++ /dev/null @@ -1 +0,0 @@ -../../../skills/automation-coverage \ No newline at end of file diff --git a/plugins/test-management/skills/manual-coverage b/plugins/test-management/skills/manual-coverage deleted file mode 120000 index 14c6de4..0000000 --- a/plugins/test-management/skills/manual-coverage +++ /dev/null @@ -1 +0,0 @@ -../../../skills/manual-coverage \ No newline at end of file diff --git a/plugins/test-management/skills/qa-test-coverage-map b/plugins/test-management/skills/qa-test-coverage-map new file mode 120000 index 0000000..ac1c194 --- /dev/null +++ b/plugins/test-management/skills/qa-test-coverage-map @@ -0,0 +1 @@ +../../../skills/qa-test-coverage-map \ No newline at end of file diff --git a/skills/automation-coverage/SKILL.md b/skills/automation-coverage/SKILL.md deleted file mode 100644 index f35fbf6..0000000 --- a/skills/automation-coverage/SKILL.md +++ /dev/null @@ -1,267 +0,0 @@ ---- -name: automation-coverage -description: Map automated end-to-end tests (Playwright, Cypress, WebdriverIO, CodeceptJS, Puppeteer, Appium) to source code files and produce a `coverage.e2e.yml` mapping consumed by `@testomatio/reporter --filter "coverage:..."`. Use this skill when the user wants to run only the e2e tests affected by a code change, generate an e2e coverage file, or build a traceability matrix between automated tests and source code. -license: MIT -metadata: - author: Testomat.io - version: 1.0.0 ---- - -# AUTOMATION-COVERAGE SKILL: What I do - -This skill analyzes **automated e2e tests** and the project source code, then produces `coverage.e2e.yml` — a mapping from source files (or globs) to test identifiers (suite IDs, test IDs, tags) embedded in the test code. The mapping is consumed by `@testomatio/reporter run "<runner>" --filter "coverage:file=coverage.e2e.yml,diff=<branch>"` to run only the e2e tests affected by the diff. - -## When to Use - -Trigger this skill when the user wants to: -- Map automated e2e tests to the source code they exercise. -- Generate `coverage.e2e.yml` (or a similarly named file) for use with `@testomatio/reporter`. -- Run only the e2e tests affected by a change instead of the full suite. -- Build a code → e2e-test traceability matrix. -- Speed up CI by limiting the e2e run to tests affected by the PR diff. -- Phrases: "e2e coverage", "automated test coverage", "map tests to code", "affected e2e tests", "selective e2e run", "generate coverage.e2e.yml". - ---- - -## CRITICAL CONSTRAINTS - -This skill works **only with automated e2e tests** (Playwright, Cypress, WebdriverIO, Puppeteer, CodeceptJS, Appium, etc.). - -- **DO NOT** process unit tests. -- **DO NOT** process manual markdown test cases (use `manual-coverage` instead - if already exists). -- **DO NOT** suggest creating new tests. -- **Only touch two files in this repo.** It may write `coverage.e2e.yml` (or the path the user gave) and add one `.testclaw-context/` line to `.gitignore` if it is missing. Nothing else — never a source or test file. If the e2e tests live in another repo, clone it into the gitignored `.testclaw-context/e2e-tests/`, never into a tracked folder (see Step 1). -- **Don't write scripts. Never use Python.** Read test files with your file tool; pull out IDs and tags with `grep` (Step 3). To check the finished coverage file, pipe it through `js-yaml` into the one tiny bundled helper (symlinked from `manual-coverage`): `npx js-yaml coverage.e2e.yml | node scripts/check-coverage.mjs` (Step 6). That's the only script. If you ever need more than a `grep`, a one-line `node -e '…'` is the limit — never `python`, never a parser of your own. - ---- - -## Workflow: Build E2E Coverage Map - -### Step 1: Discover the project (delegate to `project-scan`) - -Run the **`project-scan`** skill to inventory the project. Use its output as the source of truth for framework detection, the e2e test set, and the high-level codebase overview — do not duplicate those scans here. - -From the `project-scan` result, capture: -- **Frameworks** — automation framework(s) in use (Playwright, Cypress, WebdriverIO, CodeceptJS, Puppeteer, Appium, Mocha, Jest…). See [E2E Frameworks](./references/E2E_FRAMEWORKS.md) for detection signals if `project-scan` is ambiguous. -- **Automated Tests** — the list of e2e test files (the input to Step 3). -- **Project Overview** — languages, complexity (the framing for Step 5). - -If `project-scan` reports **no automated tests** (it checks `.testclaw-context/e2e-tests/` too, so this means nothing was cloned before), or no e2e framework is detected: -- ❓ Ask the user to either: - 1. Give the path to the e2e tests (then re-run `project-scan` there). - 2. Give the git URL of the e2e tests repo → `git clone <url> .testclaw-context/e2e-tests`, add `.testclaw-context/` to `.gitignore` if missing, and re-run `project-scan` against `.testclaw-context/e2e-tests`. - 3. Stop. - -Never clone the tests into a tracked folder in this repo. - -If two frameworks are detected (say Jest and Playwright), ask which one is the e2e framework — coverage filtering runs per runner. - -### Step 2: Verify Testomatio IDs are present - -Coverage filtering relies on `@S` / `@T` identifiers embedded in the test code: - -```javascript -describe('user settings @S92321384', () => { - it('updates avatar @Ta011dfa3', () => { ... }); -}); -``` - -```javascript -test('login @smoke @T6f8e9174', async ({ page }) => { ... }); -``` - -If most files have no `@S` / `@T` markers, stop and instruct the user to populate them first by running: - -```bash -npx check-tests@latest <Framework> "<glob>" --update-ids -``` - -(see the `reporter-setup` skill for the full per-framework command). Without IDs in the source, the reporter cannot select tests by coverage. - -### Step 3: Extract test information - -For each test file extract: - -- **Suite IDs** — `@S` + 8 chars in `describe` / `context` / `Feature` blocks. -- **Test IDs** — `@T` + 8 chars in `it` / `test` / `Scenario` blocks. -- **Tags** — other `@word` markers (`@smoke`, `@jira-123`, `@regression`). -- **What is exercised** — page objects imported, routes hit, fixtures used — used to reason about which source files each test covers. - -**How to extract.** Read the test files, or `grep` for the IDs and tags in test/suite names. Run these in whichever directory holds the tests (`tests/e2e`, or the cache `.testclaw-context/e2e-tests`): - -```bash -grep -rnoE '@[ST][0-9a-f]{8}' <dir> # suite/test IDs (+ which file/line) -grep -rnoE '@[A-Za-z0-9_-]+' <dir> | sort -u # every @token, tags included -``` - -Don't write a parser. Never use `python`. If there are no `@S`/`@T` IDs, add them first with `npx check-tests@latest <Framework> "<glob>" --update-ids` (see Step 2). - -### Step 4: Explore the source codebase - -Use the **Project Overview** from Step 1's `project-scan` result (languages, frameworks, complexity) as the starting frame, then identify business code the tests exercise. Skip: - -- Test code itself. -- Dependency / build / vendor folders. -- Framework configs (`playwright.config.*`, `cypress.config.*`, `wdio.conf.*`, `codecept.conf.*`, lock files). - -**Templates and views are source code — map them too.** E2E tests drive the rendered UI, so a change to a template breaks or alters the page a test asserts on. Treat view/template files as first-class mappable source alongside controllers, models, and components — do **not** skip them because they aren't `.js`/`.ts`/`.py`. Cover at least: - -- HTML & component templates: `.html`, `.htm`, `.vue`, `.svelte`, Angular `*.component.html`. -- Logic-in-markup engines: `.hbs`/`.handlebars`, `.ejs`, `.pug`/`.jade`, `.mustache`, `.liquid`. -- Server-side views: `.erb`, `.haml`, `.slim` (Rails); `.blade.php`, `.twig` (PHP); `.j2`/`.jinja`/`.jinja2` (Python); `.cshtml`/`.razor` (.NET); `.jsp` (Java). - -Map a template the same way as code: to the suite/test/tag whose e2e tests render and assert against that view (e.g. `app/views/sessions/new.html.erb` → the login suite). - -❓ If the project structure is large or ambiguous (`project-scan` reports `large` / `very-large` complexity), ask which directories to focus on or to exclude. - -### Step 5: Choose the best mapping per source file - -For each candidate source file, pick **one** strategy: - -**A) Map to Suite (`@S...`)** — when most tests in a suite relate to the file. - -```yaml -app/models/user.rb: - - "@S92321384" # Suite: user settings e2e tests -``` - -**B) Map to Test (`@T...`)** — when only one test matches the file. - -```yaml -app/controllers/sessions_controller.rb: - - "@Ta011dfa3" # Test: login with valid credentials -``` - -**C) Map to Tag (`@tag`)** — when a tag groups tests across suites. - -```yaml -tag:@smoke: - - "@Ta011dfa3" - - "@Tb022dfa4" -``` - -**Rules:** -- Prefer specific file paths when tests target a single file. -- Prefer globs (`app/services/jira/**`) when tests cover an entire subtree. -- Prefer Suite mapping when most of a suite relates — terser, survives test additions. -- Use Test mapping when only one test in a large suite is relevant. -- Use Tag mapping for cross-cutting concerns. -- Avoid empty entries. -- Add `#` comments next to each identifier explaining the mapping. - -See [Coverage File Format](./references/COVERAGE_FILE_FORMAT.md) for the full YAML grammar. - -### Step 6: Save and validate the coverage file - -Write the YAML to the resolved output path (default `coverage.e2e.yml` in the project root). If the user supplied a different path => use it. - -**Check it** — one command, run from the project root: - -```bash -npx js-yaml coverage.e2e.yml | node scripts/check-coverage.mjs -``` - -`npx js-yaml` parses the file (and fails loudly if the YAML is malformed, so a broken file never reaches the script). `check-coverage.mjs` then flags any key whose path is missing on disk, any key with no identifiers, and prints every `@S…` / `@T…` / tag it references. Check that list against the IDs you extracted in Step 3 — only you know which are real. Don't re-parse the test files; use the set you already have. Never use `python`. - -> The keys in the coverage file are paths to source files in this repo, never `.testclaw-context/...` paths. The cache holds the cloned tests; the coverage file points at the code they exercise. - -Then show the produced YAML to the user. - -### Step 7: Show next steps - -Tell the user how to use the file with `@testomatio/reporter`. The runner command **must** be passed in — `--filter` generates `--grep` that the runner consumes: - -```bash -# Playwright -npx @testomatio/reporter run "npx playwright test" \ - --filter "coverage:file=coverage.e2e.yml,diff=main" - -# Cypress -npx @testomatio/reporter run "npx cypress run" \ - --filter "coverage:file=coverage.e2e.yml,diff=main" - -# WebdriverIO -npx @testomatio/reporter run "npx wdio" \ - --filter "coverage:file=coverage.e2e.yml,diff=main" - -# CodeceptJS -npx @testomatio/reporter run "npx codeceptjs run" \ - --filter "coverage:file=coverage.e2e.yml,diff=main" -``` - -Recommend committing `coverage.e2e.yml` to the repository so CI uses the same mapping. Provide a GitHub Actions snippet on request — see [Coverage File Format](./references/COVERAGE_FILE_FORMAT.md) for the reporter contract. - -### Step 8: Final summary - -Report: -- Framework detected. -- Number of test files scanned. -- Number of tests with `@S` / `@T` IDs. -- Number of source files mapped. -- Output file path. - ---- - -## References - -| Description | File | -| ---------------------------- | --------------------------------------------- | -| Coverage YAML format | ./references/COVERAGE_FILE_FORMAT.md | -| E2E framework detection | ./references/E2E_FRAMEWORKS.md | - -## Bundled script - -`scripts/check-coverage.mjs` (~25 lines, zero deps; symlinked from `manual-coverage`) — reads the parsed coverage map on stdin, flags keys whose path is missing on disk and keys with no identifiers, and lists every `@S`/`@T`/tag the file references. Feed it via `js-yaml`, from the project root: - -```bash -npx js-yaml coverage.e2e.yml | node scripts/check-coverage.mjs -``` - -It's the only script — everything else is `grep`, your file tool, or `npx js-yaml`. Don't rewrite it in `python`. - ---- - -## Error Handling - -### Recovery - -- **No e2e tests found (cache included)** → ask for the path, or a git URL to clone into the gitignored `.testclaw-context/e2e-tests/`. -- **Tests have no `@S`/`@T` IDs** → instruct the user to run `npx check-tests <Framework> "<glob>" --update-ids` (cross-link `reporter-setup`). -- **Ambiguous source layout** → ask which directories are application code. - -### Hard Fail (stop immediately) - -- Cannot create or write the output file. -- User refuses to provide a tests directory or to clone a repo. -- The agent is asked to modify files other than the output file (refuse — see CRITICAL CONSTRAINTS). - ---- - -## Examples - -**Playwright project, default output:** -``` -Use automation-coverage skill to map our Playwright tests to source code -``` - -**Custom directory + output:** -``` -Use automation-coverage skill, tests in e2e/playwright, output to ops/coverage.e2e.yml -``` - -**Full workflow (CI):** -1. Run `reporter-setup` to install `@testomatio/reporter` and import tests via `check-tests --update-ids` (so suite/test IDs land in the source). -2. Run `automation-coverage` — internally delegates to `project-scan` for framework detection and inventory, then produces `coverage.e2e.yml`. -3. Commit `coverage.e2e.yml`. -4. In CI, run `npx @testomatio/reporter run "npx playwright test" --filter "coverage:file=coverage.e2e.yml,diff=origin/main"` — only the affected tests execute. - ---- - -## Quick Commands - -| Action | Command | -| ----------------------------------- | ------------------------------------------------------------------------------------------------------ | -| Populate IDs in test source | `npx check-tests@latest <Framework> "<glob>" --update-ids` | -| Run affected Playwright tests | `npx @testomatio/reporter run "npx playwright test" --filter "coverage:file=coverage.e2e.yml,diff=main"` | -| Run affected Cypress tests | `npx @testomatio/reporter run "npx cypress run" --filter "coverage:file=coverage.e2e.yml,diff=main"` | -| Run affected CodeceptJS tests | `npx @testomatio/reporter run "npx codeceptjs run" --filter "coverage:file=coverage.e2e.yml,diff=main"` | diff --git a/skills/automation-coverage/references/COVERAGE_FILE_FORMAT.md b/skills/automation-coverage/references/COVERAGE_FILE_FORMAT.md deleted file mode 120000 index fcf5bf8..0000000 --- a/skills/automation-coverage/references/COVERAGE_FILE_FORMAT.md +++ /dev/null @@ -1 +0,0 @@ -../../manual-coverage/references/COVERAGE_FILE_FORMAT.md \ No newline at end of file diff --git a/skills/automation-coverage/scripts/check-coverage.mjs b/skills/automation-coverage/scripts/check-coverage.mjs deleted file mode 120000 index d9271ac..0000000 --- a/skills/automation-coverage/scripts/check-coverage.mjs +++ /dev/null @@ -1 +0,0 @@ -../../manual-coverage/scripts/check-coverage.mjs \ No newline at end of file diff --git a/skills/manual-coverage/SKILL.md b/skills/manual-coverage/SKILL.md deleted file mode 100644 index 7144b7b..0000000 --- a/skills/manual-coverage/SKILL.md +++ /dev/null @@ -1,253 +0,0 @@ ---- -name: manual-coverage -description: Map manual test cases (markdown) to source code files and produce a `coverage.manual.yml` mapping consumed by `@testomatio/reporter --filter "coverage:..."`. Use this skill when the user wants to run only the manual tests affected by a code change, generate a manual coverage file, build a traceability matrix between manual cases and source code, or set up change-aware regression for manual QA. -license: MIT -metadata: - author: Testomat.io - version: 1.0.0 ---- - -# MANUAL-COVERAGE SKILL: What I do - -This skill analyzes **manual test cases in markdown format** and the project source code, then produces `coverage.manual.yml` — a mapping from source files (or globs) to manual test identifiers (suite IDs, test IDs, tags). The mapping is consumed by `@testomatio/reporter run --filter "coverage:file=coverage.manual.yml,diff=<branch>"` to create manual runs in Testomat.io that contain only the cases affected by the diff. - -## When to Use - -Trigger this skill when the user wants to: -- Map manual test cases to the source code that implements them. -- Generate `coverage.manual.yml` (or a similarly named file) for use with `@testomatio/reporter`. -- Run only manual regression tests relevant to a code change instead of the full suite. -- Build a code → manual-test traceability matrix. -- Find dead manual tests (tests with no matching source) or coverage gaps (source with no manual tests). -- Phrases: "manual test coverage", "map manual cases to code", "affected manual tests", "selective regression for manual tests", "generate coverage.manual.yml". - ---- - -## CRITICAL CONSTRAINTS - -This skill works **only with manual tests in markdown format**. - -- **DO NOT** process unit, functional, or e2e test files. -- **DO NOT** suggest creating new automated tests. -- **Only touch two files in this repo.** This skill runs inside the user's source-code repo. It may write `coverage.manual.yml` (or the path the user gave) and add one `.testclaw-context/` line to `.gitignore` if it is missing. Nothing else — never a source file. Cases pulled from Testomat.io go into the gitignored `.testclaw-context/manual-tests/`, never into a tracked folder (see Step 1). -- **Don't write scripts. Never use Python.** Read `.test.md` files with your file tool; pull out IDs and tags with `grep` (Step 2). To check the finished coverage file, pipe it through `js-yaml` into the one tiny bundled helper: `npx js-yaml coverage.manual.yml | node scripts/check-coverage.mjs` (Step 5). If more checks needed use custom script with `npx js-yaml`. - -If automated test files (e.g. e2e test, unit, api) are encountered while exploring, ignore them and continue with the manual markdown set. - ---- - -## Workflow: Build Manual Coverage Map - -### Step 1: Discover the project (delegate to `project-scan`) - -Run the **`project-scan`** skill to inventory the project. Use its output as the source of truth for both the manual test set and the high-level codebase overview — do not duplicate that scan here. - -From the `project-scan` result, capture: -- **Manual Tests** — the list of `.test.md` files and their suite/test titles (the input to Step 2). -- **Project Overview** — languages, frameworks, complexity (the framing for Step 3). - - -If `project-scan` reports **no manual tests** (it checks the cache too, so this means nothing was pulled before): -- ❓ Ask the user how to proceed: - 1. Pull cases from Testomat.io — have **`sync-cases`** pull into the gitignored cache: `npx check-tests pull -d .testclaw-context/manual-tests`, then add `.testclaw-context/` to `.gitignore` if missing. Re-run `project-scan` and continue. - 2. Point to a directory the scan missed (then re-run `project-scan` there). - 3. Stop. - -Never pull cases into a tracked folder. Don't repeat `sync-cases` pull logic here. - -### Step 2: Extract test information - -Each manual test markdown file follows the format described in [Classical Tests Markdown Format](../generate-cases/references/test-case-format.md) (canonical reference, owned by `generate-cases`). For every file extract: - -- **Suite ID** — `@S` + 8 chars, found in the `<!-- suite ... id: @S... -->` block. -- **Test IDs** — `@T` + 8 chars, found in `<!-- test ... id: @T... -->` blocks. -- **Tags** — `@tag` markers in suite/test titles and `tags:` lines inside the metadata blocks. -- **Context** — suite title, test titles, steps, and expected results — used to reason about which source files implement each behavior. - -**How to extract.** Read the `.test.md` files — the metadata blocks are short. For a quick overview, `grep` instead of a parser. Run these in whichever directory holds the cases (`manual-tests/`, or the cache `.testclaw-context/manual-tests/`): - -```bash -grep -rnE 'id:[[:space:]]*@S' <dir> # suite IDs (+ which file) -grep -rnE 'id:[[:space:]]*@T' <dir> # test IDs -grep -rnE '^tags:' <dir> # tags from metadata blocks -grep -rhoE '@[A-Za-z0-9_-]+' <dir> | sort -u # every @token (titles included) -``` - -Don't write a markdown parser. Never use `python`. - -If a file has no `@S` / `@T` IDs, the user hasn't pushed it to Testomat.io yet. Ask whether to push first via `sync-cases`, or skip those files — a mapping without IDs is useless to the reporter. - -### Step 3: Explore the codebase - -Use the **Project Overview** from Step 1's `project-scan` result (languages, frameworks, complexity) as the starting frame, then identify business code that implements the behaviors described by the manual tests. Skip everything not part of the application — `project-scan` already excludes most of this, but reinforce when reading source: - -- Test code: `*.test.*`, `*.spec.*`, `*_test.*`, `test_*.*`, `*.cy.*`, `__tests__/`, `__specs__/`, `tests/`, `spec/`, `specs/`. -- Markdown manual test directories themselves. -- Dependency / build / vendor folders. -- Framework configs and lock files. - -Work with the project structure (controllers, models, services, components, pages, routes, etc.) — not the testing infrastructure. - -**Templates and views are source code — map them too.** A manual case walks through the rendered UI, so a change to a template alters the screen the tester checks. Treat view/template files as first-class mappable source alongside controllers, models, and components — do **not** skip them because they aren't `.js`/`.ts`/`.py`. Cover at least: - -- HTML & component templates: `.html`, `.htm`, `.vue`, `.svelte`, Angular `*.component.html`. -- Logic-in-markup engines: `.hbs`/`.handlebars`, `.ejs`, `.pug`/`.jade`, `.mustache`, `.liquid`. -- Server-side views: `.erb`, `.haml`, `.slim` (Rails); `.blade.php`, `.twig` (PHP); `.j2`/`.jinja`/`.jinja2` (Python); `.cshtml`/`.razor` (.NET); `.jsp` (Java). - -Map a template the same way as code: to the suite/test/tag whose manual cases exercise that screen (e.g. `app/views/sessions/new.html.erb` → the login suite). - -❓ If the project structure is ambiguous or large (`project-scan` reports `large` / `very-large` complexity), ask the user which directories to focus on or to exclude. - -### Step 4: Choose the best mapping per source file - -For each candidate source file, pick **one** mapping strategy based on which option gives the cleanest, most stable selection: - -**A) Map to Suite (`@S...`)** — when most tests in a suite relate to the file. - -```yaml -app/models/user.rb: - - "@S816410d6" # Suite: User permissions -``` - -**B) Map to Test (`@T...`)** — when only one specific test matches the file. - -```yaml -app/controllers/sessions_controller.rb: - - "@T6f8e9174" # Test: User is blocked after 5 failed login attempts -``` - -**C) Map to Tag (`@tag`)** — when a tag groups tests across multiple suites that all relate to the file. - -```yaml -app/services/jira_service.rb: - - "@jira" # Tag: All JIRA integration manual tests -``` - -**Decision guidance:** -- Prefer Suite mapping over listing many individual Test IDs from the same suite. -- Prefer Test mapping when only one test in a large suite is relevant. -- Prefer Tag mapping when the relevant tests live across several suites. -- Globs (`app/services/jira/**`) are valid file keys when a whole subtree maps to the same identifiers. -- Avoid empty entries. -- Add `#` comments next to each identifier explaining the mapping. - -See [Coverage File Format](./references/COVERAGE_FILE_FORMAT.md) for the full YAML grammar. - -### Step 5: Save and validate the coverage file - -Write the YAML to the resolved output path (default `coverage.manual.yml`). If the user supplied a different path => use it. -(keep `#` comments next to each ID so future readers can audit the mapping without opening Testomat.io). - -**Check it** — one command, run from the project root: - -```bash -npx js-yaml coverage.manual.yml | node scripts/check-coverage.mjs -``` - -`npx js-yaml` parses the file (and fails loudly if the YAML is malformed, so a broken file never reaches the script). `check-coverage.mjs` then flags any key whose path is missing on disk, any key with no identifiers, and prints every `@S…` / `@T…` / tag it references. Check that list against the IDs you extracted in Step 2 — only you know which are real. Don't re-parse the markdown; use the set you already have. Never use `python`. - -> The keys in the coverage file are paths to source files in this repo, never `.testclaw-context/...` paths. The cache holds the test cases; the coverage file points at the code they cover. - -Then show the produced YAML to the user. - -### Step 6: Show next steps - -Tell the user how to use the file with `@testomatio/reporter`: - -```bash -# Create a pending manual run that includes only cases affected by the diff vs main -npx @testomatio/reporter run \ - --kind manual \ - --filter "coverage:file=coverage.manual.yml,diff=main" -``` - -For batched regression cycles, recommend grouping runs: - -```bash -TESTOMATIO_RUNGROUP="Regression 911" npx @testomatio/reporter run \ - --kind manual \ - --filter "coverage:file=coverage.manual.yml,diff=main" -``` - -Recommend committing `coverage.manual.yml` to the repository so CI and teammates use the same mapping. - -If you pulled the cases, tell the user they stay in the gitignored `.testclaw-context/manual-tests/`. A re-run or a follow-up question reuses them, and nothing was committed. Don't delete the cache or move it into a tracked folder. - -### Step 7: Suggest follow-ups - -Once the file is saved, propose any of: - -- Scan the source for **coverage gaps** (features without manual tests). On approval, propose new manual cases (delegate to `generate-cases`). -- Scan for **dead tests** (manual tests whose features no longer exist in source). -- Answer questions like "do we have manual tests for X?" from the cached cases in `.testclaw-context/manual-tests/`. -- If the user wants to *edit* cases, they can edit them right in `.testclaw-context/manual-tests/` and push back with `sync-cases` — gitignored doesn't mean read-only. - ---- - -## References - -| Description | File | -| ---------------------------- | ---------------------------------------------------------- | -| Coverage YAML format | ./references/COVERAGE_FILE_FORMAT.md | -| Manual test markdown format | ../generate-cases/references/test-case-format.md | - -## Bundled script - -`scripts/check-coverage.mjs` (~25 lines, zero deps) — reads the parsed coverage map on stdin, flags keys whose path is missing on disk and keys with no identifiers, and lists every `@S`/`@T`/tag the file references. Feed it via `js-yaml`, from the project root: - -```bash -npx js-yaml coverage.manual.yml | node scripts/check-coverage.mjs -``` - -It's the only script — everything else is `grep`, your file tool, or `npx js-yaml`. Don't rewrite it in `python`. - ---- - -## Error Handling - -### Recovery - -- **No manual tests found (cache included)** → have `sync-cases` pull into `.testclaw-context/manual-tests/`, or ask the user for a directory the scan missed. -- **Markdown files with no `@S`/`@T` IDs** → ask whether to push first via `sync-cases`, or skip those files. -- **Ambiguous source layout** → ask the user which directories are application code. - -### Hard Fail (stop immediately) - -- Cannot create or write the output file. -- User refuses to provide a tests directory or to pull from Testomat.io. -- The agent is asked to modify files other than the output file (refuse — see CRITICAL CONSTRAINTS). - ---- - -## Examples - -**Build the mapping in a source repo (cases pulled if needed):** -``` -Use manual-coverage skill to build coverage.manual.yml for our manual cases -``` -If there are no local `.test.md` files, it pulls them into the gitignored `.testclaw-context/manual-tests/` and works from there. - -**Cases already local:** -``` -Use manual-coverage skill for the cases in manual-tests/ -``` - -**With a custom output path:** -``` -Use manual-coverage skill, output to ops/coverage.qa.yml -``` - -**Full workflow (source repo):** -1. `manual-coverage` runs `project-scan`. No local cases, so it has `sync-cases` pull into `.testclaw-context/manual-tests/` (gitignored) and re-runs `project-scan`. -2. It maps source files to suite/test/tag IDs and writes `coverage.manual.yml`. The only tracked changes are that file and one line in `.gitignore`. -3. `npx @testomatio/reporter run --kind manual --filter "coverage:file=coverage.manual.yml,diff=main"` creates a pending run with only the affected cases. - ---- - -## Quick Commands - -| Action | Command | -| -------------------------------------- | ---------------------------------------------------------------------------------------------------- | -| Pull cases into the gitignored cache | `npx check-tests pull -d .testclaw-context/manual-tests` | -| Create affected manual run | `npx @testomatio/reporter run --kind manual --filter "coverage:file=coverage.manual.yml,diff=main"` | -| Group runs | `TESTOMATIO_RUNGROUP="Regression 911" npx @testomatio/reporter run --kind manual --filter "..."` | diff --git a/skills/qa-test-coverage-map/SKILL.md b/skills/qa-test-coverage-map/SKILL.md new file mode 100644 index 0000000..4afafcb --- /dev/null +++ b/skills/qa-test-coverage-map/SKILL.md @@ -0,0 +1,208 @@ +--- +name: qa-test-coverage-map +description: Map tests to the source code they cover and generate a per-project coverage map (`coverage.<project>.yml`) consumed by `@testomatio/reporter --filter "coverage:..."`. Handles manual markdown test cases and automated e2e tests (Playwright, Cypress, WebdriverIO, CodeceptJS, Puppeteer, Appium) in one file. Use this skill when the user wants to run only the tests affected by a code change, generate a coverage file (any coverage*.yml), build a code-to-test traceability matrix, or set up change-aware regression — for manual QA, automated suites, or both. Replaces the former manual-coverage and automation-coverage skills. +license: MIT +metadata: + author: Testomat.io + version: 2.0.0 +--- + +# QA-TEST-COVERAGE-MAP SKILL: What I do + +I analyze a project's tests — manual cases in markdown, automated e2e tests, or +both — and produce a **coverage map**: YAML mapping source files (or globs) to +the Testomat.io identifiers of the tests that cover them (`@S` suite IDs, `@T` +test IDs, `@tag`s). `@testomatio/reporter` consumes the map to count, create, +or run only the tests affected by a git diff: + +```bash +npx @testomatio/reporter run --filter "coverage:file=<map>,diff=main" ... +``` + +Manual and automated tests living in one project go into **one map**. Files are +split per *project*, never per test kind. + +## File naming is the contract + +One file per Testomat.io project, in the repo root. The name tells every +consumer — CI, the `setup-pr-testing` skill — what kinds of tests are inside +without opening the file: + +| Project's tests | File | +| ------------------ | ----------------------------------- | +| manual + automated | `coverage.<projectSlug>.yml` | +| manual only | `coverage.<projectSlug>.manual.yml` | +| automated e2e only | `coverage.<projectSlug>.e2e.yml` | + +`<projectSlug>` is a kebab-case identifier of the Testomat.io project (e.g. +`billing-app`). Most repos serve one project and get one file; a monorepo or a +split manual/e2e setup may serve several projects — one file each. Legacy +`coverage.manual.yml` / `coverage.e2e.yml` files (no slug) are single-kind maps +from the old scheme; offer to migrate them when you touch them. + +## Constraints + +- **Write only the coverage file(s)**, plus one `.testclaw-context/` line in + `.gitignore` if missing. Never modify source or test files. +- **Pulled or cloned material lives in the gitignored `.testclaw-context/`**: + manual cases pulled from Testomat.io → `.testclaw-context/manual-tests/`, an + external e2e repo → `.testclaw-context/e2e-tests/`. Never into a tracked + folder. Tests already in the repo are used where they are. +- **Only tests carrying Testomat.io IDs can be mapped.** A map without real + `@S`/`@T` IDs is useless to the reporter. +- **Unit and integration tests are out of scope** — map manual cases and e2e + tests only. +- **No ad-hoc scripts, never Python.** File reads, `grep`, `npx js-yaml`, and + the bundled `scripts/check-coverage.mjs` cover everything this skill needs. + +## Workflow + +### 1. Discover (delegate to `project-scan`) + +Run **`project-scan`**. It reports which test kinds exist (manual `.test.md` +files, automated e2e frameworks), the languages, and the project shape — that +determines the file name(s) and everything after. + +If a kind the user expects is missing locally: + +- manual cases → have `sync-cases` pull them: + `npx check-tests pull -d .testclaw-context/manual-tests` +- e2e tests in another repo → `git clone <url> .testclaw-context/e2e-tests` +- otherwise ask for the directory, or stop. + +If several automated frameworks are detected, ask which one is the e2e suite. + +### 2. Pick the file name(s) + +Confirm which Testomat.io project the tests belong to and its slug (ask if you +cannot tell). Choose the file name from the table above based on what Step 1 +found. Multiple projects → split the tests by project and build one map each. +If the user supplied an output path, use it. + +### 3. Extract test identifiers + +- **Manual cases** carry IDs in metadata comments — + `<!-- suite ... id: @S... -->`, `<!-- test ... id: @T... -->`, plus `tags:` + lines ([format reference](../generate-cases/references/test-case-format.md)). +- **Automated tests** carry IDs in test titles — + `describe('user settings @S92321384')`, `it('updates avatar @Ta011dfa3')` + (per-framework syntax: [E2E_FRAMEWORKS.md](./references/E2E_FRAMEWORKS.md)). + +`grep` is enough — don't write a parser: + +```bash +grep -rnoE '@[ST][0-9a-f]{8}' <dir> # suite/test IDs +grep -rhoE '@[A-Za-z0-9_-]+' <dir> | sort -u # every @token, tags included +``` + +Also read the tests themselves — titles, steps, page objects, routes — to +understand which source files each one exercises. + +**Missing IDs?** The tests haven't been synced with Testomat.io yet. Manual +cases: push via `sync-cases` first. Automated tests: +`npx check-tests@latest <Framework> "<glob>" --update-ids` (see +`reporter-setup`). Don't map tests without IDs. + +### 4. Map source files to identifiers + +Explore the business code — controllers, models, services, components, routes. +Skip test code, dependencies, build output, and framework configs. + +**Templates and views are source too** (`.html`, `.vue`, `.svelte`, `.erb`, +`.blade.php`, `.twig`, `.cshtml`, …): a template change alters the screen a +test checks, so map them like any other file. + +Per source file or glob, pick the identifier that gives the most stable +selection: + +- **Suite `@S...`** — most tests of a suite relate to the file (preferred, + survives test additions). +- **Test `@T...`** — only one test in a large suite is relevant. +- **Tag `@tag`** — the related tests are spread across suites. + +In a mixed project, manual and automated identifiers sit side by side under +the same key: + +```yaml +app/controllers/sessions_controller.rb: + - "@S816410d6" # Suite: Login (manual) + - "@Ta011dfa3" # Test: login with valid credentials (e2e) +``` + +Use globs (`app/services/jira/**`) for whole subtrees, annotate every +identifier with a `#` comment, avoid empty entries. Full grammar: +[COVERAGE_FILE_FORMAT.md](./references/COVERAGE_FILE_FORMAT.md). + +If the codebase is large or ambiguous, ask which directories to focus on. + +### 5. Validate + +```bash +npx js-yaml coverage.<slug>.yml | node scripts/check-coverage.mjs +``` + +`js-yaml` fails loudly on malformed YAML; the checker flags keys whose path is +missing on disk and keys with no identifiers, and lists every referenced ID — +cross-check that list against the set you extracted in Step 3. Map keys are +repo paths, never `.testclaw-context/...` paths. Show the final YAML to the +user. + +### 6. Hand off + +Show how the map is used — the suffix decides the `--kind` flag: + +```bash +# manual-only map → pending manual run with the affected cases +npx @testomatio/reporter run --kind manual \ + --filter "coverage:file=coverage.<slug>.manual.yml,diff=main" + +# mixed map → one run holding affected manual cases and automated tests +npx @testomatio/reporter run --kind mixed \ + --filter "coverage:file=coverage.<slug>.yml,diff=main" + +# automated-only map → wrap the runner, no --kind needed +npx @testomatio/reporter run "npx playwright test" \ + --filter "coverage:file=coverage.<slug>.e2e.yml,diff=main" +``` + +Recommend committing the map so CI and teammates share one mapping, and +**`setup-pr-testing`** to wire it into CI (PR comments with affected counts, +post-merge regression runs). If cases were pulled, they stay in the gitignored +cache for reuse — don't delete or commit them. + +Useful follow-ups: scan for coverage gaps (source with no tests → delegate to +`generate-cases`) or dead tests (tests whose feature no longer exists). + +## References + +| Description | File | +| --------------------------- | ------------------------------------------------ | +| Coverage YAML format | ./references/COVERAGE_FILE_FORMAT.md | +| E2E framework detection | ./references/E2E_FRAMEWORKS.md | +| Manual test markdown format | ../generate-cases/references/test-case-format.md | + +## Bundled script + +`scripts/check-coverage.mjs` (~25 lines, zero deps) — sanity-checks a parsed +map from stdin. It is the only script; everything else is `grep`, file tools, +or `npx js-yaml`. Don't rewrite it, don't add others, never use `python`. + +## Examples + +**Mixed project:** "Build a coverage map for our manual cases and Playwright +tests" → one `coverage.billing-app.yml` mapping source files to both kinds of +identifiers. + +**Manual-only, nothing local:** cases are pulled into +`.testclaw-context/manual-tests/`, mapped, and written to +`coverage.billing-app.manual.yml`. Tracked changes: that file and one +`.gitignore` line. + +**Two projects in a monorepo:** `apps/shop` and `apps/admin` sync to different +Testomat.io projects → `coverage.shop.yml` + `coverage.admin.yml`. + +## Related skills + +`project-scan` (mandatory first), `sync-cases` (pull/push manual cases), +`reporter-setup` (reporter install + `--update-ids`), `generate-cases` (author +missing cases), `setup-pr-testing` (consume the map in CI). diff --git a/skills/manual-coverage/references/COVERAGE_FILE_FORMAT.md b/skills/qa-test-coverage-map/references/COVERAGE_FILE_FORMAT.md similarity index 54% rename from skills/manual-coverage/references/COVERAGE_FILE_FORMAT.md rename to skills/qa-test-coverage-map/references/COVERAGE_FILE_FORMAT.md index 344e3a4..9f1ca43 100644 --- a/skills/manual-coverage/references/COVERAGE_FILE_FORMAT.md +++ b/skills/qa-test-coverage-map/references/COVERAGE_FILE_FORMAT.md @@ -1,10 +1,14 @@ # Coverage File Format -`coverage.manual.yml` and `coverage.e2e.yml` share the same grammar. Both are read by `@testomatio/reporter run --filter "coverage:file=<path>,diff=<branch>"`. +All coverage maps share one grammar, whatever their name — +`coverage.<project>.yml` (manual + automated), `coverage.<project>.manual.yml` +(manual only), `coverage.<project>.e2e.yml` (automated only). All are read by +`@testomatio/reporter --filter "coverage:file=<path>,diff=<branch>"`. ## Top-level shape -A YAML map. Keys are file paths or globs relative to the repository root. Values are lists of identifiers — Suite IDs, Test IDs, or tags. +A YAML map. Keys are file paths or globs relative to the repository root. +Values are lists of identifiers — Suite IDs, Test IDs, or tags. ```yaml <file or glob>: @@ -29,7 +33,13 @@ describe('user settings @S92321384', () => { }); ``` -They are populated by `npx check-tests <Framework> "<glob>" --update-ids` after the tests are imported into Testomat.io. +They are populated by `npx check-tests <Framework> "<glob>" --update-ids` after +the tests are imported into Testomat.io. + +For manual cases, IDs live in the markdown metadata blocks +(`<!-- suite ... id: @S... -->`, `<!-- test ... id: @T... -->`). In a mixed map +both kinds sit under the same key — the reporter doesn't care where an ID came +from. ## File keys @@ -62,23 +72,27 @@ app/models/user.rb: ## Reporter usage -The runner command **must** be the first positional argument — `--filter` generates a `--grep` that the runner consumes. +The map's suffix decides the `--kind` flag: ```bash -# Playwright -npx @testomatio/reporter run "npx playwright test" \ - --filter "coverage:file=coverage.e2e.yml,diff=main" +# manual-only map — create a pending manual run +npx @testomatio/reporter run --kind manual \ + --filter "coverage:file=coverage.<project>.manual.yml,diff=main" -# Cypress -npx @testomatio/reporter run "npx cypress run" \ - --filter "coverage:file=coverage.e2e.yml,diff=main" +# mixed map — one run holding manual cases and automated tests +npx @testomatio/reporter run --kind mixed \ + --filter "coverage:file=coverage.<project>.yml,diff=main" -# WebdriverIO / Mocha / Jest / CodeceptJS — pass the corresponding runner command -npx @testomatio/reporter run "npx codeceptjs run" \ - --filter "coverage:file=coverage.e2e.yml,diff=main" +# automated-only map — wrap the runner, no --kind +# (--filter generates a --grep that the runner consumes) +npx @testomatio/reporter run "npx playwright test" \ + --filter "coverage:file=coverage.<project>.e2e.yml,diff=main" ``` -The `diff` value must be a stable branch (`main`, `origin/main`) the reporter can run `git diff` against. +For automated execution, the runner command (`npx playwright test`, +`npx cypress run`, `npx codeceptjs run`, `npx wdio`, …) is the first positional +argument. The `diff` value must be a stable ref (`main`, `origin/main`) the +reporter can run `git diff` against. ## GitHub Actions example @@ -104,7 +118,7 @@ jobs: TESTOMATIO: ${{ secrets.TESTOMATIO }} run: | npx @testomatio/reporter run "npx playwright test" \ - --filter "coverage:file=coverage.e2e.yml,diff=origin/main" + --filter "coverage:file=coverage.shop.e2e.yml,diff=origin/main" ``` ## Checking the coverage file @@ -112,15 +126,22 @@ jobs: One command, run from the project root — no parser of your own: ```bash -npx js-yaml coverage.manual.yml | node scripts/check-coverage.mjs # or coverage.e2e.yml +npx js-yaml coverage.<project>.yml | node scripts/check-coverage.mjs ``` -`npx js-yaml` parses the YAML (it fails loudly on a malformed file, so a broken one never reaches the script). `scripts/check-coverage.mjs` (~25 lines, zero deps; ships with both `manual-coverage` and `automation-coverage`, the latter as a symlink) reads that parsed map on stdin, flags any key whose path is missing on disk, flags any key with no identifiers, lists every `@S…` / `@T…` / tag the file references, and exits non-zero on a problem. +`npx js-yaml` parses the YAML (it fails loudly on a malformed file, so a broken +one never reaches the script). `scripts/check-coverage.mjs` (~25 lines, zero +deps; ships with `qa-test-coverage-map`) reads that parsed map on stdin, flags any key +whose path is missing on disk, flags any key with no identifiers, lists every +`@S…` / `@T…` / tag the file references, and exits non-zero on a problem. -The script can't tell which identifiers are real — check the listed ones against the set you extracted earlier in the workflow. Don't re-parse the test set, don't write your own YAML parser, and never use `python`. +The script can't tell which identifiers are real — check the listed ones +against the set you extracted earlier in the workflow. Don't re-parse the test +set, don't write your own YAML parser, and never use `python`. ## Authoring tips +- One file per Testomat.io project; the name says what's inside. - Prefer Suite IDs when most of a suite relates to a file. - Use Test IDs only when one test in a large suite is relevant. - Use tags (`@smoke`, `@billing`, `@jira-…`) for cross-cutting concerns. diff --git a/skills/automation-coverage/references/E2E_FRAMEWORKS.md b/skills/qa-test-coverage-map/references/E2E_FRAMEWORKS.md similarity index 90% rename from skills/automation-coverage/references/E2E_FRAMEWORKS.md rename to skills/qa-test-coverage-map/references/E2E_FRAMEWORKS.md index 452d318..ce0d6c2 100644 --- a/skills/automation-coverage/references/E2E_FRAMEWORKS.md +++ b/skills/qa-test-coverage-map/references/E2E_FRAMEWORKS.md @@ -57,9 +57,9 @@ npx check-tests@latest CodeceptJS "**/*_test.js" --update-ids npx check-tests@latest WebdriverIO "**/*.{test,e2e}.js" --update-ids ``` -`check-tests` rewrites the test files in place, inserting the IDs assigned by Testomat.io. Commit the changes before running `automation-coverage`. +`check-tests` rewrites the test files in place, inserting the IDs assigned by Testomat.io. Commit the changes before running `qa-test-coverage-map`. ## Related skills - `reporter-setup` — install `@testomatio/reporter` and import tests via `check-tests`. -- `sync-cases` — pull/push manual cases (the manual-coverage counterpart). See its [Testomat.io CLI reference](../../sync-cases/references/TESTOMATIO_CLI.md) for the full `check-tests` command set, including `--update-ids`. +- `sync-cases` — pull/push manual cases. See its [Testomat.io CLI reference](../../sync-cases/references/TESTOMATIO_CLI.md) for the full `check-tests` command set, including `--update-ids`. diff --git a/skills/manual-coverage/scripts/check-coverage.mjs b/skills/qa-test-coverage-map/scripts/check-coverage.mjs similarity index 91% rename from skills/manual-coverage/scripts/check-coverage.mjs rename to skills/qa-test-coverage-map/scripts/check-coverage.mjs index 0d9134c..267923b 100755 --- a/skills/manual-coverage/scripts/check-coverage.mjs +++ b/skills/qa-test-coverage-map/scripts/check-coverage.mjs @@ -1,10 +1,10 @@ #!/usr/bin/env node -// Sanity-check a coverage.*.yml mapping (manual or e2e — same format). +// Sanity-check a coverage*.yml mapping (manual, e2e, or mixed — same format). // // `js-yaml` does the YAML parsing; this script just checks the result. // Run it from the project root and pipe the parsed file in: // -// npx js-yaml coverage.manual.yml | node check-coverage.mjs +// npx js-yaml coverage.<project>.yml | node check-coverage.mjs // // (`npx js-yaml` prints JSON, and fails loudly if the YAML is malformed, // so a broken file never reaches this script.) diff --git a/skills/setup-pr-testing/SKILL.md b/skills/setup-pr-testing/SKILL.md index 747f503..12eaa3b 100644 --- a/skills/setup-pr-testing/SKILL.md +++ b/skills/setup-pr-testing/SKILL.md @@ -1,414 +1,283 @@ --- name: setup-pr-testing -description: Set up change-aware PR regression testing — wire CI so each pull request posts a comment listing how many manual and automated tests its diff affects, and so that after the PR is merged and deployed only those affected tests are actually run, via Testomat.io coverage maps and @testomatio/reporter. Use this skill when the user wants PR-triggered regression, "comment affected test counts on PRs", "run only affected tests after merge/deploy", selective manual runs per PR, post-deploy e2e, a coverage-driven CI pipeline, weekly/grouped regression runs in a rungroup, launching automated regression on Testomat.io CI via `--remote`, or to connect coverage.manual.yml / coverage.e2e.yml to their CI. CI-agnostic — adapts to whatever CI the project already uses (GitHub Actions, GitLab CI, Jenkins, Bitbucket, CircleCI, etc.). Trigger it even if the user only says "make PRs comment affected tests" or "run the right tests after merge". +description: Set up change-aware PR regression testing — wire CI so each pull request posts a comment listing how many manual and automated tests its diff affects, and so that after the PR is merged and deployed only those affected tests are actually run, via Testomat.io coverage maps and @testomatio/reporter. Use this skill when the user wants PR-triggered regression, "comment affected test counts on PRs", "run only affected tests after merge/deploy", selective manual runs per PR, post-deploy e2e, a coverage-driven CI pipeline, weekly/grouped regression runs in a rungroup, launching automated regression on Testomat.io CI via `--remote`, or to connect coverage*.yml maps to their CI. CI-agnostic — adapts to whatever CI the project already uses (GitHub Actions, GitLab CI, Jenkins, Bitbucket, CircleCI, etc.). Trigger it even if the user only says "make PRs comment affected tests" or "run the right tests after merge". license: MIT metadata: author: Testomat.io - version: 2.0.0 + version: 3.0.0 --- # SETUP-PR-TESTING SKILL: What I do I wire a project's CI so a pull request's change drives testing in **three -phases** — a cheap notice while the PR is open, real runs once it is merged, and +phases** — a cheap notice while the PR is open, runs created once it is merged, execution once it is deployed. Nothing heavy happens until the change is actually going somewhere. -> **GOAL: produce a working pipeline inside the project's own CI system.** That -> committed CI configuration is the one and only finished result — reaching it -> means the skill is done. -> -> **I run locally to set up that pipeline — I am never part of CI.** This skill -> executes on a developer's machine (or wherever an agent is invoked); it does -> **not** run tests, create runs, post comments, or call `@testomatio/reporter` -> itself. Every command shown here is something I *write into the pipeline* for -> CI to execute later. If you find yourself running a reporter command to "see -> it work", stop — that is not the goal; a correct pipeline in the target CI is. +> **GOAL: a working pipeline inside the project's own CI system.** That +> committed CI configuration is the one and only finished result. +> **I run locally to author it — I am never part of CI.** Do not run tests, +> create runs, or call `@testomatio/reporter` to "see it work"; every command +> below is something the pipeline executes later. ``` PR opened ──▶ NOTICE ONLY — comment the affected test counts on the PR - reporter run --filter-list (manual map + e2e map) → counts - posted via the CI's native PR-comment API (GitHub / GitLab / Bitbucket) + reporter run --filter-list per coverage map → counts + posted via the CI's native PR-comment API no runs created · never blocks the PR - e.g. "0 automated tests, 10 manual tests are affected by this PR" -PR merged ──▶ Manual regression run (created, pending) - reporter start --kind manual --filter "coverage:file=coverage.manual.yml,diff=<base>" - titled by the merge commit · in a rungroup · testers pick it up - - ──▶ Automated regression run (created, NOT executed) - reporter start --filter "coverage:file=coverage.e2e.yml,diff=<base>" --format id - TESTOMATIO_SHARED_RUN=1 · TESTOMATIO_TITLE="report for commit <sha>" - TESTOMATIO_SHARED_RUN_TIMEOUT=<covers the merge→deploy gap> - → capture RUN_ID (stdout is just the id with --format id) +PR merged ──▶ One regression run per coverage map (created, NOT executed) + coverage.<slug>.manual.yml → reporter start --kind manual + coverage.<slug>.yml → reporter start --kind mixed (shared run) + coverage.<slug>.e2e.yml → reporter start (shared run) + titled by the merge commit · in a rungroup + manual cases sit pending for testers · automated tests wait for deploy deploy done ──▶ Launch automated execution on Testomat.io CI - reporter run --remote <ci-profile> - (TESTOMATIO_RUN=$RUN_ID + same shared-run title + TESTOMATIO_SHARED_RUN=1) - Testomat.io dispatches the project's CI profile; the affected e2e - tests run there and report back into the same prepared run. + reporter run --remote <ci-profile> (runs containing automated tests) + TESTOMATIO_RUN=$RUN_ID · TESTOMATIO_SHARED_RUN=1 · same shared title ``` -Every phase is **change-aware**: a coverage map (`coverage.*.yml`) maps source -files → test/suite IDs, and `@testomatio/reporter` filters by the PR diff so -only impacted tests are counted, prepared, and run. - -The big shift from older setups: **execution moved to after merge+deploy**, the -**PR-open step is now just an informational comment**, and **automated tests are -launched through Testomat.io's `--remote` CI profile** instead of this repo -reaching into another repo's pipeline. If the e2e tests live elsewhere, the CI -profile points there — the reporter never triggers a foreign repo directly. - -I do **not** invent tests, write CI from guesswork, or assume a CI system. I -discover the project, confirm the unknowns with the user, reuse existing -coverage maps (or delegate creating them), and express the three phases in -whatever CI the project actually uses. - -## What this skill is and isn't - -This skill is the **method**, not a catalogue of CI snippets. The valuable, -non-obvious knowledge is: - -1. the three-phase model and why each phase is gated where it is (comment = - harmless, on open; manual run = on merge, no deploy needed; automated run = - prepared on merge, executed only after deploy, never blocking the release); -2. the `@testomatio/reporter` command contract — `--filter-list` for the notice - counts, `reporter start` to create runs without executing, the shared-run - env vars (`TESTOMATIO_SHARED_RUN`, `TESTOMATIO_TITLE`, - `TESTOMATIO_SHARED_RUN_TIMEOUT`), and `reporter run --remote <profile>` to - launch a prepared run on Testomat.io CI (see - [references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md)); -3. the decisions — coverage maps required, whether a Testomat.io CI profile - exists for `--remote`, and what to ask the user (deploy trigger, completion - signal, rungroup strategy, diff base, shared-run timeout). - -Translating a trigger into a specific CI's YAML/Groovy/config is **not** special -knowledge — you already know how GitHub Actions, GitLab CI, Jenkins, or any -other CI express "run on PR open", "run on merge", "run after deploy", "post a -PR comment", and "don't fail the pipeline". Write that yourself for the CI in -front of you. **Do not write explicit per-CI workflow files into this skill or -expect a per-CI recipe file** — give the agent the contract and let it author -the config for whatever CI it finds. +## Coverage maps drive everything + +A coverage map maps source files → test identifiers; the reporter filters it by +the PR diff so only impacted tests are counted, prepared, and run. **One map +per Testomat.io project, named by content** — created by the `qa-test-coverage-map` +skill, never hand-written here: + +| Map | Contains | On merge | After deploy | +| ---------------------------- | ------------------ | ----------------------------------------------------------- | ------------------------------------- | +| `coverage.<slug>.manual.yml` | manual only | `start --kind manual` — run is complete, testers pick it up | nothing | +| `coverage.<slug>.yml` | manual + automated | `start --kind mixed`, prepared as a shared run | launch automated tests via `--remote` | +| `coverage.<slug>.e2e.yml` | automated only | `start` (no `--kind`), prepared as a shared run | launch via `--remote` | + +The suffix is the whole contract: it tells CI which `--kind` to pass and +whether a deploy-time execute phase exists. Legacy `coverage.manual.yml` / +`coverage.e2e.yml` (no slug) mean manual-only / automated-only. A repo serving +several Testomat.io projects has several maps — repeat the per-map commands +for each. + +## Method, not snippets + +The valuable knowledge here is the three-phase model, the reporter command +contract ([references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md)), +and the decisions to confirm with the user. Translating a trigger into a +specific CI's YAML/Groovy is not — you already know how every CI expresses "on +PR open", "on merge", "after deploy", "post a PR comment", and "don't fail the +pipeline". Write that config yourself for the CI in front of you; never bake +per-CI workflow files into this skill. ## When to use -- "Comment how many tests a PR affects", "show affected test counts on PRs". -- "Run only the affected tests after a PR is merged and deployed". -- "Create a pending manual regression run per merged PR with just the relevant cases". -- "Launch affected e2e on Testomat.io CI after deploy", "use `--remote` for regression". -- "Hook coverage.manual.yml / coverage.e2e.yml into our CI". +- "Comment how many tests a PR affects." +- "Run only the affected tests after a PR is merged and deployed." +- "Create a pending manual regression run per merged PR." +- "Launch affected e2e on Testomat.io CI after deploy" / "use `--remote`". +- "Hook our coverage*.yml into CI." --- ## CRITICAL CONSTRAINTS -- **The goal is a working pipeline in the project's CI — nothing else.** Reaching - that committed CI configuration is the final goal; the task is complete only - when it exists. This skill runs locally to *author* that config; it is never a - CI step and never executes the reporter itself. Do not run tests, create runs, - or call `@testomatio/reporter` to "verify" — your output is committed CI config - the target system runs later. -- **Discovery first, always.** Never write CI before delegating to `project-scan` - (frameworks, manual + automated tests, where e2e tests live). Decisions below - depend on its result. -- **Never assume the CI system, and never hardcode one.** Identify the CI the - project already uses by reading the repo; if you cannot tell, ask. Then write - config for *that* CI from your own knowledge of it. Do not bake per-CI - workflow files into this skill. -- **Regression cannot work without a coverage map.** If the needed - `coverage.*.yml` is missing, the only correct move is to propose creating it - via the coverage skills — not to hand-write a mapping or skip filtering. The - PR-open comment also needs the maps to count affected tests. -- **PR open is a NOTICE ONLY.** On PR open/update, compute affected counts with - `--filter-list` and post a PR comment. **Never create runs and never block the - PR** in this phase. Counts only — e.g. `0 automated tests, 10 manual tests are - affected by this PR`. Must work on GitHub, GitLab, and Bitbucket using each - one's native PR/MR comment API. -- **Regression runs are created on MERGE, not on PR open.** When the PR merges: - create the manual run (pending) and create the automated run (prepared, not - executed). Nothing executes until deploy. -- **Automated run is prepared, then launched separately.** On merge, create it - with `reporter start` as a **shared run** keyed by the merge commit - (`TESTOMATIO_SHARED_RUN=1`, `TESTOMATIO_TITLE="report for commit <sha>"`) and - do **not** execute it. After deploy, launch it with `reporter run --remote - <profile>`. -- **Set the shared-run timeout to span the merge→deploy gap.** The shared run is - matched by title only within `TESTOMATIO_SHARED_RUN_TIMEOUT` minutes (default - 20). If deploy takes longer, the execute step won't attach to the prepared run - and a stray new run appears. Ask how long deploys take and set the timeout - above it (e.g. `TESTOMATIO_SHARED_RUN_TIMEOUT=120`). -- **Execute automated through `--remote`, never by triggering another repo.** - `reporter run --remote <profile>` asks Testomat.io to dispatch the project's - configured CI profile. Do not reach into a foreign repo's pipeline with a PAT - and `{grep, run, env}` inputs anymore — that responsibility now lives in the - Testomat.io CI profile (configured under **Settings → CI**). -- **The automated execute step must never fail the deploy/release pipeline.** It - is observation, not a gate. Isolate it (a non-failing job keyed off the deploy - signal). -- **Every created run gets a meaningful title.** Set `TESTOMATIO_TITLE` on every - `start`/`run` call that creates or launches a run. Derive it from the merge - commit (subject + short SHA), e.g. `report for commit <sha>`. The automated - prepare and execute steps MUST use the **same** title so they converge on one - shared run. -- **Every created run lands in a rungroup.** Set `TESTOMATIO_RUNGROUP_TITLE` on - the manual and automated runs. The grouping strategy (week / day / milestone / - release / submodule) is the user's choice; ask in Step 4. -- **Only touch CI config and (if asked) coverage files.** Do not modify - application or test source. +- **The deliverable is committed CI config — never execute the reporter + yourself.** +- **Discovery first.** Delegate to `project-scan` before writing anything. +- **Never assume or hardcode the CI system.** Read the repo; if unclear, ask. +- **No coverage map → no regression.** Missing maps are created via the + `qa-test-coverage-map` skill, never hand-written; filtering is never skipped. +- **PR open is a notice only.** Counts in a comment; no runs; never blocks the + PR. +- **Runs are created on merge, executed after deploy.** Manual-only runs are + complete at creation; runs containing automated tests are prepared as shared + runs (`TESTOMATIO_SHARED_RUN=1`, title from the merge commit, + `TESTOMATIO_SHARED_RUN_TIMEOUT` above the merge→deploy gap — the 20-minute + default is usually too short). +- **Automated execution goes through `reporter run --remote <profile>`** — a + Testomat.io CI profile (Settings → CI), never by reaching into another + repo's pipeline. The prepare and execute steps MUST share the same + `TESTOMATIO_TITLE` so they converge on one run. +- **The execute step never fails the deploy/release pipeline.** It is + observation, not a gate — isolate it as a non-failing job. +- **Every run gets a meaningful title and a rungroup.** `TESTOMATIO_TITLE` + from the merge commit; `TESTOMATIO_RUNGROUP_TITLE` per the user's grouping + strategy. +- **Only touch CI config** (and coverage maps if delegated). Never source or + test files. --- ## Workflow -### Step 1 — Discover the project (delegate to `project-scan`) +### Step 1 — Discover (delegate to `project-scan`) + +Capture: is there an e2e framework (unit/integration don't count), are there +manual `.test.md` cases, do automated tests live here or elsewhere. This tells +you which maps to expect and which phases apply. -Run **`project-scan`**. From its result capture: +### Step 2 — Identify the CI -- **Frameworks** — is there an automated **e2e** framework here (Playwright, - Cypress, CodeceptJS, WebdriverIO, Puppeteer, Appium)? Unit/integration - frameworks do **not** count as e2e. -- **Manual tests** — are there `*.test.md` cases (here or pullable from Testomat.io)? -- **Automated tests** — present in this repo, or only elsewhere? +Read the repo's CI config files. Several CIs or none → ask which one runs PRs. -This tells you which maps you can build counts from, and which phases apply -(manual-only vs manual + automated). +### Step 3 — Locate the coverage maps -### Step 2 — Identify the CI system (do not assume) +Look for `coverage*.yml` in the repo root. The filename tells you what each +map contains (table above). Validate that a map still resolves before wiring +CI to it (the `qa-test-coverage-map` skill bundles a checker). If the map for a test +kind the user wants is missing → propose creating it and delegate to +**`qa-test-coverage-map`**; the phases for that kind cannot be wired without it. -Read the repo to see which CI it already uses — its CI config files / dotfiles -make this obvious. If nothing is configured, or several CIs are present, **ask -the user which CI runs their PRs**. Once you know the CI, you know how to write -its triggers and how to post a PR/MR comment with it. +### Step 4 — Ask the unknowns -### Step 3 — Locate or create the coverage maps +Read the CI files first so you don't ask what's already answered: -Look for existing coverage files in the repo root: `coverage.manual.yml`, -`coverage.e2e.yml`, or any `coverage*.yml`. Inspect the keys to confirm they map -this repo's source. +1. **What triggers a deploy** that should launch automated execution — merge + to main, release event, tag, push to a deploy branch, or manual? +2. **How does the deploy signal completion?** A concrete observable event — + a job finishing, a pipeline event, a status check, a health endpoint. +3. **How long is merge → deploy-complete?** Sets + `TESTOMATIO_SHARED_RUN_TIMEOUT` comfortably above it. +4. **Rungroup strategy** — week / day / milestone / release / submodule. +5. **Diff base** — the PR's target branch is the natural base; for post-deploy + ranges see the caveat in REPORTER_CONTRACT.md. -For each kind of test the user has: +The PR comment needs only the maps and the base (Q5). Manual-only runs need +only the rungroup (Q4). Q1–Q3 matter only when a map contains automated tests. -- **Map exists** → reuse it. Validate it still resolves (the coverage skills - bundle a checker) before wiring CI to it. -- **Map missing** → both the affected-count comment and the regression run for - that kind **will not work**. Propose creating it and, on agreement, delegate: - - manual → **`manual-coverage`** → `coverage.manual.yml` - - automated/e2e → **`automation-coverage`** → `coverage.e2e.yml` +### Step 5 — Confirm how automated execution launches -You need a map for each kind you want to report or run. The PR-open comment can -list only the kinds whose maps exist. +`reporter run --remote <profile>` dispatches a **Testomat.io CI profile** +(Settings → CI) that owns the runner, browsers, environment URLs, and secrets — +whether the e2e suite lives in this repo or a dedicated one. No profile yet → +that is a prerequisite for the user; wire the execute step so it can be enabled +once the profile exists. -### Step 4 — Ask the unknowns +**Run inline instead of `--remote`** when the suite must run in this pipeline: +mobile (simulators, signing), API tests against a server this repo spins up, or +an existing same-repo e2e job that already works. The execute step then wraps +the runner directly on the prepared run. + +**No e2e suite anywhere** → wire only the comment and manual phases; explain +the automated phase needs a suite first. Never fabricate an e2e job. + +### Step 6 — Wire the phases into the CI + +Write the jobs in the CI's own syntax. The skill-specific parts are the +reporter commands and env vars; triggers, secrets, and the PR-comment call are +ordinary CI config. + +**(a) PR opened → counts comment (notice only).** + +For each map, a `--filter-list` dry run lists the matching IDs without running +or creating anything: + +```bash +npx @testomatio/reporter run \ + --filter-list "coverage:file=<map>,diff=<base>" --format ids +``` + +Count the IDs (empty output = 0) and post one comment via the CI's native +PR/MR comment API, e.g. `0 automated tests, 10 manual tests are affected by +this PR`. Per-kind numbers come from `.manual`/`.e2e` maps; a mixed map yields +one combined count (`12 tests (manual + automated)`). Update one existing +comment rather than re-posting. Never fail the PR check. + +**(b) PR merged → create one run per map.** + +Required env on every call: `TESTOMATIO`, `TESTOMATIO_TITLE` (merge commit, +e.g. `report for commit <short-sha>`), `TESTOMATIO_RUNGROUP_TITLE`. + +Manual-only map — created pending, done: + +```bash +npx @testomatio/reporter start --kind manual \ + --filter "coverage:file=coverage.<slug>.manual.yml,diff=<base>" +``` + +Map containing automated tests — prepared as a shared run, **not executed**. +Mixed maps add `--kind mixed`; e2e-only maps take no `--kind`: + +```bash +RUN_ID=$(TESTOMATIO_SHARED_RUN=1 \ +TESTOMATIO_TITLE="report for commit <short-sha>" \ +TESTOMATIO_SHARED_RUN_TIMEOUT=<minutes covering deploy> \ +npx @testomatio/reporter start --kind mixed \ + --filter "coverage:file=coverage.<slug>.yml,diff=<base>" --format id) +``` + +`--format id` prints only the run id, so `RUN_ID` captures it cleanly — carry +it to the execute step (pipeline output/artifact). In a mixed run the manual +cases are immediately pending for testers; the automated part waits for +deploy. + +**(c) Deploy done → launch automated execution.** + +For each run prepared in (b), triggered by the deploy-completion signal — +never by PR open: + +```bash +TESTOMATIO_RUN=$RUN_ID \ +TESTOMATIO_SHARED_RUN=1 \ +TESTOMATIO_TITLE="report for commit <short-sha>" \ +npx @testomatio/reporter run --remote <ci-profile> +``` + +With no fresh `--filter`, Testomat.io reuses the run's stored scope — only the +affected automated tests run on the CI profile and report back into the same +run. If the deploy pipeline can't carry `RUN_ID`, the identical shared title +(within the timeout) matches the prepared run instead. Keep this job +non-failing and off the release's critical path. + +Inline exception (Step 5) — wrap the runner instead of `--remote`: + +```bash +TESTOMATIO_RUN=$RUN_ID npx @testomatio/reporter run "<runner cmd>" \ + --filter "coverage:file=<map>,diff=<base>" +``` -Read their CI files first so you don't ask what's already there, then confirm: - -1. **When does a deploy that should trigger automated execution happen?** Offer - the common options: - - on merge to the main/deploy branch (deploy is part of that pipeline), - - on a release event (GitHub Release published, GitLab release created), - - on a git tag (e.g. `v*`), - - on every push to a deploy branch, - - never automated — deploy is manual, launch e2e on demand. -2. **How does that deploy signal its completion?** This gates the execute step. - Need a concrete, observable signal — a deploy job/stage completing, a - `workflow_run`/pipeline-finished event, a deployment event, a status check - going green, a release/tag created, or a health endpoint returning 200. -3. **How long does merge → deploy-complete usually take?** This sets - `TESTOMATIO_SHARED_RUN_TIMEOUT` (minutes, default 20) so the execute step - still matches the prepared shared run. Pick a value comfortably above the - typical deploy duration. -4. **What rungroup strategy should regression runs use?** Every created run goes - into a rungroup; the question is what defines a group: - - **week** — e.g. `Regression W2 May 2026` - (`W$(( ($(date +%-d) - 1) / 7 + 1 )) $(date +'%B %Y')`), - - **day** — e.g. `Regression 2026-05-27`, - - **milestone** — the active sprint/milestone, - - **release** — the upcoming release/version tag, - - **submodule** — the project area touched (monorepos). - Default to one strategy for both the manual and automated runs unless the - user splits them. -5. **Diff base** for "what changed": for the PR-open comment and the on-merge - runs, the PR's target branch (`main`/`master`) is the natural base. For a - post-deploy range see the one-commit-per-deploy caveat in - [references/REPORTER_CONTRACT.md](references/REPORTER_CONTRACT.md). - -**The PR-open comment needs none of the deploy answers (Q1–Q3).** It only needs -the coverage maps and a base branch (Q5). **The manual run needs only the -rungroup answer (Q4)** — it is created pending on merge regardless of deploy. - -### Step 5 — Confirm how automated execution is launched (`--remote`) - -Automated execution runs through a **Testomat.io CI profile** triggered by -`reporter run --remote <profile>`. Decide and confirm: - -- **Is there a CI profile configured on the project?** (Testomat.io → **Settings - → CI**.) It names the workflow Testomat.io dispatches — in this repo or in a - dedicated e2e repo. If none exists, that is a prerequisite the user must set up - (point them at the CI configuration page); `--remote` cannot work without it. -- **The CI profile owns the runner, browsers, environment URLs, and secrets.** - This is exactly why `--remote` replaces cross-repo dispatch: the source repo - only needs its coverage map + git history to *prepare* the scoped run; the CI - profile decides *where and how* the suite actually executes. No PAT, no - foreign-repo trigger, no `{grep, run, env}` inputs to maintain. -- **Exceptions — run inline in this pipeline instead of `--remote`:** - - **Mobile (Appium, Detox, native simulators)** — bound to specific OS images - and signing material on the test job; keep it in this pipeline. - - **API / contract tests** that exercise a server this repo can spin up — run - against `localhost`; not gated on a remote deploy or a CI profile. - - **An existing same-repo e2e job that already works** — don't fight it; you - can still prepare the run here and let that job execute it. - For these, after deploy run the reporter wrapping the runner directly - (`reporter run "<runner cmd>" --filter "coverage:file=coverage.e2e.yml,diff=<base>"`) on the - prepared run instead of `--remote`. -- **No e2e suite anywhere (only unit/integration here).** Do not stand up an e2e - job. Set up the comment (manual only) and the manual run; tell the user the - automated phase needs an e2e suite first (point at `automation-coverage` / test - authoring). - -### Step 6 — Wire the three phases into the project's CI - -Express each applicable phase in the CI from Step 2, in that CI's own syntax — -you know it. The skill-specific parts are the reporter commands, the env vars, -and the structural rules; the trigger/secret/job scaffolding and the PR-comment -call are ordinary CI config you write for whatever system it is. - -**(a) PR opened → affected-counts comment (notice only).** - -- Trigger: PR/MR opened (add reopened/synchronize if the user wants the comment - to refresh on new commits). -- For each coverage map that exists, compute the affected count with a - `--filter-list` dry run — it lists matching IDs without running anything: - ``` - npx @testomatio/reporter run \ - --filter-list "coverage:file=coverage.manual.yml,diff=<base>" --format ids - npx @testomatio/reporter run \ - --filter-list "coverage:file=coverage.e2e.yml,diff=<base>" --format ids - ``` - Count the IDs (empty output = `0`). Assemble a one-line comment, e.g. - `0 automated tests, 10 manual tests are affected by this PR`. -- Post it with the CI's native PR/MR comment mechanism (GitHub PR comment, - GitLab MR note, Bitbucket PR comment). Prefer updating a single existing - comment over adding a new one on every push. -- This phase **creates no runs** and must **never fail the PR check**. - -**(b) PR merged → create the regression runs.** - -- Trigger: PR merged into the main/deploy branch. -- Required env on these calls: `TESTOMATIO` (API key), `TESTOMATIO_TITLE`, - `TESTOMATIO_RUNGROUP_TITLE`. Title derives from the merge commit - (e.g. `report for commit <short-sha>`); rungroup from Step 4. - - **Manual run** — created pending, testers pick it up; nothing executes: - ``` - npx @testomatio/reporter start --kind manual \ - --filter "coverage:file=coverage.manual.yml,diff=<base>" - ``` - - **Automated run** — prepared as a shared run, **not executed**: - ``` - RUN_ID=$(TESTOMATIO_SHARED_RUN=1 \ - TESTOMATIO_TITLE="report for commit <short-sha>" \ - TESTOMATIO_SHARED_RUN_TIMEOUT=<minutes covering deploy> \ - npx @testomatio/reporter start \ - --filter "coverage:file=coverage.e2e.yml,diff=<base>" --format id) - ``` - `--format id` makes `start` print only the run id to stdout, so `RUN_ID` - captures it cleanly; carry it to the execute step (pipeline output/artifact). - The shared title + timeout let the execute step (and any parallel executors) - converge on this same prepared run. - -**(c) Deploy done → launch automated execution via `--remote`.** - -- Trigger: the deploy-completion signal from Step 4 — never "PR opened". -- Launch the prepared run on the Testomat.io CI profile, reusing the same title - and shared-run flag, and pointing at the prepared run: - ``` - TESTOMATIO_RUN=$RUN_ID \ - TESTOMATIO_SHARED_RUN=1 \ - TESTOMATIO_TITLE="report for commit <short-sha>" \ - npx @testomatio/reporter run --remote <ci-profile> - ``` - `TESTOMATIO_RUN=$RUN_ID` ties the launch to the run prepared in (b); with no - fresh `--filter`, Testomat.io greps that run's own stored scope, so only the - affected e2e tests run. Testomat.io dispatches the CI profile; the suite runs - there and reports back into the same run. -- If the deploy pipeline is fully decoupled and cannot carry `RUN_ID`, the - execute step matches the prepared run by its shared-run **title** instead — - which is why the title must be identical and the shared-run timeout must still - be open. Keep `TESTOMATIO_SHARED_RUN=1` and the same `TESTOMATIO_TITLE`. -- **Isolation:** keep this off the release's critical path — a separate - job/pipeline keyed off the deploy signal, set non-failing for the release - (`continue-on-error`, `allow_failure: true`, etc.). It reports to Testomat.io; - it must not gate the deploy. -- **Inline exception (Step 5):** when not using a CI profile, replace the - `--remote` call with the runner wrapped directly, on the prepared run: - ``` - TESTOMATIO_RUN=$RUN_ID npx @testomatio/reporter run "<runner cmd>" \ - --filter "coverage:file=coverage.e2e.yml,diff=<base>" - ``` - -State the secrets the user must provision — the `TESTOMATIO` API key on the jobs -that talk to Testomat.io, and a token allowed to post PR/MR comments for phase -(a) (the CI's built-in token usually suffices for same-repo comments). A CI -profile for `--remote` is configured in Testomat.io, not via a repo secret. +State the secrets to provision: `TESTOMATIO` on jobs that talk to Testomat.io, +and a token able to post PR/MR comments (the CI's built-in token usually +suffices). The CI profile for `--remote` is configured in Testomat.io, not as +a repo secret. ### Step 7 — Summarize and hand off -Report concisely: - -- CI system targeted; files written in this repo. -- Which phases are wired (comment / manual run / automated prepare / automated - execute) and which were skipped and why (missing coverage map / no e2e suite / - no CI profile). -- The comment format and where its counts come from (which maps, which base). -- Title scheme (merge commit) and that the automated prepare + execute steps - share one title; rungroup strategy and where it is computed. -- Shared-run timeout chosen and the deploy duration it must cover. -- How automated execution is launched: the `--remote <profile>` CI profile (or - the inline-runner exception), and that `RUN_ID` (or the shared title) links - the prepare and execute steps. -- Secrets/tokens to add before it works; that a Testomat.io CI profile must - exist for `--remote`. -- Any assumption needing confirmation (deploy signal, diff base, timeout, - rungroup recipe). -- Recommend committing the coverage map(s) so CI and the team share one mapping. +Report: the CI targeted and files written; which phases are wired per map and +which were skipped (missing map / no e2e suite / no CI profile); the comment +format and its maps; title scheme and rungroup; the shared-run timeout and the +deploy duration it covers; how the execute step finds the prepared run +(`RUN_ID` or shared title); secrets and prerequisites still needed; assumptions +to confirm. Recommend committing the coverage maps. --- ## Examples -**Example 1 — comment on open, regression after merge+deploy, e2e via CI profile** -Input: "Comment how many tests each PR touches, then after we merge and deploy, -run the affected e2e and create a manual run." -Output: ask the rungroup (weekly), deploy signal, and deploy duration. On PR -open, a non-blocking job posts `N automated, M manual tests affected` computed -with `--filter-list`. On merge: a pending manual run (`report for commit <sha>` -in `Regression W<n> <Month> <Year>`) and a prepared automated shared run with the -same title and `TESTOMATIO_SHARED_RUN_TIMEOUT` above the deploy time; `RUN_ID` -captured. On deploy-complete: a non-failing job runs `reporter run --remote -<profile>` with that `RUN_ID` + shared title, launching the affected e2e on -Testomat.io CI. All written in the project's own CI. - -**Example 2 — no coverage files yet** -Input: "Set up selective regression on our pull requests." -Output: find no `coverage.*.yml`; explain the comment and the regression runs -can't work without them; delegate to `manual-coverage` / `automation-coverage`; -only then wire the CI. - -**Example 3 — only unit tests in repo** -Input: "Run affected e2e on PRs." -Output: project-scan finds no e2e framework anywhere; set up the PR-open comment -(manual count only) and the on-merge manual run if manual cases exist; explain -the automated phase needs an e2e suite first; do not fabricate an e2e job. - -**Example 4 — no CI profile configured** -Input: "Use `--remote` to run our e2e after deploy." -Output: confirm there is no Testomat.io CI profile yet; explain `--remote` -dispatches a configured profile and cannot work without one; point the user at -Testomat.io **Settings → CI** to add the profile that runs the e2e suite; wire -the prepare step now and the `--remote` execute step once the profile exists. -Until then, offer the inline-runner exception (Step 5) if the e2e suite can run -in this pipeline. +**Example 1 — one project, manual + automated (`coverage.shop.yml`)** +"Comment how many tests each PR touches, then after merge+deploy run the +affected e2e and give testers their cases." → Ask rungroup, deploy signal, +deploy duration. PR open: a non-blocking job posts the combined affected +count. On merge: one `start --kind mixed` shared run titled +`report for commit <sha>` — manual cases pending immediately, `RUN_ID` +captured. On deploy-complete: a non-failing job runs +`reporter run --remote <profile>` with that `RUN_ID`, executing the affected +automated tests into the same run. + +**Example 2 — no coverage maps yet** +"Set up selective regression on our PRs." → Find no `coverage*.yml`; explain +nothing can be counted or filtered without a map; delegate to `qa-test-coverage-map`; +only then wire CI. + +**Example 3 — manual-only project** +project-scan finds `.test.md` cases and no e2e framework → wire the PR comment +(manual count) and the on-merge `start --kind manual` run from +`coverage.<slug>.manual.yml`. No deploy phase at all; explain the automated +phase needs an e2e suite first. Never fabricate an e2e job. + +**Example 4 — no CI profile for `--remote`** +Confirm none exists; explain `--remote` dispatches a configured profile +(Testomat.io → Settings → CI) and cannot work without one. Wire phases (a) and +(b) now and the (c) execute step ready to enable once the profile exists; +offer the inline-runner exception meanwhile. --- @@ -420,6 +289,6 @@ in this pipeline. ## Related skills -`project-scan` (mandatory first), `manual-coverage` and `automation-coverage` -(create the maps this skill consumes), `reporter-setup` (install the reporter if -the project has no Testomat.io integration yet). +`project-scan` (mandatory first), `qa-test-coverage-map` (creates the maps this skill +consumes), `reporter-setup` (install the reporter if the project has no +Testomat.io integration yet). diff --git a/skills/setup-pr-testing/evals/evals.json b/skills/setup-pr-testing/evals/evals.json index 4d0fa1a..05fd1c8 100644 --- a/skills/setup-pr-testing/evals/evals.json +++ b/skills/setup-pr-testing/evals/evals.json @@ -4,29 +4,29 @@ { "id": 0, "name": "github-remote-profile-happy-path", - "prompt": "The project is at FIXTURE/acme-api. On each PR I want a comment showing how many manual and automated tests the change affects. After the PR is merged and the staging deploy finishes, run only the affected e2e tests. The e2e suite runs through our Testomat.io CI profile 'staging-e2e'. Deploys happen on merge to main via the existing 'Deploy Staging' workflow.", - "expected_output": "Runs project-scan first; detects GitHub Actions (does not assume); reuses existing coverage.manual.yml + coverage.e2e.yml. Phase 1: a non-blocking PR-opened job computes affected counts with reporter run --filter-list for both maps and posts/updates a single PR comment like '0 automated tests, 10 manual tests are affected by this PR'; creates no runs. Phase 2 (on merge): reporter start --kind manual creates a pending manual run, and reporter start with TESTOMATIO_SHARED_RUN=1, TESTOMATIO_TITLE set to the merge commit, and TESTOMATIO_SHARED_RUN_TIMEOUT above the deploy duration prepares the automated run scoped to the e2e map without executing it, capturing RUN_ID. Phase 3 (after Deploy Staging succeeds): a non-failing job runs reporter run --remote staging-e2e with TESTOMATIO_RUN=$RUN_ID + the same shared title, launching the affected e2e on the CI profile. Sets meaningful title (merge commit) and rungroup; isolates the execute job from the release; does NOT dispatch a foreign repo directly.", + "prompt": "The project is at FIXTURE/acme-api. On each PR I want a comment showing how many tests the change affects. After the PR is merged and the staging deploy finishes, run only the affected e2e tests. The e2e suite runs through our Testomat.io CI profile 'staging-e2e'. Deploys happen on merge to main via the existing 'Deploy Staging' workflow.", + "expected_output": "Runs project-scan first; detects GitHub Actions (does not assume); reuses the existing per-project coverage map(s) (coverage.<slug>.yml / coverage.<slug>.manual.yml / coverage.<slug>.e2e.yml) and reads the test kinds from the filename suffix. Phase 1: a non-blocking PR-opened job computes affected counts with reporter run --filter-list per map and posts/updates a single PR comment with the counts; creates no runs. Phase 2 (on merge): one reporter start per map with --kind derived from the suffix (--kind manual for manual-only, --kind mixed for a mixed map, no --kind for e2e-only); runs containing automated tests are prepared as shared runs with TESTOMATIO_SHARED_RUN=1, TESTOMATIO_TITLE set to the merge commit, and TESTOMATIO_SHARED_RUN_TIMEOUT above the deploy duration, capturing RUN_ID without executing anything. Phase 3 (after Deploy Staging succeeds): a non-failing job runs reporter run --remote staging-e2e with TESTOMATIO_RUN=$RUN_ID + the same shared title, launching the affected automated tests on the CI profile. Sets meaningful title (merge commit) and rungroup; isolates the execute job from the release; does NOT dispatch a foreign repo directly.", "files": [] }, { "id": 1, "name": "gitlab-missing-coverage-maps", "prompt": "The project is at FIXTURE/shop-web. I want each MR to comment how many tests it affects, and after merge+deploy run only the affected tests. I don't think we have any coverage files set up yet.", - "expected_output": "Runs project-scan; detects GitLab CI (not GitHub); finds no coverage.*.yml; explains both the affected-counts comment and the regression runs cannot work without coverage maps and proposes/delegates to manual-coverage and automation-coverage to create them before wiring .gitlab-ci.yml; asks how merges deploy, how deploy completion is observable, and roughly how long deploys take (for the shared-run timeout); notes the PR-open comment needs no deploy answers.", + "expected_output": "Runs project-scan; detects GitLab CI (not GitHub); finds no coverage*.yml; explains both the affected-counts comment and the regression runs cannot work without a coverage map and proposes/delegates to the qa-test-coverage-map skill to create the per-project map (coverage.<slug>.yml, or the .manual/.e2e variant matching the test kinds) before wiring .gitlab-ci.yml; asks how merges deploy, how deploy completion is observable, and roughly how long deploys take (for the shared-run timeout); notes the PR-open comment needs no deploy answers.", "files": [] }, { "id": 2, "name": "no-e2e-suite-refuse-to-fabricate", "prompt": "The project is at FIXTURE/lib-utils. Make our pull requests run the affected end-to-end tests automatically.", - "expected_output": "Runs project-scan; finds only Jest unit tests, no e2e framework anywhere; does NOT fabricate an e2e job, pipeline, or --remote launch; explains the automated phase needs an e2e suite first (points at authoring/automation-coverage); offers the PR-open comment (manual count only) and an on-merge pending manual run if manual cases exist; asks which CI is used since none is configured.", + "expected_output": "Runs project-scan; finds only Jest unit tests, no e2e framework anywhere; does NOT fabricate an e2e job, pipeline, or --remote launch; explains the automated phase needs an e2e suite first (points at test authoring / qa-test-coverage-map once a suite exists); offers the PR-open comment (manual count only) and an on-merge pending manual run via start --kind manual from the manual coverage map if manual cases exist; asks which CI is used since none is configured.", "files": [] }, { "id": 3, "name": "no-ci-profile-for-remote", "prompt": "The project is at FIXTURE/acme-api. Use --remote to run our affected e2e tests after deploy, and comment the affected counts on each PR. We use GitHub Actions and deploy on tag v*.", - "expected_output": "Runs project-scan; detects GitHub Actions; confirms whether a Testomat.io CI profile exists for --remote and, finding none, explains --remote dispatches a configured profile and cannot work without one, pointing the user to Testomat.io Settings > CI to create it. Still wires phase 1 (PR-open counts comment) and phase 2 (on-merge manual run + prepared automated shared run titled by commit). Wires the phase-3 reporter run --remote execute step keyed off the tag/deploy-complete signal as non-failing, to be enabled once the profile exists; offers the inline-runner exception if the e2e suite can run in this pipeline instead.", + "expected_output": "Runs project-scan; detects GitHub Actions; confirms whether a Testomat.io CI profile exists for --remote and, finding none, explains --remote dispatches a configured profile and cannot work without one, pointing the user to Testomat.io Settings > CI to create it. Still wires phase 1 (PR-open counts comment per coverage map) and phase 2 (one run per map on merge, --kind by filename suffix, runs containing automated tests prepared as shared runs titled by commit). Wires the phase-3 reporter run --remote execute step keyed off the tag/deploy-complete signal as non-failing, to be enabled once the profile exists; offers the inline-runner exception if the e2e suite can run in this pipeline instead.", "files": [] } ] diff --git a/skills/setup-pr-testing/references/REPORTER_CONTRACT.md b/skills/setup-pr-testing/references/REPORTER_CONTRACT.md index 3ddf74e..8ca56b9 100644 --- a/skills/setup-pr-testing/references/REPORTER_CONTRACT.md +++ b/skills/setup-pr-testing/references/REPORTER_CONTRACT.md @@ -7,12 +7,16 @@ these commands. ## 1. The coverage filter -A coverage map (`coverage.manual.yml` / `coverage.e2e.yml`) maps source globs → -test/suite IDs/tags. The reporter resolves *changed files* via git, maps them -through the YAML, and selects the matching tests. +A coverage map maps source globs → test/suite IDs/tags. There is one map per +Testomat.io project, and its suffix says which kinds of tests it selects: +`coverage.<slug>.yml` (manual + automated), `coverage.<slug>.manual.yml` +(manual only), `coverage.<slug>.e2e.yml` (automated only). Legacy +`coverage.manual.yml` / `coverage.e2e.yml` are single-kind maps. The reporter +resolves *changed files* via git, maps them through the YAML, and selects the +matching tests. ``` ---filter "coverage:file=<path-to-coverage.yml>,diff=<git-ref>" +--filter "coverage:file=<path-to-coverage-map>,diff=<git-ref>" ``` - `file=` — path to the coverage map. **May be absolute.** It is read with @@ -23,66 +27,81 @@ through the YAML, and selects the matching tests. to detect. Confirmed behavior, not configurable. The whole model rests on this: the repo that holds the coverage map + git -history is where the affected selection is computed — for the comment counts and -for scoping the prepared run alike. +history is where the affected selection is computed — for the comment counts +and for scoping the prepared run alike. + +### The `--kind` rule + +The map's suffix decides the `--kind` flag on `start`/`run`: + +| Map | `--kind` | +| ---------------------------- | --------------- | +| `coverage.<slug>.manual.yml` | `--kind manual` | +| `coverage.<slug>.yml` | `--kind mixed` | +| `coverage.<slug>.e2e.yml` | *(no flag)* | ## 2. Phase 1 — affected-counts comment (`--filter-list`) `--filter-list` computes the affected tests **without executing or creating a run** — exactly what the PR-open notice needs. Pair it with `--format` to get a clean, parseable list on stdout (the banner is suppressed and logs go to -stderr): +stderr). Run it once per coverage map: ```bash npx @testomatio/reporter run \ - --filter-list "coverage:file=coverage.manual.yml,diff=$BASE" --format ids -npx @testomatio/reporter run \ - --filter-list "coverage:file=coverage.e2e.yml,diff=$BASE" --format ids + --filter-list "coverage:file=<map>,diff=$BASE" --format ids ``` `--format` values: `ids` (comma-separated, default), `newline` (one per line — easy to `wc -l`), `json`, `grep` (`(@Sxxxx|@Tyyyy|...)`). - Count the entries per map; empty output = `0`. -- Assemble one line, e.g. `0 automated tests, 10 manual tests are affected by - this PR`, and post it via the CI's PR/MR comment API (GitHub / GitLab / - Bitbucket). Prefer updating one existing comment over re-posting. +- Assemble one line: per-kind counts when `.manual`/`.e2e` maps exist (`0 + automated tests, 10 manual tests are affected by this PR`); a mixed map + yields one combined count (`12 tests (manual + automated) are affected`). + Post it via the CI's PR/MR comment API (GitHub / GitLab / Bitbucket); prefer + updating one existing comment over re-posting. - Needs `TESTOMATIO` (API key) for the `coverage:` resolution. Creates nothing, must never fail the PR check. ## 3. Phase 2 — create the regression runs on merge -### 3a. Manual run — created pending, not executed +One run per coverage map, with `--kind` per the rule in §1. + +### 3a. Manual-only map — run created pending, complete at creation No test runner; the reporter creates a manual run containing only the affected -cases for testers to pick up: +cases for testers to pick up. Nothing happens at deploy time: ```bash npx @testomatio/reporter start --kind manual \ - --filter "coverage:file=coverage.manual.yml,diff=$BASE" + --filter "coverage:file=coverage.<slug>.manual.yml,diff=$BASE" ``` -- Required env: `TESTOMATIO`, `TESTOMATIO_TITLE` (merge commit, e.g. `report for - commit <sha>`), `TESTOMATIO_RUNGROUP_TITLE` (the rungroup bucket; supports `/` - nesting). `TESTOMATIO_ENV` optional. +- Required env: `TESTOMATIO`, `TESTOMATIO_TITLE` (merge commit, e.g. `report + for commit <sha>`), `TESTOMATIO_RUNGROUP_TITLE` (the rungroup bucket; + supports `/` nesting). `TESTOMATIO_ENV` optional. -### 3b. Automated run — prepared as a shared run, not executed +### 3b. Map containing automated tests — prepared as a shared run, not executed -`start` creates the run scoped to the affected e2e tests and returns its id +`start` creates the run scoped to the affected tests and returns its id **without running anything**. Created as a *shared* run so the later execute -step — and any parallel executors — converge on this one run by title: +step — and any parallel executors — converge on this one run by title. A mixed +map adds `--kind mixed` (its manual cases are immediately pending for +testers); an e2e-only map takes no `--kind`: ```bash RUN_ID=$(TESTOMATIO_SHARED_RUN=1 \ TESTOMATIO_TITLE="report for commit $SHA" \ TESTOMATIO_SHARED_RUN_TIMEOUT=$DEPLOY_MINUTES \ - npx @testomatio/reporter start --filter "coverage:file=coverage.e2e.yml,diff=$BASE" --format id) + npx @testomatio/reporter start --kind mixed \ + --filter "coverage:file=coverage.<slug>.yml,diff=$BASE" --format id) ``` -- `--filter` scopes the prepared run to the affected tests; that scope is stored - on the run and reused at launch time (§4). -- **Output:** `--format id` makes `start` print only the run id to stdout (banner - and logs go to stderr), so `RUN_ID=$(...)` captures just the id. +- `--filter` scopes the prepared run to the affected tests; that scope is + stored on the run and reused at launch time (§4). +- **Output:** `--format id` makes `start` print only the run id to stdout + (banner and logs go to stderr), so `RUN_ID=$(...)` captures just the id. - Required env: `TESTOMATIO`, `TESTOMATIO_TITLE`, `TESTOMATIO_RUNGROUP_TITLE`. ### Shared-run env vars (the convergence mechanism) @@ -103,7 +122,8 @@ RUN_ID=$(TESTOMATIO_SHARED_RUN=1 \ profile (configured under **Settings → CI**) for an already-prepared run, instead of executing tests locally. This replaces the old cross-repo dispatch: the CI profile owns the runner, browsers, environment URLs and secrets — the -reporter just triggers it. +reporter just triggers it. Applies to every run prepared in §3b (mixed and +e2e-only); manual-only runs have no launch phase: ```bash TESTOMATIO_RUN=$RUN_ID \ @@ -114,11 +134,12 @@ npx @testomatio/reporter run --remote <profile> - `TESTOMATIO_RUN=$RUN_ID` points the launch at the run prepared in §3b. With no fresh `--filter`, Testomat.io greps that run's **own stored scope**, so only - the affected e2e tests run — no need to recompute the diff at deploy time. + the affected automated tests run — no need to recompute the diff at deploy + time. - The CI profile name must exist on the project, otherwise the call fails with `CI launch failed: No settings for <profile>`. `--remote` cannot be combined with `--filter-list`. -- Testomat.io passes the run id into the dispatched workflow, so the e2e tests +- Testomat.io passes the run id into the dispatched workflow, so the tests running there report back into the same prepared run. - On success the CLI prints the launched profile and run URL, then exits 0; the run transitions as the CI reports results. @@ -137,13 +158,13 @@ profile) and `TESTOMATIO_CI_PARAMS` (= comma-separated `key=value` overrides). ## 5. Inline exception — execute in this pipeline (no CI profile) -When the e2e suite runs in this pipeline (mobile, API/contract, or an existing -same-repo job) rather than via a CI profile, launch the prepared run by wrapping -the runner directly after deploy: +When the automated suite runs in this pipeline (mobile, API/contract, or an +existing same-repo job) rather than via a CI profile, launch the prepared run +by wrapping the runner directly after deploy: ```bash TESTOMATIO_RUN=$RUN_ID npx @testomatio/reporter run "<runner cmd>" \ - --filter "coverage:file=coverage.e2e.yml,diff=$BASE" + --filter "coverage:file=<map>,diff=$BASE" ``` Examples of `<runner cmd>`: `npx playwright test`, `npx cypress run`, diff --git a/skills/testomatio-flow/SKILL.md b/skills/testomatio-flow/SKILL.md index 618580b..77777f1 100644 --- a/skills/testomatio-flow/SKILL.md +++ b/skills/testomatio-flow/SKILL.md @@ -39,8 +39,7 @@ The skill orchestrates these specialized capabilities: | **improve-test-cases** | Improve existing test cases quality | | **sync-cases** | Upload test cases to Testomat.io TMS | | **reporter-setup** | Add Testomat.io reporter to your automation project | -| **manual-coverage** | Map manual test cases to source files (`coverage.manual.yml`) | -| **automation-coverage** | Map automated e2e tests to source files (`coverage.e2e.yml`) | +| **qa-test-coverage-map** | Map manual and automated tests to source files (per-project `coverage.*.yml`) | | **project-scan** | Scan project source code to inventory languages, frameworks, and existing tests | <!-- TODO: autotests-fixer, traceability-matrix -->