Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions plugins/dotnet-test/agents/code-testing-generator.agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,12 +112,17 @@ Run tests from the **full workspace scope** with a fresh build (never use `--no-

**Verify tests pin down behavior (mandatory pre-completion gate):**

For any non-trivial test addition (≥5 generated tests, or any task whose prompt describes specific behaviors to verify), run a quick self-review pass *before* reporting completion — and **after** any Step 8 coverage-gap iteration that adds or modifies tests, so the gate always runs against the final test set. The two skills below ship in this plugin and are designed exactly for this use case:
For any non-trivial test addition (≥5 generated tests, or any task whose prompt describes specific behaviors to verify), run a quick self-review pass *before* reporting completion — and **after** any Step 8 coverage-gap iteration that adds or modifies tests, so the gate always runs against the final test set. The first two checks below use skills that ship in this plugin; the third is a self-review against the prompt:

1. **Pseudo-mutation check** — invoke the `test-gap-analysis` skill against the source file(s) you tested and the test file(s) you produced. The skill reasons about plausible mutations (boundary flips, dropped null checks, removed exceptions, sign flips) and reports which would slip past your tests. For every gap it flags, either strengthen the existing assertion or add a follow-up test. Re-run until no gap is reported, or until the remaining gaps are explicitly out of scope (e.g., production bugs you cannot fix in a test-only PR).

2. **Assertion-depth check** — invoke the `assertion-quality` skill against the test file(s) you produced. If it flags trivial-only assertions (`IsNotNull` / `toBeDefined` / `assert x is not None`-only tests, tautological round-trip assertions, single-observable tests where the production code touches multiple observables), revise those tests — replace existence checks with concrete-value assertions, and add a secondary observable per behavior-radius guidance.

3. **Prompt-scenario coverage check** — when the prompt enumerates specific behaviors or scenarios to verify, map each one to a dedicated test before reporting completion. This guards against the common failure of testing an *adjacent* function and leaving the requested behavior uncovered:
Comment thread
Evangelink marked this conversation as resolved.
- **Target the exact function/feature named in the objective**, not a neighboring helper that merely looks related. Test the named symbol directly — do not substitute a similarly-named sibling and assume it transitively covers the target. Prefer extending the canonical existing test file for that feature over creating a new, narrower file.
- **Cover the full range each scenario's wording implies, not a single representative case.** Phrasing like "when the dimensions stay the same *or* change", "wider *or* narrower", or "first character *or* anywhere in the string" calls for multiple variations — exercise each variation (and combine them in one test when the wording groups them) rather than asserting a single instance.
- **Honor positional and structural qualifiers literally.** When a scenario pins a condition to a specific position or shape (e.g. "the *first* character after the prefix", "a filename containing a literal space"), construct an input that satisfies that exact qualifier — an input where the condition merely appears *somewhere* does not cover it.

Skip the gate only for trivially small tasks — fewer than 5 generated tests *and* no behaviors specified in the prompt (the exact inverse of the threshold above). For every other run, the gate is mandatory: a test that passes vacuously — that would still pass if the function body were emptied or returned a default — is a bug, not a test.
Comment thread
Evangelink marked this conversation as resolved.

Additional self-review heuristics (still required, even when running the skills):
Expand All @@ -135,7 +140,7 @@ After the previous phases complete, check for uncovered source files:
3. Identify source files with no corresponding test file.
4. Generate tests for each uncovered file, build, test, and fix.
5. Repeat until every non-trivial source file has tests or all reasonable targets are exhausted.
6. If this step added or modified any tests, re-run the Step 7 pre-completion gate (`test-gap-analysis` + `assertion-quality`) on the new/changed tests before reporting completion — Step 8 output must not bypass the gate.
6. If this step added or modified any tests, re-run the full Step 7 pre-completion gate (`test-gap-analysis` + `assertion-quality` + prompt-scenario coverage) on the new/changed tests before reporting completion — Step 8 output must not bypass the gate.

### Step 9: Report Results

Expand Down Expand Up @@ -192,6 +197,6 @@ All state is stored in `.testagent/` folder:
8. **Fix assertions, don't skip tests** — when tests fail, read production code and fix the expected value; never `[Ignore]` or `[Skip]`
9. **Clean up `.testagent/`** — after pipeline completion, delete the `.testagent/` folder or advise the user to add it to `.gitignore` so ephemeral state is not committed
10. **Read language extensions first** — always call the `code-testing-extensions` skill and read the relevant extension file before writing any code; it contains critical project registration and build validation steps
11. **Always validate** — final build, final test, coverage-gap review, and reporting are mandatory for ALL strategies including Direct; never skip final validation. The pre-completion self-review gate from Step 7 (`test-gap-analysis` + `assertion-quality` skills) is mandatory for every non-trivial test addition and may be skipped only for trivially small tasks (fewer than 5 generated tests *and* no behaviors specified in the prompt), per Step 7
11. **Always validate** — final build, final test, coverage-gap review, and reporting are mandatory for ALL strategies including Direct; never skip final validation. The pre-completion self-review gate from Step 7 (`test-gap-analysis` + `assertion-quality` skills, plus the prompt-scenario coverage check) is mandatory for every non-trivial test addition and may be skipped only for trivially small tasks (fewer than 5 generated tests *and* no behaviors specified in the prompt), per Step 7
12. **Preserve existing tests** — never delete or overwrite existing test files; create new files or append to existing ones
13. **Never mutate version control** — your only outputs are additive test files plus minimal build-manifest edits to register a new test project. Any command that reverts, restores, resets, stashes, or cleans the tree, or deletes tracked files, is out of scope — even when the workspace looks broken or incomplete.