Skip to content

feat(0.26.0): skills audit --require-tests gate (BRO-1411 slice 2)#73

Merged
broomva merged 1 commit into
mainfrom
feature/bro-1411-require-tests
Jun 5, 2026
Merged

feat(0.26.0): skills audit --require-tests gate (BRO-1411 slice 2)#73
broomva merged 1 commit into
mainfrom
feature/bro-1411-require-tests

Conversation

@broomva
Copy link
Copy Markdown
Owner

@broomva broomva commented Jun 5, 2026

What & why

Skillify step 3 (unit tests on deterministic code) — BRO-1411 slice 2, built bstack-native.

/checkit on Garry Tan's skillify essay surfaced that bstack skills audit covers hygiene (budget / duplicate / reachability) but never correctness of the skill layer. This adds the missing correctness report + a CI gate.

Changes

  • 6th audit report — "Untested deterministic code". Flags skills shipping scripts/*.{py,sh,mjs,js,ts} (or root-level code) with no test file (test_*.py, *_test.py, *.test.*, test_*.sh). Markdown-only skills exempt; test files don't count as code. --json gains untested.
  • --require-tests — escalates to a hard gate (exit 1 if any untested skill). Default stays exit 0 → informational, non-breaking; CI opts into enforcement.
  • +4 hermetic tests (T10–T13): detection (code-no-tests flagged; code+tests + md-only exempt), gate exit-1, informational exit-0, report-section presence.

Validation (P11)

  • bash tests/skill-audit.test.sh14/14 pass.
  • Dogfood against the real ~/broomva/skills: 19 skills ship deterministic code with no tests (investment-management ×5 scripts, wealth-management ×4, social-intelligence, alkosto-wait-optimizer, …) — the bstack analog of GBrain's "6/40 dark skills". Informational today; backlog candidates.

Notes

  • Primitive count unchanged (20) — widens bstack skills audit, not a new P-row. VERSION 0.25.0 → 0.26.0.
  • Tracking: BRO-1411 (slices 3–4 remain: E2E skill-smoke, per-skill LLM-eval). Provenance: research/entities/concept/skillify.md.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added a sixth audit report for untested deterministic skill code
    • Introduced --require-tests flag that enforces test coverage requirements and returns a hard exit-code gate when violations are detected
  • Tests

    • Added test coverage for the new untested code audit and enforcement flag

Skillify step 3 (unit tests on deterministic code), bstack-native. Adds a 6th
audit report 'Untested deterministic code' (correctness, vs the 5 hygiene
reports) + --require-tests flag that gates CI (exit 1 if any skill ships
scripts/*.{py,sh,mjs,js,ts} with no test file). Markdown-only skills exempt.

First real run over ~/broomva/skills: 19 skills ship untested deterministic
code — the bstack analog of GBrain's 6/40 dark-skills finding.

+4 hermetic tests (14/14). VERSION 0.25.0 -> 0.26.0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@linear
Copy link
Copy Markdown

linear Bot commented Jun 5, 2026

BRO-1411

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Need the big picture first? Review this PR in Change Stack to see what changed before going file by file.

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 24a49a0e-4ee9-48b6-96ef-d9b720e7d431

📥 Commits

Reviewing files that changed from the base of the PR and between e3c2257 and 47c02a8.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • VERSION
  • scripts/skill-audit.py
  • tests/skill-audit.test.sh

📝 Walkthrough

Walkthrough

This PR adds a new audit report (report 6) to the skills auditor that detects deterministic code shipped without test coverage, plus an optional --require-tests flag that gates CI on those findings. The feature includes helper functions for test-file classification and code-file enumeration, updated output (JSON and human-readable), hermetic test coverage, and release documentation.

Changes

Untested Code Detection and Gate

Layer / File(s) Summary
Untested code detection infrastructure
scripts/skill-audit.py
Docstring updated from five to six audit reports; new constant CODE_EXTS, classifier _is_test_file(), enumerator _skill_code_files(), checker _skill_has_tests(), and detector detect_untested() added to support the untested code audit.
Audit report integration and output
scripts/skill-audit.py
Compute untested list, derive gate-failure condition, add --require-tests CLI flag, extend JSON output with untested and require_tests fields, render human-readable "Untested deterministic code" section, and return exit code 1 when gated.
Test coverage for untested detection gate
tests/skill-audit.test.sh
Hermetic fixture (FX2) with three skills (md-only, code-without-tests, code-with-tests); helper rt_audit() to run against fixture; T10–T13 assertions verify untested detection in JSON output, gate exit codes with/without --require-tests, and human-readable reporting section; cleanup after tests.
Release notes and version bump
VERSION, CHANGELOG.md
Version incremented to 0.26.0; changelog documents new untested code report, gate behavior, and test additions.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

  • broomva/life#1632: This PR implements the Skill-QA "skill-script test gate" (slice 2) that detects and optionally gates on skills shipping untested deterministic code.

Possibly related PRs

  • broomva/bstack#61: This PR extends the skills auditor introduced in #61 by adding new untested code detection and a correctness gate, along with corresponding hermetic test coverage.

Poem

🐰 A new gate springs forth to guard the way,
No code without its tests shall stay;
Deterministic, true, and pure—
Detection makes the code more sure!
Hopping forward, v0.26! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a --require-tests gate for the skills audit with version 0.26.0, and is clear and specific.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/bro-1411-require-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@broomva broomva merged commit 1538732 into main Jun 5, 2026
6 checks passed
@broomva broomva deleted the feature/bro-1411-require-tests branch June 5, 2026 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant