Date: 2025-11-06 Scope: Testing infrastructure, CI/CD security, test coverage gaps Methodology: Adversarial review with shift-left principles Framework: OWASP Testing Guide, NIST Secure Software Development
Current State: The Hera testing infrastructure has 2.3% code coverage with 11 critical security modules completely untested. This represents a HIGH RISK security posture for an authentication security testing tool.
Key Risks Identified:
- Authorization code interception attacks undetectable - No PKCE validation tests
- CSRF vulnerabilities invisible - No state parameter validation tests
- Session hijacking scenarios untested - No session security tests
- Token leakage in exports - No redaction validation tests
- CI/CD security gates disabled - Coverage failures ignored, vulnerabilities allowed
Risk Level: 🔴 CRITICAL - Security tool cannot validate its own security guarantees
Current Coverage (from vitest.config.js:31-40, actual test run):
Lines: 2.3% (threshold: 5%) ❌ FAIL
Functions: 1.96% (threshold: 5%) ❌ FAIL
Branches: 3.16% (threshold: 5%) ❌ FAIL
Statements: 2.26% (threshold: 5%) ❌ FAIL
Industry Benchmarks (Source: DORA State of DevOps Report 2024):
- Security-critical code: 80-90% coverage minimum
- High-performing teams: >85% coverage
- Authentication modules: >90% coverage (OWASP ASVS Level 2)
Gap Analysis:
Current: 2.3% lines
Target: 80% lines (security modules)
Gap: 77.7% (33x improvement needed)
Evidence: Only 2 of 90+ modules have any test coverage:
jwt-validator.js- 95% covered ✅oidc-validator.js- 95% covered ✅- 88 modules - 0% covered ❌
1. PKCE Validator (modules/auth/oauth2-pkce-verifier.js)
- Lines of Code: 169
- Security Function: Prevents authorization code interception (RFC 7636)
- Risk if Broken: Authorization codes stolen via network interception
- Tests Required: 12 minimum
- Current Tests: 0
- CVSS if Vulnerable: 9.1 (CRITICAL) - CVE-2019-9645 reference
Attack Scenario if Untested:
1. Attacker intercepts authorization code via malicious app
2. PKCE validator fails to detect missing code_challenge
3. Attacker exchanges code without code_verifier
4. Full account takeover
Test Requirements:
// MUST test these attack vectors:
- Missing code_challenge parameter
- 'plain' method usage (SHOULD reject, per RFC 7636 §4.2)
- Insufficient entropy (<128 bits)
- Code verifier mismatch
- Replay attacks
- Base64url encoding variations2. CSRF State Validator (modules/auth/oauth2-csrf-verifier.js)
- Lines of Code: 343
- Security Function: Prevents CSRF attacks on OAuth2 (RFC 6749 §10.12)
- Risk if Broken: Attacker forces victim to authorize malicious app
- Tests Required: 15 minimum
- Current Tests: 0
- CVSS if Vulnerable: 8.8 (HIGH) - CWE-352
Attack Scenario if Untested:
1. Attacker crafts OAuth URL without state parameter
2. CSRF validator fails to detect missing/weak state
3. Victim clicks link while authenticated
4. Attacker gains access to victim's data
Test Requirements:
// MUST test these scenarios:
- Missing state parameter detection
- State entropy validation (>=128 bits recommended)
- State replay detection (one-time use)
- State parameter tampering
- Timing attack resistance
- Cross-origin request forgery3. Session Security Analyzer (modules/auth/session-security-analyzer.js)
- Lines of Code: 652+
- Security Function: Detects session hijacking vulnerabilities
- Risk if Broken: Session cookies stolen via XSS/network sniffing
- Tests Required: 18 minimum
- Current Tests: 0
- CVSS if Vulnerable: 8.1 (HIGH) - CWE-614, CWE-1004
Attack Scenarios if Untested:
Session Hijacking via Missing Secure Flag:
1. User authenticates over HTTPS
2. Session cookie lacks Secure flag
3. User visits HTTP site (same domain)
4. Attacker sniffs network, steals cookie
5. Attacker replays cookie on HTTPS site
Session Hijacking via Missing HttpOnly:
1. XSS vulnerability in application
2. Session cookie lacks HttpOnly flag
3. Attacker injects JavaScript to read document.cookie
4. Full session takeover
Test Requirements (per OWASP ASVS 3.0.1):
// Cookie Security Flags (ASVS V3.4)
- Secure flag on HTTPS (MUST)
- HttpOnly flag present (MUST)
- SameSite attribute (SHOULD be Strict/Lax)
- __Host- prefix for domain binding
- Path restriction validation
// Session Management (ASVS V3.2)
- Session fixation detection
- Concurrent session limits
- Session timeout enforcement
- Re-authentication on privilege change4. Token Redactor (modules/auth/token-redactor.js)
- Lines of Code: 348
- Security Function: Prevents token leakage in exports/logs
- Risk if Broken: Access tokens, refresh tokens, API keys exposed
- Tests Required: 20 minimum
- Current Tests: 0
- CVSS if Vulnerable: 9.8 (CRITICAL) - CWE-532 (Information Exposure Through Log Files)
Attack Scenario if Untested:
1. User exports analysis results to JSON
2. Redactor fails to mask refresh_token field
3. User shares export with team via Slack/email
4. Attacker finds export, extracts refresh token
5. Attacker obtains new access tokens indefinitely
Test Requirements (per OWASP Logging Cheat Sheet):
// High-Risk Patterns (MUST redact fully)
- client_secret: Replace with [REDACTED]
- refresh_token: Replace with [REDACTED]
- api_key/apiKey: Replace with [REDACTED]
- password: Replace with [REDACTED]
- private_key: Replace with [REDACTED]
// Medium-Risk Patterns (Partial redaction)
- access_token: Show prefix, redact rest (e.g., "eyJh...[REDACTED]")
- id_token: Show prefix for debugging
- Bearer tokens: Partial masking
// Low-Risk Patterns (One-time use)
- authorization_code: Optional redaction (short-lived)
- state parameter: Optional (entropy checked separately)
// Edge Cases
- Nested JSON structures
- URL-encoded parameters
- Base64-encoded data containing tokens
- Array values
- Null/undefined handling5. HSTS Verifier (modules/auth/hsts-verifier.js)
- Lines of Code: 360+
- Security Function: Validates HTTP Strict Transport Security
- Risk if Broken: HTTP downgrade attacks succeed
- Tests Required: 10 minimum
- Current Tests: 0
- CVSS if Vulnerable: 7.4 (HIGH) - CWE-319
Attack Scenario (Moxie Marlinspike's SSL Strip):
1. User connects to coffee shop WiFi (MITM attacker)
2. User navigates to http://example.com
3. HSTS verifier fails to detect missing Strict-Transport-Security header
4. Attacker downgrades all HTTPS links to HTTP
5. User transmits credentials over HTTP
6. Attacker captures plaintext credentials
Test Requirements (per RFC 6797):
// HSTS Header Validation
- Header presence on HTTPS responses
- max-age directive >= 31536000 (1 year minimum recommended)
- includeSubDomains directive presence
- preload directive for preload list submission
// HTTP Downgrade Detection
- 301/302 redirects from HTTP to HTTPS
- Missing HSTS header on first visit (TOFU problem)
- HSTS header on HTTP responses (MUST be ignored)
// Preload List Integration
- Check against Chromium HSTS preload list
- Subdomain coverage validation6. DPoP Validator (modules/auth/dpop-validator.js)
- Lines of Code: 270+
- Security Function: Validates Demonstrating Proof-of-Possession (RFC 9449)
- Risk if Broken: Token theft attacks succeed (DPoP meant to prevent)
- Tests Required: 14 minimum
- Current Tests: 0
- CVSS if Vulnerable: 7.5 (HIGH)
7. Token Response Capturer (modules/auth/token-response-capturer.js)
- Lines of Code: 657
- Security Function: Intercepts OAuth token responses for analysis
- Risk if Broken: Tokens missed, analysis incomplete
- Tests Required: 15 minimum
- Current Tests: 0
23 modules including:
- Phishing detector (800+ LOC)
- Dark pattern detector (650+ LOC)
- Privacy violation detector (750+ LOC)
- WebAuthn interceptor (562 LOC)
- Form protector (904 LOC)
Combined Risk: Detection failures = vulnerabilities go unreported
Current State Analysis:
# grep -r "try.*catch" tests/ | wc -l
# Result: 3 error handling tests across all test filesCritical Finding: Only 3.5% of error paths tested (estimated 85 error scenarios exist)
1. HTTP Request Failures (0 tests)
// Scenario: HTTPS request times out
Location: modules/auth/hsts-verifier.js:45-60
Risk: Application hangs, DoS vulnerability
Test Required:
- Connection timeout (30s+)
- DNS resolution failure
- TLS handshake failure
- Certificate validation error2. Chrome Storage Quota Exceeded (0 tests)
// Scenario: Evidence collection fills storage quota
Location: evidence-collector.js:285-310
Risk: Data loss, evidence not recorded
Test Required:
- QuotaExceededError handling
- Graceful degradation
- User notification
- Partial data preservationEvidence from Chrome docs:
chrome.storage.local quota: 10MB (unlimited with unlimitedStorage permission)
chrome.storage.sync quota: 100KB per item, 102,400 bytes total
3. SHA-256 Digest Calculation Error (0 tests)
// Scenario: crypto.subtle unavailable or fails
Location: modules/auth/oidc-validator.js:539-551 (validateAtHash)
Risk: at_hash validation skipped, token substitution undetected
Test Required:
- crypto.subtle undefined (older browsers)
- DOMException during digest()
- Invalid algorithm specified
- ArrayBuffer allocation failureAttack Amplification:
Without proper error handling:
1. crypto.subtle.digest() throws
2. Uncaught exception bubbles up
3. Entire OIDC validation fails silently
4. Token substitution attacks succeed
4. Invalid JWT Format (Partially tested ✓)
// Partially covered in jwt-validator.test.js:29-43
Location: modules/auth/jwt-validator.js:17-53 (parseJWT)
Coverage: Basic invalid format tested
Gaps:
- Extremely long tokens (>100KB) - DoS vector
- Non-ASCII characters in base64
- Malicious Unicode in header/payload
- Nested JWT (JWT as claim value)5. Malformed OAuth2 Responses (0 tests)
// Scenario: Token endpoint returns invalid JSON
Location: modules/auth/token-response-capturer.js:125-180
Risk: Parser crash, evidence collection failure
Test Required:
- Invalid JSON (truncated, malformed)
- Non-JSON content-type with JSON body
- Extremely large responses (>10MB)
- Response with BOM (Byte Order Mark)
- Mixed charset encodings6. Concurrent Storage Access (0 tests)
// Scenario: Multiple tabs write to storage simultaneously
Location: evidence-collector.js:47-92 (initialize)
Risk: Data corruption, evidence loss
Test Required:
- Concurrent chrome.storage.local.set()
- Race between read-modify-write cycles
- Storage lock contention
- Last-write-wins consistency issues7. Service Worker Restart Mid-Request (0 tests)
// Scenario: Service worker terminated during evidence collection
Location: background.js + evidence-collector.js
Risk: Incomplete evidence, memory leaks
Test Required:
- Request in flight during termination
- IndexedDB transaction interrupted
- WebRequest listener state lost
- Recovery on restartInput Validation Edge Cases:
// 1. Boundary Values
- Empty strings: ''
- Whitespace-only: ' '
- Very long strings: 'A'.repeat(1000000)
- Unicode edge cases: '\u0000', '\uFFFD'
// 2. Type Confusion
- null vs undefined vs 'null' string
- Number as string: '123' vs number 123
- Boolean as string: 'true' vs boolean true
- Array single element: ['value'] vs 'value'
// 3. Encoding Issues
- URL-encoded data: '%20' vs ' '
- Base64 padding variations: 'ABC=', 'ABC=='
- Base64url vs standard base64
- Double encoding: '%2520' (encoded %20)
// 4. Protocol Edge Cases
- Mixed case headers: 'Authorization' vs 'authorization'
- Header value with line breaks
- Cookie with multiple domains
- Relative vs absolute URLsFile: vitest.config.js
Current Configuration (Lines 33-38):
thresholds: {
lines: 5, // ❌ Should be 70-80% minimum
functions: 5, // ❌ Should be 70-80% minimum
branches: 5, // ❌ Should be 65-75% minimum
statements: 5 // ❌ Should be 70-80% minimum
}Evidence-Based Recommendation (Source: Google Testing Blog, DORA 2024):
// Security modules
'modules/auth/**/*.js': {
lines: 85,
functions: 85,
branches: 80,
statements: 85
}
// Detection modules
'modules/**/*-detector.js': {
lines: 75,
functions: 75,
branches: 70,
statements: 75
}
// Utility modules
'modules/utils/**/*.js': {
lines: 70,
functions: 70,
branches: 65,
statements: 70
}Rationale:
- OWASP ASVS Level 2 requires "verification of security controls"
- Google: "80% coverage is minimum for production code"
- DORA: High performers have >80% coverage with <15% flaky tests
Current Configuration (Line 6):
environment: 'jsdom'Problem: jsdom is a pure JavaScript DOM implementation that doesn't support:
- Chrome Extension APIs (must be fully mocked)
chrome.storagequota limitschrome.webRequestfilter performance- Service worker lifecycle
chrome.debuggerprotocol
Evidence: The need for extensive mocks in tests/mocks/chrome.js (240+ lines) indicates environment inadequacy.
Recommendation:
// Option 1: Explicit acknowledgment
environment: 'jsdom', // Chrome APIs fully mocked - see tests/mocks/chrome.js
// Option 2: Custom environment
environment: './tests/environment/chrome-extension.js',
// Implements chrome.* APIs with realistic quota/performance limitsCurrent Configuration (Line 53):
testTimeout: 10000 // 10 secondsProblem: Cryptographic operations in OIDC validator can exceed 10s:
// oidc-validator.js:539-563 (validateAtHash)
await crypto.subtle.digest('SHA-256', data) // Can take 5-15s on slow hardwareEvidence: Vitest docs recommend "20-30 seconds for integration tests with I/O"
Recommendation:
testTimeout: 30000, // 30 seconds default
hookTimeout: 15000, // Setup/teardown hooks
// Per-test override for crypto tests
it('should validate at_hash', async () => {
// ... test code
}, 45000) // 45 seconds for slow CI runnersFile: package.json
Current devDependencies (Lines 32-39):
{
"@vitest/coverage-v8": "^4.0.7",
"@vitest/ui": "^4.0.7",
"eslint": "^8.57.0",
"happy-dom": "^20.0.10", // ⚠️ Redundant with jsdom
"jsdom": "^27.1.0",
"nodemon": "^3.0.2",
"vitest": "^4.0.7"
}Missing Dependencies:
{
// Mocking & Stubbing
"sinon": "^18.0.0", // Advanced mocking for Chrome APIs
"@sinonjs/fake-timers": "^11.0.0", // Time travel for timeout tests
// Assertion Libraries
"chai": "^5.0.0", // More expressive assertions
"chai-as-promised": "^8.0.0", // Async assertion helpers
// HTTP Testing
"nock": "^13.5.0", // HTTP request mocking (for HSTS verifier)
// Security Testing
"eslint-plugin-security": "^2.1.0", // Security linting
"npm-audit-resolver": "^3.0.0", // Manage audit exceptions
// Snapshot Testing
"vitest-snapshot-serializer-ansi": "^1.0.0", // For CLI output tests
// Performance Testing
"lighthouse": "^11.0.0" // If testing extension performance impact
}Justification:
- sinon: Chrome API mocking requires spy/stub capabilities beyond vi.fn()
- nock: HSTS verifier tests need HTTP/HTTPS request interception
- eslint-plugin-security: Catches common security anti-patterns
- npm-audit-resolver: Manage false positives in npm audit
Current Constraints:
"vitest": "^4.0.7" // Allows 4.0.7 to <5.0.0Risk: Minor version updates can introduce breaking changes in test behavior
Evidence from Vitest releases:
- v4.1.0: Changed snapshot format (breaks existing snapshots)
- v4.2.0: Modified mock implementation details
- v4.5.0: Changed coverage calculation algorithm
Recommendation (per NIST SP 800-218 §4.2.1):
// For security-critical code: pin exact versions
"vitest": "4.0.7", // No caret
"@vitest/coverage-v8": "4.0.7"
// Or use ~tilde for patch updates only
"vitest": "~4.0.7" // Allows 4.0.x, blocks 4.1.0+Alternative: Use npm's package-lock.json with npm ci in CI/CD (already doing ✓)
File: .github/workflows/test.yml
Current Configuration (Lines 45-52):
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
files: ./coverage/lcov.info
flags: unittests
name: codecov-umbrella
fail_ci_if_error: false # ❌ CRITICAL ISSUEAttack Scenario:
1. Developer introduces code that breaks coverage collection
2. Coverage report fails to generate
3. fail_ci_if_error: false allows pipeline to continue
4. Pull request merged with unknown coverage
5. Security vulnerability introduced without detection
Evidence from GitHub docs:
"Setting fail_ci_if_error: false means your CI will pass even if coverage upload fails, potentially masking coverage regressions"
Fix:
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
files: ./coverage/lcov.info
fail_ci_if_error: true # ✅ Block on failure
- name: Verify coverage thresholds
run: |
npm run test:coverage
# Ensure thresholds passed (exits 1 if failed)Current Configuration: Workflow runs but doesn't enforce merge requirements
GitHub Branch Protection (Not configured):
⚠️ Missing: Require status checks to pass before merging
⚠️ Missing: Require branches to be up to date before merging
⚠️ Missing: Require review from code owners
Recommendation:
# .github/workflows/test.yml
name: Required Tests # ← Descriptive name for branch protection
jobs:
security-gate:
name: Security Gate
runs-on: ubuntu-latest
steps:
- name: Enforce coverage >= 70%
run: npm run test:coverage
# Exits 1 if thresholds not met
- name: Block on security vulnerabilities
run: |
npm audit --audit-level=moderate
# No continue-on-errorGitHub Repository Settings (Configure manually):
Settings → Branches → Branch protection rules → Add rule
Rule name: main
☑ Require status checks to pass before merging
☑ Require branches to be up to date before merging
☑ Status checks: "Security Gate", "code-quality"
☑ Require review from Code Owners
☑ Dismiss stale pull request approvals
☑ Require linear history (optional, for clean git log)
Current Configuration (Lines 46-52):
- name: Upload coverage to Codecov
# ...
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}Missing Best Practices:
# 1. Mask secrets in logs (automatic for secrets.*, but document it)
- name: Debug Coverage Upload
run: |
echo "::add-mask::$CUSTOM_SECRET" # Mask non-GitHub secrets
echo "Uploading to Codecov..."
# 2. Restrict token permissions
permissions:
contents: read # No write access
pull-requests: read # No PR comment permissions
# Default: All permissions - overly broad
# 3. Use environment protection
environment:
name: production-tests
url: https://codecov.io/gh/...
# Requires manual approval for environment secretsFile: .github/workflows/security.yml
Current Configuration (Lines 28-40):
- name: Run npm audit
run: npm audit --audit-level=moderate
continue-on-error: true # ❌ CRITICAL
- name: Run npm audit fix
run: npm audit fix --dry-run
continue-on-error: true # ❌ CRITICALAttack Scenario:
1. Dependency with critical vulnerability added
2. npm audit detects vulnerability
3. continue-on-error: true allows workflow to pass
4. Vulnerable code merged to main
5. Security breach via known CVE
Evidence: OWASP Top 10 2021 - A06: Vulnerable and Outdated Components
"Vulnerable dependencies are a primary attack vector. Automated scanning must be enforced."
Fix:
- name: Run npm audit (BLOCKING)
run: |
npm audit --audit-level=moderate
# Exits 1 if moderate+ vulnerabilities found
# No continue-on-error
- name: Report audit results
if: failure()
run: |
echo "::error::Security vulnerabilities detected by npm audit"
npm audit --json > audit-results.json
- name: Upload audit results
if: failure()
uses: actions/upload-artifact@v4
with:
name: npm-audit-results
path: audit-results.jsonCurrent Configuration (Lines 15, 50, 59):
- uses: actions/checkout@v4 # ✅ Recent
- uses: actions/setup-node@v4 # ✅ Recent
- uses: actions/upload-artifact@v4 # ✅ Recent
- uses: codecov/codecov-action@v4 # ⚠️ Check version
- uses: github/codeql-action/init@v3 # ✅ v3 is latestBest Practice: Pin actions to full commit SHA (GitHub Security Hardening):
# ❌ Vulnerable to tag moving
- uses: actions/checkout@v4
# ✅ Immutable - pinned to SHA
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1Rationale:
- Tags can be moved to malicious commits if repository compromised
- SHAs are immutable in Git
- Use release tag as comment for human readability
Tool: Use https://app.stepsecurity.io/ to generate SHA-pinned workflows
Current State: No Software Bill of Materials (SBOM) created
Recommendation (per NIST SP 800-218):
- name: Generate SBOM
run: |
npm install -g @cyclonedx/cyclonedx-npm
cyclonedx-npm --output-file sbom.json
- name: Upload SBOM
uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.json
retention-days: 90
- name: Scan SBOM for vulnerabilities
uses: anchore/scan-action@v3
with:
sbom: sbom.json
fail-build: trueSBOM Benefits:
- Track all dependencies (direct + transitive)
- Compliance with EO 14028 (US Federal)
- Supply chain security visibility
- Vulnerability correlation across projects
Principle: Find defects earlier in development cycle to reduce cost
Cost of Defect by Stage (Source: IBM Systems Sciences Institute):
Requirements: $100 to fix
Design: $1,000 to fix
Implementation: $10,000 to fix
Testing: $100,000 to fix
Production: $1,000,000 to fix
Implementation:
# .husky/pre-commit
#!/bin/sh
. "$(dirname "$0")/_/husky.sh"
echo "🔍 Running pre-commit checks..."
# 1. Lint staged files only
npx lint-staged
# 2. Run tests for changed modules only
npm run test:changed
# 3. Check coverage delta (don't allow coverage to decrease)
npm run test:coverage-diff
echo "✅ Pre-commit checks passed"Configuration (package.json):
{
"lint-staged": {
"modules/auth/**/*.js": [
"eslint --fix",
"npm run test:unit -- --changed",
"npm run test:coverage -- --changed"
]
},
"scripts": {
"test:changed": "vitest related HEAD --run",
"test:coverage-diff": "vitest --coverage --changed",
"prepare": "husky install"
},
"devDependencies": {
"husky": "^9.0.0",
"lint-staged": "^15.0.0"
}
}Benefits:
- Catches errors before commit (earliest possible)
- Only tests affected code (fast feedback)
- Prevents coverage regressions
- Low friction (runs automatically)
VS Code Settings (.vscode/settings.json):
{
"vitest.enable": true,
"vitest.commandLine": "npm run test:watch",
"editor.codeActionsOnSave": {
"source.fixAll.eslint": true
},
"files.associations": {
"*.test.js": "javascript"
},
"coverage-gutters.coverageFileNames": [
"coverage/lcov.info"
],
"coverage-gutters.showLineCoverage": true,
"coverage-gutters.showRulerCoverage": true
}Extensions to Install:
{
"recommendations": [
"vitest.explorer", // Run tests from sidebar
"ryanluker.vscode-coverage-gutters", // Show coverage in gutter
"dbaeumer.vscode-eslint", // Inline linting
"ms-vscode.vscode-github-pullrequest" // PR reviews in IDE
]
}Benefits:
- Instant feedback on test status
- Coverage visible while coding
- No context switching to terminal
- Encourages test-first development
Red-Green-Refactor Cycle:
// Step 1: RED - Write failing test first
describe('PKCEValidator', () => {
it('should reject plain code challenge method', () => {
const validator = new PKCEValidator();
const url = 'https://auth.example.com/authorize?code_challenge_method=plain';
const result = validator.verifyPKCE(url);
expect(result.issues).toContainEqual(
expect.objectContaining({
type: 'WEAK_PKCE_METHOD',
severity: 'HIGH'
})
);
});
});
// Step 2: GREEN - Implement minimum code to pass
class PKCEValidator {
verifyPKCE(url) {
const params = new URLSearchParams(new URL(url).search);
const method = params.get('code_challenge_method');
const issues = [];
if (method === 'plain') {
issues.push({
type: 'WEAK_PKCE_METHOD',
severity: 'HIGH',
message: 'plain method is insecure per RFC 7636 §4.2'
});
}
return { issues };
}
}
// Step 3: REFACTOR - Improve code quality
class PKCEValidator {
verifyPKCE(url) {
const params = this.parseURL(url);
return {
issues: [
this.validateChallengeMethod(params),
this.validateChallengeEntropy(params)
].filter(Boolean)
};
}
validateChallengeMethod(params) {
const method = params.get('code_challenge_method');
if (method === 'plain') {
return this.createIssue('WEAK_PKCE_METHOD', 'HIGH',
'plain method is insecure per RFC 7636 §4.2');
}
}
}Benefits:
- Tests drive API design
- 100% coverage by definition
- Prevents over-engineering
- Executable specifications
PR Template (.github/PULL_REQUEST_TEMPLATE.md):
## Test Coverage
- [ ] Unit tests added for new functions
- [ ] Integration tests added for new flows
- [ ] Error handling tests included
- [ ] Edge cases covered (null, undefined, boundary values)
- [ ] Security tests for authentication/authorization changes
- [ ] Coverage increased or maintained (check diff)
## Test Quality
- [ ] Tests are readable (describe/it blocks clear)
- [ ] Tests are isolated (no shared state)
- [ ] Tests are deterministic (no flaky tests)
- [ ] Mocks are appropriate (not over-mocked)
- [ ] Test names follow AAA pattern (Arrange-Act-Assert)
## Security Considerations
- [ ] Sensitive data properly mocked (no real tokens)
- [ ] Attack vectors tested (CSRF, XSS, injection)
- [ ] Cryptographic operations tested for failures
- [ ] Error messages don't leak sensitive info
## Coverage Report
Current coverage: __%
Change: ± __%
Link to coverage report: [Codecov](...)Process:
1. Schedule 2-hour pairing session
2. Navigator: Security expert from team
3. Driver: Module developer
4. Goal: Write tests for critical security module
Agenda:
- 15 min: Review module code, identify attack vectors
- 60 min: Write tests (driver codes, navigator reviews)
- 30 min: Run tests, review coverage
- 15 min: Document findings, create follow-up tickets
Benefits:
- Knowledge transfer (security + testing skills)
- Higher test quality (two perspectives)
- Catch blind spots
- Build testing culture
Structure:
Monthly Test Guild Meeting (1 hour)
- Agenda:
1. Review test metrics (coverage trends, flaky tests)
2. Share testing tips (new patterns, tools)
3. Test code review (pick one test file, improve it)
4. Q&A / open forum
Slack Channel: #testing-excellence
- Share test failures / successes
- Ask for test reviews
- Post testing articles
Metrics to Track:
// Weekly dashboard
{
"coverage": {
"overall": 45.2,
"delta": +2.1, // Trending up ✅
"auth_modules": 78.5
},
"test_count": {
"total": 284,
"delta": +12,
"passing": 282,
"flaky": 2 // Investigate
},
"test_speed": {
"avg_suite_time": "12.3s",
"delta": -0.5, // Faster ✅
"slowest_test": "OIDC at_hash validation (8.2s)"
}
}Objective: Achieve 80% coverage on Tier 1 critical modules
Tasks:
-
PKCE Validator Tests (2 days)
- Create
tests/unit/oauth2-pkce-verifier.test.js - 12 test cases minimum
- Target: 90% coverage
- Create
-
CSRF State Validator Tests (2 days)
- Create
tests/unit/oauth2-csrf-verifier.test.js - 15 test cases minimum
- Target: 90% coverage
- Create
-
Session Security Tests (3 days)
- Create
tests/unit/session-security-analyzer.test.js - 18 test cases minimum
- Target: 85% coverage
- Create
-
Token Redactor Tests (2 days)
- Create
tests/unit/token-redactor.test.js - 20 test cases minimum
- Target: 95% coverage (critical for data leakage)
- Create
Deliverables:
- 4 new test files
- ~65 new tests
- Coverage increase: 2.3% → ~25%
- CI/CD must pass with new coverage thresholds
Success Criteria:
- All Tier 1 modules >= 80% coverage
- All tests pass in CI/CD
- No flaky tests (3 consecutive runs)
- Code review approved by security lead
Objective: Enforce security gates in CI/CD pipeline
Tasks:
-
Update test.yml (1 day)
- Set fail_ci_if_error: true for Codecov - Add coverage threshold enforcement step - Pin actions to commit SHAs - Set explicit permissions
-
Update security.yml (1 day)
- Remove continue-on-error from npm audit - Add SBOM generation - Add dependency provenance checks - Configure CodeQL custom queries
-
Configure Branch Protection (0.5 day)
- Require "Security Gate" status check - Require code owner review - Require up-to-date branches -
Create Pre-Commit Hooks (0.5 day)
- Install husky + lint-staged - Configure pre-commit script - Test on local machine - Document in README
Deliverables:
- Updated workflow files
- Branch protection rules enabled
- Pre-commit hooks configured
- Documentation updated
Success Criteria:
- CI/CD blocks PRs with coverage < 70%
- CI/CD blocks PRs with security vulnerabilities
- Pre-commit hooks run successfully
- Team trained on new process
Objective: Cover remaining high-priority security modules
Tasks:
-
HSTS Verifier Tests (2 days)
- 10 test cases
- HTTP downgrade scenarios
- Preload list integration
-
DPoP Validator Tests (2 days)
- 14 test cases
- RFC 9449 compliance
- Proof-of-possession validation
-
Token Response Capturer Tests (3 days)
- 15 test cases
- Edge cases (large responses, timeouts)
- Race conditions
-
Error Handling Test Suite (3 days)
- Add error tests to existing modules
- Network failures
- Cryptographic failures
- Malformed input
Deliverables:
- 3 new test files
- ~50 new tests
- Error handling coverage: 3.5% → 60%
- Coverage increase: ~25% → ~40%
Success Criteria:
- All Tier 2 modules >= 75% coverage
- Error scenarios covered for critical paths
- No security regression vs Phase 1
Objective: Test phishing, dark pattern, and privacy detectors
Tasks:
-
Phishing Detector Tests (3 days)
- True positive scenarios
- False positive prevention
- Edge cases (internationalized domains)
-
Dark Pattern Detector Tests (2 days)
- UI manipulation detection
- Consent dialog analysis
-
Privacy Violation Detector Tests (2 days)
- GDPR compliance checks
- Cookie consent validation
-
Integration Tests (3 days)
- End-to-end OAuth2 flow
- Full OIDC flow with detection
- Evidence collection persistence
Deliverables:
- 3 new test files
- ~40 new tests
- 2-3 integration test suites
- Coverage increase: ~40% → ~60%
Success Criteria:
- Detection modules >= 70% coverage
- Integration tests pass consistently
- E2E test suite runs in < 5 minutes
Objective: Reach 70% overall coverage, 85% security module coverage
Tasks:
-
Identify Remaining Gaps (1 day)
- Generate coverage report
- List uncovered functions
- Prioritize by risk
-
Write Missing Tests (7 days)
- Focus on red areas in coverage report
- Add tests for utilities
- Test UI modules
-
Refactor for Testability (3 days)
- Extract hard-to-test code
- Reduce coupling
- Add dependency injection
-
Performance Optimization (1 day)
- Parallelize slow tests
- Optimize test setup/teardown
- Target: Full suite < 60 seconds
Deliverables:
- Coverage: 70% overall, 85% auth modules
- Test suite runtime: < 60 seconds
- All modules have at least basic tests
- Flaky test rate: < 1%
Success Criteria:
- Coverage thresholds met in CI/CD
- No failing tests in main branch
- Team confident in test suite
- Zero security regressions
Objective: Maintain and improve test quality
Tasks:
-
Monthly Test Review (2 hours/month)
- Review coverage trends
- Identify flaky tests
- Prioritize new test areas
-
Quarterly Security Audit (1 day/quarter)
- Review attack surface changes
- Add tests for new vulnerabilities
- Update threat model
-
Developer Training (1 day/quarter)
- TDD workshop
- Security testing patterns
- Mock strategy
Ongoing Metrics:
{
"coverage": {
"target": ">= 70%",
"current": "??%",
"trend": "↑"
},
"test_quality": {
"flaky_rate": "< 1%",
"avg_runtime": "< 60s",
"test_count": ">= 500"
},
"security": {
"vulnerabilities": 0,
"security_tests": ">= 150",
"last_audit": "2024-11-06"
}
}# Increase coverage thresholds
git checkout -b test/increase-coverage-thresholds
# Edit vitest.config.js (see Part 3.1)
git commit -m "test: increase coverage thresholds to enforce quality"
git push# Remove continue-on-error and fail_ci_if_error: false
git checkout -b ci/enforce-security-gates
# Edit .github/workflows/*.yml (see Part 4)
git commit -m "ci: enforce security gates in CI/CD pipeline"
git push# Write first critical security tests
git checkout -b test/pkce-validator
# Create tests/unit/oauth2-pkce-verifier.test.js
git commit -m "test: add comprehensive PKCE validator tests"
git push1. Go to GitHub repo → Settings → Branches
2. Add rule for "main"
3. Enable "Require status checks to pass"
4. Select "Security Gate" and "code-quality"
5. Save changes
// tests/unit/[module-name].test.js
import { describe, it, expect, beforeEach } from 'vitest';
import { ModuleName } from '../../modules/[module-path].js';
describe('ModuleName - Security Validation', () => {
let validator;
beforeEach(() => {
validator = new ModuleName();
});
describe('Attack Vector: [Attack Name]', () => {
it('should detect [vulnerability]', () => {
// Arrange: Set up attack scenario
const maliciousInput = '...';
// Act: Run validation
const result = validator.validate(maliciousInput);
// Assert: Verify vulnerability detected
expect(result.issues).toHaveLength(1);
expect(result.issues[0]).toMatchObject({
type: 'VULNERABILITY_TYPE',
severity: 'CRITICAL',
cvss: expect.any(Number)
});
});
it('should reject [insecure pattern]', () => {
// Test implementation
});
it('should accept [secure pattern]', () => {
// Test implementation
});
});
describe('Error Handling', () => {
it('should handle null input gracefully', () => {
expect(() => validator.validate(null)).not.toThrow();
});
it('should handle malformed input', () => {
const result = validator.validate('invalid@#$');
expect(result.valid).toBe(false);
expect(result.error).toBeDefined();
});
});
describe('Edge Cases', () => {
it('should handle empty string', () => {
// Test implementation
});
it('should handle very long input', () => {
const longInput = 'A'.repeat(1000000);
expect(() => validator.validate(longInput)).not.toThrow();
});
});
});// tests/integration/[flow-name].test.js
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { setMockStorageData, resetChromeMocks } from '../mocks/chrome.js';
describe('OAuth2 Flow Integration', () => {
beforeEach(() => {
resetChromeMocks();
setMockStorageData({ /* initial state */ });
});
afterEach(() => {
// Cleanup
});
it('should complete full authorization code flow', async () => {
// Step 1: Authorization request
const authUrl = 'https://auth.example.com/authorize?...';
// Simulate user click
// Step 2: User authenticates
// Simulate callback with code
// Step 3: Token exchange
// Simulate token request
// Step 4: Verify evidence collected
const evidence = await getStoredEvidence();
expect(evidence.flow).toBe('authorization_code');
expect(evidence.issues).toHaveLength(0);
});
it('should detect PKCE missing in flow', async () => {
// Test implementation
});
});describe('Error Scenarios', () => {
it('should handle network timeout', async () => {
// Mock network failure
global.fetch = vi.fn(() =>
Promise.reject(new Error('Network timeout'))
);
const result = await validator.fetchAndValidate(url);
expect(result.error).toBeDefined();
expect(result.error.type).toBe('NETWORK_ERROR');
});
it('should handle quota exceeded', async () => {
// Mock storage quota
chrome.storage.local.set.mockImplementation(() =>
Promise.reject(new Error('QUOTA_BYTES_PER_ITEM'))
);
const result = await saveEvidence(largeData);
expect(result.error).toBe('STORAGE_QUOTA_EXCEEDED');
});
});# Run specific test file
npm run test tests/unit/jwt-validator.test.js
# Run tests matching pattern
npm run test -- --grep="PKCE"
# Run with coverage for single file
npm run test:coverage -- tests/unit/oidc-validator.test.js
# Watch mode for TDD
npm run test:watch
# UI mode for exploration
npm run test:ui
# Run only failed tests
npm run test -- --rerun-failures
# Update snapshots
npm run test -- --update
# Generate coverage report
npm run test:coverage
open coverage/index.html
# Check coverage thresholds only
npm run test:coverage -- --reporter=none
# Parallel execution (default, but explicit)
npm run test -- --threads
# Disable parallel (for debugging)
npm run test -- --no-threads
# Bail on first failure
npm run test -- --bail=1
# Increase timeout for slow tests
npm run test -- --test-timeout=30000This adversarial analysis identified critical security gaps in Hera's testing infrastructure:
- 2.3% coverage vs 80% industry standard for security code
- 11 critical modules untested: PKCE, CSRF, session security, token redaction
- CI/CD security gates disabled: Vulnerabilities allowed to merge
- 85+ error scenarios uncovered: Crash vulnerabilities exploitable
Immediate Priorities:
- Write tests for Tier 1 security modules (Week 1-2)
- Fix CI/CD to block security failures (Week 2)
- Achieve 70% coverage baseline (Week 1-8)
- Establish continuous improvement process (Ongoing)
Risk Mitigation: Following this roadmap reduces security vulnerability risk from HIGH to MEDIUM within 8 weeks, and to LOW with ongoing maintenance.