Skip to content

[REVIEW] sast-config: add polyglot monorepo workspace-scoped SARIF completeness gates #1527

@bnpl7

Description

@bnpl7

[REVIEW] sast-config: add polyglot monorepo workspace-scoped SARIF completeness gates

Skill Being Reviewed

Skill name: sast-config
Skill path: skills/devsecops/sast-config/

False Positive Analysis

Benign-looking SAST maturity result that can be incorrectly scored as broad coverage:

# .github/workflows/codeql.yml
strategy:
  matrix:
    language: [javascript]
paths-ignore:
  - services/**
  - apps/**
  - packages/**
SAST Maturity Assessment Summary
- Active tool: CodeQL
- CWE Top 25 coverage: 18/25 for JavaScript
- CI gate: pass on PR
- Custom rules: 12 Semgrep rules in /security/semgrep

Why this is a false positive:

The skill can report healthy CWE coverage for "JavaScript" while the repository is a polyglot monorepo and the CI workflow only analyzes a narrow path or a single language slice. In the example, ignoring services/, apps/, and packages/ means the assessment covers almost none of the production code. The skill checks whether rules exist and whether CWE rows are mapped, but not whether the analyzed workspace spans all language ecosystems and deployable components.

Coverage Gaps

Missed variant 1: Per-package SARIF uploaded from subdirectory scan only

# turbo / nx monorepo
projects:
  - payments-api (Go)
  - web-checkout (TypeScript)
  - auth-worker (Python)
semgrep ci --config p/ci --subdir apps/web-checkout
# SARIF uploaded as full-repo scan result

Why it should be caught: ASVS mapping can appear complete for TypeScript while Go and Python services remain unscanned. The skill should require a component-to-scan-artifact matrix, not a single global coverage table.

Missed variant 2: Generated code and vendor subtree inflates pass rate

Scanned files: 4,812
Generated protobuf/grpc: 4,103
Handwritten source: 709
Gate result: pass (0 findings)

Why it should be caught: Findings suppression and low signal can hide missing coverage of handwritten code. The skill should gate on % handwritten LOC scanned or equivalent build-target coverage.

Missed variant 3: CodeQL autobuild succeeds only for root app while failing silently for nested modules

CodeQL job summary:
- autobuild: success
- extracted: 1 Java database (root build.gradle only)
- modules not extracted: payments-core, ledger-worker

Why it should be caught: Existing reviews mention build completeness, but not monorepo workspace boundary proof. A single successful autobuild should not satisfy coverage for all compiled modules.

Edge Cases

  • Bazel/Gradle composite builds: Extraction may cover targets not shipped to production; skill should map scan targets to release artifacts.
  • Fork PR scans: pull_request_target workflows may scan base branch only; PR delta coverage can be misrepresented.
  • Baseline suppression files: Repo-wide baseline may hide findings in one package while giving another package a false clean bill of health.
  • Shared ruleset with language filters disabled: One Semgrep config file referenced everywhere, but only Java rules enabled in CI.

Remediation Quality

  • Fix resolves the vulnerability
  • Fix doesn't introduce new security issues
  • Fix doesn't break functionality
  • Issues found: Add monorepo completeness gates to Step 1 Discovery and Step 2 CWE coverage validation.

Comparison to Other Tools

Tool Catches this? Notes
Semgrep AppSec platform Partial Has project/tag scoping if configured; skill does not require it
CodeQL dependency analysis Partial Shows extracted languages/LOC if reviewer inspects logs
SonarQube monorepo Partial Can do per-project gates; skill lacks equivalent requirement
GitHub Advanced Security code scanning Partial SARIF category metadata exists but skill ignores it

Overall Assessment

Strengths: Solid CWE/ASVS mapping framework, good discovery patterns, and useful severity-tuning guidance.

Needs improvement: Coverage is evaluated at repo level, not deployable-component level. Polyglot monorepos are the common case for the target audience, so false completeness is likely.

Priority recommendations:

  1. Add a "Workspace Coverage Matrix" output: each production component, language, scanner, last successful scan commit, and LOC extracted.
  2. Treat single-language green CI in a polyglot repo as High finding until all release components are mapped.
  3. Require SARIF/run logs to prove which paths were included/excluded; do not accept global CWE coverage without path evidence.

Bounty Info

  • I have read and agree to the CONTRIBUTING.md bounty terms
  • Preferred payment method: PayPal

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions