Skip to content

[epic] Image scanning lane — scan Dockerfile / base images / built image content #9

@nirmalgupta

Description

@nirmalgupta

Motivation

Today the scanner targets a source repo (lockfiles, source files, secrets, IaC) and finds CVEs in declared dependencies. Container images built from that repo carry a different attack surface — statically-installed binaries (e.g., the scanner binaries we install in our own Dockerfile), OS packages from apt, pip-installed Python at build time. Docker Scout demonstrably finds Go-stdlib criticals inside those binaries that our source-only lane never sees (e.g., CVE-2025-68121 inside gitleaks 8.30.1 — caught Scout's review of leverj/security-scan:v0.2.1).

The chicken-and-egg: at the time security-scan runs against a source repo, the image typically hasn't been built yet. So "scan the image" has to be either (a) scan-without-building, or (b) opt-in build-and-scan, or (c) scan an already-published image.

This epic adds three coordinated modes so users can pick the cost/coverage tradeoff that fits their pipeline.

Modes

(A) Dockerfile audit — default on, no image required

Scan the Dockerfile itself for posture issues: missing USER, hardcoded secrets in ENV/ARG, ADD from URL, copying sensitive paths, root-only WORKDIR, exposed :22, apt-get without --no-install-recommends, etc.

Implementation: most of this is already in Trivy's --scanners misconfig. Verify our runners/trivy.py invocation includes the misconfig scanner and that Dockerfiles in the cloned tree are picked up. If not, enable it. Map findings to category: "iac" (which the project board already supports).

(B) Base-image scan — default on, no build required

Parse every FROM line in the repo's Dockerfile(s), pull each base image (cacheable across runs), and run trivy image <ref> against each. Findings get category: "dependency" or a new category: "image".

This catches "your project's images are built on a vulnerable base" — same insight that drove our v0.2.2 base bump from python:3.13-slimpython:3.14-slim — without needing to build the project's actual image.

Cost: one image pull per FROM reference, ~100MB–1GB each, cacheable. First run is slow; subsequent runs reuse the local image.

(C) Built-image scan — opt-in

Either:

  • built_image.ref: <published-tag> — pull and scan an already-published image; or
  • built_image.build_locally: truedocker build . in the cloned tree (with the host docker socket mounted; security implications noted below) and scan the resulting local image.

This is the only mode that catches binary-embedded CVEs — exactly the Scout-grade gap we hit with leverj/security-scan:v0.2.2. Findings get category: "image".

Cost is real: a docker build is minutes + multi-GB; a docker pull is whatever the image weighs. Opt-in because not every project has a Dockerfile, and the ones that do don't all want this on every run.

Config schema sketch

image_scan:
  dockerfile: true            # mode A — Trivy misconfig over Dockerfiles in the tree
  base_images: true           # mode B — scan each `FROM` reference
  built_image:                # mode C — opt-in
    enabled: false
    ref: null                 # if set: pull + scan this image (no build)
    build_locally: false      # if true and ref unset: docker build the cloned tree and scan

Sub-tasks

  • Wire runners/trivy.py to enable --scanners misconfig and verify Dockerfile findings flow into the normalizer.
  • New runners/base_images.py: extract FROM lines from Dockerfiles in the tree, run trivy image per base, normalize.
  • New runners/built_image.py: implement the two sub-modes (ref and build_locally). For build_locally, expect a mounted docker socket in the container (-v /var/run/docker.sock:/var/run/docker.sock) and document the security implications.
  • Extend secscan/config.py with the image_scan block + validation.
  • Extend detect.py to recognize Dockerfiles and emit (dockerfile, base_images, built_image) targets when their flags are on.
  • Add a category: \"image\" (or reuse dependency/iac as appropriate) and update the Category Projects v2 field options + manifest's new_fields.
  • Update cross_validate.py to treat image findings same as others.
  • Tests: synthetic Dockerfile fixture, mocked trivy image responses, ensure non-Dockerfile repos cleanly skip.
  • Manifest changelog + bump config_schema_version to 3.
  • README + skill SKILL.md updates (the ai-skills consumer reads the manifest, so the migration is auto-surfaced).

Security implications (mode C, build_locally)

docker build from a cloned (untrusted) repo is fundamentally a code execution surface — the Dockerfile's RUN lines run with daemon privileges. Mitigations to consider:

  • Default off; require explicit opt-in per-config.
  • Document clearly that mode C "executes" the repo (violating the spec's Never execute repo code invariant for that mode specifically).
  • For mode (C) ref-only sub-mode (pull + scan, no build), the safety profile is the same as Scout itself — fine.
  • Consider an isolated docker daemon (rootless, network-scoped) for build_locally runs.

The spec's Repo execution: Never execute repo code invariant should be amended to note that image_scan.built_image.build_locally: true is the one opt-in exception.

Non-goals

  • Replacing Docker Scout. Scout has its own SBOM pipeline, runtime detection, and grading; we'd be a stripped-down image-scanning lane riding on Trivy's image mode.
  • Multi-platform image scans (build/pull amd64+arm64 per image). v1 of this epic uses the host arch only.
  • Mutating the image / suggesting fixes. We file findings; the existing system fixes them.
  • Image distribution / signing / SBOM management for user images (that's their pipeline's concern).

Related

  • Spec §17 (deferred roadmap) — adds "image scanning lane" alongside DAST/pen-test and live Supabase.
  • Live Supabase Security Advisor epic: [epic] Live Supabase Security Advisor (DB-connected lane) #4
  • Triggered by Docker Scout findings on leverj/security-scan:v0.2.1 and :v0.2.2 (see the v0.2.2 manifest changelog).

Acceptance

  • A project with a Dockerfile gets findings filed for: (a) Dockerfile config issues, (b) CVEs in its base images, and — when opt-in — (c) CVEs in the built/pulled image's full content.
  • A project without a Dockerfile is unaffected (no findings, no errors).
  • All findings flow through the existing fingerprint + Projects v2 + cross-validation pipeline; no special-casing in sync.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions