Run container as non-root, modernize Dockerfile and CI caching#138
Conversation
pre-created cache directory, working with or without a volume - pin base images by tag and digest; use apk --no-cache - add a BuildKit go-build cache mount and -trimpath to test/build - add a dev target with a documented live-source workflow - add OCI image labels and a .dockerignore to shrink the build context
|
dependabot job would work good....but I thing renovate would work even much better and it's providing The Deps dashboard and better PRs and customizable Examples here: tenhishadow/dotfiles#55 Let me know your thougths - and I could also implement it there |
|
Thank you for the contribution, it looks very useful. Cc @dezeroku @Orochimarufan @ilya-zlobintsev for more attention |
|
@anatol @dezeroku @Orochimarufan @ilya-zlobintsev One more thing I prototyped while working on this, sharing it here for the record: a distroless runtime image. The idea: since the runtime stage really only needs the pacoloco binary, the final image can be FROM common AS build
RUN --mount=type=cache,target=/root/.cache/go-build \
go build -trimpath -tags sqlite_omit_load_extension \
-ldflags='-s -w -linkmode external -extldflags "-static"' \
&& install -d /rootfs/var/cache/pacoloco
FROM gcr.io/distroless/static-debian13:nonroot AS executable
WORKDIR /pacoloco
COPY --from=build /build/pacoloco .
# pacoloco requires cache_dir to exist; bake it in with the right owner
COPY --from=build --chown=65532:65532 /rootfs/ /
USER 65532:65532
EXPOSE 9129
CMD ["/pacoloco/pacoloco"](sqlite_omit_load_extension drops sqlite's dlopen-based extension loading, which can't work in a static binary anyway.) What makes it a nice fit: distroless/static already ships ca-certificates, tzdata and the nonroot (65532) user out of the box — no apk add tzdata needed at all, TZ for the prefetch cron just works. I verified the result end to end: HTTPS mirror fetches, the sqlite prefetch DB, log timestamps with TZ set — all fine. The image comes out at ~17 MB with essentially zero scannable surface (trivy: no OS packages, 0 findings). That said, I decided not to include it in this PR: switching the cgo sqlite driver to static linking is a real change to how the binary is built, and with no shell in the image docker exec debugging stops working. Felt like too much to bundle with a non-root/CI cleanup. If there's interest, I'm happy to send it as a small follow-up — it's a ~10-line diff on top of this PR. |
|
|
||
| steps: | ||
| - uses: actions/checkout@v3 | ||
| - uses: actions/checkout@master |
There was a problem hiding this comment.
Why lock it to 'master' instead of v7? I suppose that we don't have any args passed to it so I'd expect it to work with future majors, but it's quite uncommon I'd say.
On a broader note, if we already lock the hashes in Dockerfile then we might also want to do it for GitHub actions, especially with renovate/dependabot in place, WDYT?
There was a problem hiding this comment.
@dezeroku Yes, this was intentional.
My reasoning was that actions/checkout is probably one of the most stable GitHub Actions, and since we do not pass any special inputs here, tracking master avoids having to periodically bump the major version just for this step.
Strictly speaking, the strongest supply-chain posture would be pinning GitHub Actions by commit SHA as well and letting Renovate/Dependabot keep them updated, similar to the Dockerfile digest pins.
That said, I agree it is less common than pinning to a major version, so I can switch it to v7 if you prefer.
There was a problem hiding this comment.
I see your point, but let's keep the convention of locking it to something tangible i think.
Either v7 or commit SHAs are A-okay in my opinion
|
LGTM in general, just this one comment to address about hashes. As for the distroless, the only minor inconvenience I see with it is that working within the filesystem in environment where it's not easily accessible by the user might get a bit harder. For example if you run the pod in K8S with Ceph backed volume you might want to manually remove or just look through the files from time to time. This can be easily done with something like a sidecar or dev container, but might be a bit harder for new users adopting it. It's just a small nit in the end and I don't think we should consider it a blocker for distroless, just something to have in mind |
Yes, that was my thinking as well. I considered distroless, but decided not to include it in this PR for exactly that reason: debugging and inspecting the runtime filesystem becomes more awkward both in Docker and in Kubernetes, especially for users who do not already have sidecar/dev-container workflows in place. It would also require moving the image closer to a fully static binary setup, so it is a bit more invasive than just changing the final base image. The size difference was not huge either from what I remember - roughly around 10 MB, so I mentioned it as a possible future direction, but did not want to make this PR larger or less convenient to operate. |
|
Thank you @tenhishadow for your great contribution. One minor comment - I see that this PR has two fairly distinct features. Could you please amend the "fix" commits on top of the original two commit to keep the history nice and clean. |
- bump all actions to current majors (checkout v7, build-push v7,
metadata v6, upload/download-artifact v7/v8, qemu/buildx/login v4)
- replace the third-party lowercase action with bash ${VAR,,}
- add least-privilege permissions and concurrency cancellation
- share one layer-cache scope per branch+platform between the test and
build jobs, with a default-branch read fallback for new branches
- persist the Dockerfile go-build cache mount across runs via
buildkit-cache-dance
- add dependabot for github-actions and docker base-image digests
Done, I've cleaned up the history and squashed the fix commits into the original feature commits. As a side note, GitHub's "Squash and merge" option usually achieves the same result automatically for the target branch, but I've rewritten the commits as requested. |
|
Thank you very much for the contribution @tenhishadow . Much appreciated! The changes are merged. |
What
Two focused commits: the Docker image now runs as a non-root user, and the build workflow gets current action versions plus caching that actually persists between runs. No changes to pacoloco itself.
Docker image (
build(docker))apk add --no-cache,-trimpath, and a BuildKitgo-buildcache mount for the test/build stages - incremental rebuilds only recompile what changed (locally: ~58s → ~2s after a one-line source edit).devtarget with the full toolchain: bind-mount the source and Go caches, get a fast edit-run loop. Usage is documented at the top of the Dockerfile..dockerignore(build context no longer includes.git, docs, etc).Stage names (
test,build, final) and the binary path/pacoloco/pacolocoare unchanged, so the existing workflow and compose files keep working.⚠ Behaviour change
Anyone bind-mounting a host directory for the cache needs it writable by uid 65532 (
chown 65532:65532 /path/to/cache) or has to override the user (--user "$(id -u):$(id -g)", oruser:in compose). Named volumes and volume-less runs are unaffected. This also gives a clear answer to the "which user does the container run as?" question that came up in #118.CI (
ci)checkout@v7,build-push-action@v7,metadata-action@v6,upload/download-artifact@v7/v8,setup-qemu/buildx/login@v4.ASzc/change-string-case-action; lowercasing is one line of bash now.permissions:block andconcurrencyso a newer push cancels superseded runs.go-buildcache mount is persisted across CI runs viareproducible-containers/buildkit-cache-dance— layer cache only helps when nothing changed; this keeps compilation incremental when the source did change, which is especially noticeable for the QEMU-emulated arm builds..github/dependabot.ymlcoveringgithub-actionsanddocker(the latter maintains the Dockerfile digest pins).Testing
docker build --target test .passes for the release toolchain; image builds for amd64/arm64/arm-v7./metricsresponds, a real package database download through the cache works (HTTPS mirror), the prefetch sqlite DB is created on the volume with correct ownership,TZis honoured.hadolint: clean;trivy image: 0 vulnerabilities;actionlinton the workflow: 0 findings.Related: #118 (runtime user question).