Skip to content

Conversation

@victorlin
Copy link
Member

@victorlin victorlin commented Jan 6, 2026

Description of proposed changes

This PR contains two prep commits and one main commit. Message from main commit:

Instead of using one long-running job to build 6 images (3 stages x 2 platforms) and 3 manifest lists, use 2 platform-specific jobs to build 3 images (one per stage) and a third job to build the manifest lists.

This change comes with two functional improvements:

  1. Most significantly, per-runner disk space usage and build times go down. This is due to both parallelizing the build per platform and removing emulation for the linux/arm64 build. The new time bottleneck for the workflow is now the linux/arm64 test job, which allows for an easy follow-up improvement (see added comment).

  2. As a byproduct of the platform-specific images being tagged, they can now be retrieved from GitHub REST API results, removing the need to fetch from GHCR's undocumented Docker Registry API.

Related issue(s)

Checklist

  • Checks pass

  • linux/arm64 image works on M1 mac

    Tested with zika-tutorial:

    nextstrain build --image nextstrain/base:branch-victorlin-split-build zika-tutorial
    

- [ ] Update changelog

This avoids emitting lines such as

    [None] booting buildkit
    [None] load build definition from Dockerfile
Preparing to add a matrix to the build job. Variables should only be set
once and can be done outside of the matrix.
@victorlin victorlin self-assigned this Jan 6, 2026
@victorlin
Copy link
Member Author

Comparing before and after, build time dropped from 36 to 11 minutes.

Instead of using one long-running job to build 6 images (3 stages x 2
platforms) and 3 manifest lists, use 2 platform-specific jobs to build 3
images (one per stage) and a third job to build the manifest lists.

This change comes with two functional improvements:

1. Most significantly, per-runner disk space usage and build times go
   down. This is due to both parallelizing the build per platform and
   removing emulation for the linux/arm64 build. The new time bottleneck
   for the workflow is now the linux/arm64 test job, which allows for an
   easy follow-up improvement (see added comment).

2. As a byproduct of the platform-specific images being tagged, they can
   now be retrieved from GitHub REST API results, removing the need to
   fetch from GHCR's undocumented Docker Registry API.
@victorlin victorlin force-pushed the victorlin/split-build branch from 9b4087e to 52477eb Compare January 6, 2026 22:27
Copy link
Contributor

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Can't say I fully understand the details, but does seem like the added bit of complexity is worth it.

Comment on lines +116 to +135
- name: Create base-builder-build-platform image
run: |
docker buildx imagetools create \
-t "ghcr.io/nextstrain/base-builder-build-platform:$TAG" \
"ghcr.io/nextstrain/base-builder-build-platform:$TAG-amd64" \
"ghcr.io/nextstrain/base-builder-build-platform:$TAG-arm64"
- name: Create base-builder-target-platform image
run: |
docker buildx imagetools create \
-t "ghcr.io/nextstrain/base-builder-target-platform:$TAG" \
"ghcr.io/nextstrain/base-builder-target-platform:$TAG-amd64" \
"ghcr.io/nextstrain/base-builder-target-platform:$TAG-arm64"
- name: Create base image
run: |
docker buildx imagetools create \
-t "ghcr.io/nextstrain/base:$TAG" \
"ghcr.io/nextstrain/base:$TAG-amd64" \
"ghcr.io/nextstrain/base:$TAG-arm64"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking

The general pattern in this repo seems to run docker buildx commands in /devel/ scripts, should these steps be wrapped in a separate script?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll keep it as is for a couple reasons:

  1. The README instructions for building locally using the existing devel scripts still work fine. The split build + merge-builds pattern is an optimization specific to the GitHub Actions workflow.
  2. Since these commands are simple, keeping them inline and split across 3 steps feels more readable, both in the code and on the GitHub run page. If it ever gets more complex, we can move to a devel wrapper script.

@victorlin
Copy link
Member Author

Merging, but happy to walk through details in a dev chat session!

@victorlin victorlin merged commit bf92951 into master Jan 7, 2026
65 checks passed
@victorlin victorlin deleted the victorlin/split-build branch January 7, 2026 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Build exceeds runner storage limits Build time for linux/arm64 image

3 participants