Skip to content

[WIP, DRAFT] ci: add shared build infrastructure with S3 artifact caching#6693

Closed
vyavdoshenko wants to merge 7 commits into
mainfrom
bobik/update_ci_to_use_shared_artefacts
Closed

[WIP, DRAFT] ci: add shared build infrastructure with S3 artifact caching#6693
vyavdoshenko wants to merge 7 commits into
mainfrom
bobik/update_ci_to_use_shared_artefacts

Conversation

@vyavdoshenko
Copy link
Copy Markdown
Contributor

Add new CI infrastructure that builds on LARGE runners and shares artifacts via S3. Runs alongside existing ci.yml for validation — no changes to current CI.

Changes:

  • .github/actions/ensure-build/ — composite action: check S3 for existing artifact, build if miss, upload
  • .github/actions/fetch-build/ — composite action: download artifact from S3, extract to /build
  • .github/workflows/ci-shared-build.yml — new workflow (2 build configs: x86 Debug + ARM Release, 3 test jobs)
  • .github/actions/regression-tests/ — add optional dragonfly-path input (backward-compatible)

How it works:

  1. Build jobs run on CI-LARGE-86/CI-LARGE-ARM, use cmake -B /build (fixed path for consistent CTestTestfile.cmake)
  2. Artifacts packaged with tar --dereference | zstd and uploaded to S3 (ci-builds/{sha}/{build-id}/)
  3. Test jobs download from S3, extract to /build, run ctest/pytest as usual
  4. Concurrency groups with cancel-in-progress: false queue builds for the same SHA — second build skips if artifact exists

Prerequisites:

  • S3 bucket with lifecycle rule (5-day TTL on ci-builds/ prefix)
  • CI_BUILD_BUCKET secret in repo settings

@vyavdoshenko vyavdoshenko self-assigned this Feb 20, 2026
Copilot AI review requested due to automatic review settings February 20, 2026 17:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “shared build” CI path that builds once on LARGE runners, stores build artifacts in S3, and reuses them across multiple downstream test jobs (alongside the existing ci.yml).

Changes:

  • Added .github/actions/ensure-build to build-on-miss and upload a compressed /build artifact to S3.
  • Added .github/actions/fetch-build to download/extract the artifact into /build for test jobs.
  • Added .github/workflows/ci-shared-build.yml plus a backward-compatible dragonfly-path override in the regression-tests action.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
.github/workflows/ci-shared-build.yml New workflow wiring build-once + S3 artifact reuse across CTest/PyTest jobs
.github/actions/ensure-build/action.yml Composite action to detect existing S3 artifact, build/package/upload if missing
.github/actions/fetch-build/action.yml Composite action to download/extract the S3 artifact into /build
.github/actions/regression-tests/action.yml Adds dragonfly-path input to run tests against an absolute binary path

Comment on lines +63 to +66
if [ "${{ inputs.ninja-target }}" = "dragonfly" ]; then
# Minimal package: just the dragonfly binary
tar --dereference -cf - dragonfly | zstd -3 -o /tmp/artifact.tar.zst
else
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tar invocation is missing a -- end-of-options delimiter. If an attacker can cause a file in /build to begin with - (e.g. via a crafted target/output name), tar may interpret it as an option (including dangerous ones like checkpoint actions). Use tar ... -- dragonfly (and consider a fixed allowlist/list-file approach) to avoid option-injection.

Copilot uses AI. Check for mistakes.
Comment on lines +82 to +90
# Build the file list
FILE_LIST=""
for f in $BINARIES $CTEST_FILES $RUNFILES; do
FILE_LIST="$FILE_LIST $f"
done

echo "=== Packaging $(echo $FILE_LIST | wc -w) items ==="
echo "$FILE_LIST" | tr ' ' '\n' | sort
tar --dereference -cf - $FILE_LIST | zstd -3 -o /tmp/artifact.tar.zst
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tar --dereference -cf - $FILE_LIST expands an unquoted, whitespace-delimited string and also lacks a -- end-of-options delimiter. This is both fragile (breaks if any path contains whitespace/newlines) and can allow option-injection if any packaged path starts with -. Prefer generating a file list and passing it via tar --null -T - --files-from=- (or similar) and include -- to terminate options.

Suggested change
# Build the file list
FILE_LIST=""
for f in $BINARIES $CTEST_FILES $RUNFILES; do
FILE_LIST="$FILE_LIST $f"
done
echo "=== Packaging $(echo $FILE_LIST | wc -w) items ==="
echo "$FILE_LIST" | tr ' ' '\n' | sort
tar --dereference -cf - $FILE_LIST | zstd -3 -o /tmp/artifact.tar.zst
# Build the file list and package it safely
ITEM_COUNT=0
for f in $BINARIES $CTEST_FILES $RUNFILES; do
ITEM_COUNT=$((ITEM_COUNT + 1))
done
echo "=== Packaging ${ITEM_COUNT} items ==="
for f in $BINARIES $CTEST_FILES $RUNFILES; do
echo "$f"
done | sort
# Use a NUL-delimited file list to avoid word-splitting and option-injection
{
for f in $BINARIES $CTEST_FILES $RUNFILES; do
printf '%s\0' "$f"
done
} | tar --dereference -cf - --null --files-from=- | zstd -3 -o /tmp/artifact.tar.zst

Copilot uses AI. Check for mistakes.
if [ -n "${{ inputs.dragonfly-path }}" ]; then
mv "${{ inputs.dragonfly-path }}" /var/crash/dragonfly_${timestamp}
else
mv "${GITHUB_WORKSPACE}/${{ inputs.build-folder-name }}/dragonfly" /var/crash/dragonfly_${timestamp}
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When dragonfly-path is not provided, this fallback path hard-codes dragonfly and ignores the dfly-executable input. If callers set dfly-executable to a different binary name, this mv will fail and the debugging artifact won’t be preserved. Use ${{ inputs.dfly-executable }} in the fallback path for consistency with the action’s inputs.

Suggested change
mv "${GITHUB_WORKSPACE}/${{ inputs.build-folder-name }}/dragonfly" /var/crash/dragonfly_${timestamp}
mv "${GITHUB_WORKSPACE}/${{ inputs.build-folder-name }}/${{ inputs.dfly-executable }}" /var/crash/dragonfly_${timestamp}

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +10
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says the new shared-build workflow uses concurrency with cancel-in-progress: false to queue builds for the same SHA, but this workflow sets cancel-in-progress: true at the workflow level. If the intent is to preserve/queue in-flight shared builds (especially for re-runs on the same SHA), consider aligning this setting (or adjusting the concurrency group to include ${{ github.sha }} if you only want cancellation across different SHAs).

Copilot uses AI. Check for mistakes.
@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented Feb 20, 2026

🤖 Augment PR Summary

Summary: Adds an alternative CI workflow that builds once on large runners and reuses those build outputs across multiple test jobs via S3 caching.

Changes:

  • Introduces .github/actions/ensure-build composite action to check S3 for an artifact keyed by SHA + build-id, build with CMake/Ninja on cache miss, and upload a tar.zst artifact.
  • Introduces .github/actions/fetch-build composite action to download and extract the cached artifact into a fixed /build directory.
  • Extends .github/actions/regression-tests with an optional dragonfly-path input to run PyTests against a prebuilt binary (keeping existing behavior as fallback).
  • Adds .github/workflows/ci-shared-build.yml to run an x86 Debug full build + CTest/regression jobs and an ARM Release minimal build + large regression job, in parallel to the existing CI.

Technical Notes: Uses GitHub OIDC (configure-aws-credentials) to assume an AWS role, stores artifacts under ci-builds/{sha}/{build-id}/, and uses per-SHA concurrency groups to avoid duplicate builds.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

if [ -n "${{ inputs.dragonfly-path }}" ]; then
mv "${{ inputs.dragonfly-path }}" /var/crash/dragonfly_${timestamp}
else
mv "${GITHUB_WORKSPACE}/${{ inputs.build-folder-name }}/dragonfly" /var/crash/dragonfly_${timestamp}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy binary on a self hosted runner hard-codes /dragonfly instead of using the dfly-executable input, so it may fail to collect the crashing binary when a non-default executable name is used.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.


echo "=== Packaging $(echo $FILE_LIST | wc -w) items ==="
echo "$FILE_LIST" | tr ' ' '\n' | sort
tar --dereference -cf - $FILE_LIST | zstd -3 -o /tmp/artifact.tar.zst
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using tar --dereference here can end up archiving symlink targets outside the build tree (if the build generates such links), which could unintentionally upload host/workspace files into the S3 artifact.

Severity: medium

Other Locations
  • .github/actions/ensure-build/action.yml:65

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

run: |
mkdir -p /build
cd /build
zstd -d /tmp/artifact.tar.zst --stdout | tar xf -
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracting a tarball from S3 with tar xf - has the usual risk of path traversal/overwrites if the artifact contents are ever corrupted or attacker-controlled; do we want a guardrail to ensure extraction stays within /build?

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@romange
Copy link
Copy Markdown
Collaborator

romange commented Feb 22, 2026

Note that using s3 for caching will increase our egress costs on AWS. Also, not sure how much time you will save there.

@vyavdoshenko
Copy link
Copy Markdown
Contributor Author

Note that using s3 for caching will increase our egress costs on AWS. Also, not sure how much time you will save there.

How much time we can save, we will measure when we have a couple of first builds. Let's try it out.
We have to find a place to store the artifacts. If you think that S3 is not a good place, please give me another direction to consider. It was the first idea for me to reuse AWS S3 for this task as well.

Copilot AI review requested due to automatic review settings February 23, 2026 11:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.

export AWS_MAX_ATTEMPTS=3
S3_KEY="ci-builds/${GITHUB_SHA}/${{ inputs.build-id }}/artifact.tar.zst"
echo "Downloading s3://${{ inputs.s3-bucket }}/${S3_KEY}"
aws s3 cp --no-progress "s3://${{ inputs.s3-bucket }}/${S3_KEY}" /tmp/artifact.tar.zst
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the S3 download fails (network issue, incorrect bucket, missing artifact), the script will continue to the extraction step which will fail with a confusing error. Consider adding explicit error handling after the aws s3 cp command to check if the download succeeded and provide a clear error message if the artifact is missing or the download failed.

Suggested change
aws s3 cp --no-progress "s3://${{ inputs.s3-bucket }}/${S3_KEY}" /tmp/artifact.tar.zst
if ! aws s3 cp --no-progress "s3://${{ inputs.s3-bucket }}/${S3_KEY}" /tmp/artifact.tar.zst; then
echo "ERROR: Failed to download artifact from s3://${{ inputs.s3-bucket }}/${S3_KEY}. Check the bucket name, key, and AWS credentials." >&2
exit 1
fi
if [ ! -s /tmp/artifact.tar.zst ]; then
echo "ERROR: Artifact file /tmp/artifact.tar.zst is missing or empty after download." >&2
exit 1
fi

Copilot uses AI. Check for mistakes.
Comment on lines +87 to +95
# Build the file list
FILE_LIST=""
for f in $BINARIES $CTEST_FILES $RUNFILES; do
FILE_LIST="$FILE_LIST $f"
done

echo "=== Packaging $(echo $FILE_LIST | wc -w) items ==="
echo "$FILE_LIST" | tr ' ' '\n' | sort
tar --dereference -cf - $FILE_LIST | zstd -3 -o /tmp/artifact.tar.zst
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variables BINARIES, CTEST_FILES, and RUNFILES are not quoted when used in the for loop and tar command. If any filenames contain spaces or special characters, this will cause the script to fail or behave incorrectly. Wrap variable expansions in quotes to handle filenames with spaces safely.

Suggested change
# Build the file list
FILE_LIST=""
for f in $BINARIES $CTEST_FILES $RUNFILES; do
FILE_LIST="$FILE_LIST $f"
done
echo "=== Packaging $(echo $FILE_LIST | wc -w) items ==="
echo "$FILE_LIST" | tr ' ' '\n' | sort
tar --dereference -cf - $FILE_LIST | zstd -3 -o /tmp/artifact.tar.zst
# Build the file list as newline-separated entries to safely handle spaces
FILE_LIST="/tmp/artifact_file_list.txt"
: > "$FILE_LIST"
# Always include the main binary
printf '%s\n' "$BINARIES" >> "$FILE_LIST"
# Append test binaries, CTest files, and runfiles if present
if [ -n "$TEST_BINS" ]; then
printf '%s\n' "$TEST_BINS" >> "$FILE_LIST"
fi
if [ -n "$CTEST_FILES" ]; then
printf '%s\n' "$CTEST_FILES" >> "$FILE_LIST"
fi
if [ -n "$RUNFILES" ]; then
printf '%s\n' "$RUNFILES" >> "$FILE_LIST"
fi
ITEM_COUNT=$(wc -l < "$FILE_LIST")
echo "=== Packaging $ITEM_COUNT items ==="
sort "$FILE_LIST"
tar --dereference -cf - --files-from="$FILE_LIST" | zstd -3 -o /tmp/artifact.tar.zst

Copilot uses AI. Check for mistakes.
s3-bucket: ${{ secrets.CI_BUILD_BUCKET }}
- uses: ./.github/actions/regression-tests
with:
dfly-executable: dragonfly
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both dfly-executable and dragonfly-path inputs are specified, but dragonfly-path overrides the dfly-executable + build-folder-name combination (per lines 111-115 in regression-tests/action.yml). The dfly-executable input on line 234 is redundant and confusing. Consider removing it when dragonfly-path is provided, or add a comment explaining that dragonfly-path takes precedence.

Suggested change
dfly-executable: dragonfly

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +39
S3_KEY="ci-builds/${GITHUB_SHA}/${{ inputs.build-id }}/artifact.tar.zst"
echo "s3_key=${S3_KEY}" >> $GITHUB_OUTPUT

if aws s3api head-object --bucket "${{ inputs.s3-bucket }}" --key "${S3_KEY}" > /dev/null 2>&1; then
echo "found=true" >> $GITHUB_OUTPUT
echo "::notice::Artifact found for ${GITHUB_SHA}/${{ inputs.build-id }}, skipping build"
else
echo "found=false" >> $GITHUB_OUTPUT
echo "::notice::No artifact for ${GITHUB_SHA}/${{ inputs.build-id }}, building from source"
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build-id input is used directly in the S3 key path construction without validation. If build-id contains path traversal characters like '../', it could lead to artifacts being written to unexpected S3 locations. While the current workflow only passes hardcoded values like 'ubuntu20-gcc14-Debug', consider adding validation to ensure build-id contains only safe characters (alphanumeric, hyphens, underscores) to prevent potential misuse if this action is reused elsewhere.

Suggested change
S3_KEY="ci-builds/${GITHUB_SHA}/${{ inputs.build-id }}/artifact.tar.zst"
echo "s3_key=${S3_KEY}" >> $GITHUB_OUTPUT
if aws s3api head-object --bucket "${{ inputs.s3-bucket }}" --key "${S3_KEY}" > /dev/null 2>&1; then
echo "found=true" >> $GITHUB_OUTPUT
echo "::notice::Artifact found for ${GITHUB_SHA}/${{ inputs.build-id }}, skipping build"
else
echo "found=false" >> $GITHUB_OUTPUT
echo "::notice::No artifact for ${GITHUB_SHA}/${{ inputs.build-id }}, building from source"
BUILD_ID="${{ inputs.build-id }}"
if [[ ! "$BUILD_ID" =~ ^[A-Za-z0-9_-]+$ ]]; then
echo "::error::Invalid build-id '$BUILD_ID'. Allowed characters are: letters, digits, hyphen (-), underscore (_)."
exit 1
fi
S3_KEY="ci-builds/${GITHUB_SHA}/${BUILD_ID}/artifact.tar.zst"
echo "s3_key=${S3_KEY}" >> $GITHUB_OUTPUT
if aws s3api head-object --bucket "${{ inputs.s3-bucket }}" --key "${S3_KEY}" > /dev/null 2>&1; then
echo "found=true" >> $GITHUB_OUTPUT
echo "::notice::Artifact found for ${GITHUB_SHA}/${BUILD_ID}, skipping build"
else
echo "found=false" >> $GITHUB_OUTPUT
echo "::notice::No artifact for ${GITHUB_SHA}/${BUILD_ID}, building from source"

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +2
name: Ensure Build
description: >
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions that the S3 bucket requires a lifecycle rule with 5-day TTL on the ci-builds/ prefix, but this prerequisite is not documented in the workflow file or action. Consider adding a comment at the top of this file documenting the required S3 bucket configuration, including the lifecycle policy, so that future maintainers or anyone setting up this infrastructure in another repository knows what's required.

Copilot uses AI. Check for mistakes.
find /build -maxdepth 1 -name '*_test' -type f -exec chmod +x {} \;

echo "=== Artifact extracted to /build ==="
ls -lh /build/dragonfly
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the dragonfly binary does not exist after extraction, the chmod command will fail silently with '2>/dev/null || true', but ls -lh will then fail with a non-zero exit code and terminate the action. Consider checking for the file's existence before attempting to list it, or handle the error case explicitly.

Suggested change
ls -lh /build/dragonfly
if [ -f /build/dragonfly ]; then
ls -lh /build/dragonfly
else
echo "dragonfly binary not found at /build/dragonfly"
fi

Copilot uses AI. Check for mistakes.
Comment on lines +194 to +195
dfly-executable: dragonfly
dragonfly-path: /build/dragonfly
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both dfly-executable and dragonfly-path inputs are specified, but dragonfly-path overrides the dfly-executable + build-folder-name combination (per lines 94-98 in regression-tests/action.yml). The dfly-executable input on line 194 is redundant and confusing. Consider removing it when dragonfly-path is provided, or add a comment explaining that dragonfly-path takes precedence.

Copilot uses AI. Check for mistakes.
run-only-on-ubuntu-latest: true
build-folder-name: build
filter: "(not slow) and (not opt_only)"
s3-bucket: ""
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The s3-bucket input is set to an empty string, which disables S3 snapshot tests in the regression-tests action (see line 88 condition 'if: inputs.s3-bucket != ""'). However, the test-large-arm job on line 240 uses the S3_REGTEST_BUCKET secret. For consistency, either both regression test jobs should use the same S3 bucket configuration, or there should be a comment explaining why test-regression skips S3 tests while test-large-arm includes them.

Suggested change
s3-bucket: ""
s3-bucket: ${{ secrets.S3_REGTEST_BUCKET }}

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 23, 2026 12:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 12 comments.

Comment on lines +41 to +50
- uses: ./.github/actions/ensure-build
with:
build-id: ubuntu20-gcc14-Debug
ninja-target: src/all
cmake-args: >-
-DCMAKE_BUILD_TYPE=Debug -GNinja
-DWITH_AWS=OFF -DWITH_GCP=OFF -DWITH_UNWIND=OFF -DWITH_GPERF=OFF
-DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++
-DCMAKE_CXX_FLAGS="-Werror -no-pie"
s3-bucket: ${{ secrets.CI_BUILD_BUCKET }}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The S3 bucket name is provided via secrets.CI_BUILD_BUCKET at lines 50, 84, 110, 191, 231, 240, but there's no validation that this secret is set before attempting S3 operations. If the secret is not configured, the workflow will fail with unclear error messages during the S3 operations.

Consider adding a validation step early in the build jobs:

- name: Validate S3 configuration
  shell: bash
  run: |
    if [ -z "${{ secrets.CI_BUILD_BUCKET }}" ]; then
      echo "::error::CI_BUILD_BUCKET secret is not configured"
      exit 1
    fi

This would make troubleshooting easier for repositories that fork this workflow.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,246 @@
name: ci-shared-build
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow is named "ci-shared-build" at line 1, but the file is named ci-shared-build.yml. While the workflow name at lines 162, 204, 245 correctly references "shared-build-*", there's inconsistency in terminology.

The PR title uses "shared build infrastructure", the workflow name is "ci-shared-build", and artifact names use "shared-build-*-logs". Consider standardizing on one term:

  • Either "ci-shared-build" everywhere (workflow file name already matches)
  • Or "shared-build" everywhere (artifact names already match)

This is a minor naming inconsistency but could cause confusion when searching logs or artifacts.

Suggested change
name: ci-shared-build
name: shared-build

Copilot uses AI. Check for mistakes.
Comment on lines +19 to +23
- name: Install dependencies
shell: bash
run: |
apt-get update -qq && apt-get install -y -qq zstd > /dev/null
pip3 install -q awscli 2>/dev/null || pip3 install -q --break-system-packages awscli
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The apt-get update and apt-get install commands at lines 22, 30 redirect output to suppress noise, but don't check for failures. If the package installation fails (e.g., network issues, repository problems), subsequent steps will fail with cryptic "command not found" errors for zstd or aws.

Consider checking exit codes:

apt-get update -qq && apt-get install -y -qq zstd > /dev/null || {
  echo "::error::Failed to install dependencies"
  exit 1
}

This is particularly important for zstd which is critical for artifact extraction.

Copilot uses AI. Check for mistakes.
Comment on lines +76 to +77
# Collect test binaries (top-level *_test files)
TEST_BINS=$(find . -maxdepth 1 -name '*_test' -type f | sed 's|^\./||')
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The packaging logic at lines 76-80 uses find . -maxdepth 1 -name '*_test' -type f to collect test binaries, but doesn't check if the files are actually executable. If CMake produces a non-executable file matching *_test (e.g., a data file or script), it would be included but not made executable later.

While the extract step at lines 43-44 in fetch-build/action.yml does chmod +x on these files, it would be more robust to filter by executable bit during collection:

TEST_BINS=$(find . -maxdepth 1 -name '*_test' -type f -executable | sed 's|^\./||')

This ensures only executable test binaries are packaged.

Suggested change
# Collect test binaries (top-level *_test files)
TEST_BINS=$(find . -maxdepth 1 -name '*_test' -type f | sed 's|^\./||')
# Collect test binaries (top-level executable *_test files)
TEST_BINS=$(find . -maxdepth 1 -name '*_test' -type f -executable | sed 's|^\./||')

Copilot uses AI. Check for mistakes.
Comment on lines +30 to +32
concurrency:
group: ensure-build-${{ github.sha }}-ubuntu20-gcc14-Debug
cancel-in-progress: false
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The job-level concurrency group with cancel-in-progress: false at lines 30-32 creates a queue for builds with the same SHA+build-id, while the workflow-level concurrency at lines 8-10 has cancel-in-progress: true. This creates a potential race condition:

  1. PR updated → workflow-level cancels previous run
  2. New run starts, but job-level concurrency queues it
  3. If the cancelled run's build job is still uploading to S3, the check at line 40 might see found=false, start a duplicate build, then later fail/corrupt when both try to upload

Consider either:

  • Using cancel-in-progress: true at job level (wastes compute but simpler)
  • Adding a lock/coordination mechanism in S3 (e.g., upload to temp key, then atomic rename)
  • Document that this is acceptable risk for a WIP/DRAFT workflow

Copilot uses AI. Check for mistakes.
Comment on lines +63 to +96
- name: Package artifact
if: steps.check-s3.outputs.found == 'false'
shell: bash
run: |
cd /build

if [ "${{ inputs.ninja-target }}" = "dragonfly" ]; then
# Minimal package: just the dragonfly binary
tar --dereference -cf - dragonfly | zstd -3 -o /tmp/artifact.tar.zst
else
# Full package: dragonfly + test binaries + CTestTestfile tree + runfiles
BINARIES="dragonfly"

# Collect test binaries (top-level *_test files)
TEST_BINS=$(find . -maxdepth 1 -name '*_test' -type f | sed 's|^\./||')
if [ -n "$TEST_BINS" ]; then
BINARIES="$BINARIES $TEST_BINS"
fi

# Collect CTestTestfile.cmake files (needed for ctest -L DFLY)
CTEST_FILES=$(find . -name 'CTestTestfile.cmake' | sed 's|^\./||')

# Collect runfiles directories (test data symlinks resolved by --dereference)
RUNFILES=$(find . -name '*.runfiles' -type d | sed 's|^\./||')

# Build the file list
FILE_LIST=""
for f in $BINARIES $CTEST_FILES $RUNFILES; do
FILE_LIST="$FILE_LIST $f"
done

echo "=== Packaging $(echo $FILE_LIST | wc -w) items ==="
echo "$FILE_LIST" | tr ' ' '\n' | sort
tar --dereference -cf - $FILE_LIST | zstd -3 -o /tmp/artifact.tar.zst
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The artifact packaging uses zstd -3 at lines 71, 96, which is a low compression level. While this is faster than higher levels, it results in larger artifacts and longer S3 upload/download times.

For CI artifacts that are downloaded multiple times (one upload, multiple downloads by test jobs), using a higher compression level like zstd -10 could reduce total network transfer time despite slower compression. The trade-off depends on artifact size and network speed.

Current approach prioritizes build speed over network efficiency. For a DRAFT/WIP workflow, this is acceptable. Consider benchmarking with higher compression levels in production to optimize total CI time.

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +39
dragonfly-path:
required: false
type: string
description: "Absolute path to dragonfly binary. If set, overrides build-folder-name/dfly-executable."
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new dragonfly-path input at lines 36-39 is marked as required: false with type: string, but the implementation at lines 94-98, 111-115, 202-206 checks if [ -n "${{ inputs.dragonfly-path }}" ] which will always be true even if the input is an empty string.

In GitHub Actions, when an optional input is not provided, it evaluates to an empty string, not null. The current check works correctly, but it's worth noting that explicitly passing an empty string (dragonfly-path: "") would be treated differently than omitting the parameter.

This is not a bug in practice (users won't explicitly pass empty strings), but for robustness, consider documenting that the parameter should be omitted rather than set to empty string when not needed.

Copilot uses AI. Check for mistakes.
Comment on lines +89 to +96
FILE_LIST=""
for f in $BINARIES $CTEST_FILES $RUNFILES; do
FILE_LIST="$FILE_LIST $f"
done

echo "=== Packaging $(echo $FILE_LIST | wc -w) items ==="
echo "$FILE_LIST" | tr ' ' '\n' | sort
tar --dereference -cf - $FILE_LIST | zstd -3 -o /tmp/artifact.tar.zst
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tar command uses unquoted variable expansion for $FILE_LIST, which could fail if any file names contain spaces or special characters. Although CTestTestfile.cmake and .runfiles paths are unlikely to have spaces, this is a maintainability risk.

Consider using an array and proper quoting:

FILE_ARRAY=()
for f in $BINARIES $CTEST_FILES $RUNFILES; do
  FILE_ARRAY+=("$f")
done
tar --dereference -cf - "${FILE_ARRAY[@]}" | zstd -3 -o /tmp/artifact.tar.zst
Suggested change
FILE_LIST=""
for f in $BINARIES $CTEST_FILES $RUNFILES; do
FILE_LIST="$FILE_LIST $f"
done
echo "=== Packaging $(echo $FILE_LIST | wc -w) items ==="
echo "$FILE_LIST" | tr ' ' '\n' | sort
tar --dereference -cf - $FILE_LIST | zstd -3 -o /tmp/artifact.tar.zst
FILE_ARRAY=()
for f in $BINARIES $CTEST_FILES $RUNFILES; do
if [ -n "$f" ]; then
FILE_ARRAY+=("$f")
fi
done
echo "=== Packaging ${#FILE_ARRAY[@]} items ==="
printf '%s\n' "${FILE_ARRAY[@]}" | sort
tar --dereference -cf - "${FILE_ARRAY[@]}" | zstd -3 -o /tmp/artifact.tar.zst

Copilot uses AI. Check for mistakes.
Comment on lines +25 to +32
- name: Download artifact from S3
shell: bash
run: |
export AWS_MAX_ATTEMPTS=3
S3_KEY="ci-builds/${GITHUB_SHA}/${{ inputs.build-id }}/artifact.tar.zst"
echo "Downloading s3://${{ inputs.s3-bucket }}/${S3_KEY}"
aws s3 cp --no-progress "s3://${{ inputs.s3-bucket }}/${S3_KEY}" /tmp/artifact.tar.zst
ls -lh /tmp/artifact.tar.zst
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The S3 download step has no error handling if the artifact doesn't exist. While this might be intentional (expecting the build job to have completed successfully), it will produce a cryptic error message if the artifact is missing due to build job failure or race condition.

Consider adding explicit error handling:

if ! aws s3 cp --no-progress "s3://${{ inputs.s3-bucket }}/${S3_KEY}" /tmp/artifact.tar.zst; then
  echo "::error::Failed to download artifact from S3. Build may have failed or artifact is missing."
  exit 1
fi

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +40
- name: Extract artifact
shell: bash
run: |
mkdir -p /build
cd /build
zstd -d /tmp/artifact.tar.zst --stdout | tar xf -
rm -f /tmp/artifact.tar.zst
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tar extraction uses tar xf - without verification flags. If the downloaded artifact is corrupted or incomplete (network issue, S3 issue), this could silently extract a partial/corrupt build leading to confusing test failures.

Consider adding verification:

if ! zstd -d /tmp/artifact.tar.zst --stdout | tar xf - --warning=failed-read; then
  echo "::error::Failed to extract artifact. The archive may be corrupted."
  exit 1
fi

Copilot uses AI. Check for mistakes.
@vyavdoshenko vyavdoshenko deleted the bobik/update_ci_to_use_shared_artefacts branch April 16, 2026 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants