Skip to content

[CI Failure Doctor] CI Failure Investigation - Run #34708 #14811

@github-actions

Description

@github-actions

🏥 CI Failure Investigation - Run #34708

Summary

Integration: Workflow Cache finished running its integration suite but actions/upload-artifact failed with a 403 while finalizing the generated artifact, so the downstream canary_go coverage job could not download the cache results and marked the cache tests as missing.

Failure Details

Root Cause Analysis

Integration: Workflow Cache writes test-result-integration-Workflow Cache.json and immediately uploads it, but Azure returned Failed to FinalizeArtifact: Received non-retryable error: Failed request: (403) Forbidden while actions/upload-artifact@... was finalizing the blob (log around 2026-02-10T16:06:33.6820072Z). Because the artifact never finished uploading, the canary_go job had no JSON file to read for that test group, so compare-test-coverage.sh treated the four cache tests as never executed.

Failed Jobs and Errors

  • Integration: Workflow Cacheactions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 step (finalizing test-result-integration-Workflow Cache) failed with 403 Forbidden.
  • canary_go./scripts/compare-test-coverage.sh all-tests.txt executed-tests.txt errored with ❌ FAILURE: Found 4 tests that are NOT being executed in CI listing TestCacheMemoryMultipleIntegration, TestCacheMemoryRestoreOnly, TestCacheMemoryWithThreatDetection, and TestCacheSupport. Those names belong to the Workflow Cache group whose artifact never reached the coverage job.

Investigation Findings

  1. The cache tests themselves succeeded—go test produced JSON output—but artifact finalization aborted with 403, so the JSON file was never stored.
  2. canary_go depends on each integration job’s artifact to know which tests ran; missing artifacts are interpreted as missing tests, so the coverage comparison script fails fast with the four cache tests as soon as it cannot find the Workflow Cache artifact.

Recommended Actions

  • Re-run the workflow so that the artifact upload can finish successfully and the coverage job can consume test-result-integration-Workflow Cache. If the 403 was transient, everything will pass.
  • If 403 entries continue to recur, capture the failing upload logs and consider retrying the upload or using a more resilient artifact ingestion strategy (e.g., retry wrapper or custom upload that can troubleshoot 403s).

Prevention Strategies

  • Detect missing artifacts before running compare-test-coverage.sh and either skip the comparison or surface a clear error so the root cause (missing artifact) is obvious instead of just missing-test names.
  • Add retries around upload-artifact steps that feed downstream collectors so transient storage/auth errors don’t cascade into coverage failures.

AI Team Self-Improvement

When you wire up coverage/metric jobs, explicitly verify that every actions/upload-artifact step succeeded before letting downstream jobs consume its data; treat 403 FinalizeArtifact responses as fatal and re-run the upload or the workflow.

Historical Context

No open [CI Failure Doctor] issue currently matches run #34708. The only open issue with that label (#14809) describes a different timeout during the frontmatter hash test, so this 403 artifact upload + coverage failure is a new occurrence.

AI generated by CI Failure Doctor

To add this workflow in your repository, run gh aw add githubnext/agentics/workflows/ci-doctor.md@ea350161ad5dcc9624cf510f134c6a9e39a6f94d. See usage guide.

  • expires on Feb 11, 2026, 4:19 PM UTC

Metadata

Metadata

Assignees

No one assigned

    Labels

    cookieIssue Monster Loves Cookies!

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions