Skip to content

[Self-Heal] Add self-scheduling auto-repair workflow#39

Draft
badMade wants to merge 1 commit into
mainfrom
add-self-healing-pipeline-465456819745700463
Draft

[Self-Heal] Add self-scheduling auto-repair workflow#39
badMade wants to merge 1 commit into
mainfrom
add-self-healing-pipeline-465456819745700463

Conversation

@badMade

@badMade badMade commented May 17, 2026

Copy link
Copy Markdown
Owner

This PR introduces a robust, self-healing automation pipeline that automatically repairs the codebase (formatting issues, updated test snapshots), runs healthchecks to verify fixes, and opens auto-generated Pull Requests.

Key Features:

  • Telemetry-Based Scheduling: scripts/compute_schedule.mjs analyzes recent commit frequency to dynamically scale the pipeline schedule (e.g., from weekly to daily runs) by modifying .github/self-heal-schedule.yml and .github/workflows/self-heal.yml.
  • Reactive & Manual Triggers: Integrated workflow_run triggers for failed ci workflows, alongside workflow_dispatch for manual control.
  • Idempotent Repair Pipeline: scripts/self_heal.mjs applies standard repair fixes (npm ci, npx prettier -w ., npx vitest run -u), evaluates a healthcheck after each step to find the minimum fix needed, and exits 0 if successfully drifted.
  • Guardrails: Excludes selfheal-* branch loops, prevents opening duplicate PRs, automatically closes stale self-heal PRs, explicit checks for meaningful diffs, limits what files can be staged, and implements a security scan to prevent accidentally staging keys or secrets.
  • Documentation added via SELF_HEAL_SETUP.md.

PR created automatically by Jules for task 465456819745700463 started by @badMade

Added self-healing functionality for formatting, snapshot, and configuration drift. Includes automated scheduling based on telemetry and reactive run configurations on CI failures.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
@google-labs-jules

Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an automated self-healing CI pipeline designed to maintain codebase health by automatically fixing formatting issues and updating test snapshots. Key additions include scripts for health checks, automated repairs, and a dynamic scheduling system that adjusts based on commit activity. Review feedback highlighted several areas for improvement: handling shallow clones in the telemetry logic to ensure accurate scheduling, fixing a logging bug in the health check script where output was suppressed during failures, and adjusting exit codes in the self-heal script to prevent healthy builds from being marked as failures in CI.

Comment on lines +32 to +33
// 1. Telemetry: PR/Commit frequency over the last 14 days
const commits = getGitOutput(`git log --since="14 days ago" --format="%aI"`);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This telemetry logic relies on git log to determine commit frequency. By default, many CI environments (like GitHub Actions) perform a shallow clone with a fetch-depth of 1. In such cases, git log will only return the most recent commit, causing the script to always compute a 'Dormant' or 'Low-churn' schedule. It is recommended to check for a shallow clone and warn the user, or ensure the workflow is configured with fetch-depth: 0 in the checkout step.

  // 1. Telemetry: PR/Commit frequency over the last 14 days
  const isShallow = getGitOutput("git rev-parse --is-shallow-repository")[0] === "true";
  if (isShallow) {
    console.warn("Warning: Shallow clone detected. Telemetry may be inaccurate. Ensure fetch-depth is set to 0 in your checkout step.");
  }
  const commits = getGitOutput(`git log --since="14 days ago" --format="%aI"`);

Comment thread scripts/healthcheck.mjs
Comment on lines +14 to +19
execSync(command, { stdio: "ignore" });
return true;
} catch (error) {
console.error(`Command failed: ${command}`);
if (error.stdout) console.error(error.stdout.toString());
if (error.stderr) console.error(error.stderr.toString());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When stdio is set to "ignore" in execSync, the stdout and stderr properties on the resulting error object will be null. This makes the logging on lines 18-19 ineffective, as no output will be captured to display when a command fails. To fix this while keeping the output silent on success, you can use stdio: "pipe".

Suggested change
execSync(command, { stdio: "ignore" });
return true;
} catch (error) {
console.error(`Command failed: ${command}`);
if (error.stdout) console.error(error.stdout.toString());
if (error.stderr) console.error(error.stderr.toString());
execSync(command, { stdio: "pipe" });
return true;
} catch (error) {
console.error(`Command failed: ${command}`);
if (error.stdout) console.error(error.stdout.toString());
if (error.stderr) console.error(error.stderr.toString());

Comment thread scripts/self_heal.mjs
Comment on lines +94 to +99
if (healthPassed && diffExists) {
process.exit(0);
} else {
console.error("Pipeline finished but healthcheck failed, or no diff found. Exiting 1.");
process.exit(1);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The script exits with code 1 if the codebase is already healthy and no changes were made (!diffExists). In a CI environment, this will mark the workflow run as failed, which can lead to unnecessary notifications for maintainers when the repository is in a good state. It is generally better to exit with code 0 if the healthcheck passes, regardless of whether changes were applied, and use a different mechanism (like a step output or checking for a diff in the workflow) to decide if a Pull Request should be created.

  if (healthPassed) {
    if (diffExists) {
      console.log("Repair successful! Changes applied.");
    } else {
      console.log("Codebase is already healthy. No changes needed.");
    }
    process.exit(0);
  } else {
    console.error("Pipeline finished but healthcheck failed. Exiting 1.");
    process.exit(1);
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant