[Self-Heal] Add self-scheduling auto-repair workflow#39
Conversation
Added self-healing functionality for formatting, snapshot, and configuration drift. Includes automated scheduling based on telemetry and reactive run configurations on CI failures. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Code Review
This pull request introduces an automated self-healing CI pipeline designed to maintain codebase health by automatically fixing formatting issues and updating test snapshots. Key additions include scripts for health checks, automated repairs, and a dynamic scheduling system that adjusts based on commit activity. Review feedback highlighted several areas for improvement: handling shallow clones in the telemetry logic to ensure accurate scheduling, fixing a logging bug in the health check script where output was suppressed during failures, and adjusting exit codes in the self-heal script to prevent healthy builds from being marked as failures in CI.
| // 1. Telemetry: PR/Commit frequency over the last 14 days | ||
| const commits = getGitOutput(`git log --since="14 days ago" --format="%aI"`); |
There was a problem hiding this comment.
This telemetry logic relies on git log to determine commit frequency. By default, many CI environments (like GitHub Actions) perform a shallow clone with a fetch-depth of 1. In such cases, git log will only return the most recent commit, causing the script to always compute a 'Dormant' or 'Low-churn' schedule. It is recommended to check for a shallow clone and warn the user, or ensure the workflow is configured with fetch-depth: 0 in the checkout step.
// 1. Telemetry: PR/Commit frequency over the last 14 days
const isShallow = getGitOutput("git rev-parse --is-shallow-repository")[0] === "true";
if (isShallow) {
console.warn("Warning: Shallow clone detected. Telemetry may be inaccurate. Ensure fetch-depth is set to 0 in your checkout step.");
}
const commits = getGitOutput(`git log --since="14 days ago" --format="%aI"`);| execSync(command, { stdio: "ignore" }); | ||
| return true; | ||
| } catch (error) { | ||
| console.error(`Command failed: ${command}`); | ||
| if (error.stdout) console.error(error.stdout.toString()); | ||
| if (error.stderr) console.error(error.stderr.toString()); |
There was a problem hiding this comment.
When stdio is set to "ignore" in execSync, the stdout and stderr properties on the resulting error object will be null. This makes the logging on lines 18-19 ineffective, as no output will be captured to display when a command fails. To fix this while keeping the output silent on success, you can use stdio: "pipe".
| execSync(command, { stdio: "ignore" }); | |
| return true; | |
| } catch (error) { | |
| console.error(`Command failed: ${command}`); | |
| if (error.stdout) console.error(error.stdout.toString()); | |
| if (error.stderr) console.error(error.stderr.toString()); | |
| execSync(command, { stdio: "pipe" }); | |
| return true; | |
| } catch (error) { | |
| console.error(`Command failed: ${command}`); | |
| if (error.stdout) console.error(error.stdout.toString()); | |
| if (error.stderr) console.error(error.stderr.toString()); |
| if (healthPassed && diffExists) { | ||
| process.exit(0); | ||
| } else { | ||
| console.error("Pipeline finished but healthcheck failed, or no diff found. Exiting 1."); | ||
| process.exit(1); | ||
| } |
There was a problem hiding this comment.
The script exits with code 1 if the codebase is already healthy and no changes were made (!diffExists). In a CI environment, this will mark the workflow run as failed, which can lead to unnecessary notifications for maintainers when the repository is in a good state. It is generally better to exit with code 0 if the healthcheck passes, regardless of whether changes were applied, and use a different mechanism (like a step output or checking for a diff in the workflow) to decide if a Pull Request should be created.
if (healthPassed) {
if (diffExists) {
console.log("Repair successful! Changes applied.");
} else {
console.log("Codebase is already healthy. No changes needed.");
}
process.exit(0);
} else {
console.error("Pipeline finished but healthcheck failed. Exiting 1.");
process.exit(1);
}
This PR introduces a robust, self-healing automation pipeline that automatically repairs the codebase (formatting issues, updated test snapshots), runs healthchecks to verify fixes, and opens auto-generated Pull Requests.
Key Features:
scripts/compute_schedule.mjsanalyzes recent commit frequency to dynamically scale the pipeline schedule (e.g., from weekly to daily runs) by modifying.github/self-heal-schedule.ymland.github/workflows/self-heal.yml.workflow_runtriggers for failedciworkflows, alongsideworkflow_dispatchfor manual control.scripts/self_heal.mjsapplies standard repair fixes (npm ci,npx prettier -w .,npx vitest run -u), evaluates a healthcheck after each step to find the minimum fix needed, and exits0if successfully drifted.selfheal-*branch loops, prevents opening duplicate PRs, automatically closes staleself-healPRs, explicit checks for meaningful diffs, limits what files can be staged, and implements a security scan to prevent accidentally staging keys or secrets.SELF_HEAL_SETUP.md.PR created automatically by Jules for task 465456819745700463 started by @badMade