[Self-Heal] Add self-scheduling auto-repair workflow#57
Conversation
Implements an automated self-healing CI pipeline capable of detecting code drift, resolving common formatting/snapshot/dependency issues idempotently, and proposing repairs via PR. Includes telemetry-based self-scheduling logic. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Code Review
This pull request introduces an automated self-healing CI pipeline, including documentation, ESLint configuration, dependency updates, and scripts to compute optimal schedules, run health checks, and perform repair steps. The review feedback highlights several critical improvements: handling shallow clones in compute_schedule.mjs to prevent inaccurate commit counts, validating regex matches before updating workflow files to avoid silent failures, adding robust error fallback guards in health check logging, implementing a fail-fast mechanism for critical repair steps like npm ci, and correcting the exit code in self_heal.mjs when no code drift is detected to prevent false-alarm workflow failures.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| function getCommitCount(days) { | ||
| try { | ||
| const since = new Date(Date.now() - days * 24 * 60 * 60 * 1000).toISOString(); | ||
| const count = execSync(`git log --since="${since}" --oneline | wc -l`, { encoding: 'utf8' }); | ||
| return parseInt(count.trim(), 10) || 0; | ||
| } catch (e) { | ||
| return 0; | ||
| } | ||
| } |
There was a problem hiding this comment.
In GitHub Actions, the default behavior of actions/checkout is to perform a shallow clone (fetch-depth: 1). In a shallow clone, git log will only see the single fetched commit, causing getCommitCount to return at most 1 and incorrectly classifying active repositories as 'Dormant' or 'Standard'.
Additionally, using git log ... | wc -l is inefficient and less portable across different environments (e.g., Windows development environments without wc).
We can resolve both issues by detecting shallow clones and attempting to unshallow them, and using the more efficient and portable git rev-list --count command.
function getCommitCount(days) {
try {
try {
const isShallow = execSync("git rev-parse --is-shallow-repository", { encoding: "utf8" }).trim() === "true";
if (isShallow) {
console.log("Shallow clone detected. Attempting to unshallow for accurate telemetry...");
execSync("git fetch --unshallow", { stdio: "ignore" });
}
} catch (shallowError) {
console.warn("Warning: Could not unshallow repository history:", shallowError.message);
}
const since = new Date(Date.now() - days * 24 * 60 * 60 * 1000).toISOString();
const count = execSync("git rev-list --count --since=\"" + since + "\" HEAD", { encoding: "utf8" });
return parseInt(count.trim(), 10) || 0;
} catch (e) {
return 0;
}
}| function updateWorkflowSchedule(newSchedule) { | ||
| const content = fs.readFileSync(WORKFLOW_PATH, 'utf8'); | ||
| // Match the cron line with the # AUTO-UPDATED marker | ||
| const updated = content.replace( | ||
| /cron:\s*".*?"\s*# AUTO-UPDATED/, | ||
| `cron: "${newSchedule}" # AUTO-UPDATED` | ||
| ); | ||
| fs.writeFileSync(WORKFLOW_PATH, updated, 'utf8'); | ||
| } |
There was a problem hiding this comment.
The updateWorkflowSchedule function uses content.replace with a regular expression to update the cron schedule. If the # AUTO-UPDATED marker is missing, formatted differently, or uses single quotes, the replacement will silently fail and write the unmodified content back, while the script still reports success. Adding a check to verify that the regex matches before writing prevents silent failures.
| function updateWorkflowSchedule(newSchedule) { | |
| const content = fs.readFileSync(WORKFLOW_PATH, 'utf8'); | |
| // Match the cron line with the # AUTO-UPDATED marker | |
| const updated = content.replace( | |
| /cron:\s*".*?"\s*# AUTO-UPDATED/, | |
| `cron: "${newSchedule}" # AUTO-UPDATED` | |
| ); | |
| fs.writeFileSync(WORKFLOW_PATH, updated, 'utf8'); | |
| } | |
| function updateWorkflowSchedule(newSchedule) { | |
| const content = fs.readFileSync(WORKFLOW_PATH, "utf8"); | |
| const regex = /cron:\s*".*?"\s*# AUTO-UPDATED/; | |
| if (!regex.test(content)) { | |
| throw new Error("Could not find the 'cron: \"...\" # AUTO-UPDATED' marker in " + WORKFLOW_PATH); | |
| } | |
| const updated = content.replace(regex, "cron: \"" + newSchedule + "\" # AUTO-UPDATED"); | |
| fs.writeFileSync(WORKFLOW_PATH, updated, "utf8"); | |
| } |
| function runStep(name, command) { | ||
| log(`\n--- Running Repair Step: ${name} ---`); | ||
| try { | ||
| const output = execSync(command, { encoding: 'utf-8', stdio: 'pipe' }); | ||
| log(output); | ||
| } catch (error) { | ||
| log(`[WARN] Step ${name} encountered an error:`); | ||
| log(error.stdout || ''); | ||
| log(error.stderr || ''); | ||
| } | ||
|
|
||
| const isHealthy = runHealthCheck(POST_CHECK_LOG); | ||
| const changed = hasDiff(); | ||
|
|
||
| log(`Step result: Healthy=${isHealthy}, Diff=${changed}`); | ||
|
|
||
| if (isHealthy && changed) { | ||
| log('System is healthy and repairs were made. Exiting with success.'); | ||
| process.exit(0); | ||
| } else if (isHealthy && !changed) { | ||
| log('System is healthy but no files were modified. Continuing to next potential repair step.'); | ||
| } else { | ||
| log('System is still unhealthy. Proceeding to next repair step.'); | ||
| } | ||
| } |
There was a problem hiding this comment.
If a critical setup step like npm ci fails, proceeding to run subsequent steps (such as eslint or vitest) is futile and will result in a cascade of confusing module resolution errors. Adding a required parameter to runStep allows the pipeline to fail fast and abort immediately when a critical step fails.
function runStep(name, command, required = false) {
log("\n--- Running Repair Step: " + name + " ---");
try {
const output = execSync(command, { encoding: "utf-8", stdio: "pipe" });
log(output);
} catch (error) {
log("[WARN] Step " + name + " encountered an error:");
log(error.stdout || "");
log(error.stderr || "");
if (required) {
log("[FATAL] Critical step " + name + " failed. Aborting self-heal pipeline.");
process.exit(1);
}
}
const isHealthy = runHealthCheck(POST_CHECK_LOG);
const changed = hasDiff();
log("Step result: Healthy=" + isHealthy + ", Diff=" + changed);
if (isHealthy && changed) {
log("System is healthy and repairs were made. Exiting with success.");
process.exit(0);
} else if (isHealthy && !changed) {
log("System is healthy but no files were modified. Continuing to next potential repair step.");
} else {
log("System is still unhealthy. Proceeding to next repair step.");
}
}| } | ||
|
|
||
| // Step 1: Rebuild/reinstall (clean install of tooling + deps) | ||
| runStep('Reinstall dependencies', 'npm ci'); |
There was a problem hiding this comment.
Mark the dependency installation step as required so that the self-heal pipeline aborts immediately if npm ci fails, preventing subsequent steps from failing with confusing module resolution errors.
| runStep('Reinstall dependencies', 'npm ci'); | |
| runStep('Reinstall dependencies', 'npm ci', true); |
| function runHealthCheck(logFile) { | ||
| try { | ||
| const output = execSync('node scripts/healthcheck.mjs', { encoding: 'utf-8', stdio: 'pipe' }); | ||
| fs.writeFileSync(logFile, output); | ||
| return true; | ||
| } catch (error) { | ||
| fs.writeFileSync(logFile, error.stdout + '\n' + error.stderr); | ||
| return false; | ||
| } | ||
| } |
There was a problem hiding this comment.
If execSync fails to spawn the healthcheck script or if stdout/stderr are not populated on the error object, accessing them directly can result in writing 'undefined undefined' to the log file. Adding fallback guards and logging error.message makes the error logging more robust.
function runHealthCheck(logFile) {
try {
const output = execSync("node scripts/healthcheck.mjs", { encoding: "utf-8", stdio: "pipe" });
fs.writeFileSync(logFile, output);
return true;
} catch (error) {
const stdout = error.stdout || "";
const stderr = error.stderr || error.message || "";
fs.writeFileSync(logFile, stdout + "\n" + stderr);
return false;
}
}| if (!hasDiff()) { | ||
| log('System is healthy but no code drift was detected. Nothing to repair.'); | ||
| process.exit(1); | ||
| } |
There was a problem hiding this comment.
If the repository is already healthy and no code drift is detected, this is a successful state. Exiting with 1 will mark the scheduled GitHub Actions workflow run as failed, triggering false alarm notifications. It should exit with 0 instead.
| if (!hasDiff()) { | |
| log('System is healthy but no code drift was detected. Nothing to repair.'); | |
| process.exit(1); | |
| } | |
| if (!hasDiff()) { | |
| log("System is healthy and no code drift was detected. Nothing to repair."); | |
| process.exit(0); | |
| } |
Self-Heal CI Pipeline
This PR introduces an automated self-healing CI pipeline configured via GitHub Actions and Node.js scripts to automatically repair code drift.
Details
ciworkflow failure, and manual dispatches.npm ci), formatting (eslint --fix,prettier -w), snapshots (vitest run -u), and dependencies (npm update).scripts/compute_schedule.mjsdynamically adjusts the checking cadence by counting recent commits to determine if the repo is in a High, Active, Standard, or Dormant activity state.!startsWith(github.ref_name, 'selfheal-')), strictly prevents direct pushes to default branch (usesgh pr create), and runs entropy regex checks against the git diff to avoid committing API tokens or secrets.Rationale for Initial Schedule
The initial schedule is set to the bootstrap value of
0 0 * * *(Daily). This gives the repository baseline coverage while the telemetry logic waits for adequate commit data to adjust frequency up or down dynamically during active periods.(Note: Review
SELF_HEAL_SETUP.mdfor full breakdown and reviewer instructions).PR created automatically by Jules for task 1263455366753982256 started by @badMade