[Self-Heal] Add self-scheduling auto-repair workflow#54
Conversation
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Code Review
This pull request introduces a self-healing automation setup, including scripts for computing schedules, running health checks, and executing self-repair steps, along with associated configuration files and documentation. The review feedback highlights several critical issues: an unrecognized ESLint rule (preserve-caught-error) that will cause linting to fail, an anti-pattern in the healthcheck script that runs npm install, an incorrect non-zero exit code in the self-heal script when the system is healthy, a missing installation step after running typesync, and a rounding error in the schedule script's oscillation guard due to the use of Math.ceil.
| "@typescript-eslint/ban-ts-comment": "off", | ||
| "preserve-caught-error": "off" | ||
| } |
There was a problem hiding this comment.
The rule preserve-caught-error is not a standard ESLint or @typescript-eslint rule. Including an invalid or unrecognized rule in the flat configuration will cause ESLint to throw an error and fail to run entirely, which will break the linting step in your healthchecks and CI. If you want to configure how unused caught errors are handled, you should configure the caughtErrors option of the standard no-unused-vars rule instead.
"@typescript-eslint/ban-ts-comment": "off"
}| console.log("Checking install..."); | ||
| if (!run("npm install")) return 1; | ||
|
|
There was a problem hiding this comment.
Running npm install inside a healthcheck script is an anti-pattern. A healthcheck should be a read-only, idempotent validation of the current codebase state (e.g., running tests, linting, and building). Running npm install can modify package-lock.json or node_modules during validation, which can introduce unexpected side effects or untracked changes. Furthermore, scripts/self_heal.mjs already explicitly manages dependency installation via npm ci and npm update as separate steps.
| if (isHealthyPre && !hasDiff()) { | ||
| console.log("System is healthy and no diffs found. Exiting."); | ||
| process.exit(1); // No action needed, intentional non-zero return for github actions workflow to catch | ||
| } |
There was a problem hiding this comment.
Exiting with a non-zero code (1) when the system is healthy and no diffs are found is an anti-pattern. This will cause scheduled GitHub Actions runs to be marked as failed, triggering false alarm notifications to the team. A healthy system is a successful state and should exit with 0. The workflow can easily check if a PR is needed by running git status --porcelain or using a standard action to detect changes after the script completes.
| if (isHealthyPre && !hasDiff()) { | |
| console.log("System is healthy and no diffs found. Exiting."); | |
| process.exit(1); // No action needed, intentional non-zero return for github actions workflow to catch | |
| } | |
| if (isHealthyPre && !hasDiff()) { | |
| console.log("System is healthy and no diffs found. Exiting."); | |
| process.exit(0); | |
| } |
| { name: "Step 1: Rebuild/reinstall", command: "npm ci" }, | ||
| { name: "Step 2: Lint/format auto-fix", command: "npx eslint --fix . && npx prettier -w ." }, | ||
| { name: "Step 3: Snapshot updates", command: "npx vitest run -u --passWithNoTests" }, | ||
| { name: "Step 4: Type stubs", command: "npx typesync" }, |
There was a problem hiding this comment.
npx typesync only adds missing @types/* packages to package.json, but it does not actually install them. Since npm update (in Step 5) only updates existing packages and does not install newly added ones, these type definitions will never be installed in node_modules. This can cause subsequent build or type-checking steps to fail. You should run npm install immediately after typesync to ensure the new types are installed.
| { name: "Step 4: Type stubs", command: "npx typesync" }, | |
| { name: "Step 4: Type stubs", command: "npx typesync && npm install" }, |
| const diffTime = Math.abs(now - lastUpdated); | ||
| const diffDays = Math.ceil(diffTime / (1000 * 60 * 60 * 24)); | ||
| if (diffDays < 3) { |
There was a problem hiding this comment.
Using Math.ceil on the day difference means that any time difference greater than 2.0 days (e.g., 2 days and 1 second) will be rounded up to 3. As a result, the condition diffDays < 3 will evaluate to false, allowing the schedule to be recomputed after only 2 days instead of the intended 3-day guard window. To ensure a strict 3-day guard, perform a direct fractional day comparison or compare the millisecond timestamps directly.
| const diffTime = Math.abs(now - lastUpdated); | |
| const diffDays = Math.ceil(diffTime / (1000 * 60 * 60 * 24)); | |
| if (diffDays < 3) { | |
| const diffTime = Math.abs(now.getTime() - lastUpdated.getTime()); | |
| const diffDays = diffTime / (1000 * 60 * 60 * 24); | |
| if (diffDays < 3) { |
Automated self-healing PR to fix code drift and CI failures. Review logs in actions.
Triggers explained:
Checklist:
Initial schedule computation logic evaluates standard telemetry (git log over 30 days) to find the optimal contiguous "quiet" window. Users can override via manual edit to the yml schedule file. See
SELF_HEAL_SETUP.mdfor more details.Link to Jules chat context: [Agent Request / Chat Context]
PR created automatically by Jules for task 2758855860826736662 started by @badMade