🤖 fix: stop idle auto-compaction loop on repeated/model-not-found failures by ammar-agent · Pull Request #3567 · coder/mux

ammar-agent · 2026-06-15T18:13:44Z

Summary

Stops the hourly idle auto-compaction loop from re-attempting a workspace that keeps failing: it now gives up immediately when an attempt fails with a model-not-found error, and otherwise after two consecutive failures of any kind. Previously a persistently failing workspace (e.g. a misconfigured/typo'd compaction model) was re-queued every hour forever, since a failed compaction neither marks the workspace compacted nor refreshes recency, so it stayed eligible indefinitely.

Background

IdleCompactionService re-checks eligibility every hour and re-queues any idle workspace. There was no failure tracking, so a workspace whose compaction always fails would be retried on every cycle with no cap. The most common cause is a compaction model the provider rejects.

A key subtlety: the failure can surface in two different places.

Pre-stream failures (invalid model string, model unavailable, provider/runtime issues) come back synchronously as the sendMessage Result that executeIdleCompaction already inspects.
Mid-stream failures (the common model_not_found case for a well-formed but non-existent Anthropic/OpenAI model) are not returned by sendMessage — they arrive later as an async "error" stream event and are only persisted transiently to partial.json, so they can't be recovered from chat history on the next check.

Implementation

IdleCompactionService now tracks per-workspace consecutive failures and a sticky suppressedWorkspaceIds set. checkEligibility short-circuits suppressed workspaces (reason: "suppressed_after_failures"). A new recordOutcome(workspaceId, outcome) applies the policy: success resets the streak; model_not_found suppresses immediately; any other failure suppresses on the second consecutive occurrence.
WorkspaceService reports terminal idle-compaction outcomes through a listener:
- pre-stream failure → reported from executeIdleCompaction (model-not-found classified via formatSendMessageError),
- mid-stream failure → reported from the existing aiService.on("error") handler when the workspace is mid idle-compaction (read before stopStreamingStatus clears the marker),
- success → reported from handleStreamCompletion.
- error / stream-end / stream-abort are mutually exclusive per turn, so a failed turn never also reports success. Aborts are intentionally neutral (not a failure).
ServiceContainer wires the listener to IdleCompactionService.recordOutcome.

Suppression is in-memory and sticky for the service lifetime; restarting the app (e.g. after fixing the configured compaction model) clears it.

Validation

New unit tests in idleCompactionService.test.ts: suppress after two consecutive failures, suppress immediately on model_not_found, success resets the streak, and checkAllWorkspaces no longer queues a suppressed workspace.
New workspaceService.test.ts tests: executeIdleCompaction reports modelNotFound: true for an invalid compaction model and false for a generic pre-stream failure.
make static-check and targeted tests pass.

Risks

Low. Behavior is additive and scoped to the background idle-compaction loop. The outcome listener is optional (no-op until wired), and normal interactive sends are untouched. Worst case of a wrong suppression is that a workspace stops auto-compacting on idle until restart, which a user can already trigger manually.

Generated with mux • Model: anthropic:claude-opus-4-8 • Thinking: xhigh • Cost: $1.48

ammar-agent · 2026-06-15T18:13:50Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8a65f175c2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Address Codex review: a clean provider stream-end does not guarantee the post-stream history compaction succeeded. Drive the idle success/failure outcome from CompactionHandler.handleCompletion (after the summary is persisted, or on a post-stream persistence failure) via a new onIdleCompactionOutcome callback, instead of reporting success on stream-end.

ammar-agent · 2026-06-15T18:36:09Z

Addressed in 083e71d.

The idle-compaction outcome is no longer reported from the provider stream-end. Instead, CompactionHandler.handleCompletion() now reports the terminal outcome via a new onIdleCompactionOutcome(success) callback:

success is reported only after the compacted summary is actually persisted (right after onCompactionComplete),
failure is reported on the post-stream persistence paths (performCompaction returns Err, empty summary, raw-JSON summary).

WorkspaceService.createSession forwards that to reportIdleCompactionOutcome (a persistence failure → { success: false, modelNotFound: false }), so these post-stream failures now count toward the suppression threshold. The unconditional success report in handleStreamCompletion was removed. Mid-stream provider errors (no stream-end/no persistence) are still reported from the "error" handler, and the three terminal events are mutually exclusive per turn, so each idle attempt reports exactly one outcome.

Added CompactionHandler tests: success → [true], empty summary → [false], manual (non-idle) compaction → not called.

ammar-agent · 2026-06-15T18:36:19Z

@codex review

Please take another look — addressed the post-stream persistence concern.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 083e71d465

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Address Codex review: the requireIdle path returns a busy-skip when the workspace becomes active after eligibility but before sendMessage. That is an expected race, not a failure. Skip reporting it as an outcome so two normal user-interaction races cannot suppress idle compaction for a healthy workspace. Factor the busy-skip message into a shared constant.

ammar-agent · 2026-06-15T18:44:34Z

Addressed in c1a1b76.

The requireIdle busy-skip ("Workspace is busy; idle-only send was skipped.") is now treated as neutral — executeIdleCompaction no longer calls recordOutcome for it, so normal user-interaction races can't accumulate toward suppression. I factored the message into a shared IDLE_ONLY_BUSY_SKIP_MESSAGE constant (used by the requireIdle return, the early busy throw, and the skip check) to keep the contract DRY. Extended the busy-skip test to assert no outcome is reported.

ammar-agent · 2026-06-15T18:44:45Z

@codex review

Please take another look — the requireIdle busy-skip is now treated as neutral.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c1a1b76bc4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Address Codex review: an in-flight retry can persist a successful compaction after suppression was set. recordOutcome(success) now also clears the suppressed set, so the workspace self-heals and becomes eligible again.

ammar-agent · 2026-06-15T18:51:42Z

Addressed in 866a615.

recordOutcome now also clears suppressedWorkspaceIds on success, so a later in-flight retry that actually persists a compaction lifts the suppression and the workspace becomes eligible again (self-healing). Added a test: two failures suppress → a subsequent success makes the workspace eligible again.

ammar-agent · 2026-06-15T18:51:44Z

@codex review

Please take another look — suppression is now lifted on a later success.

chatgpt-codex-connector · 2026-06-15T18:56:58Z

Codex Review: Didn't find any major issues. Breezy!

Reviewed commit: 866a615c91

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Stop idle auto-compact loop on repeated/model-not-found failures

8a65f17

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread src/node/services/workspaceService.ts Outdated

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread src/node/services/workspaceService.ts Outdated

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread src/node/services/idleCompactionService.ts

Lift idle-compaction suppression when a later compaction succeeds

866a615

Address Codex review: an in-flight retry can persist a successful compaction after suppression was set. recordOutcome(success) now also clears the suppressed set, so the workspace self-heals and becomes eligible again.

ammario merged commit 9ec890d into main Jun 15, 2026
23 checks passed

ammario deleted the fix/idle-compaction-stop-on-repeated-failure branch June 15, 2026 21:52

Conversation

ammar-agent commented Jun 15, 2026

Summary

Background

Implementation

Validation

Risks

Uh oh!

ammar-agent commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ammar-agent commented Jun 15, 2026

Uh oh!

ammar-agent commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ammar-agent commented Jun 15, 2026

Uh oh!

ammar-agent commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ammar-agent commented Jun 15, 2026

Uh oh!

ammar-agent commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants