feat(smithy,smithy-web): dispatch stuck-queue warning (#59)#98
Open
komoreka wants to merge 8 commits into
Open
feat(smithy,smithy-web): dispatch stuck-queue warning (#59)#98komoreka wants to merge 8 commits into
komoreka wants to merge 8 commits into
Conversation
… detection Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds DispatchHealth interface to api/types.ts, extends DaemonStatusResponse in useDaemon.ts with an optional health field (also tightens poll to 5s), and creates the DispatchHealthBanner component that renders an amber warning when hasStuckQueue is true. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pper Banner now takes an optional className applied to its outer element so each mount site controls its own page-specific padding. The workspaces route previously wrapped the banner in a div with px-6 pt-4 which left 16px of empty padding when the banner self-hides (queue healthy or dismissed). The wrapper is gone; the banner mounts directly with className="mx-6 mt-4" passed in.
Replaces the per-tick rate-limited warn with state-transition logging: - Healthy → Stuck: single STUCK warn line with the count and a clear hint. - Stuck → Healthy: single RESUMED info line confirming dispatch is flowing. - No periodic reminders. A long-running stuck state does not spam the log. Drops the stuckWarnTickInterval config option (was 20 ticks, 100s spacing in production). Distinctive STUCK/RESUMED prefixes make the lines findable in a chatty log stream. Tests: - Verifies STUCK warn fires on first stuck tick AND does not fire again on subsequent stuck ticks (was the periodic-spam regression risk). - Verifies RESUMED info fires on the next healthy tick after stuck.
This was referenced May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #59. Detects and surfaces dispatch stuck-queue conditions when workers are unavailable.
DispatchDaemon.getDispatchHealth()reportsreadyUnassignedTasks,availableWorkers,stuck. Stuck = ready unassigned > 0 AND no available workers (worker =agentRole === 'worker', not disabled, session not terminated).[dispatch] STUCKline on the healthy→stuck transition and a single[dispatch] RESUMEDon the way back. No periodic warns to keep noise out of busy logs.GET /api/daemon/statusincludes ahealthfield (hasStuckQueue,readyUnassignedTasks,availableWorkers).smithy-webshows a dismissible amber banner on the agents and workspaces pages when the API reports a stuck queue, polling every 5s.Why this design
The original issue suggested either a startup-only line, periodic warns, or a
--require-agentsflag. After dogfooding, periodic warns drowned in heartbeats and a startup-only line missed the common case (worker dies mid-session). Transition logging gives one clear line at the moment something changes, and the banner gives a visible signal in the UI without being modal.Pool-routing observation from the issue is filed separately as #94 to keep this PR scoped.
Test plan
bun test packages/smithy/src/services/dispatch-daemon.bun.test.ts(4 detection + 2 transition cases)pnpm --filter @stoneforge/smithy test src/server/routes/daemon.test.ts(3 vitest cases incl. throw path)turbo typecheckcleansf serve smithyagainst a repo with ready tasks and no attached workers, observed STUCK log + amber banner; attached a worker, observed RESUMED log + banner cleared.🤖 Generated with Claude Code