Description
Agent pools can remain at capacity after the worker or steward session occupying the slot has already terminated. The agent is idle and has no active session, but the pool still reports the agent in activeAgentIds, which blocks new work until pool status is manually refreshed.
Steps to reproduce
- Create an agent pool with
maxSize: 1.
- Let the dispatch daemon start a worker or steward session that occupies the pool slot.
- Let that session terminate or be cleaned up as dead.
- Check the pool status and agent/session status.
Expected behavior
When the worker or steward session terminates, the pool slot should be released automatically. The pool should report activeCount: 0, availableSlots: 1, and no active agent IDs when no governed worker/steward sessions are running.
Actual behavior
The pool can remain full even though the occupying agent has no active session.
Observed API state:
{
"pool": {
"id": "el-rmw5",
"config": { "name": "test", "maxSize": 1, "agentTypes": [], "enabled": true },
"status": {
"activeCount": 1,
"availableSlots": 0,
"activeByType": { "worker:ephemeral": 1 },
"activeAgentIds": ["el-5d6n"]
}
}
}
At the same time:
GET /api/agents/el-5d6n/status returned hasActiveSession: false and activeSession: null.
GET /api/sessions?status=running showed no running worker or steward sessions, only the director session.
- The latest
el-5d6n session was terminated with terminationReason: "Process no longer alive (PID check)".
Calling POST /api/pools/refresh recomputed the status correctly when no governed sessions were running, which suggests the persisted session state is correct and the pool runtime status/cache is stale.
Operating system
macOS
Node version
20
Package version
1.25.0
Additional context
Code inspection found that AgentPoolService.onAgentSpawned() is called from dispatch paths, while AgentPoolService.onAgentSessionEnded() exists but does not appear to have call sites. Session cleanup paths such as dead-process cleanup mark agents idle and persist terminated session history, but the pool slot is not released automatically.
Description
Agent pools can remain at capacity after the worker or steward session occupying the slot has already terminated. The agent is idle and has no active session, but the pool still reports the agent in
activeAgentIds, which blocks new work until pool status is manually refreshed.Steps to reproduce
maxSize: 1.Expected behavior
When the worker or steward session terminates, the pool slot should be released automatically. The pool should report
activeCount: 0,availableSlots: 1, and no active agent IDs when no governed worker/steward sessions are running.Actual behavior
The pool can remain full even though the occupying agent has no active session.
Observed API state:
{ "pool": { "id": "el-rmw5", "config": { "name": "test", "maxSize": 1, "agentTypes": [], "enabled": true }, "status": { "activeCount": 1, "availableSlots": 0, "activeByType": { "worker:ephemeral": 1 }, "activeAgentIds": ["el-5d6n"] } } }At the same time:
GET /api/agents/el-5d6n/statusreturnedhasActiveSession: falseandactiveSession: null.GET /api/sessions?status=runningshowed no running worker or steward sessions, only the director session.el-5d6nsession was terminated withterminationReason: "Process no longer alive (PID check)".Calling
POST /api/pools/refreshrecomputed the status correctly when no governed sessions were running, which suggests the persisted session state is correct and the pool runtime status/cache is stale.Operating system
macOS
Node version
20
Package version
1.25.0
Additional context
Code inspection found that
AgentPoolService.onAgentSpawned()is called from dispatch paths, whileAgentPoolService.onAgentSessionEnded()exists but does not appear to have call sites. Session cleanup paths such as dead-process cleanup mark agents idle and persist terminated session history, but the pool slot is not released automatically.