Skip to content

afx: builder state.db row can vanish while Tower terminal + shellper + worktree survive → uncleanable SPLIT state #1101

Description

@waleedkadous

Summary

A builder's state.db builders row can disappear while its runtime artifacts survive — the Tower terminal session (in global.db terminal_sessions), the shellper + PTY child processes, the .builders/<id>/ worktree, and the local + remote branches all remain. This leaves an uncleanable SPLIT state.

Observed

A bugfix builder whose PR had merged and issue closed:

  • state.db builders row: gone (all afx cleanup lookups return "not found").
  • global.db terminal_sessions: row still present (type=builder, with pid).
  • shellper + bash processes: still alive (cwd = the worktree).
  • worktree + local branch + origin/<branch>: all still present.
  • Tower UI: still lists the builder ALIVE ("Blocked: PR review", no owner).

Why it matters

  1. Uncleanable: afx cleanup resolves a builder only via the state.builders row (cleanup.ts:196/235fatal("not found")), and every teardown step is keyed off that row. No row → no sanctioned teardown.
  2. Routing desync (same root): with the row gone, lookupBuilderSpawningArchitect(sender) returns undefined, so resolveAgentInWorkspace falls through to the "non-builder sender → main first" branch — the builder's afx send architect silently lands on main. The missing row is also why the UI shows "no owner" (owner = spawned_by_architect, which lives in that row). This is the same class as the afx send: detectCurrentBuilderId silently falls back to bare worktree name on state.db read failure → builder messages misroute to main #1094 routing fault.

Ask

Find and fix whatever removes the builders row while the terminal/shellper/worktree persist (a partial cleanup, a race, or a premature row delete). The row should not be deleted until the runtime artifacts are torn down (or vice-versa — make teardown atomic). Companion issue covers a row-less cleanup escape hatch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/towerArea: Tower server / agent farm CLI

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions