Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
id: bugfix-1094
title: afx-send-detectcurrentbuilderi
protocol: bugfix
phase: pr
plan_phases: []
current_plan_phase: null
gates:
pr:
status: pending
requested_at: '2026-06-23T22:08:09.451Z'
iteration: 1
build_complete: false
history: []
started_at: '2026-06-23T21:45:58.428Z'
updated_at: '2026-06-23T22:08:09.451Z'
pr_ready_for_human: true
94 changes: 94 additions & 0 deletions codev/state/bugfix-1094_thread.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# bugfix-1094 — afx send detectCurrentBuilderId silent misroute

Issue #1094. Protocol: BUGFIX (strict). Branch: `builder/bugfix-1094`.

## Investigate

**Bug**: `detectCurrentBuilderId()` (`packages/codev/src/agent-farm/commands/send.ts`)
silently returns the bare worktree dir name (e.g. `bugfix-2461`) on three
unverifiable paths inside a *confirmed* `.builders/<id>/` context:

1. `!existsSync(dbPath)` → `return worktreeDirName` (line 73)
2. `catch { return worktreeDirName }` — DB open failure (line 78-80) ← the real incident (Node ABI mismatch)
3. no matching builder row → `return worktreeDirName` (line 97)

The bare name is non-canonical (`builder-bugfix-2461` is the real id), so
Tower's `lookupBuilderSpawningArchitect(sender)` returns `undefined`, and
`resolveAgentInWorkspace` drops to the "non-builder sender → main first"
branch (`tower-messages.ts:344-347`). Net: builder's `afx send architect`
silently lands on **main** instead of its spawning architect.

Violates global principle #1 "fail fast, NEVER implement fallbacks": a fatal
environmental fault (DB unreadable) is laundered into a plausible-looking
misroute that is very hard to diagnose. Related memory:
reference_hermes_node_shadow_misroute.

**Call graph**: `detectCurrentBuilderId()` is used in exactly one place —
`send()` line 222: `const from = detectCurrentBuilderId() ?? 'architect'`.
Plus the two test files. Blast radius is just `afx send`'s `from` identity.

## Fix plan

1. `detectCurrentBuilderId`: in a confirmed builder context, **throw** a typed,
actionable `BuilderIdResolutionError` on all three unverifiable paths instead
of returning a bare name. DB-open failure gets a message that names the
likely ABI-mismatch cause. Contract becomes: returns canonical id | null
(not a builder); throws when it's a builder but id can't be verified.
2. `send()`: wrap the `from` computation; on throw, `fatal(message)` so afx send
aborts loudly rather than misrouting.
3. Tower guard (defense-in-depth): in `resolveAgentInWorkspace`, when sender
looks like a builder id (`parseAgentName` / `…-<n>` shape) but
`lookupBuilderSpawningArchitect` returns `undefined`, log a warning before
the main-first fallback.
4. Tests: update bugfix-774 tests that encoded the bare-name fallback; add a
DB-open-failure regression test asserting it does NOT return a bare name.

Note: worktree had no node_modules — running `pnpm install` to build/test.

## Fix (done)

Implemented:
- `send.ts`: new `BuilderIdResolutionError` + `describeStateDbOpenFailure` (ABI-mismatch
hint). `detectCurrentBuilderId` now **throws** on all 3 unverifiable paths in a
confirmed builder context (missing db / open failure / no row) instead of returning
a bare name. Call site in `send()` wraps it → `fatal()` on throw.
- `tower-messages.ts`: `looksLikeBuilderId` + guard in `resolveAgentInWorkspace` — warns
before the main-first fallback when a builder-shaped sender has no row (and isn't a
registered architect). Defense-in-depth.

Reproduced bug (unopenable db → old code returned bare `bugfix-2461`). Verified fix
against REAL main state.db: worktree resolves to canonical `builder-bugfix-1094`
(happy path unchanged).

Tests:
- bugfix-774 test: the 2 fallback tests now assert throws; added DB-open-failure
regression + `describeStateDbOpenFailure` message tests.
- new bugfix-1094-tower-guard test: `looksLikeBuilderId` + warn-then-main-fallback.
- send.test.ts: was built on the old silent fallback (ran from worktree CWD, mocked
existsSync→false). Now chdir to tmpdir in beforeEach so `from` resolves to
'architect' deterministically; simplified `getExpectedFrom()`.

Full suite green: 167 files / 3347 tests pass, 0 fail. Net ~217 LOC tracked +
~110 new test file. Within BUGFIX scope.

## PR (gate: pr)

PR #1095 opened. CMAP (`--protocol bugfix --type pr`):
- codex = APPROVE (HIGH, no key issues)
- claude = APPROVE (HIGH, no key issues)
- gemini = NO VERDICT (broken agy review lane — burned session hunting for source
under sandbox restrictions; matches reference_agy_consult_review_broken #1032/1033)

No REQUEST_CHANGES, no defects surfaced. One non-blocking note (looksLikeBuilderId
could match a hypothetical architect name like `custom-42`) — already mitigated by the
`!entry.architects.has(sender)` guard + warning-only nature. No changes made.

Two pre-existing consult tooling gotchas hit (NOT my bug, candidates for follow-up issues):
1. `consult --type pr` needs explicit `--protocol bugfix` or it looks at top-level
`codev/consult-types/pr-review.md` (only `integration-review.md` lives there;
pr-review.md is per-protocol under `protocols/<p>/consult-types/`).
2. consult project auto-detect regex `/\.builders\/[^/]*?-?(\d+)-([^/]+)/` requires a
trailing slug after the issue number; worktree `bugfix-1094` (no slug) fails →
"Multiple projects found". Workaround: `--project-id 1094`.

Requesting pr gate via `porch done`. Waiting for human approval.
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
/**
* Issue #1094 — Tower-side defense-in-depth guard.
*
* When `resolveAgentInWorkspace` is asked to route 'architect' for a `sender`
* that LOOKS like a builder id (canonical `builder-<protocol>-<id>` or the bare
* worktree form `<protocol>-<id>`) but `lookupBuilderSpawningArchitect` returns
* `undefined` (no matching state.db row), affinity routing has been silently
* bypassed. The send still falls through to 'main' (unchanged behavior), but a
* warning is now emitted so the misroute is visible instead of silent.
*
* Also unit-tests the `looksLikeBuilderId` heuristic directly.
*/

import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
import type { WorkspaceTerminals } from '../servers/tower-types.js';

const { mockGetWorkspaceTerminals, mockLookupBuilderSpawningArchitect } = vi.hoisted(() => ({
mockGetWorkspaceTerminals: vi.fn<() => Map<string, WorkspaceTerminals>>(),
mockLookupBuilderSpawningArchitect: vi.fn<(id: string) => string | null | undefined>(),
}));

vi.mock('../servers/tower-terminals.js', () => ({
getWorkspaceTerminals: () => mockGetWorkspaceTerminals(),
}));

vi.mock('../state.js', () => ({
lookupBuilderSpawningArchitect: (id: string) => mockLookupBuilderSpawningArchitect(id),
}));

import { resolveTarget, isResolveError, looksLikeBuilderId } from '../servers/tower-messages.js';

const WS = '/home/user/project';

function mkEntry(architects: Record<string, string>, builders: Record<string, string> = {}): WorkspaceTerminals {
return {
architects: new Map(Object.entries(architects)),
builders: new Map(Object.entries(builders)),
shells: new Map(),
fileTabs: new Map(),
};
}

describe('looksLikeBuilderId — issue #1094', () => {
it('matches canonical and bare builder ids', () => {
expect(looksLikeBuilderId('builder-bugfix-2461')).toBe(true);
expect(looksLikeBuilderId('bugfix-2461')).toBe(true);
expect(looksLikeBuilderId('bugfix-2461-some-slug')).toBe(true);
expect(looksLikeBuilderId('spir-100')).toBe(true);
});

it('does not match architect senders', () => {
expect(looksLikeBuilderId('architect')).toBe(false);
expect(looksLikeBuilderId('arch')).toBe(false);
expect(looksLikeBuilderId('main')).toBe(false);
expect(looksLikeBuilderId('sibling')).toBe(false);
});
});

describe('Tower guard — non-canonical builder sender warns before main fallback', () => {
let warnSpy: ReturnType<typeof vi.spyOn>;

beforeEach(() => {
vi.clearAllMocks();
warnSpy = vi.spyOn(console, 'warn').mockImplementation(() => {});
});

afterEach(() => {
warnSpy.mockRestore();
});

it('warns when a builder-shaped sender has no state.db row, then routes to main', () => {
// Two architects → fast path skipped → lookup consulted. The bare,
// non-canonical sender id is exactly what the old send.ts fallback shipped.
mockGetWorkspaceTerminals.mockReturnValue(
new Map([[WS, mkEntry({ main: 'term-main', triage: 'term-triage' })]]),
);
mockLookupBuilderSpawningArchitect.mockReturnValue(undefined); // no matching row

const result = resolveTarget('architect', WS, 'bugfix-2461');
if (isResolveError(result)) throw new Error(`unexpected: ${result.message}`);
// Behavior unchanged: still falls through to main.
expect(result.terminalId).toBe('term-main');
// But the bypass is now visible.
expect(warnSpy).toHaveBeenCalledTimes(1);
expect(warnSpy.mock.calls[0][0]).toContain('bugfix-2461');
expect(warnSpy.mock.calls[0][0]).toMatch(/issue #1094/);
});

it('does NOT warn for a legitimate non-builder (architect) sender', () => {
mockGetWorkspaceTerminals.mockReturnValue(
new Map([[WS, mkEntry({ main: 'term-main', sibling: 'term-sibling' })]]),
);
mockLookupBuilderSpawningArchitect.mockReturnValue(undefined); // not a builder

const result = resolveTarget('architect', WS, 'architect');
if (isResolveError(result)) throw new Error(`unexpected: ${result.message}`);
expect(result.terminalId).toBe('term-main');
expect(warnSpy).not.toHaveBeenCalled();
});

it('does NOT warn when there is no sender at all', () => {
mockGetWorkspaceTerminals.mockReturnValue(
new Map([[WS, mkEntry({ main: 'term-main', sibling: 'term-sibling' })]]),
);

const result = resolveTarget('architect', WS);
if (isResolveError(result)) throw new Error(`unexpected: ${result.message}`);
expect(result.terminalId).toBe('term-main');
expect(warnSpy).not.toHaveBeenCalled();
});
});
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,11 @@ import { tmpdir } from 'node:os';
import { join } from 'node:path';
import Database from 'better-sqlite3';

import { detectCurrentBuilderId } from '../commands/send.js';
import {
detectCurrentBuilderId,
BuilderIdResolutionError,
describeStateDbOpenFailure,
} from '../commands/send.js';
import { LOCAL_SCHEMA } from '../db/schema.js';

function writeBuilderRow(dbPath: string, id: string, worktree: string): void {
Expand Down Expand Up @@ -77,24 +81,78 @@ describe('detectCurrentBuilderId — issue #774', () => {
expect(detectCurrentBuilderId()).toBe('builder-bugfix-1599');
});

it('falls back to worktree dir name when workspace state.db is missing', () => {
// Issue #1094: in a confirmed builder worktree, an inability to verify the
// canonical id is an ERROR — never a silent bare-name fallback (which would
// misroute `afx send architect` to 'main'). The three unverifiable paths
// below must all throw rather than return the bare worktree dir name.

it('throws (does not return bare name) when workspace state.db is missing', () => {
rmSync(join(workspacePath, '.agent-farm'), { recursive: true });
process.chdir(worktreePath);
expect(detectCurrentBuilderId()).toBe('bugfix-1599');
expect(() => detectCurrentBuilderId()).toThrow(BuilderIdResolutionError);
expect(() => detectCurrentBuilderId()).toThrow(/bugfix-1599/);
});

it('falls back to worktree dir name when no row matches', () => {
it('throws (does not return bare name) when no builder row matches', () => {
// Workspace DB exists but has no row for this worktree.
const wsDb = new Database(join(workspacePath, '.agent-farm', 'state.db'));
wsDb.prepare('DELETE FROM builders').run();
wsDb.close();

process.chdir(worktreePath);
expect(detectCurrentBuilderId()).toBe('bugfix-1599');
expect(() => detectCurrentBuilderId()).toThrow(BuilderIdResolutionError);
expect(() => detectCurrentBuilderId()).toThrow(/no matching builder row/);
});

it('throws (does not return bare name) when state.db cannot be opened — issue #1094', () => {
// The real incident: a Node ABI mismatch made `new Database()` throw, and
// the old `catch { return worktreeDirName }` swallowed it, shipping the
// bare name `bugfix-1599`. Simulate an unopenable DB by replacing the file
// with a directory at the same path (existsSync passes; open throws).
rmSync(join(workspacePath, '.agent-farm'), { recursive: true });
mkdirSync(join(workspacePath, '.agent-farm', 'state.db'), { recursive: true });

process.chdir(worktreePath);

// Must NOT silently return the bare worktree directory name.
let returned: string | null | undefined;
try {
returned = detectCurrentBuilderId();
} catch (err) {
expect(err).toBeInstanceOf(BuilderIdResolutionError);
expect((err as Error).message).toContain('bugfix-1599');
expect((err as Error).message).not.toBe('bugfix-1599');
expect((err as Error).message).toMatch(/issue #1094/);
return;
}
// Reaching here means it returned instead of throwing — fail loudly.
expect(returned).toBeUndefined(); // never reached if it threw; asserts no silent bare-name
throw new Error(`Expected a throw, got a silent return of '${returned}'`);
});

it('returns null when not in a builder worktree', () => {
process.chdir(workspacePath);
expect(detectCurrentBuilderId()).toBeNull();
});
});

describe('describeStateDbOpenFailure — issue #1094 actionable messages', () => {
it('names the better-sqlite3 ABI mismatch and points at reinstalling codev', () => {
const abiErr = new Error(
"The module was compiled against a different Node.js version using NODE_MODULE_VERSION 147. " +
"This version of Node.js requires NODE_MODULE_VERSION 127.",
);
const msg = describeStateDbOpenFailure('/ws/.agent-farm/state.db', 'bugfix-2461', abiErr);
expect(msg).toMatch(/ABI mismatch/i);
expect(msg).toMatch(/reinstall codev/i);
expect(msg).toContain('bugfix-2461');
expect(msg).toMatch(/issue #1094/);
});

it('gives a generic hint for non-ABI open failures', () => {
const msg = describeStateDbOpenFailure('/ws/.agent-farm/state.db', 'bugfix-2461', new Error('disk I/O error'));
expect(msg).toMatch(/corruption|permissions|stale lock/i);
expect(msg).not.toMatch(/ABI mismatch/i);
expect(msg).toContain('bugfix-2461');
});
});
32 changes: 21 additions & 11 deletions packages/codev/src/agent-farm/__tests__/send.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
* Spec 0110: Messaging Infrastructure — Phase 4
*/

import { describe, it, expect, beforeEach, vi } from 'vitest';
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';

// ============================================================================
// Mocks
Expand Down Expand Up @@ -66,6 +66,7 @@ vi.mock('node:fs', async () => {
};
});

import { tmpdir } from 'node:os';
import { send, detectWorkspaceRoot } from '../commands/send.js';
import { fatal } from '../utils/logger.js';

Expand All @@ -74,18 +75,17 @@ import { fatal } from '../utils/logger.js';
// ============================================================================

/**
* Detect the 'from' value the send command will use based on CWD.
* If CWD is inside .builders/<dir>/, `from` will be the builder's canonical ID
* (from state.db) or the dir name. Otherwise 'architect'.
* The 'from' sender identity these tests expect. The suite runs from a CWD
* outside any `.builders/<id>/` worktree (see beforeEach), so
* detectCurrentBuilderId() returns null and send() uses 'architect'.
*
* Builder-id detection (and its #1094 fail-loud behavior when state.db is
* unreadable inside a worktree) is covered by bugfix-774 / bugfix-1094 tests;
* these tests deliberately isolate from it so they exercise send()'s other
* behavior without depending on the physical CWD of the test runner.
*/
function getExpectedFrom(): string {
const cwd = process.cwd();
const match = cwd.match(/\.builders\/([^/]+)/);
if (!match) return 'architect';
// In tests, loadState returns our mock with builders that have worktree paths.
// The worktree dir match will look for match[1] in state.db.
// Our defaultState doesn't match the actual CWD dir, so it falls back to dir name.
return match[1]; // e.g., '0110'
return 'architect';
}

function defaultState() {
Expand All @@ -105,13 +105,23 @@ function defaultState() {
// ============================================================================

describe('send command', () => {
const origCwd = process.cwd();

beforeEach(() => {
// Run from outside any `.builders/<id>/` worktree so the sender identity
// resolves deterministically to 'architect' regardless of where the test
// runner physically lives (it may itself run inside a builder worktree).
process.chdir(tmpdir());
vi.clearAllMocks();
mockIsRunning.mockResolvedValue(true);
mockSendMessage.mockResolvedValue({ ok: true, resolvedTo: 'builder-spir-109' });
mockLoadState.mockReturnValue(defaultState());
});

afterEach(() => {
process.chdir(origCwd);
});

describe('single target send', () => {
it('sends to a builder by full name', async () => {
await send({ builder: 'builder-spir-109', message: 'Hello builder' });
Expand Down
Loading
Loading