fix(T12009): layout-proof CLI entry resolution for gateway auto-start#1097
Merged
Conversation
… + surface spawn failure reason resolveCliEntryPath() used join(dirname(import.meta.url), '..x5', 'dist/cli/index.js') — correct in the source tree (src/cli/lib/ is 3 dirs deep inside the package) but the esbuild bundle inlines ALL source into a single file at <pkg>/dist/cli/index.js. From inside the bundle import.meta.url IS the entry file, so 5 '..' escape the package and land at the npm prefix lib/ directory — producing MODULE_NOT_FOUND on every gateway child spawn. Fix: replace fixed-depth '..' arithmetic with a walk up the directory tree to the nearest ancestor package.json whose name is "@cleocode/cleo", then return <pkgRoot>/dist/cli/index.js. realpathSync resolves symlinked global-bin installs before the walk. Descriptive error surfaces when dist is absent or no matching package is found. Observability: runCockpit previously discarded spawnResult.reason. Now emits " auto-start: <reason>" and the last line of gateway.err when the gateway stays unreachable — that one line would have made this a 1-minute diagnosis. Regression tests: 4 new tests in gateway-auto-start.test.ts cover (a) bundled dist/cli/ layout, (b) symlinked-bin, (c) missing dist → throw, (d) no @cleocode/cleo package → throw. All 26 tests pass. Live proof: bare cleo in a TTY boots the gateway, renders "CLEO Cockpit — Kanban (498 tasks)", port 7777 confirmed accepting. DHQ-096/097 logged to dogfood-harness-question-ledger.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root cause
resolveCliEntryPath()ingateway-auto-start.tscomputedjoin(dirname(import.meta.url), '..', '..', '..', '..', '..')— valid in the source tree wheresrc/cli/lib/is 3 directories deep inside the package, but the esbuild bundle inlines ALL source into a single file at<pkg>/dist/cli/index.js. From inside the bundle,import.meta.urlIS that entry file, sodirnameis<pkg>/dist/cli/(only 2 hops to the package root). Five..from there escapes the package entirely, landing at the npm prefixlib/directory.Proof from
~/.local/state/cleo/gateway.erron v2026.6.15:A secondary observability gap:
runCockpitdiscardedspawnResult.reasonentirely, so this error was invisible even when looking at the TUI output.Changes
packages/cleo/src/cli/lib/gateway-auto-start.tsresolveCliEntryPath()to walk up the directory tree looking for the nearest ancestorpackage.jsonwithname === "@cleocode/cleo", then return<pkgRoot>/dist/cli/index.js. Throws a descriptive error when the bundle is missing or the package root is not found.realpathSyncon the start path so symlinked global-bin installs (.npm-global/bin/cleo → ../lib/node_modules/@cleocode/cleo/bin/cleo.js) resolve correctly before the walk begins.resolveCliEntryPathaccept an injectablestartUrlparameter (defaults toimport.meta.url) for testability.packages/cleo/src/cli/lib/tui/cockpit.tsauto-start: <reason>to the sink.gateway.err: read<logDir>/gateway.errand emit its last line — that one line would have made T12009 a 1-minute diagnosis.packages/cleo/src/cli/lib/__tests__/gateway-auto-start.test.tsFour new regression tests for
resolveCliEntryPath:dist/cli/layout — resolves correctlydist/cli/index.js— throws descriptive error@cleocode/cleopackage.json found — throws descriptive errorAll 26 tests pass (22 existing + 4 new).
docs/plan/dogfood-harness-question-ledger.mdcleoauto-start dead in packaged installs — FIXED in this PRcleo loginhangs after successful OAuth — OPEN (T12010).changeset/t12009-gateway-autostart-entry-path.mdChangeset documenting the fix.
Live proof
Ran from worktree (
packages/cleo/bin/cleo.js, the fixed build) on this machine:Port 7777 confirmed accepting (
LISTEN 0 511 127.0.0.1:7777). No "not reachable" message. Cleanup confirmed — port 7777 free after test.Meta-lesson
DHQ-086 class:
T11980batteries-included AC was merged without a packaged-install e2e of the new auto-start path. The regression only surfaces in production (npm install → actual dist layout). Future batteries-included spawn features need a packaged-install smoke test as part of the acceptance criteria.🤖 Generated with Claude Code