fix: persist XCTest runner port per device — stop orphaning xcodebuild processes by qwertey6 · Pull Request #3167 · mobile-dev-inc/Maestro

qwertey6 · 2026-04-10T06:08:53Z

Proposed changes

Stop orphaning xcodebuild processes between Maestro runs. Each maestro test invocation was picking a random port from 7001-7128, even when an XCTest runner from a previous run was still listening on a different port. The new run couldn't find a runner on its new port, so it spawned a fresh xcodebuild + runner pair. The old runner got killed by simctl (XCTest only allows one runner per app), but the old xcodebuild process was orphaned and stayed alive forever.

Over many runs, this accumulated orphaned xcodebuild processes that held simulator resources, eventually causing connection failures, timeouts, and the symptoms reported in #1299, #2932, and various flaky-test reports.

Impact

Saves ~5 seconds per warm run by reusing the existing runner (the isChannelAlive() short-circuit in restartXCTestRunner finally works the way it was meant to)
No more orphaned xcodebuild zombies filling up the process table
Cross-platform: file-based store, no lsof/process scanning, works on Windows
Builds on the runner-reuse work in perf: 4x faster startup — skip unnecessary driver reinstall, cache build products #3139 (which added isChannelAlive() skip but couldn't use it because the port changed every run)

Root cause

TestCommand.selectPort() returns a random port from 7001..7128 for every invocation. The previous run's runner is still listening on its old port (e.g., 7106). The new run picks a new port (e.g., 7117) and:

Calls restartXCTestRunner() for port 7117
isChannelAlive() checks port 7117 → returns false (old runner is on 7106)
Spawns a new xcodebuild test-without-building on port 7117
simctl kills the old runner (only one XCTest runner per app at a time)
The old xcodebuild is now waiting for its dead runner — orphaned forever

After N runs, there are N orphaned xcodebuild processes. Verified locally by running ps aux | grep xcodebuild after a few maestro test invocations.

Fix

Persist the XCTest runner port per device to ~/.maestro/xctest-ports/<deviceId>. On the next invocation, read the saved port and probe it with isPortListening() (a short-timeout Socket.connect()). If something is listening, reuse the port — the existing isChannelAlive() check in restartXCTestRunner() then short-circuits the entire reinstall path. If nothing is listening, the saved port is stale; pick a new random port and update the file.

Verification

Run	State	Time	Port	Processes
1 (cold)	Fresh start	13s	7106	2 (xcodebuild + runner)
2 (warm)	Reuse	8s	7106 (same)	2 (no orphans)
3 (warm)	Reuse	7s	7106 (same)	2 (no orphans)
4 (after pkill)	Recover	11s	7009 (new)	2 (fresh)

Parallel runs on two simulators each get their own port file and don't interfere.

Why not pick the same port deterministically (e.g., hash deviceId → port)?

Considered. Two issues:

Hash collisions across devices on the same machine
The user might have the port in use by something else

The persistent file is more flexible: the port is "sticky" once chosen, but each device picks an actually-available port at first start.

Why not scan running processes to find the port (lsof)?

Not cross-platform (Windows has no lsof)
Fragile parsing
Maestro CLI runs in sandboxes that may restrict process inspection

Cross-platform safety

XCTestPortStore uses java.io.File only — no Unix-specific calls. Works on Windows, Linux, and macOS.

Depends on: #3166 → #3165 → #3141 → #3140 → #3139 → #3138 (stacked PR chain)

Issues fixed

Builds on #3139 to make the driver-reuse optimization actually work across CLI invocations. Likely contributes to fixing #1299 and the broader class of "iOS driver hangs" reports.

iOS simulators share the host's localhost, causing port collisions when multiple Maestro processes target different sims simultaneously. Session tracking was per-platform, so two processes on different devices would interfere with each other's sessions. Changes: - Per-device session tracking: SessionStore keys are now "{platform}_{deviceId}_{sessionId}" instead of "{platform}_{sessionId}" - Add --driver-host-port CLI flag for explicit XCTest server port - Auto-select available ports with isPortAvailable() check - Refactor SessionStore from singleton to injectable class (DI) - Add shouldCloseSession(platform, deviceId) for per-device shutdown instead of global activeSessions().isEmpty() - Add cross-process file locking to KeyValueStore (~/.maestro/sessions) - Append PID to debug log directory to prevent parallel race - Enable useJUnitPlatform() in maestro-cli (was missing) - Add SessionStoreTest with 8 tests covering isolation and lifecycle Verified: 3 iOS simulators + Android emulator running simultaneously, all passing. Both --driver-host-port (explicit) and auto-port-selection work correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Default --reinstall-driver to false: reuse a healthy running driver instead of killing and reinstalling on every run (~40s saved on iOS) - XCTestDriverClient checks isChannelAlive() before reinstalling — if the user explicitly passes --reinstall-driver, honor it - Cache extracted iOS build products per-device in ~/.maestro/build-products/<deviceId>/ with SHA-256 hash validation: skips extraction when source matches cache, re-extracts on upgrade - Reduce XCTest status check HTTP read timeout from 100s to 3s - Remove Thread.sleep(1000) heartbeat delay hack (no longer needed with per-device session tracking) Single device: ~52s → ~10-12s. Three devices parallel: ~54s → ~18s. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- iOS XCTest runner: add isVersionMatch() to XCTestInstaller interface. LocalXCTestInstaller compares SHA-256 hash of build products against a .running-hash marker written at startup. restartXCTestRunner now checks both isChannelAlive() AND isVersionMatch() — stale runners from a previous Maestro version are replaced automatically. - Android driver: add isDriverVersionCurrent() that hashes the bundled maestro-app.apk and maestro-server.apk, compares against stored hash in ~/.maestro/android-driver-hash. On mismatch, APKs are reinstalled even when reinstallDriver=false. - App binary cache (clearAppState): getCachedAppBinary now compares Info.plist of cached vs installed app. Stale cache from app updates is detected and refreshed before reinstall. Per-device cache dirs (~/.maestro/app-cache/<deviceId>/) prevent parallel races. - Add XCTestDriverClientTest (4 tests) and LocalSimulatorUtilsTest (3 tests) covering version mismatch, reuse, and cache behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

On iOS, waitForAppToSettle has two tiers: a server-side screenshot hash check (Tier 1, hardcoded 3000ms) and a client-side hierarchy comparison fallback (Tier 2). The per-command waitToSettleTimeoutMs config only controlled Tier 2, so even waitToSettleTimeoutMs: 100 would still burn up to 3 seconds in Tier 1. Fix: use waitToSettleTimeoutMs as the total settle budget. Tier 1 runs with this timeout, and any remaining time goes to Tier 2: - swipe with waitToSettleTimeoutMs: 500 → capped at 500ms total - default (no config) → unchanged 3000ms behavior Wikipedia e2e flow with tuned timeouts: 25s vs 53s baseline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

On iOS, when a React Native Pressable has accessibilityLabel set, the child Text content is collapsed into the parent's accessibility label. The element's title and value are empty, so Maestro's `text` attribute was always empty for these elements — making `tapOn: "<text>"` fail to find buttons that are clearly visible to users. The element IS reachable via `tapOn { label: ... }` or by regex against accessibilityText, but those are unintuitive workarounds. Users see text on screen and expect `tapOn: "that text"` to work — that's the entire point of the selector. Fix: in mapViewHierarchy, fall back to element.label (the iOS accessibility label) when both title and value are empty. accessibilityText still uses element.label as its canonical source, so the existing Filters.textMatches accessibilityText fallback continues to work. This also indirectly fixes the "tap doesn't fire onPress" symptom: when matching by accessibilityText regex, Maestro might select a parent View wrapping the Pressable, leading to coordinate taps in the wrong area. With text populated on the Pressable itself, normal element ranking picks the correct deepest match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The XCTest runner sometimes binds its HTTP server to ::1 (IPv6) only. Maestro CLI was hardcoded to connect via 127.0.0.1 (an IPv4 literal), which cannot reach an IPv6-only socket. Result: every HTTP call fails with "Connection refused" even though the runner is alive and curl can reach it via localhost. Fix: replace 127.0.0.1 with localhost in three places: - MaestroSessionManager.defaultXctestHost - LocalXCTestInstaller constructor default - LocalXCTestInstaller.xcTestDriverStatusCheck (was hardcoded) okhttp's default Dns resolver returns all addresses for localhost (both 127.0.0.1 and ::1) and tries them in order on connection failure, so this works regardless of which address family the runner binds to. This is the same root cause as mobile-dev-inc#1299 (open since July 2023). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Each Maestro CLI invocation was picking a new random port for the XCTest server, even when a runner from a previous run was still listening on a different port. The new run couldn't find a runner on its new port, so it spawned a fresh xcodebuild + runner pair. The old runner got killed by simctl (XCTest only allows one runner per app), but the old xcodebuild process was orphaned and stayed alive forever. Over many runs, this accumulated orphaned xcodebuild processes that held simulator resources, eventually causing connection failures and timeouts. Fix: persist the XCTest runner port to ~/.maestro/xctest-ports/<deviceId> after a successful start. On the next invocation, read the saved port and probe it with isPortListening(). If something is listening, reuse the port — isChannelAlive() (one level up) will then short-circuit the entire reinstall path. If nothing is listening, the saved port is stale; pick a new random port and update the file. XCTestPortStore is a small file-based per-device store. Cross-platform (no lsof/process scanning), so works on Windows. Verified locally with 4 sequential runs on the same simulator: - Run 1 (cold): 13s, port 7106 saved - Run 2 (warm, reuse): 8s, same port, 2 processes - Run 3 (warm, reuse): 7s, same port, 2 processes - Run 4 (after pkill): 11s, new port 7009, fresh runner Parallel runs on two simulators get independent port files and don't interfere with each other. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

qwertey6 and others added 7 commits April 4, 2026 15:37

qwertey6 mentioned this pull request Apr 13, 2026

fix: use localhost (not 127.0.0.1) for XCTest server connection (#1299) #3166

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: persist XCTest runner port per device — stop orphaning xcodebuild processes#3167

fix: persist XCTest runner port per device — stop orphaning xcodebuild processes#3167
qwertey6 wants to merge 7 commits intomobile-dev-inc:mainfrom
ReverentPeer:pr/7-persist-xctest-port

qwertey6 commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

qwertey6 commented Apr 10, 2026

Proposed changes

Impact

Root cause

Fix

Verification

Why not pick the same port deterministically (e.g., hash deviceId → port)?

Why not scan running processes to find the port (lsof)?

Cross-platform safety

Issues fixed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant