fix: persist XCTest runner port per device — stop orphaning xcodebuild processes#3167
Open
qwertey6 wants to merge 7 commits intomobile-dev-inc:mainfrom
Open
fix: persist XCTest runner port per device — stop orphaning xcodebuild processes#3167qwertey6 wants to merge 7 commits intomobile-dev-inc:mainfrom
qwertey6 wants to merge 7 commits intomobile-dev-inc:mainfrom
Conversation
iOS simulators share the host's localhost, causing port collisions when
multiple Maestro processes target different sims simultaneously. Session
tracking was per-platform, so two processes on different devices would
interfere with each other's sessions.
Changes:
- Per-device session tracking: SessionStore keys are now
"{platform}_{deviceId}_{sessionId}" instead of "{platform}_{sessionId}"
- Add --driver-host-port CLI flag for explicit XCTest server port
- Auto-select available ports with isPortAvailable() check
- Refactor SessionStore from singleton to injectable class (DI)
- Add shouldCloseSession(platform, deviceId) for per-device shutdown
instead of global activeSessions().isEmpty()
- Add cross-process file locking to KeyValueStore (~/.maestro/sessions)
- Append PID to debug log directory to prevent parallel race
- Enable useJUnitPlatform() in maestro-cli (was missing)
- Add SessionStoreTest with 8 tests covering isolation and lifecycle
Verified: 3 iOS simulators + Android emulator running simultaneously,
all passing. Both --driver-host-port (explicit) and auto-port-selection
work correctly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Default --reinstall-driver to false: reuse a healthy running driver instead of killing and reinstalling on every run (~40s saved on iOS) - XCTestDriverClient checks isChannelAlive() before reinstalling — if the user explicitly passes --reinstall-driver, honor it - Cache extracted iOS build products per-device in ~/.maestro/build-products/<deviceId>/ with SHA-256 hash validation: skips extraction when source matches cache, re-extracts on upgrade - Reduce XCTest status check HTTP read timeout from 100s to 3s - Remove Thread.sleep(1000) heartbeat delay hack (no longer needed with per-device session tracking) Single device: ~52s → ~10-12s. Three devices parallel: ~54s → ~18s. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- iOS XCTest runner: add isVersionMatch() to XCTestInstaller interface. LocalXCTestInstaller compares SHA-256 hash of build products against a .running-hash marker written at startup. restartXCTestRunner now checks both isChannelAlive() AND isVersionMatch() — stale runners from a previous Maestro version are replaced automatically. - Android driver: add isDriverVersionCurrent() that hashes the bundled maestro-app.apk and maestro-server.apk, compares against stored hash in ~/.maestro/android-driver-hash. On mismatch, APKs are reinstalled even when reinstallDriver=false. - App binary cache (clearAppState): getCachedAppBinary now compares Info.plist of cached vs installed app. Stale cache from app updates is detected and refreshed before reinstall. Per-device cache dirs (~/.maestro/app-cache/<deviceId>/) prevent parallel races. - Add XCTestDriverClientTest (4 tests) and LocalSimulatorUtilsTest (3 tests) covering version mismatch, reuse, and cache behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On iOS, waitForAppToSettle has two tiers: a server-side screenshot hash check (Tier 1, hardcoded 3000ms) and a client-side hierarchy comparison fallback (Tier 2). The per-command waitToSettleTimeoutMs config only controlled Tier 2, so even waitToSettleTimeoutMs: 100 would still burn up to 3 seconds in Tier 1. Fix: use waitToSettleTimeoutMs as the total settle budget. Tier 1 runs with this timeout, and any remaining time goes to Tier 2: - swipe with waitToSettleTimeoutMs: 500 → capped at 500ms total - default (no config) → unchanged 3000ms behavior Wikipedia e2e flow with tuned timeouts: 25s vs 53s baseline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On iOS, when a React Native Pressable has accessibilityLabel set, the
child Text content is collapsed into the parent's accessibility label.
The element's title and value are empty, so Maestro's `text` attribute
was always empty for these elements — making `tapOn: "<text>"` fail to
find buttons that are clearly visible to users.
The element IS reachable via `tapOn { label: ... }` or by regex against
accessibilityText, but those are unintuitive workarounds. Users see text
on screen and expect `tapOn: "that text"` to work — that's the entire
point of the selector.
Fix: in mapViewHierarchy, fall back to element.label (the iOS
accessibility label) when both title and value are empty. accessibilityText
still uses element.label as its canonical source, so the existing
Filters.textMatches accessibilityText fallback continues to work.
This also indirectly fixes the "tap doesn't fire onPress" symptom: when
matching by accessibilityText regex, Maestro might select a parent View
wrapping the Pressable, leading to coordinate taps in the wrong area.
With text populated on the Pressable itself, normal element ranking
picks the correct deepest match.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The XCTest runner sometimes binds its HTTP server to ::1 (IPv6) only. Maestro CLI was hardcoded to connect via 127.0.0.1 (an IPv4 literal), which cannot reach an IPv6-only socket. Result: every HTTP call fails with "Connection refused" even though the runner is alive and curl can reach it via localhost. Fix: replace 127.0.0.1 with localhost in three places: - MaestroSessionManager.defaultXctestHost - LocalXCTestInstaller constructor default - LocalXCTestInstaller.xcTestDriverStatusCheck (was hardcoded) okhttp's default Dns resolver returns all addresses for localhost (both 127.0.0.1 and ::1) and tries them in order on connection failure, so this works regardless of which address family the runner binds to. This is the same root cause as mobile-dev-inc#1299 (open since July 2023). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each Maestro CLI invocation was picking a new random port for the XCTest server, even when a runner from a previous run was still listening on a different port. The new run couldn't find a runner on its new port, so it spawned a fresh xcodebuild + runner pair. The old runner got killed by simctl (XCTest only allows one runner per app), but the old xcodebuild process was orphaned and stayed alive forever. Over many runs, this accumulated orphaned xcodebuild processes that held simulator resources, eventually causing connection failures and timeouts. Fix: persist the XCTest runner port to ~/.maestro/xctest-ports/<deviceId> after a successful start. On the next invocation, read the saved port and probe it with isPortListening(). If something is listening, reuse the port — isChannelAlive() (one level up) will then short-circuit the entire reinstall path. If nothing is listening, the saved port is stale; pick a new random port and update the file. XCTestPortStore is a small file-based per-device store. Cross-platform (no lsof/process scanning), so works on Windows. Verified locally with 4 sequential runs on the same simulator: - Run 1 (cold): 13s, port 7106 saved - Run 2 (warm, reuse): 8s, same port, 2 processes - Run 3 (warm, reuse): 7s, same port, 2 processes - Run 4 (after pkill): 11s, new port 7009, fresh runner Parallel runs on two simulators get independent port files and don't interfere with each other. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed changes
Stop orphaning xcodebuild processes between Maestro runs. Each
maestro testinvocation was picking a random port from 7001-7128, even when an XCTest runner from a previous run was still listening on a different port. The new run couldn't find a runner on its new port, so it spawned a freshxcodebuild+ runner pair. The old runner got killed by simctl (XCTest only allows one runner per app), but the oldxcodebuildprocess was orphaned and stayed alive forever.Over many runs, this accumulated orphaned
xcodebuildprocesses that held simulator resources, eventually causing connection failures, timeouts, and the symptoms reported in #1299, #2932, and various flaky-test reports.Impact
isChannelAlive()short-circuit inrestartXCTestRunnerfinally works the way it was meant to)xcodebuildzombies filling up the process tablelsof/process scanning, works on WindowsisChannelAlive()skip but couldn't use it because the port changed every run)Root cause
TestCommand.selectPort()returns a random port from7001..7128for every invocation. The previous run's runner is still listening on its old port (e.g., 7106). The new run picks a new port (e.g., 7117) and:restartXCTestRunner()for port 7117isChannelAlive()checks port 7117 → returns false (old runner is on 7106)xcodebuild test-without-buildingon port 7117xcodebuildis now waiting for its dead runner — orphaned foreverAfter N runs, there are N orphaned
xcodebuildprocesses. Verified locally by runningps aux | grep xcodebuildafter a fewmaestro testinvocations.Fix
Persist the XCTest runner port per device to
~/.maestro/xctest-ports/<deviceId>. On the next invocation, read the saved port and probe it withisPortListening()(a short-timeoutSocket.connect()). If something is listening, reuse the port — the existingisChannelAlive()check inrestartXCTestRunner()then short-circuits the entire reinstall path. If nothing is listening, the saved port is stale; pick a new random port and update the file.Verification
Parallel runs on two simulators each get their own port file and don't interfere.
Why not pick the same port deterministically (e.g., hash deviceId → port)?
Considered. Two issues:
The persistent file is more flexible: the port is "sticky" once chosen, but each device picks an actually-available port at first start.
Why not scan running processes to find the port (lsof)?
lsof)Cross-platform safety
XCTestPortStoreusesjava.io.Fileonly — no Unix-specific calls. Works on Windows, Linux, and macOS.Issues fixed
Builds on #3139 to make the driver-reuse optimization actually work across CLI invocations. Likely contributes to fixing #1299 and the broader class of "iOS driver hangs" reports.