Wrap ObjC auto-async calls in explicit checked continuations#1044
Conversation
`addIceCandidate` (Transport) and `startCapture` (CameraCapturer) are invoked through the compiler-synthesized async overloads of the completion-handler ObjC methods. That overload emits an *unsafe* continuation thunk whose symbol is keyed only by the imported signature, so it is shared across modules. When an app statically links/merges modules built in mixed Swift language modes (Swift 5 and Swift 6), the unsafe thunk emitted by a Swift-5 module wins symbol deduplication over the checked thunk this SDK (Swift 6) intends to use. At runtime these two resume paths are downgraded to unsafe, and under concurrent peer-connection negotiation the unsafe continuation corrupts the task allocator, producing EXC_BAD_ACCESS on connect (addIceCandidate) and camera start (startCapture). Calling the completion-handler API explicitly inside `withCheckedThrowingContinuation` means the shared auto-async thunk is never generated or referenced, so there is no symbol for dedup to collapse and the resume stays checked regardless of how consuming modules are built. This mirrors the explicit continuations already used for `setRemoteDescription` / `setLocalDescription` / `createOffer`.
The addIceCandidate site lives inside the `_iceCandidatesQueue` `[weak self]` closure. Moving the call into a nested withCheckedThrowingContinuation closure requires explicit `self.` to make capture semantics explicit under the Swift 6 language mode.
|
Thanks @elesahich — confirmed the root cause, reproduced it, and pinned the affected range. The fix is right; a few things before merge. Root causeswiftlang/swift#81846: calling an imported ObjC completion-handler method via its Reproduced + affected rangeRepro: an ObjC
Affected on Swift 6.1.x–6.2.3 (Xcode 16.3–26.x); fixed in 6.3 (Xcode 26.4+) — no 6.2 backport. Suggestions
Linked issues
|
pblazej
left a comment
There was a problem hiding this comment.
Happy to merge after addressing the above 🎉
Address review on livekit#1044: cover all imported ObjC completion-handler async-overload call sites, not just the two WebRTC ones. - CameraCapturer.stopCapture -> capturer.stopCapture (void completion) - MacOSScreenCapturer.startCapture -> stream.startCapture - MacOSScreenCapturer.stopCapture -> stream.stopCapture - MacOSScreenCapturer.sources -> SCShareableContent.getExcludingDesktopWindows - InAppCapturer.startCapture -> RPScreenRecorder.startCapture(handler:completionHandler:) Each now calls the completion-handler API inside an explicit checked continuation so the synthesized async-bridge thunk is never emitted. Also add the AGENTS.md 'Concurrency and State' rule and a .changes/objc-async-checked-continuation entry.
|
Thanks for the review — the repro table and affected range were helpful. Addressed all of it: All auto-bridge sites wrapped (pushed): added the remaining five alongside the two WebRTC ones —
Each calls the completion-handler API inside an explicit checked continuation, so no synthesized async-bridge thunk is emitted. AGENTS.md — added the rule under "Concurrency and State".
Toolchain — the crashing app was built with Xcode 26.2 (Swift 6.2.3), in the affected range. Our app statically merges ~170 Swift-5-mode modules with this SDK (Swift 6), which is the mixed-mode coalescing condition. The wrapping stays harmless once we reach 6.3. |
The synthesized async overload returns a non-Sendable SCShareableContent; wrapping it in a manual continuation surfaces a region-isolation 'sending content risks data races' error (the compiler bridge handles this), and the extra lines push sources(for:) past SwiftLint's function_body_length. It is a cold macOS enumeration path, so leave it on the async overload for now. The error-only and void macOS/ReplayKit bridges remain wrapped.
Reverting MacOSScreenCapturer to pristine. Two macOS-only frictions: - getExcludingDesktopWindows returns non-Sendable SCShareableContent; a manual continuation trips region-isolation 'sending' checks. - Wrapping stream.startCapture tips the already-long startCapture() past SwiftLint's function_body_length (54 > 50). These are cold macOS screen-capture paths. Keeping the WebRTC + camera + in-app (ReplayKit) sites wrapped here; the macOS SCStream / SCShareableContent sites can come as a focused fast-follow.
|
Update on the remaining sites: Wrapped the iOS-relevant bridges — I backed the three
Both are cold macOS screen-capture paths — happy to take them as a focused fast-follow (the SCShareableContent one likely needs a small Toolchain (from before): the crashing app builds with Xcode 26.2 / Swift 6.2.3, statically merged with ~170 Swift-5-mode modules — the mixed-mode coalescing condition. CI is green except |
pblazej
left a comment
There was a problem hiding this comment.
Looks great — approving; just to close the loop, can you confirm the crash no longer reproduces once the app is rebuilt with Swift 6.3 (Xcode 26.4+)?
|
Thanks for approving. On the 6.3 confirmation — I can't verify it directly yet; our toolchain is pinned at Xcode 26.2 / Swift 6.2.3 and we don't have 26.4+ here. That said, your repro table already shows the bare (unwrapped) call is clean on 6.3.0, which matches the swiftlang/swift#81846 fix, and the explicit continuations here stay correct on 6.3 — so the wrapping is a harmless no-op there rather than a behavior change. I'll confirm on-device once we move to Xcode 26.4+ and report back. |
Summary
Wraps two compiler-synthesized ObjC auto-async calls in explicit
withCheckedThrowingContinuation:Transport._iceCandidatesQueue→_pc.add(iceCandidate.toRTCType())(addIceCandidate:completionHandler:)CameraCapturer.startCapture()→capturer.startCapture(with:format:fps:)(startCaptureWithDevice:format:fps:completionHandler:)Both now call the completion-handler API directly inside an explicit checked continuation, matching the pattern already used in this file for
setRemoteDescription/setLocalDescription/createOffer.Motivation
When an app statically links/merges modules built in mixed Swift language modes (some Swift 5, some Swift 6), the compiler-generated unsafe continuation thunk for an imported ObjC completion-handler method is a signature-keyed shared symbol. The unsafe thunk emitted by a Swift-5 module can win symbol deduplication over the checked thunk this SDK (built in Swift 6) intends to use. At runtime the affected resume paths are silently downgraded to unsafe.
Under concurrent peer-connection negotiation, the downgraded unsafe continuation corrupts the Swift task allocator, surfacing as
EXC_BAD_ACCESS (SIGSEGV):We hit this reliably on real devices (iOS 18 and iOS 26) on
connect(addIceCandidate) and on enabling the camera (startCapture). The two sites here were the only remaining_pc./capturer.calls going through the auto-async overload; every other peer-connection call inTransportalready uses an explicit continuation, which is why those were unaffected.Why this fix
Calling the completion-handler API explicitly means the shared auto-async thunk is never generated or referenced, so there is no symbol for cross-module dedup to collapse. The resume stays checked regardless of how downstream modules are built — no behavior change for callers, just a safer continuation path.
Testing
Verified on device (iPhone 12 / iOS 18, iPhone 17 Pro / iOS 26): connect succeeds and camera enables without the crash; previously both reproduced consistently.