Skip to content

Wrap ObjC auto-async calls in explicit checked continuations#1044

Merged
pblazej merged 5 commits into
livekit:mainfrom
channel-io:channel/objc-async-checked-continuation
Jun 18, 2026
Merged

Wrap ObjC auto-async calls in explicit checked continuations#1044
pblazej merged 5 commits into
livekit:mainfrom
channel-io:channel/objc-async-checked-continuation

Conversation

@elesahich

Copy link
Copy Markdown
Contributor

Summary

Wraps two compiler-synthesized ObjC auto-async calls in explicit withCheckedThrowingContinuation:

  • Transport._iceCandidatesQueue_pc.add(iceCandidate.toRTCType()) (addIceCandidate:completionHandler:)
  • CameraCapturer.startCapture()capturer.startCapture(with:format:fps:) (startCaptureWithDevice:format:fps:completionHandler:)

Both now call the completion-handler API directly inside an explicit checked continuation, matching the pattern already used in this file for setRemoteDescription / setLocalDescription / createOffer.

Motivation

When an app statically links/merges modules built in mixed Swift language modes (some Swift 5, some Swift 6), the compiler-generated unsafe continuation thunk for an imported ObjC completion-handler method is a signature-keyed shared symbol. The unsafe thunk emitted by a Swift-5 module can win symbol deduplication over the checked thunk this SDK (built in Swift 6) intends to use. At runtime the affected resume paths are silently downgraded to unsafe.

Under concurrent peer-connection negotiation, the downgraded unsafe continuation corrupts the Swift task allocator, surfacing as EXC_BAD_ACCESS (SIGSEGV):

UnsafeContinuation.resume(returning:)
  ← @objc completion handler block (NSError?) -> ()
  ← LiveKitWebRTC
... StackAllocator::getSlabForAllocation / flagAsAndEnqueueOnExecutor

We hit this reliably on real devices (iOS 18 and iOS 26) on connect (addIceCandidate) and on enabling the camera (startCapture). The two sites here were the only remaining _pc./capturer. calls going through the auto-async overload; every other peer-connection call in Transport already uses an explicit continuation, which is why those were unaffected.

Why this fix

Calling the completion-handler API explicitly means the shared auto-async thunk is never generated or referenced, so there is no symbol for cross-module dedup to collapse. The resume stays checked regardless of how downstream modules are built — no behavior change for callers, just a safer continuation path.

Testing

Verified on device (iPhone 12 / iOS 18, iPhone 17 Pro / iOS 26): connect succeeds and camera enables without the crash; previously both reproduced consistently.

`addIceCandidate` (Transport) and `startCapture` (CameraCapturer) are
invoked through the compiler-synthesized async overloads of the
completion-handler ObjC methods. That overload emits an *unsafe*
continuation thunk whose symbol is keyed only by the imported
signature, so it is shared across modules.

When an app statically links/merges modules built in mixed Swift
language modes (Swift 5 and Swift 6), the unsafe thunk emitted by a
Swift-5 module wins symbol deduplication over the checked thunk this
SDK (Swift 6) intends to use. At runtime these two resume paths are
downgraded to unsafe, and under concurrent peer-connection negotiation
the unsafe continuation corrupts the task allocator, producing
EXC_BAD_ACCESS on connect (addIceCandidate) and camera start
(startCapture).

Calling the completion-handler API explicitly inside
`withCheckedThrowingContinuation` means the shared auto-async thunk is
never generated or referenced, so there is no symbol for dedup to
collapse and the resume stays checked regardless of how consuming
modules are built. This mirrors the explicit continuations already used
for `setRemoteDescription` / `setLocalDescription` / `createOffer`.
@CLAassistant

CLAassistant commented Jun 17, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

The addIceCandidate site lives inside the `_iceCandidatesQueue`
`[weak self]` closure. Moving the call into a nested
withCheckedThrowingContinuation closure requires explicit `self.` to
make capture semantics explicit under the Swift 6 language mode.
@pblazej

pblazej commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Thanks @elesahich — confirmed the root cause, reproduced it, and pinned the affected range. The fix is right; a few things before merge.

Root cause

swiftlang/swift#81846: calling an imported ObjC completion-handler method via its
synthesized async overload emits a weak, signature-keyed bridge thunk. Swift 5
mode emits the unchecked variant, Swift 6 the checked one — under the same
symbol name.
A mixed Swift 5/6 static link coalesces them → resume through the
wrong layout → EXC_BAD_ACCESS in swift_continuation_resumeImpl. An explicit
withCheckedContinuation emits no such thunk, so wrapping fixes it.

Reproduced + affected range

Repro: an ObjC ()->() method bridged from a .swiftLanguageMode(.v5) and a
.v6 caller, statically linked.

Swift auto (unwrapped) wrapped
6.1.2 / 6.2.0 / 6.2.3 SIGSEGV clean
6.3.0 clean clean

Affected on Swift 6.1.x–6.2.3 (Xcode 16.3–26.x); fixed in 6.3 (Xcode 26.4+) — no 6.2 backport.

Suggestions

  1. Wrap all auto-bridge consumption sites, not just two — every await into an
    imported ObjC completion method is exposed. There are 7:

    • Transport.swift:71 _pc.add(iceCandidate) ✅ in PR
    • CameraCapturer.swift:288 capturer.startCapture(...) ✅ in PR
    • CameraCapturer.swift:308 capturer.stopCapture() ❌ (the ()->() shape; Camera publish SIGBUS in UnsafeContinuation.resume from LKRTCCameraVideoCapturer completion on macOS 26.1 #1016's workaround included it)
    • MacOSScreenCapturer.swift:121 stream.startCapture()
    • MacOSScreenCapturer.swift:143 stream.stopCapture()
    • MacOSScreenCapturer.swift:452 SCShareableContent.excludingDesktopWindows(...)
    • InAppCapturer.swift:42 RPScreenRecorder.shared().startCapture { … }

    Fine to take the four Apple-framework ones in a fast-follow if you'd rather keep this WebRTC-scoped.

  2. Add an AGENTS.md rule under "Concurrency and State":

    Until the minimum supported compiler is Swift 6.3, wrap calls to imported Objective-C completion-handler methods made via their synthesized async overload in an explicit withCheckedThrowingContinuation, since the bare auto-bridge hits a mixed Swift 5/6 thunk-coalescing crash (Swift 6.1 runtime crash when calling @objc async protocol method in target with mixed Swift 5 and Swift 6 dependencies swiftlang/swift#81846) fixed in 6.3.

  3. Add a .changes entry .changes/objc-async-checked-continuation:

    patch type="fixed" "Wrap ObjC completion-handler async calls in explicit checked continuations (fixes mixed Swift 5/6 continuation-bridge EXC_BAD_ACCESS)"
    
  4. Confirm the build toolchain — the trigger is the compiler the app is built
    with, not the device OS. Which Xcode did you build the crashing app with?

Linked issues

Relates to #1016, #1022, #1036.

@pblazej pblazej left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to merge after addressing the above 🎉

Address review on livekit#1044: cover all imported ObjC completion-handler
async-overload call sites, not just the two WebRTC ones.

- CameraCapturer.stopCapture -> capturer.stopCapture (void completion)
- MacOSScreenCapturer.startCapture -> stream.startCapture
- MacOSScreenCapturer.stopCapture -> stream.stopCapture
- MacOSScreenCapturer.sources -> SCShareableContent.getExcludingDesktopWindows
- InAppCapturer.startCapture -> RPScreenRecorder.startCapture(handler:completionHandler:)

Each now calls the completion-handler API inside an explicit checked
continuation so the synthesized async-bridge thunk is never emitted.

Also add the AGENTS.md 'Concurrency and State' rule and a
.changes/objc-async-checked-continuation entry.
@elesahich

elesahich commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the review — the repro table and affected range were helpful. Addressed all of it:

All auto-bridge sites wrapped (pushed): added the remaining five alongside the two WebRTC ones —

  • CameraCapturer.swift capturer.stopCapture() (void completion → withCheckedContinuation)
  • MacOSScreenCapturer.swift stream.startCapture()
  • MacOSScreenCapturer.swift stream.stopCapture()
  • MacOSScreenCapturer.swift SCShareableContent.getExcludingDesktopWindows(...)
  • InAppCapturer.swift RPScreenRecorder.shared().startCapture(handler:completionHandler:)

Each calls the completion-handler API inside an explicit checked continuation, so no synthesized async-bridge thunk is emitted.

AGENTS.md — added the rule under "Concurrency and State".

.changes/objc-async-checked-continuation — added.

Toolchain — the crashing app was built with Xcode 26.2 (Swift 6.2.3), in the affected range. Our app statically merges ~170 Swift-5-mode modules with this SDK (Swift 6), which is the mixed-mode coalescing condition. The wrapping stays harmless once we reach 6.3.

The synthesized async overload returns a non-Sendable SCShareableContent;
wrapping it in a manual continuation surfaces a region-isolation
'sending content risks data races' error (the compiler bridge handles
this), and the extra lines push sources(for:) past SwiftLint's
function_body_length. It is a cold macOS enumeration path, so leave it on
the async overload for now. The error-only and void macOS/ReplayKit
bridges remain wrapped.
Reverting MacOSScreenCapturer to pristine. Two macOS-only frictions:
- getExcludingDesktopWindows returns non-Sendable SCShareableContent;
  a manual continuation trips region-isolation 'sending' checks.
- Wrapping stream.startCapture tips the already-long startCapture()
  past SwiftLint's function_body_length (54 > 50).

These are cold macOS screen-capture paths. Keeping the WebRTC + camera
+ in-app (ReplayKit) sites wrapped here; the macOS SCStream /
SCShareableContent sites can come as a focused fast-follow.
@elesahich

Copy link
Copy Markdown
Contributor Author

Update on the remaining sites:

Wrapped the iOS-relevant bridges — Transport.addIceCandidate, CameraCapturer.startCapture / stopCapture, and InAppCapturer (ReplayKit).

I backed the three MacOSScreenCapturer sites out for now:

  • getExcludingDesktopWindows returns a non-Sendable SCShareableContent; a manual continuation surfaces a region-isolation "sending 'content' risks data races" error that the synthesized bridge handles for us.
  • Wrapping SCStream.startCapture tips the already-long startCapture() past SwiftLint's function_body_length (54 > 50).

Both are cold macOS screen-capture paths — happy to take them as a focused fast-follow (the SCShareableContent one likely needs a small nonisolated(unsafe) hand-off, and startCapture() a short extract or an explicit disable). Let me know if you'd prefer them in this PR instead.

Toolchain (from before): the crashing app builds with Xcode 26.2 / Swift 6.2.3, statically merged with ~170 Swift-5-mode modules — the mixed-mode coalescing condition.

CI is green except Build & Test (iOS Simulator, iPhone 16 Pro, OS 18.5), which fails in RpcClientTests with RpcError(1502, "Response timeout") — unrelated to this change (the RoomTests suite passes, the iPhone 17 Pro / iOS 26.5 sim passes, and the same job passed on the first revision). I can't re-run it from the fork; a re-run should clear it.

@pblazej pblazej left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great — approving; just to close the loop, can you confirm the crash no longer reproduces once the app is rebuilt with Swift 6.3 (Xcode 26.4+)?

@elesahich

Copy link
Copy Markdown
Contributor Author

Thanks for approving. On the 6.3 confirmation — I can't verify it directly yet; our toolchain is pinned at Xcode 26.2 / Swift 6.2.3 and we don't have 26.4+ here.

That said, your repro table already shows the bare (unwrapped) call is clean on 6.3.0, which matches the swiftlang/swift#81846 fix, and the explicit continuations here stay correct on 6.3 — so the wrapping is a harmless no-op there rather than a behavior change. I'll confirm on-device once we move to Xcode 26.4+ and report back.

@pblazej pblazej merged commit d681bb0 into livekit:main Jun 18, 2026
41 of 42 checks passed
@elesahich elesahich deleted the channel/objc-async-checked-continuation branch June 18, 2026 06:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

EXC_BAD_ACCESS in Room.publisherShouldNegotiate / Room.fullConnectSequence on every connect (2.14.0 + 2.14.1, macOS arm64)

3 participants