Skip to content

delay sendCall/answerCall/unhold until after WebRTC ADM init, allow manual setupAudioSession#120

Merged
rex-iotum merged 9 commits into
developfrom
always-allow-programatic-unmute
Jun 16, 2026
Merged

delay sendCall/answerCall/unhold until after WebRTC ADM init, allow manual setupAudioSession#120
rex-iotum merged 9 commits into
developfrom
always-allow-programatic-unmute

Conversation

@rex-iotum

@rex-iotum rex-iotum commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Delay didActivateAudioSession until WebRTC init:

Bug:

  • didActivateAudioSession was emitting to JS that audio is ready when all it has done was tell WebRTC to spin up via [RTCAudioSession sharedInstance].isAudioEnabled = YES;
  • This causes us to action on the audio too early, like mute/unmute and can cause potential crashes and race conditions

Fix:

  • Listen for the existing WebRTC audioSessionDidStartPlayOrRecord so that we know the WebRTC Audio Engine is actually starting
  • Still not the ideal callback since this is "starting" and not "started", there does not appear to be a better callback to listen for.

Allow manual setupAudioSession

Bug:

  • gUM could be called before WebRTC init audio device

Fix:

  • manually call setupAudioSession as part of entering the InCallPage

Copilot AI review requested due to automatic review settings June 10, 2026 15:55
@rex-iotum rex-iotum force-pushed the always-allow-programatic-unmute branch from 159a8de to 380dfdf Compare June 10, 2026 15:56

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the iOS CallKit/WebRTC integration in cordova-plugin-callkit to (1) prevent duplicate CallKit reconciliation callbacks from being misclassified as UI-driven events, and (2) delay notifying JS that audio is ready until WebRTC’s audio engine has actually started.

Changes:

  • Defers pendingActivateAudioSessionEmits processing from didActivateAudioSession to WebRTC’s audioSessionDidStartPlayOrRecord callback.
  • Introduces delayed cleanup of callbackMap entries to avoid duplicate CallKit callbacks being treated as UI-initiated.
  • Updates the plugin header to conform to RTCAudioSessionDelegate.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
src/ios/CordovaCall.m Adds WebRTC audio-start gating for deferred JS emits and delays callbackMap cleanup to handle duplicate CallKit callbacks.
src/ios/CordovaCall.h Adds WebRTC audio session delegate conformance/imports for the new callback path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/ios/CordovaCall.m Outdated
Comment thread src/ios/CordovaCall.m
Comment thread src/ios/CordovaCall.h Outdated
Comment thread src/ios/CordovaCall.m
Comment thread src/ios/CordovaCall.m

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread src/ios/CordovaCall.m Outdated
Comment thread src/ios/CordovaCall.m
@rex-iotum rex-iotum force-pushed the always-allow-programatic-unmute branch from aa1ffee to 6060f48 Compare June 10, 2026 18:11
@rex-iotum rex-iotum changed the title Always allow programatic unmute, delay didActivateAudioSession until WebRTC init delay sendCall/answerCall/unhold until after WebRTC ADM init, allow manual setupAudioSession Jun 11, 2026
@rex-iotum rex-iotum requested a review from Copilot June 11, 2026 02:09

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/ios/CordovaCall.m Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/ios/CordovaCall.m Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@jerry2013

Copy link
Copy Markdown

@copilot explain what this PR does in the context of video meeting and phone call, listing the call flows and what is blocking what for js side usage.

Copilot AI commented Jun 16, 2026

Copy link
Copy Markdown

@copilot explain what this PR does in the context of video meeting and phone call, listing the call flows and what is blocking what for js side usage.

This PR fixes two race conditions in the iOS CallKit ↔ WebRTC integration. Here's what it does in the context of video/voice meetings:


Background

When a call is made or answered via CallKit, iOS fires didActivateAudioSession to signal that the system audio session is active. Previously the plugin immediately resolved the JS promise / emitted the JS event at this point. The bug is that didActivateAudioSession only hands off control to WebRTC (isAudioEnabled = YES), which then asynchronously spins up its Audio Device Manager (ADM). So JS was notified before WebRTC was actually ready.


Call flows

Outgoing call (JS calls plugin.sendCall(...))

Step What happens
1 JS calls sendCall → plugin asks CallKit to start the call
2 CallKit fires performStartCallAction → plugin calls setupAudioSession, stores the call data in pendingActivateAudioSessionEmits
3 CallKit fires didActivateAudioSession → plugin enables WebRTC audio (isAudioEnabled = YES) and waits
4 WebRTC fires audioSessionDidStartPlayOrRecord (ADM is now truly running) → only here does the plugin resolve the JS sendCall promise
JS is now safe to call getUserMedia, mute, unmute, hold, etc.

Incoming call (user taps Answer in CallKit UI)

Step What happens
1 User taps Answer → CallKit fires performAnswerCallAction → plugin calls setupAudioSession, queues an answer deferred emit
2 didActivateAudioSession → WebRTC ADM starts, JS not notified yet
3 audioSessionDidStartPlayOrRecord → plugin emits the answer event to JS listeners
JS can now safely start WebRTC media

Outgoing call from iOS Recents (UI-initiated, no prior JS call)

Same as outgoing above, except at step 4 the plugin emits a sendCall event to registered JS listeners instead of resolving a promise.

Unhold

Same deferred pattern — the unhold confirmation reaches JS only after audioSessionDidStartPlayOrRecord, ensuring audio is truly active before the app tries to resume media.


What was blocking what (the bugs)

Before this PR:

  • didActivateAudioSession immediately notified JS → JS called gUM / mute / unmute on a WebRTC ADM that hadn't finished initialising → crashes and race conditions.

After this PR:

  • JS is gated on audioSessionDidStartPlayOrRecord, so the ADM is guaranteed ready before JS acts on audio.

The setupAudioSession command

The second change exposes setupAudioSession as an explicit JS-callable command. When the app navigates to the in-call page, JS can call plugin.setupAudioSession() to pre-configure the AVAudioSession (voice-processing mode, etc.) before WebRTC's getUserMedia runs. Without this, gUM could race with audio session setup and configure the session in the wrong mode.

Copilot AI commented Jun 16, 2026

Copy link
Copy Markdown

@copilot explain what this PR does in the context of video meeting and phone call, listing the call flows and what is blocking what for js side usage.

Yes — and that's precisely the contract this PR establishes.

Why awaiting matters

sendCall now resolves its promise only after audioSessionDidStartPlayOrRecord fires, which means the WebRTC Audio Device Manager (ADM) is fully initialized. That resolved promise is the signal that audio is safe to touch.

If JS fires subsequent actions without awaiting:

Action What can go wrong without await
mute / unmute Goes straight to a CXSetMutedCallAction + performSetMutedCallAction, which operates on the audio device. If the ADM hasn't finished spinning up, this can produce a race / crash inside WebRTC's audio pipeline.
hold Similar — puts the audio device into a suspended state before it has fully started.
unhold Is itself gated: performSetHeldCallAction feeds into pendingActivateAudioSessionEmits and is deferred until audioSessionDidStartPlayOrRecord anyway, so it's self-protecting — but the JS promise still won't resolve until audio is active again.
endCall Relatively safe since it tears down instead of configuring audio, but it can still race with the ADM startup and leave cleanup in a bad state.

The expected JS pattern

await plugin.sendCall(name, id, sessionId);   // ← audio is now ready
await plugin.mute(sessionId);                  // safe

Calling mute (or any audio-affecting action) before sendCall resolves is not safe because the ADM initialization that sendCall's resolution guarantees hasn't happened yet. The PR effectively turns the resolved promise into an audio-readiness gate, so yes — JS must await it.

Copilot AI commented Jun 16, 2026

Copy link
Copy Markdown

@copilot explain what this PR does in the context of video meeting and phone call, listing the call flows and what is blocking what for js side usage.

setupAudioSession is fully synchronous and idempotent, so concurrent or un-awaited calls are benign in practice, but there are two subtle points worth knowing:


What actually happens with two concurrent/un-awaited calls

  1. Both calls run synchronously on the Cordova command thread. setupAudioSession (the internal method) has no async work — it calls AVAudioSession setters and updates the RTCAudioSessionConfiguration, then returns immediately. The Cordova wrapper resolves the JS promise right after. So both calls complete in-order on the same thread with no interleaving.

  2. The @synchronized(self) guard on _audioSessionStateSaved means the first call captures the host app's pre-call audio state and sets the flag; the second call sees the flag already set and skips the snapshot. This is correct — you only want to snapshot once per call lifetime so teardownAudioSession can restore to the right state.

  3. The AVAudioSession setters themselves are idempotent — calling setCategory:PlayAndRecord + setMode:VoiceChat twice in a row just applies the same configuration twice. iOS doesn't error or reset; it's a no-op on the second call.

  4. The RTCAudioSessionConfiguration update is likewise just overwriting the same values a second time — harmless.


Net result: not awaiting setupAudioSession before calling it again is safe — the second call is effectively a no-op. The only practical risk would be calling setupAudioSession and teardownAudioSession concurrently (e.g., a call ending racing with setup), but that's a separate concern unrelated to double-setup.

The await discipline matters more for sendCall, whose promise gates actual ADM readiness. setupAudioSession can safely be called once at page-enter without needing to await it before the rest of the call setup proceeds.

@rex-iotum rex-iotum merged commit cbdebdf into develop Jun 16, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants