Skip to content

Fix/realtime turn detection fixes (only Gemini and Openai)#470

Merged
dangusev merged 5 commits intomainfrom
fix/realtime-turn-detection-fixes
Apr 6, 2026
Merged

Fix/realtime turn detection fixes (only Gemini and Openai)#470
dangusev merged 5 commits intomainfrom
fix/realtime-turn-detection-fixes

Conversation

@dangusev
Copy link
Copy Markdown
Collaborator

@dangusev dangusev commented Apr 6, 2026

Realtime models were not emitting interruption events for other plugins to react to.

In this PR:

  • Added "interrupted" field to RealtimeAudioOutputDoneEvent to signify interruption
  • Handle interruptions in Gemini and OpenAI Realtime implementations

Summary by CodeRabbit

  • New Features

    • Real-time conversations now emit and track an "interrupted" state for audio completion, allowing interruption-aware flows.
    • Default model set for OpenAI integration to simplify setup.
  • Bug Fixes

    • Voice activity detection reacts faster to silence for snappier turns.
    • Video frame delivery now respects configured frame rate.
    • Interruptions from input or server-side cancellations properly trigger interruption handling and logging.
  • Chores

    • Consolidated realtime audio configuration warnings and disabling of incompatible settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 6, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a167621e-d47f-42eb-b82d-db44fb3f53d5

📥 Commits

Reviewing files that changed from the base of the PR and between be272a3 and b728c8c.

📒 Files selected for processing (3)
  • agents-core/vision_agents/core/agents/agents.py
  • plugins/openai/vision_agents/plugins/openai/openai_llm.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py
✅ Files skipped from review due to trivial changes (1)
  • plugins/openai/vision_agents/plugins/openai/openai_llm.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py

📝 Walkthrough

Walkthrough

Realtime audio flow now supports explicit interruption signaling: a new interrupted: bool field was added to the RealtimeAudioOutputDoneEvent, core emitter and handlers accept an interrupted argument, and plugins were updated to detect and emit interrupted completion events and adjust related realtime behavior.

Changes

Cohort / File(s) Summary
Event System
agents-core/vision_agents/core/llm/events.py
Normalized imports and added interrupted: bool = False to RealtimeAudioOutputDoneEvent, changing the event payload contract.
Core Realtime Handler
agents-core/vision_agents/core/llm/realtime.py
Added interrupted: bool = False parameter to _emit_audio_output_done_event(...) and propagated it into emitted events.
Gemini Plugin
plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
Lowered VAD silence threshold (500→250ms), pass fps=float(self.fps) to frame handler, and handle server_content.interrupted by calling interrupt(), emitting an interrupted done event, and marking message handled.
OpenAI Realtime Plugin
plugins/openai/vision_agents/plugins/openai/openai_realtime.py
On input_audio_buffer.speech_started actively call interrupt() and emit audio-output-done with interrupted=True; add explicit handling/logging for responses with status == "cancelled".
Agent Configuration
agents-core/vision_agents/core/agents/agents.py
Consolidated realtime audio config validation: if stt/tts/turn_detection are set, warn and disable all by assigning None/False consistently.
OpenAI LLM Init
plugins/openai/vision_agents/plugins/openai/openai_llm.py
Made model parameter optional with default "gpt-5.4" in OpenAILLM.__init__.

Sequence Diagram(s)

sequenceDiagram
  participant UserAudio as User Audio
  participant Plugin as Realtime Plugin
  participant Realtime as Realtime Core
  participant Events as EventEmitter/Subscribers

  UserAudio->>Plugin: audio frame / speech start
  Plugin->>Realtime: detect interruption (speech_started or server_content.interrupted)
  Realtime->>Realtime: call interrupt()
  Realtime->>Events: _emit_audio_output_done_event(response_id, interrupted=true)
  Events->>Subscribers: deliver RealtimeAudioOutputDoneEvent(interrupted=true)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

The microphone gasps, a thin throat snapped—
a Boolean like a nail driven through night.
Sound dies and leaves its small, white flag;
the loop closes on a quiet, precise cruelty.
We name the cut and call the silence finished.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main objective of the changeset: implementing fixes for realtime turn detection with interruption event handling in Gemini and OpenAI plugins.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/realtime-turn-detection-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
plugins/openai/vision_agents/plugins/openai/openai_realtime.py (1)

311-312: Optional: include response/session identifiers in cancellation log.

Adding IDs here would make interruption/cancellation traces easier to correlate across providers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/openai/vision_agents/plugins/openai/openai_realtime.py` around lines
311 - 312, The cancellation log at the response_done_event.status branch lacks
identifiers; update the logger.debug call in the response handling (where
response_done_event.response.status == "cancelled") to include the response and
session identifiers (e.g., response_done_event.response.id and any session id
available on response_done_event such as response_done_event.session_id or
response_done_event.session.id) so traces can be correlated across providers,
ensuring you handle missing attributes safely when formatting the log message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@plugins/openai/vision_agents/plugins/openai/openai_realtime.py`:
- Around line 311-312: The cancellation log at the response_done_event.status
branch lacks identifiers; update the logger.debug call in the response handling
(where response_done_event.response.status == "cancelled") to include the
response and session identifiers (e.g., response_done_event.response.id and any
session id available on response_done_event such as
response_done_event.session_id or response_done_event.session.id) so traces can
be correlated across providers, ensuring you handle missing attributes safely
when formatting the log message.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ee505b35-51d9-49c7-aa09-833a52d625db

📥 Commits

Reviewing files that changed from the base of the PR and between bd67d66 and be272a3.

📒 Files selected for processing (4)
  • agents-core/vision_agents/core/llm/events.py
  • agents-core/vision_agents/core/llm/realtime.py
  • plugins/gemini/vision_agents/plugins/gemini/gemini_realtime.py
  • plugins/openai/vision_agents/plugins/openai/openai_realtime.py

@dangusev dangusev merged commit 51cc71b into main Apr 6, 2026
6 checks passed
@dangusev dangusev deleted the fix/realtime-turn-detection-fixes branch April 6, 2026 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants