Feat/realtime voice vad p2#18
Merged
Merged
Conversation
Transforms push-to-talk into a continuous conversation mode: Frontend (aiChat.ts): - AudioManager: reduce ScriptProcessorNode buffer 4096→2048 (128ms/frame) for finer VAD granularity; add RMS-based silence detection with ~0.8s window (6 frames); add barge-in detection when user speaks while AI is playing TTS; add isMuted flag for per-frame gating - VocaTaAIChat: startAudioCall() now immediately starts recording (GPT-voice style); auto-restarts listening after TTS ends; registers VAD silence callback (auto audio_end) and barge-in callback; adds muteMic()/unmuteMic()/micMuted API Frontend (ChatPage.vue): - Phone button starts call + recording immediately (no mic click needed) - Mic button is now mute/unmute toggle (red when muted) - voiceStatusText updated: "正在聆听..." | "麦克风已静音" | "AI 回答中" - Removed old push-to-talk hint text - Removed VAD polling interval (now internal to AudioManager) Backend (XunfeiWebSocketSttClient.java): - vad_eos: 3000→1000ms (frontend VAD handles ~0.8s, server is fallback) TTFA improvement: ~4-6s → ~2.2s
There was a problem hiding this comment.
Pull request overview
This PR upgrades the realtime voice chat experience toward “continuous listening” with client-side VAD and barge-in support, plus a UI mute control and related copy/docs updates.
Changes:
- Web: Move VAD state handling into
AudioManager, add mic mute UI/state, and adjust voice status text. - Web: Add VAD-based auto-stop on silence and barge-in trigger wiring for interrupting AI speech.
- Server/Docs: Tune Xunfei STT
vad_eosand rewrite README to emphasize realtime voice interaction.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
vocata-web/src/views/ChatPage.vue |
Updates voice UI to support mute state and removes UI-side VAD polling. |
vocata-web/src/utils/aiChat.ts |
Implements VAD logic in AudioManager, continuous mode auto-restart, and barge-in callbacks. |
vocata-server/src/main/java/com/vocata/ai/stt/impl/XunfeiWebSocketSttClient.java |
Reduces STT end-of-speech timeout (vad_eos). |
README.md |
Replaces long-form intro with a more product/experience-oriented README and updated links. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return this.isMuted | ||
| } | ||
|
|
||
| async playAudio(audioBuffer: ArrayBuffer): Promise<void> { try { |
Comment on lines
+1180
to
+1185
| this.audioManager.setBargeInCallback(() => { | ||
| console.log('🎤 Barge-in:用户插话,打断 AI') | ||
| this.audioManager.clearQueue() | ||
| // 发送 audio_start → 服务端 SPEAKING 状态时触发 handleBargeIn | ||
| this.wsClient?.startAudioRecording() | ||
| }) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📌 变更内容
✅ 测试验证
PR 提交规范提醒: