Feat/realtime voice stt p0 p1 by ailuckly · Pull Request #17 · ailuckly/VocaTa

ailuckly · 2026-04-15T17:07:01Z

📌 变更内容

如:新增用户登录 API
如:修复登录失败时错误提示

✅ 测试验证

本地运行通过
自测通过
CI 流水线通过

PR 提交规范提醒：

确保提交主题信息符合约定式提交规范 (feat/fix/docs/style/refactor/test/chore)
确保代码已经通过本地测试
确保没有提交敏感信息(密码、密钥等)

xunfei.stt.* was only defined in test/prod profiles, causing XunfeiWebSocketSttClient.isAvailable() to return false on the local profile and silently fall back to QiniuSttClient. Moving the env-var mappings to application.yml makes them available across all profiles. Also updates .env.example with XUNFEI_STT_* fields and corrects AI_STT_PROVIDER default to xunfei.

Three issues fixed: 1. Authorization missing base64 encoding: Xunfei API requires the authorization string to be base64-encoded before URL-encoding. Previously only URL-encoded, causing HTTP 401 on every connection. 2. Invalid business parameter: `max_rg` is not a valid Xunfei IAT field. Removed it along with other non-standard params (rlang, vinfo, speex_size) that caused "param validate error" responses. 3. wpgs mode returns only the last segment in the final frame: Added ConcurrentSkipListMap<sn, text> to accumulate recognition segments across frames. Handles pgs=rpl (replacement) and pgs=apd (append) so the final isFinal=true result contains the complete utterance.

Two related fixes: 1. saveMessage blocks Netty IO thread: Mono.fromCallable with a blocking JDBC call was running on the reactive pipeline thread (Netty IO), causing doFinally to be delayed. This left audioSink non-null when the user started the second recording, resulting in a silent "already in progress" rejection. Fixed by adding .subscribeOn(Schedulers.boundedElastic()). 2. Server error silently deadlocks frontend recording: When the server rejected audio_start with an error, handleError() only logged it. The frontend stayed in voiceState='recording' with no active server pipeline, so no STT results ever came back. Fixed by calling stopRecording() on any server error received while recording.

Copilot

Pull request overview

This PR improves the realtime voice STT experience by enhancing client-side error handling and adding/adjusting server-side support for Xunfei WebSocket STT, including config wiring and response parsing suited for wpgs dynamic correction.

Changes:

Web: stop recording when a server error is received to avoid a stuck recording state.
Server: add Xunfei STT/TTS configuration entries and improve Xunfei WebSocket STT streaming response handling (wpgs segment accumulation + auth encoding update).
Server: run blocking message persistence on boundedElastic to avoid blocking reactive threads.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
vocata-web/src/utils/aiChat.ts	Stops recording on server error to prevent the UI/audio pipeline from getting stuck.
vocata-server/src/main/resources/application.yml	Adds Xunfei STT/TTS configuration section.
vocata-server/src/main/java/com/vocata/ai/stt/impl/XunfeiWebSocketSttClient.java	Updates Xunfei STT WebSocket parsing for wpgs and adjusts authorization encoding.
vocata-server/src/main/java/com/vocata/ai/pipeline/StreamingPipelineOrchestrator.java	Schedules blocking DB work on `boundedElastic`.
.env.example	Adds Xunfei STT env vars and switches example STT provider to xunfei.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-15T17:10:31Z

+    app-id: ${XUNFEI_STT_APP_ID:your-xunfei-stt-app-id}
+    api-key: ${XUNFEI_STT_API_KEY:your-xunfei-stt-api-key}
+    secret-key: ${XUNFEI_STT_SECRET_KEY:your-xunfei-stt-secret-key}


这里给 xunfei.stt.* 配置设置了默认占位符值（your-xunfei-stt-...）。但 XunfeiWebSocketSttClient#isAvailable() 目前只排除了 "your-xunfei-app-id/api-key/secret-key" 这几个占位符字符串，导致在仅设置 AI_STT_PROVIDER=xunfei 且未提供真实凭据时，appId/apiKey/secretKey 会落到这些默认值并被误判为“已配置”，随后连接科大讯飞会失败。建议：要么去掉这些默认值（留空让应用明确判定未配置），要么把 isAvailable() 里的占位符判断与这里的默认值保持一致。

Suggested change

app-id: ${XUNFEI_STT_APP_ID:your-xunfei-stt-app-id}

api-key: ${XUNFEI_STT_API_KEY:your-xunfei-stt-api-key}

secret-key: ${XUNFEI_STT_SECRET_KEY:your-xunfei-stt-secret-key}

app-id: ${XUNFEI_STT_APP_ID:}

api-key: ${XUNFEI_STT_API_KEY:}

secret-key: ${XUNFEI_STT_SECRET_KEY:}

Copilot · 2026-04-15T17:10:32Z

+                            int code = response.path("code").asInt(-1);
+                            if (code != 0) {
+                                String msg = response.path("message").asText("未知错误");
+                                logger.error("🎤【科大讯飞WebSocket STT】API错误 code={}: {}", code, msg);
+                                SttResult errorResult = new SttResult();
+                                errorResult.setText("API错误: " + msg);
+                                errorResult.setConfidence(0.0);
+                                errorResult.setFinal(true);
+                                Map<String, Object> errMeta = new HashMap<>();
+                                errMeta.put("error", msg);
+                                errMeta.put("provider", "XunfeiWebSocketSTT");
+                                errorResult.setMetadata(errMeta);
+                                sink.next(errorResult);
+                                return WebSocket.Listener.super.onText(webSocket, data, last);
+                            }


当 response.code != 0 时这里只是 sink.next(errorResult) 然后继续返回 onText，但没有主动关闭 WebSocket、停止心跳或终止 sink。这样音频流仍会继续发送帧，可能导致持续报错/资源占用，调用端也可能一直等不到完成信号。建议在该分支里：停止 heartbeatActive、对 webSocket 发送 close，并 sink.complete() 或 sink.error(...)（并确保后续不再发送音频帧）。

Copilot · 2026-04-15T17:10:32Z

+                            JsonNode dataNode = response.path("data");
+                            int frameStatus = dataNode.path("status").asInt(-1);
+                            JsonNode result = dataNode.path("result");
+


这里把解析逻辑内联后，类里原来的 parseWebSocketResponse(...) 已不再被调用（当前仓库内也没有其他引用），会留下死代码并增加后续维护/逻辑漂移风险。建议删除该方法，或复用该方法来解析响应并在其基础上做 wpgs 段落合并。

Copilot · 2026-04-15T17:10:32Z

    private Mono<Message> saveMessage(Long conversationId, String content, SenderType senderType, Long userId) {
-        return Mono.fromCallable(() -> {
-            Message message = new Message();
+        return Mono.fromCallable(() -> {            Message message = new Message();


Mono.fromCallable 的 lambda 开头被挤在同一行（{ Message message...），看起来像是误删了换行/缩进，容易触发格式化或 Checkstyle/lint 规则，也降低可读性。建议把 { 后的语句按项目常规格式换行并对齐缩进。

Suggested change

return Mono.fromCallable(() -> { Message message = new Message();

return Mono.fromCallable(() -> {

Message message = new Message();

ailuckly added 3 commits April 16, 2026 00:59

Copilot AI review requested due to automatic review settings April 15, 2026 17:07

ailuckly merged commit ad98175 into develop Apr 15, 2026
2 checks passed

Copilot started reviewing on behalf of ailuckly April 15, 2026 17:07 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

ailuckly deleted the feat/realtime-voice-stt-p0-p1 branch May 7, 2026 12:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/realtime voice stt p0 p1#17

Feat/realtime voice stt p0 p1#17
ailuckly merged 3 commits into
developfrom
feat/realtime-voice-stt-p0-p1

ailuckly commented Apr 15, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	return Mono.fromCallable(() -> { Message message = new Message();
	return Mono.fromCallable(() -> {
	Message message = new Message();

Conversation

ailuckly commented Apr 15, 2026

📌 变更内容

✅ 测试验证

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants