fix: correct Qiniu STT endpoint and robust response parsing by ailuckly · Pull Request #14 · ailuckly/VocaTa

ailuckly · 2026-04-12T12:06:27Z

Summary

Fix default STT endpoint from openai.qiniu.com/v1 → api.qnaigc.com/v1 (the correct host for voice/asr API)
Improve parseAsrResponse to handle multiple response formats (data.text, data.result.text, top-level text) and correctly set error metadata so isSttError() filter in Orchestrator properly detects and discards error results before LLM
Add subscribeOn(boundedElastic) to uploadAudioToQiniu to avoid blocking Reactor thread during sync file upload

Root cause

The QINIU_STT_ENDPOINT was never set on the staging server, so it fell back to the default https://openai.qiniu.com/v1 instead of the correct https://api.qnaigc.com/v1, causing all /voice/asr calls to fail with "access denied".

Test plan

Voice chat: verify STT recognition triggers and returns recognized text in stt_result events
Verify STT errors no longer get sent to LLM as user input

- QiniuSttClient: add error metadata to failed STT results so they can be distinguished from valid recognition results - StreamingPipelineOrchestrator: filter out STT error results before forwarding to LLM pipeline. Errors still sent to client for display.

Docker layer caching caused deployments to skip recompilation even when source code changed via git pull. Split into explicit build --no-cache + up --wait to ensure fresh builds on every deploy.

- Change default QINIU_STT_ENDPOINT from openai.qiniu.com to api.qnaigc.com (the correct API host for qnaigc voice/asr) - Fix parseAsrResponse to handle multiple response formats (data.text, data.result.text, top-level text) and correctly set error metadata so isSttError() filter in Orchestrator can detect API errors - Add subscribeOn(boundedElastic) to uploadAudioToQiniu to avoid blocking Reactor thread during synchronous file upload

…t/realtime-voice-stt-p0-p1

Copilot

Pull request overview

This PR fixes Qiniu STT integration issues by updating the default ASR endpoint, improving ASR response parsing/error signaling, and adjusting execution to avoid blocking Reactor threads during audio upload.

Changes:

Update default Qiniu STT endpoint to https://api.qnaigc.com/v1.
Make parseAsrResponse handle multiple response shapes and attach error metadata for downstream filtering.
Run synchronous upload work on boundedElastic and tweak staging deploy build steps.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
`vocata-server/src/main/resources/application.yml`	Updates default STT endpoint configuration.
`vocata-server/src/main/java/com/vocata/ai/stt/impl/QiniuSttClient.java`	Adds error metadata and broadens ASR response parsing; offloads upload to boundedElastic.
`vocata-server/src/main/java/com/vocata/ai/pipeline/StreamingPipelineOrchestrator.java`	Filters STT error results (via metadata) out of LLM path.
`.github/workflows/cd-staging.yml`	Splits compose build/up and forces no-cache builds.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-12T12:09:57Z

@@ -144,13 +144,13 @@ public Flux<PipelineEvent> processVoiceMessage(String conversationUuid,
                            .filter(r -> r.getText() != null && !r.getText().trim().isEmpty())


sttFlux 先按 text 非空过滤（144行），但当前 STT 错误结果有些分支会把 text 设为空（例如 Qiniu parseAsrResponse 的 error/unknown-format 分支）。这会导致“错误结果让前端知道发生了什么”的注释与实际行为不一致，并可能让语音链路在 STT 失败时静默结束（没有任何 STT/Error 事件）。建议保留错误结果通过（例如过滤条件允许 isSttError(r) 的结果通过，或在 STT 端为错误设置可展示的非空文本）。

Suggested change

.filter(r -> r.getText() != null && !r.getText().trim().isEmpty())

.filter(r -> isSttError(r) || (r.getText() != null && !r.getText().trim().isEmpty()))

Copilot · 2026-04-12T12:09:57Z

+            // 检查顶层错误字段 ({"error": "..."} 或 {"error": {...}})
+            if (response.containsKey("error") && response.get("error") != null) {
+                String errorMessage = response.get("error").toString();
                logger.error("七牛云ASR API错误: {}", errorMessage);
+                metadata.put("error", errorMessage);
+                result.setText("");
+                result.setConfidence(0.0);
            } else if (response.containsKey("data")) {


当 ASR 返回顶层 error 或未知格式时，这里把 result.text 设为 ""（469-470、494-495）。在当前编排器里 STT 结果会先按 text 非空过滤，因此这些错误会被直接丢弃，前端看不到错误提示，且后续流程可能无输入直接完成。建议在保留 metadata.error 的同时给 text 设置一个可展示的错误消息（或改为由上游根据 metadata.error 生成并透传错误事件）。

Copilot · 2026-04-12T12:09:58Z

+        // 格式1: data.text
+        if (data.containsKey("text")) {
+            String t = (String) data.get("text");
+            return StringUtils.hasText(t) ? t : "";
+        }


extractTextFromData 在字段存在但内容为空白时返回空字符串（522行），而上层 parseAsrResponse 仅用 text != null 判定成功并赋予 0.95 置信度（476-479行）。这会把“空结果”误判为成功并记录高置信度。建议将空白文本视为未识别（返回 null 或在上层用 StringUtils.hasText(text) 判定成功），并在未识别时保持置信度为 0。

Copilot · 2026-04-12T12:09:58Z

        echo "[2/5] deploy compose stack"
        echo "当前提交: $GIT_SHA"
-        docker compose --env-file "$ENV_FILE" up -d --build --wait
+        docker compose --env-file "$ENV_FILE" build --no-cache


staging 部署流程改为每次都 docker compose build --no-cache（71行）会完全失去镜像层缓存，显著增加构建/部署时间与服务器磁盘/网络压力；如果只是确保拉到最新代码，默认使用缓存通常更合适。建议移除 --no-cache，或仅在手动触发/显式参数开启时使用无缓存构建。

Suggested change

docker compose --env-file "$ENV_FILE" build --no-cache

docker compose --env-file "$ENV_FILE" build

…ture Xunfei IAT requires raw PCM (audio/L16;rate=16000) but MediaRecorder outputs webm/opus containers. Switch to AudioWorklet-compatible ScriptProcessorNode approach: - Create dedicated 16kHz AudioContext for recording - Capture Float32 samples, convert to Int16 PCM in-line - Send 4096-frame chunks (~256ms) directly as binary WebSocket frames - Remove MediaRecorder, handleAudioChunk, handleMediaRecorderStop - Update audio_start format field from 'mediarecorder' to 'pcm'

ailuckly added 4 commits April 12, 2026 13:02

fix: use --no-cache in staging CD to prevent stale Docker builds

018d42d

Docker layer caching caused deployments to skip recompilation even when source code changed via git pull. Split into explicit build --no-cache + up --wait to ensure fresh builds on every deploy.

Merge branch 'develop' of https://github.com/ailuckly/VocaTa into fea…

5c978ab

…t/realtime-voice-stt-p0-p1

Copilot AI review requested due to automatic review settings April 12, 2026 12:06

Copilot started reviewing on behalf of ailuckly April 12, 2026 12:07 View session

Copilot AI reviewed Apr 12, 2026

View reviewed changes

ailuckly merged commit 74b4e80 into develop Apr 12, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: correct Qiniu STT endpoint and robust response parsing#14

fix: correct Qiniu STT endpoint and robust response parsing#14
ailuckly merged 5 commits into
developfrom
feat/realtime-voice-stt-p0-p1

ailuckly commented Apr 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 12, 2026

Uh oh!

Copilot AI Apr 12, 2026

Uh oh!

Copilot AI Apr 12, 2026

Uh oh!

Copilot AI Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -144,13 +144,13 @@ public Flux<PipelineEvent> processVoiceMessage(String conversationUuid,
		.filter(r -> r.getText() != null && !r.getText().trim().isEmpty())

	.filter(r -> r.getText() != null && !r.getText().trim().isEmpty())
	.filter(r -> isSttError(r) \|\| (r.getText() != null && !r.getText().trim().isEmpty()))

	docker compose --env-file "$ENV_FILE" build --no-cache
	docker compose --env-file "$ENV_FILE" build

Conversation

ailuckly commented Apr 12, 2026

Summary

Root cause

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants