Spoken commands fail with the error "No Voice detected" (and Home Assistant logs show stt-no-text-recognized) when the Wyoming STT engine is configured to use the Sherpa-ONNX library (e.g., Parakeet-TDT models).
The wake word is detected perfectly, and the VAD (Voice Activity Detection) triggers correctly in the logs, but no text is transcribed.
Technical Details:
App Version: Ava v0.4.5 (knoop7 fork)
STT Engine: rhasspy/wyoming-whisper:latest
Library: sherpa (specifically sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8)
Behavior:
Switching back to standard faster-whisper library with medium-int8 model resolves the issue immediately.
The brownard/Ava fork works correctly with the Sherpa library, suggesting an audio encoding or streaming protocol mismatch in the knoop7 audio pipeline when communicating with Sherpa-based Wyoming servers.
- type: stt-vad-start
timestamp: "2026-04-08T07:45:41.956267+00:00"
- type: stt-vad-end
timestamp: "2026-04-08T07:45:42.835689+00:00"
- type: error
data:
code: stt-no-text-recognized
message: No text recognized
It seems the audio stream format or chunking provided by this fork is not compatible with the stricter requirements of the Sherpa library.
Spoken commands fail with the error "No Voice detected" (and Home Assistant logs show stt-no-text-recognized) when the Wyoming STT engine is configured to use the Sherpa-ONNX library (e.g., Parakeet-TDT models).
The wake word is detected perfectly, and the VAD (Voice Activity Detection) triggers correctly in the logs, but no text is transcribed.
Technical Details:
App Version: Ava v0.4.5 (knoop7 fork)
STT Engine: rhasspy/wyoming-whisper:latest
Library: sherpa (specifically sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8)
Behavior:
Switching back to standard faster-whisper library with medium-int8 model resolves the issue immediately.
The brownard/Ava fork works correctly with the Sherpa library, suggesting an audio encoding or streaming protocol mismatch in the knoop7 audio pipeline when communicating with Sherpa-based Wyoming servers.
It seems the audio stream format or chunking provided by this fork is not compatible with the stricter requirements of the Sherpa library.