- Custom Component replacement for Home Assistant
This custom component replaces the official Microsoft TTS integration for Home Assistant, which has not been updated or maintained for a long time and is now legacy.
If you already have the original Microsoft TTS integration configured via configuration.yaml, you must remove that configuration.
IMPORTANT: Before removing it, save:
- Your API key
- The server region (e.g.,
westeurope,eastus, etc.)
Remove from your configuration.yaml lines similar to these:
tts:
- platform: microsoft
api_key: YOUR_API_KEY
region: YOUR_REGIONClick this badge to install Microsoft Text-to-Speech (TTS) via HACS
Manual
Copy the custom_components folder to your Home Assistant configuration directory (where the configuration.yaml file is located).
The final structure should be:
config/
├── custom_components/
│ └── microsoft/
│ ├── __init__.py
│ ├── config_flow.py
│ ├── const.py
│ ├── manifest.json
│ └── tts.py
└── configuration.yaml
Restart Home Assistant completely to load the new custom component.
Click this badge after restart Home Assistant to configure Microsoft Text-to-Speech (TTS)
Manual
- Go to Settings → Devices & Services → Integrations
- Click the + Add Integration button
- Search for "Microsoft Text-to-Speech (TTS)"
- Follow the guided configuration process
- Enter the API key and server region that you saved previously
This integration now supports streaming text-to-speech for reduced latency in voice assistant pipelines. When used with LLM conversation agents:
- Sentence-by-sentence synthesis: Audio is generated and played as soon as each sentence is complete, rather than waiting for the entire response
- 50-70% latency reduction: Users hear the first sentence while the LLM is still generating subsequent text
- Multi-language support: Intelligent sentence detection for 140+ languages including:
- Latin scripts (English, Italian, Spanish, etc.)
- CJK languages (Chinese, Japanese, Korean)
- Arabic and Urdu
- Indic scripts (Hindi, Bengali, Marathi, etc.)
- Full SSML support: Maintains all voice customization options (voice, rate, pitch, volume, style, role) in streaming mode
- SSML sanitization: now with full handling of special characters.
The streaming implementation uses the async_stream_tts_audio method introduced in Home Assistant's TTS architecture:
- Text accumulation: Incoming text chunks from the LLM are accumulated until a sentence boundary is detected
- Sentence synthesis: Each complete sentence is synthesized independently using Azure TTS REST API
- Audio streaming: Audio chunks are streamed to Home Assistant as they arrive from Azure
- Immediate playback: Home Assistant begins playback without waiting for the complete response
Note: Streaming requires Home Assistant 2024.2+.
- Home Assistant version 2024.2+ or higher
- Azure Cognitive Services Speech API key
Developed by @pajeronda
Integration based on:
This project is released under the MIT License. See LICENSE for details.
-
API Usage: This integration requires an active Microsoft Azure account and a valid API key. Use of the Azure Cognitive Services API is subject to Microsoft's terms of service.
-
Trademarks: Microsoft and related logos are registered trademarks of Microsoft Corp. This project is an unofficial integration developed by @pajeronda and is not affiliated with, sponsored by, or endorsed by Microsoft Corp.