Replies: 1 comment
-
|
sometimes remote, but I'll be doing the VPN (tailscale) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Key constraints/ingredients that matter for “peer-to-peer voice” in LVA:
soundcardon PulseAudio / PipeWire-Pulse, with optional module-echo-cancel AEC.device_id+ per-device topics.Here are several viable P2P voice directions (ordered from “most practical” to “most minimal”):
Option A: WebRTC (aiortc) + MQTT signaling (recommended)
What it is: Real-time full-duplex audio using WebRTC’s battle-tested jitter buffering, Opus codec, congestion control, and encryption (SRTP).
Signaling: Use your existing MQTT broker to exchange SDP offers/answers + ICE candidates.
Pros
Cons
How it fits LVA
CALLING/IN_CALL(or keep it parallel but it’s usually easier to treat as a state).IN_CALL, you likely disable wake word (or require push-to-talk) to avoid accidental triggers.MQTT topic sketch
lva/{target_id}/call/invite(payload:{call_id, from_id, sdp_offer, timestamp, codec_prefs})lva/{from_id}/call/answer(payload:{call_id, sdp_answer})lva/{from_id}/call/iceandlva/{target_id}/call/ice(payload:{call_id, candidate})lva/{peer_id}/call/bye(payload:{call_id, reason})Security
config.json+ optionally a shared secret/HMAC on signaling messages so random MQTT clients can’t ring every device.Option B: RTP/Opus over UDP + MQTT signaling (lighter than WebRTC)
What it is: You run your own RTP session (Opus frames) between peers; MQTT still does call setup.
Pros
Cons
When it’s attractive
Option C: “Intercom broadcast” (multicast or fanout) for announcements
What it is: One device sends a short live stream (or recorded snippet) to many devices (kitchen announcement mode).
Pros
Cons
Fit
Option D: Use HA as rendezvous / relay (not truly P2P)
What it is: HA coordinates calls and can even relay media (or run a voice server).
Pros
Cons
My suggested path (fastest to “good”, least regrets)
Milestone 1 (simple + useful): Half-duplex intercom (push-to-talk / hold-to-talk).
Milestone 2: Upgrade to full duplex “calls” using WebRTC (Option A).
This sequencing keeps you from sinking time into homegrown jitter buffers + echo problems before you’ve proven the product feel.
Brainstorm: UX + project integration ideas
Voice commands (routed locally, not via HA):
MQTT-controlled calls so HA dashboards or automations can trigger:
call/inviteto a specific device groupPriority rules:
RESPONDING(TTS playback), decide: auto-duck volume, interrupt, or reject call.LED states:
Echo strategy:
If you want one concrete “north star” architecture to build toward: WebRTC audio-only with aiortc, MQTT signaling, device allowlist, and an
IN_CALLstate that temporarily disables wakeword/STT. That keeps everything Python-native and fits your current event-driven design.Tell me whether your target is LAN-only intercom or sometimes remote (outside the house), and I’ll narrow the design to the best option and propose the exact new modules / config schema entries / MQTT topics to add.
Beta Was this translation helpful? Give feedback.
All reactions