Summary
Currently, our AI capabilities are split across two disconnected services:
/ai-service: Handles text-based chat.
/voice-agent: Handles real-time voice-call-like interactions.
This separation creates a maintenance bottleneck. Updates to the system prompt, tool definitions, or model configurations must be duplicated across both repositories, leading to logic drift and inconsistent user experiences. We need to unify these into a single integrated logic layer.
Objectives
The goal is to create a Single Source of Truth for the agent's intelligence while allowing for "modality-specific" outputs (e.g., text vs. voice + emotion).
- Unified Configuration : Share a single system_prompt.md and config.yaml across both interfaces.
- Shared Toolset: Ensure both services access the same function-calling registry.
Maintain Voice Flare: Preserve the voice agent's ability to process and output emotion/SSML tags, ensuring the LLM understands when it is "speaking" vs. "typing."
Summary
Currently, our AI capabilities are split across two disconnected services:
/ai-service: Handles text-based chat./voice-agent: Handles real-time voice-call-like interactions.This separation creates a maintenance bottleneck. Updates to the system prompt, tool definitions, or model configurations must be duplicated across both repositories, leading to logic drift and inconsistent user experiences. We need to unify these into a single integrated logic layer.
Objectives
The goal is to create a Single Source of Truth for the agent's intelligence while allowing for "modality-specific" outputs (e.g., text vs. voice + emotion).
Maintain Voice Flare: Preserve the voice agent's ability to process and output emotion/SSML tags, ensuring the LLM understands when it is "speaking" vs. "typing."