Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions app/ai/voice/agents/breeze_buddy/agent/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,13 @@
from app.ai.voice.agents.breeze_buddy.services.telephony.base_provider import (
VoiceCallProvider,
)
from app.ai.voice.agents.breeze_buddy.stt.fallback import (
ALERT_STT_TERMINAL_FAILURE,
STT_FALLBACK_SLACK_TAG,
record_stt_failure,
send_templated_alert,
)
from app.services.service_health import service_health_monitor
Comment on lines +68 to +74
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several newly added imports appear unused in this module (ALERT_STT_TERMINAL_FAILURE, STT_FALLBACK_SLACK_TAG, send_templated_alert, and service_health_monitor unless used elsewhere). If they aren’t used later in the file, they should be removed to avoid confusion and keep imports accurate.

Suggested change
from app.ai.voice.agents.breeze_buddy.stt.fallback import (
ALERT_STT_TERMINAL_FAILURE,
STT_FALLBACK_SLACK_TAG,
record_stt_failure,
send_templated_alert,
)
from app.services.service_health import service_health_monitor
from app.ai.voice.agents.breeze_buddy.stt.fallback import record_stt_failure

Copilot uses AI. Check for mistakes.
from app.ai.voice.agents.breeze_buddy.template import TemplateContext
from app.ai.voice.agents.breeze_buddy.template.builder import FlowConfigBuilder
from app.ai.voice.agents.breeze_buddy.template.context import with_context
Expand Down Expand Up @@ -603,6 +610,42 @@ async def on_pipeline_error(task, error):
{"processor": str(processor), "error": error_msg},
)

# Detect STT errors by processor name keywords
processor_str = str(processor).lower()
stt_keywords = (
"stt",
"soniox",
"deepgram",
"transcri",
"google",
"sarvam",
)
is_stt_error = any(kw in processor_str for kw in stt_keywords)
Comment on lines +613 to +623
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The STT error detection keywords include the substring "google", which is likely to match non‑STT processors like GoogleLLM... and incorrectly treat LLM failures as STT failures (triggering STT alerts and ending the call). Tighten this check to only match actual STT processors/providers (e.g., soniox, deepgram, stt, etc.) or use the processor’s class/type rather than substring matching.

Suggested change
# Detect STT errors by processor name keywords
processor_str = str(processor).lower()
stt_keywords = (
"stt",
"soniox",
"deepgram",
"transcri",
"google",
"sarvam",
)
is_stt_error = any(kw in processor_str for kw in stt_keywords)
# Detect STT errors using STT-specific processor identifiers only.
# Avoid broad provider substrings like "google", which can also match
# non-STT processors such as Google-backed LLM components.
processor_str = str(processor).lower()
processor_type_str = processor.__class__.__name__.lower()
stt_keywords = (
"stt",
"soniox",
"deepgram",
"transcri",
"sarvam",
"speech",
)
is_stt_error = any(
kw in processor_str or kw in processor_type_str
for kw in stt_keywords
)

Copilot uses AI. Check for mistakes.

if not is_stt_error:
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

service_health_monitor is imported but never used in on_pipeline_error. As a result, non‑STT pipeline errors (TTS/LLM/telephony) aren’t recorded into the service health circuits, so auto‑pause can’t trigger from those failures. Consider calling service_health_monitor.record_pipeline_error(...) for non‑STT errors before returning.

Suggested change
if not is_stt_error:
if not is_stt_error:
try:
service_health_monitor.record_pipeline_error(
processor=str(processor),
error_message=str(error_msg),
)
except Exception as health_err:
logger.warning(
f"Failed to record non-STT pipeline error in service health monitor: {health_err}"
)

Copilot uses AI. Check for mistakes.
return

logger.warning(f"STT error detected from processor: {processor}")

# Record failure in fallback system (once per call, Soniox only)
if self.stt_provider == "soniox" and not self._stt_failure_recorded:
self._stt_failure_recorded = True
try:
await record_stt_failure(
error_msg=str(error_msg)[:200],
call_sid=self.call_sid or "",
context="mid-call",
)
except Exception as fb_err:
logger.warning(f"STT fallback record_failure failed: {fb_err}")

# Alert and end call — no mid-call swap in Phase 1
fire_and_forget(self._send_mid_call_stt_alert())
try:
await task.queue_frames([EndFrame()])
except Exception:
pass
Comment on lines +613 to +647
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if fire_and_forget is defined or imported anywhere in the codebase
echo "=== Searching for fire_and_forget definition/import ==="
rg -n "fire_and_forget" --type=py

echo ""
echo "=== Checking EndFrame import in agent/__init__.py ==="
rg -n "from pipecat.frames.frames import" app/ai/voice/agents/breeze_buddy/agent/__init__.py

echo ""
echo "=== Checking stt_provider attribute initialization ==="
rg -n "self\.stt_provider" app/ai/voice/agents/breeze_buddy/agent/__init__.py

echo ""
echo "=== Checking _stt_failure_recorded attribute initialization ==="
rg -n "self\._stt_failure_recorded" app/ai/voice/agents/breeze_buddy/agent/__init__.py

Repository: juspay/clairvoyance

Length of output: 2978


🏁 Script executed:

#!/bin/bash
# Get first 100 lines to see all imports
echo "=== First 100 lines (imports section) ==="
head -100 app/ai/voice/agents/breeze_buddy/agent/__init__.py

echo ""
echo "=== Check if fire_and_forget is defined in agent/__init__.py ==="
rg -n "def fire_and_forget" app/ai/voice/agents/breeze_buddy/agent/__init__.py

echo ""
echo "=== Search for __init__ method and surrounding context ==="
rg -n "def __init__" app/ai/voice/agents/breeze_buddy/agent/__init__.py

Repository: juspay/clairvoyance

Length of output: 3709


🏁 Script executed:

#!/bin/bash
# Read the __init__ method body (lines 116-200 to capture most of it)
echo "=== __init__ method body (lines 116-250) ==="
sed -n '116,250p' app/ai/voice/agents/breeze_buddy/agent/__init__.py

echo ""
echo "=== Search for fire_and_forget definition anywhere in file ==="
grep -n "def fire_and_forget\|fire_and_forget.*=" app/ai/voice/agents/breeze_buddy/agent/__init__.py

echo ""
echo "=== Search for any fire_and_forget function or import in entire file ==="
grep -n "fire_and_forget" app/ai/voice/agents/breeze_buddy/agent/__init__.py

Repository: juspay/clairvoyance

Length of output: 5457


🏁 Script executed:

#!/bin/bash
# Verify the exact line range (613-647) matches the code snippet shown in review
echo "=== Lines 613-647 to verify against review snippet ==="
sed -n '613,647p' app/ai/voice/agents/breeze_buddy/agent/__init__.py

Repository: juspay/clairvoyance

Length of output: 1466


Critical: Undefined names and attributes will cause runtime errors.

The code contains multiple undefined names and attributes that will raise NameError and AttributeError at runtime:

  1. fire_and_forget (line 643) — not imported or defined anywhere in the file
  2. EndFrame (line 645) — not imported from pipecat.frames.frames
  3. self.stt_provider (line 631) — not initialized in __init__
  4. self._stt_failure_recorded (line 632) — not initialized in __init__

The silent except: pass (lines 646-647) will also swallow errors without any indication of failure.

Proposed fixes

Add missing import:

-from pipecat.frames.frames import LLMMessagesAppendFrame, TTSSpeakFrame
+from pipecat.frames.frames import EndFrame, LLMMessagesAppendFrame, TTSSpeakFrame

Import or define fire_and_forget:

+import asyncio
+
+def fire_and_forget(coro):
+    """Schedule coroutine without awaiting."""
+    asyncio.create_task(coro)

Initialize attributes in __init__ after line 195 (error tracking section):

         # Error tracking
         self.errors: List[Dict[str, Any]] = []
+
+        # STT fallback tracking
+        self.stt_provider: Optional[str] = None
+        self._stt_failure_recorded: bool = False

Replace silent exception:

-            try:
-                await task.queue_frames([EndFrame()])
-            except Exception:
-                pass
+            try:
+                await task.queue_frames([EndFrame()])
+            except Exception as e:
+                logger.warning(f"Failed to queue EndFrame: {e}")
🧰 Tools
🪛 Ruff (0.15.12)

[warning] 639-639: Do not catch blind exception: Exception

(BLE001)


[error] 643-643: Undefined name fire_and_forget

(F821)


[error] 645-645: Undefined name EndFrame

(F821)


[error] 646-647: try-except-pass detected, consider logging the exception

(S110)


[warning] 646-646: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/ai/voice/agents/breeze_buddy/agent/__init__.py` around lines 613 - 647,
The block handling STT errors uses undefined symbols and uninitialized
attributes: import or define fire_and_forget and EndFrame (from
pipecat.frames.frames) so calls to
fire_and_forget(self._send_mid_call_stt_alert()) and EndFrame() resolve;
initialize self.stt_provider and self._stt_failure_recorded in the class
__init__ (e.g., set default provider string and False) so checks in that block
won't raise AttributeError; and replace the silent "except: pass" around await
task.queue_frames([EndFrame()]) with a specific exception handler that logs the
error (use logger.warning or logger.exception) to avoid swallowing failures.
Ensure references to record_stt_failure and _send_mid_call_stt_alert remain
unchanged.


@self.transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected: {client}")
Expand Down
9 changes: 9 additions & 0 deletions app/ai/voice/agents/breeze_buddy/managers/calls.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@
)
from app.services.gcp.storage.storage import upload_file_to_gcs
from app.services.redis.client import get_redis_service
from app.services.service_health import service_health_monitor


async def _get_lead_config(lead: LeadCallTracker) -> Optional[CallExecutionConfig]:
Expand Down Expand Up @@ -456,6 +457,14 @@ async def process_backlog_leads():
await release_lock_on_lead_by_id(locked_lead.id)
continue

# Check global service health pause (circuit breaker pattern)
if await service_health_monitor.is_globally_paused():
logger.info(
f"Skipping lead {locked_lead.id} - calls are globally paused due to service health"
)
await release_lock_on_lead_by_id(locked_lead.id)
continue
Comment on lines +460 to +466
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global pause check happens only after acquiring a per-lead DB lock. When calls are paused, this will still lock and unlock every backlog lead on each run, adding avoidable DB load. Consider checking is_globally_paused() once near the start of process_backlog_leads() (before querying/locking leads) and returning early (or sleeping) when paused.

Copilot uses AI. Check for mistakes.

customer_phone = (locked_lead.payload or {}).get(
"customer_mobile_number"
)
Expand Down
19 changes: 19 additions & 0 deletions app/core/config/dynamic.py
Original file line number Diff line number Diff line change
Expand Up @@ -341,3 +341,22 @@ async def OUTBOUND_RATE_LIMIT_WINDOW_SECONDS() -> int:
async def OUTBOUND_RATE_LIMIT_BLOCK_ENABLED() -> bool:
"""Returns OUTBOUND_RATE_LIMIT_BLOCK_ENABLED from Redis"""
return await get_config("OUTBOUND_RATE_LIMIT_BLOCK_ENABLED", False, bool)


# --- Service Health Monitoring Configuration ---
async def ENABLE_SERVICE_HEALTH_MONITORING() -> bool:
"""Returns ENABLE_SERVICE_HEALTH_MONITORING from Redis.

When True, service health monitoring is active and will auto-pause
calls when upstream service failures exceed thresholds.
"""
return await get_config("ENABLE_SERVICE_HEALTH_MONITORING", True, bool)
Comment on lines +350 to +353
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENABLE_SERVICE_HEALTH_MONITORING defaults to True. Since enabling this feature can pause outbound calls automatically, consider defaulting this flag to False and requiring an explicit rollout via Redis/DevCycle to avoid unexpected behavior in environments where the config key isn’t set yet.

Suggested change
When True, service health monitoring is active and will auto-pause
calls when upstream service failures exceed thresholds.
"""
return await get_config("ENABLE_SERVICE_HEALTH_MONITORING", True, bool)
When False (default), service health monitoring is disabled unless
explicitly enabled via Redis/DevCycle rollout.
When True, service health monitoring is active and will auto-pause
calls when upstream service failures exceed thresholds.
"""
return await get_config("ENABLE_SERVICE_HEALTH_MONITORING", False, bool)

Copilot uses AI. Check for mistakes.


async def SERVICE_HEALTH_AUTO_RESUME_MINUTES() -> int:
"""Returns SERVICE_HEALTH_AUTO_RESUME_MINUTES from Redis.

Number of minutes with no errors before auto-resuming calls
after a circuit breaker opens.
"""
return await get_config("SERVICE_HEALTH_AUTO_RESUME_MINUTES", 15, int)
5 changes: 5 additions & 0 deletions app/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@
from app.schemas import (
AutomaticVoiceUserConnectRequest,
)
from app.services.fallback import initialize_fallback_tasks
from app.services.service_health import initialize_service_health_tasks
from app.services.langfuse.tasks.task import initialize_langfuse_tasks
from app.services.redis import (
close_redis_connections,
Expand Down Expand Up @@ -167,6 +169,9 @@ async def lifespan(_app: FastAPI):
# Initialize Langfuse tasks (if configured)
await initialize_langfuse_tasks(_background_scheduler)

# Initialize STT fallback reset tasks
await initialize_fallback_tasks(_background_scheduler)

Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialize_service_health_tasks is imported but never invoked in the lifespan startup, so the service health monitoring background task will never be registered (and the import is currently unused). If the monitor is intended to run, it should be initialized alongside the fallback/langfuse tasks when the background scheduler is created.

Suggested change
# Initialize service health monitoring tasks
await initialize_service_health_tasks(_background_scheduler)

Copilot uses AI. Check for mistakes.
### Register new tasks here
Comment on lines +172 to 175
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Missing call to initialize_service_health_tasks.

initialize_service_health_tasks is imported on line 69 but never invoked. The service health background task (check_and_reset_circuits) will not be registered, meaning circuits will never auto-reset based on the configured schedule.

🐛 Proposed fix to register service health tasks
             # Initialize STT fallback reset tasks
             await initialize_fallback_tasks(_background_scheduler)

+            # Initialize service health check tasks
+            await initialize_service_health_tasks(_background_scheduler)
+
             ### Register new tasks here
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Initialize STT fallback reset tasks
await initialize_fallback_tasks(_background_scheduler)
### Register new tasks here
# Initialize STT fallback reset tasks
await initialize_fallback_tasks(_background_scheduler)
# Initialize service health check tasks
await initialize_service_health_tasks(_background_scheduler)
### Register new tasks here
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/main.py` around lines 172 - 175, The import
initialize_service_health_tasks is never invoked, so the service health job
(check_and_reset_circuits) isn’t registered; add a call to await
initialize_service_health_tasks(_background_scheduler) (similar to the existing
await initialize_fallback_tasks(_background_scheduler)) in the initialization
sequence (e.g., immediately after initialize_fallback_tasks) so the background
scheduler registers the check_and_reset_circuits task.


# Start the scheduler only if tasks are registered
Expand Down
Loading
Loading