Skip to content

An AI-powered assistant for live phone conversations, offering real-time question suggestions, live transcription, and comprehensive post-call analysis. Ideal for sales, customer service, and market research.

License

Notifications You must be signed in to change notification settings

Nelson-PROIA/call-companion

Repository files navigation

Call Companion Agent

An AI-powered assistant for live phone conversations that suggests contextually relevant questions in real-time and provides comprehensive post-call analysis. Perfect for cold calls, market research, customer interviews, and sales discovery.

Features

  • 🎯 3 Call Modes: Computer audio (full duplex), microphone only, or Twilio phone calls
  • 🤖 Real-time AI Question Suggestions: GPT-4o analyzes conversations and suggests 1-2 most relevant questions
  • 📝 Live Transcript Display: See conversations unfold with speaker identification
  • 📊 Post-Call Synthesis: Comprehensive AI analysis with insights, answered questions, pain points, and next steps
  • 🎨 Modern UI: Built with shadcn/ui components and Tailwind CSS v4
  • 🌗 Light/Dark Theme: Comfortable viewing in any environment
  • 🔌 Provider-Agnostic Architecture: Easily swap Call, STT, and LLM providers without code changes
  • ⚙️ 4-Step Configuration Wizard: Intuitive setup with visual progress tracking

Quick Start

Prerequisites

Installation

# Clone the repository
git clone <your-repo-url>
cd cold-calls-agent

# Install dependencies
npm install

# Create environment file
cp .env.example .env.local

# Edit .env.local and add your OpenAI API key
OPENAI_API_KEY=sk-your-actual-key-here

# Start development server
npm run dev

# Open http://localhost:3000

Demo Mode

The app works immediately with just an OpenAI API key! It uses simulated conversations to demonstrate the AI suggestion system. No phone integration or STT service needed for testing.

How It Works

Configuration Wizard (4 Steps)

  1. Call Provider - Choose Twilio and configure phone settings
  2. Speech-to-Text Provider - Select Gladia and set language/diarization options
  3. AI/LLM Provider - Choose OpenAI GPT-4o and configure temperature/tokens
  4. Call Context - Define topic, goal, target profile, and questions with importance scores

During the Call

  • Live transcript appears as conversation progresses
  • AI analyzes recent context (last 5 messages)
  • Suggests 1-2 most relevant questions based on:
    • Conversation flow
    • Question importance scores (0.5-1.0)
    • Previously asked questions
    • Call context and goals

After the Call

Post-call synthesis includes:

  • Overall summary
  • Key insights extracted
  • Answered questions with their responses
  • Unanswered high-priority questions
  • Pain points and opportunities identified
  • Recommended next steps

Call Modes

1. Computer Audio

  • Captures both microphone and system audio output
  • Perfect for Zoom, Google Meet, Teams calls
  • Requires screen/audio sharing permission
  • Uses Web Audio API for stream mixing

2. Microphone Only

  • Captures only microphone input
  • Simple device selection
  • Lower resource usage
  • Ideal for phone calls on speakerphone

3. Twilio Phone Call

  • Makes actual outbound calls via Twilio
  • Enter phone number in international format
  • Optional custom Twilio credentials
  • Audio forwarded through Twilio Media Streams

Configuration Examples

JSON Upload

Upload a .json file with this structure:

{
  "context": {
    "topic": "SaaS product discovery",
    "goal": "Understand pain points and product-market fit",
    "target_profile": "Engineering managers at tech companies"
  },
  "questions": [
    { "question": "What are your biggest pain points?", "score": 0.95 },
    { "question": "How do you currently solve this?", "score": 0.90 },
    { "question": "What's your team size?", "score": 0.70 }
  ]
}

Question Score Guidelines

  • 0.9-1.0 (Critical/Red): Must-ask questions, core pain points, qualification criteria
  • 0.8-0.89 (High/Blue): Important context, product fit indicators, next steps
  • 0.7-0.79 (Medium/Accent): Nice-to-have information, background context
  • 0.5-0.69 (Low/Gray): Optional details, future considerations

Tech Stack

  • Framework: Next.js 15 (App Router) with React 18
  • Language: TypeScript (strict mode)
  • Styling: Tailwind CSS v4 with parameterized theme variables
  • UI Library: shadcn/ui components exclusively
  • Font: JetBrains Mono (developer-friendly)
  • AI: OpenAI GPT-4o (swappable via provider pattern)
  • STT: Gladia (swappable via provider pattern)
  • Telephony: Twilio (swappable via provider pattern)

Architecture

Provider Pattern

The system uses interface-based architecture allowing provider swapping without changing business logic.

Three Provider Types:

  1. ICallProvider - Phone infrastructure (Twilio, Vonage, Plivo, etc.)
  2. ISTTProvider - Speech-to-text (Gladia, Whisper, AssemblyAI, etc.)
  3. ILLMProvider - AI/LLM (OpenAI, Anthropic, local models, etc.)

Adding New Providers:

  1. Create provider class in lib/providers/your-provider.provider.ts
  2. Implement the appropriate interface (ICallProvider, ISTTProvider, or ILLMProvider)
  3. Create config component in components/providers/YourProviderConfig.tsx
  4. Register in the Step component (e.g., STTProviderStep.tsx)
  5. Add environment variables to .env.local

No changes needed in business logic, API routes, or services!

Directory Structure

cold-calls-agent/
├── app/
│   ├── api/
│   │   ├── suggestions/route.ts        # GPT-4o question selection
│   │   ├── call-synthesis/route.ts     # Post-call AI analysis
│   │   ├── twilio-webhook/route.ts     # Twilio phone integration
│   │   └── stt-stream/route.ts         # Real-time speech-to-text
│   ├── layout.tsx                      # Root layout with theme provider
│   ├── page.tsx                        # Main application UI
│   └── globals.css                     # Theme variables & base styles
├── components/
│   ├── ui/                             # shadcn components (11 files)
│   ├── CallInterface.tsx               # Live transcript display
│   ├── QuestionSuggestions.tsx         # AI suggested questions panel
│   ├── ContextConfig.tsx               # Configuration form with file upload
│   ├── CallSynthesis.tsx               # Post-call summary modal
│   ├── ConfigurationWizard.tsx         # Multi-step setup wizard
│   ├── ThemeToggle.tsx                 # Light/dark theme switcher
│   └── providers/                      # Provider config components
│       ├── TwilioCallProviderConfig.tsx
│       ├── GladiaSTTConfig.tsx
│       └── OpenAILLMConfig.tsx
├── lib/
│   ├── interfaces/                     # Abstract interfaces (OOP contracts)
│   │   ├── call-provider.interface.ts
│   │   ├── stt-provider.interface.ts
│   │   └── llm-provider.interface.ts
│   ├── providers/                      # Concrete implementations
│   │   ├── twilio.provider.ts
│   │   ├── gladia.provider.ts
│   │   └── openai.provider.ts
│   ├── services/                       # Business logic (uses interfaces)
│   │   ├── question-suggester.service.ts
│   │   └── call-analyzer.service.ts
│   └── provider-factory.ts             # Factory pattern for provider creation
├── examples/
│   └── demo-transcript.ts              # Demo simulation for development
└── types/
    └── index.ts                        # All TypeScript type definitions

Adding New Providers

To add a new provider (e.g., Anthropic for LLM):

1. Create Provider Implementation (lib/providers/anthropic.provider.ts):

import { ILLMProvider, Message, LLMConfig, LLMResponse } from '@/lib/interfaces/llm-provider.interface'

export class AnthropicProvider implements ILLMProvider {
  constructor(apiKey?: string) {
    this.client = new Anthropic({ apiKey: apiKey || process.env.ANTHROPIC_API_KEY })
  }
  
  getName(): string { return 'Anthropic' }
  
  async complete(messages: Message[], config?: LLMConfig): Promise<LLMResponse> {
    // Implementation
  }
}

2. Create Config Component (components/providers/AnthropicLLMConfig.tsx):

export function AnthropicLLMConfig({ config, onChange, onContinue }) {
  return (
    <Card className="p-6">
      {/* Provider-specific UI controls */}
      <Button onClick={onContinue}>Continue</Button>
    </Card>
  )
}

3. Register in Step Component (components/LLMProviderStep.tsx):

const llmProviders = [
  { id: 'openai', name: 'OpenAI' },
  { id: 'anthropic', name: 'Anthropic' },  // Add here
]

const llmProviderForms = {
  openai: OpenAILLMConfig,
  anthropic: AnthropicLLMConfig,  // Add here
}

4. Update Types (types/index.ts):

export type LLMProviderType = 'openai' | 'anthropic'

5. Register in Factory (lib/provider-factory.ts):

case 'anthropic':
  return new AnthropicProvider()

6. Set Environment (.env.local):

LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...

No changes needed in API routes, services, or business logic!

Environment Variables

Create .env.local:

# Required for AI features
OPENAI_API_KEY=sk-your-openai-api-key

# Optional - for real-time transcription
GLADIA_API_KEY=your-gladia-api-key

# Optional - for Twilio phone calls
TWILIO_ACCOUNT_SID=your-twilio-account-sid
TWILIO_AUTH_TOKEN=your-twilio-auth-token

# Provider selection (defaults shown)
LLM_PROVIDER=openai
STT_PROVIDER=gladia
CALL_PROVIDER=twilio

API Routes

/api/suggestions - Question Selection

Analyzes conversation and suggests 1-2 most relevant questions.

Key Parameters:

  • Temperature: 0.4 (focused but adaptive)
  • Context window: Last 5 conversation turns
  • Max suggestions: 2

Customization (app/api/suggestions/route.ts):

temperature: 0.4      // Adjust for creativity vs consistency
transcript.slice(-5)  // Change context window size
suggestions.slice(0, 2) // Modify max suggestions

/api/call-synthesis - Post-Call Analysis

Generates comprehensive call summary.

Key Parameters:

  • Temperature: 0.3 (factual analysis)
  • High-priority threshold: 0.8 (questions with score >= 0.8)

Output:

{
  overallSummary: string
  keyInsights: string[]
  answeredQuestions: Array<{question: string, answer: string}>
  unansweredHighPriorityQuestions: string[]
  painPointsAndOpportunities: string[]
  nextSteps: string[]
}

Development

Scripts

npm run dev          # Start dev server with Turbopack
npm run build        # Production build
npm start            # Start production server
npm run lint         # Run ESLint

Styling Guidelines

Based on project preferences:

  • ✅ Use shadcn/ui components exclusively
  • ✅ No !important in CSS
  • ✅ No inherit keyword
  • ✅ No semicolons in CSS declarations
  • ✅ Parameterized colors for light/dark themes
  • ✅ CSS-only animations (no JavaScript-based)
  • ✅ JetBrains Mono font for developer aesthetic

Adding Components

'use client'  // If interactive

import { Button } from '@/components/ui/button'
import { Card } from '@/components/ui/card'

export function MyComponent() {
  return (
    <Card className="bg-background text-foreground">
      <Button className="bg-primary hover:bg-primary/90">
        Click Me
      </Button>
    </Card>
  )
}

Production Deployment

Vercel (Recommended)

# Build the project
npm run build

# Deploy to Vercel
vercel deploy

# Set environment variables in Vercel dashboard

Production Setup

For real calls (beyond demo mode):

  1. Twilio Configuration

    • Get a Twilio phone number
    • Configure webhook URL: https://your-domain.com/api/twilio-webhook
    • Enable Media Streams
  2. Gladia Integration

    • The /api/stt-stream endpoint requires WebSocket support
    • Options: Custom Node.js server, separate WebSocket service, or Vercel Edge Runtime (experimental)
  3. Environment Variables

    • Set all required API keys in production environment
    • Use .env.example as reference

API Costs (Estimated)

  • OpenAI GPT-4o: ~$0.02-0.10 per call
  • Gladia: $0.144 per hour ($0.072 per 30-min call)
  • Twilio: ~$0.01-0.02 per minute

Troubleshooting

No AI Suggestions

  1. Verify OPENAI_API_KEY in .env.local
  2. Restart dev server after adding key
  3. Check browser console for errors
  4. Ensure call is started

Build Errors

# Clean install
rm -rf node_modules .next
npm install
npm run build

Slow AI Responses

  1. Reduce temperature in /api/suggestions/route.ts
  2. Limit context window (fewer recent messages)
  3. Check OpenAI API status
  4. Consider switching to faster model

License

MIT License - see LICENSE file for details.


Built with Next.js 15, GPT-4o, and shadcn/ui

About

An AI-powered assistant for live phone conversations, offering real-time question suggestions, live transcription, and comprehensive post-call analysis. Ideal for sales, customer service, and market research.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published