Koozie Customer Service Agent

A FastAPI-based AI agent that provides customer service support for Koozie Group using Vertex AI's Gemini 2.5 Flash model with streaming responses for minimal latency.

Features

✅ Streaming Responses: Token-by-token streaming for voice applications (minimizes latency)
✅ Koozie Context: Full product catalog and support information loaded into every request
✅ Two Endpoints: /chat (streaming) and /chat/sync (non-streaming)
✅ Hot Reload Development: Docker Compose setup for rapid testing
✅ GCP Ready: Cloud Build configuration for automated deployment

Environment Variables

Create a .env file (see .env.example):

GCP_PROJECT_ID=heyai-backend
GCP_REGION=us-central1
GCP_PROJECT_NUMBER=127756525541
VERTEX_AI_LOCATION=us-central1
VERTEX_AI_MODEL=gemini-2.5-flash

Quick Start (Development)

Prerequisites

Docker and Docker Compose installed
GCP credentials configured (via gcloud auth application-default login or service account)

Run with Hot Reload

# Start the development server with hot reload
docker-compose -f docker-compose.dev.yml up

# The server will be available at http://localhost:8080

Test Endpoints

# Make test script executable
chmod +x test_endpoints.sh

# Run tests
./test_endpoints.sh

Or test manually:

# Health check
curl http://localhost:8080/health

# Streaming chat (for voice apps)
curl -sN -X POST http://localhost:8080/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "What is a Koozie?"}'

# Synchronous chat (for testing)
curl -X POST http://localhost:8080/chat/sync \
  -H "Content-Type: application/json" \
  -d '{"message": "Tell me about your pens."}'

API Endpoints

`GET /health`

Health check endpoint. Returns server status and configuration.

Response:

{
  "status": "healthy",
  "project_id": "heyai-backend",
  "location": "us-central1",
  "model": "gemini-2.5-flash",
  "context_loaded": true,
  "vertex_ai_initialized": true
}

`POST /chat` (Streaming)

Streaming chat endpoint. Returns Server-Sent Events (SSE) with tokens as they're generated.

Request:

{
  "message": "What products do you offer?",
  "conversation_history": [
    {"role": "user", "content": "Hello"},
    {"role": "model", "content": "Hi! How can I help you?"}
  ]
}

Response: Server-Sent Events stream

data: {"text": "We", "done": false}

data: {"text": " offer", "done": false}

data: {"text": " a wide", "done": false}

...

data: {"text": "", "done": true}

`POST /chat/sync` (Non-Streaming)

Synchronous chat endpoint. Returns complete response.

Request: Same as /chat

Response:

{
  "status": "success",
  "message": "We offer a wide range of promotional products..."
}

Deployment to GCP

Prerequisites

GCP project with Vertex AI API enabled
Artifact Registry repository created
Cloud Build trigger configured

Deploy

The cloudbuild.yaml is configured to:

Build Docker image
Push to Artifact Registry
Deploy to Cloud Run

Simply push to your repository and the Cloud Build trigger will handle deployment.

Manual Deployment

# Build and push image
gcloud builds submit --config cloudbuild.yaml

# Or deploy directly to Cloud Run
gcloud run deploy koozie-agent-service \
  --source . \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars="GCP_PROJECT_ID=heyai-backend,VERTEX_AI_LOCATION=us-central1,VERTEX_AI_MODEL=gemini-2.5-flash"

Project Structure

test-agent/
├── main.py                 # FastAPI server with Vertex AI integration
├── context.txt            # Koozie Group product catalog and support info
├── requirements.txt       # Python dependencies
├── Dockerfile            # Production container
├── Dockerfile.dev        # Development container
├── docker-compose.dev.yml # Hot reload development setup
├── cloudbuild.yaml       # GCP Cloud Build configuration
├── .env.example          # Environment variable template
├── .gitignore            # Git ignore rules
└── test_endpoints.sh     # Test script

Notes

The server loads context.txt at startup and includes it in every request via system instructions
Streaming endpoint uses Server-Sent Events (SSE) for real-time token delivery
GCP credentials are automatically detected via Application Default Credentials
The service is configured for Cloud Run deployment with auto-scaling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Koozie Customer Service Agent

Features

Environment Variables

Quick Start (Development)

Prerequisites

Run with Hot Reload

Test Endpoints

API Endpoints

`GET /health`

`POST /chat` (Streaming)

`POST /chat/sync` (Non-Streaming)

Deployment to GCP

Prerequisites

Deploy

Manual Deployment

Project Structure

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.gitignore		.gitignore
DEPLOYMENT_CONFIG.md		DEPLOYMENT_CONFIG.md
Dockerfile		Dockerfile
Dockerfile.dev		Dockerfile.dev
LICENSE		LICENSE
README.md		README.md
cloudbuild.yaml		cloudbuild.yaml
code.txt		code.txt
context.txt		context.txt
docker-compose.dev.yml		docker-compose.dev.yml
main.py		main.py
recs.txt		recs.txt
requirements.txt		requirements.txt
test_endpoints.sh		test_endpoints.sh
test_streaming.py		test_streaming.py

Folders and files

Latest commit

History

Repository files navigation

Koozie Customer Service Agent

Features

Environment Variables

Quick Start (Development)

Prerequisites

Run with Hot Reload

Test Endpoints

API Endpoints

GET /health

POST /chat (Streaming)

POST /chat/sync (Non-Streaming)

Deployment to GCP

Prerequisites

Deploy

Manual Deployment

Project Structure

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /health`

`POST /chat` (Streaming)

`POST /chat/sync` (Non-Streaming)

Packages