Skip to content

swigerb/dunkin-chat-voice-assistant

Repository files navigation

Dunkin Voice Chat Assistant

Dunkin Voice Chat Assistant is an Inspire Brands–themed, voice-driven ordering experience that showcases Microsoft best practices for Azure OpenAI GPT-4o Realtime, Azure AI Search, and Azure Container Apps. The experience emulates a Dunkin crew member who can search the official menu, hold multilingual conversations, and keep orders in sync across devices.

As guests speak, real-time transcription, translation, and order management provide a transparent view of every choice...from signature lattes to bakery classics. The UI applies Dunkin's vibrant design language so stakeholders can picture how voice AI augments drive-thru, curbside, and kiosk flows.

Beyond coffee fans, this sample demonstrates how Microsoft’s Responsible AI guidance plus Azure-first tooling enable inclusive, hands-free interactions for franchise teams, accessibility scenarios, and mixed fleet deployments across the Inspire Brands portfolio.

Table of Contents

Acknowledgment

This project extends the VoiceRAG Repository, adapting its Microsoft-first architecture for a Dunkin scenario. Review the original pattern in this blog post. For the upstream README, see voice_rag_README.md.

Special thanks to John Carroll for the original coffee-chat-voice-assistant that inspired this sample. This fork updates to the latest OpenAI models and stirs in a few extra flavors—call it my own brew of the solution.

Visual Demonstrations

app screenshot

Features

  • Dunkin-specific conversational AI: GPT-4o Realtime is constrained to verified menu data through Azure AI Search so it always sounds like a Dunkin crew member.
  • Retrieval-Augmented Generation (RAG): Azure OpenAI tool-calling plus semantic hybrid search keep recommendations grounded with pricing, nutrition, and add-on guidance.
  • Real-Time Transcription + Translation: Multilingual guests receive accurate transcripts in their language of choice with instant pivots between English, Spanish, Mandarin, French, and more.
  • Live Order Synchronization: Function calls update the shared cart so kiosk screens, mobile devices, and drive-thru headsets stay aligned without race conditions.
  • Audio Output + Accessibility: Browser audio playback mirrors what a guest would hear over drive-thru speakers, supporting screenless or low-vision ordering.
  • Durable Session Tokens: Every realtime conversation emits a session token plus per-turn identifiers so transcripts can map back to telemetry, QA findings, or Azure logs.

Architecture Diagram

The RTClient in the frontend receives the audio input, sends that to the Python backend which uses an RTMiddleTier object to interface with the Azure OpenAI real-time API, and includes a tool for searching Azure AI Search.

Diagram of real-time RAG pattern

This repository includes infrastructure as code and a Dockerfile to deploy the app to Azure Container Apps, but it can also be run locally as long as Azure AI Search and Azure OpenAI services are configured.

Getting Started

You have a few options for getting started with this template. The quickest way to get started is GitHub Codespaces, since it will setup all the tools for you, but you can also set it up locally. You can also use a VS Code dev container

GitHub Codespaces

You can run this repo virtually by using GitHub Codespaces, which opens a web-based VS Code in your browser:

  1. In your forked GitHub repository, select Code ➜ Codespaces ➜ Create codespace on main.
  2. Choose a machine type with at least 8 cores (the 32 GB option provides the smoothest dev experience).
  3. After the container finishes provisioning, open a new terminal and proceed to deploying the app.

VS Code Dev Containers

You can run the project in your local VS Code Dev Container using the Dev Containers extension:

  1. Start Docker Desktop (install it if not already installed).
  2. Clone your GitHub repository locally (see Local environment).
  3. Open the folder in VS Code and choose Reopen in Container when prompted (or run the Dev Containers: Reopen in Container command).
  4. After the container finishes building, open a new terminal and proceed to deploying the app.

Local environment

  1. Install the required tools by running the prerequisites script:
# Make the script executable
chmod +x ./scripts/install_prerequisites.sh
 
# Run the script
./scripts/install_prerequisites.sh

The script installs the Azure CLI, signs you in, and verifies Docker availability for you.

Alternatively, manually install Azure Developer CLI, Node.js, Python >=3.11, Git, and Docker Desktop. 2. Clone your GitHub repository (git clone https://github.com/swigerb/dunkin-chat-voice-assistant.git) 3. Proceed to the next section to deploy the app.

Ingesting Menu Items into Azure AI Search

From JSON

If you have a JSON file containing the menu items for your café, you can use the provided Jupyter notebook to ingest the data into Azure AI Search.

Steps (JSON)

  1. Open the menu_ingestion_search_json.ipynb notebook.
  2. Follow the instructions to configure Azure OpenAI and Azure AI Search services.
  3. Prepare the JSON data for ingestion.
  4. Upload the prepared data to Azure AI Search.

This notebook demonstrates how to configure Azure OpenAI and Azure AI Search services, prepare the JSON data for ingestion, and upload the data to Azure AI Search for hybrid semantic search capabilities.

Link to JSON Ingestion Notebook

From PDF

If you have a PDF file of a café's menu that you would like to use, you can use the provided Jupyter notebook to extract text from the PDF, parse it into structured JSON format, and ingest the data into Azure AI Search.

Steps (PDF)

  1. Open the menu_ingestion_search_pdf.ipynb notebook.
  2. Follow the instructions to extract text from the PDF using OCR.
  3. Parse the extracted text using GPT-4o into structured JSON format.
  4. Configure Azure OpenAI and Azure AI Search services.
  5. Prepare the parsed data for ingestion.
  6. Upload the prepared data to Azure AI Search.

This notebook demonstrates how to extract text from a menu PDF using OCR, parse the extracted text into structured JSON format, configure Azure OpenAI and Azure AI Search services, prepare the parsed data for ingestion, and upload the data to Azure AI Search for hybrid semantic search capabilities.

Link to PDF Ingestion Notebook

Running the App Locally

You have two options for running the app locally for development and testing:

Option 1: Direct Local Execution (Recommended for Development)

Run this app locally using the provided start scripts:

  1. Create an app/backend/.env file with the necessary environment variables. You can use the provided sample file as a template:

    cp app/backend/.env-sample app/backend/.env

    Then, fill in the required values in the app/backend/.env file.

  2. Run this command to start the app:

    Windows:

    pwsh .\scripts\start.ps1

    Linux/Mac:

    ./scripts/start.sh
  3. The app will be available at http://localhost:8000

Option 2: Docker-based Local Execution

For testing in an isolated container environment:

  1. Make sure you have an .env file in the app/backend/ directory as described above.

  2. Run the Docker build script:

    # Make the script executable
    chmod +x ./scripts/docker-build.sh
    
    # Run the build script
    ./scripts/docker-build.sh

    This script automatically handles:

    • Verifying/creating frontend environment variables
    • Building the Docker image using app/frontend/.env for Vite settings
    • Running the container with your backend configuration
  3. Navigate to http://localhost:8000 to use the application.

Alternatively, you can manually build and run the Docker container:

# Ensure frontend Vite settings exist (edit values as needed)
# cp ./app/frontend/.env-sample ./app/frontend/.env

# Build the Docker image
docker build -t coffee-chat-app \
  -f ./app/Dockerfile ./app

# Run the container with your environment variables
docker run -p 8000:8000 --env-file ./app/backend/.env coffee-chat-app:latest

Deploying to Azure

To deploy the app to a production environment in Azure:

  1. Make sure you have an .env file set up in the app/backend/ directory. You can copy the sample file:

    cp app/backend/.env-sample app/backend/.env
  2. Run the deployment script with minimal parameters:

    # Make the script executable
    chmod +x ./scripts/deploy.sh
    
    # Run the deployment with just the app name (uses all defaults)
    ./scripts/deploy.sh <name-of-your-app>

    The script will automatically:

    • Look for backend environment variables in ./app/backend/.env
    • Look for or create frontend environment variables in ./app/frontend/.env
    • Use the Dockerfile at ./app/Dockerfile
    • Use the Docker context at ./app
  3. For more control, you can specify custom paths:

    ./scripts/deploy.sh \
      --env-file /path/to/custom/backend.env \
      --frontend-env-file /path/to/custom/frontend.env \
      --dockerfile /path/to/custom/Dockerfile \
      --context /path/to/custom/context \
      <name-of-your-app>
  4. After deployment completes, your app will be available at the URL displayed in the console.

License

This project is licensed under the MIT License. You may use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the software, provided that the copyright notice and permission notice from the MIT License are included in all copies or substantial portions of the software. Refer to the LICENSE file for the complete terms.

Contributing

Contributions are welcome! Please review CONTRIBUTING.md for environment setup, branching guidance, and the pre-flight test checklist before opening an issue or submitting a pull request.

Disclaimer

All trademarks and brand references belong to their respective owners.

The diagrams, images, and code samples in this repository are provided AS IS for proof-of-concept and pilot purposes only and are not intended for production use.

These materials are provided without warranty of any kind and do not constitute an offer, commitment, or support obligation on the part of Microsoft. Microsoft does not guarantee the accuracy or completeness of any information contained herein.

MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, including but not limited to warranties of merchantability, fitness for a particular purpose, or non-infringement.

Use of these materials is at your own risk.

Resources

About

Dunkin-themed voice ordering demo brewed with Azure OpenAI Realtime + AI Search—multilingual chat, grounded menu RAG, live order sync, session tokens, and customizable extras (only on drinks!).

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors