Dunkin Voice Chat Assistant is an Inspire Brands–themed, voice-driven ordering experience that showcases Microsoft best practices for Azure OpenAI GPT-4o Realtime, Azure AI Search, and Azure Container Apps. The experience emulates a Dunkin crew member who can search the official menu, hold multilingual conversations, and keep orders in sync across devices.
As guests speak, real-time transcription, translation, and order management provide a transparent view of every choice...from signature lattes to bakery classics. The UI applies Dunkin's vibrant design language so stakeholders can picture how voice AI augments drive-thru, curbside, and kiosk flows.
Beyond coffee fans, this sample demonstrates how Microsoft’s Responsible AI guidance plus Azure-first tooling enable inclusive, hands-free interactions for franchise teams, accessibility scenarios, and mixed fleet deployments across the Inspire Brands portfolio.
- Dunkin Voice Chat Assistant
This project extends the VoiceRAG Repository, adapting its Microsoft-first architecture for a Dunkin scenario. Review the original pattern in this blog post. For the upstream README, see voice_rag_README.md.
Special thanks to John Carroll for the original coffee-chat-voice-assistant that inspired this sample. This fork updates to the latest OpenAI models and stirs in a few extra flavors—call it my own brew of the solution.
- Dunkin-specific conversational AI: GPT-4o Realtime is constrained to verified menu data through Azure AI Search so it always sounds like a Dunkin crew member.
- Retrieval-Augmented Generation (RAG): Azure OpenAI tool-calling plus semantic hybrid search keep recommendations grounded with pricing, nutrition, and add-on guidance.
- Real-Time Transcription + Translation: Multilingual guests receive accurate transcripts in their language of choice with instant pivots between English, Spanish, Mandarin, French, and more.
- Live Order Synchronization: Function calls update the shared cart so kiosk screens, mobile devices, and drive-thru headsets stay aligned without race conditions.
- Audio Output + Accessibility: Browser audio playback mirrors what a guest would hear over drive-thru speakers, supporting screenless or low-vision ordering.
- Durable Session Tokens: Every realtime conversation emits a session token plus per-turn identifiers so transcripts can map back to telemetry, QA findings, or Azure logs.
The RTClient in the frontend receives the audio input, sends that to the Python backend which uses an RTMiddleTier object to interface with the Azure OpenAI real-time API, and includes a tool for searching Azure AI Search.
This repository includes infrastructure as code and a Dockerfile to deploy the app to Azure Container Apps, but it can also be run locally as long as Azure AI Search and Azure OpenAI services are configured.
You have a few options for getting started with this template. The quickest way to get started is GitHub Codespaces, since it will setup all the tools for you, but you can also set it up locally. You can also use a VS Code dev container
You can run this repo virtually by using GitHub Codespaces, which opens a web-based VS Code in your browser:
- In your forked GitHub repository, select Code ➜ Codespaces ➜ Create codespace on main.
- Choose a machine type with at least 8 cores (the 32 GB option provides the smoothest dev experience).
- After the container finishes provisioning, open a new terminal and proceed to deploying the app.
You can run the project in your local VS Code Dev Container using the Dev Containers extension:
- Start Docker Desktop (install it if not already installed).
- Clone your GitHub repository locally (see Local environment).
- Open the folder in VS Code and choose Reopen in Container when prompted (or run the Dev Containers: Reopen in Container command).
- After the container finishes building, open a new terminal and proceed to deploying the app.
- Install the required tools by running the prerequisites script:
# Make the script executable
chmod +x ./scripts/install_prerequisites.sh
# Run the script
./scripts/install_prerequisites.shThe script installs the Azure CLI, signs you in, and verifies Docker availability for you.
Alternatively, manually install Azure Developer CLI, Node.js, Python >=3.11, Git, and Docker Desktop.
2. Clone your GitHub repository (git clone https://github.com/swigerb/dunkin-chat-voice-assistant.git)
3. Proceed to the next section to deploy the app.
If you have a JSON file containing the menu items for your café, you can use the provided Jupyter notebook to ingest the data into Azure AI Search.
- Open the
menu_ingestion_search_json.ipynbnotebook. - Follow the instructions to configure Azure OpenAI and Azure AI Search services.
- Prepare the JSON data for ingestion.
- Upload the prepared data to Azure AI Search.
This notebook demonstrates how to configure Azure OpenAI and Azure AI Search services, prepare the JSON data for ingestion, and upload the data to Azure AI Search for hybrid semantic search capabilities.
Link to JSON Ingestion Notebook
If you have a PDF file of a café's menu that you would like to use, you can use the provided Jupyter notebook to extract text from the PDF, parse it into structured JSON format, and ingest the data into Azure AI Search.
- Open the
menu_ingestion_search_pdf.ipynbnotebook. - Follow the instructions to extract text from the PDF using OCR.
- Parse the extracted text using GPT-4o into structured JSON format.
- Configure Azure OpenAI and Azure AI Search services.
- Prepare the parsed data for ingestion.
- Upload the prepared data to Azure AI Search.
This notebook demonstrates how to extract text from a menu PDF using OCR, parse the extracted text into structured JSON format, configure Azure OpenAI and Azure AI Search services, prepare the parsed data for ingestion, and upload the data to Azure AI Search for hybrid semantic search capabilities.
Link to PDF Ingestion Notebook
You have two options for running the app locally for development and testing:
Run this app locally using the provided start scripts:
-
Create an
app/backend/.envfile with the necessary environment variables. You can use the provided sample file as a template:cp app/backend/.env-sample app/backend/.env
Then, fill in the required values in the
app/backend/.envfile. -
Run this command to start the app:
Windows:
pwsh .\scripts\start.ps1
Linux/Mac:
./scripts/start.sh
-
The app will be available at http://localhost:8000
For testing in an isolated container environment:
-
Make sure you have an
.envfile in theapp/backend/directory as described above. -
Run the Docker build script:
# Make the script executable chmod +x ./scripts/docker-build.sh # Run the build script ./scripts/docker-build.sh
This script automatically handles:
- Verifying/creating frontend environment variables
- Building the Docker image using
app/frontend/.envfor Vite settings - Running the container with your backend configuration
-
Navigate to http://localhost:8000 to use the application.
Alternatively, you can manually build and run the Docker container:
# Ensure frontend Vite settings exist (edit values as needed)
# cp ./app/frontend/.env-sample ./app/frontend/.env
# Build the Docker image
docker build -t coffee-chat-app \
-f ./app/Dockerfile ./app
# Run the container with your environment variables
docker run -p 8000:8000 --env-file ./app/backend/.env coffee-chat-app:latestTo deploy the app to a production environment in Azure:
-
Make sure you have an
.envfile set up in theapp/backend/directory. You can copy the sample file:cp app/backend/.env-sample app/backend/.env
-
Run the deployment script with minimal parameters:
# Make the script executable chmod +x ./scripts/deploy.sh # Run the deployment with just the app name (uses all defaults) ./scripts/deploy.sh <name-of-your-app>
The script will automatically:
- Look for backend environment variables in
./app/backend/.env - Look for or create frontend environment variables in
./app/frontend/.env - Use the Dockerfile at
./app/Dockerfile - Use the Docker context at
./app
- Look for backend environment variables in
-
For more control, you can specify custom paths:
./scripts/deploy.sh \ --env-file /path/to/custom/backend.env \ --frontend-env-file /path/to/custom/frontend.env \ --dockerfile /path/to/custom/Dockerfile \ --context /path/to/custom/context \ <name-of-your-app>
-
After deployment completes, your app will be available at the URL displayed in the console.
This project is licensed under the MIT License. You may use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the software, provided that the copyright notice and permission notice from the MIT License are included in all copies or substantial portions of the software. Refer to the LICENSE file for the complete terms.
Contributions are welcome! Please review CONTRIBUTING.md for environment setup, branching guidance, and the pre-flight test checklist before opening an issue or submitting a pull request.
All trademarks and brand references belong to their respective owners.
The diagrams, images, and code samples in this repository are provided AS IS for proof-of-concept and pilot purposes only and are not intended for production use.
These materials are provided without warranty of any kind and do not constitute an offer, commitment, or support obligation on the part of Microsoft. Microsoft does not guarantee the accuracy or completeness of any information contained herein.
MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, including but not limited to warranties of merchantability, fitness for a particular purpose, or non-infringement.
Use of these materials is at your own risk.
- OpenAI Realtime API Documentation
- Azure OpenAI Documentation
- Azure AI Services Documentation
- Azure AI Search Documentation
- Azure AI Services Tutorials
- Azure AI Community Support
- Azure AI GitHub Samples
- Azure AI Services API Reference
- Azure AI Services Pricing
- Azure Developer CLI Documentation
- Azure Developer CLI GitHub Repository

