A real-time voice-interactive chatbot powered by ESP32, Gemini AI, Google Web Speech API, and dual OLED animations. Inspired by Iron Man's J.A.R.V.I.S, this project listens for a trigger word and responds intelligently with animated eye expressions.
- 🎤 Voice Activated: Always listening for the trigger word
Jarvis. - 💬 Real-Time Q&A: Captures speech, converts to text using Google Web Speech API (STT), and gets responses from Gemini AI.
- 👀 OLED Animation: Dual OLED eyes with blinking animation that resume smoothly during AI processing.
- 🧠 Smart Display:
- After trigger word: shows
YesandSir!for 2s. - Displays user question and Gemini's response for 15s.
- Auto-interrupts if trigger word is used again.
- After trigger word: shows
- ❌ Exit Commands: Recognizes
Thanks Buddy,exit, orbyeand displaysSee youandSoonbefore shutdown. - 🛠️ Error Handling: Displays
Sorry, Say again!when input is unclear and immediately retries.
ESP32 Dev Board - Main controller
OLED Displays x2 - 1.3-inch I2C SSD1306 (left + right)
Microphone - I2S - captures voice inputs
Python Backend - Handles STT, Gemini, serial I/O
Breadborad
Jumper Wires
-> Microphone (I2S) to esp32
| Microphone Pin | ESP32 GPIO |
|---|---|
| VCC | 3.3 V |
| GND | GND |
| SCK | GPIO 14 |
| WS | GPIO 15 |
| SD | GPIO 32 |
-> OLEDs Display (I2C) to esp32
| OLED Pin | ESP32 GPIO |
|---|---|
| VCC | 3.3 V |
| GND | GND |
| SCL (OLED 1) | GPIO 22 |
| SDA (OLED 1) | GPIO 21 |
| SCL (OLED 2) | GPIO 16 |
| SDA (OLED 2) | GPIO 17 |
-> Google Web Speech API (STT) -> Gemini AI via API call
- Always listens for "Jarvis"
- Displays "Yes" + "Sir!"
- Listens for your question
- Sends to Gemini AI
- Shows Q & A on OLEDs
- Returns to idle eye blinking
- Handles exit or unclear commands
-
Clone the Repository git clone https://github.com/Vignesh-D-31/esp32-chatbot.git cd esp32-chatbot
-
Set Up the Python Environment Make sure you have Python 3.8 or later installed. Then install the required dependencies: pip install -r requirements.txt ✅ Ensure you have your GOOGLE_APPLICATION_CREDENTIALS and Gemini API key configured in your environment
-
Upload Arduino Code to ESP32 -> Open chatbot_stt_gemini.ino in the Arduino IDE. -> Select your ESP32 board from Tools > Board. -> Connect your ESP32 via USB and select the appropriate COM port. -> Click Upload.
-
Connect Your Hardware -> Connect the following components: -> Two OLEDs via I2C (left and right eye) -> Microphone for capturing user voice Refer to the pin mapping in the README.md ⚡Connections section.
-
Run the Python Interface python main.py
This script handles: -> Trigger word detection -> Speech-to-Text (Google Web Speech API) -> Response generation (Gemini AI) -> Sending text to ESP32 via Serial ℹ️ Make sure the ESP32 is connected to your computer via USB and the correct COM port is set in the Python script.
-> Idle State:
-> When inputs are not given:
-> When exit commands are given:
-> Sample Questions and Responses:








