Skip to content

KrishnaKC15/Aura-Ai

Repository files navigation

Aura AI – ESP32-S3 + Local Brain

Aura is a local-first AI companion powered by an ESP32-S3-Box-3 for audio I/O and a Python backend ("Brain") for speech + intelligence.


⭐ Features

  • Wake‑word based interaction ("Aura", "Hey Aura", etc.)
  • Local wake‑word + STT (Whisper small)
  • Conversational AI via LLM (Ollama / Llama 3.2)
  • Speech output via Microsoft Edge TTS
  • Music playback with categories (Rap / Item / Relax / Travel / Random)
  • Smart Home triggers via tag system
  • Cute UI Face with blinking, lipsync, idle sleep
  • Low-latency duplex streaming via WebSockets

🧩 System Architecture

┌───────────────┐       PCM Audio        ┌───────────────┐
│ ESP32‑S3 BOX  │  ───────────────▶      │   Aura Brain   │
│ (Microphone)  │                       │  (Python API)  │
└──────┬────────┘       TTS Audio       └───────┬────────┘
       │ ◀──────────────────────────────────────┘
       │
       ▼
 ┌──────────────┐
 │ UI + Speaker │
 └──────────────┘

📂 Folder Structure

AURA-AI/
├── main/
│   ├── aura_firmware.c
│   ├── CMakeLists.txt
│   └── idf_component.yml
├── managed_components/
├── partitions.csv
└── sdkconfig
│
└── aura_brain/              # Python backend
    ├── assets/
    │   └── songs/           # MUSIC GOES HERE (IMPORTANT: ADD music files in these folders)
    │       ├── rap/
    │       ├── item_songs/
    │       ├── relax mixed_genre/
    │       ├── travel/
    │       └── random/
    ├── model/
    ├── server.py
    ├── dashboard.html
    ├── requirements.txt
    └── dependencies.txt

🎧 IMPORTANT: MUSIC FOLDERS

To enable music playback, you must place audio files in:

aura_brain/assets/songs/

Supported formats:

  • .mp3
  • .wav
  • .m4a
  • .flac

Categories used by the system:

Category Tag Folder Name
PLAY_RAP Baadshah, rap/
PLAY_ITEM item_songs/
PLAY_RELAX relax mixed_genre/
PLAY_TRAVEL travel/
PLAY_RANDOM any folder / mixed

If folders do not exist, create them manually.


🔧 Setup – Aura Brain (Python)

1. Install Dependencies

You must have Python 3.9+.

pip install -r requirements.txt

2. Install Whisper + FFmpeg

Whisper model used: small.en

FFmpeg must be present or placed in the working directory as:

ffmpeg.exe
ffprobe.exe

3. Install Ollama + Model

curl https://ollama.ai/install.sh | sh
ollama pull llama3.2

4. Run Server

python server.py

Server starts at:

ws://<your-ip>:8000/ws/audio
IP can be found running *ipconfig* in terminal (use wifi IPv4)

---

# ⚙️ Setup – ESP32 Firmware

### **1. Dependencies (ESP-IDF)**
Install ESP-IDF v5.x

```sh
git clone https://github.com/espressif/esp-idf.git

2. Configure WiFi and Server in aura_firmware.c

Edit:

#define WIFI_SSID      "<YOUR_WIFI>"
#define WIFI_PASS      "<YOUR_PASSWORD>"
#define BRAIN_SERVER_URI "ws://<IP>:8000/ws/audio"

3. Build + Flash

idf.py build
idf.py flash
idf.py monitor

🧠 Keyword + Tag Mapping

User Intent LLM Tag Output
"Turn on lights" {{LIGHT_ON}}
"Play rap" {{PLAY_RAP}}
"Stop music" {{STOP_MUSIC}}

LLM is instructed via this system prompt (already inside code).`


📦 requirements.txt (Python)

Typical entries (already included):

fastapi
uvicorn
websockets
pydub
faster-whisper
edge-tts
glob2
ollama

📁 dependencies.txt (Firmware)

Tracks ESP-IDF + LVGL + BSP + Codec

Example:

ESP-IDF v5.3
LVGL v8.x
esp-box-3-bsp
esp_codec_dev
esp_websocket_client

Place this beside your server.py.


📝 Developer Notes

  • Wake-word handled via STT phrase matching
  • LLM streaming with sentence-split buffering
  • Traffic control prevents full-duplex overlap
  • UI state machine drives ears + eyes + mouth

🎬 Demo Requirements

View On Youtube :Aura Ai Companion


🙌 Credits

Created by Krishna Chauhan for 2025 Circuit Digest Competition Aura Submission.

Uses:

  • ESP32‑S3‑Box‑3 (Hardware)
  • Whisper (STT)
  • Edge‑TTS (Speech)
  • Llama 3.2 via Ollama (LLM)
  • LVGL (UI)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors