Speech to speech chatbot

This project is a voice-based chatbot that uses Vosk for speech recognition, Piper for text-to-speech, and a local Large Language Model (LLM) via CLI (using the Python gpt4all package) for generating responses.

The chatbot persona is named Emma (modifiable).
The default user name is Pedro (modifiable).
These roles and other settings are easily changed in chatbot.py or the LLM context prompt.
The script has been tested on Windows (other OSes may require adjustments).

How It Works

You speak as Pedro (default, can be customized) into your microphone.
Audio is captured and transcribed to English text using the Vosk en_US model.
The recognized text is sent to the on-CPU LLM (using the local gpt4all Python package running a GGUF model). No API calls are made.
Emma (the AI) generates a response using the specified LLM model and replies aloud with a natural voice.
Piper TTS voices Emma's reply using the en_US female voice (current voice: en_US-libritts-high.onnx + corresponding .json metadata). You can switch voices by updating paths/URLs in chatbot.py and setup_models.py.

Chatbot Modes

There are two supported scripts/modes:

chatbot.py — Non-streaming mode: waits for full LLM response before vocalizing.
chatbot-stream.py — Streaming mode: generates and speaks responses incrementally as they're produced by the LLM, for a more natural, responsive feel.

Model Information (LLM)

Uses a GGUF format model:
gguf-models/Phi-3-mini-4k-instruct-q4.gguf
This model must be downloaded (see below) and made available at the specified location before starting the chatbot.

Performance & Lag Notice

Since all AI generation runs on your CPU, expect some lag (hundreds of milliseconds to multiple seconds) both after you finish speaking (while converting speech to text) and while waiting for Emma's response. The delay depends on your CPU speed and the complexity of your question.

Customization

Change Persona or Greetings: Edit the greeting text, prompt, or role names in chatbot.py.
Use a Different LLM/Model:
- Change MODEL_NAME in chatbot.py to match any installed model.
- Ensure you have downloaded and installed the correct GGUF model. Update path in your script or via environment variables as needed.
- For alternate GGUF models, update the file in gguf-models/.
Change Vosk or Piper Models:
- Download/replace the models (see setup_models.py for URLs and local paths).
- For non-English or alternate voices, update these paths and ensure the files match your target language or persona.
Troubleshooting Tips:
- If the chatbot does not speak, check your audio input/output devices and model files.
- If you get errors about missing GGUF models, check the paths and model download.
- To test alternate configurations, update the RELEVANT parameters in chatbot.py and re-run the bot.

Quick Start (Recommended)

Set up everything with one automated script:

python setup_env.py

This will:

Create a new Python virtual environment in .venv (if one doesn't exist)
Install dependencies from requirements.txt
Download all required speech and voice models via setup_models.py

Then activate the environment:

On Windows:
```
.venv\Scripts\activate
```
On macOS/Linux:
```
source .venv/bin/activate
```

Now you can run the chatbot!

# Non-streamed LLM (full response, more delay)
python chatbot.py

# Streamed LLM (Emma speaks while thinking)
python chatbot-stream.py

Speak into your microphone. The bot will transcribe your text, generate a reply using the GGUF-based LLM, and speak the response as "Emma."
Press Ctrl+C to stop.

Requirements

Python 3.8+
Working microphone & speakers
No API server or backend needed. Local CPU-only inference.

Notes

Model files for Vosk and Piper, gguf will be auto-downloaded into vosk-models/ and piper-voice/ and gguf-models/ folders.
If you wish to use different voice/language models, update the paths/URLs in setup_models.py and chatbot.py.
You may need to adapt device settings or paths for other operating systems or hardware.

Manual Setup (Advanced/Optional)

If you prefer to do steps manually, see previous README versions or break down what setup_env.py automates:

Create and activate .venv
pip install -r requirements.txt
python setup_models.py
Activate environment and run chatbot.

Enjoy your private, local offline AI assistant!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
chatbot-stream.py		chatbot-stream.py
chatbot.py		chatbot.py
requirements.txt		requirements.txt
setup_env.py		setup_env.py
setup_models.py		setup_models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech to speech chatbot

How It Works

Chatbot Modes

Model Information (LLM)

Performance & Lag Notice

Customization

Quick Start (Recommended)

Requirements

Notes

Manual Setup (Advanced/Optional)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Speech to speech chatbot

How It Works

Chatbot Modes

Model Information (LLM)

Performance & Lag Notice

Customization

Quick Start (Recommended)

Requirements

Notes

Manual Setup (Advanced/Optional)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages