Breath Assist - Voice-Activated AI Assistant

A 100% local, voice-activated AI assistant that allows you to speak to your computer using your voice and execute commands like building apps, system administration, and more. Similar to HAL from 2001: A Space Odyssey but using local resources only.

Features

100% Local Processing: No cloud services required
Voice Activation: Listen for wake word and respond to voice commands
Voice Activity Detection: Automatically detects when you stop speaking
Conversational AI: Natural language interaction with local models
System Administration: Execute safe system commands
Development Tools: Build applications based on verbal instructions
File Operations: Read, write, list, and manage files
Git Integration: Perform git operations
Network Operations: Safe network commands
Tool Creation: Create new tools as needed
Web Browsing: Open browsers to specific URLs
Time & Date: Provide accurate system time and date
Memory System: Remember important information with user confirmation

Architecture

┌────────────┐
│  Microphone│
└─────┬──────┘
      ▼
┌────────────┐
│ pw-record  │  (PipeWire)
└─────┬──────┘
      ▼
┌────────────┐
│ Whisper    │  (STT)
└─────┬──────┘
      ▼
┌───────────────────────────┐
│ Intent / Reasoning Model  │  (Ollama: qwen3 / deepseek / llama)
└─────┬─────────────────────┘
      ▼
┌───────────────────────────┐
│ Tool Router (YOURS)       │  ← THIS IS THE CORE
│ - bash                    │
│ - fs                      │
│ - git                     │
│ - build                   │
│ - network                 │
│ - self-create tools       │
└─────┬─────────────────────┘
      ▼
┌────────────┐
│ Piper TTS  │
└────────────┘

Prerequisites

You need the following system components installed:

Audio Processing

pw-record and arecord for audio capture
sox for audio processing
Appropriate audio drivers for your microphone

AI Models & Frameworks

Ollama with a supported model (e.g., qwen3:latest, deepseek-r1:latest)
Whisper (either Python whisper package or whisper.cpp)
Piper TTS for text-to-speech

Dependencies

Python 3.10+
Required Python packages (see requirements.txt)

Installation

1. Clone the Repository

git clone https://github.com/yourusername/breath-assist.git
cd breath-assist

2. Initialize Submodules

This project uses submodules for AI frameworks. Initialize them:

git submodule init
git submodule update

3. Setup Ollama

Make sure Ollama is running:

ollama serve

Pull a supported model:

ollama pull qwen3:latest
# or
ollama pull deepseek-r1:latest

4. Install Dependencies

pip install -r requirements.txt

5. Configure Audio

Ensure your microphone is accessible via arecord:

arecord -l  # List available audio devices

Usage

Basic Usage

Start the voice assistant:

./scripts/start_assistant.sh

Or with a custom wake word:

./scripts/start_assistant.sh "hal"

The assistant will listen for the wake word followed by commands:

"computer, what is the time?" → Gets current time
"computer, what is the date?" → Gets current date
"computer, list files in current directory" → Lists directory contents
"computer, open firefox and go to google.com" → Opens Firefox to Google
"computer, build a web app called myblog" → Creates a web application
"computer, my name is Andrew" → Remembers your name (with confirmation)
"computer, what is my name?" → Retrieves stored name

Memory Features

The assistant can store important information with your confirmation:

When you say "my principles are..." it will ask "Should I remember this? Please say yes remember to confirm."
If you confirm, it will ask for a key like "user.principles"
Then it will preview "Storing user.principles = ... Say confirm to save."
Confirm again to store the information permanently

Available Commands

System & File Operations

ls, ps, top, df, du, free, uptime, uname, etc.
File operations: read, write, list, delete, move
Git commands: git status, git commit, etc.

Development

Build operations: Python, Node.js, Rust, Go apps
App creation: Web applications and more
Tool creation: Generate custom shell scripts

Web & Network

Browse to websites
Network operations: curl, wget, ping, etc.

Configuration

The assistant can be configured via environment variables:

OLLAMA_MODEL: Specify which Ollama model to use (default: qwen3:latest)
OLLAMA_BASE_URL: Specify Ollama server URL (default: http://127.0.0.1:11434)

Troubleshooting

Audio Issues

Make sure your microphone is properly configured
Check arecord -l to ensure the device is detected
Verify audio permissions

Ollama Issues

Ensure Ollama server is running with ollama serve
Check that your chosen model is pulled with ollama pull [model]

Performance

The system uses whisper.cpp for faster speech-to-text processing
Using smaller models (like qwen3:4b instead of 7b) can improve response times

Contributing

Contributions are welcome! Please submit a pull request or open an issue for bugs and feature requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Ollama for local LLM serving
Whisper.cpp for fast speech-to-text
Piper TTS for local text-to-speech
Various Python libraries that make this project possible

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.qwen		.qwen
docs		docs
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
FUNCTIONS.md		FUNCTIONS.md
LICENSE		LICENSE
README.md		README.md
STATUS_REPORT.md		STATUS_REPORT.md
TASKS.md		TASKS.md
requirements.txt		requirements.txt
setup.py		setup.py
test_pipeline.py		test_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Breath Assist - Voice-Activated AI Assistant

Features

Architecture

Prerequisites

Audio Processing

AI Models & Frameworks

Dependencies

Installation

1. Clone the Repository

2. Initialize Submodules

3. Setup Ollama

4. Install Dependencies

5. Configure Audio

Usage

Basic Usage

Memory Features

Available Commands

System & File Operations

Development

Web & Network

Configuration

Troubleshooting

Audio Issues

Ollama Issues

Performance

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Breath Assist - Voice-Activated AI Assistant

Features

Architecture

Prerequisites

Audio Processing

AI Models & Frameworks

Dependencies

Installation

1. Clone the Repository

2. Initialize Submodules

3. Setup Ollama

4. Install Dependencies

5. Configure Audio

Usage

Basic Usage

Memory Features

Available Commands

System & File Operations

Development

Web & Network

Configuration

Troubleshooting

Audio Issues

Ollama Issues

Performance

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages