A 100% local, voice-activated AI assistant that allows you to speak to your computer using your voice and execute commands like building apps, system administration, and more. Similar to HAL from 2001: A Space Odyssey but using local resources only.
- 100% Local Processing: No cloud services required
- Voice Activation: Listen for wake word and respond to voice commands
- Voice Activity Detection: Automatically detects when you stop speaking
- Conversational AI: Natural language interaction with local models
- System Administration: Execute safe system commands
- Development Tools: Build applications based on verbal instructions
- File Operations: Read, write, list, and manage files
- Git Integration: Perform git operations
- Network Operations: Safe network commands
- Tool Creation: Create new tools as needed
- Web Browsing: Open browsers to specific URLs
- Time & Date: Provide accurate system time and date
- Memory System: Remember important information with user confirmation
┌────────────┐
│ Microphone│
└─────┬──────┘
▼
┌────────────┐
│ pw-record │ (PipeWire)
└─────┬──────┘
▼
┌────────────┐
│ Whisper │ (STT)
└─────┬──────┘
▼
┌───────────────────────────┐
│ Intent / Reasoning Model │ (Ollama: qwen3 / deepseek / llama)
└─────┬─────────────────────┘
▼
┌───────────────────────────┐
│ Tool Router (YOURS) │ ← THIS IS THE CORE
│ - bash │
│ - fs │
│ - git │
│ - build │
│ - network │
│ - self-create tools │
└─────┬─────────────────────┘
▼
┌────────────┐
│ Piper TTS │
└────────────┘
You need the following system components installed:
pw-recordandarecordfor audio capturesoxfor audio processing- Appropriate audio drivers for your microphone
- Ollama with a supported model (e.g.,
qwen3:latest,deepseek-r1:latest) - Whisper (either Python whisper package or whisper.cpp)
- Piper TTS for text-to-speech
- Python 3.10+
- Required Python packages (see requirements.txt)
git clone https://github.com/yourusername/breath-assist.git
cd breath-assistThis project uses submodules for AI frameworks. Initialize them:
git submodule init
git submodule updateMake sure Ollama is running:
ollama servePull a supported model:
ollama pull qwen3:latest
# or
ollama pull deepseek-r1:latestpip install -r requirements.txtEnsure your microphone is accessible via arecord:
arecord -l # List available audio devicesStart the voice assistant:
./scripts/start_assistant.shOr with a custom wake word:
./scripts/start_assistant.sh "hal"The assistant will listen for the wake word followed by commands:
- "computer, what is the time?" → Gets current time
- "computer, what is the date?" → Gets current date
- "computer, list files in current directory" → Lists directory contents
- "computer, open firefox and go to google.com" → Opens Firefox to Google
- "computer, build a web app called myblog" → Creates a web application
- "computer, my name is Andrew" → Remembers your name (with confirmation)
- "computer, what is my name?" → Retrieves stored name
The assistant can store important information with your confirmation:
- When you say "my principles are..." it will ask "Should I remember this? Please say yes remember to confirm."
- If you confirm, it will ask for a key like "user.principles"
- Then it will preview "Storing user.principles = ... Say confirm to save."
- Confirm again to store the information permanently
ls,ps,top,df,du,free,uptime,uname, etc.- File operations: read, write, list, delete, move
- Git commands:
git status,git commit, etc.
- Build operations: Python, Node.js, Rust, Go apps
- App creation: Web applications and more
- Tool creation: Generate custom shell scripts
- Browse to websites
- Network operations:
curl,wget,ping, etc.
The assistant can be configured via environment variables:
OLLAMA_MODEL: Specify which Ollama model to use (default:qwen3:latest)OLLAMA_BASE_URL: Specify Ollama server URL (default:http://127.0.0.1:11434)
- Make sure your microphone is properly configured
- Check
arecord -lto ensure the device is detected - Verify audio permissions
- Ensure Ollama server is running with
ollama serve - Check that your chosen model is pulled with
ollama pull [model]
- The system uses whisper.cpp for faster speech-to-text processing
- Using smaller models (like qwen3:4b instead of 7b) can improve response times
Contributions are welcome! Please submit a pull request or open an issue for bugs and feature requests.
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama for local LLM serving
- Whisper.cpp for fast speech-to-text
- Piper TTS for local text-to-speech
- Various Python libraries that make this project possible