A highly modular, production-ready voice assistant that combines speech recognition, Retrieval-Augmented Generation (RAG), and natural language processing for intelligent conversations.
The monolithic v8/v9 codebase has been refactored into focused, reusable modules:
-
rag_system.py- Complete RAG pipelineKnowledgeBase- Document managementEmbeddingManager- OpenAI embeddingsFAISSIndex- Vector similarity searchRAGRetriever- Context retrievalRAGSystem- Main orchestrator
-
prompt_manager.py- Intelligent prompt generationSystemPrompt- Core assistant personalityPromptTemplate- Reusable prompt templatesPromptGenerator- Context-aware prompt generationConversationManager- Chat history managementRAGPromptBuilder- RAG-specific prompt construction
-
kb_manager.py- Knowledge base utilitiesKnowledgeBaseManager- File and chunk managementContentOrganizer- Section-based organization
-
S2S_v10.py- Refactored voice assistant- Clean, modular design
- Dependency injection
- Better error handling
- Wake word detection ("Hey Bruce") via Porcupine
- Real-time speech recognition (Google Speech-to-Text)
- Natural text-to-speech output (pyttsx3)
- Vector-based retrieval: FAISS for fast similarity search
- Smart confidence levels: HIGH (β₯0.85), MEDIUM (β₯0.75), LOW (<0.75)
- Context-aware responses: Different prompts for different confidence levels
- Fallback to general knowledge: Seamless degradation when no relevant docs found
- Conversation memory: Maintains context across exchanges
- Special commands: Time, date, day queries answered instantly
- Conversation control: Natural goodbye/exit detection
- Timeout management: 60-second idle timeout before disconnect
- Comprehensive logging
- Error handling and recovery
- Resource cleanup
- Configuration management
STS_Robot/
βββ S2S_v10.py # π Main modular assistant
βββ rag_system.py # π Complete RAG pipeline
βββ prompt_manager.py # π Prompt generation system
βββ kb_manager.py # π Knowledge base utilities
β
βββ S2S_v8.py / S2S_v9.py # Legacy versions (reference)
βββ .env # Environment variables (API keys)
βββ requirements.txt # Python dependencies
β
βββ info.txt # π Knowledge base (your content)
βββ embeddings.npy # Pre-computed document embeddings
βββ faiss_index.idx # FAISS vector index
βββ doc_chunks.pkl # Pickled document chunks
β
βββ Hey-Bruce_en_windows_v3_0_0.ppn # Porcupine wake word model
βββ README.md # This file
pip install -r requirements.txtCreate a .env file in the project directory:
PORCUPINE_API_KEY=your_porcupine_api_key_here
GROQ_API_KEY=your_groq_api_key_here
OPEN_API=your_openai_api_key_hereGet API Keys:
Create or update info.txt with your content:
Your knowledge base content here...
Topics, facts, information that Bruce should know about.
Optionally organize with sections:
--- History ---
Your history content...
--- Technology ---
Your tech content...
--- General ---
Other information...
python S2S_v10.pyThe assistant will:
- Load or create the RAG index on first run
- Listen for "Hey Bruce"
- Start conversation when wake word detected
- Continue until you say goodbye or timeout occurs
User Question
β
Generate Embedding (OpenAI)
β
FAISS Similarity Search (k=3)
β
Calculate Confidence Score
β
Select Prompt Strategy (HIGH/MEDIUM/LOW)
β
Build Context-Aware Prompt
β
LLM Response (Groq Gemma2-9B)
β
Add to Conversation History
β
Text-to-Speech Output
| Confidence | Threshold | Strategy | Use Case |
|---|---|---|---|
| HIGH | β₯0.85 | Use top 2 relevant chunks | Factual queries with good matches |
| MEDIUM | β₯0.75 | Use top chunk with uncertainty note | Partial matches |
| LOW | <0.75 | General knowledge only | Novel questions |
You: "Hey Bruce!"
Bruce: "Hi, I'm Bruce. How can I help you today?"
You: "What time is it?"
Bruce: "The current time is 02:30 PM on Monday, January 6, 2025."
You: "Tell me about [topic from your knowledge base]"
Bruce: [Retrieves relevant information and provides answer]
You: "What's 2+2?"
Bruce: [Uses general knowledge since no match in KB]
You: "Goodbye"
Bruce: "Goodbye! Say 'Hey Bruce' if you need me again."
Edit prompt_manager.py β PromptGenerator.TEMPLATES:
TEMPLATES = {
"HIGH": PromptTemplate(
"high_confidence",
"""Your custom prompt here..."""
),
# ... more templates
}Edit prompt_manager.py β SystemPrompt.SYSTEM_CONTENT:
SYSTEM_CONTENT = """You are [your custom personality]..."""Option 1: Direct file edit
vim info.txt # Edit directlyOption 2: Programmatic update
from kb_manager import KnowledgeBaseManager
manager = KnowledgeBaseManager()
manager.append_to_knowledge_base("New information here...")In S2S_v10.py β _initialize_rag():
self.rag_system = RAGSystem(
kb_path="info.txt", # Change if needed
faiss_path="faiss_index.idx",
)Or in rag_system.py β RAGRetriever.retrieve():
def retrieve(self, query_embedding, k=5): # Change k to retrieve more/fewer chunks
...from kb_manager import KnowledgeBaseManager
manager = KnowledgeBaseManager()
manager.display_stats()Output:
==================================================
π Knowledge Base Statistics
==================================================
Exists: True
Size: 25.34 KB
Characters: 25,948
Words: 4,287
Lines: 182
==================================================
from rag_system import RAGSystem
rag = RAGSystem()
rag.initialize()
results = rag.retrieve_context("test query", k=3)
for chunk, score in results:
print(f"Score: {score:.3f} | {chunk[:100]}...")The system logs to console with timestamps:
2025-01-06 14:30:45 - INFO - β
All required environment variables present
2025-01-06 14:30:46 - INFO - β
All components initialized successfully
2025-01-06 14:30:47 - INFO - π Voice Assistant started. Listening for 'Hey Bruce'...
- Never commit
.envfile to version control - Use strong API keys from official providers
- Regenerate keys if exposed
- First run will take longer (RAG indexing)
- Subsequent runs use cached index
- Keep knowledge base organized for better retrieval
- Monitor API usage for cost
- Regularly update knowledge base
- Test new prompts before deployment
- Review logs for errors
- Keep dependencies updated
- Create
.envfile with required keys - Verify key names match exactly
- Ensure microphone is working
- Reduce background noise
- Check
recognizersettings inS2S_v10.py
- Check internet connection (API calls)
- Verify GPU availability if using local embeddings
- Review knowledge base size
- Verify
info.txtexists and has content - Check that embeddings were generated
- Review similarity scores in logs
from rag_system import RAGSystem
rag = RAGSystem()
rag.initialize()
results = rag.retrieve_context(query="your question", k=3)
confidence = rag.get_confidence_level(score=0.82)from prompt_manager import PromptGenerator
prompt = PromptGenerator.generate(
confidence_level="HIGH",
question="What is X?",
context="Answer: X is..."
)from prompt_manager import ConversationManager
conv = ConversationManager(max_history=4)
conv.add_message("user", "Hello")
conv.add_message("assistant", "Hi there!")
history = conv.get_history()
conv.clear()- Multi-language support
- Local embedding model (eliminate OpenAI dependency)
- Web UI dashboard
- Persistent conversation logging
- User-specific knowledge bases
- Advanced RAG with re-ranking
- Voice activity detection
- Custom wake word training
[Your License Here]
- Picovoice Porcupine: Wake word detection
- OpenAI: Embeddings and API
- Groq: Fast LLM inference
- FAISS: Vector similarity search
- LangChain: Text splitting utilities
For issues, questions, or suggestions:
- Check the Troubleshooting section
- Review logs for error messages
- Verify environment configuration
- Test individual components in isolation
Made with β€οΈ - Your Intelligent Voice Assistant