A local AI chat application built for the Gemma 3n Challenge, featuring advanced conversation management, multiple backend support, and specialized AI tutoring capabilities.
This project is developed for the Gemma 3n Challenge, showcasing innovative applications of Google's Gemma models in a local environment with advanced features like:
- Multi-modal AI interactions
- Persistent conversation management
- Speech-to-text integration
- AI-powered tutoring system
- Model Context Protocol (MCP) implementation
- Multiple Backend Support: Ollama and LlamaCpp integration
- Gemma Model Optimization: Specialized configurations for Gemma models
- Model Context Protocol: Advanced AI model communication
- Speech Integration: Voice input/output for natural interactions
- AI Tutoring: Intelligent tutoring system with conversation context
- Persistent Chat History: SQLite-based conversation storage
- RESTful API: Full programmatic access
- Web Interface: Clean, responsive chat UI
- Process Management: Robust application lifecycle handling
- Comprehensive Logging: Detailed system monitoring
Choose your preferred backend:
# Full installation (recommended)
./install.sh
# Ollama backend only
./install-ollama.sh
# LlamaCpp backend
./install-llamacpp.shDownload and configure Gemma models:
# Download GGUF models
./download-gguf.sh
# Check available models
cat models.txtEdit config.json for your setup:
- Model selection
- Backend preferences
- API endpoints
- Speech settings
python app.py├── app.py # Main Flask application
├── api/ # Core API modules
│ ├── backend.py # AI backend management
│ ├── chat.py # Chat functionality
│ ├── conversations.py # Conversation persistence
│ ├── mcp.py # Model Context Protocol
│ ├── speech.py # Speech integration
│ └── tutoring.py # AI tutoring system
├── config/ # Configuration management
├── services/ # Business logic services
├── utils/ # Utility functions
├── persistence/ # Data layer
├── static/ # Web assets
├── templates/ # HTML templates
├── logs/ # Application logs
└── llama_models/ # Local model storage
This application is optimized for Gemma models with:
- Custom prompt templates tailored for Gemma's instruction format
- Context window optimization for longer conversations
- Parameter tuning specific to Gemma's architecture
- Memory-efficient inference for local deployment
See models.txt for the complete list of supported Gemma variants.
The tutoring.py module implements:
- Personalized learning paths
- Knowledge gap identification
- Interactive problem-solving
- Progress tracking
The chat-manager.sh script provides:
- Conversation export/import
- Chat analytics
- Model performance comparison
- Batch processing capabilities
The mcp.py implementation enables:
- Dynamic context management
- Multi-turn conversation optimization
- Memory-efficient long conversations
- Context-aware responses
- Python 3.8+
- CUDA-compatible GPU (recommended)
- 8GB+ RAM for model inference
pip install -r requirements.txt# Run with different backends
python app.py --backend ollama
python app.py --backend llamacppThe application includes several optimizations for Gemma models:
- Dynamic batching for improved throughput
- Memory pooling for efficient GPU usage
- Context caching for faster responses
- Model quantization support (GGUF format)
- Local-first approach: All processing happens on your machine
- No external API calls for model inference
- Secure conversation storage with SQLite
- API key management through secure configuration
- Research Reference: See
learn_lm_paper.pdffor technical background - API Documentation: Available in the
docs/directory - Change Log: See
CHANGELOG.mdfor version history
This project demonstrates:
- Innovation: Novel application of Gemma models in education
- Technical Excellence: Robust, scalable architecture
- User Experience: Intuitive chat interface with voice support
- Performance: Optimized local inference pipeline
This project is licensed under the terms specified in LICENSE.
This is a challenge submission project. For questions or collaboration opportunities, please refer to the challenge guidelines.
Built with ❤️ for the Gemma 3n Challenge