Skip to content
This repository was archived by the owner on Jul 5, 2025. It is now read-only.

FOUNDATIONAIBASED/OLLAMINI

Repository files navigation

OLLAMINI - Android AI Server App

A powerful Android application that transforms your device into a local AI server. OLLAMINI downloads AI models directly from various sources and runs them using a custom native C++ AI runner with GPU acceleration support.

🚀 Features

  • Model Management: Download and manage AI models from Hugging Face and other sources
  • Local AI Server: Run models locally with a custom C++ implementation
  • GPU Acceleration: Native GPU support for faster inference
  • Chat Interface: Interactive chat with AI models
  • Server Control: Start/stop the local AI server
  • Statistics: Monitor server performance and model usage
  • Settings: Customize app behavior and server configuration
  • Documentation: Built-in help and usage guides

🏗️ Architecture

Core Components

  1. Model Downloader: Downloads .gguf models directly from URLs
  2. Native AI Runner: Custom C++ implementation for model inference
  3. Local Server: HTTP API server for external access
  4. Android UI: Jetpack Compose interface for management

Data Flow

Internet → Model Download → Local Storage → Native AI Runner → HTTP API → External Clients

📱 Screenshots

  • Home: Server status and quick actions
  • Models: Browse, download, and manage AI models
  • Chat: Interactive conversations with AI models
  • Server: Start/stop and configure the local server
  • Statistics: Performance metrics and usage data
  • Settings: App configuration and preferences
  • Documentation: Help and usage guides

🛠️ Technical Stack

  • Language: Kotlin
  • UI Framework: Jetpack Compose
  • Database: Room with SQLite
  • Networking: Retrofit for API calls
  • Native Code: C++ with JNI for AI inference
  • Background Services: Android WorkManager
  • GPU Support: OpenCL integration

📋 Requirements

  • Android 8.0 (API 26) or higher
  • 4GB+ RAM recommended
  • 2GB+ free storage for models
  • GPU with OpenCL support (optional, for acceleration)

🚀 Quick Start

  1. Install the app from the APK
  2. Grant permissions for storage and network access
  3. Browse models in the Models tab
  4. Download a model (e.g., Llama 2 7B)
  5. Start the server from the Server tab
  6. Chat with the AI or use the HTTP API

🔧 Configuration

Model Sources

The app uses a JSON file to define available models with human-readable sizes:

[
  {
    "id": "llama2-7b",
    "name": "Llama 2 7B",
    "description": "Meta's Llama 2 7B parameter model",
    "size": "4GB",
    "download_url": "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf",
    "parameters": "7B",
    "model_files": {
      "model_url": "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf",
      "model_size": "4GB",
      "params_url": "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/params",
      "params_size": "1KB",
      "config_url": "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/config.json",
      "config_size": "2KB"
    }
  }
]

Required Fields:

  • id: Unique identifier for the model (required)
  • name: Display name shown in the app (required)
  • description: Model description and details (optional)
  • size: Human-readable model size like "4GB", "1.5GB", "500MB" (required)
  • download_url: Main download URL for the model (required)
  • parameters: Model parameter count like "7B", "13B", "14B" (required)

Model Files (Optional):

  • model_files.model_url: Main .gguf model file URL
  • model_files.model_size: Human-readable size of the model file
  • model_files.params_url: Model parameters file URL (optional)
  • model_files.params_size: Size of params file (e.g., "1KB")
  • model_files.config_url: Model configuration file URL (optional)
  • model_files.config_size: Size of config file (e.g., "2KB")

Size Format Support:

  • Bytes: "1024B"
  • Kilobytes: "1KB", "1.5KB"
  • Megabytes: "500MB", "1.5MB"
  • Gigabytes: "4GB", "1.5GB"
  • Terabytes: "1TB"

Server Settings

  • Port: Default 8080 (configurable)
  • Host: 0.0.0.0 (accessible from network)
  • API Endpoints: RESTful interface for model interaction

🔌 API Usage

Once the server is running, you can interact with models via HTTP:

# Generate text
curl -X POST http://localhost:8080/generate \
  -H "Content-Type: application/json" \
  -d '{"model": "llama2-7b", "prompt": "Hello, how are you?"}'

# Chat conversation
curl -X POST http://localhost:8080/chat \
  -H "Content-Type: application/json" \
  -d '{"model": "llama2-7b", "message": "Tell me a joke"}'

🎯 Use Cases

  • Personal AI Assistant: Run AI models locally for privacy
  • Development Testing: Test AI integrations without cloud costs
  • Offline AI: Use AI capabilities without internet connection
  • Educational: Learn about AI models and inference
  • Prototyping: Quick AI model testing and experimentation

🔒 Privacy & Security

  • Local Processing: All AI inference happens on your device
  • No Cloud Dependencies: Models run entirely locally
  • Data Privacy: Your conversations stay on your device
  • Network Control: Choose which devices can access your AI server

🛡️ Permissions

  • Storage: Download and store AI models
  • Network: Access model repositories and serve HTTP API
  • WiFi State: Detect network configuration for server access

📊 Performance

  • Model Loading: Optimized for Android devices
  • Memory Management: Efficient RAM usage for large models
  • GPU Acceleration: Optional OpenCL support for faster inference
  • Background Processing: Non-blocking model operations

🔄 Updates

  • Model Updates: Automatic version checking
  • App Updates: Regular feature and security updates
  • Community Models: Easy addition of new model sources

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Hugging Face: Model hosting and distribution
  • TheBloke: GGUF model conversions
  • Meta: Llama 2 models
  • Mistral AI: Mistral models
  • Microsoft: Phi and Orca models

📞 Support

  • Issues: Report bugs on GitHub
  • Discussions: Ask questions in GitHub Discussions
  • Documentation: Check the in-app help section

OLLAMINI - Your personal AI server, powered by local inference.

About

PROJECT IS ON HOLD

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors