You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generate a complete local AI stack in the browser — private, powerful, no cloud required.
Pick your services · Choose your models · Download a ready-to-use docker-compose.yml
🎯 What is this?
Running a local AI stack means juggling Ollama, Open WebUI, n8n, vector databases, and more — before a single container is running. Finding the right images, configuring ports, wiring services together, and deciding which models fit your hardware takes hours.
ai-stack-wizard solves exactly that. Pick the services you need, select your Ollama models, enter your RAM to get tailored recommendations — and download a fully wired docker-compose.yml in seconds. No installation, no build system, no backend — a single HTML file.
✨ Features
🧠 RAM-aware model picker
Select your RAM and get instant model recommendations
✅ 12 pre-configured AI services
Ollama, n8n, Open WebUI, Flowise, LiteLLM and more
✅ Ollama model auto-pull
Selected models are pulled automatically on container start
✅ Pre-wired networking
All services communicate via a shared ai-stack Docker network
✅ Live YAML preview
Syntax highlighting directly in the browser
✅ One-click download
Finished docker-compose.yml ready to deploy
✅ No backend required
A single HTML file — open locally or host on GitHub Pages
✅ Fully configurable
Ports, paths, passwords, API keys — everything adjustable
📦 Included Services
🦙 Core Inference — Ollama
Service
Description
Ollama
Run large language models locally with a simple REST API. Models are pulled automatically on startup.
💬 Chat Interface — Open WebUI, AnythingLLM
Service
Description
Open WebUI
ChatGPT-like interface for your local Ollama models
AnythingLLM
All-in-one: chat, RAG pipelines, AI agents & workspaces
⚡ Workflow Automation — n8n
Service
Description
n8n
Automate anything with AI-powered workflows. Pre-wired to talk to Ollama directly.
🧠 AI Agents & RAG — Flowise, Qdrant, SearXNG
Service
Description
Flowise
Build LLM apps, RAG pipelines & AI agents with a visual editor
Qdrant
High-performance vector database for semantic search & RAG
SearXNG
Private, self-hosted web search engine for AI agents — no tracking
🔀 AI Gateway — LiteLLM
Service
Description
LiteLLM
Unified OpenAI-compatible API proxy for all your models — local and remote
Modern container & stack management UI (Portainer alternative)
🎙️ Audio & Speech — faster-whisper
Service
Description
faster-whisper
Local speech-to-text transcription API — meetings, voice notes, podcasts
📄 Document AI — Paperless-ngx
Service
Description
Paperless-ngx
Document management with OCR and full-text search — wired to local AI
🧠 Ollama Model Guide
Model
RAM
Best for
Llama 3.2 3B
~8 GB
Fast, lightweight everyday tasks
Mistral 7B
~8 GB
Excellent reasoning, very efficient
Llama 3.2 8B
~16 GB
Best balance of quality and speed
Gemma 3 9B
~16 GB
Coding & structured tasks
Qwen 2.5 14B
~16 GB
Multilingual & code generation
DeepSeek R1 8B
~16 GB
Math & complex reasoning
Phi-4 14B
~16 GB
Microsoft's compact powerhouse
Llama 3.1 70B
~48 GB
Near-GPT-4 quality
Nomic Embed
~4 GB
Text embeddings for RAG pipelines
LLaVA 13B
~16 GB
Vision — describe and analyze images
🚀 Getting Started
Option 1 — Open directly in the browser
# Clone the repository
git clone https://github.com/bitalchemy-io/ai-stack-wizard
cd ai-stack-wizard
# Open index.html in your browser
open index.html # macOS
xdg-open index.html # Linux