Every AI SaaS sends your data to someone else's server. Not this one.
KeepAI is a privacy-first, production-ready backend that runs large language models (Llama 3, Mistral, CodeLlama, etc.) on your own infrastructure β no cloud dependency, no data leaks, no API usage fees.
"The best AI is the one that respects your privacy."
| Feature | KeepAI | OpenAI API | Other Backends |
|---|---|---|---|
| Data Privacy | π 100% local | β Data leaves your infra | β Usually cloud |
| Cost | π° Free (your hardware) | πΈ Per-token billing | πΈ SaaS fees |
| Models | π Any Ollama model | π GPT only | π Limited choices |
| Auth & RBAC | β Built-in JWT + RBAC | β Not included | β Rarely included |
| Structured Output | β JSON mode | β Supported | β Usually missing |
| Database | β PostgreSQL | β No persistence | |
| Clean Architecture | β Hexagonal | β N/A |
- π€ Local LLM Inference β Run Llama 3, Mistral, CodeLlama, DeepSeek, and 100+ models locally via Ollama
- π JWT Authentication β Register, login, token-based auth out of the box
- π‘οΈ Role-Based Access Control β Database-driven permissions (Admin/User roles, granular permissions)
- π Structured JSON Extraction β Extract invoices, contracts, forms as validated JSON
- ποΈ PostgreSQL Persistence β Async SQLAlchemy + Alembic migrations
- π³ Docker Ready β One command to start everything
- ποΈ Clean Architecture β Hexagonal/ports-and-adapters pattern, fully testable
- π Observability β Structured JSON logging, request tracking
- β Tested β pytest + async tests + CI pipeline
git clone https://github.com/yoosuf/KeepAI.git
cd fastapi-ollama-backend
docker compose up --build -d
docker compose exec ollama ollama pull llama3That's it. Your AI backend runs at http://localhost:8000 with Swagger docs at http://localhost:8000/docs.
git clone https://github.com/yoosuf/KeepAI.git
cd fastapi-ollama-backend
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
# Set up PostgreSQL and Ollama, then:
./entrypoint.sh# Register
curl -X POST http://localhost:8000/api/v1/auth/register \
-H "Content-Type: application/json" \
-d '{"email": "demo@example.com", "password": "demo1234"}'
# Login
TOKEN=$(curl -s -X POST http://localhost:8000/api/v1/auth/login \
-F "username=demo@example.com" \
-F "password=demo1234" | python3 -c "import sys,json; print(json.load(sys.stdin)['access_token'])")
# Generate AI response
curl -X POST http://localhost:8000/api/v1/prompts \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"prompt_text": "Explain quantum computing in 3 sentences."}'Extract structured data from invoices, contracts, forms, and legal documents.
curl -X POST "http://localhost:8000/api/v1/extract-invoice?text_content=Invoice%20%23999%20from%20TechCorp.%20Date:%202026-01-15.%202%20Laptops%20at%20%241000%20each.%20Total:%20%242000." \
-H "Authorization: Bearer $TOKEN"Process patient records, clinical notes, and medical documents on-premises β HIPAA-friendly by design.
Extract clauses, parties, and obligations from contracts without sending sensitive data to third parties.
Ask questions in English and get SQL queries β the structured JSON pattern makes this trivial.
Run AI assistants with full data privacy for sensitive research data.
Deploy behind your firewall with role-based access for different teams.
| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/health |
β | Health check |
POST |
/api/v1/auth/register |
β | Register new user |
POST |
/api/v1/auth/login |
β | Login, get JWT token |
POST |
/api/v1/prompts |
JWT | Send prompt to LLM |
GET |
/api/v1/prompts |
JWT | List your prompts |
GET |
/api/v1/prompts/{id} |
JWT | Get prompt details |
POST |
/api/v1/extract-invoice |
JWT | Extract JSON from text |
GET |
/api/v1/admin/users |
Admin | List all users |
GET |
/api/v1/admin/all-prompts |
Admin | List all prompts |
π Full API docs: API_DOCUMENTATION.md or live at /docs (Swagger).
βββββββββββββββββββββββ
β KeepAI App β
β Router β Service β
β β Interface β
β β β
β ββββββββββββββββ β
β β PostgreSQL β β
β β (asyncpg) β β
β ββββββββββββββββ β
β β β
β ββββββββββββββββ β
β β Ollama API β β
β β (local LLM) β β
β ββββββββββββββββ β
βββββββββββββββββββββββ
Clean Architecture / Hexagonal β Router (presentation) β Service (application) β LLMInterface (port) β OllamaClient (adapter).
Each layer is independently testable and swappable. Swap Ollama for OpenAI, Anthropic, or any API β change one file.
- JWT Auth + RBAC
- PostgreSQL persistence
- Structured JSON extraction
- Docker Compose
- Streaming responses (SSE)
- Chat history & conversations
- Web UI (React + Monaco Editor)
- Multi-model routing
- RAG (Retrieval-Augmented Generation)
- Code generation agents
- API key management
- Rate limiting
- Usage analytics dashboard
We love contributions! Check out our Contributing Guidelines and Code of Conduct.
Ways to contribute:
- π Report bugs via GitHub Issues
- π‘ Suggest features
- π Improve documentation
- π§ Submit pull requests
Found a vulnerability? Please read our Security Policy for reporting instructions.
Key security features:
- JWT tokens with configurable expiry
- bcrypt password hashing
- Database-driven RBAC
- No cloud dependencies β your data stays yours
- Non-root user in Docker
Distributed under the MIT License. See LICENSE for details.
Yoosuf Mohamed β mayoosuf@gmail.com
Project Link: https://github.com/yoosuf/KeepAI
β Star this project if you find it useful! β
Built with β€οΈ for the open-source community