A local AI server written in Rust. Kronk provides an OpenAI-compatible API on a single port, automatically managing backend lifecycles — starting models on demand, routing requests, and unloading idle models to save resources.
Think of it as your own local Ollama or LM Studio server, but for llama.cpp and ik_llama backends.
Tip
Get up and running: kronk model pull bartowski/OmniCoder-8B-GGUF && kronk serve
Windows: Download the installer from Releases, or:
cargo install --git https://github.com/danielcherubini/kronk kronkLinux (Debian/Ubuntu):
sudo dpkg -i kronk_*.debLinux (Fedora/RHEL):
sudo rpm -i kronk-*.rpmkronk model pull bartowski/OmniCoder-8B-GGUFKronk downloads all available quants, detects your GPU VRAM, and suggests optimal context sizes.
kronk serveThat's it. Kronk starts an OpenAI-compatible server on http://localhost:11434. When a request comes in for a model, Kronk automatically starts the right backend, waits for it to be ready, and forwards the request.
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "bartowski/OmniCoder-8B-GGUF",
"messages": [{"role": "user", "content": "Hello!"}]
}'Models are unloaded after 5 minutes of inactivity (configurable with --idle-timeout).
# Install and start (run as admin / sudo)
kronk service install
kronk service start
# After that, no admin needed
kronk service stop
kronk service start
kronk statusNote
For debugging individual backends, you can still use kronk run <server-name> to run a single server in the foreground.
kronk serve [--host H] [--port P] [--idle-timeout S] Start the server
kronk status Show status of all servers
kronk service install Install as a system service
kronk service start Start the service
kronk service stop Stop the service
kronk service remove Remove the service
kronk model pull <repo> Pull a model from HuggingFace
kronk model ls List installed models
kronk model ps Show running model processes
kronk model create <name> Create a server from an installed model
kronk model rm <model> Remove an installed model
kronk model scan Scan for untracked GGUF files
kronk model search <query> Search HuggingFace for GGUF models
kronk config show Print the current configuration
kronk config edit Open config file in editor
kronk config path Show the config file path
kronk logs [name] View logs (defaults to proxy logs)
kronk run <name> [--ctx N] Run a single backend (for debugging)
Kronk manages LLM backend installations (llama.cpp, ik_llama) with automatic version tracking and updates:
kronk backend install llama_cpp # Install latest llama.cpp
kronk backend install ik_llama # Install latest ik_llama (builds from source)
kronk backend install llama_cpp --version b8407 # Install specific version
kronk backend install llama_cpp --build # Force build from source
kronk backend update <name> # Update to latest version
kronk backend list # List installed backends
kronk backend remove <name> # Remove a backend
kronk backend check-updates # Check for updates- llama.cpp: Downloads pre-built binaries for your platform, or builds from source with GPU support
- ik_llama: Always builds from source (no pre-built binaries available)
- Linux/macOS: Backends in
~/.config/kronk/backends/ - Windows: Backends in
%APPDATA%\kronk\backends\ - Version tracking in
~/.config/kronk/backend_registry.toml(Linux/macOS) or%APPDATA%\kronk\backend_registry.toml(Windows)
The installer detects your GPU and prompts you to select acceleration:
- CUDA (NVIDIA) — CUDA cores for faster inference
- Vulkan (AMD/Intel/NVIDIA) — Cross-platform GPU acceleration
- Metal (Apple Silicon) — macOS GPU acceleration
- ROCm (AMD) — AMD GPU support on Linux
- CPU — Fallback when no GPU is available
Kronk auto-generates a config on first run:
- Windows:
%APPDATA%\kronk\config\config.toml - Linux:
~/.config/kronk/config.toml
[backends.llama_cpp]
path = "C:\\path\\to\\llama-server.exe"
health_check_url = "http://localhost:8080/health"
[models.my-model]
backend = "llama_cpp"
model = "bartowski/OmniCoder-8B-GGUF"
quant = "Q4_K_M"
profile = "coding"
enabled = true
[proxy]
host = "0.0.0.0"
port = 11434
idle_timeout_secs = 300
startup_timeout_secs = 120
[supervisor]
restart_policy = "always"
max_restarts = 10
restart_delay_ms = 3000
health_check_interval_ms = 5000The [models.*] key (e.g. my-model) is the alias used by clients in "model": "my-model". You can define multiple models. When kronk serve is running, request any enabled model and its backend will start automatically. Backend ports are auto-assigned — you don't need to configure them.
Model cards are stored in ~/.config/kronk/configs/<company>--<model>.toml and contain quant info, context settings, and sampling presets.
~/.config/kronk/
├── config.toml Main configuration
├── profiles/ Sampling presets (editable)
│ ├── coding.toml
│ ├── chat.toml
│ ├── analysis.toml
│ └── creative.toml
├── configs/ Model cards
│ └── bartowski--OmniCoder-8B.toml
├── models/ GGUF model files
│ └── bartowski/OmniCoder-8B/*.gguf
└── logs/ Service logs
kronk servestarts an OpenAI-compatible API server on a single port (default 11434)- When a request arrives with
"model": "my-model", kronk looks up the config key in[models.*] - If the backend isn't running, kronk auto-assigns a free port, starts the backend with the right GGUF file, and waits for it to become healthy
- The request is forwarded to the backend and the response is streamed back
- After
idle_timeout_secsof inactivity, the backend is shut down to free resources
- Windows: Native Service Control Manager via the
windows-servicecrate.kronk service installregisters kronk as a Windows Service that auto-starts on boot. No NSSM or wrappers needed. - Linux: Generates and manages systemd user units.
kronk service installcreates the unit file, enables it, and starts the service.
kronk service install automatically adds an inbound firewall rule for port 11434. kronk service remove cleans it up.
kronk/
├── crates/
│ ├── kronk-core/ # Config, process supervisor, platform abstraction
│ ├── kronk-cli/ # CLI binary (clap)
│ └── kronk-mock/ # Mock LLM backend for testing
├── installer/ # Inno Setup script (Windows installer)
├── modelcards/ # Community model cards
├── .github/workflows/ # CI/CD release pipeline
└── README.md
git clone https://github.com/danielcherubini/kronk.git
cd kronk
cargo build --releaseThe binary is at target/release/kronk.exe (Windows) or target/release/kronk (Linux).
- TUI Dashboard —
kronk-tuicrate with ratatui - System tray — Windows tray icon for quick service toggle
- Tauri GUI — Lightweight desktop frontend for non-CLI users
Kronk is built with modern Rust and follows these core crates:
- kronk-core — Core logic, process supervision, config management, platform abstractions
- kronk-cli — Command-line interface with clap, user prompts with inquire
- kronk-mock — Mock backend for testing and development
Key dependencies include:
tokio— Async runtime with process managementclap— CLI parsingserde/toml— Configuration serializationtracing— Structured loggingreqwest/hf-hub— HTTP client and HuggingFace integrationsysinfo— System resource monitoringindicatif— Progress bars for downloadsdirectories— Platform-specific config paths