AI Model Gateway is a self-hosted LLM gateway for teams that want provider routing, configuration publishing, telemetry, benchmarks, and day-2 operations in one compact Go runtime.
It is not trying to be another hosted model marketplace. The project is optimized for local control: one supervisor command, separate data/control/telemetry planes, OpenAI-compatible client entry points, provider health visibility, safe config rollout, and an admin UI that behaves like an operations console.
| Need | Link |
|---|---|
| Try the packaged runtime first | Release archive install |
| Try with Docker Compose | Docker Compose deployment |
| Decide in one short pass | 15-minute evaluation path |
| Check adoption fit before wiring clients | LLM gateway adoption checklist |
| Point local tools at the gateway | Client integrations |
| Route Codex CLI traffic | Codex CLI gateway page |
| Route Claude Code traffic | Claude Code gateway page |
| Verify fallback behavior | Provider fallback demo |
| Review quality and security before adoption | Review evidence |
| Compare gateway options | LLM gateway comparison |
| Read the product page | Website |
| 中文评估入口 | 自托管 LLM 网关中文页 |
| Share with a relevant community | Share kit / 中文分享包 |
Fastest trial: download the packaged runtime, verify SHA256SUMS.txt, then run the 15-minute path or fallback demo. If you prefer containers, use the Docker Compose path below; CI also builds the runtime image on every main-branch push. If it fits your self-hosted LLM infrastructure needs, star the repository so more operators can find it.
The Admin UI exposes the operational surface behind the gateway: provider health, routing state, telemetry, benchmark signals, config publishing, diagnostics, request logs, and update/rollback workflows. Start with the 15-minute evaluation path, review the quality and security evidence, or use the release archive install path if you want the shortest packaged trial.
AI Model Gateway is most useful when you are running LLM traffic for a team and need operational control rather than a hosted model marketplace:
- You want OpenAI, Anthropic, and Responses-compatible clients to enter through one local gateway.
- You need provider routing, fallback, rate limiting, and cache behavior that you can inspect and change.
- You want config changes to go through preview, diff, publish, audit, and rollback instead of editing a live proxy file.
- You need request logs, latency, cost, provider health, benchmarks, diagnostics, and replay in one admin surface.
- You want updates and rollback to use manifest-verified bundles instead of replacing binaries by hand.
The fastest path is the packaged release archive. It avoids rebuilding the runtime and verifies the download with SHA256SUMS.txt.
| Platform | Archive |
|---|---|
| Linux x64 | ai-model-gateway-linux-amd64.tar.gz |
| Linux arm64 | ai-model-gateway-linux-arm64.tar.gz |
| Windows x64 | ai-model-gateway-windows-amd64.zip |
| macOS arm64 | ai-model-gateway-darwin-arm64.tar.gz |
| Checksums | SHA256SUMS.txt |
Follow the release archive install path for checksum verification, local config setup, runtime directories, temporary local tokens, and aigw supervise startup.
Container trial:
cp configs/config.example.yaml configs/config.yaml
cat > deploy/secrets.env <<'EOF'
ADMIN_BOOTSTRAP_TOKEN=change-me-32-characters-minimum-0
COOKIE_SIGNING_KEY=change-me-32-characters-minimum-0
ADMIN_TOKEN=change-me-admin-token
VIEWER_TOKEN=change-me-viewer-token
EOF
docker compose -f deploy/docker-compose.yaml up -d
curl http://127.0.0.1:18081/-/healthSee Docker Compose deployment for logs, published ports, and provider-key setup.
If you prefer to audit or modify the code before running it, build from source:
git clone https://github.com/SSC-STUDIO/Ai-Model-Gateway.git
cd Ai-Model-Gateway
go build -o ./dist/aigw ./cmd/aigw
go build -o ./dist/gatewayd ./cmd/gatewayd
go build -o ./dist/controld ./cmd/controld
go build -o ./dist/telemetryd ./cmd/telemetryd
cp configs/config.example.yaml configs/config.yaml
mkdir -p .gateway-runtime/telemetry .gateway-runtime/gateway .gateway-runtime/control
ADMIN_BOOTSTRAP_TOKEN=change-me-32-characters-minimum-0 \
COOKIE_SIGNING_KEY=change-me-32-characters-minimum-0 \
ADMIN_TOKEN=change-me-admin-token \
VIEWER_TOKEN=change-me-viewer-token \
./dist/aigw supervise -runtime-root .gateway-runtime -config-dir configs -bin-dir ./distThen open http://localhost:18080/admin and check http://localhost:18080/-/health.
| Goal | Start here |
|---|---|
| Share a visual overview | Project website |
| Share copy-ready links and short posts | Share kit |
| Share with Chinese developer communities | Chinese share kit |
| Evaluate from a Chinese landing page | Chinese self-hosted LLM gateway page |
| Try the packaged runtime | Release archive install path |
| Try with Docker Compose | Docker Compose deployment |
| Build from source locally | Installation guide |
| Decide quickly whether to spend more time | 15-minute evaluation path |
| Match it to your team's workflow | Use cases |
| Evaluate whether self-hosting fits | Self-hosted LLM gateway checklist |
| Check adoption fit from a shareable page | LLM gateway adoption checklist |
| Review CI, tests, runtime smoke, and maturity evidence | Quality evidence |
| Review auth, secret, SSRF, telemetry, and deployment trust boundaries | Security and trust model |
| Understand project direction | Project roadmap |
| Compare LLM gateway options | LLM gateway comparison guide |
| Start from a gateway comparison search page | LLM gateway comparison page |
| Start from a self-hosted gateway search page | Self-hosted LLM gateway page |
| Start from an OpenAI-compatible gateway search page | OpenAI-compatible LLM gateway page |
| Start from a client integration search page | LLM client integrations page |
| Start from an adoption checklist search page | LLM gateway adoption checklist |
| Start from a Codex CLI gateway search page | Codex CLI gateway page |
| Start from a Claude Code gateway search page | Claude Code gateway page |
| Start from an OpenAI and Anthropic gateway search page | OpenAI Anthropic gateway page |
| Start from a provider fallback search page | LLM provider fallback gateway page |
| Understand config publish and rollback | Config publish and rollback |
| Operate provider fallback and health | Provider fallback and health operations |
| Run a provider fallback proof | Provider fallback demo |
| Connect an OpenAI-compatible upstream | OpenAI-compatible upstreams |
| Point clients at the gateway | Client integrations |
| Run it as an operations service | Deployment guide |
| Control it from scripts or terminals | CLI guide |
| Point local AI tools at the gateway | aigw clients |
| Debug startup, routing, or admin access | Troubleshooting |
| Ask what to improve next | Maintainer discussion |
Self-Hosted Gateway | OpenAI-Compatible Gateway | Adoption Checklist | Client Integrations | Codex CLI Gateway | Claude Code Gateway | OpenAI-Compatible Upstreams | OpenAI Anthropic Gateway | Provider Fallback Gateway | Gateway Comparison Page | Quality Evidence | Security Model | Docs | Roadmap | Promotion Kit | 100-Star Campaign
The AI gateway space already has strong projects:
| Project type | Good at | AI Model Gateway difference |
|---|---|---|
| LiteLLM / Portkey-style gateways | Broad provider coverage, unified APIs, spend controls, guardrails | Adds a native three-plane runtime, local config publishing, rollback, diagnostics, and an ops-first admin console |
| Helicone-style observability | Request tracing, analytics, evaluations, experiments | Treats observability as one part of the gateway lifecycle instead of the whole product |
| OpenRouter-style hosted routers | Fast access to many public models through a hosted broker | Keeps routing, keys, telemetry, and policy inside the user's own environment |
| Kong / Envoy AI Gateway stacks | Enterprise gateway ecosystem, plugins, Kubernetes-native traffic policy | Focuses on LLM-specific operations with fewer moving parts and a small self-hosted binary set |
See docs/differentiation.md for positioning notes and docs/llm-gateway-comparison.md for a practical selection guide.
| Capability | What it does |
|---|---|
| Multi-protocol gateway | OpenAI Chat Completions, Anthropic Messages, and OpenAI Responses compatibility |
| Provider routing | Model matching, provider-level routing, fallback policy, and loop detection |
| Rate limiting | Token-bucket limits by API key, IP, and model |
| Request cache | In-memory LRU cache with configurable TTL and item count |
| SSRF protection | DNS pinning, private IP detection, and allowlist support |
| Config publishing | Authoring YAML -> compiled snapshot -> zero-interruption publish and rollback |
| Provider health operations | Health-aware weighted routing, probes, cooldown state, and incident checks |
| Telemetry plane | Async event ingestion, cost aggregation, timeseries projections, and query APIs |
| Audit log | Searchable control-plane operation history |
| Admin UI | Overview, Monitoring, Benchmark, Ops, Config, and Logs workspaces |
| Benchmarking | Multi-model comparison with exact, judge, JSON, tool, and stream scoring modes |
| Local CLI | Runtime status, preflight, diagnostics, provider probes, config diff, publish history, and rollback |
The runtime uses one operator entry point and three internal planes:
external supervisor / systemd / k8s
|
+--------------+
| aigw |
| supervise |
+------+-------+
|
+------------------+------------------+
| |
+------+-------+ +-------+------+
| Data Plane | | Control Plane|
| gatewayd | | controld |
| :18080 | | :18081 |
+------+-------+ +-------+------+
| |
+------------------+------------------+
|
+------+-------+
| Telemetry |
| telemetryd |
| IPC only |
+--------------+
aigwis the local operations entry point forsupervise,doctor, bundle verification, upgrades, and rollback workflows.gatewaydhandles client inference traffic, compatible API routes, health checks, and telemetry event emission.controldserves the admin APIs and owns authoring config, compile/publish/rollback, audit, probing, and benchmark workflows.telemetrydingests events over IPC and exposes projections through the control plane.
The old gateway / gateway.exe launcher has been removed. Production deployments should supervise aigw supervise.
Open the admin console at http://localhost:18080/admin after the runtime starts.
| Workspace | Primary jobs |
|---|---|
| Overview | Gateway health, time windows, runtime state, provider health |
| Monitoring | Traffic, latency, cost, model usage, and pricing visibility |
| Benchmark | Model capability comparison, scoring, and promotion signals |
| Ops | Runtime status, provider probes, audit log, diagnostics, replay |
| Config | YAML/JSON/visual config editing, publish history, revision diff |
| Logs | Request search, error filtering, and CSV export |
| Overview | Monitoring |
|---|---|
![]() |
![]() |
| Ops mobile | Benchmark mobile |
|---|---|
![]() |
![]() |
go build -o ./dist/aigw ./cmd/aigw
go build -o ./dist/gatewayd ./cmd/gatewayd
go build -o ./dist/controld ./cmd/controld
go build -o ./dist/telemetryd ./cmd/telemetryd
go build -o ./dist/gateway-cli ./cmd/gateway-cliGenerate and verify a release manifest before packaging:
./dist/aigw bundle build -root . -out aigw-manifest.json
./dist/aigw bundle verify -root . -manifest aigw-manifest.jsonCopy-Item .\configs\config.example.yaml .\configs\config.yaml
$env:ADMIN_BOOTSTRAP_TOKEN = "<32+ chars>"
$env:COOKIE_SIGNING_KEY = "<32+ chars>"
$env:ADMIN_TOKEN = "<admin token>"
$env:VIEWER_TOKEN = "<viewer token>"config.yaml is the operator authoring config. The daemon bootstrap files are separate:
configs/gatewayd.jsonconfigs/controld.jsonconfigs/telemetryd.json
mkdir -p .gateway-runtime/telemetry .gateway-runtime/gateway .gateway-runtime/control
./dist/aigw supervise -runtime-root .gateway-runtime -config-dir configs -bin-dir ./distcurl http://127.0.0.1:18080/-/health
curl http://127.0.0.1:18080/v1/models
curl http://127.0.0.1:18081/adminLocal development note: if another live service already owns 127.0.0.1:18080, stop it or move the development runtime to different ports before starting the three-plane runtime.
aigw clients can print environment snippets or update local tool config for Codex, Claude Code, and OpenClaw. See client integrations for generic OpenAI SDK, curl, and local tool setup notes.
./dist/aigw clients print -config-dir configs
./dist/aigw clients apply -config-dir configs -api-key "<API key>"
./dist/aigw clients apply -dry-rungateway-cli is the remote management CLI:
# Config
./dist/gateway-cli config show
./dist/gateway-cli config preview configs/config.yaml
./dist/gateway-cli config diff --file configs/config.yaml
# Runtime
./dist/gateway-cli runtime status
./dist/gateway-cli runtime preflight
# Audit and diagnostics
./dist/gateway-cli audit 50
./dist/gateway-cli diagnostics
./dist/gateway-cli secrets check
# Provider probes
./dist/gateway-cli probe model gpt-4 openai-demo
./dist/gateway-cli provider list
./dist/gateway-cli provider test openai
# Telemetry
./dist/gateway-cli telemetry events
# Publish management
./dist/gateway-cli publish history
./dist/gateway-cli publish rollback rev-001Useful options:
-server url: control-plane URL, defaulthttp://127.0.0.1:18081-token token: admin token, or useADMIN_TOKEN-format text|json|csv: output format
# Go tests
go test ./... -count=1
# Focused Go tests with coverage
go test ./internal/gateway/... -count=1 -cover
# Admin unit/component tests
npm --prefix web/admin test
# Admin production build
npm --prefix web/admin run build
# Admin Playwright audit
npm --prefix web/admin run test:e2e| Path | Purpose |
|---|---|
cmd/aigw/ |
Operations entry point |
cmd/gatewayd/ |
Data-plane daemon |
cmd/controld/ |
Control-plane daemon |
cmd/telemetryd/ |
Telemetry daemon |
cmd/gateway-cli/ |
Remote management CLI |
internal/control/ |
Control-plane API, compiler, publisher |
internal/gateway/ |
Snapshot runtime, API handlers, telemetry client |
internal/telemetry/ |
Event log, projections, query layer |
internal/contracts/ |
Cross-plane RPC and transport contracts |
internal/infra/ |
Shared auth, config loader, pricing, and infrastructure |
internal/proxy/ |
SSRF-safe proxy helpers |
web/admin/ |
Admin SPA built with Preact and Vite |
configs/ |
Example and bootstrap configuration |
docs/ |
Architecture, installation, deployment, and operations docs |
scripts/ |
Deployment and verification helpers |
- Do not replace
gatewayd,controld, ortelemetrydindependently during a normal upgrade. Ship one manifest-verified bundle. - Run individual daemons only for advanced debugging. Production paths should use
aigw supervise. - Linux deployments can use
deploy/aigw.serviceoraigw service print. Windows deployments should wrapaigw.exe supervisewith NSSM or an equivalent service manager.
- Differentiation
- Project Roadmap
- Evaluate In 15 Minutes
- LLM Gateway Comparison Guide
- Self-Hosted LLM Gateway Checklist
- LLM Gateway Adoption Checklist
- Config Publish and Rollback
- Architecture
- Installation
- Deployment
- CLI Guide
- Troubleshooting
- API Messages Endpoint
- Chinese Model Integration
- Changelog
- Contributing
- Code of Conduct
- Security Policy
MIT. See LICENSE.



