AI Model Gateway

AI Model Gateway is a self-hosted LLM gateway for teams that want provider routing, configuration publishing, telemetry, benchmarks, and day-2 operations in one compact Go runtime.

It is not trying to be another hosted model marketplace. The project is optimized for local control: one supervisor command, separate data/control/telemetry planes, OpenAI-compatible client entry points, provider health visibility, safe config rollout, and an admin UI that behaves like an operations console.

Start Here

Need	Link
Try the packaged runtime first	Release archive install
Try with Docker Compose	Docker Compose deployment
Decide in one short pass	15-minute evaluation path
Check adoption fit before wiring clients	LLM gateway adoption checklist
Point local tools at the gateway	Client integrations
Route Codex CLI traffic	Codex CLI gateway page
Route Claude Code traffic	Claude Code gateway page
Verify fallback behavior	Provider fallback demo
Review quality and security before adoption	Review evidence
Compare gateway options	LLM gateway comparison
Read the product page	Website
中文评估入口	自托管 LLM 网关中文页
Share with a relevant community	Share kit / 中文分享包

Fastest trial: download the packaged runtime, verify SHA256SUMS.txt, then run the 15-minute path or fallback demo. If you prefer containers, use the Docker Compose path below; CI also builds the runtime image on every main-branch push. If it fits your self-hosted LLM infrastructure needs, star the repository so more operators can find it.

See It First

The Admin UI exposes the operational surface behind the gateway: provider health, routing state, telemetry, benchmark signals, config publishing, diagnostics, request logs, and update/rollback workflows. Start with the 15-minute evaluation path, review the quality and security evidence, or use the release archive install path if you want the shortest packaged trial.

Who Should Use It

AI Model Gateway is most useful when you are running LLM traffic for a team and need operational control rather than a hosted model marketplace:

You want OpenAI, Anthropic, and Responses-compatible clients to enter through one local gateway.
You need provider routing, fallback, rate limiting, and cache behavior that you can inspect and change.
You want config changes to go through preview, diff, publish, audit, and rollback instead of editing a live proxy file.
You need request logs, latency, cost, provider health, benchmarks, diagnostics, and replay in one admin surface.
You want updates and rollback to use manifest-verified bundles instead of replacing binaries by hand.

Try It Quickly

The fastest path is the packaged release archive. It avoids rebuilding the runtime and verifies the download with SHA256SUMS.txt.

Platform	Archive
Linux x64	`ai-model-gateway-linux-amd64.tar.gz`
Linux arm64	`ai-model-gateway-linux-arm64.tar.gz`
Windows x64	`ai-model-gateway-windows-amd64.zip`
macOS arm64	`ai-model-gateway-darwin-arm64.tar.gz`
Checksums	`SHA256SUMS.txt`

Follow the release archive install path for checksum verification, local config setup, runtime directories, temporary local tokens, and aigw supervise startup.

Container trial:

cp configs/config.example.yaml configs/config.yaml
cat > deploy/secrets.env <<'EOF'
ADMIN_BOOTSTRAP_TOKEN=change-me-32-characters-minimum-0
COOKIE_SIGNING_KEY=change-me-32-characters-minimum-0
ADMIN_TOKEN=change-me-admin-token
VIEWER_TOKEN=change-me-viewer-token
EOF
docker compose -f deploy/docker-compose.yaml up -d
curl http://127.0.0.1:18081/-/health

See Docker Compose deployment for logs, published ports, and provider-key setup.

If you prefer to audit or modify the code before running it, build from source:

git clone https://github.com/SSC-STUDIO/Ai-Model-Gateway.git
cd Ai-Model-Gateway
go build -o ./dist/aigw ./cmd/aigw
go build -o ./dist/gatewayd ./cmd/gatewayd
go build -o ./dist/controld ./cmd/controld
go build -o ./dist/telemetryd ./cmd/telemetryd
cp configs/config.example.yaml configs/config.yaml
mkdir -p .gateway-runtime/telemetry .gateway-runtime/gateway .gateway-runtime/control
ADMIN_BOOTSTRAP_TOKEN=change-me-32-characters-minimum-0 \
COOKIE_SIGNING_KEY=change-me-32-characters-minimum-0 \
ADMIN_TOKEN=change-me-admin-token \
VIEWER_TOKEN=change-me-viewer-token \
./dist/aigw supervise -runtime-root .gateway-runtime -config-dir configs -bin-dir ./dist

Then open http://localhost:18080/admin and check http://localhost:18080/-/health.

Choose A Setup Path

Goal	Start here
Share a visual overview	Project website
Share copy-ready links and short posts	Share kit
Share with Chinese developer communities	Chinese share kit
Evaluate from a Chinese landing page	Chinese self-hosted LLM gateway page
Try the packaged runtime	Release archive install path
Try with Docker Compose	Docker Compose deployment
Build from source locally	Installation guide
Decide quickly whether to spend more time	15-minute evaluation path
Match it to your team's workflow	Use cases
Evaluate whether self-hosting fits	Self-hosted LLM gateway checklist
Check adoption fit from a shareable page	LLM gateway adoption checklist
Review CI, tests, runtime smoke, and maturity evidence	Quality evidence
Review auth, secret, SSRF, telemetry, and deployment trust boundaries	Security and trust model
Understand project direction	Project roadmap
Compare LLM gateway options	LLM gateway comparison guide
Start from a gateway comparison search page	LLM gateway comparison page
Start from a self-hosted gateway search page	Self-hosted LLM gateway page
Start from an OpenAI-compatible gateway search page	OpenAI-compatible LLM gateway page
Start from a client integration search page	LLM client integrations page
Start from an adoption checklist search page	LLM gateway adoption checklist
Start from a Codex CLI gateway search page	Codex CLI gateway page
Start from a Claude Code gateway search page	Claude Code gateway page
Start from an OpenAI and Anthropic gateway search page	OpenAI Anthropic gateway page
Start from a provider fallback search page	LLM provider fallback gateway page
Understand config publish and rollback	Config publish and rollback
Operate provider fallback and health	Provider fallback and health operations
Run a provider fallback proof	Provider fallback demo
Connect an OpenAI-compatible upstream	OpenAI-compatible upstreams
Point clients at the gateway	Client integrations
Run it as an operations service	Deployment guide
Control it from scripts or terminals	CLI guide
Point local AI tools at the gateway	`aigw clients`
Debug startup, routing, or admin access	Troubleshooting
Ask what to improve next	Maintainer discussion

Why This Exists

The AI gateway space already has strong projects:

Project type	Good at	AI Model Gateway difference
LiteLLM / Portkey-style gateways	Broad provider coverage, unified APIs, spend controls, guardrails	Adds a native three-plane runtime, local config publishing, rollback, diagnostics, and an ops-first admin console
Helicone-style observability	Request tracing, analytics, evaluations, experiments	Treats observability as one part of the gateway lifecycle instead of the whole product
OpenRouter-style hosted routers	Fast access to many public models through a hosted broker	Keeps routing, keys, telemetry, and policy inside the user's own environment
Kong / Envoy AI Gateway stacks	Enterprise gateway ecosystem, plugins, Kubernetes-native traffic policy	Focuses on LLM-specific operations with fewer moving parts and a small self-hosted binary set

See docs/differentiation.md for positioning notes and docs/llm-gateway-comparison.md for a practical selection guide.

Core Capabilities

Capability	What it does
Multi-protocol gateway	OpenAI Chat Completions, Anthropic Messages, and OpenAI Responses compatibility
Provider routing	Model matching, provider-level routing, fallback policy, and loop detection
Rate limiting	Token-bucket limits by API key, IP, and model
Request cache	In-memory LRU cache with configurable TTL and item count
SSRF protection	DNS pinning, private IP detection, and allowlist support
Config publishing	Authoring YAML -> compiled snapshot -> zero-interruption publish and rollback
Provider health operations	Health-aware weighted routing, probes, cooldown state, and incident checks
Telemetry plane	Async event ingestion, cost aggregation, timeseries projections, and query APIs
Audit log	Searchable control-plane operation history
Admin UI	Overview, Monitoring, Benchmark, Ops, Config, and Logs workspaces
Benchmarking	Multi-model comparison with exact, judge, JSON, tool, and stream scoring modes
Local CLI	Runtime status, preflight, diagnostics, provider probes, config diff, publish history, and rollback

Architecture

The runtime uses one operator entry point and three internal planes:

                  external supervisor / systemd / k8s
                                |
                         +--------------+
                         | aigw         |
                         | supervise    |
                         +------+-------+
                                |
             +------------------+------------------+
             |                                     |
      +------+-------+                     +-------+------+
      | Data Plane   |                     | Control Plane|
      | gatewayd     |                     | controld     |
      | :18080       |                     | :18081       |
      +------+-------+                     +-------+------+
             |                                     |
             +------------------+------------------+
                                |
                         +------+-------+
                         | Telemetry    |
                         | telemetryd   |
                         | IPC only     |
                         +--------------+

aigw is the local operations entry point for supervise, doctor, bundle verification, upgrades, and rollback workflows.
gatewayd handles client inference traffic, compatible API routes, health checks, and telemetry event emission.
controld serves the admin APIs and owns authoring config, compile/publish/rollback, audit, probing, and benchmark workflows.
telemetryd ingests events over IPC and exposes projections through the control plane.

The old gateway / gateway.exe launcher has been removed. Production deployments should supervise aigw supervise.

Admin UI

Open the admin console at http://localhost:18080/admin after the runtime starts.

Workspace	Primary jobs
Overview	Gateway health, time windows, runtime state, provider health
Monitoring	Traffic, latency, cost, model usage, and pricing visibility
Benchmark	Model capability comparison, scoring, and promotion signals
Ops	Runtime status, provider probes, audit log, diagnostics, replay
Config	YAML/JSON/visual config editing, publish history, revision diff
Logs	Request search, error filtering, and CSV export

Screenshots

Overview	Monitoring

Ops mobile	Benchmark mobile

Quick Start

Build

go build -o ./dist/aigw        ./cmd/aigw
go build -o ./dist/gatewayd    ./cmd/gatewayd
go build -o ./dist/controld    ./cmd/controld
go build -o ./dist/telemetryd  ./cmd/telemetryd
go build -o ./dist/gateway-cli ./cmd/gateway-cli

Generate and verify a release manifest before packaging:

./dist/aigw bundle build -root . -out aigw-manifest.json
./dist/aigw bundle verify -root . -manifest aigw-manifest.json

Configure

Copy-Item .\configs\config.example.yaml .\configs\config.yaml
$env:ADMIN_BOOTSTRAP_TOKEN = "<32+ chars>"
$env:COOKIE_SIGNING_KEY = "<32+ chars>"
$env:ADMIN_TOKEN = "<admin token>"
$env:VIEWER_TOKEN = "<viewer token>"

config.yaml is the operator authoring config. The daemon bootstrap files are separate:

configs/gatewayd.json
configs/controld.json
configs/telemetryd.json

Run

mkdir -p .gateway-runtime/telemetry .gateway-runtime/gateway .gateway-runtime/control
./dist/aigw supervise -runtime-root .gateway-runtime -config-dir configs -bin-dir ./dist

Verify

curl http://127.0.0.1:18080/-/health
curl http://127.0.0.1:18080/v1/models
curl http://127.0.0.1:18081/admin

Local development note: if another live service already owns 127.0.0.1:18080, stop it or move the development runtime to different ports before starting the three-plane runtime.

Point Local AI Tools At The Gateway

aigw clients can print environment snippets or update local tool config for Codex, Claude Code, and OpenClaw. See client integrations for generic OpenAI SDK, curl, and local tool setup notes.

./dist/aigw clients print -config-dir configs
./dist/aigw clients apply -config-dir configs -api-key "<API key>"
./dist/aigw clients apply -dry-run

CLI Examples

gateway-cli is the remote management CLI:

# Config
./dist/gateway-cli config show
./dist/gateway-cli config preview configs/config.yaml
./dist/gateway-cli config diff --file configs/config.yaml

# Runtime
./dist/gateway-cli runtime status
./dist/gateway-cli runtime preflight

# Audit and diagnostics
./dist/gateway-cli audit 50
./dist/gateway-cli diagnostics
./dist/gateway-cli secrets check

# Provider probes
./dist/gateway-cli probe model gpt-4 openai-demo
./dist/gateway-cli provider list
./dist/gateway-cli provider test openai

# Telemetry
./dist/gateway-cli telemetry events

# Publish management
./dist/gateway-cli publish history
./dist/gateway-cli publish rollback rev-001

Useful options:

-server url: control-plane URL, default http://127.0.0.1:18081
-token token: admin token, or use ADMIN_TOKEN
-format text|json|csv: output format

Tests

# Go tests
go test ./... -count=1

# Focused Go tests with coverage
go test ./internal/gateway/... -count=1 -cover

# Admin unit/component tests
npm --prefix web/admin test

# Admin production build
npm --prefix web/admin run build

# Admin Playwright audit
npm --prefix web/admin run test:e2e

Repository Layout

Path	Purpose
`cmd/aigw/`	Operations entry point
`cmd/gatewayd/`	Data-plane daemon
`cmd/controld/`	Control-plane daemon
`cmd/telemetryd/`	Telemetry daemon
`cmd/gateway-cli/`	Remote management CLI
`internal/control/`	Control-plane API, compiler, publisher
`internal/gateway/`	Snapshot runtime, API handlers, telemetry client
`internal/telemetry/`	Event log, projections, query layer
`internal/contracts/`	Cross-plane RPC and transport contracts
`internal/infra/`	Shared auth, config loader, pricing, and infrastructure
`internal/proxy/`	SSRF-safe proxy helpers
`web/admin/`	Admin SPA built with Preact and Vite
`configs/`	Example and bootstrap configuration
`docs/`	Architecture, installation, deployment, and operations docs
`scripts/`	Deployment and verification helpers

Operations Constraints

Do not replace gatewayd, controld, or telemetryd independently during a normal upgrade. Ship one manifest-verified bundle.
Run individual daemons only for advanced debugging. Production paths should use aigw supervise.
Linux deployments can use deploy/aigw.service or aigw service print. Windows deployments should wrap aigw.exe supervise with NSSM or an equivalent service manager.

Documentation

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 263 Commits
.claude/worktrees		.claude/worktrees
.github		.github
audit-output		audit-output
cmd		cmd
configs		configs
deploy		deploy
docs		docs
examples/provider-fallback		examples/provider-fallback
internal		internal
scripts		scripts
site		site
web/admin		web/admin
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.mcp.json		.mcp.json
AGENT.md		AGENT.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
IMPROVEMENT_SUMMARY.md		IMPROVEMENT_SUMMARY.md
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
MIGRATION.md		MIGRATION.md
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
VERSION		VERSION
code_analysis_report.md		code_analysis_report.md
ensure-gateway-running.ps1		ensure-gateway-running.ps1
go.mod		go.mod
go.sum		go.sum
package-lock.json		package-lock.json
package.json		package.json
setup-portproxy.ps1		setup-portproxy.ps1
update-portproxy-on-boot.ps1		update-portproxy-on-boot.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Model Gateway

Start Here

See It First

Who Should Use It

Try It Quickly

Choose A Setup Path

More Links

Why This Exists

Core Capabilities

Architecture

Admin UI

Screenshots

Quick Start

Build

Configure

Run

Verify

Point Local AI Tools At The Gateway

CLI Examples

Tests

Repository Layout

Operations Constraints

Documentation

License

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Model Gateway

Start Here

See It First

Who Should Use It

Try It Quickly

Choose A Setup Path

More Links

Why This Exists

Core Capabilities

Architecture

Admin UI

Screenshots

Quick Start

Build

Configure

Run

Verify

Point Local AI Tools At The Gateway

CLI Examples

Tests

Repository Layout

Operations Constraints

Documentation

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages