Skip to content

SSC-STUDIO/Ai-Model-Gateway

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

263 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

CI License: MIT Version

AI Model Gateway

AI Model Gateway is a self-hosted LLM gateway for teams that want provider routing, configuration publishing, telemetry, benchmarks, and day-2 operations in one compact Go runtime.

It is not trying to be another hosted model marketplace. The project is optimized for local control: one supervisor command, separate data/control/telemetry planes, OpenAI-compatible client entry points, provider health visibility, safe config rollout, and an admin UI that behaves like an operations console.

Start Here

Need Link
Try the packaged runtime first Release archive install
Try with Docker Compose Docker Compose deployment
Decide in one short pass 15-minute evaluation path
Check adoption fit before wiring clients LLM gateway adoption checklist
Point local tools at the gateway Client integrations
Route Codex CLI traffic Codex CLI gateway page
Route Claude Code traffic Claude Code gateway page
Verify fallback behavior Provider fallback demo
Review quality and security before adoption Review evidence
Compare gateway options LLM gateway comparison
Read the product page Website
中文评估入口 自托管 LLM 网关中文页
Share with a relevant community Share kit / 中文分享包

Fastest trial: download the packaged runtime, verify SHA256SUMS.txt, then run the 15-minute path or fallback demo. If you prefer containers, use the Docker Compose path below; CI also builds the runtime image on every main-branch push. If it fits your self-hosted LLM infrastructure needs, star the repository so more operators can find it.

See It First

AI Model Gateway Admin overview

The Admin UI exposes the operational surface behind the gateway: provider health, routing state, telemetry, benchmark signals, config publishing, diagnostics, request logs, and update/rollback workflows. Start with the 15-minute evaluation path, review the quality and security evidence, or use the release archive install path if you want the shortest packaged trial.

Who Should Use It

AI Model Gateway is most useful when you are running LLM traffic for a team and need operational control rather than a hosted model marketplace:

  • You want OpenAI, Anthropic, and Responses-compatible clients to enter through one local gateway.
  • You need provider routing, fallback, rate limiting, and cache behavior that you can inspect and change.
  • You want config changes to go through preview, diff, publish, audit, and rollback instead of editing a live proxy file.
  • You need request logs, latency, cost, provider health, benchmarks, diagnostics, and replay in one admin surface.
  • You want updates and rollback to use manifest-verified bundles instead of replacing binaries by hand.

Try It Quickly

The fastest path is the packaged release archive. It avoids rebuilding the runtime and verifies the download with SHA256SUMS.txt.

Platform Archive
Linux x64 ai-model-gateway-linux-amd64.tar.gz
Linux arm64 ai-model-gateway-linux-arm64.tar.gz
Windows x64 ai-model-gateway-windows-amd64.zip
macOS arm64 ai-model-gateway-darwin-arm64.tar.gz
Checksums SHA256SUMS.txt

Follow the release archive install path for checksum verification, local config setup, runtime directories, temporary local tokens, and aigw supervise startup.

Container trial:

cp configs/config.example.yaml configs/config.yaml
cat > deploy/secrets.env <<'EOF'
ADMIN_BOOTSTRAP_TOKEN=change-me-32-characters-minimum-0
COOKIE_SIGNING_KEY=change-me-32-characters-minimum-0
ADMIN_TOKEN=change-me-admin-token
VIEWER_TOKEN=change-me-viewer-token
EOF
docker compose -f deploy/docker-compose.yaml up -d
curl http://127.0.0.1:18081/-/health

See Docker Compose deployment for logs, published ports, and provider-key setup.

If you prefer to audit or modify the code before running it, build from source:

git clone https://github.com/SSC-STUDIO/Ai-Model-Gateway.git
cd Ai-Model-Gateway
go build -o ./dist/aigw ./cmd/aigw
go build -o ./dist/gatewayd ./cmd/gatewayd
go build -o ./dist/controld ./cmd/controld
go build -o ./dist/telemetryd ./cmd/telemetryd
cp configs/config.example.yaml configs/config.yaml
mkdir -p .gateway-runtime/telemetry .gateway-runtime/gateway .gateway-runtime/control
ADMIN_BOOTSTRAP_TOKEN=change-me-32-characters-minimum-0 \
COOKIE_SIGNING_KEY=change-me-32-characters-minimum-0 \
ADMIN_TOKEN=change-me-admin-token \
VIEWER_TOKEN=change-me-viewer-token \
./dist/aigw supervise -runtime-root .gateway-runtime -config-dir configs -bin-dir ./dist

Then open http://localhost:18080/admin and check http://localhost:18080/-/health.

Choose A Setup Path

Goal Start here
Share a visual overview Project website
Share copy-ready links and short posts Share kit
Share with Chinese developer communities Chinese share kit
Evaluate from a Chinese landing page Chinese self-hosted LLM gateway page
Try the packaged runtime Release archive install path
Try with Docker Compose Docker Compose deployment
Build from source locally Installation guide
Decide quickly whether to spend more time 15-minute evaluation path
Match it to your team's workflow Use cases
Evaluate whether self-hosting fits Self-hosted LLM gateway checklist
Check adoption fit from a shareable page LLM gateway adoption checklist
Review CI, tests, runtime smoke, and maturity evidence Quality evidence
Review auth, secret, SSRF, telemetry, and deployment trust boundaries Security and trust model
Understand project direction Project roadmap
Compare LLM gateway options LLM gateway comparison guide
Start from a gateway comparison search page LLM gateway comparison page
Start from a self-hosted gateway search page Self-hosted LLM gateway page
Start from an OpenAI-compatible gateway search page OpenAI-compatible LLM gateway page
Start from a client integration search page LLM client integrations page
Start from an adoption checklist search page LLM gateway adoption checklist
Start from a Codex CLI gateway search page Codex CLI gateway page
Start from a Claude Code gateway search page Claude Code gateway page
Start from an OpenAI and Anthropic gateway search page OpenAI Anthropic gateway page
Start from a provider fallback search page LLM provider fallback gateway page
Understand config publish and rollback Config publish and rollback
Operate provider fallback and health Provider fallback and health operations
Run a provider fallback proof Provider fallback demo
Connect an OpenAI-compatible upstream OpenAI-compatible upstreams
Point clients at the gateway Client integrations
Run it as an operations service Deployment guide
Control it from scripts or terminals CLI guide
Point local AI tools at the gateway aigw clients
Debug startup, routing, or admin access Troubleshooting
Ask what to improve next Maintainer discussion

More Links

Self-Hosted Gateway | OpenAI-Compatible Gateway | Adoption Checklist | Client Integrations | Codex CLI Gateway | Claude Code Gateway | OpenAI-Compatible Upstreams | OpenAI Anthropic Gateway | Provider Fallback Gateway | Gateway Comparison Page | Quality Evidence | Security Model | Docs | Roadmap | Promotion Kit | 100-Star Campaign

Why This Exists

The AI gateway space already has strong projects:

Project type Good at AI Model Gateway difference
LiteLLM / Portkey-style gateways Broad provider coverage, unified APIs, spend controls, guardrails Adds a native three-plane runtime, local config publishing, rollback, diagnostics, and an ops-first admin console
Helicone-style observability Request tracing, analytics, evaluations, experiments Treats observability as one part of the gateway lifecycle instead of the whole product
OpenRouter-style hosted routers Fast access to many public models through a hosted broker Keeps routing, keys, telemetry, and policy inside the user's own environment
Kong / Envoy AI Gateway stacks Enterprise gateway ecosystem, plugins, Kubernetes-native traffic policy Focuses on LLM-specific operations with fewer moving parts and a small self-hosted binary set

See docs/differentiation.md for positioning notes and docs/llm-gateway-comparison.md for a practical selection guide.

Core Capabilities

Capability What it does
Multi-protocol gateway OpenAI Chat Completions, Anthropic Messages, and OpenAI Responses compatibility
Provider routing Model matching, provider-level routing, fallback policy, and loop detection
Rate limiting Token-bucket limits by API key, IP, and model
Request cache In-memory LRU cache with configurable TTL and item count
SSRF protection DNS pinning, private IP detection, and allowlist support
Config publishing Authoring YAML -> compiled snapshot -> zero-interruption publish and rollback
Provider health operations Health-aware weighted routing, probes, cooldown state, and incident checks
Telemetry plane Async event ingestion, cost aggregation, timeseries projections, and query APIs
Audit log Searchable control-plane operation history
Admin UI Overview, Monitoring, Benchmark, Ops, Config, and Logs workspaces
Benchmarking Multi-model comparison with exact, judge, JSON, tool, and stream scoring modes
Local CLI Runtime status, preflight, diagnostics, provider probes, config diff, publish history, and rollback

Architecture

The runtime uses one operator entry point and three internal planes:

                  external supervisor / systemd / k8s
                                |
                         +--------------+
                         | aigw         |
                         | supervise    |
                         +------+-------+
                                |
             +------------------+------------------+
             |                                     |
      +------+-------+                     +-------+------+
      | Data Plane   |                     | Control Plane|
      | gatewayd     |                     | controld     |
      | :18080       |                     | :18081       |
      +------+-------+                     +-------+------+
             |                                     |
             +------------------+------------------+
                                |
                         +------+-------+
                         | Telemetry    |
                         | telemetryd   |
                         | IPC only     |
                         +--------------+
  • aigw is the local operations entry point for supervise, doctor, bundle verification, upgrades, and rollback workflows.
  • gatewayd handles client inference traffic, compatible API routes, health checks, and telemetry event emission.
  • controld serves the admin APIs and owns authoring config, compile/publish/rollback, audit, probing, and benchmark workflows.
  • telemetryd ingests events over IPC and exposes projections through the control plane.

The old gateway / gateway.exe launcher has been removed. Production deployments should supervise aigw supervise.

Admin UI

Open the admin console at http://localhost:18080/admin after the runtime starts.

Workspace Primary jobs
Overview Gateway health, time windows, runtime state, provider health
Monitoring Traffic, latency, cost, model usage, and pricing visibility
Benchmark Model capability comparison, scoring, and promotion signals
Ops Runtime status, provider probes, audit log, diagnostics, replay
Config YAML/JSON/visual config editing, publish history, revision diff
Logs Request search, error filtering, and CSV export

Screenshots

Overview Monitoring
Admin overview Admin monitoring
Ops mobile Benchmark mobile
Admin ops mobile Admin benchmark mobile

Quick Start

Build

go build -o ./dist/aigw        ./cmd/aigw
go build -o ./dist/gatewayd    ./cmd/gatewayd
go build -o ./dist/controld    ./cmd/controld
go build -o ./dist/telemetryd  ./cmd/telemetryd
go build -o ./dist/gateway-cli ./cmd/gateway-cli

Generate and verify a release manifest before packaging:

./dist/aigw bundle build -root . -out aigw-manifest.json
./dist/aigw bundle verify -root . -manifest aigw-manifest.json

Configure

Copy-Item .\configs\config.example.yaml .\configs\config.yaml
$env:ADMIN_BOOTSTRAP_TOKEN = "<32+ chars>"
$env:COOKIE_SIGNING_KEY = "<32+ chars>"
$env:ADMIN_TOKEN = "<admin token>"
$env:VIEWER_TOKEN = "<viewer token>"

config.yaml is the operator authoring config. The daemon bootstrap files are separate:

  • configs/gatewayd.json
  • configs/controld.json
  • configs/telemetryd.json

Run

mkdir -p .gateway-runtime/telemetry .gateway-runtime/gateway .gateway-runtime/control
./dist/aigw supervise -runtime-root .gateway-runtime -config-dir configs -bin-dir ./dist

Verify

curl http://127.0.0.1:18080/-/health
curl http://127.0.0.1:18080/v1/models
curl http://127.0.0.1:18081/admin

Local development note: if another live service already owns 127.0.0.1:18080, stop it or move the development runtime to different ports before starting the three-plane runtime.

Point Local AI Tools At The Gateway

aigw clients can print environment snippets or update local tool config for Codex, Claude Code, and OpenClaw. See client integrations for generic OpenAI SDK, curl, and local tool setup notes.

./dist/aigw clients print -config-dir configs
./dist/aigw clients apply -config-dir configs -api-key "<API key>"
./dist/aigw clients apply -dry-run

CLI Examples

gateway-cli is the remote management CLI:

# Config
./dist/gateway-cli config show
./dist/gateway-cli config preview configs/config.yaml
./dist/gateway-cli config diff --file configs/config.yaml

# Runtime
./dist/gateway-cli runtime status
./dist/gateway-cli runtime preflight

# Audit and diagnostics
./dist/gateway-cli audit 50
./dist/gateway-cli diagnostics
./dist/gateway-cli secrets check

# Provider probes
./dist/gateway-cli probe model gpt-4 openai-demo
./dist/gateway-cli provider list
./dist/gateway-cli provider test openai

# Telemetry
./dist/gateway-cli telemetry events

# Publish management
./dist/gateway-cli publish history
./dist/gateway-cli publish rollback rev-001

Useful options:

  • -server url: control-plane URL, default http://127.0.0.1:18081
  • -token token: admin token, or use ADMIN_TOKEN
  • -format text|json|csv: output format

Tests

# Go tests
go test ./... -count=1

# Focused Go tests with coverage
go test ./internal/gateway/... -count=1 -cover

# Admin unit/component tests
npm --prefix web/admin test

# Admin production build
npm --prefix web/admin run build

# Admin Playwright audit
npm --prefix web/admin run test:e2e

Repository Layout

Path Purpose
cmd/aigw/ Operations entry point
cmd/gatewayd/ Data-plane daemon
cmd/controld/ Control-plane daemon
cmd/telemetryd/ Telemetry daemon
cmd/gateway-cli/ Remote management CLI
internal/control/ Control-plane API, compiler, publisher
internal/gateway/ Snapshot runtime, API handlers, telemetry client
internal/telemetry/ Event log, projections, query layer
internal/contracts/ Cross-plane RPC and transport contracts
internal/infra/ Shared auth, config loader, pricing, and infrastructure
internal/proxy/ SSRF-safe proxy helpers
web/admin/ Admin SPA built with Preact and Vite
configs/ Example and bootstrap configuration
docs/ Architecture, installation, deployment, and operations docs
scripts/ Deployment and verification helpers

Operations Constraints

  • Do not replace gatewayd, controld, or telemetryd independently during a normal upgrade. Ship one manifest-verified bundle.
  • Run individual daemons only for advanced debugging. Production paths should use aigw supervise.
  • Linux deployments can use deploy/aigw.service or aigw service print. Windows deployments should wrap aigw.exe supervise with NSSM or an equivalent service manager.

Documentation

License

MIT. See LICENSE.