ai-free-swap

An OpenAI- and Anthropic-compatible proxy server that routes your requests through multiple free-tier AI providers. If one provider is down or rate-limited, the proxy automatically tries the next one -- so your application keeps working without any code changes.

You configure a list of AI providers (Google Gemini, Qwen, OpenRouter, xAI Grok, Anthropic, or any OpenAI-compatible service), assign priorities, and the proxy handles the rest. Your app talks to one local endpoint, and ai-free-swap finds a working provider behind the scenes. Both OpenAI SDK and Anthropic SDK clients work out of the box.

One-Click Deploy

Deploy your own instance directly from this repo -- no fork needed:

Both platforms auto-generate a SERVER_API_KEY to protect your instance. Add at least one provider API key (e.g., GEMINI_API_KEY) and you're live.

See Cloud Hosting Guide for more options (Fly.io, VPS, Oracle Cloud free tier).

How It Works

Your App                ai-free-swap                   Providers
  |                         |                              |
  |-- POST /v1/chat/... -->|                              |
  |-- POST /v1/messages -->|                              |
  |                         |-- try Gemini (priority 1) ->|
  |                         |<-- error / rate limited -----|
  |                         |                              |
  |                         |-- try Qwen (priority 2) --->|
  |                         |<-- success ------------------|
  |                         |                              |
  |<-- response -----------|                              |

Your application sends a request to the proxy -- either OpenAI format (/v1/chat/completions) or Anthropic format (/v1/messages).
The proxy tries providers in priority order (lowest number = highest priority).
Within the same priority level, providers are tried in random order.
If a provider fails, the proxy automatically tries the next one.
The response is returned in the same format the client used -- your app doesn't need to know which provider actually handled the request.

Quick Start

1. Install

pip install .

2. Get API Keys

Sign up for free API keys from one or more providers:

Provider	Sign Up
Google Gemini	https://aistudio.google.com/apikey
Alibaba Qwen	https://modelstudio.console.alibabacloud.com (International) https://dashscope.console.aliyun.com/ (Chinese)
OpenRouter	https://openrouter.ai/keys
xAI Grok	https://console.x.ai/
Anthropic	https://platform.claude.com/settings/workspaces/default/keys/
Groq	https://console.groq.com/keys
Mistral	https://admin.mistral.ai/organization/api-keys
Cloudflare AI	https://dash.cloudflare.com/?to=/:account/ai/workers-ai
Hugging Face	https://huggingface.co/settings/tokens
Cohere	https://dashboard.cohere.com/welcome/
Deepseek	https://platform.deepseek.com/api_keys
Z.ai	https://z.ai/manage-apikey/apikey-list
Moonshot (Kimi)	https://platform.kimi.ai/console/api-keys
SiliconFlow	https://cloud.siliconflow.com/account/ak
MiniMax	https://platform.minimax.io/user-center/payment/token-plan

These are just examples -- any service with an OpenAI-compatible API works (DeepSeek, GLM, Groq, Together, local Ollama, etc.). See Custom OpenAI-Compatible Providers.

3. Configure

cp config.yaml.example config.yaml

Open config.yaml and add your API keys. At minimum, you need one provider:

keep_cycles: 1
model_name: "aifree"

server:
  host: "0.0.0.0"
  port: 8000

providers:
  - priority: 1
    backends:
      - provider: gemini
        api_key: "your-gemini-api-key-here"
        model: "gemini-2.5-flash"

4. Run

ai-free-swap --config config.yaml

The server starts at http://localhost:8000.

5. Send a Request

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "aifree",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Installation

Requires Python 3.11 or later.

pip (recommended)

pip install .

# Or in development mode
pip install -e .

After installation, the ai-free-swap command is available system-wide.

Docker

docker build -t ai-free-swap .

docker run -p 8000:8000 \
  -v /path/to/your/config.yaml:/app/config.yaml \
  -e GEMINI_API_KEY="your-key" \
  ai-free-swap

From source

pip install -r requirements.txt
python -m ai_free_swap --config config.yaml

Configuration

The proxy is configured with a YAML file. Copy the example to get started:

cp config.yaml.example config.yaml

Top-Level Settings

Setting	Default	Description
`keep_cycles`	`1`	How many times to cycle through all providers before giving up. Set to `2` or `3` if providers have intermittent failures.
`model_name`	`"aifree"`	The model name shown in `/v1/models`. Clients can use this name or any backend model name directly.
`show_provider`	`true`	When `true`, responses include a `provider_name` field showing which provider handled the request. Set to `false` to hide this.
`model_routing`	`"any"`	How to handle the model name from client requests. `"any"` (default) ignores the client model and uses all providers in priority order -- best for proxy use cases where clients send arbitrary model names. `"match"` routes to backends whose configured model matches the request, falling back to all providers if no match is found -- useful when you configure multiple distinct models and want clients to choose.

Server Settings

server:
  host: "0.0.0.0"    # Listen address ("0.0.0.0" = all interfaces, "127.0.0.1" = local only)
  port: 8000          # Listen port
  api_key: ""         # Optional: require this key from clients (see "Securing the Proxy")

Provider Settings

Providers are organized into priority groups. Each group has a priority number and a list of backends:

providers:
  - priority: 1           # Tried first
    backends:
      - provider: gemini   # Provider type (or any label if base_url is set)
        api_key: "..."     # API key for this provider
        model: "gemini-2.5-flash"  # Model to use

  - priority: 2           # Tried if all priority 1 backends fail
    backends:
      - provider: qwen
        api_key: "..."
        model: "qwen-flash"

Each backend supports these fields:

Field	Required	Description
`provider`	Yes	Provider type (see Supported Providers) or any name you choose if `base_url` is set.
`api_key`	Yes	API key. Supports `${ENV_VAR}` syntax.
`model`	Yes	Model identifier to use with this provider.
`name`	No	Friendly name for this backend. Shown in logs and in the `provider_name` response field.
`base_url`	No	Override the provider's API URL. Required for custom/self-hosted providers.
`extra`	No	Provider-specific options (e.g., `timeout`, `default_max_tokens`).

Environment Variables for API Keys

Instead of putting API keys directly in the config file, you can reference environment variables with ${VAR_NAME} syntax:

backends:
  - provider: gemini
    api_key: "${GEMINI_API_KEY}"
    model: "gemini-2.5-flash"

The variable names are up to you -- use whatever makes sense:

backends:
  - provider: deepseek
    api_key: "${MY_DEEPSEEK_KEY}"
    model: "deepseek-chat"
    base_url: "https://api.deepseek.com/v1"

Then set the variables before starting:

export GEMINI_API_KEY="your-actual-key"
export MY_DEEPSEEK_KEY="your-deepseek-key"
ai-free-swap --config config.yaml

Priority and Fallback

The priority number determines the order providers are tried:

Lower numbers are tried first (priority 1 before priority 2).
Within the same priority, backends are tried in random order -- this distributes load across multiple accounts or keys.
If all backends in a priority group fail, the proxy moves to the next group.
After all groups are exhausted, the cycle repeats up to keep_cycles times.

Example: distribute load across three Gemini keys, fall back to Qwen:

keep_cycles: 2  # Try everything twice before giving up

providers:
  - priority: 1
    backends:
      - provider: gemini
        name: "gemini-key-1"
        api_key: "${GEMINI_KEY_1}"
        model: "gemini-2.5-flash"
      - provider: gemini
        name: "gemini-key-2"
        api_key: "${GEMINI_KEY_2}"
        model: "gemini-2.5-flash"
      - provider: gemini
        name: "gemini-key-3"
        api_key: "${GEMINI_KEY_3}"
        model: "gemini-2.5-flash"

  - priority: 2
    backends:
      - provider: qwen
        api_key: "${QWEN_KEY}"
        model: "qwen-flash"

Custom OpenAI-Compatible Providers

Any service with an OpenAI-compatible API works. Set base_url and use whatever name you want for provider -- the name is just a label for logs:

providers:
  - priority: 1
    backends:
      # DeepSeek
      - provider: deepseek
        api_key: "${DEEPSEEK_KEY}"
        model: "deepseek-chat"
        base_url: "https://api.deepseek.com/v1"

      # GLM (Zhipu AI)
      - provider: glm
        api_key: "${GLM_KEY}"
        model: "glm-4-flash"
        base_url: "https://open.bigmodel.cn/api/paas/v4"

      # Groq
      - provider: groq
        api_key: "${GROQ_KEY}"
        model: "llama-3.3-70b-versatile"
        base_url: "https://api.groq.com/openai/v1"

      # Together AI
      - provider: together
        api_key: "${TOGETHER_KEY}"
        model: "meta-llama/Llama-3.3-70B-Instruct-Turbo"
        base_url: "https://api.together.xyz/v1"

      # Local Ollama
      - provider: ollama
        api_key: "unused"
        model: "llama3.2"
        base_url: "http://localhost:11434/v1"

      # LM Studio
      - provider: lmstudio
        api_key: "unused"
        model: "local-model"
        base_url: "http://localhost:1234/v1"

Full Configuration Example

keep_cycles: 1
model_name: "aifree"
show_provider: true
model_routing: "any"  # "any" = ignore client model, "match" = route by model name

server:
  host: "0.0.0.0"
  port: 8000
  api_key: ""

providers:
  - priority: 1
    backends:
      - provider: gemini
        name: "gemini-flash-1"
        api_key: "${GEMINI_API_KEY_1}"
        model: "gemini-2.5-flash"
      - provider: gemini
        name: "gemini-flash-2"
        api_key: "${GEMINI_API_KEY_2}"
        model: "gemini-2.5-flash"

  - priority: 2
    backends:
      - provider: qwen
        api_key: "${DASHSCOPE_API_KEY}"
        model: "qwen-flash"

  - priority: 3
    backends:
      - provider: openrouter
        api_key: "${OPENROUTER_API_KEY}"
        model: "google/gemini-2.5-flash:free"
      - provider: openrouter
        api_key: "${OPENROUTER_API_KEY}"
        model: "meta-llama/llama-4-scout:free"

  - priority: 4
    backends:
      - provider: grok
        api_key: "${XAI_API_KEY}"
        model: "grok-3-mini"

  - priority: 5
    backends:
      - provider: anthropic
        api_key: "${ANTHROPIC_API_KEY}"
        model: "claude-sonnet-4-6"

Supported Providers

These providers have built-in base URLs and work with just an API key:

Provider	`provider` value	Models (examples)
Google Gemini	`gemini`	`gemini-2.5-flash`, `gemini-2.5-flash-lite`
Alibaba Qwen	`qwen`	`qwen-flash`
OpenRouter	`openrouter`	`google/gemini-2.5-flash:free`, `meta-llama/llama-4-scout:free`
xAI Grok	`grok`	`grok-3-mini`
OpenAI	`openai`	`gpt-4o-mini`, `gpt-4o`
Anthropic	`anthropic`	`claude-sonnet-4-6`, `claude-haiku-4-5`

Any other OpenAI-compatible service works too -- just set base_url. See Custom OpenAI-Compatible Providers for examples with DeepSeek, GLM, Groq, Together, Ollama, LM Studio, and more.

Using with Applications

OpenAI Python SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="unused",  # or your proxy api_key if set
)

response = client.chat.completions.create(
    model="aifree",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

With streaming:

stream = client.chat.completions.create(
    model="aifree",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Anthropic Python SDK

from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:8000",
    api_key="unused",  # or your proxy api_key if set
)

response = client.messages.create(
    model="aifree",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.content[0].text)

With streaming:

with client.messages.stream(
    model="aifree",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Tell me a story."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="")

Using ANTHROPIC_BASE_URL environment variable:

Many tools that use the Anthropic SDK (Claude Code, aider, etc.) read the ANTHROPIC_BASE_URL environment variable. Set it to point at your proxy:

export ANTHROPIC_BASE_URL="http://localhost:8000"
export ANTHROPIC_API_KEY="your-proxy-key"  # or any non-empty string if proxy has no api_key

# Now any tool using the Anthropic SDK will go through your proxy
claude   # Claude Code
aider    # aider with --model claude-sonnet-4-6

curl

# Non-streaming (OpenAI format)
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "aifree", "messages": [{"role": "user", "content": "Hi"}]}'

# Streaming (OpenAI format)
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "aifree", "messages": [{"role": "user", "content": "Hi"}], "stream": true}'

# Non-streaming (Anthropic format)
curl http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-proxy-key" \
  -H "anthropic-version: 2023-06-01" \
  -d '{"model": "aifree", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hi"}]}'

# With proxy authentication (OpenAI style)
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-proxy-key" \
  -d '{"model": "aifree", "messages": [{"role": "user", "content": "Hi"}]}'

Any OpenAI-Compatible Client

ai-free-swap works as a drop-in replacement for the OpenAI API. In most tools, you just need to change two settings:

API Base URL: http://localhost:8000/v1
Model: aifree (or any backend model name you configured)

This works with tools like aider, cline, open-hands, continue, LangChain, LlamaIndex, Open Interpreter, and any other application that supports custom OpenAI endpoints.

Any Anthropic-Compatible Client

ai-free-swap also works as a drop-in replacement for the Anthropic Messages API. Set the base URL to your proxy:

ANTHROPIC_BASE_URL: http://localhost:8000
ANTHROPIC_API_KEY: your proxy api_key (or any non-empty string)

This works with Claude Code, aider (Anthropic mode), and any other tool that uses the Anthropic SDK with a configurable base URL.

Command-Line Options

ai-free-swap [options]

Options:
  --config, -c PATH      Path to config file (default: config.yaml)
  --host HOST            Override the host from config
  --port PORT            Override the port from config
  --log-level LEVEL      Set log verbosity: debug, info, warning, error
                         (default: info)

Examples:

# Use a custom config and port
ai-free-swap -c my-config.yaml --port 9000

# Enable debug logging to see provider routing decisions
ai-free-swap --log-level debug

# Run as a Python module
python -m ai_free_swap --config config.yaml

Securing the Proxy

If the proxy is accessible on a network (not just localhost), set an API key:

server:
  api_key: "your-secret-proxy-key"

Clients must then include this key in requests. Both authentication methods are supported:

# OpenAI-style: Authorization header
curl http://your-server:8000/v1/chat/completions \
  -H "Authorization: Bearer your-secret-proxy-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "aifree", "messages": [{"role": "user", "content": "Hi"}]}'

# Anthropic-style: x-api-key header
curl http://your-server:8000/v1/messages \
  -H "x-api-key: your-secret-proxy-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "aifree", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hi"}]}'

The /health endpoint is always public (no key required).

Troubleshooting

"Error loading config" -- Check that config.yaml exists and is valid YAML. If using ${ENV_VAR} syntax, make sure the environment variables are set.

"All configured providers failed" (503) -- Check that your API keys are valid. Run with --log-level debug to see which providers were tried and why each failed. Increase keep_cycles to retry more times.

"Model 'xyz' is not configured" (400) -- The model name in your request doesn't match any configured backend. Use "aifree" to use any available provider, or check your config for the exact model names.

Server not reachable -- Check the port isn't already in use. In Docker, make sure you used -p 8000:8000. If host is 127.0.0.1, the server only accepts local connections -- change to 0.0.0.0.

Running Tests

# Install test dependencies
pip install -e ".[test]"

# Run all tests
python -m pytest tests/ -v

# Run a specific test file
python -m pytest tests/test_router.py -v

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
docs		docs
src/ai_free_swap		src/ai_free_swap
tests		tests
.dockerignore		.dockerignore
.flake8		.flake8
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.yaml.cloud		config.yaml.cloud
config.yaml.example		config.yaml.example
pyproject.toml		pyproject.toml
render.yaml		render.yaml
requirements.txt		requirements.txt
run.sh		run.sh

Folders and files

Latest commit

History

Repository files navigation

ai-free-swap

One-Click Deploy

Table of Contents

How It Works

Quick Start

1. Install

2. Get API Keys

3. Configure

4. Run

5. Send a Request

Installation

pip (recommended)

Docker

From source

Configuration

Top-Level Settings

Server Settings

Provider Settings

Environment Variables for API Keys

Priority and Fallback

Custom OpenAI-Compatible Providers

Full Configuration Example

Supported Providers

Using with Applications

OpenAI Python SDK

Anthropic Python SDK

curl

Any OpenAI-Compatible Client

Any Anthropic-Compatible Client

Command-Line Options

Securing the Proxy

Troubleshooting

Further Reading

Running Tests

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages