Local Ollama Cloud proxy for Claude Code and Codex. It adapts GLM-5.2 for
max reasoning, 1M-class context aliases, Anthropic-compatible requests, and
OpenAI Responses API traffic.
The adapter is intentionally small: it forwards OpenAI-compatible and Anthropic-compatible HTTP traffic to a local Ollama server, rewriting only the metadata that these clients cannot express correctly.
- Claude Code can select
glm-5.2:cloud[1m]so its local context budget is 1M-class. The adapter forwards the valid Ollama model nameglm-5.2:cloud. - Claude Code can use a cheaper Haiku-class model such as
glm-4.7:cloudfor lightweight tasks while keeping Sonnet/Opus on GLM-5.2. - Codex can expose
xhighas the user-selectable top reasoning level. The adapter forwards Ollama GLM-compatiblemax. - Normal
high,medium,low, andnonereasoning levels are left alone.
Claude Code ─┐
├─ http://127.0.0.1:11435 ── http://127.0.0.1:11434 ── Ollama Cloud
Codex ───────┘
Default rewrites:
model: "glm-5.2"->model: "glm-5.2:cloud"model: "glm-5.2:cloud[1m]"->model: "glm-5.2:cloud"reasoning.effort: "xhigh"->reasoning.effort: "max"on Responses APIreasoning_effort: "xhigh"->reasoning_effort: "max"on Chat Completions
The service logs method, path, status, and mutation names only. It does not log prompts or response bodies.
- The default service binds to
127.0.0.1; do not expose it publicly without a separate authentication and network security plan. - Do not commit API keys, provider tokens,
.envfiles, local Claude/Codex credential files, prompt logs, response logs, or session transcripts. .gitignoreexcludes common local secrets, logs, build output, and temporary files.- The optional pre-commit setup includes staged
gitleaksscanning.
cmd/code-ollama-adapter: Go binary entrypoint.internal/proxy: request forwarding and payload rewrite logic.systemd/code-ollama-adapter.service: Linux systemd unit template.launchd/ai.openclaw.code-ollama-adapter.plist: macOS launchd template.scripts/install.sh: build and install for Linux/macOS.scripts/uninstall.sh: stop and remove service metadata.config/codex-ollama-cloud.config.toml: Codex profile example..agents/skills/configure-code-ollama-adapter: Agent workflow for safely configuring and verifying this local setup..agents/skills/setup-codex-claude-ollama: Agent workflow for helping users configure Codex and Claude Code from scratch.
This repo includes a small Codex skill for agent-assisted setup:
setup-codex-claude-ollama.
Ask an agent to use $setup-codex-claude-ollama when you want it to configure
Codex and Claude Code on a machine, preserve existing settings, back up config
files, set Codex xhigh, set Claude's 1M-class context alias, start the
adapter, and verify both clients.
Use $configure-code-ollama-adapter for narrower project maintenance and
repair after the adapter is already installed. Use the README for human
maintenance.
git clone git@github.com:hxy91819/code-ollama-adapter.git
cd code-ollama-adapter
go test ./...
go build -o bin/code-ollama-adapter ./cmd/code-ollama-adapterCI runs on Ubuntu and macOS and checks:
gofmtgo test ./...- binary build
- install/uninstall script help output
Optional local hygiene:
pre-commit install
pre-commit run --all-filesThe pre-commit setup expects gitleaks and golangci-lint to be installed. It
runs staged secret scanning, basic file hygiene checks, gofmt, go vet,
go test, and a small Go lint set.
~/code-ollama-adapter/bin/code-ollama-adapter \
--host 127.0.0.1 \
--port 11435 \
--upstream http://127.0.0.1:11434 \
--model-alias glm-5.2 \
--model-alias 'glm-5.2:cloud[1m]' \
--model-target glm-5.2:cloud \
--reasoning-map xhigh=max \
--default-reasoning-effort maxHealth check:
curl http://127.0.0.1:11435/healthLinux systemd:
cd ~/code-ollama-adapter
scripts/install.sh
systemctl status code-ollama-adapter.serviceThe installer builds bin/code-ollama-adapter, installs a root-owned executable
under /usr/local/lib/code-ollama-adapter/<service-name>/, and writes a systemd
unit that executes that installed binary. The service does not execute a
mutable checkout binary, and distinct service names get distinct binary paths.
macOS launchd:
cd ~/code-ollama-adapter
scripts/install.sh
launchctl print "gui/$(id -u)/ai.openclaw.code-ollama-adapter"When run as root on macOS, the installer writes to
/Library/LaunchDaemons; otherwise it writes to ~/Library/LaunchAgents.
The launchd plist is generated with the installed binary path.
Linux:
cd ~/code-ollama-adapter
scripts/uninstall.shmacOS:
cd ~/code-ollama-adapter
scripts/uninstall.shUninstall removes service metadata only. It does not delete the project tree or the installed binary.
codexo uses:
codex --profile ollama-cloud --dangerously-bypass-approvals-and-sandboxThe profile should point the Ollama Cloud provider at the adapter:
model = "glm-5.2"
model_provider = "ollama_cloud"
model_reasoning_effort = "xhigh"
model_catalog_json = "/absolute/path/to/ollama-cloud-glm-5.2.json"
[model_providers.ollama_cloud]
name = "Ollama Cloud via Code Ollama Adapter"
base_url = "http://127.0.0.1:11435/v1"
wire_api = "responses"The model catalog should include xhigh in supported_reasoning_levels. This
setup defaults Codex to xhigh, and the service also injects max when Codex
omits reasoning from the wire request. Use high explicitly when you want
normal high reasoning, and xhigh for Ollama GLM max:
codex exec --profile ollama-cloud \
-c model_reasoning_effort='"xhigh"' \
--dangerously-bypass-approvals-and-sandbox \
'Reply exactly OK.'Claude Code should use the adapter as its Anthropic-compatible base URL and a
display model with the [1m] suffix for Sonnet/Opus:
ANTHROPIC_BASE_URL=http://127.0.0.1:11435
ANTHROPIC_DEFAULT_SONNET_MODEL='glm-5.2:cloud[1m]'
ANTHROPIC_DEFAULT_HAIKU_MODEL='glm-4.7:cloud'
ANTHROPIC_DEFAULT_OPUS_MODEL='glm-5.2:cloud[1m]'The adapter strips [1m] before forwarding to Ollama. Claude Code still budgets
Sonnet/Opus sessions as 1M-class models. glm-4.7:cloud is intentionally left
without [1m]; it is a cheaper Haiku-class default with a smaller context
window.
Request-level smoke:
curl -sS http://127.0.0.1:11435/v1/responses \
-H 'content-type: application/json' \
-d '{"model":"glm-5.2","input":"Reply exactly OK.","max_output_tokens":16,"reasoning":{"effort":"xhigh"}}' \
| jq '{model, reasoning}'Expected result includes:
{"reasoning":{"effort":"max"}}Client smoke:
claude -p /context --model 'glm-5.2:cloud[1m]'
claude -p 'Reply exactly OK.' --model 'glm-4.7:cloud'
codex exec --profile ollama-cloud \
-c model_reasoning_effort='"xhigh"' \
--dangerously-bypass-approvals-and-sandbox \
'Reply exactly OK.'Expected log markers:
rewrite_model: a client alias was mapped toglm-5.2:cloud.rewrite_reasoning:xhighwas mapped tomax.
- Keep rewrite rules explicit and narrow.
- Add a test in
internal/proxy/transform_test.gofor every new rewrite rule. - Keep the service listening on
127.0.0.1unless there is a concrete need to expose it. - Do not log request or response bodies.
- After editing behavior, run
go test ./..., reinstall/restart the service, and run at least one live smoke through Codex or Claude Code.
Earlier prototypes used these names:
- project directory:
ollama-cloud-code-proxy - service:
ollama-cloud-code-proxy.service - binary:
ollama-cloud-code-proxy - older service:
claude-ollama-proxy.service
The current canonical name is code-ollama-adapter. Keep only one service
active on port 11435.