Skip to content

Duy-Nguyen-2006/GLM-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GLM-API

Z.ai to OpenAI-Compatible API Proxy with Tool Calling Support.

Convert Z.ai's free web chat into a fully OpenAI-compatible API endpoint, enabling coding agents like Cline, Roo-Code, and OpenCode to use GLM models with function/tool calling.

Features

  • OpenAI-Compatible API - Drop-in replacement for OpenAI's /v1/chat/completions
  • Tool Calling Support - Emulated function calling via XML injection + multi-format parser
  • CAPTCHA Bypass - Playwright-based browser automation bypasses Aliyun invisible CAPTCHA
  • Streaming Support - SSE streaming responses (chunked)
  • Thinking Models - Supports reasoning_content for GLM thinking models
  • Web Dashboard - Monitor status and test API from browser
  • Multiple Models - GLM-4-Flash, GLM-4.5, GLM-5, GLM-5.1, and more

Quick Start

Prerequisites

  • Node.js 18+
  • A Z.ai account (free at chat.z.ai)

Setup

  1. Clone and install:
git clone https://github.com/toan9tranphu/GLM-API.git
cd GLM-API
npm install
npx playwright install chromium
  1. Get your Z.ai JWT token:

    • Log in to chat.z.ai
    • Open browser DevTools → Application → Cookies → token
    • Copy the JWT value
  2. Start the server:

ZAI_TOKEN=your_jwt_token_here API_KEY=sk-your-key node server.js

Configure Cline

In Cline settings:

  • API Provider: OpenAI Compatible
  • Base URL: http://localhost:8000/v1
  • API Key: sk-your-key (same as API_KEY env var)
  • Model: glm-4-flash (or any supported model)

API Endpoints

Method Endpoint Description
GET /health Health check (no auth)
GET /v1/models List available models
POST /v1/chat/completions Chat completion (OpenAI format)
GET / Web dashboard

Supported Models

  • glm-4-flash (fastest, free)
  • glm-4.5, glm-4.5-air
  • glm-4.6, glm-4.6-v
  • glm-4.7
  • glm-5, glm-5-turbo
  • glm-5.1

Add -thinking suffix for reasoning models (e.g., glm-5.1-thinking).

Tool Calling

GLM-API uses emulated tool calling: tool definitions are injected into the system prompt as XML, and the model's XML response is parsed into OpenAI tool_calls format. This works reliably with GLM models.

Example Request

curl http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-4-flash",
    "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
          "type": "object",
          "properties": {"location": {"type": "string"}},
          "required": ["location"]
        }
      }
    }],
    "stream": false
  }'

Example Response

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": null,
      "tool_calls": [{
        "id": "call_1",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\":\"Tokyo\"}"
        }
      }]
    },
    "finish_reason": "tool_calls"
  }]
}

Environment Variables

Variable Default Description
ZAI_TOKEN (required) Z.ai JWT token cookie
API_KEY sk-glm-api API key for authentication
PORT 8000 Server port
HEADLESS true Run browser headless
TIMEOUT 120000 Request timeout (ms)

Architecture

Cline/Roo-Code → GLM-API (Express) → Playwright → chat.z.ai
                      ↓
              Parse SSE Response
                      ↓
              Extract Tool Calls
                      ↓
          OpenAI-Compatible Response
  • Browser: Playwright controls a real Chromium browser that bypasses Z.ai's CAPTCHA
  • SSE Capture: Intercepts chat.z.ai/api/v2/chat/completions responses
  • Parser: Multi-format tool call parser (XML, JSON, inline, natural language)
  • Adapter: Converts Z.ai's custom SSE format to OpenAI format

Limitations

  • Sequential only: One request at a time (browser is single-session)
  • ~10-30s latency: Depends on model and response length
  • Token expiry: JWT tokens expire; re-login needed periodically
  • Emulated tool calling: Not native; depends on model following XML format

License

MIT

About

Z.ai to OpenAI-Compatible API Proxy with Tool Calling Support

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors