Skip to content

Agile-V/local-llm-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Local LLM Proxy for Agile-V Studio

A local HTTPS-to-HTTP proxy that lets you use local or network LLM backends with Agile-V Studio without running into mixed-content browser restrictions.

Browser (HTTPS)  --->  Proxy (HTTPS:8443)  --->  LLM Backend (HTTP)
     |                      |                         |
  studio.agile-v.org    192.168.x.x:8443       e.g. Ollama :11434
                                               or LLM Proxy :3015

The Problem

Agile-V Studio runs over HTTPS. When you configure a local LLM (like Ollama) or a network LLM proxy as your backend, the browser blocks the request because it's HTTP β€” a mixed-content violation that can't be bypassed with CSP headers.

This proxy sits on your local network, terminates TLS with a self-signed certificate, and forwards requests to your HTTP backend. The browser sees an HTTPS connection and allows it.

Features

  • Zero dependencies β€” runs on Node.js built-ins only (https, http, fs)
  • Works with any LLM backend β€” Ollama, llama.cpp, LocalAI, LM Studio, vLLM, LiteLLM, or any OpenAI/Anthropic-compatible API
  • Token tracking β€” built-in dashboard and JSON API for monitoring usage
  • CORS handling β€” supports both OpenAI and Anthropic request headers
  • Streaming β€” pipes responses through without buffering (SSE/chunked)
  • Persistent stats β€” token counts survive restarts

Quick Start

1. Copy and edit the config

cp config.example.json config.json

Edit config.json to point at your LLM backend:

{
  "proxy": {
    "port": 8443,
    "allowedOrigins": "*"
  },
  "target": {
    "host": "127.0.0.1",
    "port": 11434
  }
}

See Configuration for all options.

2. Generate a certificate (one-time)

./generate-cert.sh

3. Start the proxy

Node.js:

node proxy.js

Docker:

docker compose up -d

4. Verify it works

# OpenAI-compatible backend (e.g. Ollama)
curl -sk https://localhost:8443/api/tags

# Anthropic-compatible backend
curl -sk https://localhost:8443/v1/messages \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "x-api-key: dummy" \
  -d '{"model":"claude-haiku-4-5-20251001","max_tokens":20,"messages":[{"role":"user","content":"Hi"}]}'

Trust the Certificate

The certificate is self-signed. For browser fetch() requests to work (which is how Agile-V Studio's browser proxy calls your LLM), the certificate must be trusted at the system or browser level.

Simply clicking through a browser warning is not sufficient β€” fetch() will still reject untrusted certificates.

macOS (recommended)

sudo security add-trusted-cert -d -r trustRoot \
  -k /Library/Keychains/System.keychain certs/cert.pem

Works for Chrome and Safari. Firefox uses its own certificate store (see below).

Firefox

Firefox ignores the system keychain. Two options:

Option 1: Open about:config in Firefox and set security.enterprise_roots.enabled to true. Firefox will then use the system keychain.

Option 2: Manually import under Settings > Privacy & Security > Certificates > View Certificates > Import > select certs/cert.pem > check "Trust this CA".

Linux

sudo cp certs/cert.pem /usr/local/share/ca-certificates/llm-proxy.crt
sudo update-ca-certificates

Windows

certutil -addstore "Root" certs\cert.pem

Token Tracking

The proxy automatically extracts token usage from LLM responses. Works with both API formats:

  • OpenAI: prompt_tokens, completion_tokens, total_tokens
  • Anthropic: input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens

Dashboard

Open in your browser:

https://<proxy-ip>:8443/_proxy/dashboard

Shows total token usage, breakdown by model and endpoint, and recent requests. Auto-refreshes every 5 seconds.

Stats API

# Get token stats as JSON
curl -sk https://localhost:8443/_proxy/stats

# Reset all stats
curl -sk -X POST https://localhost:8443/_proxy/reset

Example response:

{
  "uptime": "2h 15m",
  "requests": 42,
  "completions": 38,
  "errors": 0,
  "tokens": {
    "prompt": 12500,
    "completion": 8300,
    "total": 20800,
    "formatted": "20.8k"
  },
  "byModel": {
    "claude-haiku-4-5-20251001": { "requests": 20, "total": 10400 },
    "qwen3:8b-q4_K_M": { "requests": 18, "total": 10400 }
  }
}

Stats are persisted to stats.json every 30 seconds and on shutdown.

Configuration

The proxy is configured via config.json. Copy the example to get started:

cp config.example.json config.json

config.json

{
  "proxy": {
    "port": 8443,
    "allowedOrigins": "*"
  },
  "target": {
    "host": "127.0.0.1",
    "port": 11434
  },
  "tls": {
    "cert": "./certs/cert.pem",
    "key": "./certs/key.pem"
  },
  "tracking": {
    "enabled": true,
    "statsFile": "./stats.json",
    "persistInterval": 30,
    "historySize": 100
  }
}
Section Key Default Description
proxy port 8443 HTTPS port the proxy listens on
allowedOrigins "*" Allowed CORS origins (comma-separated or *)
target host "127.0.0.1" Target LLM backend host
port 11434 Target LLM backend port (Ollama default)
tls cert "./certs/cert.pem" Path to TLS certificate
key "./certs/key.pem" Path to TLS private key
tracking enabled true Enable/disable token tracking
statsFile "./stats.json" Where to persist token stats
persistInterval 30 Save stats to disk every N seconds
historySize 100 Number of recent requests to keep in memory

Common configurations

Local Ollama (default, no changes needed):

{
  "target": { "host": "127.0.0.1", "port": 11434 }
}

Network LLM proxy (e.g. LiteLLM, custom API proxy):

{
  "target": { "host": "192.168.178.90", "port": 3015 }
}

Ollama on another machine:

{
  "target": { "host": "192.168.178.42", "port": 11434 }
}

Restrict CORS to your Agile-V Studio instance:

{
  "proxy": { "allowedOrigins": "https://studio.agile-v.org,https://localhost:3000" }
}

Disable token tracking:

{
  "tracking": { "enabled": false }
}

Custom config file

node proxy.js --config /path/to/my-config.json

Environment variable overrides

Environment variables take precedence over config.json:

TARGET_HOST=10.0.0.5 TARGET_PORT=8080 node proxy.js
Env Variable Overrides
PROXY_PORT proxy.port
TARGET_HOST target.host
TARGET_PORT target.port
CERT_FILE tls.cert
KEY_FILE tls.key
ALLOWED_ORIGINS proxy.allowedOrigins

Setup with Agile-V Studio

1. Start the proxy

Run the proxy on a machine that can reach your LLM backend over HTTP. This can be the same machine (for Ollama) or any machine on your network.

2. Trust the certificate

See "Trust the Certificate" above. Without this step, the browser will reject fetch() requests to the proxy.

3. Configure the endpoint

In Agile-V Studio, go to:

  • Project Settings > LLM Configuration, or
  • User Settings > LLM Configuration

Select "Custom Local LLM" as the provider and set the endpoint to:

https://<proxy-ip>:8443/v1

For example:

https://192.168.178.56:8443/v1

Important: The endpoint should end with /v1. The proxy forwards the full path, so the browser calls https://192.168.178.56:8443/v1/messages and the proxy forwards to http://<target>/v1/messages.

4. Done

The browser connects to the proxy over HTTPS, the proxy forwards to your LLM backend over HTTP. No more mixed-content errors. Monitor token usage at /_proxy/dashboard.

Files

proxy.js              # Proxy server (Node.js, zero deps)
config.example.json   # Example configuration (copy to config.json)
config.json           # Your configuration (not in git)
generate-cert.sh      # Certificate generator (OpenSSL)
Dockerfile            # Docker image (node:20-alpine)
docker-compose.yml    # One-command Docker start
stats.json            # Persisted token stats (auto-generated, not in git)
certs/                # Generated certificates (not in git)
  cert.pem            # TLS certificate
  key.pem             # Private key
  combined.pem        # Both in one file

How It Works

  1. The proxy starts an HTTPS server with the self-signed certificate
  2. Requests to /_proxy/* are handled internally (stats, dashboard, reset)
  3. All other requests are forwarded 1:1 to the HTTP target
  4. CORS headers are set automatically, supporting both OpenAI and Anthropic-specific headers (anthropic-version, x-api-key, etc.)
  5. For completion endpoints (/chat/completions, /v1/messages, /api/generate, /api/chat), the response is parsed to extract token usage
  6. Response bodies are streamed (piped), not buffered β€” works for long LLM responses and SSE streams
  7. Connection errors to the target return a clean JSON 502 response

Troubleshooting

"Cannot connect to http://..."

The LLM backend is not running or not reachable:

# Test the target directly
curl http://<target-host>:<target-port>/

"NetworkError when attempting to fetch resource" (Firefox)

Firefox uses its own certificate store. See "Trust the Certificate > Firefox" above.

"CORS request did not succeed"

Two possible causes:

  1. Certificate not trusted: The browser rejects the TLS connection before CORS is even checked. Solution: trust the certificate at the system level.

  2. Header not allowed: If your LLM client sends additional headers not in Access-Control-Allow-Headers. The proxy already allows: Content-Type, Authorization, x-api-key, anthropic-version, and various x-stainless-* headers.

"Blocked loading mixed active content"

The endpoint in Agile-V Studio still points directly to http://... instead of the proxy (https://...). Change the endpoint to https://<proxy-ip>:8443/v1.

Docker: LLM not reachable

Inside a Docker container, localhost refers to the container itself. The docker-compose.yml sets TARGET_HOST=host.docker.internal which resolves to the host machine on Docker Desktop (macOS/Windows).

On Linux, use network_mode: host:

services:
  llm-proxy:
    build: .
    network_mode: host
    environment:
      - TARGET_HOST=127.0.0.1

Renew the certificate

./generate-cert.sh

This overwrites the existing files. Restart the proxy and update the trust in your keychain or browser afterward.

License

MIT

About

πŸ”Œ Local LLM Proxy - HTTPS-to-HTTP proxy for using local/network LLM backends with Agile V Studio. Solves mixed-content browser restrictions. Works with Ollama, LocalAI, LM Studio, and custom endpoints.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors