Releases · llm0ai/llm0

28 May 21:11

mrmushfiq

v0.2.0

459c125

v0.2.0 Latest

Latest

What's Changed

moved mrmshfiq/llm0-gateway to llm0ai/llm0
chore: rename module to github.com/llm0ai/llm0 by @mrmushfiq in #1

Full Changelog: v0.1.3...v0.2.0

Contributors

mrmushfiq

Assets 2

25 May 20:29

mrmushfiq

v0.1.3

9dfd7ed

v0.1.3

Summary

Patch release: cleaner Ollama streaming, consistent cost display, better request validation, and SDK examples in the README.

Fixed

Filter empty role-only SSE chunks from Ollama streams (OLLAMA_FILTER_EMPTY_CHUNKS, default true)
Validate empty model/messages before calling upstream (catches smart-quote JSON paste bugs)
Round cost_usd to 6 decimals so headers, JSON body, and logs match

Added

README: Python and Node examples (single request + simple agent loop with per-user headers)

Upgrade

```bash
git pull
docker compose up -d --build gateway
docker compose up -d --force-recreate gateway ( if gateway was running during the build )
```
Full changelog: CHANGELOG.md
EOF

Assets 2

20 Apr 15:52

mrmushfiq

v0.1.2

e1754c3

v0.1.2

Patch release. Redis durability fix + config-propagation documentation corrections. No schema changes, no env var changes, no API changes.

Fixed

Redis AOF persistence actually enabled in docker-compose.yml. The README and design doc both stated AOF was on; the compose file never set it, and there was no data volume, so docker compose down (or an OOM restart) silently wiped every spend counter. The redis service now runs with --appendonly yes --appendfsync everysec and a dedicated redis_data named volume.
Config-propagation docs corrected. README.md previously stated that per-project settings (monthly_cap_usd, rate_limit_per_minute, cache_enabled, semantic_cache_enabled, semantic_threshold) propagate within CUSTOMER_LIMIT_CACHE_TTL_SECONDS (default 60s). That is wrong — they ride the Redis apikey:* auth cache, which uses CACHE_TTL_SECONDS (default 3600s / 1 hour). CUSTOMER_LIMIT_CAE_TTL_SECONDS governs only the in-process customer_limits cache for end-user spend/request caps.

Added (docs only)

CUSTOMER_LIMIT_CACHE_TTL_SECONDS now documented in the env var table.
Updated CACHE_TTL_SECONDS description to reflect its dual role (exact-match cache TTL and API-key auth cache TTL).

Upgrade notes

```bash
git pull
docker compose down
docker compose up -d
```

The new `redis_data` volume starts empty.
Nothing else needs to be rebuilt: the gateway Go binary and the embedding image are unchanged.

Full diff: v0.1.1...v0.1.2

Assets 2

20 Apr 07:54

mrmushfiq

v0.1.1

8d20214

v0.1.1 — First public release

First public release of llm0-gateway.
An OpenAI-compatible LLM gateway with automatic failover, two-tier caching (exact + semantic), SSE streaming, per-customer spend caps, and scheduled maintenance workers. Runs locally via Docker Compose or go run and fronts four providers (OpenAI, Anthropic, Gemini, local Ollama) behind a single /v1/chat/completions endpoint.

Highlights

Four providers, one endpoint — prefix-based routing, drop-in OpenAI client compat
Automatic failover — 429 / 5xx / 404 / timeout / network, configurable FAILOVER_MODE
Streaming (SSE) across all four providers with trailing metadata frames
Two-tier caching — Redis (hot, <2 ms) + Postgres (warm) for exact match; pgvector HNSW for semantic (0.954 similarity hits in ~41 ms, $0 cost)
Per-customer spend caps — daily/monthly USD, block or downgrade on overflow
Scheduled workers — monthly speche cleanup, log retention, reconciliation
1400+ RPS sustained on the hot path with p50/p99 cache-hit at ~11ms / 16 ms and rejection took 2 ms

Note on versioning

The first tag was accidentally pushed as v1.0.0 and has been withdrawn. v0.1.1 is the first public release. Versions before 1.0 reflect pre-stable status — the HTTP surface is intended to stay OpenAI-compatible, but operational semantics may shift in patch releases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Summary

Fixed

Added

Upgrade

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Fixed

Added (docs only)

Upgrade notes

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

Note on versioning

Links

Uh oh!

Releases: llm0ai/llm0

v0.2.0

What's Changed

Contributors

Uh oh!

v0.1.3

Summary

Fixed

Added

Upgrade

Uh oh!

v0.1.2

Fixed

Added (docs only)

Upgrade notes

Uh oh!

v0.1.1 — First public release

Highlights

Note on versioning

Links

Uh oh!