Skip to content

fix: accept OpenAI-chat shape under /gemini and /openai prefixes (#1)#3

Merged
prasadus92 merged 1 commit into
mainfrom
fix/openai-chat-prefix-tolerance
Jun 2, 2026
Merged

fix: accept OpenAI-chat shape under /gemini and /openai prefixes (#1)#3
prasadus92 merged 1 commit into
mainfrom
fix/openai-chat-prefix-tolerance

Conversation

@prasadus92
Copy link
Copy Markdown
Owner

@prasadus92 prasadus92 commented Jun 2, 2026

Problem

Reported in #1: an OpenAI-chat client (Hermes, or any client using an openai_chat transport) configured with a /gemini base URL fails with:

HTTP 404: Error code: 404 - {'detail': 'Not Found'}

even though curl against the native Gemini route works fine.

Root cause

OpenAI-style clients build their final request URL by appending a fixed suffix onto base_url:

  • /chat/completions for inference
  • /models, /v1/models, /v1/models/{id} for model discovery

So a base_url of http://127.0.0.1:8787/gemini makes the client call POST /gemini/chat/completions. But /gemini is the proxy's native generateContent route prefix, which has no chat/completions handler. The OpenAI-compat handler was only mounted at /openai/v1/chat/completions, /v1/chat/completions, and /chat/completions. The proxy log in the issue shows exactly this: every /gemini/chat/completions POST and every /gemini/.../models discovery probe returning 404.

curl worked because it used the native shape (/gemini/v1beta/models/...:generateContent); the OpenAI-chat client uses a different shape against the same prefix.

Fix

Mount the OpenAI-compat chat handler and the model-discovery endpoints under the /gemini and /openai prefixes as well as the bare root. Gemini already routes through Vertex's OpenAI-compat endpoint inside _handle_openai (which dispatches by the request body's model, not the URL prefix), so it does not matter which prefix the client appends to.

Newly accepted shapes:

Method Paths added
POST /gemini/chat/completions, /gemini/v1/chat/completions, /openai/chat/completions
GET /models, /openai/v1/models, /openai/models, /gemini/v1/models, /gemini/models, /api/v1/models, plus /{openai,gemini}/v1/models/{id} probes

These join the existing /{,openai/}v1/chat/completions, /chat/completions, and /v1/models routes, matching the project's existing "accept client URL variants" approach.

Docs

  • README Hermes recipe + examples/hermes-config-snippet.yaml previously recommended the broken /gemini + openai_chat pairing. Updated to point the openai_chat Gemini provider at /openai (clearest match for the shape) and note that /gemini and the bare root now work too.
  • Added a troubleshooting entry for the 404 {'detail': 'Not Found'} symptom.
  • Endpoints table + status list updated.

Tests

  • test_gemini_prefix_accepts_openai_chat_shape: regression for Not able to use proxy for gemini-2.5-flash #1. The three new chat paths return 200 and forward to Vertex's OpenAI-compat endpoint with model: google/gemini-2.5-flash.
  • test_model_discovery_under_client_prefixes: discovery probes resolve under each client prefix; unknown-model probe still 404s.

Full suite: 22 passed, ruff check + ruff format --check clean.

Closes #1

OpenAI-chat clients (e.g. Hermes) build their request URL by appending
/chat/completions onto base_url, and probe /v1/models for discovery. A
user pointing such a client at the /gemini base therefore called
/gemini/chat/completions, which had no handler under the native
generateContent route prefix and returned 404 {"detail":"Not Found"}.

Mount the OpenAI-compat handler and the model-discovery endpoints under
the /gemini and /openai prefixes as well as the bare root. Gemini already
routes through Vertex's OpenAI-compat endpoint inside _handle_openai
(keyed off the request body's model), so the URL prefix the client
happens to use no longer matters.

Also corrects the README recipe and Hermes example, which previously
recommended the broken /gemini + openai_chat pairing.

Fixes #1
@prasadus92 prasadus92 force-pushed the fix/openai-chat-prefix-tolerance branch from 4b36107 to a6ed129 Compare June 2, 2026 13:29
@prasadus92 prasadus92 merged commit 969af2f into main Jun 2, 2026
2 checks passed
@prasadus92 prasadus92 deleted the fix/openai-chat-prefix-tolerance branch June 2, 2026 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Not able to use proxy for gemini-2.5-flash

1 participant