docs(vertexai): replace incorrect model identifiers with auto-discovery note by major · Pull Request #5406 · llamastack/llama-stack

major · 2026-04-01T16:03:51Z

What does this PR do?

The Vertex AI provider description listed model identifiers with a vertex_ai/ prefix (e.g. vertex_ai/gemini-2.5-flash) that never matched the resource names returned by the Vertex AI API (publishers/google/models/gemini-2.5-flash), causing server startup failures when users followed the documentation.

Since the provider rewrite in #4951, models are dynamically discovered from the Vertex AI project at startup via list_models(). The hardcoded model list was stale and misleading. This replaces it with a note that models are auto-discovered.

Closes #5403

How model discovery works after #4951

Auto-discovery (default): list_models() queries the Vertex AI API at startup and registers all available models automatically. No user configuration needed. The routing table prefixes each model with the provider ID, so publishers/google/models/gemini-2.5-flash becomes vertexai/publishers/google/models/gemini-2.5-flash.
Short-form model IDs work too: API requests can use vertexai/gemini-2.5-flash as a shorthand. The google-genai SDK normalizes the name internally.
Restricting models via allowed_models: Users who want a subset can set allowed_models in the provider config using the full API resource names (e.g. publishers/google/models/gemini-2.5-flash).
The old vertex_ai/gemini-2.5-flash format was never valid as a provider_resource_id and failed check_model_availability() against the API-populated cache.

Verified with a live Vertex AI project

Ran llama stack run with remote::vertexai against a real GCP project. Server auto-discovered 13 models at startup:

$ curl -s http://localhost:8321/v1/models | python3 -c "import sys,json; [print(m['id']) for m in json.load(sys.stdin)['data']]"
vertexai/publishers/google/models/gemini-1.5-pro-002
vertexai/publishers/google/models/gemini-2.0-flash-001
vertexai/publishers/google/models/gemini-2.0-flash
vertexai/publishers/google/models/gemini-2.0-flash-lite-001
vertexai/publishers/google/models/gemini-2.5-flash-preview-04-17
vertexai/publishers/google/models/gemini-2.5-pro-exp-03-25
vertexai/publishers/google/models/gemini-2.5-pro
vertexai/publishers/google/models/gemini-2.5-flash
vertexai/publishers/google/models/gemini-2.5-flash-lite
vertexai/publishers/google/models/gemini-2.5-pro-tts
vertexai/publishers/google/models/gemini-2.5-flash-tts
vertexai/publishers/google/models/gemini-live-2.5-flash-native-audio
vertexai/publishers/google/models/gemini-3.1-flash-image-preview

Chat completion with short-form ID works:

$ curl -s http://localhost:8321/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "vertexai/gemini-2.5-flash", "messages": [{"role": "user", "content": "Say hello in exactly 5 words."}]}' \
  | python3 -m json.tool
{
    "id": "chatcmpl-...",
    "choices": [{"message": {"role": "assistant", "content": "Hello there, my dear friend."}, "finish_reason": "stop", ...}],
    "model": "vertexai/gemini-2.5-flash",
    ...
}

Test Plan

Documentation-only change. Verified provider_codegen.py regenerates remote_vertexai.mdx cleanly and all pre-commit hooks pass.

$ uv run ./scripts/provider_codegen.py
Processing provider registry
  Processing provider registry...

$ SKIP=actionlint git commit ...
Provider Codegen.............................................................Passed
ruff (legacy alias)..........................................................Passed
ruff format..................................................................Passed
mypy.........................................................................Passed
markdownlint.............................................(no files to check)Skipped

…ry note The provider description listed model identifiers with a vertex_ai/ prefix (e.g. vertex_ai/gemini-2.5-flash) that never matched the resource names returned by the Vertex AI API, causing registration failures at startup. Replace the stale hardcoded list with a note that models are automatically discovered from the project. Signed-off-by: Major Hayden <major@mhtx.net>

major requested review from ashwinb, bbrowning, cdoern, ehhuang, franciscojavierarceo, leseb, mattf and raghotham as code owners April 1, 2026 16:03

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 1, 2026

major marked this pull request as draft April 1, 2026 16:22

major force-pushed the fix/5403 branch 2 times, most recently from b4150ac to 2dd9824 Compare April 1, 2026 16:24

major marked this pull request as ready for review April 1, 2026 17:40

major force-pushed the fix/5403 branch from 67d9b16 to 551a437 Compare April 1, 2026 19:49

major force-pushed the fix/5403 branch from 551a437 to b4c18f3 Compare April 1, 2026 19:51

major closed this Apr 1, 2026

major reopened this Apr 1, 2026

Merge remote-tracking branch 'origin/main' into fix/5403

2765768

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(vertexai): replace incorrect model identifiers with auto-discovery note#5406

docs(vertexai): replace incorrect model identifiers with auto-discovery note#5406
major wants to merge 2 commits intollamastack:mainfrom
major:fix/5403

major commented Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

major commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

How model discovery works after #4951

Verified with a live Vertex AI project

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

major commented Apr 1, 2026 •

edited

Loading