docs(vertexai): replace incorrect model identifiers with auto-discovery note#5406
Open
major wants to merge 2 commits intollamastack:mainfrom
Open
docs(vertexai): replace incorrect model identifiers with auto-discovery note#5406major wants to merge 2 commits intollamastack:mainfrom
major wants to merge 2 commits intollamastack:mainfrom
Conversation
b4150ac to
2dd9824
Compare
…ry note The provider description listed model identifiers with a vertex_ai/ prefix (e.g. vertex_ai/gemini-2.5-flash) that never matched the resource names returned by the Vertex AI API, causing registration failures at startup. Replace the stale hardcoded list with a note that models are automatically discovered from the project. Signed-off-by: Major Hayden <major@mhtx.net>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
The Vertex AI provider description listed model identifiers with a
vertex_ai/prefix (e.g.vertex_ai/gemini-2.5-flash) that never matched the resource names returned by the Vertex AI API (publishers/google/models/gemini-2.5-flash), causing server startup failures when users followed the documentation.Since the provider rewrite in #4951, models are dynamically discovered from the Vertex AI project at startup via
list_models(). The hardcoded model list was stale and misleading. This replaces it with a note that models are auto-discovered.Closes #5403
How model discovery works after #4951
list_models()queries the Vertex AI API at startup and registers all available models automatically. No user configuration needed. The routing table prefixes each model with the provider ID, sopublishers/google/models/gemini-2.5-flashbecomesvertexai/publishers/google/models/gemini-2.5-flash.vertexai/gemini-2.5-flashas a shorthand. Thegoogle-genaiSDK normalizes the name internally.allowed_models: Users who want a subset can setallowed_modelsin the provider config using the full API resource names (e.g.publishers/google/models/gemini-2.5-flash).vertex_ai/gemini-2.5-flashformat was never valid as aprovider_resource_idand failedcheck_model_availability()against the API-populated cache.Verified with a live Vertex AI project
Ran
llama stack runwithremote::vertexaiagainst a real GCP project. Server auto-discovered 13 models at startup:Chat completion with short-form ID works:
Test Plan
Documentation-only change. Verified
provider_codegen.pyregeneratesremote_vertexai.mdxcleanly and all pre-commit hooks pass.