chore(pricing): Update vertex-ai pricing by siddharthsambharia-portkey · Pull Request #550 · Portkey-AI/models

siddharthsambharia-portkey · 2026-03-17T12:15:04Z

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type	Count
➕ Models added	3
🔄 Models updated (merged)	14

➕ New Models

gemini-2.5-pro-tts
gemini-2.5-flash-tts
veo-3.1-lite-generate-001

🔄 Updated Models

gemini-3.1-pro-preview
gemini-3-pro-image-preview
gemini-3.1-flash-image-preview
gemini-3.1-flash-lite-preview
gemini-3-flash-preview
gemini-2.5-pro
veo-3.1-generate-001
veo-3.0-generate-001
veo-3.0-fast-generate-001
text-embedding-005
text-embedding-large-exp-03-07
text-multilingual-embedding-002
textembedding-gecko
multimodalembedding

Model-to-Pricing-Page Mapping

Model ID	Publisher / Section	Source	Notes
`gemini-3.1-pro-preview`	Google – Gemini 3.1	API	Standard (≤200K) rates; Gemini 3 search $14/1000
`gemini-3-pro-preview`	Google – Gemini 3	API	Same pricing as Gemini 3.1 Pro ($2/$12/M)
`gemini-3-pro-image-preview`	Google – Gemini 3	API	Image output token $120/M; batch included
`gemini-3.1-flash-image-preview`	Google – Gemini 3.1	API	Image output token $60/M; batch included
`gemini-3.1-flash-lite-preview`	Google – Gemini 3.1	API	Input $0.25/M, output $1.50/M
`gemini-3-flash-preview`	Google – Gemini 3	API	Input $0.50/M, output $3/M
`gemini-2.5-pro`	Google – Gemini 2.5	API	Standard (≤200K) rates; long-context tier noted
`gemini-2.5-flash`	Google – Gemini 2.5	API	Standard rates
`gemini-2.5-flash-preview-09-2025`	Google – Gemini 2.5	API	Matches gemini-2.5-flash pricing
`gemini-2.5-flash-lite`	Google – Gemini 2.5	API	Input $0.10/M, output $0.40/M
`gemini-2.5-flash-lite-preview-09-2025`	Google – Gemini 2.5	API	Matches gemini-2.5-flash-lite pricing
`gemini-2.5-flash-image`	Google – Gemini 2.5	API	Image output token $30/M; batch included
`gemini-2.5-computer-use-preview-10-2025`	Google – Gemini 2.5	API	Matched Computer Use row; same pricing as 2.5 Pro
`gemini-2.0-flash-001`	Google – Gemini 2.0	API	Canonical ID with -001 suffix per ID rules
`gemini-2.0-flash-lite-001`	Google – Gemini 2.0	API	Canonical ID with -001 suffix per ID rules
`gemini-2.5-pro-tts`	Google – Gemini 2.5	API – price not found	TTS model; no pricing row found; added with price 0
`gemini-2.5-flash-tts`	Google – Gemini 2.5	API – price not found	TTS model; no pricing row found; added with price 0
`imagen-4.0-ultra-generate-001`	Google – Imagen	API	$0.06/image
`imagen-4.0-generate-001`	Google – Imagen	API	$0.04/image
`imagen-4.0-fast-generate-001`	Google – Imagen	API	$0.02/image
`imagen-3.0-generate-002`	Google – Imagen	API	$0.04/image
`imagen-3.0-capability-001`	Google – Imagen	API	Capability model mapped to imagen-3.0-generate pricing ($0.04/image)
`imagen-3.0-capability-002`	Google – Imagen	API	Capability model mapped to imagen-3.0-generate pricing ($0.04/image)
`veo-3.1-generate-001`	Google – Veo	API	$0.40/sec (Video+Audio 720p/1080p); duration=8s, count=1
`veo-3.1-fast-generate-001`	Google – Veo	API	$0.15/sec (Video+Audio 720p/1080p)
`veo-3.1-lite-generate-001`	Google – Veo	API	$0.05/sec (Video+Audio 720p)
`veo-3.0-generate-001`	Google – Veo	API	$0.40/sec (Video+Audio 720p/1080p)
`veo-3.0-fast-generate-001`	Google – Veo	API	$0.15/sec (Video+Audio)
`veo-2.0-generate-001`	Google – Veo	API	$0.50/sec (Video 720p)
`gemini-embedding-001`	Google – Embedding	API	$0.00015/1K tokens (online)
`text-embedding-005`	Google – Embedding	API	$0.000025/1K chars → converted to $0.0001/1K tokens
`text-embedding-large-exp-03-07`	Google – Embedding	API	Same pricing as text-embedding-005 (no dedicated row)
`text-multilingual-embedding-002`	Google – Embedding	API	$0.000025/1K chars → converted to $0.0001/1K tokens
`textembedding-gecko`	Google – Embedding	API	Legacy; uses standard text embedding pricing
`multimodalembedding`	Google – Embedding	API	Text component: $0.0002/1K chars → $0.0001/1K tokens
`gemini-embedding-2-preview`	Google – Embedding	API – price not found	Gemini Embedding 2 Unified Multimodal (Preview); added with price 0 pending confirmed pricing
`claude-opus-4-6`	Anthropic – Claude	API	@default stripped; Global pricing $5/$25/M; cache included
`claude-sonnet-4-6`	Anthropic – Claude	API	@default stripped; Global pricing $3/$15/M; cache included
`claude-opus-4-5@20251101`	Anthropic – Claude	API	Pinned version; Global pricing $5/$25/M; cache included
`claude-sonnet-4-5@20250929`	Anthropic – Claude	API	Pinned version; Global pricing $3/$15/M; cache included
`claude-haiku-4-5@20251001`	Anthropic – Claude	API	Pinned version; Global pricing $1/$5/M; cache included
`claude-opus-4-1@20250805`	Anthropic – Claude	API	Pinned version; Uniform pricing $15/$75/M; cache included
`claude-opus-4@20250514`	Anthropic – Claude	API	Pinned version; Uniform pricing $15/$75/M; cache included
`claude-sonnet-4@20250514`	Anthropic – Claude	API	Pinned version; Uniform pricing $3/$15/M; cache included
`gpt-oss-120b-maas`	OpenAI	API	MaaS model; $0.09/$0.36/M; batch included
`llama-3.3-70b-instruct-maas`	Meta – Llama	API	Llama 3.3 70B row; $0.72/$0.72/M; batch included
`llama-4-maverick-17b-128e-instruct-maas`	Meta – Llama	API	Llama 4 Maverick row; $0.35/$1.15/M; batch included
`mistral-small-2503`	Mistral	API	Mistral Small 3.1 row; $0.10/$0.30/M
`mistral-medium-3`	Mistral	API	$0.40/$2.00/M
`codestral-2`	Mistral	API	Codestral 2 row; $0.30/$0.90/M
`deepseek-r1-0528-maas`	DeepSeek	API	DeepSeek-R1 0528 row; $1.35/$5.40/M; batch included
`deepseek-v3.1-maas`	DeepSeek	API	DeepSeek-V3.1 row; $0.60/$1.70/M; batch included
`deepseek-v3.2-maas`	DeepSeek	API	DeepSeek-V3.2 row; $0.56/$1.68/M; batch included
`qwen3-235b-a22b-instruct-2507-maas`	Qwen	API	Qwen3-235B row; $0.22/$0.88/M; batch included
`qwen3-coder-480b-a35b-instruct-maas`	Qwen	API	Qwen3-Coder-480B row; $0.22/$1.80/M; batch included
`qwen3-next-80b-a3b-instruct-maas`	Qwen	API	Qwen3-Next-80B-Instruct row; $0.15/$1.20/M
`qwen3-next-80b-a3b-thinking-maas`	Qwen	API	Qwen3-Next-80B-Thinking row; $0.15/$1.20/M
`kimi-k2-thinking-maas`	Kimi / Moonshot	API	Kimi-K2-Thinking row; $0.60/$2.50/M
`minimax-m2-maas`	MiniMax	API	MiniMax-M2 row; $0.30/$1.20/M
`glm-4.7-maas`	ZAI.org – GLM	API	GLM-4.7 row; $0.60/$2.20/M
`glm-5-maas`	ZAI.org – GLM	API	GLM-5 row; $1.00/$3.20/M

Excluded Models (not in output)

Model	Publisher	Reason
`gemini-live-2.5-flash-native-audio`	Google	Global exclude: `-live-` streaming
`jamba-large-1.6`	AI21	`has_deploy: true` without `-maas` (self-deploy)
`clip-vit-base-patch32`, `openclip`	OpenAI	Non-generative (CLIP vision)
`whisper-large`	OpenAI	Audio transcription (global exclude)
`gpt-oss`	OpenAI	`has_deploy: true` without `-maas` (self-deploy)
`sam3`, `faster-r-cnn`, `mask-r-cnn`, `retinanet`	Meta	Non-generative CV/segmentation
`xlm-roberta-large`, `roberta-large`	Meta	Non-generative NLP
`llama-guard`, `prompt-guard`	Meta	Guard models (global exclude)
`llama2`, `llama3`, `llama3_1`, `llama3-2`, `llama3-3`, `llama4`, `codellama-7b-hf`, `llama-2-quantized`	Meta	`has_deploy: true` without `-maas` (self-deploy)
`nllb`, `imagebind`	Meta	Non-generative (translation / multimodal understanding)
`codestral-2501-self-deploy`	Mistral	`has_deploy: true` + "self-deploy" in name
`mistral-ocr-2505`	Mistral	OCR model (global exclude)
`ministral-3`, `mistral-large-3`	Mistral	`has_deploy: true` without `-maas` (self-deploy)
`mistral` (mistral-ai), `mixtral`	Mistral	`has_deploy: true` without `-maas` (self-deploy)
`deepseek-r1`, `deepseek-v3`, `deepseek-v3-1`, `deepseek-v3-2`	DeepSeek	`has_deploy: true` without `-maas` (self-deploy)
`deepseek-ocr`, `deepseek-ocr-2`, `deepseek-ocr-maas`	DeepSeek	OCR model (global exclude)
`qwq`, `qwen3`, `qwen3-5`, `qwen3-vl`, `qwen2`, `qwen3-embedding`, `qwen3-coder`, `qwen3-next`, `qwen3-coder-next`	Qwen	`has_deploy: true` without `-maas` (self-deploy)
`qwen-image`	Qwen	Explicit policy exclusion
`kimi-k2`, `kimi-k2-5`	Kimi	`has_deploy: true` without `-maas` (self-deploy)
`minimax-m2`	MiniMax	`has_deploy: true` without `-maas` (self-deploy)
`glm-4.7`, `glm-5`, `glm-4.5`	ZAI.org	`has_deploy: true` without `-maas` (self-deploy)
`glm-image`	ZAI.org	Explicit policy exclusion
`glm-ocr`	ZAI.org	OCR model (global exclude)
Various non-generative Google models	Google	Non-generative ML/CV/NLP (imagegeneration, virtual-try-on, bert, gemma, PaLM, etc.)

Pricing-Page-Only Models (not added — not in API)

Model	Publisher	Notes
Gemini 1.5 Flash / 1.5 Pro / 1.0 Pro	Google	On pricing page but NOT returned by get_vertex_models
Llama 3.1 405B / Llama 4 Scout	Meta	On pricing page but NOT returned by get_vertex_models
gpt-oss-20b	OpenAI	On pricing page but NOT returned by get_vertex_models
Lyria 3 Pro / Lyria 3 / Lyria 2	Google	On pricing page; excluded per global lyria-* rule (music generation)

Generated by Pricing Agent on 2026-04-04

siddharthsambharia-portkey added 30 commits March 17, 2026 17:45

chore(pricing): Update vertex-ai pricing

a1a3f5f

chore(pricing): Update vertex-ai pricing

53b3f5d

chore(pricing): Update vertex-ai pricing

52dbf8e

chore(pricing): Update vertex-ai pricing

f19c6a3

chore(pricing): Update vertex-ai pricing

a6e1035

chore(pricing): Update vertex-ai pricing

91c6f2a

chore(pricing): Update vertex-ai pricing

d32f719

chore(pricing): Update vertex-ai pricing

6a7c7e8

chore(pricing): Update vertex-ai pricing

916ddaf

chore(pricing): Update vertex-ai pricing

fa02c68

chore(pricing): Update vertex-ai pricing

7320d33

chore(pricing): Update vertex-ai pricing

3604db1

chore(pricing): Update vertex-ai pricing

d31b801

chore(pricing): Update vertex-ai pricing

a267566

chore(pricing): Update vertex-ai pricing

04933eb

chore(pricing): Update vertex-ai pricing

2dd50e4

chore(pricing): Update vertex-ai pricing

21a3a64

chore(pricing): Update vertex-ai pricing

244cd8b

chore(pricing): Update vertex-ai pricing

623bbde

chore(pricing): Update vertex-ai pricing

c7e7113

chore(pricing): Update vertex-ai pricing

5cda0eb

chore(pricing): Update vertex-ai pricing

3b130f0

chore(pricing): Update vertex-ai pricing

271a047

chore(pricing): Update vertex-ai pricing

8867d9d

chore(pricing): Update vertex-ai pricing

bdf8d15

chore(pricing): Update vertex-ai pricing

23b51be

chore(pricing): Update vertex-ai pricing

81c0fd3

chore(pricing): Update vertex-ai pricing

ebf58b5

chore(pricing): Update vertex-ai pricing

6745bf8

chore(pricing): Update vertex-ai pricing

62dc55d

siddharthsambharia-portkey added 14 commits March 29, 2026 23:43

chore(pricing): Update vertex-ai pricing

6bdcbde

chore(pricing): Update vertex-ai pricing

c92c59c

chore(pricing): Update vertex-ai pricing

bc2a8ec

chore(pricing): Update vertex-ai pricing

94276ac

chore(pricing): Update vertex-ai pricing

6eb1b42

chore(pricing): Update vertex-ai pricing

f1c7a23

chore(pricing): Update vertex-ai pricing

d828299

chore(pricing): Update vertex-ai pricing

16c465d

chore(pricing): Update vertex-ai pricing

c28df48

chore(pricing): Update vertex-ai pricing

14f78ba

chore(pricing): Update vertex-ai pricing

293eaea

chore(pricing): Update vertex-ai pricing

ecf0a84

chore(pricing): Update vertex-ai pricing

9bb72fa

chore(pricing): Update vertex-ai pricing

04e1595

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(pricing): Update vertex-ai pricing#550

chore(pricing): Update vertex-ai pricing#550
siddharthsambharia-portkey wants to merge 44 commits intomainfrom
pricing-update/vertex-ai

siddharthsambharia-portkey commented Mar 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

siddharthsambharia-portkey commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

➕ New Models

🔄 Updated Models

Model-to-Pricing-Page Mapping

Excluded Models (not in output)

Pricing-Page-Only Models (not added — not in API)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

siddharthsambharia-portkey commented Mar 17, 2026 •

edited

Loading