Skip to content

feat(vercel-provider): add OrqAiImageModel with /images/generations +…#129

Merged
thedevtoni merged 1 commit into
orq-ai:mainfrom
hamza-adami:feat/vercel-provider-image-edits
May 15, 2026
Merged

feat(vercel-provider): add OrqAiImageModel with /images/generations +…#129
thedevtoni merged 1 commit into
orq-ai:mainfrom
hamza-adami:feat/vercel-provider-image-edits

Conversation

@hamza-adami
Copy link
Copy Markdown
Contributor

… /images/edits support

Replace direct use of OpenAICompatibleImageModel for image generation with a dedicated OrqAiImageModel that targets the Orq AI proxy correctly across all upstream providers (OpenAI, fal, Google Imagen, Leonardo, etc).

Problems fixed

  1. OpenAICompatibleImageModel hardcodes response_format: "b64_json" in the request body. Newer OpenAI image models (gpt-image-1 and later) reject this parameter with Unknown parameter: 'response_format', and the Orq proxy forwards that error verbatim. Image generation through Orq was broken for any gpt-image-1+ model.

  2. The upstream model's response schema only accepts data[].b64_json. Several Orq-routed upstreams (fal, Leonardo, dall-e-2 default) return the image under data[].url instead — either as a data URI or as an http URL. These responses failed schema validation with AI_TypeValidationError: data[0].b64_json: Invalid input: expected string, received undefined.

  3. The upstream model did not support image editing via Orq. Calling generateImage({ prompt: { images: [...] } }) had no effect and Orq's /images/edits endpoint was unreachable through this provider.

Changes

  • New OrqAiImageModel implementing ImageModelV3 (specificationVersion "v3", maxImagesPerCall = 10), wired up via provider.imageModel(id) in createOrqAiProvider.

  • Request: drops response_format so each underlying upstream uses its own default — gpt-image-1 returns base64 inline, dall-e-* defaults to a URL, fal/Leonardo return data URIs. User-supplied extras (quality, style, background, etc.) pass through via providerOptions.orq.

  • Response schema accepts every shape documented in orq-node's CreateImageData ({ b64_json?, url?, revised_prompt? }) and an optional usage envelope (input_tokens, output_tokens, total_tokens).

  • normalizeImageEntry handles three response shapes uniformly:

    • b64_json field -> returned as-is
    • url data URI -> strips the prefix
    • url http(s) URL -> fetches and base64-encodes Always returns base64 strings, matching the ImageModelV3 contract.
  • New /images/edits branch (multipart form-data) activates when files != null && files.length > 0. Each file is converted to a File with a .png / .jpg / .webp filename so Orq's middleware doesn't reject the part as application/octet-stream. The mask parameter is dropped with a warning because Orq's edit schema does not carry a mask field.

  • Inline copy of defaultOpenAICompatibleErrorStructure because @ai-sdk/openai-compatible exports the type but not the value. Orq's proxy returns OpenAI-shaped error envelopes, so the schema matches 1:1.

  • createOrqAiProvider now throws if apiKey is empty, and the URL builder no longer wraps the path in a redundant new URL(...).

Spec compliance verified against:

  • @ai-sdk/provider's ImageModelV3 type (return shape: images, warnings, response { timestamp, modelId, headers }, optional usage).
  • Upstream OpenAICompatibleImageModel in vercel/ai for body / schema / warning conventions.
  • orq-ai/orq-node's CreateImageRequestBody, CreateImageEditRequestBody, CreateImageData, and CreateImageUsage for body + response shapes.

@hamza-adami hamza-adami force-pushed the feat/vercel-provider-image-edits branch 2 times, most recently from a38e929 to ac5a10f Compare May 15, 2026 11:40
…ross Orq upstreams

Replace direct use of `OpenAICompatibleImageModel` with a dedicated
`OrqAiImageModel` that targets the Orq AI proxy correctly across all
upstream providers (OpenAI, fal, Google Imagen, Leonardo, etc).

Problems fixed
--------------
1. `OpenAICompatibleImageModel`'s `/images/generations` branch hardcodes
   `response_format: "b64_json"` in the request body. Newer OpenAI image
   models (gpt-image-1 and later) reject this parameter with
   `Unknown parameter: 'response_format'`, and the Orq proxy forwards
   that error verbatim. Image generation through Orq was broken for any
   gpt-image-1+ model.

2. The upstream response schema only accepts `data[].b64_json` as a
   required string. Several Orq-routed upstreams (fal, Leonardo, dall-e
   default) return the image under `data[].url` instead, either as a data
   URI or as an http URL. These responses failed schema validation with
   `AI_TypeValidationError: data[0].b64_json: Invalid input: expected
   string, received undefined`. This affected both the generation and the
   edit branches because they share one schema.

3. The upstream `/images/edits` branch forwards a `mask` field if present
   in the call options. Orq's `CreateImageEditRequestBody` schema (see
   orq-node) does not define `mask`, so requests carrying it are rejected
   server-side. The new model drops `mask` with a warning instead.

Note: image *editing* was technically reachable on `main` via the
inherited upstream class and worked for the common gpt-image-1 path
(because the edit branch does not send `response_format` and gpt-image-1
always returns `b64_json`). It was broken for dall-e-2 with default
`response_format: url` and for any call that included a `mask`. This PR
makes editing robust across all Orq-routed models, not just gpt-image-1.

Changes
-------
- New `OrqAiImageModel` implementing `ImageModelV3` (specificationVersion
  `"v3"`, `maxImagesPerCall = 10`), wired up via `provider.imageModel(id)`
  in `createOrqAiProvider`.

- `/images/generations` branch drops `response_format` so each underlying
  upstream uses its own default — gpt-image-1 returns base64 inline,
  dall-e-* defaults to a URL, fal/Leonardo return data URIs. User-supplied
  extras (quality, style, background, etc.) pass through via
  `providerOptions.orq`.

- Response schema accepts every shape documented in orq-node's
  `CreateImageData` (`{ b64_json?, url?, revised_prompt? }`) plus an
  optional `usage` envelope (`input_tokens`, `output_tokens`,
  `total_tokens`).

- `normalizeImageEntry` handles three response shapes uniformly:
    * `b64_json` field -> returned as-is
    * `url` data URI -> strips the prefix
    * `url` http(s) URL -> fetches and base64-encodes
  Always returns base64 strings, matching the `ImageModelV3` contract.

- `/images/edits` branch (multipart form-data) activates when
  `files != null && files.length > 0`. Each file is converted to a
  `File` with a `.png` / `.jpg` / `.webp` filename so Orq's middleware
  doesn't reject the part as `application/octet-stream`. The `mask`
  parameter is dropped with a warning because Orq's edit schema does not
  carry a mask field.

- `defaultOrqAiErrorStructure` is an inline copy of the upstream OpenAI
  error envelope. `@ai-sdk/openai-compatible` exports the type but not
  the value, so we replicate the schema. Orq's proxy returns OpenAI-shaped
  error envelopes, so this matches 1:1.

- `createOrqAiProvider` now throws if `apiKey` is empty, and the URL
  builder no longer wraps the path in a redundant `new URL(...)`.

- Adds `zod` as a peer dependency, matching the upstream
  `@ai-sdk/openai-compatible` pattern (`^3.25.76 || ^4.1.8`).

Spec compliance verified against:
- `@ai-sdk/provider`'s `ImageModelV3` type (return shape: `images`,
  `warnings`, `response { timestamp, modelId, headers }`, optional `usage`).
- Upstream `OpenAICompatibleImageModel@2.0.35` in `vercel/ai` for body /
  schema / warning conventions.
- `orq-ai/orq-node`'s `CreateImageRequestBody`,
  `CreateImageEditRequestBody`, `CreateImageData`, and `CreateImageUsage`
  for body + response shapes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hamza-adami hamza-adami force-pushed the feat/vercel-provider-image-edits branch from ac5a10f to e6f2772 Compare May 15, 2026 11:54
@thedevtoni thedevtoni merged commit 8abaafa into orq-ai:main May 15, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants