You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You have several options for deploying models to your OpenShift AI cluster.
387
-
We recommend using **[ModelCar](https://kserve.github.io/website/docs/model-serving/storage/providers/oci#using-modelcars)**
388
-
because it removes the need to manually download models from Hugging Face Hub,
389
-
upload them to S3, or manage access permissions. With ModelCar, you can package
390
-
models as OCI images and pull them at runtime or precache them. This simplifies
391
-
versioning, improves traceability, and integrates cleanly into CI/CD workflows.
392
-
ModelCar images also ensure reproducibility and maintain versioned model releases.
393
-
394
-
You can deploy our own model using a ModelCar container, which packages all
395
-
model files into an OCI container image. To learn more about ModelCar containers,
396
-
read this article **[Build and deploy a ModelCar container in OpenShift AI](https://developers.redhat.com/articles/2025/01/30/build-and-deploy-modelcar-container-openshift-ai)**.
397
-
It explains the benefits of ModelCar containers, how to build a ModelCar image,
398
-
and how to deploy it with OpenShift AI.
399
-
400
-
For additional patterns and prebuilt ModelCar images, explore the Red Hat AI
!!! tip "Use Any Other Available Model from the ModelCar Catalog registry."
407
-
408
-
**You can use any model from the ModelCar Catalog registry in a similar way.**
409
-
For example, for the `Granite-3.3-8B-Instruct` model, you can use the publicly
410
-
available container image from the **Quay.io** registry: **[quay.io/redhat-ai-services/modelcar-catalog:granite-3.3-8b-instruct](https://quay.io/repository/redhat-ai-services/modelcar-catalog?tag=granite-3.3-8b-instruct)**.
411
-
412
-
The **[Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct)**
413
-
model is an 8-billion-parameter, 128K context-length language model fine-tuned
414
-
for improved reasoning and instruction-following capabilities. It is built
415
-
on top of the `Granite-3.3-8B-Base` model.
416
-
417
-
To create a connection for the `Granite-3.3-8B-Instruct` model, use the
##### Using Publicly Available ModelCar Catalog registry
385
+
386
+
You have several options for deploying models to your OpenShift AI cluster.
387
+
We recommend using **[ModelCar](https://kserve.github.io/website/docs/model-serving/storage/providers/oci#using-modelcars)**
388
+
because it removes the need to manually download models from Hugging Face Hub,
389
+
upload them to S3, or manage access permissions. With ModelCar, you can package
390
+
models as OCI images and pull them at runtime or precache them. This simplifies
391
+
versioning, improves traceability, and integrates cleanly into CI/CD workflows.
392
+
ModelCar images also ensure reproducibility and maintain versioned model releases.
393
+
394
+
You can deploy your own model using a ModelCar container, which packages all
395
+
model files into an OCI container image. To learn more about ModelCar containers,
396
+
read this article **[Build and deploy a ModelCar container in OpenShift AI](https://developers.redhat.com/articles/2025/01/30/build-and-deploy-modelcar-container-openshift-ai)**.
397
+
398
+
It explains the benefits of ModelCar containers, how to build a ModelCar image,
399
+
and how to deploy it with OpenShift AI.
400
+
401
+
For additional patterns and prebuilt ModelCar images, explore the Red Hat AI
!!! tip "Use Any Other Available Model from the ModelCar Catalog registry."
408
+
409
+
**You can use any model from the ModelCar Catalog registry in a similar way.**
410
+
For example, for the `Granite-3.3-8B-Instruct` model, you can use the publicly
411
+
available container image from the **Quay.io** registry: **[quay.io/redhat-ai-services/modelcar-catalog:granite-33-8b-instruct](https://quay.io/repository/redhat-ai-services/modelcar-catalog?tag=granite-3.3-8b-instruct)**.
412
+
413
+
The **[Granite-3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct)**
414
+
model is an 8-billion-parameter, 128K context-length language model fine-tuned
415
+
for improved reasoning and instruction-following capabilities. It is built
416
+
on top of the `Granite-3.3-8B-Base` model.
417
+
418
+
To create a connection for the `Granite-3.3-8B-Instruct` model, use the
Additionally, you may find it helpful to read **[Optimize and deploy LLMs for
439
+
production with OpenShift AI](https://developers.redhat.com/articles/2025/10/06/optimize-and-deploy-llms-production-openshift-ai)**.
440
+
441
+
##### Using Model Catalog
442
+
443
+
Recent version of RHOAI include support for the **Model Catalog**, enabling users
444
+
to easily discover, evaluate, and deploy generative AI models from a centralized
445
+
interface. This feature provides access to models from multiple providers such
446
+
as Red Hat, IBM, Meta, NVIDIA, Mistral AI, and Google, with built-in benchmarking
447
+
based on open-source evaluation datasets to compare performance and quality. It
448
+
simplifies the workflow by allowing data scientists and AI engineers to select
449
+
suitable models, register them in a model registry, and deploy them directly to
450
+
a serving runtime.
436
451
437
-
However, note that all these images are compiled for the **x86 architecture**.
438
-
If you're targeting ARM, you'll need to rebuild these images on an ARM machine,
439
-
as demonstrated in **[this guide](https://pandeybk.medium.com/serving-vllm-and-granite-models-on-arm-with-red-hat-openshift-ai-0178adba550e)**.
452
+

440
453
441
-
Additionally, you may find it helpful to read **[Optimize and deploy LLMs for production with OpenShift AI](https://developers.redhat.com/articles/2025/10/06/optimize-and-deploy-llms-production-openshift-ai)**.
454
+
Models can be deployed directly from the model catalog to streamline the deployment
455
+
process. For more details, refer to the [**official documentation**](https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/3.0/html-single/working_with_model_registries/index#deploying-a-model-from-the-model-catalog_model-registry).
442
456
443
457
## Setting up Single-model Server and Deploy the model
0 commit comments