Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions demo/clusters/openshift/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Running on Red Hat OpenShift

This document covers OpenShift-specific requirements for deploying the driver.

## OpenShift Version

Ensure you have OpenShift 4.19 or later.
For installation methods, see [OpenShift documentation](https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/installation_overview/ocp-installation-overview).

## `TechPreviewNoUpgrade` Feature Set

On OpenShift 4.20 (Kubernetes 1.33) and earlier, enable DRA through the `TechPreviewNoUpgrade` feature set (includes the `DynamicResourceAllocation` feature gate), either during installation or post-install.
See [Enabling features using FeatureGates](https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/nodes/working-with-clusters#nodes-cluster-enabling-features-about_nodes-cluster-enabling).

**WARNING:** Enabling `TechPreviewNoUpgrade` disables future cluster upgrades and is irreversible.

## Scheduler Profile

On OpenShift 4.19, enable DRA in the cluster scheduler:

```console
$ oc patch --type merge -p '{"spec":{"profile": "HighNodeUtilization", "profileCustomizations": {"dynamicResourceAllocation": "Enabled"}}}' scheduler cluster
```

## NVIDIA GPU Operator

Install the NVIDIA GPU Operator for OpenShift.
Ensure you have version 25.10.0 or later.
For installation steps, see [NVIDIA GPU Operator on Red Hat OpenShift Container Platform](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html).

Ensure CDI (Container Device Interface) mode is enabled in the `ClusterPolicy`:

```yaml
cdi:
enabled: true
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the default now with GPU Operator 25.10.0. We can probably still leave this step in, right? (I suppose that's your point here... making damn sure :-)).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, people may change it to 'false' to keep the old behavior, i.e. before CDI support was introduced on OpenShift. Obviously, the driver won't work correctly with cdi: false.


## NVIDIA GPU Driver Location

On OpenShift, the location of the driver provided by the GPU operator differs from the default.
Set the following value when installing the Helm chart:

```yaml
nvidiaDriverRoot: /run/nvidia/driver
Comment thread
empovit marked this conversation as resolved.
```

Example using `--set`:

```console
$ helm install nvidia-dra-driver-gpu nvidia/nvidia-dra-driver-gpu \
<...>
--set nvidiaDriverRoot=/run/nvidia/driver
```