OpenShift Validated Reference Design for Enterprise AI

IMPORTANT NOTE: This project is a community-driven reference design and is NOT officially supported by Red Hat. It is provided as-is for validation and reference purposes. For production deployments, please consult Red Hat documentation and support channels.

Overview

This reference configuration uses the kube-compare tool to validate a complete Red Hat OpenShift AI deployment, ensuring all required components are properly configured for production-ready Enterprise AI workloads.

The reference validates the following infrastructure layers:

Foundation Operators (mandatory): Node Feature Discovery (NFD) and Kernel Module Management (KMM)
Storage Backends (conditional): Logical Volume Manager Storage (LVMS) or alternative storage solutions
GPU Infrastructure (requires at least one): NVIDIA GPU Operator or AMD GPU Operator
OpenShift AI Platform (mandatory): OpenShift AI Operator, Service Mesh, Serverless, and Authorino

Architecture

The reference uses kube-compare V2 format with logical matching strategies:

allOf: Mandatory components that must all be present (Foundation Operators, OpenShift AI Platform)
allOrNoneOf: Complete stack validation - either all components present or none (Storage, GPU vendors)

This approach enforces architectural requirements while allowing configuration flexibility for different infrastructure scenarios.

Structure

.
├── metadata.yaml              # Main kube-compare configuration
├── common/
│   ├── operators/            # Foundation operators (NFD, KMM)
│   ├── platform/             # OpenShift AI platform components
│   └── storage/              # Storage backend configurations (LVMS)
├── vendors/
│   ├── nvidia/               # NVIDIA GPU operator
│   └── amd/                  # AMD GPU operator
└── docs/
    ├── foundation-operators.md    # Foundation layer coding standards
    ├── storage-backends.md        # Storage configuration guidelines
    ├── gpu-infrastructure.md      # GPU vendor selection and setup
    └── openshift-ai.md            # Platform component requirements

Prerequisites

Red Hat OpenShift 4.20 or later
kube-compare CLI tool installed
Cluster admin access for validation
GPU hardware (NVIDIA or AMD) for AI/ML acceleration

Usage

Compare Against Live Cluster

Compare your running OpenShift cluster against the reference configuration:

oc cluster-compare \
  -r https://raw.githubusercontent.com/leo8a/openshift-reference-enterprise-ai/refs/heads/main/metadata.yaml

Compare with Local must-gather Files

Validate using must-gather output for offline analysis:

oc cluster-compare \
  -r https://raw.githubusercontent.com/leo8a/openshift-reference-enterprise-ai/refs/heads/main/metadata.yaml \
  -f "must-gather.local.*/*/cluster-scoped-resources","must-gather.local.*/*/namespaces" \
  -R

Validation Strategy

The reference enforces the following validation logic:

Foundation Operators: Both NFD and KMM must be present
Storage: LVMS stack must be complete if deployed, or alternative storage must be available
GPU Infrastructure: At least one GPU vendor operator (NVIDIA or AMD) must be fully deployed
OpenShift AI Platform: All four components (RHOAI, Service Mesh, Serverless, Authorino) must be present

Customization

Refer to the documentation in docs/ for detailed coding standards and customization guidelines:

Modify component versions in subscription manifests
Adjust storage configurations for your infrastructure
Configure GPU settings for specific hardware
Customize DataScienceCluster settings for workload requirements

Contributing

Contributions are welcome! Please ensure:

All changes follow the coding standards documented in docs/
New components include appropriate validation strategy (allOf, allOrNoneOf, etc.)
Component descriptions clearly explain purpose and dependencies
Changes are tested with kube-compare validation

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
common		common
docs		docs
vendors		vendors
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
metadata.yaml		metadata.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenShift Validated Reference Design for Enterprise AI

Overview

Architecture

Structure

Prerequisites

Usage

Compare Against Live Cluster

Compare with Local must-gather Files

Validation Strategy

Customization

Contributing

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenShift Validated Reference Design for Enterprise AI

Overview

Architecture

Structure

Prerequisites

Usage

Compare Against Live Cluster

Compare with Local must-gather Files

Validation Strategy

Customization

Contributing

License

References

About

Topics

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages