IMPORTANT NOTE: This project is a community-driven reference design and is NOT officially supported by Red Hat. It is provided as-is for validation and reference purposes. For production deployments, please consult Red Hat documentation and support channels.
This reference configuration uses the kube-compare tool to validate a complete Red Hat OpenShift AI deployment, ensuring all required components are properly configured for production-ready Enterprise AI workloads.
The reference validates the following infrastructure layers:
- Foundation Operators (mandatory): Node Feature Discovery (NFD) and Kernel Module Management (KMM)
- Storage Backends (conditional): Logical Volume Manager Storage (LVMS) or alternative storage solutions
- GPU Infrastructure (requires at least one): NVIDIA GPU Operator or AMD GPU Operator
- OpenShift AI Platform (mandatory): OpenShift AI Operator, Service Mesh, Serverless, and Authorino
The reference uses kube-compare V2 format with logical matching strategies:
allOf: Mandatory components that must all be present (Foundation Operators, OpenShift AI Platform)allOrNoneOf: Complete stack validation - either all components present or none (Storage, GPU vendors)
This approach enforces architectural requirements while allowing configuration flexibility for different infrastructure scenarios.
.
├── metadata.yaml # Main kube-compare configuration
├── common/
│ ├── operators/ # Foundation operators (NFD, KMM)
│ ├── platform/ # OpenShift AI platform components
│ └── storage/ # Storage backend configurations (LVMS)
├── vendors/
│ ├── nvidia/ # NVIDIA GPU operator
│ └── amd/ # AMD GPU operator
└── docs/
├── foundation-operators.md # Foundation layer coding standards
├── storage-backends.md # Storage configuration guidelines
├── gpu-infrastructure.md # GPU vendor selection and setup
└── openshift-ai.md # Platform component requirements- Red Hat OpenShift 4.20 or later
- kube-compare CLI tool installed
- Cluster admin access for validation
- GPU hardware (NVIDIA or AMD) for AI/ML acceleration
Compare your running OpenShift cluster against the reference configuration:
oc cluster-compare \
-r https://raw.githubusercontent.com/leo8a/openshift-reference-enterprise-ai/refs/heads/main/metadata.yamlValidate using must-gather output for offline analysis:
oc cluster-compare \
-r https://raw.githubusercontent.com/leo8a/openshift-reference-enterprise-ai/refs/heads/main/metadata.yaml \
-f "must-gather.local.*/*/cluster-scoped-resources","must-gather.local.*/*/namespaces" \
-RThe reference enforces the following validation logic:
- Foundation Operators: Both NFD and KMM must be present
- Storage: LVMS stack must be complete if deployed, or alternative storage must be available
- GPU Infrastructure: At least one GPU vendor operator (NVIDIA or AMD) must be fully deployed
- OpenShift AI Platform: All four components (RHOAI, Service Mesh, Serverless, Authorino) must be present
Refer to the documentation in docs/ for detailed coding standards and customization guidelines:
- Modify component versions in subscription manifests
- Adjust storage configurations for your infrastructure
- Configure GPU settings for specific hardware
- Customize DataScienceCluster settings for workload requirements
Contributions are welcome! Please ensure:
- All changes follow the coding standards documented in
docs/ - New components include appropriate validation strategy (
allOf,allOrNoneOf, etc.) - Component descriptions clearly explain purpose and dependencies
- Changes are tested with kube-compare validation
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.