Releases: dell/omnia
Omnia 2.1.0.0-rc1
Key Features
- Added InfiniBand OFED support for Slurm cluster nodes.
- Powervault Block storage support for Slurm controller node.
- Upgrade Image Builder (x86_64 and aarch64) base from AlmaLinux 8 to AlmaLinux 10.
- Deprecated Docker Hub delivery for the Omnia core container and support only building the Omnia core image locally.
Documentation
- Omnia v2.1.0.0-rc1 Documentation: https://omnia.readthedocs.io/en/v2.1.0.0-rc1/index.html
Omnia 2.0.0.0
Key Features
-
Support for Podman Containers
- Omnia Core – Orchestrates HPC cluster operations
- Omnia Auth – Provides LDAP-based authentication
- OpenCHAMI – Powers diskless provisioning workflows
- Pulp Repository Service – Hosts local repositories for air-gapped deployments
-
Repository Management
- Pulp-based local repository service deployed as a Podman container for faster, network-independent provisioning
-
Authentication Service
- Integrated LDAP server within Omnia Auth for centralized authentication
-
Telemetry Collection and Monitoring
- iDRAC Telemetry, LDMS Telemetry, and air-gapped telemetry support
-
Kubernetes Cluster High Availability
- Built-in HA failover for Kubernetes control plane nodes
-
Role-Based Provisioning
- Functional group-based provisioning with automated role assignment and OS image customization
-
Stateless Boot
- Stateless provisioning for RHEL 10 using OpenCHAMI
-
Automatic CUDA Installation
- GPU nodes provisioned with CUDA for HPC workloads
-
Security Enhancements
- Credentials encrypted using industry-standard algorithms
-
Platform Support
- Supports x86_64 and aarch64 architectures
-
Input Template and Validator
- Predefined templates and early validation to reduce errors and accelerate provisioning
Documentation
- Omnia v2.0.0.0 Documentation: https://omnia.readthedocs.io/en/v2.0.0.0/index.html
Omnia 1.7.1
New Features:
- Platform enablement of AMD 17G PowerEdge servers - R6725, R7725, R6715, R7715
- New operating system support - Ubuntu 24.04
- Enablement of Intel Gaudi 3 accelerator on Ubuntu 24.04 & 22.04 OS
- Enablement of NVIDIA accelerators - L40s, H100 NVL, H200 SXM
- NVIDIA Collective Communications Library (NCCL) 2.25.1 on nodes running Ubuntu 24.04 OS
- NVIDIA GPU operator (25.3.0) on nodes running Ubuntu 24.04 OS
- Support for ROCm Communication Collectives Library (RCCL) 2.21.5 on nodes with AMD accelerators
- Support for RoCE configuration with Calico network plugin
- Ability to add external nodes (with pre-loaded OS and internet connectivity) to a Kubernetes (K8s) cluster
- Addition of Multus-CNI plugin (4.1.4) and Whereabouts plugin (0.8.0) for Kubernetes (K8s)
- Ability to configure additional NICs and update kernel parameters during compute node provisioning
- Upgrade support on Omnia Infrastructure Manager (OIM) from v1.7 to v1.7.1
- Software stack updates:
- Intel Gaudi driver - 1.19.2
- Kubernetes - 1.31.4
- Kubespray - 2.27
- CSI PowerScale driver - 2.13.0
- NVIDIA CUDA - 12.8
- NVIDIA vLLM - 0.7.2
- AMD ROCm - 6.3.1
- Grafana - 11.4.1
- BCM RoCE - 232.1.133.2
- Jinja - 3.1.6
Documentation Enhancements:
- Grouping by OS version in the
Software Installed by Omniasubsection - Addition of
Unsupported packages based on cluster OSsubsection
Omnia 1.7.1-rc2
- Kubernetes version 1.31.4 support
- Ubuntu 24.04 support with netplan configuration
- IP rule assignment playbook integration with server spec update
- Server spec update & kernel parameters update while provisioning
- Security vulnerability fix - jinja2 version update 3.1.5
- Xilinx device plugin version downgrade 1.2.0
- Fixed broken links for Rocky OS.
Omnia 1.7.1-rc1
What’s New in this pre-release:
- Kubernetes version 1.31.4 support
- Ubuntu 24.04 support
- Security vulnerability fix - jinja2 version update 3.1.5
- Xilinx device plugin version downgrade 1.2.0
- Fixed broken links for Rocky OS.
Omnia 1.7
Note: Fix for broken symlink in Omnia 1.7 that was blocking deployments on Rocky Linux clusters is available in Omnia 1.7.1-rc2 (Pre-release) and later releases. We recommend that Rocky Linux users upgrade to this version for smooth cluster deployments.
What’s New in this release:
- Refresh for XE9680 w/ AMD Mi300x accelerators & PowerSwitch Z9864F based network architecture
- Pre-enablement for XE9680 w/ Intel Gaudi 3 accelerators
- NVIDIA container toolkit for NVIDIA accelerators
- Installation of Kubernetes stack v1.29
- Sample playbook for a pre-trained Generative AI model - Llama 3.1
- CSI drivers for Kubernetes to access PowerScale storage
- Internal OpenLDAP server configuration as a proxy server
- Corporate proxy on RHEL, Rocky Linux, and Ubuntu clusters
- Omnia execution within a virtual environment w/ Python 3.11 and Ansible 9.5.1
- Setting OS Kernel command-line parameters using server_spec_update utility
- Revamped Omnia documentation featuring OS-specific install guides, deployment-flow diagram, and other enhancements
Omnia 1.6.1
This patch release is focused on fixing following issue:
- The dependent package ‘libssl1.1_1.1.1f-1ubuntu2.22_amd64’ required by Omnia 1.6 is no longer available for Ubuntu 22.04 OS.
Note:
- With Omnia 1.6.1, new cluster deployments will encounter a TLS CA certificate error with OpenLDAP due to changes in the dependent package ‘openldaptoolbox’. To resolve this, we recommend using Omnia 1.7 for new cluster deployments with OpenLDAP.
- A critical security vulnerability in the cryptography software used by Omnia versions 1.6.1 and earlier has been resolved in Omnia 1.7 by updating the cryptography software to version 44.0.0. We recommend that users upgrade to Omnia 1.7.
Omnia 1.6
This release has been deprecated since the dependent package ‘libssl1.1_1.1.1f-1ubuntu2.22_amd64’ is no longer available for Ubuntu 22.04
Note: Before running local repo in Omnia 1.6 production environment with Ubuntu 22.04 OS, please apply the fix by following the upgrade flow of Omnia 1.6.1
Omnia has been enhanced to offer:
-
Hardware Enablement
- Enablement for AI workloads on XE9680 with AMD Mi300x GPUs
-
OS enablement
-
Enablement for AI
-
Install GPU device plugin for Kubernetes
-
GPU device plugin for AMD
-
GPU device plugin for NVIDIA
-
-
Additional Features
-
One-off Utility to add a node or to remove a node.
-
HPC/AI cluster inventory partitioning
-
CPU inventory
-
AMD GPU inventory
-
NVIDIA GPU inventory
-
Omnia 1.5.1
This patch release is focused on fixing following issue:
-
Installation of Kubernetes 1.16 and 1.19 are deprecated.
-
Spark Operator support is deprecated.
-
Omnia now installs Kubernetes 1.26
Kubeflow is not supported on v1.5.1 due to Kubernetes upgrade.
Omnia 1.4.3.1
This release is focused on supporting following features:
-
Hardware Support: Intel E810 NIC, ConnectX-5/6 NICs.
-
Omnia github now hosts a “genesis” image with this functionality baked in for initial bootup.
-
Host aliasing for Scheduler and IPA authentication.
-
Login and Manager Node access from both public and private NIC.
-
Validation check enhancements:
-
Rearranged to occur as early as possible.
-
Isolate checks when running smaller playbooks.
-
-
Added a Benchmark Install Guide: OneAPI for Intel, MPI AOCC HPL for AMD.