Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Overview
The H-series virtual machines (VMs) are the latest HPC offerings on Azure. HB-series VMs offer 60-core AMD EPYC processors and are optimized for running applications with high memory-bandwidth requirements, such as explicit finite element analysis, fluid dynamics, and weather modeling. The HC-series VMs have 44-core Intel Xeon Skylake processors and are optimized for applications requiring intensive CPU calculations, like molecular dynamics and implicit finite element analysis. HB and HC VMs feature 100-Gb/s EDR InfiniBand and support the latest MPI types and versions. For more information on how to scale HPC applications on HB and HC VMs, see the Scaling HPC Applications Guide.
Azure CycleCloud supports the new H-series VMs, but for the best experience and performance, follow the guidelines and best practices in this article.
CentOS 7.6 HPC Marketplace image
The CentOS 7.6 HPC Marketplace image contains all of the drivers to enable the InfiniBand interface as well as precompiled versions of all of the common MPI variants installed in /opt. For details on what the image offers, see this blog post.
To use the CentOS 7.6 HPC image when creating your cluster, check the Custom Image box on the Advanced Settings parameter and enter the value OpenLogic:CentOS-HPC:7.6:latest
.
To support the older H16r VM series and keep cluster head nodes locked to the same version of CentOS, the default "Cycle CentOS 7" image in the Base OS dropdown deploys CentOS 7.4. While this version works for most VM series, HB and HC VMs require CentOS 7.6 or newer and a different Mellanox driver.
Disable SElinux in CycleCloud versions earlier than 7.7.4
By default, SElinux only considers /root
and /home
to be valid paths for home directories. If users have home directories outside of these paths, SElinux blocks SSH from using any SSH keypairs in the user's home directory. In CycleCloud clusters, you create user home directories in /shared/home
. While CycleCloud versions newer than 7.7.4 automatically set the /shared/home
path as a valid SElinux homedir context, older versions don't support this feature. To make sure SSH works properly for users on the cluster, disable SElinux in the cluster template:
[[node defaults]]
[[[configuration]]]
cyclecloud.selinux.policy = permissive
Running MPI jobs with Slurm
MPI jobs running on HB or HC VMs need to run in the same virtual machine scale set. To ensure proper autoscale placement of VMs for MPI jobs running with Slurm, set the following attribute in your cluster template:
[[nodearray execute]]
Azure.SingleScaleset = true
Azure.MaxScalesetSize = 300
Azure.Overprovision = true
Getting pkeys for use with OpenMPI and MPICH
Some MPI variants require you to specify the InfiniBand PKEY when running the job. Use the following Bash function to determine the PKEY:
get_ib_pkey()
{
key0=$(cat /sys/class/infiniband/mlx5_0/ports/1/pkeys/0)
key1=$(cat /sys/class/infiniband/mlx5_0/ports/1/pkeys/1)
if [ $(($key0 - $key1)) -gt 0 ]; then
export IB_PKEY=$key0
else
export IB_PKEY=$key1
fi
export UCX_IB_PKEY=$(printf '0x%04x' "$(( $IB_PKEY & 0x0FFF ))")
}