Poor Cluster Shared Volume read/write performance when node is not CSV owner

CoreJN 1 Reputation point
2022-05-20T09:45:42.163+00:00

We have a 5 node Windows Server 2016 Failover Cluster setup using an HPE Nimble as shared storage. We're using the cluster for Hyper-V. All virtual machine VHDXs are stored on the cluster shared volume (CSV).

We're having a problems with disk performance within VMs when the VM is running on a node which does not own the CSV storage.

Transferring files via SMB between VMs when they are both running on a node which owns the CSV speeds are between 1.5GB/s and 2GB/s. If you take the storage ownership away from that node, speeds drop to ~100MB/s.

This seems like the storage traffic is going via the 1GB network, through the owner node then into the SAN. From what I understand this shouldn't be the case unless the CSV has been set to redirected mode. (I've not confirmed this with Wireshark or anything yet, working on that)

I've run the command Get-ClusterSharedVolumeState which returned the following:

BlockRedirectedIOReason : NotBlockRedirected
FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : HyperV03
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\

BlockRedirectedIOReason : NotBlockRedirected

FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : Hyperv06
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\

BlockRedirectedIOReason : NotBlockRedirected
FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : hyperv05
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\

BlockRedirectedIOReason : NotBlockRedirected
FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : Hyperv04
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\

BlockRedirectedIOReason : NotBlockRedirected
FileSystemRedirectedIOReason : NotFileSystemRedirected
Name : Cluster Disk 1
Node : Hyperv02
StateInfo : Direct
VolumeFriendlyName : VM-CSV
VolumeName : \?\Volume{9323278e-8374-474c-b9e7-1097305c0d1f}\

According to this output redirection isn't the cause of the issue.

Can anyone think of a reason why else this might be happening?

Connections to the SAN have all been setup using HPE Windows Toolkit which configures the MPIO settings and various other bits for you. We've confirmed all nodes are able to hit transfers speeds of the expected 1GB/s+ but only when that node takes ownership of the CSV.

Windows Server 2016
Windows Server 2016
A Microsoft server operating system that supports enterprise-level management updated to data storage.
2,528 questions
Hyper-V
Hyper-V
A Windows technology providing a hypervisor-based virtualization solution enabling customers to consolidate workloads onto a single server.
2,738 questions
Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
1,011 questions
{count} votes

1 answer

Sort by: Most helpful
  1. CK LIM 6 Reputation points
    2022-08-03T02:20:26.733+00:00

    I have the same issue and we are using windows 2019 DC version with 5 nodes clustering.

    I ask the vendors to request Microsoft to provide the patch/fix but they replied it's by design and expected behaviour which i do not agreed with that. I still want the vendor to pursue this fix/patch with Microsoft

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.