Microsoft-Windows-FailoverClustering event id: 5120

Article
2015-05-20

Question

_{Wednesday, May 20, 2015 7:30 AM}

Hello All,

Every few days we receive the following message from our Hyper-V Cluster.

Cluster Shared Volume 'disk 1' ('disk 1) has entered a paused state because of '(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished.Cluster Shared Volume 'disk 1' ('disk 1') has entered a paused state because of '(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished.

We are running on Windows 2012R2 with all the patches till 01-05-2015.

All the VM's off the CSV volume stay online and there are no interruption within the VM's. Anybody a idee how ik can solve the message.

All replies (5)

_{Thursday, May 21, 2015 2:52 AM ✅Answered}

First off, make sure you have all the recommended cluster hotfixes... we've done several fixes for this event scenario.
https://support.microsoft.com/en-us/kb/2920151

Here's some blogs to check out:
http://blogs.msdn.com/b/clustering/archive/2014/02/26/10503497.aspx

as well as this one:
http://blogs.msdn.com/b/clustering/archive/2014/12/08/10579131.aspx

Hope that helps,
Elden

_{Wednesday, May 20, 2015 7:39 AM}

This can be caused by VSS pausing the I/O when preparing snapshot. If you have a backup job (or any other activity that can trigger creation of a snapshot) configured to run at the time this error is logged, then you can safely ignore it.

Gleb.

_{Wednesday, May 20, 2015 8:07 AM}

Gleb, thanks for the answer.

We make snapshots of the volume's trough our SAN (Dell Equllogic) the first snapshot is taken on 22:30. The error was from 19:15.

_{Thursday, May 21, 2015 12:24 PM}

The c000020c error code means "STATUS_CONNECTION_DISCONNECTED" and the troubleshooting approach for this event depends on that as there may be several type of issue that would cause the CSV to auto-pause (so you can exclude the suggestions for other instance of this event, that have a different error in their description). The "disconnect" error may indicate a problem with connectivity, though that may be just a side effect of a different problem. www.eventid.net has a few suggestions, some for this specific error. For one example, the culprit was a an error in the switch VLAN configuration.

_{Wednesday, November 18, 2015 3:16 PM}

What I have found is that when you get to a certain amount of VM's in your environment it becomes more imperative to place your VM's on the proper hosts and make those hosts the coordinator for those LUNS. For example you have a VM on HOST1, then HOST1 should be the owner of the CSV where the VHDX is stored.

I have 5 clustered hosts with 20 LUNS connected via ISCSI 10GBE and noticed random 5120 errors once the volume increased past 100 VM's. The storage was not being taxed however we were still getting the errors in the eventlog. At first the CSV's would not lock so it wasn't a big issue. But once you we increase the number of VM's up to 150+, the 5120 errors were constant and we started getting locking issues which would take the CSVs offline and take down the VM's on whichever LUN was experiencing the locking issue.

My hosts are all 2012 R2 and fully patched as of Nov.1, 2015 and all the CSV updates from MS did not resolve the 5120 issue. Placing the VM on the host that is the coordinator of the CSV where the VHDX is residing will solve this issue.

Share via

Microsoft-Windows-FailoverClustering event id: 5120

Question

All replies (5)

Additional resources