Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Monday, March 21, 2016 5:29 PM
Hi,
We are running a Hyper-V cluster running on Windows Server 2012 R2. It is a 4 node cluster built on Dell PowerEdge M620 with QLogic QMD 8262-k CNA. We have setup a converged network design for this cluster which means host traffic is also passing through Hyper-V switch https://technet.microsoft.com/en-ca/library/dn550728.aspx#BKMK_Example
Problem is I am observing intermittent packet loss both on hosts and VMs. Due to this packet loss, cluster breaks every now and then. VMQ is disabled both on physical cards and NIC team.
I ran some monitoring and observed that dropped packets count jumps suddenly around the time when cluster complains of missed heartbeat signals (Event ID 1650). Are these number of dropped packets normal ? what other tracing can be run to see what is causing this amount of packet drop ?
Thanks,
All replies (8)
Wednesday, March 23, 2016 4:19 PM âś…Answered
Thanks guys,
@Leo, yeah there is ping failure also when the packet loss occur. Unfortunately, the power management setting did not have any effect. yeah I will keep the thread updated of any progress,
@Cloud-Ras, yes the vSwitch is on NIC team. The NIC team is created with Switch Independent and Dynamic settings. I tested with other load balancing modes but this also didn't have any effect.
I guess time to call MS support.
Monday, March 21, 2016 6:09 PM
Hello,
Did you install up to date NIC drivers and all Windows updates? In theory VMQ issue is particularly related to 1 Gb NICs so it shouldn't be a cause of your issue.
Monday, March 21, 2016 6:12 PM
This is not normal.
First - check the NIC firmware and drivers and be sure they are in alignment. I have seen many discussions where a particular driver version works with a specific firmware version, etc. Only the OEM knows.
The OEM should also have recommendations for the configuration.
Second - I am not directly familiar with the drivers, so be sure that there is no power savings enabled on the physical NIC.
You stated you disabled VMQ. That is commonly a solution. There are times when a reboot is also required.
The other thing that can be happening is at the physical layer, don't count it out. Ensure that DNS is clean, routing tables are updated, nothing is flapping, etc.
Brian Ehlert
http://ITProctology.blogspot.com
Learn. Apply. Repeat.
Monday, March 21, 2016 6:33 PM
Thanks for prompt suggestions guys.
NIC firmware and driver are up to date. Also all the latest Microsoft patches are installed. Not sure about specific compatibility between firmware and driver. I'll try to look it up
The reason I am more focused on Hyper-V switch is that in order isolate the issue, I removed 1 NIC from the team on two hosts, bind a different subnet IP on these NICs and ran a continuous ping. I didn't observed a single packet loss on this interface.
Power management was indeed enabled. I have turned this off. Will observe the behavior from now on.
Thanks,
Tuesday, March 22, 2016 7:41 AM
Hi NewtoCloud,
Have you run cluster validation and is there any useful information?
When all the NICs are configured in NIC team, run ping with -t flag. When heartbeat signals missed, are there any ping packets lost?
Have you tried to recreate the NIC team and virtual switch?
>>Will observe the behavior from now on.
If any update, welcome to share the information.
Best Regards,
Leo
Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Support, contact [email protected].
Tuesday, March 22, 2016 2:46 PM
Hi Sir,
Do you run your vSwitch upon a NIC-Teaming? - if you do, try checking up if you run these as Address Hash, Hyper-V Port or Dynamic, and check up on, witch mode is best for your Switch.
I would recommend LACP at any time, witch is the most differcult to set up depending on what Switch you haver.
The most used config on NIC Teaming is "Switch Independent" and "Dynamic" but not the most recommended solution.
Please remember to mark the replies as answers if they help and unmark them if they provide no help.
Thursday, March 24, 2016 5:40 AM
Hi NewtoCloud,
If you create the team on host and not adding virtual switch, is the connection always stable?
>>I guess time to call MS support.
More in-depth investigation can be done so that you would get a more satisfying explanation and solution to this issue.
Best Regards,
Leo
Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Support, contact [email protected].
Monday, June 24, 2019 1:23 PM
Hi,
Did you ever resolved this issue ? We are facing exactly the same thing and can't find the source of the problem.
Thanks.