Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Sunday, May 13, 2018 7:27 PM
Hello,
I have researched online but could not find anything relevant. A hyper-v managed windows server 2016 instance is crashing and throws multiple events of:
"The IO operation at logical block address 0x###### for Disk 0 "
On the hyper-v host, after the VM has crashed there has been a pattern of a SMBClient 30611 error of:
"
Failed to reconnect a persistent handle.
Error: The account is not authorized to login from this station.
FileId: 0x200000E0265CBEE:0x20E000000A9
CreateGUID: {b3d6066e-563c-11e8-a949-0002c937dda1}
Path: \networked\path\to\instance.vhdx
Reason: 201
Previous reconnect error: STATUS_SUCCESS
Previous reconnect reason: The reason is not specified
Guidance:
A persistent handle allows transparent failover on Windows File Server clusters. This event has many causes and does not always indicate an issue with SMB. Review online documentation for troubleshooting information.
"
Followed by several 30906 errors:
"
A request on persistent/resilient handle failed because the handle was invalid or it exceeded the timeout.
Status: The transport connection is now disconnected.
Type: Write (and Read)
Path: \networked\path\to\instance.vhdx
Restart count: 0
Guidance:
After retrying a request on a Continuously Available (Persistent) handle or a Resilient handle, the client was unable to reconnect the handle. This event is the result of a handle recovery failure. Review other events for more details.
"
Then the server crashed. If someone has any ideas or could point me in a direction to recover other more logs that would be super.
Thanks!
All replies (20)
Monday, May 14, 2018 6:28 AM
Hi,
Based on my knowledge, error 30611 usually means SMB client failed to resume a continuous available (CA) handle on a CA file share resource. And "Error: The account is not authorized to login from this station" may happen when it connects to other storage location.
May I ask is there any external storage connect to your hyper-v host?
And where is your VM located? local drive on the host or other external storage the connected to the hyper-v host?
Event 153 “The IO operation at logical block address 0x###### for Disk 0” is an error associated with the storage subsystem. Please run chkdsk on the volume. Have you run that to see if it finds and corrects any errors?
https://blogs.msdn.microsoft.com/ntdebugging/2013/04/30/interpreting-event-153-errors/
In addition, please try to move all the VM related data to another drive and check the results.
Best Regards,
Mary
Please remember to mark the replies as answers if they help.
If you have feedback for TechNet Subscriber Support, contact [email protected].
Tuesday, May 15, 2018 4:08 PM
Hello,
The vhdx is located on our s2d cluster for our azure pack environment. The cluster is being used by over 50 guests, and our logs show that it is only this VM that is eventing IO logic retries. So I am leaning toward something is wrong with this Domain Controller VM.
Is there something s2d specific I could check that would be related?
Thanks!
Wednesday, May 16, 2018 1:43 AM
Hi,
You may run Get-StorageHealthReport to check the storage health report.
/en-us/powershell/module/storage/get-storagehealthreport?view=win10-ps
And also run get-virtualdisk | fl, get-physicaldisk | fl, get-storagepool to check if the S2D is a health status.
Best Regards,
Mary
Please remember to mark the replies as answers if they help.
If you have feedback for TechNet Subscriber Support, contact [email protected].
Thursday, May 17, 2018 1:19 PM
Hello,
Failover cluster manager shows that all the disks in the pool are healthy.
Friday, May 18, 2018 1:24 AM
Hi,
>our logs show that it is only this VM that is eventing IO logic retries. So I am leaning toward something is wrong with this Domain Controller VM.
May I ask what the current status of this VM.
If you restart the VM, could the VM running in health status?
As said above, event 30611 can occur if the volume for the file share does not have the Resume Key filter attached or similar reasons.
Applications with continuously available file handles opened on file shares on the affected volumes will have the handles closed. The application administrator will need to take recovery action to reestablish those handles. This could include restarting virtual machines, reattaching databases or restarting the application.
Best Regards,
Mary
Please remember to mark the replies as answers if they help.
If you have feedback for TechNet Subscriber Support, contact [email protected].
Monday, May 21, 2018 1:35 PM
The current status is that it is powered off, but post crash it was not booted in safe mode.
Post crash/boot everything was "working", but the VM spat out additional IO retries with no consistency. As in 5 and 2 days apart at different times of the day.
All the applications and services were 'restarted' when the VM powered on, yes? What do you mean reattaching databases? The VM uses a networked SQL server.
Tuesday, May 29, 2018 9:53 PM
We think it has something to do with the vhdx and not anything to do with our s2d cluster, as other VMs would be having this issue.
Have any ideas that I could look into?
Friday, July 20, 2018 12:42 PM
We are seeing 30906 events on our Hyperv servers as well; however our storage is located on SOFS Clusters. VM's are hanging and cannot failover to another host since the configuration.xml file for the VM's is locked. Only turning off the Host will solve the issue since then the locked .xml is freed.
Have you been able to solve it or find a workaround for it? What steps did you take ?
Jan
Monday, August 27, 2018 3:51 AM
We are seeing VMs on our Hyper-V host crashed with smberror 30611 follow by 30906 as well. our storage is on sofs
VMs are hanging and could not be migrated off. The only solution is to restart the Hyper-V Host so hanging VMs can be started on other nodes.
Anyone has a solution will be greatly appreciated
Thanks
Qui
Monday, August 27, 2018 9:11 AM
Hello Qui,
We solved the issue by setting SMBMultichannel off on the HyperV hosts...
Set-SmbClientConfiguration -EnableMultiChannel $false
After changing this setting we did not experience the problem anymore..
Microsoft will release a hotfix for the issue in September.. (needs to be installed on SOFS)
Hope this helps!
Jan
Tuesday, August 28, 2018 4:51 AM
Hi Jan,
thanks for the tip, will try it out.
do you know why multichannel is causing the problem?
Thanks again
Qui
Wednesday, August 29, 2018 8:24 PM
Would you happen to have the MS link describing this issue?
Thursday, August 30, 2018 7:00 AM
There's a bug in the SMB Header Channelsequence. A public fix will be released on august 18.
I cannot elaborate on this but will keep you posted when the hotfix is publicly available.
Jan
Thursday, August 30, 2018 7:00 AM
There's a bug in the SMB Header Channelsequence. A public fix will be released on august 18.
I cannot elaborate on this but will keep you posted when the hotfix is publicly available.
Jan
Thursday, August 30, 2018 7:28 PM
Do you mean September 18 ?
Friday, August 31, 2018 5:47 AM
Yes, sorry, September 18 it is.
Jan
Thursday, September 20, 2018 4:09 PM
So now can you say what it is?
Sunday, September 23, 2018 4:36 PM
I am interested in this SMB bug as well. After googling around I was unable to find anything specific regarding a bug or an upcoming hotfix.
Thursday, November 8, 2018 1:20 AM
There's a bug in the SMB Header Channelsequence. A public fix will be released on august 18.
I cannot elaborate on this but will keep you posted when the hotfix is publicly available.
Jan
Still looking for information on this. Has a hotfix been released?
Thursday, November 8, 2018 7:05 AM
The hotfix is integrated in the september 2018 update, we have installed it onto our production servers and the issue is solved...
Jan