Replication Health - Time Duration

Question

_{Monday, December 18, 2017 1:12 PM | 2 votes}

Hello everyone!

I'm having a hard time with VM Replica on Hyper-v 2002 R2. Everything was fine until i had to expand the VHDX on the VM.

This VM is a DC and file server running Windows 2012 R2.

I get the errors every 6 hours which is the time interval I setup on the replication settings.

The Replication Health on primary Hyper-V shows "{Time duration since the last successful application consistent checkpoint has exceeded the warning limit for the virtual machine ********"

Primary hyper-v errors:

Error 18012 (Checkpoint Operation failed)

Error 33676 (Replication operation for virtual machine '*****' failed: The writer experienced a non-transient error. If the backup process is retried, the error is likely to reoccur. (0x800423F4). (Virtual machine ID *****) (Primary server: '*****', Replica server: '*****')

Warning 32024 (Hyper-V failed to generate VSS snapshot set for virtual machine '*****': The writer experienced a non-transient error. If the backup process is retried, the error is likely to reoccur. (0x800423F4). VSS snapshot set generation can fail if backup operation is in progress. (Virtual machine ID *****)

VM Errors: (Erros logged on Windows client inside the VM)

Error : 489 (lsass (728) An attempt to open the file "\?\Volume{*****}\Windows\NTDS\ntds.dit" for read only access failed with system error 32 (0x00000020): "The process cannot access the file because it is being used by another process. ". The open file operation will fail with error -1032 (0xfffffbf8).

Warning 8229 (A VSS writer has rejected an event with error 0x800423f4, The writer experienced a non-transient error. If the backup process is retried,
the error is likely to reoccur.
. Changes that the writer made to the writer components while handling the event will not be available to the requester. Check the event log for related events from the application hosting the VSS writer.

Operation:
PostSnapshot Event

Context:
   Execution Context: Writer
   Writer Class Id: {*****}
   Writer Name: NTDS
   Writer Instance ID: {*****}
   Command Line: C:\Windows\system32\lsass.exe
   Process ID: 728 )

Error 2 ( The VSS writer NTDS failed with status 11 and writer specific failure code 0x800423F4. )

Any help will be appreciated.

Thank you!

Leônidas - Analista de Suporte - MCSA Windows 2012

All replies (34)

_{Tuesday, December 26, 2017 8:56 AM ✅Answered}

Hi Leonidas,

Please check the following thread and try the method in my reply in that case, check if it suits your scenario too:

https://social.technet.microsoft.com/Forums/windowsserver/en-US/4ce4b170-23a8-497b-b283-9d396f9276ba/hyper-v-vss-replication-error?forum=winserverhyperv

Best Regards,

Anne

Please remember to mark the replies as answers if they help.
If you have feedback for TechNet Subscriber Support, contact [email protected].

_{Tuesday, December 19, 2017 8:58 AM}

Hi Leonidas,

1. As far as I'm concerned, it will be sooner to re-create the replication between the primary server and the replica server instead of troubleshooting the original broken replication.

2. On server 2012R2, we may resize the VHD of the VM with replication configured, please check the following article for detailed information:

https://blogs.technet.microsoft.com/virtualization/2013/11/14/online-resize-of-virtual-disks-attached-to-replicating-virtual-machines/

Best Regards,

Anne

Please remember to mark the replies as answers if they help.
If you have feedback for TechNet Subscriber Support, contact [email protected].

_{Tuesday, December 19, 2017 11:09 AM}

Hi Anne,

I forgot to mention that this has been happening for a while and I've already tried to totally delete the replication and start it from scratch, but it did not solve my problem.

Leônidas - Analista de Suporte - MCSA Windows 2012

_{Tuesday, December 19, 2017 11:52 AM}

Hey,

I have the same problem.
I also re-create the replication for the problematic VM's but the problem came back again.

Eli

_{Wednesday, December 20, 2017 10:01 AM}

Hi Leonidas,

1. Do you mean re-creating replication still not resolve the issue and does the error message you provided above are the ones after re-creating?

2. What is the primary Server, does it running well apart from replication?

3. We are trying to re-produce the issue in our lab, and we'll feedback the result as soon as we get any progress.

Best Regards,

Anne

Please remember to mark the replies as answers if they help.
If you have feedback for TechNet Subscriber Support, contact [email protected].

_{Wednesday, December 20, 2017 11:18 AM}

Hi Anne

1. Yes, these messages are all the same. After expanding the disks on the primary server, replication stopped due to disk size difference. Instead of expanding the disks on the replica I deleted it and re-created and than these messages started. I've re-created the replication twice till now and the problem persists.

2. Not sure I understand your question. I've 2 hyper-v servers each has 2 VMs that are replicated to the other one each 30s. I have this problem only on one VM.

3. Thank you.

Leônidas - Analista de Suporte - MCSA Windows 2012

_{Friday, December 22, 2017 11:13 AM}

Hello,

got suddenly the exact same problem. I suspect KB4054519 which was installed at that time the problems started. Maybe something was patched (to death) regarding VSS and Hyper-V.

Please Anne, I would very appreciate it if you would first check your internal support options for explanation possibilities before continuing with us.

Cheers Greg

_{Sunday, December 24, 2017 10:42 AM}

Hello,

We also have the problem with two VMs (one Server2012R2 VM and one SUSE Linux SLES11 VM)

"Time duration since the last successful application consistent checkpoint has exceeded the warning limit"

The other VMs on the same 2012R2 Hyper-V machine do not have this problem.

We also removed twice the replication for the two VMs, removed all files at the replication server and started a new replication, but the problem stays.

Does someone have a solution?

Best Regards,

Bareld

_{Monday, December 25, 2017 3:15 AM}

Hi,

Please check if the following hotfix could be of help:

https://support.microsoft.com/en-us/help/3046826/you-cannot-upgrade-hyper-v-integration-components-or-back-up-windows-v

please note: to install those hotfix, please make sure the KB2919355 update has been install on the computer.

Best Regards,

Anne

Please remember to mark the replies as answers if they help.
If you have feedback for TechNet Subscriber Support, contact [email protected].

_{Monday, December 25, 2017 12:09 PM}

Hey Anne,

I run the following hotfix, and I make sure that the KB2919355 update has been installed on the computer.

But I am getting this message:

The update is not applicable to your computer.

I have Server 2012R2 64bit.

Best Regards,

Eli.

_{Monday, December 25, 2017 1:37 PM}

Hello Anne,

I doubt if the hotfix is the solution for the problem.

The hotfix is for a system with the following symptoms:
"Integration services setup always prompts for upgrade, even though the components are upgraded successfully for a Windows 8 or Windows Server 2012 virtual machine that has SR-IOV enabled. The Best Practice Analyzer (BPA) reports the integration services version as old or out-of-date."

We do not have this issue.

Each hotfix has the text:
This hotfix is intended to correct only the problem that is described in this article. Apply this hotfix only to systems that are experiencing this specific problem.

So according to the website we should not install it.

Also we find it very strange that this problem only happens suddenly with only two of our VM and not with the other VMs on the same Hyper-v host.

But anyway we did tried to install the hotfix, and yes just like Eli reported, we also have the message

"The update is not applicable to your computer"

We are also running Server 2012R2 (64 bit).

Best Regards,

Bareld

_{Monday, December 25, 2017 9:42 PM}

Same here,

The update is not applicable to your computer

Windows 2012 R2 x64.

Leônidas - Analista de Suporte - MCSA Windows 2012

_{Monday, December 25, 2017 11:35 PM}

Hello,

Same problem here. The problematic VM is DC.

This issue started just after installing the KB4054519, KB4054522, KB 4052978.

Is this the same for you?

_{Tuesday, December 26, 2017 9:57 AM}

Hello Anne,

I don't think that the issue is related to the "integration services" since we have the issue both on a VM with 2012R2 (running as DC), but also on a Linux VM (SLES11). All other VMs do not have the issue.

There must be something else.

Best Regards,

Bareld

_{Tuesday, December 26, 2017 11:55 AM}

Hey,

I removed the KB4054519, KB4054522, KB4052978 and the problem still exciting...

This issue is happening on 4 VM's of my 12 VM's that I replicates and 3 of the 4 VM's are DC's.

Best Regards,

Eli.

_{Wednesday, December 27, 2017 7:56 AM}

We have the same issues on our production 2012R2 Hyper-V servers after last week Windows Updates.

Please help to fix it ASAP, it's not acceptable.

Replications wornings for:

DCs, Web servers, FTPs and other (Windows and Linux machines) , approximately 50% of VMs with warnings, all other OK.

_{Wednesday, December 27, 2017 11:08 PM}

Hi Anne, I've proceeded with the Integration Services downgration and apperently is has solved the issue. No more errors are been logged.

Now the problematic VM has Integration Services 6.3.9600.16384.

Questions:

For how long I'll have to keep this version and not be able to upgrade?

Thank you!

Leônidas - Analista de Suporte - MCSA Windows 2012

_{Thursday, December 28, 2017 2:02 AM}

Hi Leonidas,

We have reported this issue to more professional team, if there's any useful information, we'll feedback as soon as possible.

Best Regards,

Anne

Please remember to mark the replies as answers if they help.
If you have feedback for TechNet Subscriber Support, contact [email protected].

_{Friday, December 29, 2017 1:53 PM}

Hi there,

Any progress on this issue?

I have got one 2012R2 VM with same issue but mentioned IntegrationServicesVersion is 6.3.9600.16384 for whole time on this VM. On Primary VM is Replication Health Normal but on Secondary is Replication Health Warning: Time duration since the last successful application consistent checkpoint has exceeded the warning limit for the virtual machine - after 8 hours comes to normal and another day the same issue it all starts 13.12 2017.

Thank you,

Jakub

_{Friday, December 29, 2017 1:59 PM}

Hello Anne,

We also did a downgrade of the integration services on the Server2012R2 VM (running as DC), but problem is not solved by doing so. Also our SLES11 linux VM machine also still has the problem.

Do you have other suggestions?

Best Regards,

Bareld

_{Friday, December 29, 2017 2:29 PM}

Hello,

of course this has nothing to do with integration services inside a VM. At least not directly. The problem appeared suddenly and with linux and windows systems. Possibly some integration stuff is broken on hosts / hyperv side, but not on VM side.

_{Tuesday, January 2, 2018 12:24 AM}

Hi Anne, any news on this? Today after a update the problem reappeared. I noticed Integration Services was updated and i had to downgrade it again. Apparently every time a update is installed I'll have to rollback the Integration Services.

Waiting.

Thanl You!

Leônidas - Analista de Suporte - MCSA Windows 2012

_{Monday, January 8, 2018 9:16 AM}

Hi there,

still no progress on this issue?

Best regards,

Jakub Srna

_{Monday, January 8, 2018 12:09 PM}

Hello Jakub,

obviously not. Im also waiting. But let me ask you one question:

Does this behaviour appear in your case during backup or without any backup? We have this issue during backup. And during backup the eventlog entry per se is a quiet normal error. But this error never led to a warning VM state. But now it does and our automated alerts are ringing. That's our specific problem for example.

Cheers Greg

_{Monday, January 8, 2018 3:09 PM}

Hi Greg,

this behaviour is only happening when shadow copy tries to snapshot a VM for VM replica. During the backup process i do not get this error.

Leônidas - Analista de Suporte - MCSA Windows 2012

_{Thursday, January 25, 2018 9:57 AM}

We have similar Problems.

Local Hyper-V, Servertype DC - 2012R2 latest updates incl. Integration Services, Replicated to Azure RM:

The VSS writer NTDS failed with status 11 and writer specific failure code 0x800423F4 - DC Eventviewer
Since last updates Replication Warning on local Hyper-V-Manager: Could not retrieve replication health data.

Azure Error:

Error ID	70171
Error Message	No application consistent recovery point available for the VM in the last 240 minutes.
Possible causes	1. Replication is progressing slowly or is not progressing as expected. 2. The target storage account is not provisioned with sufficient throughput or IOPs to handle the volume of replication data.
Recommendation	Ensure that: 1. Application-consistent snapshot frequency is configured to a valid value in the replication policy. 2. There is sufficient network bandwidth available between the HyperV server and Azure. 3. Look for any associated events in the site recovery events table and resolve them if any.

All this started around 1 month ago, before all was working... And local backup is working without any problems.

Btw. W2012R2 DC directly in Azure won't have this error (it's not replicated, it's backup with Azure Backup Agent)

_{Thursday, January 25, 2018 12:28 PM}

It seems the Problem is solved with latest updates for Windows, Integration Services and AzureSiteRecoveryProvider on Hyper-V hosts.

Also a new inital replication is needed.

Edited: Fix won't help, error reappears after several hours...

_{Monday, February 5, 2018 10:03 AM}

I fixed this problem with rebuilding the replication.
But you have to choose on "Configure Recovery History" Only the latest recovery point.
Everytime your VM is trying to create a recovery point, you will get an error.
We have to wait till microsoft fixed this problem with an Update.

_{Saturday, February 17, 2018 3:59 PM}

Hello,

i have same pb, on 2 hyper-v

a) on both, VMs come form older hyperV plateform 2008R2 or 2012 upgraded to 2012R2.

b) after Host upgrade, I also upgrade Integration services on each VM

c) Pb seems appear particulaly on VM with SQL installed, lihe DC or Exchange, but not exclusively.

Regards,

_{Monday, February 19, 2018 2:44 PM}

Hi All,

@Greg: it appears without backup every day.

I gues still no progress on this issue, right?

Thank you.

Best regards,

Jakub

_{Wednesday, February 21, 2018 10:58 AM}

Hi Anne,

still no answer from Microsoft professional team ? almost 2 month now...

Regards,

Frederic.

_{Friday, March 9, 2018 10:09 AM}

Hello Anne,

now 3 month this issue appeared. still no progress from Microsoft ressearch team ?

thank you for your help,

Frederic

MS Partner - France

_{Tuesday, March 27, 2018 2:51 PM | 1 vote}

Hi Anne,

still no progress? It's kind of frustrating...

Thank you for your reply.

Best regards,

Jakub

_{Wednesday, March 28, 2018 7:31 AM}

Hi All,

I think KB4072650 is part of the solution. It's an update for the "integration services". All my servers With 2012 or 2016 or now back in "normal" replication state. But older servers are not OK yet. For Offline ones, I needed to failover them twice.

Sebastien

Last updated on 2017-12-18

Share via

Replication Health - Time Duration

Question

All replies (34)

Additional resources