Share via


SQL MP Alert - Database Backup Failed To Complete

Question

Thursday, May 16, 2019 9:32 AM

Hey all,

We frequently get the alert of "Database Backup Failed To Complete" which is based on a rule checking for Event ID 3041 on the SQL Server.

The problem is that backup jobs do fail but they retry sometimes after a few minutes or after a couple of hours for example, and 99% of the times they work on the 2nd or 3rd retry, which means that we are getting alerts that we don't need to but we would need it to alert if it failed after the 3rd time.

I thought about creating a monitor for event count but when I looked at Event Viewer I see it triggers one event per database and not per DB Engine, which means that if it fails once on a DB engine it will trigger X amount of "Event ID 3041" depending on the number of DBs on that SQL server.

Sorry if the explanation is confusing and let me know if you need clarification. 

Has anyone had this issue or solved it?

CHeers

All replies (4)

Monday, May 20, 2019 8:35 AM âś…Answered

"Database Backup Failed To Complete" is a rule which generate an alert when backup failed.
For monitor backup job whether it is successful or not, you may consider

  1. enable MSSQL Agent job discovery which is disabled by default.
  2. all MSSQL Agent job has Last Run status monitor which can monitor whether it last agent job run status is successful or not.

https://social.technet.microsoft.com/Forums/ie/en-US/88f1c2a7-2abf-44e3-aad4-04ae74aa57e6/monitoring-sql-backups?forum=operationsmanagergeneral

Roger


Thursday, May 16, 2019 12:31 PM

Ive had this issue before, i dont know of anything that has been shown to resolve this.

What i could suggest is to perhaps disable it and create a new one that would detect the event, then have either a diagnostic/recovery task that can check it worked the second time then auto-resolve the monitor after. But this would have to be a custom action

Website: www.walshamsolutions.com Technical Blog: https://www.walshamsolutions.com/technical-blog Personal Blog: https://www.walshamsolutions.com/personal-blog Twitter: Dwalshampro


Friday, May 17, 2019 6:28 AM

Hi,

Firstly, I know little about SQL. Just share some thoughts. Based on the description, it seems we want to detect the "permanent" error of the backup, other than the "temporary" one (which can be fixed by retry). If there is the case, it seems that it's not proper to detect error by the Event ID simply, we can consult SQL guys if there is any other way to detect if a backup fails (even after retry) and then add that detection method in OM to get the result we desired. 

Hope the above information helps.

Regards,

Alex Zhu

Please remember to mark the replies as answers if they help.
If you have feedback for TechNet Subscriber Support, contact [email protected].


Saturday, June 15, 2019 8:51 PM

Hi,

did you resolve this one? We would appreciate your feedback. Thank you in advance!

Regards,

(Please take a moment to "Vote as Helpful" and/or "Mark as Answer" where applicable. This helps the community, keeps the forums tidy, and recognizes useful contributions. Thanks!) Blog: https://blog.pohn.ch/ Twitter: @StoyanChalakov