SQL Server database backups help companies avoid data losses and recover from database corruptions quickly. You can create alerts to let the database administration team know about failing backup jobs. However, it is also equally important to keep a track of backup failures and work on the root causes. In this article, we will learn how to analyze backup failures
The backup job might fail because of network or hardware or even due to human errors. Though hardware and human errors can be controlled to an extent, network related errors are hard to control. In the case of network related errors, we will end up working closely with the network team. We would probably try to understand what happens during a specific period that interrupts the backup processes. When the backup path is a shared network folder, there are high chances that our backup job will fail. This might also happen whenever backup happens for big databases.
Log and work on it
With the following script, we are going to read the pattern of backup failures. Please note, the final log table would show the log data and the message. You can easily modify it to show the server name too.
DECLARE @Tserver TABLE ( cserver VARCHAR(200) ) INSERT INTO @Tserver VALUES ('SERVER1') DECLARE @Tcmd TABLE ( ccmd VARCHAR(2000) ) DECLARE @Tlog TABLE ( cdate DATETIME, cinfo VARCHAR(200), cmessage VARCHAR(2000) ) INSERT INTO @Tcmd SELECT 'exec (''EXEC sp_readerrorlog 0,1,''''Backup failed'''''') at [' + cserver + ']' FROM @Tserver DECLARE c1 CURSOR FOR SELECT * FROM @Tcmd DECLARE @cmd VARCHAR(2000) OPEN c1 FETCH next FROM c1 INTO @cmd WHILE @@FETCH_STATUS = 0 BEGIN PRINT @cmd INSERT INTO @Tlog (cdate, cinfo, cmessage) EXEC(@cmd) FETCH next FROM c1 INTO @cmd END CLOSE c1 DEALLOCATE c1 SELECT * FROM @Tlog
How it works
Before executing this script, you must create linked servers. This script connects to each linked server, reads the error log based on parameters that were set in the script and then updates the log table.
3 values are passed as parameters to the sp_readerrorlog. 0 is to indicate that the latest log must be read. 1 is to indicate that the SQL Error log must be read and not the Agent log. The third is search string. Below are common messages that are related to a SQL Server backup failure
|BACKUP failed to complete the command %.*ls. Check the backup application log for detailed messages.|
|Backup To URL failed to write status messages to the Windows Event Log.|
|Backup failed because there is a mismatch in file metadata for file %d.|
|%s: Backup device ‘%s’ failed to %s. Operating system error %s.|
|Log backup for database “%.*ls” on a secondary replica failed because the last backup LSN (0x%ls) from the primary database is greater than the current local redo LSN (0x%ls). No log records need to be backed up at this time. Retry the log-backup operation later.|
Please do regular database integrity checks to avoid taking backup of corrupted SQL Server databases.
Neil Varley is a data recovery expert in DataNumen, Inc., which is the world leader in data recovery technologies, including Outlook repair and excel recovery software products. For more information visit www.datanumen.com