Heavy Duty LiteSpeed Log Shipping

Part 4: Playing Nice

By Bill Wunder

 

Occasionally, there are activities that will cause the rate of transaction log writes to increase dramatically in what I’ll call spikes or surges. Spikes are brief periods when a lot of write activity hits the log device and as suddenly is done and the log activity quickly acquiesces to a normal rate. Spikes of log activity can occur during data imports to large tables in a batch process or more likely during processing of data into a permanent table just after an import into a staging database. Surges of log activity begin as quickly as spikes but the high logging rate is sustained for a longer period of time. Surges of log activity can occur for the same reasons as spike but usually involve larger tables or datasets.

 

There are a number of approaches used to reduce contention and log surging when large tables are loaded or maintained in a batch mode process. It is possible to iterate through the changes, affecting one or a limited number of rows in each iteration. This is most effective, in terms of managing log growth, for a database operating in the “simple” recovery model. In the “simple” recovery model, the log will be truncated on checkpoint back to the beginning of the oldest active transaction. Keeping the transaction small through a looping algorithm will assure that the complete table doesn’t need to be processed as a single transaction. This technique is generally not useful to control log growth in the log shipping scenario as you cannot backup logs if the simple recovery mode has been used since the last full backup. Still it may be a valid approach, even for a database under log shipping, to reduce the duration of blocks in a highly contentious database.

 

When importing data or creating indexes, the “bulk-logged” recovery model is useful to greatly reduce the amount of logging. If you have a database that has a significant maintenance window this recovery model can be easily implemented to keep log shipping backup files to a reasonable size. Unlike the “simple” recovery model, the “bulk-logged” recovery model can be used for a database under log shipping though it does add some risks and eliminates the possibility of point-in-time log recovery that is available with a database in the “full” recovery model. In general I’d recommend avoiding the “bulk-logged” recovery for a database under log shipping in 24x7 operating conditions. Even though you’ll use significantly less log space during an import you’ll still face the contention and data integrity issues associated with atomic mass changes to large tables. If a bcp or BULK INSERT is active against a large table, the table’s data will be continually changing and or unavailable during the import. Often, you will have some tables that can’t be locked for a long duration and some tables that can’t be in an inconsistent state during the import processing as mandated by your particular 24x7 business model. In most cases a staging database is needed to import the data before it is more elegantly integrated into the live dataset. Sometimes you may even need to switch between two tables when an import occurs to move from the old data to the new data in an instant. In such cases the “bulk-logged” recovery model doesn’t offer so much help. Conversely, a CREATE INDEX on a large table may be the most valid time to use the “bulk-logged” recovery model while under log shipping, though only if the CREATE INDEX is a rarely occurring activity. The CREATE INDEX will still block other access to the table, regardless of the recovery model used. While the “bulk-logged” recovery model will greatly reduce the logging activity of these types of activity

 

For maintenance of existing indices, archive/delete processing, looping algorithms that -even though they don’t create long periods of blocking - can generate a huge amount of transaction log activity, and all the other conditions that result in log growth surges a method to manage log growth in a log shipping environment is necessary. It is important to manage this log growth to avoid running out of disk space on the log device and to avoid an extended backlog of log backups creating an uncomfortable latency between the state of the source database and the state of the destination database. Sure you can configure an 8GB database with a 32GB log device, but if your SLA says you can fail over to your warm spare with less than 10 minutes of lost data and your in the midst of shipping a restored a 30GB backup when you need to fail over it’s going to cost you.

 

A technique that is working well for me is to exploit the SQLAgent Job scheduler using the msdb based system stored procedures sp_start_job and sp_stop_job. I’ll describe how I use this technique in particular with reference to index maintenance. As a working example I have modified one of the earlier scripts I placed in the sswug.com Script Library to show the changes necessary to achieve log space management with a long running log intensive process. The original script is still posted as “admin db – Index (fragmentation) Maintenance”. I have modified that earlier version to include the technique I’m about to describe and posted the updated version as “admin db - index (fragmentation) maintenance for log shipping”. Please review both scripts to get a full understanding of how this technique works.

 

The script jumps from about 300 lines to about 600 lines with the additions so there is a quite a little that has to happen. Basically the procedure dbo.IndexMaintenanceForDB should be set up in the SQLAgent job scheduler to run at an appropriate time (i.e. when the server is not working it’s hardest). When a database is defragmented that is in the full recovery model the maintenance job will create a second stored procedure name dbo.DefragLargeTables in the admin database that is basically a clone of the defrag logic of the dbo.IndexMaintenanceForDB (Only creates it, of course, if it doesn’t already exist) and also a second SQLAgent job that calls dbo.DefragLargeTables named “w Defrag large <the DB Name> tables”. The new job will not get a schedule. Instead, when dbo.IndexMaintenanceForDB begins defragging the large tables in the database it will not directly execute a DBCC. Rather it will call sp_start_job to execute “w Defrag large <the DB Name> tables”. sp_start_job is effectively an asynchronous operation so once the second job is started, dbo.IndexMaintenanceForDB for will sleep for while and periodically wake up and check the current used space in the log file for the database being maintained by looking in master.dbo.sysperfinfo. If the current log used size exceeds a threshold you have set as a parameter to dbo.IndexMaintenanceForDB the “w Defrag large <the DB Name> tables” job will be stopped by a call to sp_stop_job and dbo.IndexMaintenanceForDB will wait until the log size drops below the threshold before restarting the job. This works in this case because DBCC INDEXDEFRAG is an online operation and it can be stopped at any time with no loss of work already done and no resulting inconsistencies to the data. To use the technique for other operation you’ll want to be sure you write that operation to meet these same standards: there will be no loss of work and no inconsistencies to the data. In most cases that will mean your process will take longer to run with the advantages that it can be stopped and started at any time and that it will not cause long running contention or blocking to other consumers of a particular table’s data.

 

If you haven’t already, you can download the Heavy Duty LiteSpeed Log Shipping stored procedures and save them all to a location that you have permission to access from the network using the file names indicated in the download script for each procedure’s script. Then follow the steps outlined in Heavy Duty LiteSpeed Log Shipping Part 2 for deploying the log shipping job on a test server.

Discover the usage and processing actions of a deployed log shipping installation in the article Heavy Duty LiteSpeed Log Shipping Part 3. Then if you have surging log growth conditions you can use the information presented here to help you manage log growth. There is one more topic I’d like to cover in the next article before we leave this discussion of Heavy Duty SQL LiteSpeed Log Shipping: fail over strategies. Hope you check it out.

 

Bill