Retrofitting Table Level Delete/Archive Strategies - Moderately Complex Storage Models

by Bill Wunder


Welcome back! This article will continue a discussion of resolving blocking and contention problems created by applications with an unbounded table growth design. Be sure to review the opening article in this series for a more complete discussion Unbounded Table Growth by Design as well as the second installment that attempts to define a  Classifications of Logging Tables loosely based on how and how long data is used by the application.


In the last episode, Short Term Storage Models, we looked in some depth at a local partitioning solution useful as an alternative to trying to remove and/or archive the oldest rows from table while an application is busily adding new rows. In that Saturday Night Live scenario we had determined that once the data aged beyond a certain time our application had no interest in querying that data again. To tackle such a problem we looked at a solution that used two twin tables and regularly rotated those two table into a view that was used by the application. This technique allowed us to stabilize the storage requirements at about the the size of the data set that remained interesting and provided a very small window of only a second or two where the data delete/archival processing would compete with the application for access to the table: the time necessary to execute an ALTER VIEW statement. And in that scenario the moment in time when the contention was necessary was easily managed so we could further limit that contention by performing the ALTER VIEW during off or low hours of operation. The view took on the name originally used in the application for the table so no code changes were necessary at the application layer. The twin tables were given and A and B suffix to satisfy those pesky uniqueness rules in sysobjects. Now lets build on that discussion and consider a data requirement where the applications interest in aging rows is limited not only by time but also by the state of each row.


In a frequently occurring subset of logging table implementations we find a situation where the rows must progress to a completion. Once they attain that final state they are no longer subject to modification by the application though occasionally a "data doctor" might need to adjust rows at the final state. Consider, for example, the order line row that must be either shipped, backordered, or cancelled before it can be considered as complete. Once a final state is attained - and as well, once a predefined interval for delete/archival processing has been realized - the application has lost it's interest in the logged row in it's current form. To continue our example, the order line row may be of interest to the sales force in terms of usage patterns or the inventory analyst in terms of forecasting or order levels but is no longer interesting to the warehouse or shipping room. At this point the order line history table or the data warehouse may want to consume the order line, but the order line table is going to be happier and better behaved if we can elegantly get it out of the OLTP order line universe yet maintain a "near-line" availability. One think obvious here is the added need to persist incomplete rows in the active data set. In an unbounded table growth model this will almost always translate to a need to scan a table or index on a volatile yet poorly selective state column as well as a more helpful index to qualify rows for delete/archival processing. If an application is successful - useful and busy - such a process can be a serious performance and contention problem. And the larger the active data set the worse the problem. In such cases it is often pleasantly effective to mange the data set size by limiting the active set by age as we did with the Saturday Night Live based on a well chosen interval and also to examine the partition that is loosing focus at each rollover to move the interesting (i.e. incomplete) rows into the newly activated table of the A/B pair. This is what we will consider the Middle of the Road model. We need to do more than simply flip-flop the views that shroud the partitioning tables but we are able to establish a looser relationship between the current partition and the aged partition than a formal partitioned view and we wont have the level of contention created by mass deletes and high volume inserts on the same table. This mean each time we rollover to the other underlying table of the A/B pair we'll likely still need to do that index scan in order to identify and copy the rows from the table being moved to the background - though with shared locks rather than exclusive locks = and copy those rows to the table about to take the lead. Yet, we'll have the benefit of an unavoidably smaller more predictable in size data set to scan that will help minimize the duration and reduce the level of the contention between the delete/archival process and the application's normal work.


The issue of determining whether to move rows into the table that is about to become current or to take the more expensive route of adding a check constraint and creating a full fledged partition view is the most important decision point in any local partition retrofit. If there is an efficient way to avoid the check constraint it must be fully considered for the simple reason that creating that check constraint carries risk and requires extended exclusive rollover processing time. If you create the constraint with NOCHECK, meaning you skip the verification that existing data satisfy the check expression, there is risk that some data violates the rule and you WILL NOT be able to create an updateable partitioned view. If you let bad data past the constraint at creation your application can end up dead in water and you'll have a highly stressful data cleansing search and destroy mission to face. If you check the data against the constraints expression you'll have a little wait on your hand. I strongly encourage a Middle of the Road approach that does not require a check constraint whenever possible.


A couple of side notes


In some cases the check constraint created with NOCHECK can provide improved select performance even though it disallows updatability. Furthermore, if you shift rows that still need accept DML operations to the table becoming the current table it is necessary to consider those rows when establishing the value to use for the check constraint.    


Without adding unnecessary complexity to the discussion, it is worthwhile to point out that the strategy used here is as effective with a table hierarchy as it is with a single table. For instance, it is at least as likely that an order will remain active until all order lines have reached a final state. In that case the state of the order as a whole would be the driving consideration as to whether the order lines were eligible for removal from the active set. I have successfully deployed a local partitioning strategy in at least three high volume table hierarchies. A bit more thought at the beginning of the effort is required for sure, but the results are quite satisfactory. That's about all I'll say local partitioning strategies for multiple tables here. If there is interest in that topic please let me know and I'll gladly put together a description of such an endeavor.




In keeping the practices established in the Saturday Night Live model and the here in the Middle of the Road model consistent we will stay with the established common methodology with this partitioning retrofit . Whether working on the simplest model or the most stringent, we want a conversion script that will morph the single table model to the desired partitioned model, a rollback script capable of returning the converted data model back into a single table model, and a rollover script that can flip flop the partitions at our will as the table grows and the data ages. The conversion and rollback scripts can be stored procedures or simple run once scripts. The rollover script I strongly suggest be made a well documented stored procedure so that the involved dependencies for the partitioned table design remains visible to all developers. In these examples I will follow the convention of creating a stored procedure for each operation if for no other reason than to make it easier for you to deploy the examples into a test environment as you follow the discussions. In addition, each script will create the common database used for all examples if it is not found on the SQL Server so that each test case can be run independently.


For all scripts we will make use of the INFORMATION_SCHEMA views. The INFORMATION_SCHEMA views are your friends and Microsoft has made a commitment toward supporting these views in future releases. If you have addition questions concerning these views be sure to consult Books Online documentation.


For this Middle of the Road example we will use the same table structure as was used in the Saturday Night Live example. We will create a new copy of that table with a different name in the same test database where the simpler model was implemented. We want to have adequate columns to demonstrate the necessary techniques but probably not as many columns as you may have to deal with in a real word retrofit of an unbounded table growth design. I think it will be immeasurably useful if you can find a test or development SQL Server 2000 to run the scripts as we move along, though I will also present test cycle result sets so that you can see what is happening if for some reason you cannot run the demonstration.


The complete script for this article can be found at Local Partition Demo - Middle of The Road. If you select this link now the script should bring up a new browser window so you can toggle between the article and the script. As with the Saturday Night Live model the script first creates the LocalPertitionDemo database and adds the two example roles.


The conversion script here is very much the same as the first example. The real difference is that here the UNION view to cover both tables and the view to cover the not currently in use table of the partition pair are more likely to be mandatory and used by the application. That implies that it is more likely that some stored procedures used by the application will need to be modified to reference the correct view. The conversion script makes the existing single table become the "A" version of the view shrouded table and then creates a clone of that table for the "B" version. Once both tables exist, the scrip creates the views. Note the use of the exec() function to create the views since the create statement must be in it's own batch. (You could also use sp_executesql.) 


Like the conversion script, the rollback script is very similar to the one used in the simplest model. We need to determine which of the twin tables is currently in use and move it out from underneath the view. Notice how easily this is accomplished by a quick check in the INFORMATION_SCHEMA. It is somewhat more perilous to rollback a Middle of the Road local partition because there is an increased possibility that duplicate rows may exist in the two tables. This is definitely something to watch for if, especially if you try to roll back after a failed rollover. Under normal operating conditions the rollback script is more likely to behave as expected. (But then why would you need the rollback script if things are operating normally?)

The rollover script is a stored procedure that will determine the name of the current table and the other table, then atomically truncate the other table and alter the view or views appropriately. The rollover script is where the Middle of the Road model deviates significantly from the Saturday Night Live model as it transfers data between tables. This procedure could also perform or initiate the aggregation or archival activity using the new defined other table. Notice that we make an effort to use the sp_executesql command rather than exec() since this procedure will used regularly and will likely be done by an automated task- while I chose not to type the extra few lines in the conversion script and the rollback script because those scripts will likely only be needed one time and I'll be there to see the Query Analyzer output - really just a little laziness on my part for sure. Like the Saturday Night Live rollover script, this script executes the  the dynamic command string in a transaction so that the "holdlock,tablock" in has the effect of stopping other writes but allowing reads while the alter view happens. Unlike the Saturday Night Live rollover script, here the dynamic string includes a DML statement to copy rows we wish to remain active from the table loosing focus to the table gaining focus while in that transaction.

That provides all the components we need to create and manage this partitioning method. You may wish to use the SQL Agent to initiate the nightly rollover or you may want the application to be involved. This is going to be driven by how much impact the rollover blocking has on the application. Only good testing in your environment can provide the ultimate answer to the question. The size of the table being aged to the background, the quality of the index used to identify rows that need to rolled over to the table becoming current, and the number of rows that will be rollover will all have an impact on the duration of this process.

The script concludes with a test script to verify all other scripts by executing a conversion, a rollover after a little data is added, and a rollback. The results of executing the script are included after the test script so you can observe the results without actually running the test on your test box, however, i heartily encourage you to give it a try.

If you follow the test script through the output you can see the structure and names of the views and tables as well as the behavior of the data in this Middle of the Road script. Once you get a real conversion to the point of successfully executing a similar test plan in the unit testing environment, you'll be ready to apply the conversion and a few rollovers in the integrated testing/development environment to verify that the conversion and rollover processing plays well with your application(s).

For those tables where you are removing a large chunk of rows on a regular basis and experiencing unbearable contention between the application and the row removal, this process can become a dear friend.

That brings us to the final level of partitioning we will consider. First another break to give everyone a chance to play with this example.

Hope to see you in a few days and we'll look over the most complex approach to locally partitioned retrofits.