Resolving Blocking Problems

By Bill Wunder

Blocking is a good thing! Right?

We’re not talking that nasty old deadlocking, just plain old maintain-the-specified-isolation-level blocking.

If it weren’t for resource locking – and the resulting blocking conditions – multiple processes could not be efficiently and accurately manage in the database. Ideally, a process gets its locks, quickly does its SQL business while any other processes are briefly blocked should they have an interest in a locked row/page/extent/table, then releases the locks and all processes go on about their business. No human is even aware of any blocking that may have occurred but everyone gets the data they expected.

Ah the perfect world. It’s a lovely place when you can find it.

Inevitably reality will set in and we will have to deal with database blocking that has become a business problem. In reality there will be blocking problems along with the irrevocable benefits. Most often there is a problem when users see the effects of blocking and when there is a problem they will let you know about it. Sometimes they must simply wait a noticeable amount of time and thus develop a perception that the system is “slow”. Other times the blocking persists beyond the threshold of the query or connection timeout so the action is aborted and the user may even observe an error message. Most likely the error message will be a “connection timeout” at the application. What constitutes a noticeable amount of wait time or a reasonable connection timeout setting can vary from shop to shop, application to application, and query to query. For some applications a hesitation of only a second or two can not only degrade performance perception but can queue a large number of queries and actually produce a cascading performance impact that can take a significant amount of time to clear after the initial database block is released. For others, a block may go harmlessly unnoticed for minutes or even hours. The moment of unacceptable slowness for a connection - that duration when an error message would be favorable to a continuation of the pause of the blocked user’s process – is really the duration that the query/connection timeout should be set. However - again pulling us back to reality - it is not uncommon to find query/connection timeout thresholds set to “never”. In fact, SQL Server defaults the query timeout to “never” and depends on the application connection to establish the appropriate setting.

So what can we DBAs provide in terms of lock administration? Implicitly we can gain an understanding of how the Storage Engine manages concurrency, how to avoid and reduce blocking, and how effectively each particular application in our shop that uses a database with a blocking problem is designed. If we go into a blocking problem resolution effort with good fundamentals in these areas we are in an ideal place to get the job done. SQL Server Books Online is a good place to begin a study of the SQL Server Locking Architecture. Of particular note are the articles “Understanding and Avoiding Blocking” and “Locking Architecture.“ There are a number of web sites and books available with more in-depth discussions of locking. I recommend consulting as many sources as you can dig up. It is a fairly complex topic and the more voices you can add to your understanding the higher quality will be the knowledge.

Almost always blocking is going to be an application issue and resolution will come from a change in application behavior. In some cases a revision of indexing is the cure, in other cases a more sweeping application revision is needed. This means that the developers will expend more energy in the resolution phase of the problem and the DBA will expend more energy in identification and control phase of a blocking problem investigation. As a DBA you need to be able to lay out the information to the developer: A is blocking B and here is why. Perhaps together you can brainstorm the solution possibilities. Then the developer must implement and test the revision to assure the behavior of the application.

The first thing to be done to control a block has to be to understand the block. Catching a block in situ provides the best opportunity to fully understand any block, determine if it’s a block you can live with or a block that is unacceptable, and then if necessary master the blocking condition. A good test of the level of understanding of a block is the ability to reproduce the block at will. Interestingly, once you can make a block happen intentionally you can control the block. Once the lock is controlled it can be manipulated as necessary. Being in control of the block is the key to effective lock administration.

SQL Server offers some tools that can help: sp_who, and sp_lock can both provide a point in time view of a block condition. Sometimes Profiler can prove useful though be aware that locking information can produce some voluminous output.

Catching a block in action can sometimes be a challenge. I find using a stored procedure I’ve put together in the SQL Agent Job Scheduler to monitor the system for blocks very effective. I use the procedure GetBlockingDetails in a SQL Agent job scheduled to run every minute. Once a minute it scans sysprocesses for any blocked spids. If it doesn’t find any it’s done until the next minute. If it finds a block it identifies the lead blocking spid and logs it to a table and then waits 30 seconds (change the delay periods as necessary ti suit your needs) and checks again to see if the block is persisting. If so, it captures sysprocesses data for all blocked spids involved in the block to a table and shoots off an email letting me know there is an interesting blocking episode happening. After another 30 seconds it checks that the block is ongoing and if so it shoots me off an email telling me the block is a doozey. Then the procedure goes into a loop checking for the block once a minute until it sees the block has cleared at which time it sends me one more email and then starts watching for the next block. This has proven very helpful for me.

What works best in your shop and for your applications is know best by you. Hope you found some useful information and at least a few building blocks for becoming the ultimate lock administrator wherever you are.

Bill