Are They Really Using All Those Stored Procedures?

By Bill Wunder

 

Follow me on a few twists and turns (OK, even a few grumbles) through SQL Server’s memory pool and I’ll take you to a little "set it up and go" tool that will give you a remarkably accurate indication of which of those piles and piles of stored procedures on your system are every really getting used and the degree to which each is really called.

 

SQL Server memory space has evolved over time into a virtual black box. In the early generations of the product memory configuration and management was quite the voodoo science. It was necessary to tell the server how much memory went to procedure cache and it was possible to define how much of the reaming would be distributed among the various types of data cache buffers and pools. The number of configuration options was a bit daunting, the measurement techniques were obscure and/or expensive, and the documentation was woefully inadequate. Interestingly, manifestations of memory leaks were an every day occurrence. The more work the SQL Server did – and for the most part irregardless of which memory tweaks were in place – the more often the SQL Server either crashed or needed to be started or needed to be rebooted because memory management  wasn’t doing it’s job.

 

Fortunately, the recent releases of SQL Server are much better at memory management. The improvements have led to an almost complete elimination of manually configurable memory options and to a much more stable primary storage environment. Under the covers, though, the memory pool retains much the same architectural layout as the legacy releases. There is still a fixed amount of memory allocated for SQL Server executable code address space with the remaining majority going to cache. SQL Server executable code address space includes the kernel and the server’s internal road map(s) to the protocols, .dlls and providers currently in use by the SQL Server. This piece of memory still seems a bit susceptible to leaks that can bring a server down. (See my earlier article “Automation and T-SQL” for more on this topic). The SQL Server –g start up option is an offshoot of Microsoft’s attempts to address these problems. Personally, I’d rather avoid memory leaks when possible and reboot regularly than resort to intentionally stealing chunks of the memory pool. Don’t really think Microsoft has helped us much with the –g command line option, Must be one of those business risks driving it? (i.e. CYA technology). The memory pool – or cache - is the more interesting segment of a SQL Server’s working set. In this area reside the procedure, data and log caches as well as per connection dynamic data structures and the system data structures. Basically, these types of cache are left to talk amongst themselves to decide who gets what amount of the available pool.

 

It’s no surprise that data cache gets most of available pool. At least it better if you want decent performance from the SQL Server.  Obviously, connection information and system information (including sysprocesses, syslocks, sysperfinfo, syscacheobjects and the other virtual tables must be able to claim the space they need so their memory requirements directly impact the amount available for data, though the expectation would be that they do not to exact a terribly heavy cost. Procedure and log cache I would expect to remain a pretty constant fraction of data cache space size once the server has acquiesced after a start-up. Log data is probably going to disk as quickly as possible and likely gets a top priority thus holding down the space needs of that memory queue. Procedure cache is being managed through an LRU aging algorithm.

 

Here’s where it gets interesting. SQL Server uses the virtual table syscacheobjects to track LRU statistics and age the procedure cache. If you query syscacheobjects you will find a sort of index of the current state of the cache. The data objects in cache can change dramatically so trying to see what tables are in cache on a busy server is not practical. You can, however, easily look in syscacheobjects and see which embedded or ad-hoc query statements have recently been executed if you are trying to track down a problem where that information is useful. You can also look in this table to see which stored procedures are currently in use by checking for an “execution context” cacheobjtype or see which have recently been used by checking for “compiled object” cacheobjtype. I have found it useful to use that latter bit of information as a tool to identify which procedures are actually getting called on my SQL Servers.

 

If you examine the code in the stored procedure GetProceduresFoundInCache you’ll notice that it is a self contained utility - intended for creation in the admin database if course. The procedure creates the historical tracking table it uses and even a mini-stored procedure that it uses to track when the compiled objects in cache have been changed. It infers that cache has changed by regularly looking for this mini-proc in syscacheobjects every minute as it is executed from the SQL Agent job that you set up to call it. The mini-proc is effective because it should be at the lowest rung in the LRU algorithm at all times. It always has a use count of 1 which is the minimum and does not reference any other object in cache. If GetProceduresFoundInCache finds the mini-proc it does nothing. If the mini-proc is gone, GetProceduresFoundInCache captures a snapshot list stored procedures currently in cache and updates the table ProceduresFoundInCache from that snapshot.

 

Querying ProceduresFoundInCache can tell you when a procedure entered the table, when it was last called, and the number of times that it was found in cache when GetProceduresFoundInCache determined a sample was necessary. By Left Joining ProceduresFoundInCache to sysobjects in other user databases you can easily list those procedures that were never found in cache.

 

There is no written guarantee that every stored procedure will be cached, so after you have identified those that don't appear in syscacheobjects you can do a couple of things to see if a procedure just isn't ever cached. You can execute it and see if gets cached or you can add an insert into a tracking table to each identified unused procedure and leave it in the database for a comfortable period of time. The advantage to this latter approach is that you could even log the application name and host machine that referenced a procedure. Could be you'll uncover a complete application process that is unneeded or unknown.

 

The biggest problem I have with this tool is the high level of developer denial surrounding the amount of their hard work that isn’t being used.

 

Bill