Making the best of sp_monitor

By Bill Wunder

 

One challenge that befalls every DBA is keeping an eye on system performance and general system activity trends over time. Here is a no cost, low effort way to track CPU, network, and disk activity trends over time. Using this simple method you can begin collecting data today and in as little as a few weeks have a set of daily data points that ups can use to describe the changes over time your SQL Server is seeing. Now I know that sounds a little like one of those metabamagic weight loss infomercials, but stick with me and I think you’ll be able to tell the real deal from the gimmick.

 

The sort of historical trend tracking we’ll consider can be useful to determine if the time to “throw hardware at it” is approaching or to identify cyclical usage patterns, or even identify a degraded level of adequate indexing support for a SQL Servers query load. The alternatives can be significantly more complex, expensive in terms of system overhead and even dollars, time consuming, and subject to low quality if neglected. With the tool I’ll describe here you can examine a relatively small and easy to understand result set or even easily load that result set into an Excel chart when you need some pretty pictures for the boss to justify a capital expenditure. Once you spend the hour necessary to read the article and set up the data collection SQL Agent job data collection will continue with no additional effort into a narrow table that will grow one row per day. So easy and low maintenance I think you’ll agree that there is no reason not to implement this too today.

 

System Statistical Functions

Long before Microsoft gave us the sysperfinfo table there were a set of system populated global variables that could provide some basic statistical information concerning CPU, network, and disk subsystem activity. These variables are all basically accumulators or ever increasing values that begin at 0 each time the SQL Server is started and continually grow over time. Note that the values are nondeterministic. The set of global system statistical function variables include:

 

            @@connections - number of connections, or attempted connections    

            @@cpu_busy - time in milliseconds that the CPU has spent working

            @@idle - time in milliseconds that the CPU has spent working

            @@pack_received - number of input packets read from the network

            @@pack_sent - number of output packets written to the network

            @@packet_errors - number of network packet errors that have occurred

            @@io_busy - time in milliseconds performing input and output operations

            @@total_read - number of disk reads (not cache reads)

            @@total_write - number of disk writes

            @@total_errors - number of disk read/write errors

            @@timeticks - number of microseconds per tick

 

sp_monitor

The good news is that there is a system stored procedure sp_monitor that that collects, adds meaning, and stores the values in these system statistical variables. It does some very cool formatting that will show you the sample interval value (each time sp_monitor is executed constitutes a new sample interval) and even a percentage value for the cpu_busy, io_busy, and idle values.

 

You really could just go about the business of capturing these values daily into your own table and using the code from sp_monitor as a template for your own stored procedure. In the interest of keeping the tracking tool more transparent to other folks that might be interested in the data I have found that sticking with sp_monitor is useful.

 

spt_monitor

The bad news is sp_monitor does not keep a history. Each time the system stored procedure is executed it uses the values stored in the one row table master.dbo.spt_monitor to calculate the current interval values and then overwrites those values with new numbers just captured from the global system statistical function variables. This means that if you run sp_monitor every few seconds you can tell if the CPU is pegged from Query Analyzer – a sometimes very useful feature – but it also means that spt_monitor is useless for racking the statistics over time.

 

To rectify this missing feature of spt_monitor I have created a procedure that wraps a call to sp_monitor and an archival operation of the spt_monitor data to my own table in the admin database. I call this procedure from a SQL Agent scheduled job once a day. Then I have a stored procedure that duplicates the formatting of the sp_monitor output except that it includes specified rows (a date range) from the history table rather than the single row from spt_monitor. The code necessary to create the wrapping procedure and the reporting procedure can be found in my script sp_monitor history and historical performance report stored procedure.

 

One final word of caution about sp_monitor: after a SQL Server has been running a while some of the system statistical function variables will return a value that causes an overflow error to occur. The problem seems to stem from a situation where the system statistical function variable overflows its 4 bytes of integer storage space and “resets” itself to zero. You will get some unhappy math results in the system procedure sp_monitor when this happens.

 

Other possibilities

The system table sysperfinfo in the master database provides accumulated values for many SQL Server Performance counters. This is the same data that Perfmon uses. Some is at the server level and some is at the database level. It can get a little tricky to figure out how to use some of this data. If you are trying to track a particular problem that has clues in performance data you may want to take a look at this table. See my script sysperfinfo size and performance trend analysis for some ideas on how you might use this system table.

 

Additionally, fn_virtualfilestats can help you track IO at the file level. With this new function you can actually track IO down to the database level. This can be very useful if you have many databases on a server and you need to figure out which one(s) are causing the problematic disk queuing or which ones to move to the new server.

 

Hopefully you’re getting the idea that there are a number of tools already in your SQL Server available to help you administer your SQL Server.

 

Bill