Rotating Header Image

Statistics… Peeling Away More Of The Onion

So this post is a continuation of sorts of my last post on Updating Statistics before or after an index rebuild. I asked myself the question “How often should you run statistic maintenance in general? and do I have to run any statistic maintenance if I have auto update stats enabled?” OK I didn’t REALLY ask myself the question but I thought it would be an interesting blog post LOL. The general concensus out there seems to be something like this: If 20% of my data has changes auto update stats will update my statistics. (Some quotes on the topic below)

“SQL server starts Auto update stats when data modification for a table is more than 20 %. ”

“Statistics will only be updated by Auto Update when SQL Server thinks it needs to be done (which is typically when 20% of the rows have changed). If you are manually updating statistics pretty frequently, AutoUpdate may not fire very often, if at all.”

“However, you might not know that approximately 20% of a table needs to change before triggering an auto_update statistics run for the given table. If the table is really big, it’s possible that you’re simply not changing enough data.”

“It will be nice if we can change the 500+20% formula so we can have the auto update stats more often / less often”

So I thought why not put that to the test and see what is happening. Let’s first make sure that auto update statistics is enabled on our working database:

SELECT DATABASEPROPERTY(DB_NAME(), 'IsAutoUpdateStatistics') AS 'IsAutoUpdateStatisticsEnabled'

Make sure this value is set to 1 otherwise this test wont really mean much to you.

Now let’s set ourselves up a table to work with:

SET NOCOUNT ON 

DECLARE @RowID INT 
SET 
@RowID 

CREATE TABLE tblSQLServerPediaUsers
  
ID INT
  
FirstName NVARCHAR(50),
  
LastName NVARCHAR(50),
  
Address1 NVARCHAR(50),
  
City NVARCHAR(50)
  
CONSTRAINT PK_tblSQLServerPediaUsers PRIMARY KEY CLUSTERED (ID)) 

WHILE @RowID <> 10000 
  
BEGIN 
     SET 
@RowID @RowID 
     
INSERT INTO tblSQLServerPediaUsers(IDFirstNameLastNameAddress1,City
     
SELECT @RowID'John' 'Smith''123 SQLServerPedia Way'
'NHL City'
  
END

CREATE INDEX IX_City ON tblSQLServerPediaUsers(City)

And just for tracking purposes let’s see what our statistics look like:

DBCC SHOW_STATISTICS ('tblSQLServerPediaUsers', PK_tblSQLServerPediaUsers) WITH STAT_HEADER
DBCC SHOW_STATISTICS ('tblSQLServerPediaUsers', IX_City) WITH STAT_HEADER

OK so no statistics for our Primary Key (Clustered Index) and up to date statistics with a full scan for our non clustered index. OK so taking what seems to be general public acceptance that if you change 20% or more of the table data that auto update statistics should kick off let’s start a profiler trace and in your events selection under Performance let’s check off Auto Stats. This will tell us if/when auto stats gets executed. So let’s update more than 20% of the data in the table (while the profiler trace is running):

UPDATE tblSQLServerPediaUsers
SET    
City = 'Edmonton',
  
FirstName = 'Ales',
  
LastName = 'Hemsky',
  
Address1 = 'Rexall Place'
WHERE ID IN( SELECT TOP 2499 ID
              
FROM tblSQLServerPediaUsers
              
ORDER BY NEWID())

And if we look at the profiler trace we see Auto Stats ran but if we look at the TextData we see:

Updated: tblSQLServerPediaUsers.PK_tblSQLServerPediaUsers

Hmmmm the index IX_City didn’t appear to get updated. I updated more than 20% of the table and auto stats did happen but it only happened for my Primary Key. Let’s check our statistics just to be sure:

DBCC SHOW_STATISTICS ('tblSQLServerPediaUsers', PK_tblSQLServerPediaUsers) WITH STAT_HEADER
DBCC SHOW_STATISTICS ('tblSQLServerPediaUsers', IX_City) WITH STAT_HEADER

We see the Primary Key now has statistics (as expected) and we also confirmed that the IX_City did not get updated. Let’s see how this affects querying the tblSQLServerPediaUsers table (make sure to have “show execution plan” turned on as well):

SET STATISTICS IO ON
GO
SELECT FirstName, LastName FROM tblSQLServerPediaUsers WHERE City = 'Edmonton'

Execution Plan shows that our index (IX_City) is used and the Logical Reads is at 5011. So I guess all is right in the world right? Just for fun let’s see what kind of stats we would get if we forced the same query to use the primary key (clustered index)

SET STATISTICS IO ON
GO
SELECT FirstName, LastName FROM tblSQLServerPediaUsers WITH(INDEX=PK_tblSQLServerPediaUsers) WHERE City = 'Edmonton'

Execution Plan shows that our forcing the query to use the Primary Key (PK_tblSQLServerPediaUsers) worked but wait a sec… the logical reads is at 127. In our previous run why didn’t the query engine use the primary key instead. Let’s look at what the profiler trace has in it:

Using the IX_City Index:
CPU 47
READS 5011
WRITES 0
DURATION 61

Forcing the query to use PK_tblSQLServerPediaUsers Clustered Index:
CPU 15
READS 127
WRITES 0
DURATION 43

CPU is lower, reads is lower, and duration is lower. So why didn’t the query engine use PK_tblSQLServerPediaUsers instead of IX_city? This is because the auto update stats did not update ALL the statistics. Let’s run an sp_updatestats:

EXEC sp_updatestats

Now let’s run the original query again (without forcing the query to use the primary key):

SET STATISTICS IO ON
GO
SELECT FirstName, LastName FROM tblSQLServerPediaUsers WHERE City = 'Edmonton'

Look at that… the query now uses the primary key instead of using the IX_City index.

So back to our original question of “Do I have to update my statistics if I have auto update stats enabled” My answer stays the same (yes) but now I have a working example to argue my point of view with.

Disclaimer (lol): I’m the first to admit that I don’t know absolutely everything there is to know about the query engine so please feel free to comment and provide some further information/details that I might be leaving out, missing, or TOTALLY wrong about.

Enjoy!!

Post to Twitter Post to Delicious Post to Digg Post to StumbleUpon

4 Comments

  1. Murali says:

    Hi Colin,

    I was under belief that whenever we rebuild the indexes , statistics get updated. is this right or wrong

    Regards
    Murali

  2. Murali

    Thanks for the comment and to answer your question statistics on indexes get updated when you rebuild your index. Statistics on non-indexed columns do not get updated when you rebuild your index.

    My last post http://benchmarkitconsulting.com/colin-stasiuk/2009/02/11/update-statistics-before-or-after-an-index-rebuild/ has a working example to show you how this happens.

    In my example here I built my index but then modified a large percentage of the data in the table (without rebuilding my index again) so when I didn’t update my statistics the query engine chose a less efficient plan to execute my query. Then once I updated my statistics it chose a more efficient way to execute.

    Thanks again for the comment/question

  3. Warlalten says:

    Hello, I can’t understand how to add your blog in my rss reader
    ————————
    sponsored link: http://pedeno.ru/

Leave a Reply

Twitter links powered by Tweet This v1.6.1, a WordPress plugin for Twitter.