MSDN Tutorials

Mastering SQL Server Performance Tuning

SQL Server Performance: Understanding and Managing Statistics

Effective database performance tuning relies heavily on accurate statistics. Statistics provide the query optimizer with information about the distribution of data in your tables and indexes, enabling it to create efficient execution plans.

What are Statistics?

SQL Server uses statistics to estimate the number of rows that a query will process. These estimates are crucial for the query optimizer to choose the most efficient way to access data. Statistics are based on histograms, density vectors, and other data distribution information.

Key components include:

Why are Statistics Important?

Out-of-date or missing statistics can lead to:

Regularly updating and maintaining statistics is vital for consistent query performance.

Creating and Updating Statistics

SQL Server can automatically create and update statistics, but manual intervention is sometimes necessary.

Automatic Statistics Management:

The `AUTO_CREATE_STATISTICS` and `AUTO_UPDATE_STATISTICS` database options control this behavior. It's generally recommended to keep these enabled.

Manual Creation of Statistics:

You can manually create statistics using the `CREATE STATISTICS` statement.

CREATE STATISTICS Stat_Customers_City
ON Production.Customers (City)
WITH FULLSCAN;

Manual Update of Statistics:

Use the `UPDATE STATISTICS` statement to refresh statistics. `FULLSCAN` provides the most accurate but resource-intensive update, while `SAMPLE` uses a subset of data.

-- Update using a sample
UPDATE STATISTICS Production.Customers Stat_Customers_City
WITH SAMPLE 50 PERCENT;

-- Update using a full scan
UPDATE STATISTICS Production.Customers Stat_Customers_City
WITH FULLSCAN;

-- Update all statistics for a table
UPDATE STATISTICS Production.Customers;

Managing Statistics

Monitoring and maintaining statistics is an ongoing process.

Identifying Stale Statistics:

You can query system views to find statistics that may need updating. A common indicator is the `modification_counter` in `sys.dm_db_stats_properties` relative to the number of rows.

SELECT
    s.name AS StatisticsName,
    OBJECT_NAME(sp.object_id) AS TableName,
    sp.last_user_update,
    sp.rows,
    sp.rows_sampled,
    sp.steps,
    sp.unfiltered_statistics,
    sp.modification_counter
FROM
    sys.stats AS s
CROSS APPLY sys.dm_db_stats_properties(s.object_id, s.stats_id) AS sp
WHERE
    sp.modification_counter > 0
    AND sp.rows > 0
    AND CAST(sp.modification_counter AS DECIMAL(18,2)) / sp.rows > 0.20 -- Example threshold for update
ORDER BY
    sp.last_user_update ASC;

Deleting Statistics:

While generally not recommended unless you have a specific reason, you can delete statistics using the `DROP STATISTICS` statement.

DROP STATISTICS Production.Customers.Stat_Customers_City;

Advanced Topics

Filtered Statistics: Create statistics on a subset of rows in a table, useful for queries that frequently filter on specific conditions.

CREATE STATISTICS Stat_Orders_Active
ON Sales.Orders (OrderDate)
WHERE Status = 'Active';

Auto-Stale Threshold: The threshold for automatic updates can be influenced by the `AUTO_UPDATE_STATISTICS_ASYNC` option and the `STATS_STREAM` parameter during creation/update.

By understanding and actively managing statistics, you can significantly improve the performance of your SQL Server database. Explore these concepts further to optimize your queries.