Question:           

What is parameter sniffing, and is it a bad thing?

Answer:

Parameter sniffing is actually a very good thing, most of the time.  When it does go bad, it’s usually extremely bad.  I’ll define, demonstrate, and provide some ways to avoid the bad examples of parameter sniffing.

So what is parameter sniffing?  The optimizer uses statistics histograms to estimate the cardinality of a given query.  One of the things the optimizer does is evaluate the values of the parameters being passed in.  This is “parameter sniffing.”

When the query or stored procedure containing a parameter is executed, the value initially supplied for the parameter is used when creating the execution plan.  Subsequent calls to this query/SP will use this execution plan stored in cache.  So, if the parameter value is one with high selectivity and represents the data as a whole, this is good.  However, if a value is passed that is not representative of the data, the query may suffer in performance.

The best way to understand this is by creating an example.   Using the AdventureWorks20XX database, a good example table is the Sales.SalesOrderDetail.  I’ll perform a query to list the ProductID’s distribution, create a stored procedure with one input parameter, and then execute the procedure to demonstrate good, and bad, parameter sniffing.  Finally, I’ll discuss some of the most popular ways to avoid bad parameter sniffing scenarios.

Query the Sales.SalesOrderDetail table to get a count distribution for the ProductID values:

SELECT SOD.ProductID
            , COUNT(*) as [Count]
FROM   SALES.SalesOrderDetail SOD
GROUP BY SOD.ProductID
ORDER BY COUNT(*) DESC;
GO

This returns:

ProductID           Count

870                       4688
712                       3382
873                       3354
many more rows
943                       6
942                       5
897                       2

So, from this query, we can see that ProductID 897 has only two entries, while ProductID 870 has 4,688 entries!  That’s quite a difference, let’s see if it matters.

Let’s create a stored procedure to query the Sales.SalesOrderDetail table on ProductID:

CREATE PROCEDURE DBO.TEST (@prod_id int)
AS
SELECT *
FROM [Sales].[SalesOrderDetail] SOD
WHERE SOD.ProductID = @prod_id;

Now, let’s execute it and pass the parameter value of 897:

EXEC DBO.TEST @prod_id = 897;  — 2 records

key-lookup plan

The execution plan reveals a key lookup.  For this query, the plan is good.  Remember though, key lookups require a lot of IO, so this may not be the best plan in all cases.

Looking at the Select operator properties, we see that the parameter – @prod_id was set to 897 when compiled as we would expect.

Select operator properties

Now, let’s take a look at an execution with a different parameter value – 870, and for grins, lets also turn Statistics IO on:

– CLEAR THE CACHE
DBCC FREEPROCCACHE
GO
SET STATISTICS IO ON;
GO
EXEC DBO.TEST @prod_id = 870;  — 4688 records
GO
SET STATISTICS IO OFF;
GO

Whoa!  While the optimizer used the existing execution plan in cache (key lookup), take a look at the logical reads it performed – 14,300!

(4688 row(s) affected)
Table ‘SalesOrderDetail’. Scan count 1, logical reads 14380, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The Select operator properties shows us that the compiled value for the plan generation was 897, but the runtime value was, of course, 870.

Select operator properties - high logical reads

This would be an example of parameter sniffing gone bad.  Executing the stored procedure with a value that has a high selectivity (2 records in our case) will cause the optimizer to choose an execution plan that may not perform well when other parameter values are entered.

In this example the compiled value of 897 produced the key lookup plan.  When the parameter value of 870 was entered, the optimizer said, “Oh, here’s an existing plan that uses a key lookup, cool, I’ll use it!”  And boom!  Instead of doing a key lookup of 2 records, we are now doing a key lookup for 4688 records!  The Statistics IO values returned shows the high IO due to the use of the same plan.

So, before we discuss what can be done to avoid this type of issue, let’s ask one more question:  What if I recompile the plan?  Okay, let’s do that and see what happens.

To cause a manual recompile of the stored procedure, we alter the stored procedure:

ALTER PROCEDURE [dbo].[TEST] (@prod_id int)
AS
SELECT *
FROM [Sales].[SalesOrderDetail] SOD
WHERE SOD.ProductID = @prod_id
OPTION (RECOMPILE);

Okay, so now when dbo.TEST is executed, it will recompile and use the newly passed parameter value to generate an execution plan.  Let’s execute it and check this functionality.

EXEC DBO.TEST @PROD_ID = 870  — 4688 records

Clustered Index Scan - option recompile

Now we see the newly generated execution plan based upon the parameter value of 897.  This plan uses a Clustered Index Scan operation.  Let’s take another look at the stored procedure execution using Statistics IO:

(4688 row(s) affected)

Table ‘SalesOrderDetail’. Scan count 1, logical reads 1242, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

As you might expect, the logical reads has dropped dramatically, lowing the IO thus improving performance for this parameter value.

The downfall of this type of operation, where we recompile each time the stored procedure is called, is the fact that recompiles have overhead, CPU cycles, etc.  So is this the only way to avoid the case of bad parameter sniffing?  Nope.

Solutions:

Options are available to assist with the parameter sniffing issue.  Of course, we could use the example above where we recompile the statement for each execution, but there are other alternatives.

  • OPTION (RECOMPILE) – already presented
  • OPTION (OPTIMIZE FOR (@VARIABLE = VALUE))
  • OPTION (OPTIMIZE FOR (@VARIABLE = UNKNOWN))
  • Use local variables

The OPTION (OPTIMIZE FOR (@VARIABLE = VALUE)) tells the optimizer to compile the statement using the “VALUE” vs. using whatever is passed as a parameter.  This option may be useful when calling the stored procedure over-and-over with a particular value.

Here is an example of Optimize for VALUE:

CREATE PROCEDURE DBO.TEST (@prod_id int)
       AS
       SELECT *
       FROM [Sales].[SalesOrderDetail] SOD
       WHERE SOD.ProductID = @prod_id
OPTION (OPTIMIZE for (@prod_id = 897));

If we execute the SP and pass value 870:

EXEC DBO.TEST @prod_id = 870;  — 4688 records
GO

We see the following plan:

Key Lookup Plan Option Value

The execution of the SP essentially ignores the value of 870 and uses the plan for 897 instead.  In this case, we wouldn’t want to use the value of 870 often.

The OPTION (OPTIMIZE FOR (@VARIABLE = UNKNOWN)) allows the optimizer to use statistical densities rather than statistical histograms.  All parameters use the same estimated record count.  Some query plans may not be optimal because densities are not as accurate as histograms.  When this option is used, parameter sniffing is turned off.

Here is an example of Optimize for Unknown:

CREATE PROCEDURE DBO.TEST (@prod_id int)
       AS
       SELECT *
       FROM [Sales].[SalesOrderDetail] SOD
       WHERE SOD.ProductID = @prod_id
OPTION (OPTIMIZE for (@prod_id = UNKNOWN));

If we execute this SP, passing both the 870 and 897 values, we see something interesting:

Clustered Index Scan - option recompile

Both values use the same plan – Clustered Index Scan.  Looking at the XML output of the plan, we see the StatementEstRows under tag: StmtSimple:

<StmtSimple StatementCompId=”3″ StatementEstRows=”456.079″

This estimate rowcount is the same for both parameter values passed.  As the definition states:  “the same estimated row count” is used for all parameters.

Use Local Variables – when you assign the parameter passed to a local variable in the stored procedure code to a local variable, the optimizer uses the same number of estimated rows for each parameter.

When we use this option, we are effectively “turning off” parameter sniffing.

Here is an example of local variable usage:

CREATE PROCEDURE DBO.TEST2 (@prod_id int)
       AS
       DECLARE @PRODUCT_ID INT = @prod_id  — assign param to local variable
       SELECT *
       FROM [Sales].[SalesOrderDetail] SOD
       WHERE SOD.ProductID = @PRODUCT_ID

Summary

In this post I’ve described parameter sniffing, and shown that it is a normal operation performed by the optimizer.  Using the knowledge gained in these examples, and one or more of the solutions presented, you should be able to identify and remedy any suspect stored procedures.  The takeaway from this is to have alternatives when troubleshooting stored procedures where certain parameter values give less-than-optimal results.

Resources

Microsoft MSDN – SQL Server:
https://msdn.microsoft.com/library/mt590198.aspx

Microsoft MSDN – Stored Procedures:
https://msdn.microsoft.com/en-us/library/ms187926.aspx

Microsoft MSDN – Statistics and Plan Cache:
https://msdn.microsoft.com/en-us/library/cc293624.aspx
https://msdn.microsoft.com/en-us/library/ms190397.aspx

Book:
Microsoft SQL Server 2014: Query Tuning & Optimization, Benjamin Nevarez, McGraw-Hill, 2015.