SQL Server has been around over 30 years, and I’ve been working with SQL Server for almost as long. I’ve seen a lot of changes over the years (and decades!) and versions of this incredible product. In these posts I’ll share with you how I look at some of the features or aspects of SQL Server, sometimes along with a bit of historical perspective.
Check out Kalen's recent blogs on Problematic Operators here.
Plans for SQL server diagnostics can be expensive to create, as the query optimizer has to be able to find a good plan for any legal query that is submitted. The optimizer evaluates multiple join order, multiple indexes. and various types of join and grouping algorithms, depending on your query and the tables involved. If the same query is rerun, SQL Server can save a lot of resources by reusing an existing plan. But it’s not always possible to reuse an existing plan, and it is not always a good thing to do so. In the next two articles, we’ll look at when a plan is reused and when it is recompiled.
First, we’ll look at the different flavors of plans and the metadata view that I use most often to look at what’s in my plan cache. I have written a view of my own that provides the information that I find most useful most often. SQL Server caches six different types of query plans, but only two are normally used for plan cache tuning. These are COMPILED PLAN and COMPILED PLAN STUB. My view filters out all but these two types of cache objects. COMPILED PLANS come in three varieties: AD HOC, PREPARED, and PROC. I will comment on all three kinds.
Even if we’re just looking at COMPILED PLANS, there are still a lot of plans in cache that usually need to be ignored, as they are generated by SQL Server itself. These include plans looking for filestream or full-text search indexes or internal queries working with In-memory OLTP. So, my view adds filters to try and wean out most of the plans I’m not interested in. You can download a script to build this view, called sp_cacheobjects, from here.
Even with all the filters my view uses, there are still some of SQL Server’s own internal queries in cache; I usually clear the plan cache out frequently when doing testing in this area. The simplest way to clear ALL the plans from cache is with the command: DBCC FREEPROCCACHE.
Adhoc Compiled Plans
The simplest type of plan is Adhoc. This is used for basic queries that don’t fit into another category. If you’ve downloaded my script and created my sp_cacheobjects view, you can run the following. Any version of the AdventureWorks database should work. This script makes a copy of a table and builds a couple of unique indexes on it. It also massages the SubTotal amount to remove any decimal digits.
DROP TABLE IF EXISTS newsales;
-- Make a copy of the Sales.SalesOrderHeader table
SELECT * INTO dbo.newsales
SET SubTotal = cast(cast(SubTotal as int) as money);
CREATE UNIQUE index newsales_ident
CREATE INDEX IX_Sales_SubTotal ON newsales(SubTotal);
-- Adhoc query plan reuse
-- adhoc query
SELECT * FROM dbo.newsales
WHERE SubTotal = 4;
SELECT * FROM sp_cacheobjects;
In my output, you see two Adhoc plans. One is for the SELECT statement from the newsales table, and the other is for the SELECT from my sp_cacheobjects view. Because the plan is cached, if the EXACT same query is run again, the same plan can be reused and you will see the usecounts value increase. However, there is a catch. In order for an Adhoc plan to be reused, the SQL string must be absolutely exactly the same. If you change any characters in the SQL, the query is not recognized as the same query, and a new plan is generated. If I even add a space, include the comment, or a new line break, it’s not the same string. If I change the case, that means there are different ASCII code values, thus not the same string.
You can try this out yourself by running different variations of my first SELECT statement from the newsales table. You’ll see a different row in the cache for each one. After I ran several variations–changing the number I was searching for, changing the case, adding the comment, and a new line, I see the following in cache. The SELECT from my view is being reused, but everything else has a usecounts value of 1.
One additional requirement for Adhoc query plan reuse is that the session running the query must have the same SET options in effect. There is another column in the output you can see to the right of the query text, called SETOPTS. This is a bit string with a bit for each relevant SET option. If you change one of the options, for example, SET ANSI_NULLS OFF, the bit string will change and the same plan with the original bit string cannot be reused.
Prepared Compiled Plans
The second type of cached compiled plan is a PREPARED plan. If your query meets a certain set of requirements. It can actually be automatically parameterized. It shows up in the metadata as PREPARED and the SQL string shows a parameter marker. Here’s an example:
The PREPARED plan shows the parameter marker as @1 and does not include the actual value. Notice there is a row for an ADHOC query with an actual value of 5555, but that is actually only a ‘shell’ of the real query. It does not cache the entire plan but only the query and a few identifying details, to help the query processor find the parameterized plan in the cache. Notice the size (pagesused) is much smaller than the PREPARED plan.
The default parameterization mode, called SIMPLE parameterization, is extremely strict about what plans can be parameterized. It is really only the simplest of queries that are parameterizable by default. Queries that contain JOIN, GROUP BY, OR, and many other relatively common query constructs prevent a query from being parameterized. In addition to not having any of these constructs, the most important thing for SIMPLE parameterization is that the query is SAFE. This means that there is only one possible plan no matter what values are passed for any parameters. (Of course, a query without any parameters could also be SAFE.) My query is looking for an exact match on the column SalesOrderID, which has a unique index on it. So the existing nonclustered index could be used to find any matching row. No matter what value I use, 55555 or something else, there will never be more than one row which means the plan will still be good.
In my Adhoc query plan example, I was looking for matching values for SubTotal. Some SubTotal values occur a few times or not at all, so a nonclustered index would be good. But other values could occur many times, so the index would NOT be useful. Thus, the query plan is not SAFE and the query cannot be parameterized. That is why we saw an Adhoc plan for my first example.
IF you have queries with JOIN or other disallowed constructs, you can tell SQL Server to be more aggressive in parameterizing by changing a database option:
ALTER DATABASE AdventureWorks2016 SET parameterization FORCED;
Setting your database to FORCED parameterization means SQL Server will parameter a whole lot more queries, including those with JOIN, GROUP BY, OR, etc. But is also means that SQL Server can parameterize a query that is not SAFE. It may come up with a plan that is good when only a few rows are returned, and then reuse the plan when many rows are returned. This can end up with very suboptimal performance.
One final option for a prepared plan is when you explicitly prepare a plan. This behavior is usually invoked through an application with SQLPrepare and SQLExecute APIs. You specify what is the query with parameter markings, you specify the data types and you specify the specific values to use. The same query can then be run again with different specific values and the existing plan will be used. Although using explicitly prepared plans can be could for those cases where SQL Server is not parameterizing and you wish it would, it doesn’t prevent SQL Server from using a plan that is NOT good for subsequent parameters. You need to test your queries with many different input values, and make sure you get the expected performance if and when a plan is reused.
The metadata (e.g. my sp_cacheobjects view) just shows PREPARED for all three type of plans: FORCED and SIMPLE autoparameterization and EXPLICIT parameterization.
Proc Compiled Plans
The final objtype value for Compiled Plans is for a stored procedure, which is shown as Proc. When possible, stored procedures are the best choice for reusable code, due to their ease of management from the server itself, but that doesn’t mean they are guaranteed to always give the best performance. Just like using the FORCED parameterization option (and also the explicit parameterization), stored procedures use ‘parameter sniffing’. This means the first parameter value passed in determines the plan. If subsequent executions perform fine with the same plan, then parameter sniffing is not a problem and can actually be beneficial because it saves us the cost of recompiling and reoptimizing. However, if subsequent executions with different values shouldn’t use the original plan, then we have problem. I’ll show you an example of parameter sniffing causing a problem
I’ll create a stored procedure based on the newsales table we used earlier. The procedure will have a single query, that filters based on the SalesOrderID column, on which we built a nonclustered index. The query will be based on an inequality, so for some values the query can return just a few rows, and use the index, and for other values, the query can return a LOT of rows. In other words, the query is not SAFE.
DROP PROC IF EXISTS get_sales_range;
CREATE PROC get_sales_range
SELECT * FROM dbo.newsales
WHERE SalesOrderID < @num;
I’ll use the option SET STATISTICS IO ON to see how much work is being done when the procedure is executed. First, I’ll execute it with a parameter that only returns a few rows:
SET STATISTICS IO ON
EXEC get_sales_range 43700;
The STATISTICS IO value reports that it took 43 logical reads to return 41 rows. This is normal for a nonclustered index. Now we execute the procedure again with a much bigger value.
EXEC get_sales_range 66666;
SELECT * FROM sp_cacheobjects;
This time, we see that SQL Server used a whole lot more reads:
In fact, a table scan on the newsales table only takes 843 reads, so this is far worse performance than a table scan. The sp_cacheobjects view shows us that the PROC plan has been reused for this second execution. This is an example of when parameter sniffing is NOT a good thing.
So, what can we do when parameter sniffing is a problem? In the next post, I’ll tell you when SQL Server comes up with a new plan, and doesn’t reuse old ones. We’ll look at how you can force (or encourage) recompilation, and also, we’ll see when SQL Server automatically recompiles your queries.