Archive for the ‘SQL Server 2008’ Category

Sunday T-SQL Tip: Inline vs. Multi-statement Table Valued Functions

Sunday, October 23rd, 2011

One of the biggest challenges for the developers who are not familiar with T-SQL is understanding the conceptual difference between client side and T-SQL functions. T-SQL functions look very similar to the functions developed withhigh-level programming languages. While encapsulation and code reuse are very important patterns there, it could hurt database code badly.

There are 2 kinds of functions in Microsoft SQL Server that can return table result set. The good one and the bad one. Unfortunately the bad one is much easier to use and understand for people who used ti work with high-level languages.

Let’s take a look. First, let’s create 2 tables and populate them with the data. Don’t put much attention how good is the data and how logically correct are the statements - we’re talking about performance here.

Now let’s create the multi-statement function here and run it. As you can see, total execution time is 176 millisecond in my environment.

Now let’s do inline function. We need to change the original select statement and use cross apply here. Looks more complex but at the end - execution time is 106 milliseconds - about 40 percent faster.

Now let’s check the execution plans - as you can see - first plan (multi-statement) is very simple - CI scan + aggregate. Second (inline) introduces much more complicated execution plan. Also it worth to notice that SQL Server shows that second plan takes all the cost.

How could it happen? How less expensive and simpler plan could run slower? The answer is that SQL Server lies - it does not show multi-statement function executions there at all. Let’s run the profiler and start to capture SP:Starting event.

As you can see - multi-statement function introduces SP call for each row processed. Think about all overhead related with that. Inline functions are working similarly to C++ inline functions - those are “embedded” to the execution plan and don’t carry any SP calls overhead.

So the bottom line - don’t use multi-statement functions if possible. I’m going to start the set of the posts related with CTEs - and will show how you can convert very complex multi-statement functions to inline ones.

Source code is available for download

Update (2011-12-18):
As Chirag Shah mentioned in comments, my example above is not 100% valid. I demonstrated the difference between Inline TVF and Scalar Multi-Statement function. So let’s correct that and and run the test again. (Image is clickable)

As you can see, results are even worse. The main point I want to stress - as long as UDF body has begin/end keywords, SQL Server treats them similarly to stored procedures. And that hurts.

Source code has been updated to include the last example

Sunday T-SQL Tip: Merge into CTE as the Target

Sunday, October 9th, 2011

If you spent some time working with Microsoft SQL Server 2008, you should be aware of the Merge statement. This statement is not only allowing you to insert/update/delete data as the part of the single statement (which helps with locking and performance), it also gives you an ability to intercept column values from the source rowset - something you cannot do with regular OUTPUT clause of insert, update and delete statement.

Today I’d like to show you another hidden beauty of this statement - ability to use CTE as the Target. Basically it gives you an ability to execute merge against subset of the data from the table. There are quite a few cases where it could be beneficial - let’s think about the situation when you need to synchronize target with the source that contains data only for subset of target rows. Confusing? Let’s think about one real life example.

Let’s think about order entry system and assume that you want to have a cache and store the information about last 15 orders per customer in the system. Let’s create the table and populate it with some data.

In this example orders are sorted by ID and partitioned by customers - so bigger ID means more recent orders. As you can see - you have 100 customers with 15 orders each in the cache.

Let’s assume that every day you get the data about the new orders placed into the system. This data contains the orders for subset of the customers (obviously some customers don’t place orders that day). It could also have the orders from the new customers that you don’t have in the cache. Let’s create the table:

As you can see, in this example we added 10 orders per customer for 21 old customers (CustomerIds from 80 to 100) as well as added 10 new customers (CustomerIds from 101 to 110).

What do we want to have at the end is to update the cache for existing customers (delete first 10 old orders) and add new customers to the cache. Obviously we don’t want to touch customers who did not submit any orders during the day.

Merge statement would work perfectly here. Although if we use Data table as the target, we will have hard time to differentiate the customers who didn’t submit any data. Fortunately we can put CTE that filters out customers who don’t have any orders today and use it as the target. Let’s take a look:

So, first CTE - SourceData - does the trick - it filters out everybody who don’t have the new orders. This would be our Target. Now let’s prepare the Source - first what we need to do is to combine data from the cache with the new data - MergedData CTE does that. As result of this CTE we’ll have all old and new orders combined for the customers who submits the orders today. Next - we need to determine most recent 15 orders - basically let’s sort MergedData (use ROW_NUMBER()) based on ID in descending order. Here is SortedData CTE. And now we can use first 15 rows per customer from this CTE as the Source.

The trick is what to do next - if there is the order in SourceData that is not in the Source (top 15 from SortedData) - it means order is old and we need to delete it from the cache. “When not matched by source” does that. If order is in the Source but not in the cache - we need to insert it (”when not matched by Target“). Obviously if order is in the both places, we should ignore it. And now, if you think about SourceData CTE which is the Target for the merge - it makes perfect sense. In case if you use the dbo.Data table there - all orders from the customers who did not submit data today would not be matched by Source and would be deleted. So CTE as the Target takes care of it.

If you look at the data, you’d see that new customers (CustomerID > 100) have 10 rows in the cache with ID starting at 16. Old customers who submitted data today (CustomerID: 80..100) have last 15 orders - with ID from 11 to 25. And old customers data (CustomerID < 80) is intact.

Source code is available for download

Locking in Microsoft SQL Server (Part 10 - What isolation level should I choose?)

Monday, September 26th, 2011

Wrapping up.. Only remaining question is “What isolation level is good for me”? And there is no right answer to that question. As usual, it depends. I’ll try to summarize a few things but again, think about specific details of your system and make decision based on them, not on generic advices below.

When you start to think what isolation level should you use in your system, there are 2 questions you need to answer. First, what is the blueprint of your system - is it OLTP or Data Warehouse type system? OLTP systems usually handles operational activity of the company and serves high volume of short identical queries and short transactions. Data Warehouse blueprint described reporting type system with low volume of long transactions and complex queries. In OLTP systems data is constantly changing, in Data Warehouse systems it rarely the case - data usually updates on the batches based on some schedule.

As the example, let’s think about online store. The database that handles customer facing shopping cart web site is OLTP. There are always new orders coming, existing order updating, customers check the status of existing orders, search the article lists, etc - those are done within short transactions and simple queries. Assuming same time company executives want to run some reports, see some trends and other analysis - there is the good chance that it would be another database for that purpose. This database would be optimized for complex reporting and data in that database could be refreshed every night or perhaps on the weekly basis. That database is Data Warehouse.

In real life, of course, it’s rarely the case. There is always some reporting activity against OLTP database but in any case, you can see what blueprint fits better. And it leads to the second question you need to answer: “How much data consistency do I really need?”. And even if the answer “I don’t need any consistency” is quite popular, it rarely the case in the real life. So let’s dive a little bit more in “no-consistency” mode.

No-consistency in terms of transaction isolation levels mean read uncommitted. Either directly as “set transaction isolation level” or with (NOLOCK) hints. There are some cases when you can decide to use that isolation level - for example, in our system we have a few transaction entities where data has been inserting to our system and never ever updating/deleting after that. Same time, clients are constantly downloading the data and we don’t really care if clients get the data from uncommitted transaction. This is the case when we can downgrade to read uncommitted isolation level to reduce the blocking for the client downloading sessions.

But if you think about that example - it’s rather exception than the rule. In most part of the systems you want to have consistency. Think about reporting for executives - how happy they would be if you provided them incorrect data or, even better, if they get different results running the same report twice? And the real problem that in the real life people often switches to read uncommitted isolation level to solve the blocking issues. Right, it could help in the cases when select queries (readers) are blocked by update queries (writers). But it’s rather masking the problem than solving it. As we already know, in most part of the cases, locking and blocking issues triggered by non-optimized queries. So, if you had a chance, you’d better spend some time on query optimization. Of course, this advice would not help if you are in fire-drill mode and need to fix the blocking issues in production system but in such case there is another option that could work even better - use read committed snapshot instead of read uncommitted. In that mode writers also don’t block readers and same time give you statement level consistency. Of course there are some performance implications but in most part of the cases you can live with them.

Read committed, which is also default isolation level, in most part of the cases (again, assuming the system is more or less optimized) - this is the good choice especially for OLTP systems. It gives you acceptable compromise between consistency and concurrency. Again, there is blocking involved but when system is optimized - it’s minimal. Speaking of higher isolation levels - repeatable reads and serializable - those are typically bad choice for OLTP. For reporting and data warehouse systems those could be acceptable but same time for such systems optimistic isolation levels (read committed snapshot and snapshot) are better.

And speaking of optimistic isolation levels. For Data Warehouse type systems - use them. I don’t think about any single reason why you would like to avoid them in such systems. For OLTP - consider them. If you can live with performance overhead - it could be the good choice. Be careful - don’t forget about extra 14 bytes and don’t use fillfactor = 100 though.

So, the bottom line. For Data Warehouse - use optimistic isolation levels whenever possible. Only case with Data Warehouse systems when I would suggest to consider different options is when data in the system updates on the real time. And even in such case give optimistic isolation levels the try. For OLTP - if you can use optimistic isolation levels - use them (start with read committed snapshot). If not, use read committed and optimize the queries. Don’t use read uncommitted and (nolock) hints unless you don’t care at all about consistency. And definitely don’t use read uncommitted to reduce blocking. This is the bad choice.

Last, but not least, don’t forget that you can, and often need to use multiple different isolation levels in the system. Use them wisely! :)

Part 11 - Deadlocks due multiple updates of the same row

Table of content

Locking in Microsoft SQL Server (Part 9 - Optimistic transaction isolation levels - TANSTAAFL!)

Thursday, September 8th, 2011

Two week ago we discussed 2 “optimistic” transaction isolation levels - Snapshot and Read Committed Snapshot. We already saw how those isolation levels can provide us transaction and statement level consistency reducing blocking issues same time. Sounds too good to be true..

So does it make sense to switch to optimistic isolation levels? Well, the answer is typical - “it depends”. There are a few things you need to keep in mind before you make the decision. Let’s talk about them from both - DBA and Database Developer standpoints.

First of all, as we already know, optimistic isolation levels use tempdb for the version store. When you modified the row, one or more old versions of the row is stored in tempdb. This leads to the higher tempdb load as well as to the larger tempdb size. Would it be the issue for your system? Again, it depends. As the bare minimum you should reserve enough space for tempdb and closely monitor the load. There are a few performance counters under <SqlInstance>:\Transactions section that related to the version store - they will show you version store size, generation and cleanup rates and a few other parameters. Those are very useful and I recommend to add them to your baseline.

Second thing you need to keep in mind is that SQL Server needs to store additional 14 bytes pointer to the version store in the. It’s not only increasing the row size, it also can introduce page splits. Let’s take a look. Let’s use the same database from the last blog post - you can download it from the last post.

Let’s rebuild the index (with default FILLFACTOR=100) and look at the index statistics (click on the image to open it in the different window). As you can see, it has 0% fragmentation. Row size is 215 bytes.

Now let’s run transaction in snapshot isolation level and update Value column. This is integer (fixed width column) so this update by itself should not increase row size. If you look at the index statistics now, you can see that there were heavy page splits and row size increased by 14 bytes. Those 14 bytes is the pointer to the version store.

The bottom line - if you use optimistic isolation level, don’t use 100% fillfactor for the indexes.

Speaking of development challenges - well, it’s become a little bit more interesting. First potential program is referential integrity based on triggers. Let’s take a look. Let’s create 2 tables - MasterData and DetailData and after insert trigger on DetailData. In this trigger let’s check that master data exists and rollback transaction in case of referential integrity violation. Let’s test that:

Now let’s move to more complex scenario and 2 sessions. First, let’s start transaction in the 1st session and check if we have MasterData row.

Let’s keep transaction open and in the second session let’s delete master data row. As you see everything is just fine.

Next, lets come back to the first session and insert detail row that references the master row - as you can see there is no errors but referential integrity has been compromised.

It happens because inside the trigger we still are in the context of the old transaction where we reference old version of MasterData row from the version store. This could be easily fixed in the trigger by using (READCOMMITTED) query hint but of course you should remember it and modify the code. It worth to mention that regular referential integrity based on foreign keys uses read committed isolation level by default.

Another issue is the update of the same row. Let’s take a look. first let’s reset Master and Detail table data. Now let’s start transaction in the first session and query the data

Next, let’s update data row in another transaction from another session

And now let’s try to update the same row from the first session and commit the transaction.

As you can see, behavior is completely different from the regular pessimistic isolation levels - it raises the exception. So obviously the client application needs to catch those errors and either notify users or implement some sort of retry logic to handle it.

And finally let’s look at the different update behavior in snapshot isolation mode. Let’s start the transaction assuming we have 2 rows in the table

Next, in another session let’s run update that changes DetailDataId to 2 for the first row.

Now in the first session let’s do the opposite action and check results


As you see, because of the row versioning it simply swaps the values. It would be completely different with regular pessimistic isolation levels when one session would be blocked and next update either 0 or 2 rows (depend on what session acquires the lock first). The bottom line - if you move your application to snapshot isolation level, you need to test how it would behave in environment with the multiple users. Otherwise you’d have a few nice side effects.

So to summarize - optimistic isolation levels are great but you have to keep a few things in mind:

  • Extra tempdb load
  • Possible page splits/fragmentations due bigger row size
  • Referential integrity based on triggers does not work unless read committed hint is used
  • There are different behaviors for updates when multiple sessions update the same rows and when scan is involved.

Source Code is available for download

Part 10 - What isolation level should I choose?

Table of content

Locking in Microsoft SQL Server (Part 8 - Optimistic transaction isolation levels)

Thursday, August 25th, 2011

There was some time since we discussed locking and blocking in Microsoft SQL Server. One big area we did not cover is optimistic transaction isolation levels. So what exactly are those?

As you remember, standard “pessimistic” isolation levels - read uncommitted, read committed and repeatable read - all of them can have anomalies. Dirty and non-repeatable reads, ghost rows and duplicated reads. Higher isolation levels reduce anomalies in price of reduced concurrency. Most restrictive - Serializable isolation level guarantees no anomalies although with such level concurrency greatly suffers. Before SQL Server 2005 it was virtually impossible to have consistent reporting in the systems with heavy reading and writing activities.

Obviously in quite competitive market Microsoft had to do something to solve that problem. As the result, one of the features Microsoft introduced in SQL Server 2005 was new “optimistic” transaction isolation levels. With those isolation levels readers use row versions rather than locking. When you start the transaction and modify some rows, old version of the rows are copied to the version store in tempdb. Actual row holds 14 bytes pointer to the version store.

There are 2 different isolation levels - well, actually one and half. First one is called Read Committed Snapshot and not really the level but more or less the mode of Read Committed. This level implements statement level consistency - readers would not be blocked by writers but in transaction scope it could have same anomalies as regular read committed (with the exception of duplicated reads). Let’s take a look. If you don’t have test database, create it first. First, let’s enable read committed snapshot isolation level. It worth to mention that when you switch the mode, you should not have any users connected to the database.

Next, let’s create the table and populate it with some data

Now let’s start one session and update one of the rows. As you can see, this session starts in read committed isolation level

Let’s look at the locks we have in the system. As you can see, we have exclusive lock on the row. This is expected and should happen with any isolation level. Click to the image to open it in the different window

Now let’s start another session and try to select the same row.

And here is the big difference with regular read committed. Instead of being blocked as when we have pessimistic isolation level, it returns old version of the row with placeholder equal ‘a’. So far, so good. Now let’s run another update.

As you can see, with read committed snapshot we still have writers acquiring U and X locks - so those statements would be blocked as with regular pessimistic isolation levels. What does it really mean for us? If your system suffers because of the blocking between readers and writers, read committed snapshot helps in such situation - blocking would be reduced. This is especially good if you need to deal with 3rd party vendors databases - you can change the database option and (partially) solve the problem. Although read committed snapshot would not help if writers block each other. you’ll still have typical update and exclusive locks behavior.

One other thing I already mentioned - read committed snapshot does not provide you consistency on the transaction scope. Consistency is guaranteed only on the statement scope. Let’s check that:

Now let’s commit the session 1 (I don’t put screen shot here) and run select statement again. We’re on the same transaction context.

As you can see, select returns new (updated) row value. Again, no transaction level consistency in such case. If you need full consistency on transaction level, you need to use SNAPSHOT isolation level. With such isolation level SQL Server stores multiple “old” versions of the row - as long as there are transactions that can reference them. When session references the row, it reads the version that was valid on the moment when transaction started. It has “snapshot” of the data for the transaction.

Let’s take a look. First, you need to enable that option on database level. Again, you need to be on single user mode in order to do so.

Next, let’s run same update statement and keep transaction uncommitted.

Now let’s go to another session, start transaction in snapshot isolation level and see what happens.

First, let’s select of the row updated by another session that holds X lock. As you can see - same behavior with read committed snapshot - it returns old value.

Now let’s try to update another row with table scan. As you can see - no blocking due U/X lock incompatibility now. Both rows have been updated, 2 X locks held

Now let’s commit transaction in the first session and run select in the second session again. It still returns the data that you had at the time when transaction was just started. Even if that row has been modified.

And now let’s delete that row in the first session.

If you run same select in the second session - it still returns that row.

As you can see, this is consistency on the transaction level. SQL Server guarantees that data within the transaction would be exactly the same during transaction lifetime. Even if it looks similar to Serializable isolation level, there is the key difference. With Serializable level you “lock” the data on the moment you access it. With Snapshot, it “locked” on the moment when transaction started. And obviously, this is the Heaven for reporting - consistent data with no blocking.

Too good to be true. Yes, there are a few things you need to keep in mind from both DBA and Development prospective. And we will talk about them next time.

Source code is available for download

Part 9 - Optimistic transaction isolation levels - TANSTAAFL! 

Also take a look at Part 15 - When Transaction Starts. It gives some additional details about snapshot isolation level

Table of content

Locking in Microsoft SQL Server (Part 7 - Read Committed - duplicate readings)

Thursday, August 4th, 2011

One of very interesting phenomenon we have not discussed earlier when we covered lock types and isolation levels  is duplicate readings. As we already know, read committed isolation level protects us from reading of modified and uncommitted data but the scope of protection is “this row/right now”. So what happens if row updates during the time when query is executed.

Let’s take a look. First, as usual, let’s create the table and populate it with 100,000 rows:

As you can see, even if we don’t have any unique constraints, that table has unique ID and Value columns (equal to each other) with values from 1 to 100,000. We can even run SQL query and see that there is no duplications.

Now let’s start another session and lock one row:

As you can see, we don’t commit transaction, so exclusive (X) lock is in play. Now let’s start another session, create temporary table and simply copy data from our main table there. When we run this statement in read committed mode, it would be blocked because there it tries to acquire shared (S) lock on the row that held (X) lock from the previous session.


Now let’s come back to the previous session and update another row (value column this time) and commit transaction.

As you can see, first (X) lock has been released and our blocked session resumes and finishes copying the data. And now, let’s see what we have in this table -

Surprise! We have 1 extra row - the one the first session updated. This row appears twice - with old and new values in Values column. In order to understand what happened, let’s take a look at the execution plan for insert/select statement.

As you can see, plan uses index scan. Let’s think what happened - select read first 999 rows (with ID from 1 to 999) and was blocked by (X) lock from the different transaction. Next, that transaction updated Value column for the row with ID = 1. Because Value is the index key, that index row has been moved in the index. As result, when select resumes, it read this row (with ID = 1) second time during remaining part of the index scan. Fun?

With read uncommitted another factor could come in play - sometime SQL Server can decide to use allocation scan. There are a lot of factors that can lead to that decision - but if it happens and if you’re lucky enough to have page split during query execution, you can easily have rows to be read/processed more than once.

Fortunately neither of those cases happen quite often. I’d say I saw duplicate readings in production systems maybe 2-3 times during my career. But it could give you a lot of grey hairs - think what if #Temp table had primary key on ID and you would get primary key violation on the data that guaranteed to be unique? If you want to be protected from that you have to use optimistic isolation levels. I (promise another time) is going to blog about them next time :)

Source code is available for download

Part 8 - Optimistic transaction isolation levels

Table of content

Sunday T-SQL Tip (on Thursday): Set vs. Select

Thursday, July 21st, 2011

It’s been a while since I posted any Sunday T-SQL Tips. Well, let’s do it today even if it’s Thursday :). I feel a little bit guilty stopping to blog about locking and blocking. I promise, I will blog about optimistic isolation levels in 2 weeks.

Today I want to talk about 2 very similar operators: SET and SELECT. If you ask T-SQL Developer what to use, in most part of cases she/he answers that it does not really matter. Well, it’s correct - up to degree. Those operators work in the same way unless they work differently. Let’s take a look.

First, let’s create a table and populate it with 50 rows.

Now, let’s look at the similarities. Let’s query the row. As you can see, both operators work the same way. Plain and perfect.

Now let’s change the query and try to select the row that does not exist in the table.

And here is the first major difference - SET operator “sets” variable to null. SELECT kept old value. It makes sense, if you think about it - SET assignes result from subquery that did not return any data. But more interesting behavior is when query tries to select multiple rows. Let’s run select first.

As you can see, it processes 9 rows and at the end variable has the value from the last one. Let’s check SET operator:

As you can see - it failed. Again, makes sense if you think about it from subquery standpoint.

Let’s think about it from the practical angle. Assuming you need to query some row from Users table based on Name. If today Name is unique (and has unique constraint/index on it) it does not really matter what to use SET or SELECT. Although, what will happen with the code that uses SELECT if in one day, name stops to be unique? Obviously it introduces hard-to-find bugs/side effects - system just picks up one of the rows. Using SET in such case helps - code simply fails. Obviously it will require troubleshooting but at the end it will take much less time to pinpoint the problem. Of course, you can check @@RowCount (and it’s the only choice if you need to select/assign more than 1 variable as part of the query), but with 1 variable SET is more elegant (based on my opinion).

Same time, when I need to assign constants to the multiple variables, I prefer to use select. Less typing and probably a couple microseconds of performance improvement due less operators.

Code can be downloaded here

Ad-hoc SQL and plan cache

Thursday, July 7th, 2011

Let’s take a break from locking and blocking issues and talk about Ad-hoc sql. Everybody knows that it’s bad. Most popular horror story is security hole introduced by SQL Injection. There are a lot of articles available on such subject - just google it. But today I’d like to show another problem related with that implementation - recompilations and plan cache flood. Let’s take a look.

First, let’s create our favorite Data table and populate it with some data.

Second, let’s clear the plan cache and make sure that there are nothing there that accesses Data table.

As you see, clean and clear. Now let’s have some fun and introduce really bad client side code. This code basically runs 1000 ad-hoc selects against that table.

Let’s run it and query the plan cache again. As you can see, now we have 1000 cached plans.

So let’s think about it for a minute - we forced SQL Server to compile 1000 different queries (which is quite expensive in terms of CPU usage), but besides that we just grab about 25MB of the server memory. Huh? Well, of course, those one-time-use plans would be removed from the cache rather sooner than later, but think what if you have the system that constantly runs those ad-hoc queries.. ?Alternatively, if we use parameters, you can see that there will be only 1 plan in the cache that was (re)used 1000 times. Don’t forget to clear cache before you run the second code snippet.

So what are the options if we cannot change the client code? First, of course, is forced parametrization. Although this is completely different can-of-worms and can introduce a lot of side effects related with parameter sniffing and bad plans. It deserves own blog post sometime later.

Another option is enable “Optimize for Ad hoc workloads” option. This is server side option that available in SQL Server 2008/2008R2 only. With this option enabled, SQL Server does not cache complete query plan at the first run - it generates plan stub (which is basically the small hash) instead. Next, when query runs the second time, SQL Server recompiles the query and replaces the stub with the actual plan. Let’s see that. First, let’s enable this option either in server properties window or via the script.

Next, let’s clear the cache and run the first code snipped again. As you can see, we still have 1000 objects in the plan cache, but each objects uses only 232 bytes. In case of complex queries/plans, the memory usage difference could be dramatic.

If you run it the second time, you’ll see that stubs were replaced with actual plans.

What are the downsides of this option? First of all, queries would be compiled twice. So it would introduce addition CPU load. Another issue that this is server-side option, so you cannot control it on the database level. In case if you have OLTP and Data-Warehouse databases hosted on the same server, that additional recompilation could be very expensive for the complex data-warehouse queries. But still, for OLTP and bad client code this option is usually useful.

Of course, the best advice is to get rid of ad-hoc sql at all!

Code is available for download   

Locking in Microsoft SQL Server (Part 6 - How to troubleshoot deadlocks)

Thursday, June 23rd, 2011

As we already saw, the reasons why we have blocking issues and deadlocks in the system are pretty much the same. They occur because of non-optimized queries. So, not surprisingly, troubleshooting techniques are very similar. Let’s have a quick look. We’ll use the same scripts I used last time.

The simplest approach is to use SQL Profiler. There is “Deadlock graph” event in the “Locks” event group you can use. Click on the picture to open it in the different window.

Let’s start the trace and trigger deadlock.

As you can see, it shows you very nice picture. There are 2 sessions (ovals) involved. Those sessions compete for the page locks (squares). You can see what locks each session held and you can even track it down to the resources (but that rarely needed). You can even see the statements when you move the mouse over the session oval and wait for the tool tip.

In context menu for “deadlock graph” line in the grid above, you have “Extract event data” menu command that can save this information as the file.

You can open it as the graph in management studio or, technically, simply look at XML which is extremely familiar:

As you can see it’s way more detailed in compare with graphical representation. It’s also extremely familiar with blocking process report - and you can use same technique and query sys.dm_exec_sql_text if you need to obtain sql text from handle. I demonstrated how to do that in post related with blocking troubleshooting.

In case, if you don’t want to use SQL Profiler, there are 2 options you can use. The first one is enabling trace flag 1222 with DBCC TRACEON(1222,-1) command. When you have it enabled, SQL Server put deadlock graph XML to SQL Server log.

Another option is using extended events (SQL Server 2008/2008R2). Again, it’s very powerful method although requires some initial work to set it up. As with the blocking, I’m leaving it out of scope for now.

How to deal with deadlocks? Of course, the best thing is not to have deadlocks at the first place. Again, golden rule is to optimize the queries. Meanwhile, if you need to control (up to degree) what session will be terminated, you can use SET DEADLOCK PRIORITY option. There are 21 priority levels available. When 2 sessions deadlocked, the session with the lower deadlock priority level would be chosen as the victim. In case of the same priority level (default case), SQL Server chooses the session that is less expensive to rollback.

If session is chosen as the victim, it would be aborted with error code 1205. In such case client (or T-SQL code) can catch the exception and re-run the query. But again, the best way is avoiding deadlocks at the first place.

Part 7 - Read Committed - duplicate readings 

Table of content

Locking in Microsoft SQL Server (Part 5 - Why do we have deadlocks)

Thursday, June 9th, 2011

We already know why blocking occurs in the system and how to detect and troubleshoot the blocking issues. Today I’d like us to focus on the deadlocks.First, what is the deadlock? Classic deadlock occurs when 2 processes compete for the resources and waiting on each other. Let’s take a look on that. Click on each picture below to open them in the different window. Let’s assume that Orders table has the clustered index on ID column.Let we have the session 1 that starts transaction and updates the row from the Order table. We already know that Session 1 acquires X lock on this row (ID=1) and hold it till end of transaction

Now let’s start session 2 and have it update another row in the Orders table. Same situation - session acquires and holds X lock on the row with (ID = 2).

Now if session 1 tries to acquire X lock on the row with ID = 2 , it cannot do that because Session 2 already held X lock there. So, Session 1 is going to be blocked till Session 2 rollback or commit the transaction

It worth to mention that it would happen in any transaction isolation level except SNAPSHOT that handles such conditions differently. So read uncommitted does not help with the deadlocks at all.

Simple enough. Unfortunately it rarely happens in the real life. Let’s see it in the small example using the same dbo.Data table like before. Let’s start 2 sessions and update 2 separate rows in the table (acquire X locks).Session 1:

Session 2:

Now let’s run 2 selects.Session 1:

Session 2:

Well, as you can see, it introduces deadlock.

To understand why it happens, let’s take a look at the execution plan of the select statement:

As you can see - there is the table scan. So let’s think what happened

  1. First of all, we have 2 X locks on the different rows acquired by both sessions.
  2. When session 1 ran select, it introduced table scan. In read committed, repeatable read or serializable isolation levels, readers issue shared S locks, so when session 1 tried to read the row with ID = 40000 (X lock held by session 2) - it was blocked.
  3. Same thing happens with session 2 - it’s blocked on the row with ID = 1 (X lock held by session 1).

So this is the classic deadlock even if there are no data updates involved. It worth to mention that read uncommitted transaction isolation level would not introduce deadlock - readers in this isolation level do not acquire S locks. Although you’ll have deadlock in the case if you change select to update even in read uncommitted level. Otimistic isolation levels also behave differently and we will talk about it later.So as you can see, in the real life as other blocking issues deadlocks happen due non-optimized queries.

Next week we will talk how to detect and deal with deadlocks. Source code is available for download

Part 6 - How to troubleshoot deadlocks

Table of content