Locking in Microsoft SQL Server (Part 10 – What isolation level should I choose?)

11 Replies

Wrapping up.. Only remaining question is “What isolation level is good for me”? And there is no right answer to that question. As usual, it depends. I’ll try to summarize a few things but again, think about specific details of your system and make decision based on them, not on generic advices below.

When you start to think what isolation level should you use in your system, there are 2 questions you need to answer. First, what is the blueprint of your system – is it OLTP or Data Warehouse type system? OLTP systems usually handles operational activity of the company and serves high volume of short identical queries and short transactions. Data Warehouse blueprint described reporting type system with low volume of long transactions and complex queries. In OLTP systems data is constantly changing, in Data Warehouse systems it rarely the case – data usually updates on the batches based on some schedule.

As the example, let’s think about online store. The database that handles customer facing shopping cart web site is OLTP. There are always new orders coming, existing order updating, customers check the status of existing orders, search the article lists, etc – those are done within short transactions and simple queries. Assuming same time company executives want to run some reports, see some trends and other analysis – there is the good chance that it would be another database for that purpose. This database would be optimized for complex reporting and data in that database could be refreshed every night or perhaps on the weekly basis. That database is Data Warehouse.

In real life, of course, it’s rarely the case. There is always some reporting activity against OLTP database but in any case, you can see what blueprint fits better. And it leads to the second question you need to answer: “How much data consistency do I really need?”. And even if the answer “I don’t need any consistency” is quite popular, it rarely the case in the real life. So let’s dive a little bit more in “no-consistency” mode.

No-consistency in terms of transaction isolation levels mean read uncommitted. Either directly as “set transaction isolation level” or with (NOLOCK) hints. There are some cases when you can decide to use that isolation level – for example, in our system we have a few transaction entities where data has been inserting to our system and never ever updating/deleting after that. Same time, clients are constantly downloading the data and we don’t really care if clients get the data from uncommitted transaction. This is the case when we can downgrade to read uncommitted isolation level to reduce the blocking for the client downloading sessions.

But if you think about that example – it’s rather exception than the rule. In most part of the systems you want to have consistency. Think about reporting for executives – how happy they would be if you provided them incorrect data or, even better, if they get different results running the same report twice? And the real problem that in the real life people often switches to read uncommitted isolation level to solve the blocking issues. Right, it could help in the cases when select queries (readers) are blocked by update queries (writers). But it’s rather masking the problem than solving it. As we already know, in most part of the cases, locking and blocking issues triggered by non-optimized queries. So, if you had a chance, you’d better spend some time on query optimization. Of course, this advice would not help if you are in fire-drill mode and need to fix the blocking issues in production system but in such case there is another option that could work even better – use read committed snapshot instead of
read uncommitted. In that mode writers also don’t block readers and same time give you statement level consistency. Of course there are some performance implications but in most part of the cases you can live with them.

Read committed, which is also default isolation level, in most part of the cases (again, assuming the system is more or less optimized) – this is the good choice especially for OLTP systems. It gives you acceptable compromise between consistency and concurrency. Again, there is blocking involved but when system is optimized – it’s minimal. Speaking of higher isolation levels – repeatable reads and serializable – those are typically bad choice for OLTP. For reporting and data warehouse systems those could be acceptable but same time for such systems optimistic isolation levels (read committed snapshot and snapshot) are better.

And speaking of optimistic isolation levels. For Data Warehouse type systems – use them. I don’t think about any single reason why you would like to avoid them in such systems. For OLTP – consider them. If you can live with performance overhead – it could be the good choice. Be careful – don’t forget about extra 14 bytes and don’t use fillfactor = 100 though.

So, the bottom line. For Data Warehouse – use optimistic isolation levels whenever possible. Only case with Data Warehouse systems when I would suggest to consider different options is when data in the system updates on the real time. And even in such case give optimistic isolation levels the try. For OLTP – if you can use optimistic isolation levels – use them (start with read committed snapshot). If not, use read committed and optimize the queries. Don’t use read uncommitted and (nolock) hints unless you don’t care at all about consistency. And definitely don’t use read uncommitted to reduce blocking. This is the bad choice.

Last, but not least, don’t forget that you can, and often need to use multiple different isolation levels in the system. Use them wisely! 🙂

Part 11 – Deadlocks due multiple updates of the same row

Table of content

11 thoughts on “Locking in Microsoft SQL Server (Part 10 – What isolation level should I choose?)”

Nobody September 12, 2014 at 8:37 am

I’m hearing this advice for (don’t use REPEATBLE READ or SERIALIZABLE, use minimal isolation level possible) repeated over and over again. But what is it based on? IMHO, following it blindly leads to ‘databases’ containing some garbage instead of data. Worst part of it is that to really choose ‘minimal isolation level possible’ you must know ALL possible (in future, too) transactions in your system and their interactions. Even in simple cases like your example
“we have a few transaction entities where data has been inserting to our system and never ever updating/deleting after that” it can be nontrivial to choose it (what if one of ‘entities’ rolls back or unexpectedly closes connection?). Why I should take all these risks and burdens instead of using SERIALIZABLE isolation level exclusively?

Reply ↓
1. Dmitri Korotkevitch Post authorSeptember 12, 2014 at 9:32 am
  
  Well, the choice of isolation level should depend on the level of consistency you need. But you cannot make this decision without taking system concurrency behavior into consideration. Yes, SERIALIZABLE will provide you the best consistency but system would suffer from the blocking, especially in OLTP. Is it worth it? In the busy OLTP systems, you would have hard time implementing it and you’d have very few reasons of doing so.
  
  Let’s consider READ COMMITTED for example. Yes, you can have phantom and non-repeatable read phenomena in that isolation level. But how often OLTP transactions read the same data multiple times? Even if the reporting scenarios it does not happen very often. Of course, there are also missing rows and duplicated reads that can happen due to the data movement; but again, those conditions are very rare ones and usually occur during the range scans.
  
  And of course, you have optimistic isolation levels if you need consistency and can handle extra tempdb load. Those isolation levels are much better choice from concurrency standpoint comparing to REPEATABLE READ and SERIALIZABLE. In fact, they are even “more consistent” – for example, SNAPSHOT “freezes” the data at time of transaction start while SERIALIZABLE “freezes” it at time of data access. E.g. if you read data from 2 tables sequentially in SERIALIZABLE, you would see the changes that were made in the second table by other sessions at time before SERIALIZABLE session did not access that table. It is impossible with SNAPSHOT.
  
  Reply ↓
  1. Nobody September 12, 2014 at 10:30 am
    
    Thanks for the swift answer.
    
    > Yes, SERIALIZABLE will provide you the best consistency but system would suffer
    > from the blocking, especially in OLTP.
    I’d tell that only SERIALIZABLE _guarantees_ consistency.
    
    > Is it worth it?
    If I need data and not garbage, it is.
    
    > In the busy OLTP systems, you would have hard time implementing it
    > and you’d have very few reasons of doing so.
    The reason is simple: I need 2+2 = 4 (not 7-8) 😉 in some systems. Of course, if my data is implicitly garbage (like forum message store), I wouldn’t care (I’ve heard that
    StackOverflow uses READ UNCOMMITTED exclusively).
    
    Quite the contrary, the _only_ reason not to use SERIALIZABLE is performance. So why should I start with not using it at all?
    
    > Let’s consider READ COMMITTED for example. Yes, you can have phantom and
    > non-repeatable read phenomena in that isolation level.
    > But how often OLTP transactions read the same data multiple times?
    How this question is related to consistency at all? You don’t need to read data multiple times
    to get inconsistent results on this level. For example, imagine that you have 2 tables: 1) present on-hand stock of items and 2) inventory transactions (past movements). If you try to get on-hand stock at some past date using them (reading on-hand stock first and subtracting transactions then) and someone interrupts your transactions’ scan, inserts inventory transaction in the past, changes on-hand stock (while you are waiting for him to commit), and commits transaction. As he released all locks, your transaction scanning proceeds and reads his inventory transaction… and you are screwed (you get on-hand value that NEVER existed in database)!
    
    > Even if the reporting scenarios it does not happen very often.
    It happens in reporting scenarios every day. I’ve seen behavior described above in many long-running reports. In one case, every new execution of such report for a date year ago, where no changed have been made, brought different totals EVERY time.
    
    > Of course, there are also missing rows and duplicated reads that can
    > happen due to the data movement; but again, those conditions are
    > very rare ones and usually occur during the range scans.
    What if I tell they are ubiquitous? Have you got any proofs?
    
    > In fact, they are even “more consistent” – for example, SNAPSHOT “freezes”
    > the data at time of transaction start while SERIALIZABLE “freezes” it at time of data access.
    Stop, what? 8-(
    Ever heard of Write Skew, for example?
    
    > E.g. if you read data from 2 tables sequentially in SERIALIZABLE, you would see
    > the changes that were made in the second table by other sessions at time before
    > SERIALIZABLE session did not access that table. It is impossible with SNAPSHOT.
    So what? Both behaviors are normal. Do your understand what “atomic” in ACID really means?
    It means that from RDBMS’s point of view, transactions (however long-running in reality) happen _instantly_! And if transactions overlap, every SQL engine is free to “reorder” them any way it wants. So, if you started transaction 2 hours ago and then started second, the system is free to act as if second transaction happened first.
    
    Reply ↓
    1. Dmitri Korotkevitch Post authorSeptember 12, 2014 at 3:37 pm
      
      First, thank you for the interesting discussion!
      
      SERIALIZABLE (with on-disk tables in SQL Server) does not guarantee transaction-level consistency. It is implemented based on locks, which are not acquired until data is accessed. Run the following:
      create table dbo.Orders(OrderId int, Total money); create table dbo.OrderItems(OrderId int, OrderItemId int, Total money);
      insert into dbo.Orders(OrderId, Total) values(1,10) insert into dbo.OrderItems(OrderId, OrderItemId, Total) values(1,1,5),(1,2,5); set transaction isolation level snapshot begin tran select * from dbo.OrderItems /* Run the following in another session during delay: insert into dbo.OrderItems(OrderId, OrderItemId, Total) values(2,3,20) */ waitfor delay '0:01:00'
      select sum(Total) from dbo.OrderItems commit
      You’ll get 10/30 as results, which is inconsistent. SNAPSHOT on the other hand, would provide you snapshot of the data at time of transaction starts, which is consistent. Yes, there is the possibility of write skews but I would consider any “multi-row” rules belong to the business logic domain. You should handle them in the code rather than rely on database engine.
      
      I am not arguing with the fact that lower isolation levels are inconsistent and subject for anomalies. However, based on my experience, they are good enough even for reporting activity. Customers who need better consistency can use optimistic isolation levels or build separate databases/warehouses for reporting. And I am stating that it is impossible to use SERIALIZABLE in OLTP systems or in the systems with the mixed workloads due to severe blocking it introduces.
      
      PS. Before we continue our discussion, can we at least define what we are arguing about? 🙂
      
      Reply ↓
      1. Nobody September 13, 2014 at 1:14 am
        
        I’m arguing about your advice on using isolation levels.
        I’d advice ‘use SERIALIZABLE only until it hurts’.
        
        > SERIALIZABLE (with on-disk tables in SQL Server) does not guarantee
        > transaction-level consistency.
        No, it DOES and if it doesn’t in ANY case, then it’s a BUG that should be reported!
        
        > Run the following:
        Did you ever tried your example? I ran it (replacing ‘set transaction isolation level snapshot’ with ‘set transaction isolation level SERIALIZABLE’) and got 10/10 (second session was blocked, of course).
        
        > Yes, there is the possibility of write skews but I would consider any
        > “multi-row” rules belong to the business logic domain.
        Why don’t just use SERIALIZABLE and don’t draw such artificial distinctions at all?
        
        > You should handle them in the code rather than rely on database engine.
        The problem is that you can’t reliably ‘handle them in the code’ because
        anomalies that can be introduced by snapshot isolation is much more subtle, ‘write skew’ is only an example that’s frequently cited and easy to explain. And, I repeat, you can’t be sure that using non-SERIALIZABLE transactions has no consistency problems without analyzing all their possible interactions.
        
        > However, based on my experience, they are good
        > enough even for reporting activity.
        You ‘experience’ isn’t statistics, you know? I’ve seen many cases there it isn’t true (and already wrote you about some).
        
        > And I am stating that it is impossible to use SERIALIZABLE in OLTP
        > systems or in the systems with the mixed workloads
        > due to severe blocking it introduces.
        Do you know what isolation level is used in TPC benchmarks mimicking OLTP workloads (like TPC-C or TPC-E)? You are telling me that those results are “impossible”? 😉
      2. Dmitri Korotkevitch Post authorSeptember 15, 2014 at 9:31 am
        
        I afraid, we are going nowhere. We have different opinions and experience and would not be able to convince each other. Perhaps, it is the time to stop this discussion.
        
        By the way, I am sorry, I posted the wrong code when I tried to show why I meant that SERIALIZABLE is not consistent on transaction level. It should be this one:
        create table dbo.Orders(OrderId int, Total money); create table dbo.OrderItems(OrderId int, OrderItemId int, Total money);
        insert into dbo.Orders(OrderId, Total) values(1,10) insert into dbo.OrderItems(OrderId, OrderItemId, Total) values(1,1,5),(1,2,5); set transaction isolation level serializable begin tran select * from dbo.Orders /* Run the following in another session during delay: insert into dbo.OrderItems(OrderId, OrderItemId, Total) values(2,3,20) */ waitfor delay '0:01:00'
        select sum(Total) from dbo.OrderItems commit
Nobody September 15, 2014 at 10:55 am

> I afraid, we are going nowhere. We have different opinions and experience and
> would not be able to convince each other. Perhaps, it is the time to stop this discussion.

So, not accepting your opinion is ‘going nowhere’? 😉

It’s not the matter of ‘opinion’ or ‘experience’. If DB programmer accepts your advice, she’ll most likely just write crappy code and don’t even know about it (until it hurts). With mine, she’ll be safe.

The _only_ I see reason not to use SERIALIZABLE is performance. This is just premature optimization — the root of all evil. 😉

We can stop this discussion if you like, but I didn’t understand your example.
What did you try to demonstrate?

Your second transaction:
“insert into dbo.OrderItems(OrderId, OrderItemId, Total) values(2,3,20)”
is just plain wrong now, it brings database into inconsistent state by itself.
What does it prove about any isolation level?

Reply ↓
1. Dmitri Korotkevitch Post authorSeptember 15, 2014 at 11:32 am
  
  > So, not accepting your opinion is ‘going nowhere’? 😉
  
  > It’s not the matter of ‘opinion’ or ‘experience’. If DB programmer accepts your
  > advice, she’ll most likely just write crappy code and don’t even know about it
  > (until it hurts). With mine, she’ll be safe.
  
  With yours, he or she would write the system, which suffers from blocking and would be unusable even under the moderate load and number of users. Do not tell me that it is possible to develop the system with volatile data and decent number of concurrent users using SERIALIZABLE level. At least, not in SQL Server, which is using locks to enforce SERIALIZABLE rules.
  
  > The _only_ I see reason not to use SERIALIZABLE is performance. This is just
  > premature optimization — the root of all evil. 😉
  
  Let’s not confuse performance and concurrency. There are completely different topics and concurrency is not about optimization.
  
  In either case, I think that defining “spherical cows or horses in vacuum” is the bigger evil comparing to premature optimization. It is impossible not to take product specific features and behavior into consideration when you design the system. You will never implement successful system otherwise.
  
  >We can stop this discussion if you like, but I didn’t understand your example.
  >What did you try to demonstrate?
  
  I am trying to demostrate you that SERIALIZABLE isolation level does not provide you the isolation when you see the data in the database as of at moment when transaction started. Serializable rules are enforced only after you access the data in context of transaction, which violates ACID.
  
  > Your second transaction:
  > “insert into dbo.OrderItems(OrderId, OrderItemId, Total) values(2,3,20)”
  > is just plain wrong now, it brings database into inconsistent state by itself.
  > What does it prove about any isolation level?
  
  Let’s not talk about logical context based on the table names here. It is about letter I in ACID.
  
  In either case, what benefits SERIALIZABLE gives you over SNAPSHOT in your scenario?
  
  Reply ↓
Nobody September 15, 2014 at 12:44 pm

> With yours, he or she would write the system, which suffers from blocking and
> would be unusable even under the moderate load and number of users.
I don’t think so. Only IF she encounters that system is unusable due to blocking, she can do something (like correct indexing or even going to a lower isolation level ). BTW, define ‘moderate load and number of users’.

> Do not tell me that it is possible to develop the system with volatile data and decent
> number of concurrent users using SERIALIZABLE level.
> At least, not in SQL Server, which is using locks to enforce SERIALIZABLE rules.
Is this a decent load?
If not, what is?

> Let’s not confuse performance and concurrency.
> There are completely different topics and concurrency is not about optimization.
I don’t understand that. All else being equal, concurrency determines how many TPS or business transactions in a minute you can execute and it IS performance.

> I am trying to demostrate you that SERIALIZABLE isolation level does not provide
> you the isolation when you see the data in the database
> as of at moment when transaction started.
Because it shouldn’t.

> Serializable rules are enforced only after you access the data
> in context of transaction, which violates ACID.

No, it ‘violates’ only _your_ understanding of ACID.
Nowhere ACID promises you consistency at ANY ‘moment inside the transaction’ because there are NO such ‘moments’. Transaction is _atomic_ unit.

This is from the definition of the serializable isolation level in the SQL-92:
“A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions that produces the same effect as some serial execution of those same SQL-transactions. A serial execution is one in which each SQL-transaction executes to completion before the next SQL-transaction begins.”

So _this_ is “letter I in ACID”.

> In either case, what benefits SERIALIZABLE gives you over SNAPSHOT in your scenario?
What scenario?

Anyway, _only_ SERIALIZABLE guarantees that if my transaction correctly executes alone in any situation, it’ll correctly execute in ANY mix of concurrent transactions. Giving up this unique property at the beginning of system’s development for nothing is ridiculous, IMHO.

Reply ↓
1. Dmitri Korotkevitch Post authorSeptember 15, 2014 at 1:06 pm
  
  Ok. I give up. You win 🙂
  
  Thank you for the interesting discussion.
  
  Reply ↓
  1. Nobody September 15, 2014 at 1:14 pm
    
    Oh really? 😉
    How about rewriting article with my corrections? 😉
    Thanks for the interesting discussion, too.
    
    Reply ↓