Daily Archives: November 25, 2010

Identity vs. GUID as the clustered index

As we remember from the last week – there are 3 requirements for the clustered index. It needs to be unique, static and narrow. Let’s think about identity and GUID – what is more appropriate for that.

  1. Uniqueness. Both – Identity and GUID are unique. Identity is unique in the scope of the table. GUID is unique in the scope of the database.
  2. Static. Well, there is no particular reasons to modify either of those values. So they are static
  3. Narrow. This is the key difference. Identity is typically int (4 bytes) or BigInt (8 bytes). GUID is 16 bytes. As we remember, it also means that non-clustered rows would be 8-12 bytes bigger with GUID. Identity is the clear winner in such case

Let’s add one other factor into consideration.

Identity is increasing monotonously. It means that SQL Server fills data pages one-by-one. That’s typically not a bad thing – it introduces high fillfactor and reduces the fragmentation unless you have the table with really high number of inserts. In such case the system will have contention in the extents allocation. The number varies based on the multiple factors – row size, hardware, etc. I would not worry about that unless I have at least a few hundred inserts per second. In such case I would like to have the clustered index that distributes the load across entire table. (And let’s be realistic – in such case I want to have clustered index that covers most critical queries).

GUID could behave similarly to Identity when it generated with NewSequentialId() or random when generated with NewId(). This is probably the most important difference besides the size. So technically, you can alter the behavior with one default constraint, although again, I’m not sure that GUID is a good choice for the transactional tables anyway.

So use GUID if:

  1. You need to have database-wide uniqueness and don’t want to use any other workarounds
  2. Table is small and row size does not really matter
  3. Want to be able to alter value distribution and as result write pattern (again, this is questionable solution)

Otherwise use identity.