If you worked with SQL Server for a while, you should have noticed how landscape changed over the years. We are dealing with the different problems now. Five years ago, majority of the issues I saw were related to non-optimized queries. There were the huge scans with a lot of physical I/O and bad performance.
You do not see them as often nowadays. It is very cheap to solve the problems by upgrading the server. Put a couple hundred GBs of RAM and cache all the data; add more CPUs and problems magically disappear. The root-cause has not been fixed but who cares?
Surprisingly, there is one category of the issues that did not went away – concurrency. It even becomes worse. Modern servers handle more users and problems that did not exist with 50 concurrent users may put the server to its knees with 5000 users. I’ve been constantly involved in the troubleshooting of various concurrency issues and, in fact, I see more and more of them overtime.
Troubleshooting of the blocking and concurrency issues is, in the nutshells, a simple process. You need to identify the processes involved in blocking conditions or deadlocks and analyze why those processes acquire the locks on the same resources. In majority of cases, you need to analyze queries and their execution plans identifying possible inefficiencies that led to excessive number of locks being acquired.
Collecting this information is not a trivial task. The information is exposed through DMVs (you can download the set of scripts here); however, it requires you to run the queries at time when blocking occurred. Fortunately, SQL Server allows you to capture blocking and deadlock conditions with the blocked process report and deadlock graph, analyzing them later.
There is the caveat though. Neither blocked process report nor deadlock graph provide you execution plans of the statements. Nor do they always include affected statements in the plain text. You may need to query plan cache and other DMVs to get this information and longer you wait lesser is the chance that the information is available. Moreover, SQL Server may generate enormous number of blocked process reports in cases of prolonged blocking and complex blocking chains, which complicates the analysis.
This analysis may become very time consuming, especially if you need to deal with the large number of servers. Over the years, I have created the set of routines, which simplify it for me. I have been thinking to publish my scripts for a while, but I’ve never had time to polish them enough for public consumption. Until now – and I am very happy to share my collection with all of you. So allow me introduce the Blocking Monitoring Framework, which I am using with majority of my servers!
This framework is using Event Notifications. It captures blocked process report and deadlock graphs and parses them at time when event occurred and all data is still available in the system. All information is persisted in the set of tables for the further analysis.
The first version is available for download. I also promise you that I am going to support and enhance it in the future publishing the new versions on the regular basis.
Please feel free to contact me in case of any questions. I would also appreciate if you provide me any blocked process reports and deadlock graphs that framework was unable to parse. I will address the issues as quickly as I could.