Monthly Archives: January 2014

Writing Triggers in the Right Way

The only correct way of writing triggers is not writing them. I would repeat – the only correct way of writing triggers is not writing them. Triggers negatively affect performance. AFTER triggers introduce fragmentation and increase tempdb load due to version store usage. Triggers are running in context of transaction, increase the time in which locks are held and contribute to locking and blocking in the system. However, in some cases, we have to use triggers and it is important to develop them correctly.

SQL Server fires triggers on statement rather than row level. For example, if update statement modified 10 rows, AFTER UPDATE and/or INSTEAD OF UPDATE trigger would fire once rather than 10 times. Therefore, trigger must handle situations when inserted and/or deleted tables have multiple rows.  Let’s look at the example and create a table with AFTER INSERT trigger as shown below:

create table dbo.Data 
( 
	ID int not null, 
	Value varchar(32) 
) 
go 

create trigger trgData_AI on dbo.Data 
after insert 
as 
	/* Some code */ 
	declare 
		@ID int 
	set @ID = (select ID from inserted) 
	/* Some code */

Everything is fine when you inserted a single row. However, multi-row insert would fail with exception shown below.

Msg 512, Level 16, State 1, Procedure trgData_AI, Line 9
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
The statement has been terminated.

And there is another consequence of statement-level nature of triggers. They fire even if DML statement did not modify any rows. Would it introduce any issues in the system or not depends on implementation. However, it would always lead to unnecessary performance overhead.

Let’s look at example. The code below creates another table and changes trigger in a way, that mimics behavior of audit functional, which is frequently implemented based on triggers.

create table dbo.Audit 
( 
	ID int not null identity(1,1), 
	OnDate datetime2(0) not null 
		constraint DEF_Audit_OnDate  
		default getutcdate(), 
	Message varchar(64) not null, 
)
go

alter trigger trgData_AI on dbo.Data 
after insert 
as 
begin
	declare 
		@Msg varchar(64) = 'Triggers are bad. And @@ROWCOUNT=' + convert(varchar(10),@@rowcount) 

	insert into dbo.Audit(Message) values(@Msg); 
end

Now let’s run insert statement, which does not insert any rows to the table.

insert into dbo.Data(ID, Value) 
	select 1, 'ABC'  
	where 1 = 0

If you checked content of Audit table, you would see that trigger was fired:

01. Content of Audit table

So the first important conclusion is that every trigger must checks @@ROWCOUNT variable as the very first statement in implementation.

SET NOCOUNT ON should be the second action trigger does. Without that SQL Server returns affected number of rows for each operator in the trigger in addition to original DML statement. Some client libraries rely on the single message in the output and would not work correctly in case of multiple messages. Therefore, the first two statements in the trigger should look like it is shown below.

alter trigger trgData_AI on dbo.Data 
after insert 
as 
begin
	if @@ROWCOUNT = 0
		return
	set nocount on

	/* Some Code Here */
end

Finally, there is another caveat. While implementation above works for insert, update and delete operators, it is not the case with merge@@ROWCOUNT in this case represents total number of rows affected by merge statement rather than by individual insert, update or delete action in the trigger. Let’s prove it with the example.

create table dbo.Data2(Col int not null)
go

create trigger trg_Data_AI on dbo.Data2
after insert 
as
	select 
		'After Insert' as [Trigger]
		,@@RowCount as [RowCount]
		,(select count(*) from inserted) as [Inserted Cnt]
		,(select count(*) from deleted) as [Deleted Cnt]
go

create trigger trg_Data_AU on dbo.Data2
after update
as
	select 
		'After Update' as [Trigger]
		,@@RowCount as [RowCount]
		,(select count(*) from inserted) as [Inserted Cnt]
		,(select count(*) from deleted) as [Deleted Cnt]
go

create trigger trg_Data_AD on dbo.Data2
after delete
as
	select 
		'After Delete' as [Trigger]
		,@@RowCount as [RowCount]
		,(select count(*) from inserted) as [Inserted Cnt]
		,(select count(*) from deleted) as [Deleted Cnt]
go

Now let’s run MERGE statement as shown below:

merge into dbo.Data2 as Target
using (select 1 as [Value]) as Source
on Target.Col = Source.Value
when not matched by target then
	insert(Col) values(Source.Value)
when not matched by source then
	delete
when matched then
	update set Col = Source.Value;

The table is empty, therefore only one insert statement would be executed. However, as you should see below, all three triggers were fired, It is worth mentioning that previous code example demonstrated very bad practice of returning result sets from trigger. You should avoid doing it in production code.

02. Triggers and MERGE statement

The right way to deal with such situation is checking content of inserted and deleted tables as shown below.

alter trigger trg_Data_AI on dbo.Data2
after insert 
as
	if @@rowcount = 0 
		return
	set nocount on
	if exists(select * from inserted)
		/* Some Code Here */      
go

alter trigger trg_Data_AU on dbo.Data2
after update
as
	if @@rowcount = 0 
		return
	set nocount on
	if	exists(select * from inserted) and 
		exists(select * from deleted)
		/* Some Code Here */      
go

alter trigger trg_Data_AD on dbo.Data2
after delete
as
	if @@rowcount = 0 
		return
	set nocount on
	if	exists(select * from deleted)
		/* Some Code Here */

I hope that those tips will help you to write trigger in the most optimal way. However, I would repeat – the best way of writing triggers is not writing them at all.

Unwinding Table Spools

One of the questions I have often been asked during various SQL Server events is what exactly Table Spool operator does in execution plan. I would try to answer it today.

Spool operators, in the nutshell, are internal in-memory or on-disk caches/temporary tables. SQL Server often uses spools for performance reasons to cache results of complex subexpressions that needs to be used multiple times during query execution.

Let’s look at the example and create the table, which stores some sales information as shown below:

create table dbo.Orders
(
    OrderID int not null,
    CustomerId int not null,
    Total money not null,
    constraint PK_Orders
    primary key clustered(OrderID)
);

;with N1(C) as (select 0 union all select 0) -- 2 rows
,N2(C) as (select 0 from N1 as T1 cross join N1 as T2) -- 4 rows
,N3(C) as (select 0 from N2 as T1 cross join N2 as T2) -- 16 rows
,N4(C) as (select 0 from N3 as T1 cross join N3 as T2) -- 256 rows
,Nums(Num) as (select row_number() over (order by (select null)) from N4)
    insert into dbo.Orders(OrderId, CustomerId, Total)
        select Num, Num % 10 + 1, Num
        from Nums;

Now let’s run the query that returns the list of orders with the information about total amount of sales on per-customer basis.

select OrderId, CustomerID, Total
	,Sum(Total) over(partition by CustomerID) as [Total Customer Sales] 
from dbo.Orders

As you see, in the execution plan below, SQL Server scans the table, sorts the data based on CustomerID order and uses Table Spool operator to cache the results. It allows SQL Server to access the cached data and avoid expensive sorting operation later.

01. Execution Plan with Spool operators

Even though Table Spool operator is shown in the execution plan multiple times, it is essential the same spool/cache. SQL Server builds it the first time and using its data later.

Technically speaking, there are two different logical spool operators – Eager Spool and Lazy Spool. The only difference between them are how data is populated. With Eager Spool, SQL Server fetches all rows as soon as spool is called. With Lazy Spool, SQL Server fetches rows on demand – when they are needed.

SQL Server also uses spools for Halloween Protection when modifying the data. Halloween Protection helps to avoid situations when data modifications affect what data need to be updated. The classic example of such situation is shown below. Without Halloween Protection, insert statement would fall into infinitive loop, reading rows it has been inserting.

create table dbo.HalloweenProtection
(
	Id int not null identity(1,1),
	Data int not null
);

insert into dbo.HalloweenProtection(Data)
	select Data from dbo.HalloweenProtection;

As you can see in the execution plan of insert statement, SQL Server uses Table Spool operator to cache the data from table as of before insert starts to avoid infinitive loop during execution.

02. Halloween Protection and Table Spools.

Halloween Protection has very interesting side-effect when we are talking about multi-statement user-defined functions (both, scalars and table-valued). Using multi-statement functions is bad practice by itself, however creating them without SCHEMABINDING option is even worse. That option forces SQL Server to analyze if user-defined function performs data access and avoid extra Halloween Protection-related Spool operators in the execution plan.

Let’s see the example and create two user-defined functions and using them in where clause of update statements.

create function dbo.ShouldUpdateData(@Id int)
returns bit
as
begin
	return (1)
end
go

create function dbo.ShouldUpdateDataSchemaBound(@Id int)
returns bit
with schemabinding
as
begin
	return (1)
end
go

update dbo.HalloweenProtection
set Data = 0
where dbo.ShouldUpdateData(ID) = 1;

update dbo.HalloweenProtection
set Data = 0
where dbo.ShouldUpdateDataSchemaBound(ID) = 1;

Neither of functions access the data and therefore can introduce Halloween effect. However, SQL Server does not know that in case of non-schema bound function and add Spool operator to execution plan as shown below.

03. Execution Plans for user-defined functions

Bottom line – you should always make functions schema-bound when you create them.

Spool temporary tables are usually referenced as worktables in I/O statistics for the queries. You should analyze table spool-related reads during query performance tuning. While spools can improve performance of the queries, there is the overhead introduced by unnecessary spools. You can often remove them by creating appropriate indexes on the tables.