Monthly Archives: November 2017

Splitting and Merging Non-Empty Partitions in Columnstore Indexes

Last week we discussed how to implement Sliding Window pattern in the tables with columnstore indexes. As I promised, today we are going to focus on partition management in such tables.

As you will remember, SQL Server does not allow you to split and merge non-empty columnstore partitions. I personally do not consider it as the huge limitation – perhaps because I rarely have reasons to do it in columnstore tables. However, there are still some cases when it may be required.

As one of examples, think about a table, partitioned by date, which may be populated with incorrect data from the future. Perhaps due to some issues in ETL processes. Some rows may be placed to incorrect (future) partition, which would prevent its split.

The common, by the book approach recommends dropping columnstore index, splitting or merging partitions and recreating the index afterwards. As you can imagine, it would lead to extremely inefficient process with huge amount of unnecessary overhead on large tables. After all, you have to drop and recreate columnstore index, converting table to Heap, while just subset of the partitions needs to be rebuilt. Fortunately, you can minimize the overhead with simple workaround:

Switch partition(s) to split or merge to the separate staging table
Split or merge partition(s) in the main table. You can do that because partitions will be empty after the previous step
Drop columnstore index in the staging table, split/merge partition(s) there and recreate the index afterwards
Switch partition(s) back from staging to the main table.

Let’s look at the process in details. I am going to recreate the tables I used last week with the script below.

create partition function pfOrders(datetime2(0))
as range right for values
('2016-11-01','2016-12-01','2017-01-01','2017-02-01'
,'2017-03-01','2017-04-01','2017-05-01','2017-06-01'
,'2017-07-01','2017-08-01','2017-09-01','2017-10-01'
,'2017-11-01','2017-12-01','2018-01-01');

create partition scheme psOrders 
as partition pfOrders
all to ([Primary]);

create table dbo.Orders
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        Placeholder char(100),
)
on psOrders(OrderDate);

create table dbo.OrderLineItems
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        OrderLineItemId int not null,
        Placeholder char(100),
)
on psOrders(OrderDate);
go

-- Left-most and right-most are empty
;with N1(C) as (select 0 union all select 0) -- 2 rows
,N2(C) as (select 0 from N1 as t1 cross join N1 as t2) -- 4 rows
,N3(C) as (select 0 from N2 as t1 cross join N2 as t2) -- 16 rows
,N4(C) as (select 0 from N3 as t1 cross join N3 as t2) -- 256 rows
,N5(C) as (select 0 from N4 as t1 cross join N4 as t2) -- 65,536 rows
,Ids(Id) as (select row_number() over (order by (select null)) from N5)
insert into dbo.Orders(OrderDate, OrderId)
        select dateadd(day,Id % 390,'2016-11-01'), ID
        from Ids;

insert into dbo.OrderLineItems(OrderDate,OrderId, OrderLineItemId)
        select OrderDate, OrderId, OrderId
        from dbo.Orders;
go

create clustered columnstore index CCI_Orders on dbo.Orders
on psOrders(OrderDate);

create clustered columnstore index CCI_OrderLineItems on dbo.OrderLineItems
on psOrders(OrderDate);
go

select 
        t.object_id, i.index_id, i.name as [Index], p.partition_number,
        p.[Rows], p.data_compression_desc, 
        fg.name as [Filegroup],
        sum(a.total_pages) as TotalPages, 
        sum(a.used_pages) as UsedPages, 
        sum(a.data_pages) as DataPages,
        sum(a.total_pages) * 8 as TotalSpaceKB, 
        sum(a.used_pages) * 8 as UsedSpaceKB, 
        sum(a.data_pages) * 8 as DataSpaceKB
from 
        sys.tables t with (nolock) join sys.indexes i with (nolock) on
                t.object_id = i.object_id
        join sys.partitions p with (nolock) on 
                i.object_id = p.object_id AND i.index_id = p.index_id
        join sys.allocation_units a with (nolock) on 
                p.partition_id = a.container_id
        join sys.filegroups fg with (nolock) on 
                a.data_space_id = fg.data_space_id
where
        t.object_id = object_id(N'dbo.Orders')
group by
        t.object_id, i.index_id, i.name, p.partition_number,
        p.[Rows], p.data_compression_desc, fg.name
order by 
        i.index_id, p.partition_number;

Figure 1 demonstrates data distribution across partitions in the dbo.Orders table. As you can see, we have 13 non-empty partitions started from November 2016 up to end of November 2017.

01.Initial Data Distribution

Let’s assume that we would like to merge November and December 2016 together, which are partitions 2 and 3 in the tables.

As the first step, we will create the staging tables. They need to be partitioned in the same way as the main tables. It is better to use separate partition function to decouple merge and split operations in main and staging tables from each other. Remember that SQL Server would hold schema modification (Sch-M) lock on all tables that use partition function for the duration of merge and split operations.

Partition function on the staging tables does not need to include all partitions from the main tables. It just need to include partition(s) you are about it split or merge and two adjacent partitions on the left and right sides. This will guarantee that partitions in the main and staging tables are aligned before and after the operation. It would also support partition switch without requirement of creating extra CHECK constraints on the staging tables. As you can see in the code below, pfOrdersStaging function consists of four partitions:

Everything prior November 2016. Will be empty.
November 2016
December 2016
Everything on or after January 1st 2017. Will be empty

create partition function pfOrdersStaging(datetime2(0))
as range right for values
('2016-11-01','2016-12-01','2017-01-01');

create partition scheme psOrdersStaging
as partition pfOrdersStaging
all to ([Primary]);

create table dbo.OrdersStaging
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        Placeholder char(100),
)
on psOrdersStaging(OrderDate);

create clustered columnstore index CCI_OrdersStaging on dbo.OrdersStaging
on psOrdersStaging(OrderDate);

create table dbo.OrderLineItemsStaging
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        OrderLineItemId int not null,
        Placeholder char(100),
)
on psOrdersStaging(OrderDate);

create clustered columnstore index CCI_OrderLineItemsStaging on dbo.OrderLineItemsStaging
on psOrdersStaging(OrderDate);

After the staging tables are created, we can switch two partitions from the main tables there.

alter table dbo.Orders switch partition 2 
to dbo.OrdersStaging partition 2;

alter table dbo.OrderLineItems switch partition 2 
to dbo.OrderLineItemsStaging partition 2;

alter table dbo.Orders switch partition 3
to dbo.OrdersStaging partition 3;

alter table dbo.OrderLineItems switch partition 3
to dbo.OrderLineItemsStaging partition 3;

Now we can merge the partitions in the main tables. SQL Server will allow us to proceed because they are empty after the switch.

alter partition function pfOrders() merge range ('2016-12-01');

Finally, we can merge partitions in our staging tables and switch them back to the main tables as shown below.

drop index CCI_OrdersStaging on dbo.OrdersStaging;
drop index CCI_OrderLineItemsStaging on dbo.OrderLineItemsStaging;
go

-- Offline with Sch-M
alter partition function pfOrdersStaging()
merge range ('2016-12-01');
go

create clustered columnstore index CCI_OrdersStaging on dbo.OrdersStaging
on psOrdersStaging(OrderDate);

create clustered columnstore index CCI_OrderLineItemsStaging on dbo.OrderLineItemsStaging
on psOrdersStaging(OrderDate);
go

-- Switching partitions back
alter table dbo.OrdersStaging switch partition 2 
to dbo.Orders partition 2;

alter table dbo.OrderLineItemsStaging switch partition 2 
to dbo.OrderLineItems  partition 2;

As you can see in Figure 2, now partition 2 stores the data for both November and December of 2016.

02.Data Distribution After Merge

We can use the same approach for partition split. Code below demonstrates how to split November 2017 partition using 2017-11-15 as the new range value.

-- Spliting November 2017 partition (#13)
-- Recreating Staging objects first
drop table if exists dbo.OrdersStaging;
drop table if exists dbo.OrderLineItemsStaging;
if exists(select * from sys.partition_schemes where name = 'psOrdersStaging') drop partition scheme psOrdersStaging;
if exists(select * from sys.partition_functions where name = 'pfOrdersStaging') drop partition function pfOrdersStaging;
go

create partition function pfOrdersStaging(datetime2(0))
as range right for values
('2017-11-01','2017-12-01');

create partition scheme psOrdersStaging
as partition pfOrdersStaging
all to ([Primary]);

create table dbo.OrdersStaging
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        Placeholder char(100),
)
on psOrdersStaging(OrderDate);

create clustered columnstore index CCI_OrdersStaging on dbo.OrdersStaging
on psOrdersStaging(OrderDate);

create table dbo.OrderLineItemsStaging
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        OrderLineItemId int not null,
        Placeholder char(100),
)
on psOrdersStaging(OrderDate);

create clustered columnstore index CCI_OrderLineItemsStaging on dbo.OrderLineItemsStaging
on psOrdersStaging(OrderDate);

-- Switching partitions to the staging table
alter table dbo.Orders switch partition 13
to dbo.OrdersStaging partition 2;

alter table dbo.OrderLineItems switch partition 13
to dbo.OrderLineItemsStaging partition 2;
go

-- Splitting partition in the main table 
alter partition scheme psOrders 
next used [PRIMARY];

alter partition function pfOrders()
split range ('2017-11-15');
go

-- Now spllitting partition in the Staging table
drop index CCI_OrdersStaging on dbo.OrdersStaging;
drop index CCI_OrderLineItemsStaging on dbo.OrderLineItemsStaging;
go

-- Offline with Sch-M
alter partition scheme psOrdersStaging 
next used [PRIMARY];

alter partition function pfOrdersStaging()
split range ('2017-11-15');
go

create clustered columnstore index CCI_OrdersStaging on dbo.OrdersStaging
on psOrdersStaging(OrderDate);

create clustered columnstore index CCI_OrderLineItemsStaging on dbo.OrderLineItemsStaging
on psOrdersStaging(OrderDate);

-- Switching partitions back
alter table dbo.OrdersStaging switch partition 2 
to dbo.Orders partition 13;

alter table dbo.OrdersStaging switch partition 3 
to dbo.Orders partition 14;

alter table dbo.OrderLineItemsStaging switch partition 2 
to dbo.OrderLineItems  partition 13;

alter table dbo.OrderLineItemsStaging switch partition 3
to dbo.OrderLineItems  partition 14;

As you can see, with just a few extra operators, we were able to remove the overhead of rebuilding columnstore index on entire table.

Unfortunately, this is offline approach. The data in the main tables would become inconsistent as soon as you switch partitions to the staging tables. Fortunately, if data is static, you can address it with some coding.

Instead of switching partitions to the staging tables, you can copy the data there. After split or merge is done in the staging tables, you can empty source partitions by switching them to another, temporary tables, and switch data back from the staging tables.

The code below illustrates this approach.

-- If data in the table is static 
-- Let's merge November-December 2016 and January 2017 
drop table if exists dbo.OrdersStaging;
drop table if exists dbo.OrderLineItemsStaging;
if exists(select * from sys.partition_schemes where name = 'psOrdersStaging') drop partition scheme psOrdersStaging;
if exists(select * from sys.partition_functions where name = 'pfOrdersStaging') drop partition function pfOrdersStaging;
go

create partition function pfOrdersStaging(datetime2(0))
as range right for values
('2016-11-01','2017-01-01','2017-02-01');

create partition scheme psOrdersStaging
as partition pfOrdersStaging
all to ([Primary]);

create table dbo.OrdersStaging
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        Placeholder char(100),
)
on psOrdersStaging(OrderDate);

create clustered columnstore index CCI_OrdersStaging on dbo.OrdersStaging
on psOrdersStaging(OrderDate);

create table dbo.OrderLineItemsStaging
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        OrderLineItemId int not null,
        Placeholder char(100),
)
on psOrdersStaging(OrderDate);

create clustered columnstore index CCI_OrderLineItemsStaging on dbo.OrderLineItemsStaging
on psOrdersStaging(OrderDate);

-- Copying data from Main to Staging tables
insert into dbo.OrdersStaging(OrderDate,OrderId,Placeholder)
        select OrderDate, OrderId, Placeholder
        from dbo.Orders
        where $Partition.pfOrders(OrderDate) in (2,3);

insert into dbo.OrderLineItemsStaging(OrderDate,OrderId,OrderLineItemId,Placeholder)
        select OrderDate, OrderId, OrderLineItemId, Placeholder
        from dbo.OrderLineItems
        where $Partition.pfOrders(OrderDate) in (2,3);
go

-- Merging partitions in Staging tables
drop index CCI_OrdersStaging on dbo.OrdersStaging;
drop index CCI_OrderLineItemsStaging on dbo.OrderLineItemsStaging;
go

alter partition function pfOrdersStaging()
merge range ('2017-01-01');
go

create clustered columnstore index CCI_OrdersStaging on dbo.OrdersStaging
on psOrdersStaging(OrderDate);

create clustered columnstore index CCI_OrderLineItemsStaging on dbo.OrderLineItemsStaging
on psOrdersStaging(OrderDate);
go

-- Creating temporary table to switch original
-- partitions from the main table
create partition function pfOrdersTmp(datetime2(0))
as range right for values
('2016-11-01','2017-01-01','2017-02-01');

create partition scheme psOrdersTmp
as partition pfOrdersTmp
all to ([Primary]);

create table dbo.OrdersTmp
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        Placeholder char(100),
)
on psOrdersTmp(OrderDate);

create clustered columnstore index CCI_OrdersTmp on dbo.OrdersTmp
on psOrdersTmp(OrderDate);

create table dbo.OrderLineItemsTmp
(
        OrderDate datetime2(0) not null,
        OrderId int not null,
        OrderLineItemId int not null,
        Placeholder char(100),
)
on psOrdersTmp(OrderDate);

create clustered columnstore index CCI_OrderLineItemsTmp on dbo.OrderLineItemsTmp
on psOrdersTmp(OrderDate);

-- Final steps. Let's do it in transaction
-- All operations are on metadata level
set xact_abort on
begin tran
        -- Switching original partitions out
        alter table dbo.Orders switch partition 2 
        to dbo.OrdersTmp partition 2;

        alter table dbo.OrderLineItems switch partition 2 
        to dbo.OrderLineItemsTmp partition 2;

        alter table dbo.Orders switch partition 3
        to dbo.OrdersTmp partition 3;

        alter table dbo.OrderLineItems switch partition 3
        to dbo.OrderLineItemsTmp partition 3;

        -- Merge
        alter partition function pfOrders()
        merge range ('2017-01-01');

        -- Switching partitions from the staging table
        alter table dbo.OrdersStaging switch partition 2 
        to dbo.Orders partition 2;

        alter table dbo.OrderLineItemsStaging switch partition 2 
        to dbo.OrderLineItems  partition 2;
commit;
go

-- Dropping temporary tables
drop table dbo.OrderLineItemsTmp;
drop table dbo.OrdersTmp;
drop partition scheme psOrdersTmp;
drop partition function pfOrdersTmp;

As you can see, this implementation is relatively simple and considering that data in columnstore indexes is often static, you may be able to utilize it to perform partition maintenance transparently to the users.

Source code is available for download.

Implementing Sliding Windows Data Purge Pattern with Columnstore Indexes

7 Replies

It’s been a while since my last blog post. Many things happened, including another great PASS Summit where I presented the session of Data Partitioning (by the way, you can download the slides and demos from my Presentation page). As usual, I ran out of time 🙂

As many of you know, Data Partitioning is very big and complex subject, which is impossible to cover even with half-day session. One of the topics I left out of scope is how it plays with columnstore indexes. Even though partitioned columnstore tables are similar to B-Tree ones, there are some differences between them.

The biggest difference resides in partition function split and merge behavior. With B-Tree indexes, you can split and merge non-empty partitions. SQL Server would split or merge the data automatically, granted with the schema-modification (Sch-M) table lock held in place. Other sessions would be unable to access the table but at least split and merge would work.

This is not the case with columnstore indexes where you would get the error when you try to split or merge non-empty partitions. There are several reasons for this limitation. Without diving very deep into columnstore index internals, I could say that the only option of doing split of merge internally is rebuilding columnstore index on affected partitions. Fortunately, you can split and merge empty columnstore partitions, which allow you to workaround the limitation and also implement Sliding Window pattern and use partitioning to purge the data.

The Sliding Window pattern stands for scenario when we want to retain the data for some period of time. Consider the situation when we need to keep last 12 months of data. In the beginning of each month, we may purge the data that is older than 12 months, basically having a window on the data that slides purging the oldest data, based on a given schedule.

Data partitioning is great in this scenario. It allows to keep the purge process on metadata- and minimally logged-levels by switching the old-data partition to staging table truncating it afterwards. This helps to avoid log-intensive DELETE operations and reduce blocking in the system.

Let’s look at implementation. First, I would create partition function and scheme considering that now is November 2017 and we would need to retain 12-month of data starting November 2016.

create partition function pfOrders(datetime2(0))
as range right for values
('2016-11-01','2016-12-01','2017-01-01','2017-02-01'
,'2017-03-01','2017-04-01','2017-05-01','2017-06-01'
,'2017-07-01','2017-08-01','2017-09-01','2017-10-01'
,'2017-11-01','2017-12-01','2018-01-01');

create partition scheme psOrders 
as partition pfOrders
all to ([Primary]);

This code will create 16 partitions. The left-most partition will host the data prior 2016-11-01 and it will be empty. The two right-most partitions will host the data for December 2017 and everything starting on 2018-01-01. They also will be empty at this point.

There is the simple reason why we want to have two right-most partitions empty. Let’s assume that we will run the monthly purge process in December and December’s partition will already store some data. As part of the purge process, we would like to create the new partition for the “next month” data. It is beneficial to perform this and split partition function on empty partition. With B-Tree tables, it is nice to have implementation. It allows to avoid any data scans and movements reducing the time for schema modification lock being held. For columnstore indexes, this is actually the must have part of design – you would be unable to split non-empty partition at all.

If you ever implemented Sliding Window pattern with B-Tree indexes, you would be aware of empty right-most partition. However, you might notice that in our example, there is also empty left-most partition present. As I already mentioned, columnstore indexes would not allow you to merge non-empty partitions and you need to have an extra empty partition to perform the merge after old data is purged.

Let’s look at the process in details. As the first step, I will create two tables that will share the partition function and scheme. We will populate them with some data and create the columnstore indexes on the tables.

create table dbo.Orders
(
	OrderDate datetime2(0) not null,
	OrderId int not null,
	Placeholder char(100),
)
on psOrders(OrderDate);

create table dbo.OrderLineItems
(
	OrderDate datetime2(0) not null,
	OrderId int not null,
	OrderLineItemId int not null,
	Placeholder char(100),
)
on psOrders(OrderDate);

-- Left-most and right-most are empty
;with N1(C) as (select 0 union all select 0) -- 2 rows
,N2(C) as (select 0 from N1 as t1 cross join N1 as t2) -- 4 rows
,N3(C) as (select 0 from N2 as t1 cross join N2 as t2) -- 16 rows
,N4(C) as (select 0 from N3 as t1 cross join N3 as t2) -- 256 rows
,N5(C) as (select 0 from N4 as t1 cross join N4 as t2) -- 65,536 rows
,Ids(Id) as (select row_number() over (order by (select null)) from N5)
insert into dbo.Orders(OrderDate, OrderId)
	select dateadd(day,Id % 390,'2016-11-01'), ID
	from Ids;

insert into dbo.OrderLineItems(OrderDate,OrderId, OrderLineItemId)
	select OrderDate, OrderId, OrderId
	from dbo.Orders;

create clustered columnstore index CCI_Orders on dbo.Orders
on psOrders(OrderDate);

create clustered columnstore index CCI_OrderLineItems on dbo.OrderLineItems
on psOrders(OrderDate);

The next listing returns data distribution and allocation information for dbo.Orders table.

select 
	t.object_id, i.index_id, i.name as [Index], p.partition_number,
	p.[Rows], p.data_compression_desc, 
	fg.name as [Filegroup],
	sum(a.total_pages) as TotalPages, 
	sum(a.used_pages) as UsedPages, 
	sum(a.data_pages) as DataPages,
	sum(a.total_pages) * 8 as TotalSpaceKB, 
	sum(a.used_pages) * 8 as UsedSpaceKB, 
	sum(a.data_pages) * 8 as DataSpaceKB
from 
	sys.tables t with (nolock) join sys.indexes i with (nolock) on
		t.object_id = i.object_id
	join sys.partitions p with (nolock) on 
		i.object_id = p.object_id AND i.index_id = p.index_id
	join sys.allocation_units a with (nolock) on 
		p.partition_id = a.container_id
	join sys.filegroups fg with (nolock) on 
		a.data_space_id = fg.data_space_id
where
	t.object_id = object_id(N'dbo.Orders')
group by
	t.object_id, i.index_id, i.name, p.partition_number,
	p.[Rows], p.data_compression_desc, fg.name
order by 
	i.index_id, p.partition_number;

Figure 1 illustrates the output of the listing. As you can see, data is distributed across 13 partitions with left-most and two right-most partitions empty.

01. Data Distribution And Allocation Information

Let’s try to split non-empty partition, just for the demo purposes

alter partition scheme psOrders next used [Primary];

alter partition function pfOrders()
split range ('2017-11-15');

As expected, it would not work with the error message shown below.

02.Split Non-Empty Partition

The merge of non-empty partition would also fail.

03.Merge Non-Empty Partition

Fortunately, we can split an empty partition. Again, let’s assume that we want to start purging process in December. I would insert a couple rows to December’s partitions making them non-empty.

insert into dbo.Orders(OrderDate, OrderId) values('2017-12-02',100000);
insert into dbo.OrderLineItems(OrderDate,OrderId, OrderLineItemId) values('2017-12-02',100000,100000);

Let’s create another empty partition by splitting partition function. The right-most partition is still empty and operation would succeed. As result, we will have separate empty partitions for January, 2018 and another partition that will store all data starting 2018-02-01. We will split the right-most empty partition again during our January’s 2018 purge process.

alter partition scheme psOrders next used [Primary];

alter partition function pfOrders()
split range ('2018-02-01');

The process of purging old data would consist of several steps. First, we need to create the staging tables to which we will switch old November 2016 partitions. Those tables need to be empty, have exactly the same schema with the main tables and reside on the same filegroup.

create table dbo.OrdersTmp
(
	OrderDate datetime2(0) not null,
	OrderId int not null,
	Placeholder char(100),
) on [Primary];

create clustered columnstore index CCI_OrdersTmp on dbo.OrdersTmp
on [Primary];

create table dbo.OrderLineItemsTmp
(
	OrderDate datetime2(0) not null,
	OrderId int not null,
	OrderLineItemId int not null,
	Placeholder char(100),
)
on [Primary];

create clustered columnstore index CCI_OrderLineItemsTmp on dbo.OrderLineItemsTmp
on [Primary];

After tables were created, we can switch November 2016 partitions there truncating staging tables afterwards. Remember that in the main tables, November 2016 is the second left-most partition.

alter table dbo.Orders switch partition 2 
to dbo.OrdersTmp;

alter table dbo.OrderLineItems switch partition 2 
to dbo.OrderLineItemsTmp;

truncate table dbo.OrdersTmp;
truncate table dbo.OrderLineItemsTmp;

If you look at dbo.Orders allocation information again, you would see that two left-most partitions are now empty as shown in Figure 4

04. Data Distribution And Allocation Information After Purge

Now they can be merged, which will complete our implementation.

alter partition function pfOrders()
merge range('2016-11-01');

As you can see, implementation of Sliding Window pattern with columnstore indexes is very similar to B-Tree tables. The only differences are:

You must have empty right-most partition pre-allocated to perform the split. I’d like to reiterate that even though it is not required with B-Tree indexes, such empty partition would reduce I/O overhead and table locking during split operation there.
You must have another empty left-most partition to perform the merge. This is not required nor needed with B-Tree indexes.

I’d like to thank Ned Otter (blog) for his suggestions and feedback. They convinced me to blog on the subject.

Source code is available for download.

Next (2017-11-29): Splitting and Merging Non-Empty Partitions in Columnstore Indexes