Partitioning Basics – Part 3 – Switching Data

This post follows on from Partitioning Basics – Part 2

In this final part, I want to go through how partitions can be used to archive data from a table. Firstly a table to archive the data from the primary table needs to be created:-

CREATE TABLE dbo.[DemoPartitionedTable_Archive]
(DemoID			INT IDENTITY(1,1),
 SomeData		SYSNAME,
 CaptureDate	DATE,

 CONSTRAINT [PK_DemoPartitionedTable_Archive] PRIMARY KEY CLUSTERED 
	(DemoID ASC, CaptureDate ASC) 

) ON DEMO;

I haven’t created this table on the partition scheme but it could be done because, according to Microsoft:-

“When a table and its indexes are aligned, then SQL Server can move partitions in and out of partitioned tables more effectively, because all related data and indexes are divided with the same algorithm.”

Reference:- Partitioned Tables and Indexes in SQL Server 2005

So if the archive table was on the partition scheme it would be “Aligned”. This means that moving data in and out would be more efficient. The only reason I haven’t done this here is because this is a basic demo.

Before the switch is performed the data in the partitions needs to be checked:-

SELECT 
	t.name AS TableName, i.name AS IndexName, p.partition_number, 
	r.value AS BoundaryValue, p.rows
FROM 
	sys.tables AS t
INNER JOIN
	sys.indexes AS i ON t.object_id = i.object_id
INNER JOIN
	sys.partitions AS p ON i.object_id = p.object_id AND i.index_id = p.index_id 
INNER JOIN
	sys.partition_schemes AS s ON i.data_space_id = s.data_space_id
INNER JOIN
	sys.partition_functions AS f ON s.function_id = f.function_id
LEFT OUTER JOIN 
	sys.partition_range_values AS r ON f.function_id = r.function_id AND r.boundary_id = p.partition_number
WHERE 
	t.name = 'DemoPartitionedTable'
AND 
	i.type <= 1
ORDER BY p.partition_number;

Here’s the data in the partitions:-

Partitions4

In order to switch the data from partition 1 in the primary table to the archive table, the following script needs to be run:-

ALTER TABLE dbo.[DemoPartitionedTable]
SWITCH PARTITION 1
TO dbo.[DemoPartitionedTable_Archive];
GO

Which makes the partitions look like this:-

Partitions6

Partition 1 has 0 rows and could be merged. The following script will merge the partition:-

ALTER PARTITION FUNCTION DemoPartitionFunction()
    MERGE RANGE ('2014-05-29');
GO

So now the partitions are:-

Partitions Final

The partition has had its data switched out to the archive table and then was merged into the above partition. Using the scripts in these three post will allow you to effectively manage partitioning in SQL Server.

I hope this series of posts has been helpful as an introduction into partitioning. Please let me know if you have any comments or questions.

Partitioning Basics – Part 2 – Splitting/Merging Partitions

This post follows on from the previous post Partitioning Basics – Part 1

Let’s have a look at the partitions setup in part 1, the following script will show the partition information:-

SELECT 
	t.name AS TableName, i.name AS IndexName, p.partition_number, 
	r.value AS BoundaryValue, p.rows
FROM 
	sys.tables AS t
INNER JOIN
	sys.indexes AS i ON t.object_id = i.object_id
INNER JOIN
	sys.partitions AS p ON i.object_id = p.object_id AND i.index_id = p.index_id 
INNER JOIN
	sys.partition_schemes AS s ON i.data_space_id = s.data_space_id
INNER JOIN
	sys.partition_functions AS f ON s.function_id = f.function_id
LEFT OUTER JOIN 
	sys.partition_range_values AS r ON f.function_id = r.function_id AND r.boundary_id = p.partition_number
WHERE 
	t.name = 'DemoPartitionedTable'
AND 
	i.type <= 1
ORDER BY p.partition_number;

Partitions1

These partitions will hold data until the 11th of June, after which all records in the table will be put into the end partition (number 16). This means that the end partition will end up holding a lot of records. In order to prevent this, a new partition needs to be created.

To do this, firstly the partition scheme needs to be told which filegroup the next partition will use:-

ALTER PARTITION SCHEME DemoPartitionScheme
	NEXT USED [Demo];
GO

Then a SPLIT command is run against the partition function:-

ALTER PARTITION FUNCTION DemoPartitionFunction()
	SPLIT RANGE ('2014-06-12');
GO

And now when the data checking script is run, the results are:-

Partitions2

Partition 16 now has a boundary value of 2014-06-12 and a new partition (17) has been created.

But how can a partition be removed? There’s a really simple MERGE command that performs this operation:-

ALTER PARTITION FUNCTION DemoPartitionFunction()
	MERGE RANGE ('2014-05-28');
GO

And when the data checking script is run now, the results are:-

Partitions3

So the data in the partition with the boundary value 2014-05-28 has been merged into the partition with the boundary of 2014-05-29. The problem now is that the last partition has data from the partition that was just merged into it as well as the existing data that was already there. In order to prevent this from happening, the data from the partition that is about to be merged can be switched into another table before the MERGE command is executed.

I will cover this in the next part of this blog series.

Partitioning Basics – Part 1 – Creating Partitions

Partitioned tables can be a quick and efficient way to (amongst other things) archive data. In the next couple of posts I will run through the basics of partitioning. Starting with setting up a partitioned table and loading some data.

Let’s create a demo database:-

USE [master];
GO

IF EXISTS(SELECT 1 FROM sys.databases WHERE name = 'DEMO')
	BEGIN
		DROP DATABASE [Demo];
	END

CREATE DATABASE [Demo]
	ON PRIMARY 
(NAME     = 'Demo', 
 FILENAME = 'C:\SQLServer\Data\Demo.MDF'), 
	FILEGROUP [DEMO] DEFAULT 
(NAME     = 'Demo_Data', 
 FILENAME = 'C:\SQLServer\Data\Demo_Data.NDF')
	LOG ON 
(NAME    = 'Demo_Log', 
FILENAME = 'C:\SQLServer\Logs\Demo_log.ldf')
GO

OK, so the first thing to do is create a partition function. This function defines the number of partitions we will intiailly have. It also specifies the boundaries of each partition.

USE [Demo];
GO

IF NOT EXISTS(SELECT 1 FROM sys.partition_functions WHERE name = 'DemoPartitionFunction')
CREATE PARTITION FUNCTION DemoPartitionFunction (DATE)
AS RANGE RIGHT 
FOR VALUES (DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 7),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 6),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 5),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 4),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 3),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 2),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 1),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), 0),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -1),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -2),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -3),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -4),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -5),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -6),
			DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -7));
GO

So this statement has said that my partition function is going to be using dates as boundaries. What I’ve done is setup a bunch of partitions for one week in the past and one week in the future. The function sets the right “side” of the partition as the boundary, basically specifying which side of the boundary value the partition extends. The best way of thinking about it is like this…..

Today’s date is 2014-06-04. So the boundary values created by running the above script are:-

1. 2014-05-28
2. 2014-05-29
3. 2014-05-30
4. 2014-05-31
5. 2014-06-01
6. 2014-06-02
7. 2014-06-03
8. 2014-06-04
9. 2014-06-05
10. 2014-06-06
11. 2014-06-07
12. 2014-06-08
13. 2014-06-09
14. 2014-06-10
15. 2014-06-11

Say x is a value in the table then the partitions would evaluate as:-

1. 2014-05-28 > x
2. 2014-05-28 <= x < 2014-05-29
3. 2014-05-29 <= x < 2014-05-30
4. 2014-05-30 <= x < 2014-05-31
5. 2014-05-31 <= x < 2014-06-01
6. 2014-06-01 <= x < 2014-06-02
7. 2014-06-02 <= x < 2014-06-03
8. 2014-06-03 <= x < 2014-06-04
9. 2014-06-04 <= x < 2014-06-05
10. 2014-06-05 <= x < 2014-06-06
11. 2014-06-06 <= x < 2014-06-07
12. 2014-06-07 <= x < 2014-06-08
13. 2014-06-08 <= x < 2014-06-09
14. 2014-06-09 <= x < 2014-06-10
15. 2014-06-10 <= x < 2014-06-11
16. 2014-06-11 <= x

So in these partitions, the data in the partition is always less than the value of that partition’s boundary.

After the partition function, a partition scheme needs to be created:-

IF NOT EXISTS(SELECT 1 FROM sys.partition_schemes WHERE name = 'DemoPartitionScheme')
CREATE PARTITION SCHEME DemoPartitionScheme
AS PARTITION DemoPartitionFunction
ALL TO (DEMO);

The partition scheme maps the data to a partition function and specifies where the data is going to be stored. The [ALL TO (DEMO)] part of the statement says that all partitions are going to be mapped to the DEMO filegroup.

Now that the function and scheme have been created we can create a table on the partitions:-

CREATE TABLE dbo.[DemoPartitionedTable]
(DemoID			INT IDENTITY(1,1),
 SomeData		SYSNAME,
 CaptureDate	DATE,

 CONSTRAINT [PK_DemoPartitionedTable] PRIMARY KEY CLUSTERED 
	(DemoID ASC, CaptureDate ASC)

) ON DemoPartitionScheme(CaptureDate);

Instead of specifying that the table be placed on a filegroup, the statement has [ON DemoPartitionScheme(CaptureDate)] at the end. This means that the table is on the partition scheme, using the column CaptureDate to determine which rows go on which partition.

Let’s insert some data into the table:-

INSERT INTO dbo.[DemoPartitionedTable]
(SomeData,CaptureDate)
VALUES
('Demo',DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -8));
GO 457

INSERT INTO dbo.[DemoPartitionedTable]
(SomeData,CaptureDate)
VALUES
('Demo',DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -7));
GO 493

INSERT INTO dbo.[DemoPartitionedTable]
(SomeData,CaptureDate)
VALUES
('Demo',DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -6));
GO 486

INSERT INTO dbo.[DemoPartitionedTable]
(SomeData,CaptureDate)
VALUES
('Demo';,DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -5));
GO 413

INSERT INTO dbo.[DemoPartitionedTable]
(SomeData,CaptureDate)
VALUES
('Demo',DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -4));
GO 473

INSERT INTO dbo.[DemoPartitionedTable]
(SomeData,CaptureDate)
VALUES
('Demo',DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -3));
GO 461

INSERT INTO dbo.[DemoPartitionedTable]
(SomeData,CaptureDate)
VALUES
('Demo',DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -2));
GO 422

INSERT INTO dbo.[DemoPartitionedTable]
(SomeData,CaptureDate)
VALUES
('Demo',DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()), -1));
GO 461

INSERT INTO dbo.[DemoPartitionedTable]
(SomeData,CaptureDate)
VALUES
('Demo',DATEADD(dd, DATEDIFF(dd, 0, GETUTCDATE()),0));
GO 273

This will insert data into today’s partition and the partitions for the last 8 days. This can be checked by running the following script:-

SELECT 
	t.name AS TableName, i.name AS IndexName, p.partition_number, 
	r.value AS BoundaryValue, p.rows
FROM 
	sys.tables AS t
INNER JOIN
	sys.indexes AS i ON t.object_id = i.object_id
INNER JOIN
	sys.partitions AS p ON i.object_id = p.object_id AND i.index_id = p.index_id 
INNER JOIN
	sys.partition_schemes AS s ON i.data_space_id = s.data_space_id
INNER JOIN
	sys.partition_functions AS f ON s.function_id = f.function_id
LEFT OUTER JOIN 
	sys.partition_range_values AS r ON f.function_id = r.function_id AND r.boundary_id = p.partition_number
WHERE 
	t.name = 'DemoPartitionedTable'
AND 
	i.type <= 1
ORDER BY p.partition_number;

This will show each partition and the number of rows in it:-

Partitions1

Remember that the boundary values are the upper limit, no value in the partitions will match that value due to the partition function being specified as RANGE RIGHT.

I’ll continue this post later in the month, when I’ll talk about creating new partitions and merging partitions.