Efficiently Adding New Partitions to a Row Partitioned Teradata Table: a Comparison of Alter Table vs Insert Select

Roland Wenzlofsky

April 21, 2023

minutes reading time

This article explores the various methods for adding partitions to a Row Partitioned Table and determines the most efficient approach. Previously, we conducted a similar study to analyze the resource impact of modifying the structure of an unpartitioned table. During that study, we evaluated the effectiveness of INSERT…SELECT versus ALTER TABLE. For more information, please refer to our previous report.

Teradata ALTER Table vs. INSERT INTO: Which Method is Efficient?

Partitioning can be accomplished in two ways: by duplicating data into a new table or by utilizing an ALTER TABLE statement. Before proceeding, assessing the necessary resources for both methods is important. To do this, we will generate a test table with row partitions and populate it with 100,000 records.

As is customary, I will substantiate my assertions with tests. The test configuration employs a table comprising approximately 100,000 rows.

The date column in our test table spans from 1900-01-01 to 2022-12-31, resulting in an empty NO RANGE partition.

CREATE MULTISET TABLE PPI_Test
(
PK INTEGER NOT NULL,
COL_A DECIMAL(18,2),
TheDate DATE FORMAT 'YYYY-MM-DD'
) PRIMARY INDEX (PK) PARTITION BY (RANGE_N(TheDate BETWEEN DATE'1900-01-01' AND DATE'2022-12-31' EACH INTERVAL '1' DAY, NO RANGE));

We add daily partitions for the year 2023 using ALTER TABLE statement:

ALTER TABLE PPI_Test MODIFY PRIMARY INDEX ADD RANGE BETWEEN DATE '2023-01-01' AND DATE '2023-12-31' EACH INTERVAL '1' DAY;

Creating new partitions is a streamlined process as they do not require any row redistribution. The sole exception would be if there were an abundance of rows in the NO RANGE partition that would need to be redistributed to the newly created day partitions. However, this is not applicable in this scenario. As a result, the ALTER TABLE command incurs only 110 IOs and necessitates no spool space.

Now, let’s tackle the identical task with INSERT…SELECT statements:

CREATE MULTISET TABLE PPI_Test_Copy
(
PK INTEGER NOT NULL,
COL_A DECIMAL(18,2),
TheDate DATE FORMAT 'YYYY-MM-DD'
) PRIMARY INDEX (PK) PARTITION BY (RANGE_N(TheDate BETWEEN DATE'1900-01-01' AND DATE'2023-12-31' EACH INTERVAL '1' DAY, NO RANGE));

INSERT INTO PPI_Test_Copy 
SELECT * FROM PPI_Test;

Upon revisiting the QueryLog, we examine the pivotal figures. The utilization of the INSERT…SELECT statement necessitates 228 IOs and compels the whole PPI_Test table to be deposited in the spool.

What occurs if rows need to be redistributed from the NO RANGE partition to new partitions? Prior to appending new partitions for 2023, we inserted numerous rows into our table with a date between January 1st and December 31st of 2023 for testing purposes. The same issue arises when a DROP RANGE is performed, necessitating the movement of rows to the NO RANGE partitions.

The logical input/output operations in our test configuration rose to 952, signifying an 865% increase compared to the test configuration that lacked modifications for the NO RANGE partition. Additionally, this procedure uses some spool space, although typically not as much as the INSERT INTO…SELECT method.

Here are the summarized results of our 3 test setups:

Method of adding partitions to PPI table	Logical IOs
ALTER TABLE without NO RANGE rows	110
ALTER TABLE with NO RANGE rows	952
INSERT INTO … SELECT into table copy	248

Summary

The ALTER TABLE method is generally more efficient. Still, it’s important to note that the number of IOs can significantly increase if there is a pre-populated NO RANGE partition and many rows need to be moved.

Efficiency is not the only factor to consider in a production environment. When using ALTER TABLE, an exclusive lock is placed on the table, which cannot be interrupted. Therefore, it is crucial to prevent significant data redistribution resulting from filled NO RANGE partitions, as it could potentially immobilize the entire Teradata system.

It is not recommended to rely solely on ALTER TABLE as the most efficient method without considering various factors affecting resource usage. Prior to repartitioning a significant table, it is advisable to conduct comprehensive testing.

Take a brief survey and receive the book for free Get instant access

Share0

Tweet0

Share0

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

Thanks Roland.
Is there anything like automatic partition roll out in Teradata/latest versions of Teradata?
I mean, the partitioning extends by itself for upcoming Years/months/dates etc?

Efficiently Adding New Partitions to a Row Partitioned Teradata Table: a Comparison of Alter Table vs Insert Select

Summary

Roland Wenzlofsky

Fast multi-file export of Teradata query results using only Teradata SQL Assistant

The Teradata AMP Worker Task

Boost Your Teradata Performance – The Critical Role of NOT NULL Declarations

Optimizing Teradata SQL Queries by Avoiding Full Table Scans and Utilizing Secondary Indexes