How Teradata 16 may improve the UNION ALL Performance

0
582

Teradata 16Teradata 16 – UNION ALL

Starting with Teradata 16, the optimizer has an additional option to handle sets of rows which are combined with “UNION ALL” in views and derived tables.

Up to Version 15.10, the sets of rows which are combined with “UNION ALL” are always moved into one common spool before any other operation takes place. Such an operation can be a join to another table or an aggregation step. Here is an example of a join operation:

REPLACE VIEW TestView AS
(
SELECT PK, COL1 FROM TheTableA
UNION ALL
SELECT PK, COL1 FROM TheTableB
) ;

SELECT *
FROM
TestView T01
INNER JOIN
TestTable T02
ON
T01.PK = T02.PK
WHERE
T01.COL2 = 100;

The Optimizer of Teradata 15.10 (and below) will in a first step create a common spool, and apply the join step between this spool and the table “TestTable” in a subsequent step.

No reduction of rows will take place at the time when the common spool of the “UNION ALL” branches is built. Any join preparation step, such as a redistribution on the join columns has to be done with the spool built from all branches which are combined with “UNION ALL”.

The Optimizer of Teradata 16 has an additional option, in which the Join can be split into several parts:

Each branch of the “UNION ALL” can be joined independently (in our example to table “TestTable”) The result of all join steps is moved into a final spool.

Just for presentation purposes, the query would look like this:

SELECT * FROMTheTableA T01
INNER JOIN
TestTable T02
ON T01.PK = T02.PK;
WHERE T01.COL2 = 100;

SELECT * FROMTheTableB T01
INNER JOIN
TestTable T02
ON T01.PK = T02.PK;
WHERE T01.COL2 = 100;

This new option may use significantly fewer resources if each join can effectively reduce the number of selected rows.

Obviously, the possible reduction in resource usage increases with the size of the tables which are combined with “UNION ALL” and joined with another table.

Similarly, the Optimizer of Teradata 16 can now apply aggregation steps on each “UNION ALL” branch separately. In a subsequent step, the global aggregation is done.

Up to Teradata 15.10, the aggregation took place after all branches of the “UNION ALL” have been located into a common spool.

I think the biggest benefit is given if we count the number of rows in a view or derived table which consists of several branches which are combined with “UNION ALL”. Up to Teradata 15.10, a common spool was built and the rows had to be counted, one by one.

Starting with Teradata 16, the Optimizer uses the same optimization technique which exists for single tables since Teradata 14.10: It simply counts the rows on the Cylinder Index for each of the “UNION ALL” combined tables and sums up the result.

All above shown techniques can be used by the Optimizer to improve performance based on the fact that by pushing the aggregation or join into each branch of the “UNION ALL”, spool sizes can be reduced.

As always, the Optimizer heavily relies on statistics in order to decide if these new techniques should be used.

Our Reader Score
[Total: 9    Average: 4.9/5]
How Teradata 16 may improve the UNION ALL Performance written by Roland Wenzlofsky on September 30, 2017 average rating 4.9/5 - 9 user ratings

LEAVE A REPLY

Please enter your comment!
Please enter your name here