Teradata 16 – UNION ALL
Starting with Teradata 16, the Optimizer has an additional option to handle sets of rows combined with “UNION ALL” in views and derived tables.
Up to Version 15.10, the sets of rows combined with “UNION ALL” are permanently moved into one common spool before any other operation occurs. Such an operation can be a join to another table or an aggregation step. Here is an example of a join operation:
REPLACE VIEW TestView AS
(
SELECT PK, COL1 FROM TheTableA
UNION ALL
SELECT PK, COL1 FROM TheTableB
) ;
SELECT *
FROM
TestView T01
INNER JOIN
TestTable T02
ON
T01.PK = T02.PK
WHERE
T01.COL2 = 100;
The Optimizer of Teradata 15.10 (and below) will, in the first step, create a common spool and apply the join step between this spool and the table “TestTable” in a subsequent step.
No reduction of rows will take place at the time when the common spool of the “UNION ALL” branches is built. Any join preparation step, such as a redistribution on the join columns, must be done with the spool built from all branches combined with “UNION ALL”.
The Optimizer of Teradata 16 has an additional option in which the join can be split into several parts:
Each branch of the “UNION ALL” can be joined independently (in our example to table “TestTable”). The result of all join steps is moved into a final spool.
Just for presentation purposes, the query would look like this:
SELECT * FROMTheTableA T01
INNER JOIN
TestTable T02
ON T01.PK = T02.PK;
WHERE T01.COL2 = 100;
SELECT * FROMTheTableB T01
INNER JOIN
TestTable T02
ON T01.PK = T02.PK;
WHERE T01.COL2 = 100;
This new option may use significantly fewer resources if each join can effectively reduce the number of selected rows.
The possible reduction in resource usage increases with the size of the tables, which are combined with “UNION ALL” and joined with another table.
Similarly, the Optimizer of Teradata 16 can apply aggregation steps on each “UNION ALL” branch separately. In a subsequent step, the global aggregation is done.
Up to Teradata 15.10, the aggregation occurred after all branches of the “UNION ALL” were located in a common spool.
I think the biggest benefit is if we count the number of rows in a view or derived table consisting of several branches combined with “UNION ALL”. Up to Teradata 15.10, a common spool was built, and the rows had to be counted.
Starting with Teradata 16, the Optimizer has used the same optimization technique for single tables since Teradata 14.10: It simply counts the rows on the Cylinder Index for each of the “UNION ALL” combined tables and sums up the result.
The Optimizer can use all the above-shown techniques to improve performance because by pushing the aggregation or joining into each branch of the “UNION ALL”, spool sizes can be reduced.
As always, the Optimizer heavily relies on statistics to decide whether these new techniques should be used.