Optimizing Join And Aggregation With Teradata UNION ALL Views

Teradata UNION ALL Introduction

Teradata introduced a new Optimizer feature for managing sets of rows combined by UNION ALL in views and derived tables.

In earlier versions of Teradata, rows combined with Teradata UNION ALL were consolidated into a shared spool before further operations like joining another table or performing an aggregation step. Consider the following example of a join operation:

REPLACE VIEW TestView AS
(
SELECT PK, COL1 FROM TheTableA
UNION ALL
SELECT PK, COL1 FROM TheTableB
) ;

SELECT *
FROM
TestView T01
INNER JOIN
TestTable T02
ON
T01.PK = T02.PK
WHERE
T01.COL2 = 100;

The Teradata 15.10 Optimizer creates a common spool and performs a join operation with the “TestTable”.

The common UNION ALL spool will not reduce rows during construction. All join preparation steps, including redistribution on join columns, must be performed using the spool created from the combination of all UNION ALL branches.

The Optimizer in Teradata 16 enables the division of a join into several segments and offers an additional choice.

Each Teradata UNION ALL branch can be joined separately to the “TestTable”. The final outcome of these join processes is stored in a final spool.

To illustrate the equivalent queries, the Teradata 16 Optimizer effectively executes:

SELECT * FROM TheTableA T01
INNER JOIN
TestTable T02
ON T01.PK = T02.PK
WHERE T01.COL2 = 100;

SELECT * FROM TheTableB T01
INNER JOIN
TestTable T02
ON T01.PK = T02.PK
WHERE T01.COL2 = 100;

This option can save resources by effectively decreasing the selected row count in every join.

The potential for decreased resource consumption rises as tables become larger and are merged using UNION ALL before being joined with an additional table.

The Teradata 16 Optimizer can apply aggregation to each branch of UNION ALL separately and then execute global aggregation in a subsequent step.

Before Teradata 16, aggregation occurred once all UNION ALL branches had been consolidated into a shared spool.

The primary benefit lies in counting the rows of a view or derived table that includes multiple branches merged with UNION ALL. Before Teradata 16, a common spool was created first, which required a full row count.

Starting with Teradata 14.10, the Optimizer has implemented a uniform optimization method for individual tables. This entails computing the row count on the Cylinder Index for each table linked by a UNION ALL operation and subsequently summing up the results. This technique remains in use in Teradata 16.

The Optimizer can utilize the aforementioned techniques to enhance performance by reducing spool sizes by pushing aggregation or joining into each UNION ALL branch.

The Optimizer heavily relies on statistics to determine the feasibility of implementing these methods.

Related Services

⚡ Need Help Optimizing Your Data Platform?

We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.

Explore Our Services →

📋 Considering a Move From Teradata?

Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.

Free Migration Assessment →

Optimizing Join and Aggregation with Teradata UNION ALL Views

Teradata UNION ALL Introduction

📊 Data Platform Migration Survey

Stay Ahead in Data Warehousing

Leave a Comment Cancel reply