The Idea Behind This Trick for Teradata UNION ALL
What if you need to apply a UNION ALL operation to distinct columns within a single table? Typically, the process would involve:
SELECT CloseDate FROM Customer
UNION ALL
SELECT OpenDate FROM Customer
;
This method has a drawback:
To query the Customer table with just one full table scan and no need for UNION ALL, I sought out an alternative approach that produces an identical output.
I have uncovered a potential solution that may be of use:
SELECT CASE WHEN t02.NBR = 1 THEN t01.CloseDate WHEN t02.NBR = 2 THEN t01.OpenDate ELSE NULL END FROM Customer t01 CROSS JOIN ( SELECT x AS NBR FROM (SELECT 1 AS x) x UNION ALL SELECT x AS NBR FROM (SELECT 2 AS x) x ) t02 WHERE t02.NBR IN (1,2) ;
This query only requires one complete table scan and significantly reduces IOs, particularly for massive tables.
NumResultRows | TotalIOCount | AMPCpuTime |
13464826170,00 | 5521013,00 | 14177,29 |
13464826170,00 | 10146110,00 | 14460,84 |
The IOs have significantly decreased from 10146110.00 to 5521013.00, nearly a 50% reduction.
The Same Trick for UNION
DISTINCT can simulate UNION through column selection.
SELECT DISTINCT CASE WHEN t02.NBR = 1 THEN t01.CloseDate WHEN t02.NBR = 2 THEN t01.OpenDate ELSE NULL END FROM Customer t01 CROSS JOIN ( SELECT x AS NBR FROM (SELECT 1 AS x) x UNION ALL SELECT x AS NBR FROM (SELECT 2 AS x) x ) t02 WHERE t02.NBR IN (1,2) ;
The following query utilizes the UNION feature:
SELECT CloseDate FROM Customer
UNION
SELECT OpenDate FROM Customer
;
I cannot provide specific figures as the UNION variant terminated due to a “no more spool space” error, likely caused by the query’s skewness.
The outcome is expected to be comparable.
Additionally, refer to: