fbpx

UNION and UNION ALL

The Teradata UNION statement combines the result set of at least two queries; in addition, UNION removes row duplicates. Each of the select statements must pick the same number of columns, the columns must be selected in the same order, and the datatypes of the column of all select statements must be compatible (more about this later). Below is an example:

SELECT col1 FROM TheTable1 UNION SELECT col1 FROM TheTable2;
SELECT col1 FROM TheTable1 UNION ALL SELECT col1 FROM TheTable2; 

If the datatypes for the same column of different select statements don’t match exactly, the one of the very first select statement will define the datatype for all rows of the column; if needed, Teradata does an implicit casting:

CREATE TABLE Customer 
(
CustomerId INTEGER,
LastName CHAR(100)
) PRIMARY INDEX (CustomerId);
CREATE TABLE Customer_Hist
(
CustomerId INTEGER,
LastName CHAR(200)
) PRIMARY INDEX (CustomerId);
SELECT LastName FROM Customer 
UNION ALL 
SELECT LastName FROM Customer_Hist


The last names of longer than 100 characters are truncated from the Customer_Hist table in the example above. An error that is difficult to detect because the query runs without errors; incompatible data types lead to runtime errors.

Teradata UNION ALL vs. UNION Performance

teradata union

The performance of UNION ALL is always better than that of UNION.

Since Teradata UNION only returns distinct rows, all select statements’ rows must be sorted into a common spool. For this purpose, it might be necessary to redistribute the rows because the rows of all result sets must have the same primary index for the sort step. Both the redistribution of the rows and the subsequent sorting require many IOs for large tables. Therefore we only should use UNION if we need to remove row duplicates. If there are no row duplicates in the data, we should always avoid the additional sort (and eventually redistribution) step using UNION ALL.

Significance of the Character Set

We have already discussed how different data types can affect the UNION or UNION ALL result set. The same is true for the character set. If in the example shown below, the column “col1” of table “Table1” is defined as a character set LATIN, but column “col1” of table “Table2” is defined as a character set UNICODE, a runtime error will occur if table “Table2” contains UNICODE characters in column “col1”, because the character set of the first select statement defines the character set of the result set:

SELECT col1 FROM Table1 UNION SELECT col1 FROM Table2;

INTERSECT / INTERSECT ALL

teradata intersect

The INTERSECT operator returns as the result row duplicate of two select statements ( all rows identical in both select statements).

SELECT col1,col2 FROM TheTable1 
INSERTSECT 
SELECT col1,col2 FROM TheTable2;

Similar to Teradata UNION ALL, INTERSECT ALL returns duplicates without filtering them.

Similar to UNION ALL, INTERSECT ALL returns duplicates without filtering them. Again, it is recommended to use INTERSECT ALL instead of INTERSECT for better performance (the reasons are the same as for UNION ALL above).

MINUS / EXCEPT

teradata minus

The MINUS/EXCEPT operator returns a result set containing all rows of the first select statement that are not present in the second select statement; MINUS and EXCEPT are two different names for the same functionality:

SELECT col1,col2 FROM TheTable1 MINUS SELECT col1,col2 FROM TheTable2;
SELECT col1,col2 FROM TheTable1 EXCEPT SELECT col1,col2 FROM TheTable2; 

MINUS is often used to determine whether two tables are completely identical. For this, the MINUS operator must be executed two times, as shown below:

SELECT col1,col2 FROM TheTable1 MINUS SELECT col1,col2 FROM TheTable2;
SELECT col1,col2 FROM TheTable2 EXCEPT SELECT col1,col2 FROM TheTable1; 

We must also check whether both tables to be compared are populated. Otherwise, MINUS might provide equality of the tables, although it is not given.

Precedence of SET Operators

The set operators (UNION, UNION ALL, EXCEPT, MINUS) all have the same priority. When using SET operators, it is recommended that we always use parentheses to ensure the desired priority.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
Buy the Book Teradata Query Performance Tuning

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

  • How to connect for Teradata freelancing projects. pls help

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >