The Teradata UNION ALL statement takes all rows from the intermediate results and puts them together into the result set. The UNION statement additionally removes duplicate rows before returning the result set.
UNION and UNION ALL
Both statements, the Teradata UNION and the Teradata UNION ALL, combine the result sets of two or more SQL statements:
FROM TheTable1SELECT col1 FROM TheTable1 UNION ALL SELECT col1 FROM TheTable2; UNION SELECT col1 FROM TheTable2 ;
All statements combined with UNION have to return the same number of columns, and the data types of all columns across the participating select statements have to match.
If they don't match, the data types of the very first SQL select statement will be the relevant ones and not match columns of the remaining statements are implicitly cast to the same data type the very first select statement has.
Keep this in mind, especially if your column is a character data type, as this could cause hidden truncation of text columns – a problem that is tough to discover.
UNION ALL vs. UNION Performance
UNION ALL always gives better performance than UNION.
For UNION, The AMPs must sort the resulting rows of the spools of all queries in order to have and recognize duplicates side by side.
Sorting is expensive and makes the difference in performance between UNION and UNION ALL.
Only use UNION if you need to remove row duplicates. If there are no row duplicates in the data, always avoid the sort step by using UNION ALL.
Impact of Character Set
With the UNION operator, it should also be noted that the character set also has an influence on the query. Let me give you an example:
FROM TheTable1 UNION SELECT col1 FROM TheTable2 ;
If the table “Table2” in column “col1” contains characters that are not contained in the column “col1” of table “Table1”, this may cause an error.
The risk is high if the table “Table1” is CHARACTER SET LATIN and the table “Table2” is CHARACTER SET UNICODE.
If the table “Table2” contains UNICODE characters, this leads to an error, since the CHARACTER SET of the table “Table1” determines the character set of the total result.
INTERSECT / INTERSECT ALL
The Intersect operator returns as Ergenis row duplicate of two subqueries, that is, all rows that are identical in both subqueries.
FROM TheTable1 INSERTSECT SELECT col1,col2 FROM TheTable2 ;
Similar to UNION ALL, INTERSECT ALL returns row duplicates.
For performance reasons, INTERSECT should only be used if duplicates must be removed. INTERSECT ALL should be the preferred operator
MINUS / EXCEPT
The MINUS/EXCEPT operator returns as result all rows of the first query which are not present in the second query:
FROM TheTable1SELECT col1,col2 FROM TheTable1 EXCEPT SELECT col1,col2 FROM TheTable2; MINUS SELECT col1,col2 FROM TheTable2 ;
It doesn't matter if you use MINUS or EXCEPT, both operators return the same result.
MINUS can be used to determine whether 2 tables are identical. For this the MINUS operator must be executed in both directions:
FROM TheTable1SELECT col1,col2 FROM TheTable2 EXCEPT SELECT col1,col2 FROM TheTable1; MINUS SELECT col1,col2 FROM TheTable2 ;
In addition, you must first check whether both tables to be compared are actually populated, otherwise MINUS provides equality of the tables although this is not given.
Precedence of SET Operators
The remaining set operators (UNION, UNION ALL, EXCEPT, MINUS) all have the same priority. When using SET operators, it is therefore recommended that you always use parentheses to ensure the desired priority.