Updated 2015-07-07Teradata Comparison Operators and Performance are interrelated to some extent.
Choosing the proper comparison operator can make the difference between a well performing and a poor performing query.
But why should the selected comparison operators have an impact on performance?
Statistics are the most important input to the Teradata Optimizer, and they have a huge quality impact on the execution plan and how aggressive or conservative it will be.
Teradata stores statistics on column level. While it is not possible to collect statistics on a more detailed level, this does not mean that column statistics can’t be used for a substring of a column. Teradata 14.10 has much more possibilities to use available statistics than earlier versions.
Her is one example:
Imagine a SQL statement with a WHERE condition on a specified column. Let’s assume a code stored in a character column named “ORIGIN_CD,” and we would like to retrieve all rows where the column starts with an ‘A.’
You have several options to implement this SQL statement; each one may result in a different execution plan with different access path, runtime and resource usage characteristics:
SELECT * FROM <TABLE> WHERE ORIGIN_CD LIKE ‘A%’;
SELECT * FROM <TABLE> WHERE SUBSTR(ORIGIN_CD,1,1) = ‘A’;
Both examples apply a function on top of the column “ORIGIN_CD.” Depending on your version of Teradata, these functions may be preventing the Optimizer from using column statistics. Avoid any function on a WHERE or JOIN condition column. Functions applied on join columns point to a bad design or a defunct data model.
If you can’t avoid to apply functions on WHERE and JOIN conditions, try different solutions as they might bring a significant gain in performance.