There are two major problems, when using functions on join columns.
- The Teradata Optimizer cannot make use of statistics; String manipulation functions like SUBSTR(), TRIM(), the LIKE function, COALESCE() and CASE WHEN statements force the optimizer to use heuristics to make its estimations.In case you cannot avoid using a function, try at least different strategies, and figure out which function allows for the best execution planNevertheless, using functions in your joins, means that you have a bad physical data model. Either it’s technically outdated, inaccurate on the logical data model, or it has to serve two masters.
Ask yourself: Why there is no separate column that stores the essential information as you need it?
- Using functions on your join columns forces the optimizer to do a full table scan, slowing down your queries
Not each function is equal in its negative impact on performance. I experienced different results for applications of LIKE ‘a%’ versus SUBSTR(column,1,1) = ‘a’ .
You have to experiment a little bit with the various alternatives and chose the one that impairs performance the least. The best approach, though, is to avoid the usage of functions on any join column by design.