Applying column joins in Teradata using functions may indicate a flawed data model, potentially violating the first normal form.
SELECT A.column1, A.column2, B.column1, B.column2
FROM tableA A
JOIN tableB B
ON SUBSTRING(A.join_column, 1, 3) = B.join_column;
This article investigates the adverse impact on query performance that abnormalities can cause in Teradata join scenarios and suggests potential solutions to mitigate such consequences.
What is the effect of functions on the join performance?
Applying functions such as SUBSTR(), TRIM(), COALESCE(), CASE statements, and even the LIKE function to join columns can hinder the optimizer’s ability to use existing statistics. Consequently, the optimizer must estimate the number of rows based on heuristics, which may result in inefficient query execution.
What should I do if I cannot prevent the use of functions?
While fixing the underlying data model is the optimal solution for Teradata joins, alternative methods can help reduce the negative impact of using functions on join columns. One approach is creating volatile tables and normalizing columns before joining the operation. However, remember that these techniques are workarounds for an inadequate data model.
What other disadvantages do functions on join columns have?
Most functions compel the optimizer to perform a full table scan, which can be resource-intensive and time-consuming.
Are there functions that have no negative impact on performance?
Certain functions, such as the LIKE function with the ‘%’ pattern matching parameter at the end, allow the optimizer to use histograms for statistics.
SELECT * FROM Customer WHERE LastName LIKE ‘Wen%’;
Nevertheless, various methods can be employed to enhance implementation plans, such as using “LIKE ‘a%'” instead of “SUBSTR(column,1,1) = ‘a'”.
An inadequate data model mainly causes the problems addressed in this article. Attention to the data model should be the top priority since temporary solutions are seldom effective in achieving long-term success. Allocating resources for data model enhancement will advance performance and lessen complexities in Teradata joins.