Joins as a Main Target for Teradata Tuning
Joining tables is one of the most expensive operations of an SQL statement. This is because each join requires the rows to be joined on a common AMP.
If the two tables do not have the same primary index, it is necessary to copy the rows of one or both tables. This can cause a huge number of IOs in large tables.
For each join, the Teradata Optimizer can choose between different types of joins, each suitable for a specific initial scenario. The main goal of the Teradata Optimizer is to minimize resource consumption (IOs, CPU consumption, etc.) for the entire execution plan.
General Join Optimization Considerations
The optimizer makes use of available statistics in order to determine the optimal join plan.
The optimizer‘s cost estimations for each join are based on the following factors: The expected cardinality of the output spool (after the join) and the expected cardinality of the input spools after WHERE conditions have been applied.
The selected column lists of each input spool and the average row size. Usually, only the selected columns of each input spool are taken over into the resulting spool.
The only exception is if a permanent table is joined directly without spooling.
In such a case all columns of this table are taken over into the resulting spool.
Based on the above observations we can draw the following conclusions regarding performance:
The important statistics for each join are the ones on the join columns and on any available WHERE condition.
Whenever possible we should apply WHERE conditions in our queries in order to reduce the cardinality of the input tables.
further, we should only select the columns required in the result set and avoid „SELECT * FROM“ queries.