Teradata – Why is SELECT * considered harmful?
Did you know that the columns selected in an SQL select statement can impact query performance?
The following example will prove above statement.
The example query is doing an INNER JOIN between two tables:
The primary indexes of both tables are not matching the join columns, but this is a must to join 2 tables together. The rows of one or both tables have to be relocated (duplicated or redistributed by the row hash of the join columns).
In our example we added a NUSI to the join column (REF_PK) of the second table (TheTable2).
The first query selects all columns from both tables:
Both tables are redistributed by the hash code of the join column . Both tables are full table scannen. The retrieve step estimations are considered being with ” high confidence”.
The next plan matches the rows from both tables using a hash join with 10 partitions.
Improving The Teradata Column List
The second query selects only one column:
There are several improvements in the execution plan of the second query :
Below you can see the resulting reduction in resource usage: