This is what I always teach in my performance Teradata SQL tuning workshops without ever disappointing my clients:
The most important optimization task is to give complete and correct statistics about tables and indexes (secondary indexes, join indexes, hash indexes, etc.).
Detecting missing statistics is easy. In the SQL Assistant executes the following statement:
DIAGNOSTIC HELPSTATS ON FOR SESSION;
Above statement turns on the optimizer’s statistic hints, and adds information about required statistics to the explain output.
A list of missing statistics is displayed at the end of the output. These are the statistics that the optimizer could use when creating the execution plan. Add the statistics one after the other and check the execution plan again after each change.
There are several ways to find outdated statistics. The easiest way is to split the SQL statement and to test each part separately, by comparing the estimated row count (as shown in the explain output) with the real number of records being returned by the query.
This approach is particularly suitable if the SQL statement cannot be executed in a reasonable time. Here is one example:
SELECT t01.* FROM table_1 t01 INNER JOIN table_2 t02 ON t01.PK = t02.PK
WHERE t01.column_1 = 1 AND t02.column_a = 2;
Above query can be split into two parts for testing:
SELECT * FROM table_1 WHERE column_1 = 1;
SELECT * FROM table_2 WHERE column_a = 2;
Execute and EXPLAIN both queries, and compare the number of rows returned with the number of rows returned by the EXPLAIN output.
If there is a big difference, it could be obsolete statistics or similar data distribution problems.