This is what I always teach in my performance Teradata SQL tuning workshops without ever disappointing my clients:
The most important optimization task is to give complete and correct statistics about tables and indexes (secondary indexes, join indexes, hash indexes, etc.).
Detecting missing statistics is easy. In the SQL Assistant executes the following statement:
DIAGNOSTIC HELPSTATS ON FOR SESSION;
Above statement turns on the optimizer’s statistic hints, and adds information about required statistics to the explain output.
At the end of the output, a list of missing statistics is shown. These are the statistics the optimizer could use when building the execution plan. Add Statistics one by one, and re-check the execution plan after each change.
Several methods exist to detect stale statistics. The easiest way is to split the SQL statement and to test each part separately, by comparing the estimated row count (as shown in the explain output) with the real number of records being returned by the query.
This approach is particularly suitable if the SQL statement cannot be executed in a reasonable time. Here is one example:
SELECT t01.* FROM table_1 t01 INNER JOIN table_2 t02 ON t01.PK = t02.PK
WHERE t01.column_1 = 1 AND t02.column_a = 2;
Above query can be split into two parts for testing:
SELECT * FROM table_1 WHERE column_1 = 1;
SELECT * FROM table_2 WHERE column_a = 2;
Execute and EXPLAIN both queries, and compare the number of returned rows with the number of rows the explain output delivers.
If there is a big difference, it could be related to stale statistics or similar problems with data distribution.