The Teradata Summary Collect Statistics Feature
Summary statistics are a very useful feature, which is new in Teradata Release 14. Contrary to traditional statistics, which are collected on columns or indexes, summary statistics are table level statistics:
COLLECT SUMMARY STATISTICS ON <TABLE>;
Table level statistics are the information about the number of rows per table, the average data block sizes, and the average row sizes.
The number of rows per table is a critical measure for the Teradata Optimizer to estimate the cost of a full table scan. So, how was this information made available in previous releases of Teradata?
In earlier versions of Teradata, this was achieved by collecting column statistics on the dummy “PARTITION” and the Primary Index columns.
Summary statistics which are defined on a table will be refreshed each time the statistics on any column of the table are updated. If you want to avoid the waste of resources, the best practice is to update all column statistics at once instead of doing it column by column:
The excellent solution is this one:
COLLECT STATISTICS ON <TABLE>; — all column statistics are refreshed, summary statistics are only updated once.
Although less intense on resources than traditional column/index statistics, this should still be avoided:
COLLECT STATISTICS ON <TABLE> <COLUMN_1>; — first collection of summary statistics
COLLECT STATISTICS ON <TABLE> <COLUMN_2>; — needless 2nd collection of summary statistics
Conclusion: Summary statistics collection is a very useful feature to refresh the number of rows information per table quickly.