Summary statistics are a handy feature, which is new in Teradata Release 14. Contrary to traditional statistics, which are collected on columns or indexes, summary statistics are table-level statistics:
COLLECT SUMMARY STATISTICS ON <TABLE>;
Table level statistics include the number of rows per table, the average data block sizes, and the average row sizes.
The number of rows per table is a critical measure for the Teradata Optimizer to estimate the cost of a full table scan. So, how was this information made available in previous releases of Teradata?
In earlier versions of Teradata, this was achieved by collecting column statistics on the dummy “PARTITION” and the Primary Index columns.
Summary statistics defined on a table will be refreshed each time the statistics on any column of the table are updated. If you want to avoid wasting resources, the best practice is to update all column statistics at once instead of column by column:
The excellent solution is this one:
COLLECT STATISTICS ON <TABLE>; — all column statistics are refreshed. Summary statistics are only updated once.
Although less intense on resources than traditional column/index statistics, this should still be avoided:
COLLECT STATISTICS ON <TABLE> <COLUMN_1>; — first collection of summary statistics
COLLECT STATISTICS ON <TABLE> <COLUMN_2>; — needless 2nd collection of summary statistics
Conclusion: Summary statistics collection is a handy feature to refresh the number of rows information per table quickly.