Release 14.00 introduces significant enhancements for Teradata statistics, which merit a detailed explanation.
Familiarizing yourself with the new syntax is crucial, as it is now the standard. This article will delve into the benefits of the updated syntax, which is accessible with Teradata 14.00.
COLLECT STATISTICS COLUMN (A), COLUMN(B), …, COLUMN(Z) ON TABLE_NAME;
The new syntax allows for grouping multiple collect statistics requests for a single table into a singular request. While the old syntax remains an option, it is not advised due to the notable enhancements in performance and resource utilization.
When utilizing the new syntax to group statistics collections on columns, the Teradata optimizer strives to optimize your request in order to minimize table scans. For instance:
COLLECT STATISTICS COLUMN(A,B), COLUMN(A), COLUMN(B);
In the example, the optimizer requires a single table scan to generate statistics for column A, column B, and columns (A, B). Technically, the statistics for (A, B) are initially computed, and subsequently, the spool created during this computation is reused to calculate the statistics for A and B.
While this method may not always be used, it can lead to significant enhancements in performance and resource utilization. It is important to note that utilizing the old syntax would result in three table scans. This performance and resource usage improvement can be perceived upon implementation or after the second refresh of the relevant statistics.
Rewriting your current statistical statements in the new syntax is highly recommended.