Collect Statistics in Teradata – The Evaluation
After collecting every combination considered necessary and helpful, you can check the result of the collected statistics on a table by looking at
- the time it took to collect and the then prevailing circumstances
- the collection results
Consider the lengthier collection time when planning maintenance and scheduling, even within regular or optimal conditions. Simplify and maintain the collection method, particularly for smaller tables that require only a few seconds.
If there are collected combinations with similar result sets, some may not provide additional information about the table content.
Examine this excerpt from a sequence of collectible pairings:
14/03/10 09:01:35 462,452,442 L_ID,C_DT
14/03/10 09:02:48 462,455,048 L_ID,C_DT,CLASS
Under full sampling, 462,452,442 unique combinations of L_ID and C_DT (more on later). Adding CLASS results in nearly the same set of combinations, indicating that there is only one CLASS for almost every L_ID and C_DT variety. Including the triple does not add value, as L_ID and C_DT cover almost every row alone. Therefore, we can eliminate this combination from our collection and save time and resources in the future.
There is still an advantage in such an “over-collection”: if the result comes as a surprise, it can be the one trigger that starts a data quality investigation earning you the fame of finding it first!