Collect Statistics in Teradata – Part 2

 Collect Statistics in Teradata – The Evaluation

After collecting on every combination considered necessary and helpful, you can check the result of the collected statistics on a table by looking at

  • the time it took to collect and the then prevailing circumstances
  • the collection results

If the collection took longer than expected under normal or even ideal conditions, keep that in mind when you come to maintenance and scheduling. Else, keep it simple and keep up the collect combination, especially if the table is tiny and all is just a matter of seconds.

If there are collect combinations whose result sets are very close, this is an indicator that some of them do not give any surplus information on the table content.

Look at this extract from a series of collect combinations:

14/03/10  09:01:35           462,452,442   L_ID,C_DT

14/03/10  09:02:48           462,455,048   L_ID,C_DT,CLASS

L_ID and C_DT amount to 462,452,442 distinct combinations under full sampling (of that later). Adding CLASS results in the almost the same set of combinations. So there must be one and only one CLASS for almost every combination of L_ID and C_DT. Collecting on the triple adds no value as we are almost down to every single row based on L_ID and C_DT alone. We can remove this combination from our collect combinations and so save some time and resources in the future.

There is still an advantage in such an “over-collection”: if the result comes as a surprise, it can be the one trigger that starts a data quality investigation earning you the fame of being the one who found it first!

See also:
Teradata Collect Statistics Using Sample
Collect Stats in Teradata
Teradata Statistics Basics
All you need to know about Teradata Statistics

Paul Timar
 

Paul Timar is a graduated Sociologist and Economist who turned Data Warehouse Specialist. He has more than 7 years of experience in working with Oracle and Teradata Data Warehouses in the fields of Social Security Data and Banking. His latest focus is on performance optimization.

>