Collect Statistics in Teradata – Evaluation

Paul Timar

March 12, 2014

minutes reading time

Collect Statistics in Teradata – The Evaluation

After collecting every combination considered necessary and helpful, you can check the result of the collected statistics on a table by looking at

  • the time it took to collect and the then prevailing circumstances
  • the collection results

If the collection took longer than expected under normal or even ideal conditions, keep that in mind when you come to maintenance and scheduling. Keep it simple and keep up the collect combination, especially if the table is tiny and all is just a matter of seconds.

If there are collect combinations whose result sets are very close, some do not give any surplus information on the table content.

Look at this extract from a series of collect combinations:

14/03/10  09:01:35           462,452,442   L_ID,C_DT

14/03/10  09:02:48           462,455,048   L_ID,C_DT,CLASS

L_ID and C_DT amount to 462,452,442 distinct combinations under full sampling (of that later). We are adding CLASS results in almost the same set of combinations. So there must be one and only one CLASS for nearly every variety of L_ID and C_DT. Collecting on the triple adds no value as we are almost down to every row based on L_ID and C_DT alone. We can remove this combination from our collect combinations and save some time and resources in the future.

There is still an advantage in such an “over-collection”: if the result comes as a surprise, it can be the one trigger that starts a data quality investigation earning you the fame of finding it first!

See also:
Teradata Collect Statistics Using Sample
Collect Stats in Teradata
Teradata Statistics Basics
All you need to know about Teradata Statistics

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like