For indexed columns (PI, USI, NUSI) without statistics, Teradata calculates them at runtime. However, these statistics only contain a few key figures, such as the table cardinality and distinct values. This process is called dynamic AMP sampling or earlier also dynamic AMP sampling. Dynamic AMP samples are stored in the table header of each AMPs

Read More

When Teradata introduced several new features with version 14.10, one of them, based on the so-called object use counts (OUC), caught our attention. This feature can significantly improve our query performance by optimizing the process of calculating extrapolated statistics. Up to version 13.10 of Teradata, the changes made by DML statements are not logged; the

Read More

Collecting statistics is one of the most important tools available to us for performance tuning. However, we have to think about many things because collecting statistics is always connected to resource consumption. The main considerations we need to make are: On which columns should statistics be collected? What kind of statistics should be collected (full

Read More

Introduction to Teradata Sample Statistics In this article, I will show you when, how, and why you can/should use Teradata Sample Statistics. When using sample statistics, the columns must have as many different values as possible. For example, a UPI fulfills this requirement, but the NUPI can also be used if it has as many

Read More

Heuristics are applied whenever there are no collected statistics available for a nonindexed column. Heuristics are rules of thumb to estimate the number of selected data rows. They are used in WHERE condition predicates and to join planning. Heuristics estimate the number of selected data rows to be a certain percentage of the table cardinality.

Read More

Biased value histogram entries store exact cardinalities. Statistics that are up to date should reflect the number of rows returned in the retrieving step of a biased value. In different words: The Optimizer can quickly determine the number of rows being retrieved by looking up the histogram’s biases value. Unfortunately,  this is not always the

Read More

In this article, we will analyze a statistics problem in detail. Step by step we will figure out why estimations for our example query are wrong and how we can fix them. The first situation is the following: We are running an explain on below query, and get the following execution plan: SELECT * FROM TheDatabase.TMP_STATS WHERE

Read More

Teradata Statistics Histograms – A Short Introduction Most of us are familiar with the optimizer’s confidence levels for statistics. Today I was surprised by the fact that “high confidence” doesn’t mean automatically that we will have a correct or almost correct estimation (given that the collected statistics are not stale). Although I still hope my observations could

Read More

The Optimizer usually is doing a good job when it comes to the usage of statistics. Nevertheless, sometimes it pays off to take a closer look at the execution plan and the Optimizer’s cardinality estimations. Since Teradata 14.10, I got used to including the SHOW STATISTICS statement into my considerations, as the output metrics can

Read More

Today I attended an excellent presentation about the Teradata statistics improvements presented by Thomas Mechtler, a very experienced Senior Consultant at Teradata Austria. I carved out some of the major points I wanted to share with you in this article. In the past, Teradata statistics maintenance was as tricky as it is crucial, but this changed a

Read More