When assessing the accuracy of statistics in a table, individuals often rely solely on the collection date. However, this may not always provide enough information. For example, static lookup tables may retain accurate Teradata statistics even if they have not been recently updated. Conversely, some tables change frequently, with certain fields exhibiting broader value ranges while others remain unchanged with a limited number of codes.
To validate your table’s statistics, use a WHERE condition to execute a simple query on the desired column. For instance:
SELECT * FROM TABLE WHERE TEST_COLUMN = 'X';
Compare the estimated record count, visible in Teradata Administrator, with the returned record count. If the discrepancy exceeds a single-digit percentage relative to the table’s overall size, the statistics likely require updating.
Remember that the optimizer’s estimates may not always be exact to the row level. Nonetheless, the disparity between the expected and actual record count should not be substantial.
Related Services
⚡ Need Help Optimizing Your Data Platform?
We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.
Explore Our Services →📋 Considering a Move From Teradata?
Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.
Free Migration Assessment →
Hi Alan,
One example comes to my mind:
In case the Teradata system configuration changes (for example, by adding nodes) and rows of a PPI Table are distributed to new partitions.
This happens for example if you are using the HASHING functions of Teradata in the PARTITIONING definition of your table DDL:
HASHAMP(HASHBUCKET(HASHROW(column)))
As a bigger system has more AMPS, rows are distributed differently to the partitions and the statistics become stale and have to be recollected.
This is an example, where the table content stayed the same but statistics became stale.
I was once advised to recollect statistics on all the static ‘lookup’ tables held in a database, on a monthly basis, ‘just in case’ the statistics needed to be refreshed…needless to say, I never saw a reason to do this as it never had any impact. Would there ever be a case where statistics become ‘stale’ and therefore ignored by the optimizer in such circumstances?