Evaluating the Accuracy of Table Statistics: Why Collection Date May Not Be Enough

When assessing the accuracy of statistics in a table, individuals often rely solely on the collection date. However, this may not always provide enough information. For example, static lookup tables may retain accurate Teradata statistics even if they have not been recently updated. Conversely, some tables change frequently, with certain fields exhibiting broader value ranges while others remain unchanged with a limited number of codes.

To validate your table’s statistics, use a WHERE condition to execute a simple query on the desired column. For instance:

SELECT * FROM TABLE WHERE TEST_COLUMN = 'X';

Compare the estimated record count, visible in Teradata Administrator, with the returned record count. If the discrepancy exceeds a single-digit percentage relative to the table’s overall size, the statistics likely require updating.

Remember that the optimizer’s estimates may not always be exact to the row level. Nonetheless, the disparity between the expected and actual record count should not be substantial.

Related Services

⚡ Need Help Optimizing Your Data Platform?

We cut data platform costs by 30–60% without hardware changes. 25+ years of hands-on tuning experience.

Explore Our Services →

📋 Considering a Move From Teradata?

Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.

Free Migration Assessment →

📊 Data Platform Migration Survey

Help us map where the industry is heading. Results are public — see what others chose.

1. What is your current data platform?

2. Where are you migrating to (or evaluating)?

Migrating FROM
Migrating TO

Thanks for voting! Share this with your network.

Follow me on LinkedIn for daily insights on data warehousing and platform migrations.

Stay Ahead in Data Warehousing

Get expert insights on Teradata, Snowflake, BigQuery, Databricks, Microsoft Fabric, and modern data architecture — delivered to your inbox.

2 thoughts on “Evaluating the Accuracy of Table Statistics: Why Collection Date May Not Be Enough”

  1. Hi Alan,

    One example comes to my mind:

    In case the Teradata system configuration changes (for example, by adding nodes) and rows of a PPI Table are distributed to new partitions.

    This happens for example if you are using the HASHING functions of Teradata in the PARTITIONING definition of your table DDL:

    HASHAMP(HASHBUCKET(HASHROW(column)))

    As a bigger system has more AMPS, rows are distributed differently to the partitions and the statistics become stale and have to be recollected.

    This is an example, where the table content stayed the same but statistics became stale.

    Reply
  2. I was once advised to recollect statistics on all the static ‘lookup’ tables held in a database, on a monthly basis, ‘just in case’ the statistics needed to be refreshed…needless to say, I never saw a reason to do this as it never had any impact. Would there ever be a case where statistics become ‘stale’ and therefore ignored by the optimizer in such circumstances?

    Reply

Leave a Comment

DWHPro

Expert network for enterprise data platforms. Senior consultants, project teams built for your challenge — across Teradata, Snowflake, Databricks, and more.

📍Vienna, Austria & Jacksonville, Florida

Quick Links
Services Team Teradata Book Blog Contact Us
Connect
LinkedIn → [email protected]
Newsletter

Join 4,000+ data professionals.
Weekly insights on Teradata, Snowflake & data architecture.