How the Number of Rows per Data Block Affects Teradata NUSI Selectivity: A Case Study

Roland Wenzlofsky

May 5, 2023

minutes reading time

Teradata NUSI Selectivity and Data-Block Density

The goal of this article is to show how the number of rows per base-table data block impacts the selectivity threshold for Non-Unique Secondary Indexes (NUSI) in Teradata.

Understanding this correlation is critical when analyzing query plans and tuning indexing strategies.
The number of qualifying rows that make a NUSI efficient depends not only on data distribution and statistics, but also on how many rows fit into a single data block.

Example Setup

Our test table NUSI_SELECTIVITY contains about 270 K rows.
The column PK (used as the Primary Index) ensures even AMP distribution, while COL_A is used for row qualification.

CREATE MULTISET TABLE INDEXING.NUSI_SELECTIVITY
(
 PK INTEGER,
 COL_A CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC
) PRIMARY INDEX ( PK );

SELECT COUNT(*) FROM INDEXING.NUSI_SELECTIVITY;
-- 269,378

We then created a Non-Unique Secondary Index (NUSI) on COL_A:

CREATE INDEX (COL_A) ON INDEXING.NUSI_SELECTIVITY;

Each row uses 14 bytes (4 bytes for PK, 10 bytes for COL_A).
We populated the table as follows:

INSERT INTO INDEXING.NUSI_SELECTIVITY
SELECT 
  PK,
  CASE WHEN PK <= 19.332 THEN 'A' ELSE 'B' END
FROM 
  INDEXING.SOME_TABLE;

To guarantee deterministic optimizer behavior, we collected statistics:

COLLECT STATS ON INDEXING.NUSI_SELECTIVITY COLUMN(COL_A);

Tip: Always collect statistics on indexed columns when analyzing NUSI thresholds; random AMP sampling can otherwise distort the plan.

When Teradata Uses the NUSI

We now query all rows with COL_A = 'A':

SELECT * FROM INDEXING.NUSI_SELECTIVITY  WHERE COL_A = 'A';

At 19 332 qualifying rows (~7.1 %), the optimizer still uses the NUSI:

EXPLAIN SELECT * FROM INDEXING.NUSI_SELECTIVITY WHERE COL_A = 'A';

(Plan excerpt shows “by way of index #4”)

When we slightly increase the selection to 19 333 rows, the optimizer switches to a full-table scan (FTS) because it now estimates that scanning all base-table blocks is cheaper than following NUSI rowIDs.

DELETE FROM INDEXING.NUSI_SELECTIVITY;
INSERT INTO INDEXING.NUSI_SELECTIVITY
SELECT 
  PK,
  CASE WHEN PK <= 19.333 THEN 'A' ELSE 'B' END
FROM 
  INDEXING.SOME_TABLE;

COLLECT STATS ON INDEXING.NUSI_SELECTIVITY COLUMN(COL_A);

EXPLAIN SELECT * FROM INDEXING.NUSI_SELECTIVITY WHERE COL_A = 'A';

Why the Optimizer Switches

Teradata reads and moves data blocks, not individual rows.
When each qualifying row resides in a different data block, NUSI access becomes as costly as scanning the table once sequentially.

Optimizer logic in short:
NUSI = beneficial only if fewer base-table blocks must be touched than during a single FTS.

With 19 332 rows, each data block likely contains several qualifying rows.
At 19 333 rows, the additional block I/O crosses the optimizer’s internal threshold.

Selectivity Calculation

The selectivity of this query is:

19 332  ÷  269 378  =  7.1 %

If less than 7 % of the rows qualify, the NUSI is favored; beyond that, a full scan wins.
This value depends entirely on row length, compression, and data-block size — not a fixed 10 % rule as often quoted online.

Changing Row Size and Block Density

To test the effect of row size, we duplicated the table with a larger CHAR(20) column:

CREATE MULTISET TABLE INDEXING.NUSI_SELECTIVITY_COPY, NO FALLBACK ,
 NO BEFORE JOURNAL,
 NO AFTER JOURNAL,
 CHECKSUM = DEFAULT,
 DEFAULT MERGEBLOCKRATIO
 (
  PK INTEGER,
  COL_A CHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC
 )
UNIQUE PRIMARY INDEX ( PK );

The longer rows reduce the number of rows per data block.
Repeating the test showed that 24 136 rows (~9 %) could still use the NUSI before the optimizer switched to FTS.

Observation:
Fewer rows per block → higher selectivity threshold → NUSI stays useful longer.

Changing the character set from LATIN to UNICODE (2 bytes per character) would have a similar effect.
Even schema design choices like data type or encoding can alter the optimizer’s access strategy.

The Covering-NUSI Surprise

Finally, we tested whether selecting only the indexed column (COL_A)—making the NUSI covering—would change the decision:

EXPLAIN SELECT COL_A FROM INDEXING.NUSI_SELECTIVITY_COPY WHERE COL_A = 'A';

Surprisingly, even in this case, the optimizer still performed a full-table scan under Teradata 14.10.

Although documentation suggested that covering NUSIs are more likely to be used, this behavior didn’t occur in the tested release.
Later Vantage versions (16 and 17) are generally smarter at exploiting covering secondary indexes, but the same I/O cost model still applies — if most data blocks must be touched, FTS remains cheaper.

Key Takeaways for Teradata

Teradata’s optimizer decides between NUSI and FTS based on data-block I/O, not row count.
The number of rows per block (driven by row length and compression) shifts the threshold.
Common rules such as “use NUSI below 10 % selectivity” are oversimplified.
A small schema change can alter the optimizer’s decision path.

The Snowflake Perspective

Snowflake doesn’t have NUSI or user-defined secondary indexes.
Instead, it relies on micro-partition pruning and metadata filtering to achieve the same purpose.

Each micro-partition stores min/max values and null counts for every column.
When a WHERE clause is applied, Snowflake only scans micro-partitions whose value range overlaps the predicate.
Pruning efficiency depends on how well the data was loaded and clustered, not on row length or block density.
Re-clustering can be triggered automatically using the Clustering Service to restore optimal pruning.

Equivalent concept:
Teradata’s NUSI selectivity ≈ Snowflake’s micro-partition pruning efficiency.

Migration Guidelines: Teradata → Snowflake

Concept	Teradata	Snowflake	Migration Tip
NUSI	Secondary access path via rowIDs	No secondary index; pruning via metadata	Drop NUSIs; rely on micro-partition metadata
Selectivity Basis	Number of base-table data blocks	Range of micro-partition values	Load data in sorted order to improve pruning
Impact of Row Size	Fewer rows per block = higher selectivity threshold	Irrelevant (columnar storage)	Data type changes affect storage, not pruning
Covering Index	May reduce base-table lookup	Not applicable	Use result caching or materialized views
Tuning Approach	Design indexes per workload	Optimize clustering keys and reclustering frequency	Use `SYSTEM$CLUSTERING_INFORMATION` to measure efficiency

Key Takeaway:
While Teradata indexing decisions depend on block density and selectivity, Snowflake achieves the same goal automatically through micro-partition pruning.
Migrating removes the need for NUSI design — but shifts the focus to data-load order and clustering maintenance.

Summary

Teradata NUSI performance depends on how many data blocks contain qualifying rows.
Row length and block density directly influence optimizer selectivity thresholds.
Snowflake eliminates NUSIs but introduces a new tuning dimension: micro-partition clustering.
Understanding both helps translate physical design strategies from on-prem Teradata to the cloud.

Don’t forget to watch our video course on Teradata indexing techniques.

Also, check out:
The Teradata USI compendium
The NUSI compendium: Teradata’s Secondary Index

Featured Image without Sidebar

Take a brief survey and receive the book for free Get instant access

Previous Article

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

Thank you very much. It is a big honor to hear this from Tera-Tom himself 😉

I am a big fan of your Teradata books I remember, the first one I bought was “Tera-Tom on Database Administration” for V2R6 which helped me a lot to better understand the Teradata architecture.

Thanks again,

Roland

Roland,

Another brilliant article bringing to light important information to know when one wants to become a Teradata expert. Thank you again for another home run!

Tom Coffing

You might also like

Understanding Skew in Teradata and Snowflake

Why the Same UPDATE Means Something Entirely Different in Teradata and Snowflake

Teradata Hybrid: Bridge or Destination?

Teradata vs. Snowflake: Why GROUP BY Performance Differs at Scale