Teradata NUSI Selectivity and Data-Block Density
The goal of this article is to show how the number of rows per base-table data block impacts the selectivity threshold for Non-Unique Secondary Indexes (NUSI) in Teradata.
Understanding this correlation is critical when analyzing query plans and tuning indexing strategies.
The number of qualifying rows that make a NUSI efficient depends not only on data distribution and statistics, but also on how many rows fit into a single data block.
Example Setup
Our test table NUSI_SELECTIVITY
contains about 270 K rows.
The column PK
(used as the Primary Index) ensures even AMP distribution, while COL_A
is used for row qualification.
CREATE MULTISET TABLE INDEXING.NUSI_SELECTIVITY
(
PK INTEGER,
COL_A CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC
) PRIMARY INDEX ( PK );
SELECT COUNT(*) FROM INDEXING.NUSI_SELECTIVITY;
-- 269,378
We then created a Non-Unique Secondary Index (NUSI) on COL_A
:
CREATE INDEX (COL_A) ON INDEXING.NUSI_SELECTIVITY;
Each row uses 14 bytes (4 bytes for PK
, 10 bytes for COL_A
).
We populated the table as follows:
INSERT INTO INDEXING.NUSI_SELECTIVITY
SELECT
PK,
CASE WHEN PK <= 19.332 THEN 'A' ELSE 'B' END
FROM
INDEXING.SOME_TABLE;
To guarantee deterministic optimizer behavior, we collected statistics:
COLLECT STATS ON INDEXING.NUSI_SELECTIVITY COLUMN(COL_A);
Tip: Always collect statistics on indexed columns when analyzing NUSI thresholds; random AMP sampling can otherwise distort the plan.
When Teradata Uses the NUSI
We now query all rows with COL_A = 'A'
:
SELECT * FROM INDEXING.NUSI_SELECTIVITY WHERE COL_A = 'A';
At 19 332 qualifying rows (~7.1 %), the optimizer still uses the NUSI:
EXPLAIN SELECT * FROM INDEXING.NUSI_SELECTIVITY WHERE COL_A = 'A';
(Plan excerpt shows “by way of index #4”)
When we slightly increase the selection to 19 333 rows, the optimizer switches to a full-table scan (FTS) because it now estimates that scanning all base-table blocks is cheaper than following NUSI rowIDs.
DELETE FROM INDEXING.NUSI_SELECTIVITY;
INSERT INTO INDEXING.NUSI_SELECTIVITY
SELECT
PK,
CASE WHEN PK <= 19.333 THEN 'A' ELSE 'B' END
FROM
INDEXING.SOME_TABLE;
COLLECT STATS ON INDEXING.NUSI_SELECTIVITY COLUMN(COL_A);
EXPLAIN SELECT * FROM INDEXING.NUSI_SELECTIVITY WHERE COL_A = 'A';
Why the Optimizer Switches
Teradata reads and moves data blocks, not individual rows.
When each qualifying row resides in a different data block, NUSI access becomes as costly as scanning the table once sequentially.
Optimizer logic in short:
NUSI = beneficial only if fewer base-table blocks must be touched than during a single FTS.
With 19 332 rows, each data block likely contains several qualifying rows.
At 19 333 rows, the additional block I/O crosses the optimizer’s internal threshold.
Selectivity Calculation
The selectivity of this query is:
19 332 ÷ 269 378 = 7.1 %
If less than 7 % of the rows qualify, the NUSI is favored; beyond that, a full scan wins.
This value depends entirely on row length, compression, and data-block size — not a fixed 10 % rule as often quoted online.
Changing Row Size and Block Density
To test the effect of row size, we duplicated the table with a larger CHAR(20)
column:
CREATE MULTISET TABLE INDEXING.NUSI_SELECTIVITY_COPY, NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
PK INTEGER,
COL_A CHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC
)
UNIQUE PRIMARY INDEX ( PK );
The longer rows reduce the number of rows per data block.
Repeating the test showed that 24 136 rows (~9 %) could still use the NUSI before the optimizer switched to FTS.
Observation:
Fewer rows per block → higher selectivity threshold → NUSI stays useful longer.
Changing the character set from LATIN to UNICODE (2 bytes per character) would have a similar effect.
Even schema design choices like data type or encoding can alter the optimizer’s access strategy.
The Covering-NUSI Surprise
Finally, we tested whether selecting only the indexed column (COL_A
)—making the NUSI covering—would change the decision:
EXPLAIN SELECT COL_A FROM INDEXING.NUSI_SELECTIVITY_COPY WHERE COL_A = 'A';
Surprisingly, even in this case, the optimizer still performed a full-table scan under Teradata 14.10.
Although documentation suggested that covering NUSIs are more likely to be used, this behavior didn’t occur in the tested release.
Later Vantage versions (16 and 17) are generally smarter at exploiting covering secondary indexes, but the same I/O cost model still applies — if most data blocks must be touched, FTS remains cheaper.
Key Takeaways for Teradata
- Teradata’s optimizer decides between NUSI and FTS based on data-block I/O, not row count.
- The number of rows per block (driven by row length and compression) shifts the threshold.
- Common rules such as “use NUSI below 10 % selectivity” are oversimplified.
- A small schema change can alter the optimizer’s decision path.
The Snowflake Perspective
Snowflake doesn’t have NUSI or user-defined secondary indexes.
Instead, it relies on micro-partition pruning and metadata filtering to achieve the same purpose.
- Each micro-partition stores min/max values and null counts for every column.
- When a
WHERE
clause is applied, Snowflake only scans micro-partitions whose value range overlaps the predicate. - Pruning efficiency depends on how well the data was loaded and clustered, not on row length or block density.
- Re-clustering can be triggered automatically using the Clustering Service to restore optimal pruning.
Equivalent concept:
Teradata’s NUSI selectivity ≈ Snowflake’s micro-partition pruning efficiency.
Migration Guidelines: Teradata → Snowflake
Concept | Teradata | Snowflake | Migration Tip |
---|---|---|---|
NUSI | Secondary access path via rowIDs | No secondary index; pruning via metadata | Drop NUSIs; rely on micro-partition metadata |
Selectivity Basis | Number of base-table data blocks | Range of micro-partition values | Load data in sorted order to improve pruning |
Impact of Row Size | Fewer rows per block = higher selectivity threshold | Irrelevant (columnar storage) | Data type changes affect storage, not pruning |
Covering Index | May reduce base-table lookup | Not applicable | Use result caching or materialized views |
Tuning Approach | Design indexes per workload | Optimize clustering keys and reclustering frequency | Use SYSTEM$CLUSTERING_INFORMATION to measure efficiency |
Key Takeaway:
While Teradata indexing decisions depend on block density and selectivity, Snowflake achieves the same goal automatically through micro-partition pruning.
Migrating removes the need for NUSI design — but shifts the focus to data-load order and clustering maintenance.
Summary
- Teradata NUSI performance depends on how many data blocks contain qualifying rows.
- Row length and block density directly influence optimizer selectivity thresholds.
- Snowflake eliminates NUSIs but introduces a new tuning dimension: micro-partition clustering.
- Understanding both helps translate physical design strategies from on-prem Teradata to the cloud.
Don’t forget to watch our video course on Teradata indexing techniques.
Also, check out:
The Teradata USI compendium
The NUSI compendium: Teradata’s Secondary Index
Thank you very much. It is a big honor to hear this from Tera-Tom himself 😉
I am a big fan of your Teradata books I remember, the first one I bought was “Tera-Tom on Database Administration” for V2R6 which helped me a lot to better understand the Teradata architecture.
Thanks again,
Roland
Roland,
Another brilliant article bringing to light important information to know when one wants to become a Teradata expert. Thank you again for another home run!
Tom Coffing