Teradata Hash Index: Minimizing Disk IOs and Optimizing Rows Access

Roland Wenzlofsky

April 23, 2023

minutes reading time

The Teradata Hash Index aims to reduce disk IOs by providing an alternative access path to rows, as with all Teradata index structures.

Our objective can be accomplished through the utilization of three primary impacts:

  1. The storage of a vertical subset (columns) of the table in the hash index structure
  2. Select a better Primary Index to avoid costly redistribution activities to join preparation.
  3. The Ordering of rows by value to support range scans

The Teradata Hash Index can cover either all or a portion of the selected columns.

If the Teradata Hash Index does not cover the selected columns, the corresponding base table rows can be retrieved in a subsequent step. The ROWIDs, which point to the base table rows, are carried by each index row. The ROWIDs extracted from the Hash Index are placed into a spool and then utilized to access the base table rows. Base table access can be skipped in the first case.

The aforementioned particulars bear a resemblance to the design of Teradata Secondary Indexes. However, Teradata Hash Indexes permit the creation of a personalized Primary Index, thereby facilitating the distribution of rows among the AMPs in accordance with the demands of one’s workload. Secondary Indexes do not afford this flexibility.

Teradata Hash Index Design

A Teradata hash index can store rows on the same AMP as the base table row through its Primary Index, which is identical to the Primary Index of the hash index. This is akin to a Non-Unique Secondary Index (NUSI).

The system automatically maintains the Teradata Hash Indexes, as with any index. However, updating, inserting, or deleting data incurs overhead, as the base table and index structures require maintenance.

Accurate statistics are essential for the Primary Index columns of Hash Indexing to enable the Teradata Optimizer to estimate Hash Index usage costs accurately.

When deciding between a Single Table Join Index and a Hash Index, it is important to note that although the Hash Index can be seen as a variant of the STJI, there are distinct dissimilarities.

The most significant constraints when compared to a Single Table Join Index include:

  • A Hash Index cannot have a Partitioned Primary Index
  • A Hash Index cannot have a Non-Unique Secondary Index.
  • Hash Indexes cannot be specified for NOPI or column‑partitioned base tables as they are designed around the Teradata hashing algorithm.
  • A hash index cannot be column partitioned
  • A hash index must have a Primary Index; a Single Table Join Index can be created with or without a primary index if a table is column-partitioned (as column stores on Teradata never have a Primary Index)

This is an example of creating a Hash Ordered Index:


The “by” clause defines the columns utilized for data distribution, similar to the Primary Index. However, it is important to note that the columns used for data distribution must be included in the Hash Index column list. To improve fault tolerance, we can utilize FALLBACK protection in the same manner as it is used for base tables.

The Hash Index row’s second copy will be saved on a backup AMP. Without fallback protection, Hash Index use and updates on the base table are impossible when the primary AMP is unavailable.

However, it is important to note that implementing fallback protection requires twice the storage capacity for the Hash Index structure.

In conclusion, Hash Indexes can be used as Single Table Join Indexes, but their performance is not significantly different. Creating a Hash Index is also easier due to its simpler syntax than a Single Table Join Index.

Please refer to the official Teradata documentation by following this link:

The Teradata Hash Index

More on indexing here: Teradata Indexing

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like