fbpx

The Teradata Hash Index Guide

By Roland Wenzlofsky

May 28, 2014


As it is valid for all Teradata index structures, the goal of the Teradata Hash Index is to minimize disk IOs by offering an alternative access path to the rows.

We can achieve this goal by using three main effects:

  1. The storage of a vertical subset (columns) of the table in the hash index structure
  2. The selection of a better Primary Index to avoid costly redistribution activities to join preparation.
  3. The Ordering of rows by value, to support range scans

The Teradata Hash Index can be designed to cover all selected columns or partially cover the columns.

In the first case, base table access can be skipped altogether, but if the Hash Index does not cover the selected columns, the base table rows can be obtained in a subsequent step.  Each index row carries the ROWID, which points to the related base table row (The ROWIDs from the Hash Index are extracted into a spool. In a subsequent step, the spool is used to access the base table rows).

The details above seem similar to the way Teradata Secondary Indexes are designed. But as Hash Indexes allow us to define their own Primary Index, rows can be distributed across the AMPs to fit your workload requirements. One can’t do this with Secondary Indexes.

Like a Non-Unique Secondary Index (NUSI), the rows of a hash index can be stored on the same AMP like the related base table row, simply by choosing the base table’s Primary Index the Primary Index of the hash index.

As valid for any index, Teradata Hash Indexes are maintained automatically by the system. There is overhead involved in UPDATE, INSERT and DELETE statements as the base table and the index structures have to be maintained.

Statistics are crucial and have to be collected at least on the Primary Index columns of the Hash Indexing to allow the Teradata Optimizer to estimate the Hash Index usage costs correctly.

Although the Teradata Hash Index can be viewed as a specialized form of a Single Table Join Index (STJI), some differences exist, which should be considered when choosing between the usage of a Single Table Join Index and the Hash index.

Maybe the most outstanding limitations compared with a Single Table Join Index are:

– A Hash Index cannot have a Partitioned Primary Index
– A Hash Index cannot have a Non-Unique Secondary Index.
– Hash Indexes cannot be specified for NOPI or column‑partitioned base tables as they are designed around the Teradata hashing algorithm.
– A hash index cannot be column partitioned
–  A hash index must have a Primary Index; a Single Table Join Index can be created with or without a primary index if a table is column-partitioned (as column stores on Teradata never have a Primary Index)

Here is an example for creating a Hash Ordered Hash Index:

CREATE HASH INDEX HASH_IDX (COL1, SimilaryCOL2) ON MYTABLE
BY (COL1) ORDER BY HASH

Similarly, you can create a Value Ordered Hash Index, which might be useful for retrieving rows containing ranges of values in the considered column(s):
CREATE HASH INDEX HASH_IDX (COL1, COL2) ON MYTABLE
BY (COL1) ORDER BY VALUE (COL2);
In both cases, the “by” clause defines the columns used for data distribution (like the Primary Index), but keep in mind that the columns used for data distribution have to be part of the columns which make up the Hash Index column list.
If we want to increase fault tolerance, FALLBACK protection can be used in the same way as it is used for base tables.
The second copy of each Hash Index row will be stored on a fallback AMP. In case the primary AMP is down, and you don’t have fallback protection, you cannot use the Hash Index anymore, and updates on the base table are not allowed until the AMP is coming online again.

Nevertheless, keep in mind that fallback protection doubles the space needed to store the Hash Index structure.

The conclusion is that Hash Indexes are restricted form a Single Table Join Indexes, and I do not assume a big difference in performance. As an advantage, the syntax for creating a Hash Index is more straightforward than the one used for a Single Table Join index.

More on indexing here: Teradata Indexing

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"rgb(255, 0, 0)"}},"gradients":[]}}]}__CONFIG_colors_palette__
GET OUR TERADATA BOOK

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>