fbpx

There are several articles on the internet about Teradata Hashing and how physical data rows are evenly distributed across all AMPs and sorted by ROWID.

There is almost no information about the value ordering available. The only on-hand details I know is the official Teradata documentation giving a vague statement like “value order is a way of allowing for optimized range scans of values.” The majority of the technical descriptions terminate this topic with an example, showing how to use WHERE  conditions on a range of dates. This is not enough for a firm understanding of this subject.

I think it is time to shed some light on value ordering.

In principle, we have two possibilities to use value order:

Historically only the non-unique secondary index allowed for value order. Starting with the release of Teradata 12, value ordering for join indexes was introduced.

How Teradata Value Ordering is implemented

Those of you reading our physical design posts will know how data retrieval works on a Teradata system:

  1. The Parsing Engine calculates a hash value (the ROWHASH) over the Primary Index column(s) of the rows.
  2. The so-called hashmap is consulted to identify the responsible AMP for each row.
  3. The AMP checks its Master Index to determine the cylinder carrying the row(s)
  4. The AMP checks the Cylinder Index to determine the disk sector(s) taking the row(s) (a sequence of sectors makes up a data block)

Usually, in the case of Primary Index hashing, the Master Index and the Cylinder Index carry information about where each searched ROWHASH can be found (i.e., on which cylinder and in which sectors the rows are stored). Furthermore, row pointers within a data block are maintained, sorted by ROWID.

Value ordering means that Master Index and Cylinder Index store information about where a 32-bit value (not the ROWHASH) is located. Furthermore, the data block contains a row pointer array sorted by this 32-bit value(s).

Value ordering is exclusively allowed for 32-bit integers (which includes the “date” data type, stored internally as a 32-bit integer). The reason is that the physical structures (row pointer array, Master Index dictionary, Cylinder Index dictionary) used by the fundamental hashing technique have been reused without any design changes.

Let me make some critical remarks here: Value ordering is always an All-AMP operation. Although the Join Index allows us to define any Primary Index, it is only used for data distribution in this case. The second option, the NUSI, anyway requires always a full table scan by design.

Value ordering of a Join Index gives the same advantages one would have from a single partition PPI table but with less space usage overhead.

Value order index access is always All-AMP access with a subsequent search of a 32-bit value.

I hope it is clear now why the value ordering method’s strength lies in the “order” of the 32-bit integer column in each of the physical structures (Master Index, Cylinder Index, Data Block).

More on indexing here: Teradata Indexing

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
Buy the Book Teradata Query Performance Tuning

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>