Numerous online articles discuss Teradata Hashing, which evenly distributes physical data rows across all AMPs and sorts them based on ROWID.
Limited information is available regarding value orders. The official Teradata documentation vaguely states that it allows for optimized range scans of values. Technical descriptions typically conclude with an example of using WHERE conditions on different dates, which is insufficient for a comprehensive comprehension of this topic.
It is time to clarify the value order.
In principle, we have two possibilities for using value order:
In the past, only the non-distinct secondary index permitted value order. However, Teradata 12’s release made value ordering available for join indexes.
How Teradata Value Ordering is implemented
If you’ve been following our physical design posts, you already understand the process of data retrieval on a Teradata system.
- The Parsing Engine calculates a hash value (the ROWHASH) over the rows’ Primary Index column(s).
- The so-called hashmap is consulted to identify the responsible AMP for each row.
- The AMP checks its Master Index to determine the cylinder carrying the row(s)
- The AMPÂ checks the Cylinder Index to determine the disk sector(s), taking the row(s) (a sequence of sectors that makes up a data block)
Primary Index hashing utilizes the Master and Cylinder Index to locate the ROWHASH. These indices provide information on the cylinder and sectors where the rows are stored. Additionally, ROWID-sorted row pointers are maintained within data blocks.
Value order is the process by which the Master Index and Cylinder Index contain information about the location of a 32-bit value (as opposed to the ROWHASH). In addition, the data block has a row pointer array that is sorted based on this 32-bit value.
The “date” data type, which is stored as a 32-bit integer, is the only acceptable format for value order. This is because the basic hashing technique relies on the same physical structures (such as the Master Index and Cylinder Index dictionaries and the row pointer array) and has not undergone any design alterations.
I have some critical remarks to make. Firstly, it is important to note that value ordering is exclusively an All-AMP operation. Secondly, while the Join Index permits the definition of any Primary Index, it is solely utilized for data distribution. Thirdly, the NUSI option is inherently designed to necessitate a complete table scan.
A Join Index’s value ordering offers the same benefits as a single partition PPI table while using less space.
Indexing for value order requires All-AMP access and a subsequent search of a 32-bit value.
The strength of the value ordering method is determined by the sequence of the 32-bit integer column in each physical structure, namely the Master Index, Cylinder Index, and Data Block.
Learn more about indexing by visiting Teradata Indexing.