Teradata Physical Storage: Understanding The Technical Details For Performance Optimization

A thorough understanding of the Teradata system’s technical intricacies is crucial to attaining expertise in performance optimization. Although many of us are knowledgeable about existing optimization opportunities, resolving complex performance issues often requires an in-depth understanding of internal data storage.

Until recently, data storage relied solely on hard disks. However, the current trend is shifting towards solid-state disks, yet hard drives are expected to remain in use for a considerable time.

Teradata systems currently employ both solid-state disks and hard drives for data storage.

The Teradata data storage system comprises cylinders, sectors, and blocks within the sectors, which marks the start of our journey into its inner workings.

Data records are stored in blocks allocated to specific cylinders on the AMP’s disk.

Table of Contents

Teradata Physical Storage – How can a table row be accessed?

The Parsing Engine sends requests to the AMPs containing the internal TABLEID, which serves as a unique identifier for tables in a Teradata system. To locate a data record on the disk, two indexes are employed.

The Teradata Master Index

The Master Index stores information about the cylinder holding a table’s rows, eliminating the need to read all cylinders for a single data row, such as during primary index access. The cylinder index contains each cylinder’s minimum ROWID and maximum ROWHASH values to prevent redundancy.

The master index maintains a sub-index of available cylinders for write operations such as inserts and updates.

The master index is cached in the FSG cache and readily accessible on every AMP for data retrieval.

The Teradata Cylinder Index

After querying the master index, the AMP quickly locates the cylinders containing the table records. The cylinder index is then used to pinpoint the specific cylinder sectors that hold the necessary data blocks, which ultimately house the required data rows. Notably, the cylinder index stores the initial and final sectors for every data block in a table.

Due to the larger size of the cylinder index compared to the master index, it may not be completely stored within the FSG cache.

The lowest ROWID and highest ROWHASH are accessible for restricting access to sectors containing the necessary records, similar to the master index. The cylinder index maintains a sub-index of all vacant disk sectors.

Once the AMP determines the first and last sector of the data block containing the necessary data record(s), the block can be transferred to the FSG cache for performing select, update, insert, or delete operations.

The Binary Search

The AMP will locate the requested data record(s) by performing a binary search on the data block(s).

The binary search algorithm efficiently locates a row in a sorted array of row IDs by iteratively comparing the desired row ID value with the row ID value of the central pointer array element.

Pointers mark table rows. The row can be accessed by the position the pointer array’s central element points to if it matches the row ID being searched for. If the row ID is less than the central element’s row ID, the algorithm will search to the left of the middle element. If the search key is greater, it will search to the right. The algorithm repeats these steps until the searched row ID is found or there are no more elements to search, indicating that the row is unavailable.

In part two of “The Ultimate Teradata Physical Storage Guide” series, we’ll examine the data block structure, record storage within the blocks, and Teradata’s management of varying block size needs.

We will demonstrate Teradata’s data block maintenance techniques, which involve adding, reducing, and splitting blocks as necessary to reduce fragmentation and optimize system speed.

Related Services

🏗️ Planning a Data Platform Migration?

Architecture-first approach: we design before a single line of code is written. Zero data loss across every migration delivered.

Our Migration Services →

Teradata Physical Storage: Understanding the Technical Details for Performance Optimization

Teradata Physical Storage – How can a table row be accessed?

The Teradata Master Index

The Teradata Cylinder Index

The Binary Search

📊 Data Platform Migration Survey

Stay Ahead in Data Warehousing

Leave a Comment Cancel reply