The Teradata Shared Nothing Architecture


teradata AMP

The Teradata Architecture

Database systems all work in in a similar way: Data rows travel from the storage to the main memory.

Data rows can only be processed by the CPU when they are in main memory. Data on disk by itself is useless!

Storage, memory, and CPU build one unit.

Teradata is a shared nothing architecture, consisting of many of these units.

CPU, main memory, and storage are the physical components. Teradata executes many processes on each physical CPU, to increase the degree of parallelity. These processes are the so-called AMPs.

AMPs are the workers doing the most important job: They are moving data between the disk and the main memory.

A typical Teradata system executes hundreds of AMPs at the same time, all of them doing their share if work in parallel with all other AMPs.

Each AMP owns two memory types (Teradata Intelligent Memory and the FSGCache). This two kind of memories is used by the AMP to intelligently cache data from the persistent storage owned by that AMP.

Both types of memory are exclusive to each AMP; no other AMP can use the assigned memory. The AMP reads data rows from the disk into the memory, eventually, changes them, and writes them back.

The Teradata Intelligent Memory contains the most frequently used data, while the FSG cache contains the data used by active requests.

But not only memory is assigned exclusively to an AMP. Each AMP owns a particular logical storage unit and is the only one who can read it and write to it.

There is one exception to the rule: In the case of an AMP failure, a backup AMP can take over one AMPs work, until it is functional again.

The virtual storage system manages the storage of each AMP. While in earlier times only hard disks were available for storage, current Teradata systems additionally offer solid stage drives. The virtual storage system is the interface between the AMPs logical storage unit and the physical storage device.

As modern Teradata systems often use a mix of solid state disks and hard drives, the Teradata virtual storage also monitors the usage of data and moves frequently used data to faster storage devices, and less frequently used data on the slower devices.
A network, the BYNET, connects all AMPs. It allows the Parsing Engine to communicate with the AMPs (giving them instruction) and it allows the AMPs to communicate with each other.

The Parsing Engine is responsible for checking the syntax of our request, to ensure that required permissions are available, and to create the so-called execution plan (we will talk about execution plans in the next lesson).

Furthermore, the optimizer is part of the parsing engine software. We will talk a lot about the optimizer in the upcoming lectures, as it makes a lot of use of statistics.

AMPs are doing all the updates, deletes, and inserts on a table (and as a side note, AMPS even have some particular tasks, such as character set conversions).

Teradata people like talking about AMPs, but usually, they mean the units of AMP, memory, and disk.

Teradata distributes the data rows of a table across all AMPs
(precisely we would need to say “across the disks of all AMPs”).

Distribution is done to share the workload and is the characteristic of any parallel database system. I repeat, as it’s an important concept: The hashing algorithm spreads the rows of each table across all AMPs, or in other words: Each AMP hold rows of each table.

The Primary Index

The primary index is a mechanism defining the location of each data row; It may consist of a single or multiple columns.

The values of the primary index columns may be unique or non-unique.

Don’t confuse the primary index with the primary key of a table. The primary key is a concept of logical data modeling; the primary index defines the AMP (and therefore logical storage device) for data rows.

While the primary key is utilized to identify each object stored in a table uniquely, the main idea behind primary index choice is to use the parallel system in the most efficient way.

Each data row has to pass the hashing algorithm. The hashing algorithm calculates the so-called row hash from the primary index columns.

The input order of the primary index columns doesn’t matter:

f(a,b) = f(b,a)

Data rows are distributed based on the calculated row hash.

The so-called hash map assigns each row hash to one particular AMP.

If data rows are stored (insert, changed or updated) or retrieved, it’s the AMP which is doing this work.

Same input values always map to the same row hash value. (as long as the system configuration is unchanged).

Sometimes different combinations of primary index columns map to the same row hash value, which is called “hash collision.”

Hash collisions can negatively impact performance.
The idea of hashing is to achieve an even distribution of data rows across all AMPs.

In the best case, each AMP has the same amount of data rows, as this ensures that the parallel system works in the most efficient way.

But there is a disadvantage with this approach: To join the rows of two tables they must be on the same AMP.

In other words: “The primary index of both tables has to match the join columns.”

If this is not the case, the data rows of one or both tables have to be redistributed by the join columns, or Teradata duplicates all date rows to all AMPs. Whatever is the strategy to co-locate the rows: It’s a quite expensive step.

Another limitation of hashing is that direct access to a data row only is possible if the predicate of a WHERE condition matches the primary index definition. Otherwise, in the absence of other indexes, the full table has to be scanned to find the requested data row.

We have to consider three criteria when selecting the primary index:

– A Good Access Path
– Even Distribution of Rows across all AMPs
– Low Volatility of the Primary Index Columns

A good access path means to achieve optimal retrieve and join performance.

In retrieve steps, the primary index is the most efficient way to pick up data rows.

Join steps are fast if the join columns of both tables are the same, making them an excellent primary index candidate.

The second important criteria for primary index choice is even row distribution, to use the parallel architecture in an optimal way.

Often we can’t optimize for both mentioned goals at the same time. We may have to design the primary index for a fast access path, accepting that the data distribution is not ideal.

It’s perfectly fine, as long as we are aware of this fact and know how to deal with resulting issues.

Finally, the volatility of the primary index values should be small. When the primary index value of row changes, all rows are sent to the hashing function and re-distributed to their new AMP.

Ideally, we would like to have a non-volatile primary index with an even row distribution and a fast access path (indexed access).

I hope with this short introduction to the Teradata Architecture we laid a solid foundation to understand the meaning of statistics in the upcoming lectures.

See also:
First the PDM, then Compression – Measures to reduce Resource Usage
Teradata Denormalization – Don’t trust your intuition!
The Primary Index Choice

If you have any questions about all this, please ask in the comments! I’ll be paying close attention and answering as many as I can. Thank you for reading. Whatever this blog has become, I owe it all to you.
Our Reader Score
[Total: 6    Average: 4.7/5]
The Teradata Shared Nothing Architecture written by Roland Wenzlofsky on October 15, 2014 average rating 4.7/5 - 6 user ratings


Please enter your comment!
Please enter your name here