The Teradata AMP review
While there exist a lot of detailed explanations about the general shared nothing design of the Teradata RDBMS, articles about how the individual parts of the system (Parsing Engine, AMP, BYNET, etc.) operate are infrequent or shallow.
In this article, we will take a closer look at the Teradata AMP, and its functionality in the context of the Teradata shared nothing architecture.
Each Teradata AMP is a Unix process responsible for handling its individual share of data which was assigned to it by the parsing engine (PE), applying the well-known hashing algorithm, which is strongly linked to the concept of the primary index or vice versa.
Whenever the parsing engine receive a workload request (SQL statement, load utility workload, etc.), it created an execution plan for each AMP.
Each AMP on his part can assign the steps of the incoming request to any of up to 80 available AMP worker tasks: parallelism is not ending at the AMP level but each, MP can execute up to 80 atomic tasks in parallel.
Each type of workload typically requires more than one AMP worker task. While SQL statements are usually mild on the AMP worker task usage (1 or 2 AMP worker tasks per AMP), Teradata load utilities can consume a lot of AMP worker tasks at the same time.
For example, a Fastload needs 3 AMP worker tasks in the first phase (where it is collecting the data in 64k blocks) and one AMP worker task in the second phase (when finalizing the records into the target data blocks, sorted by row hash).
The vast amount of used AMP worker tasks is the main reason, why the number of simultaneously allowed load utilities was historically limited.
As you may assume correctly, AMP worker tasks are a crucial and sensitive resource, and to avoid that AMPs may run out of worker tasks (and as a result coming into blocking situations), each AMP reserves 24 AMP worker tasks for internal work. This strategy helps to avoid blocking and deadlock like situations.
AMP worker tasks are grouped together and specialized to do certain work. For example, one group is responsible for taking over steps assigned from the parsing engine (new work), another group for continued work (such as aggregation steps) and so on.
There even exists a particular type of AMP worker tasks which are only available for tactical workload and as such represent a “data highway” for tactical queries.
The most important fact is:
If an AMP runs out of AMP worker tasks, message queues are used to store requests intermediately. But as the message queue size per AMP is limited there may come the critical moment that no more messages can be put into the queue.
This situation is called flow control mode, and the involved AMPs will just deny to take over new requests from the parsing engine.