The Teradata AMP and its Worker Tasks (AWT)

Roland Wenzlofsky

September 1, 2020

minutes reading time

What is a Teradata AMP?

AMP is the abbreviation for Access Module Processor. Each Teradata AMP is a Linux process responsible for handling its individual share of data; The assignment of physical rows to AMPs is done using a hashing algorithm that ensures that all rows of a table are evenly distributed across all AMPs. The distribution of data to virtual processes is typical for shared-nothing systems like Teradata. Each AMP has its own main memory and a logical disk assigned. Each Teradata node typically runs hundreds of AMPs simultaneously.

What are the tasks of a Teradata AMP?

Each Teradata AMP is capable of performing all tasks related to its rows. This includes reading and storing rows, aggregating rows, performing joins, locking, sorting rows, disk space management, and font conversion when a result set is created.

What is an Teradata AMP Worker Task (AWT)?

Each AMP can assign a plan steps received to one of the 80 available AMP worker tasks; parallelism does not end at the AMP level, but each AMP can execute these 80 tasks in parallel.

Teradata AMP Worker Tasks are specialized and can perform different types of tasks, depending on their assignment. Some take on new work; other AWTs are reserved to continue work that has been started. Finally, some AWTs are exclusively reserved for tactical workloads.

Different workload types require different numbers of AMP worker tasks. So a SQL statement typically needs 1-2 AMP worker tasks per AMP, while a Teradata Fastload needs 3 AMP worker tasks in the first phase (acquisition phase) and 1 AMP worker task in the apply phase (all per AMP). This is why the number of parallel running fastloads and multiloads is limited to a small two-digit range (the default setting can be adjusted as needed).

As AMP worker tasks are a crucial resource, each AMP reserves 24 AMP worker tasks for internal work. This helps to avoid blocking and deadlock like situations.

What happens when all AMP Worker Tasks of an AMP are in use?

If an AMP does not have a worker task of the required type available, the tasks are collected in a message queue. Since the message queue has only limited memory, the AMP sends a message to the parsing engine when the message queue is full. The parsing engine then aborts the task on all AMPs and tries again after a short time (in the millisecond range). This happens in a loop until the AMP has free AMP worker tasks for this specific task. At each attempt, the waiting time until the next attempt increases. This is also called flow control mode and usually means that the entire Teradata system’s performance is affected.

Are all Teradata AMPs always involved in a request?

Teradata AMPs can be used individually, in groups, or altogether to complete a task. This applies to retrieve steps as well as join steps and others. One AMP operations are, e.g., accessing rows via the Primary Index or the USI. All AMPs are required for full table scans and the majority of joins, but group AMP or single AMP joins.

What happens if an AMP crashes?

Access to the rows of a Teradata AMP is redundant. FALLBACK protection (in which a backup AMP replaces the crashed AMP) is always activated on modern Teradata Intelliflex systems. AMPs can also migrate to another node if an entire node crashes. However, it is common practice to have a hot standby node that is immediately ready for use in case of a problem.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like