The Teradata AMP and its Worker Tasks (AWT)

Roland Wenzlofsky

September 1, 2020

minutes reading time


What is a Teradata AMP?

AMP is the abbreviation for Access Module Processor. Each Teradata AMP is a Linux process responsible for handling its share of data; The assignment of physical rows to AMPs is done using a hashing algorithm that ensures that all table rows are evenly distributed across all AMPs. Data distribution to virtual processes is typical for shared-nothing systems like Teradata. Each AMP has its main memory and a logical disk assigned. Each Teradata node typically runs hundreds of AMPs simultaneously.

What are the tasks of a Teradata AMP?

Each Teradata AMP is capable of performing all tasks related to its rows. This includes reading and storing, aggregating, performing joins, locking, sorting, disk space management, and font conversion when a result set is created.

What is a Teradata AMP Worker Task (AWT)?

Each AMP can assign a plan step received to one of the 80 available AMP worker tasks; parallelism does not end at the AMP level, but each AMP can execute these 80 tasks in parallel.

Teradata AMP Worker Tasks are specialized and can perform different tasks depending on their assignment. Some take on new work; other AWTs are reserved to continue work that has been started. Finally, some AWTs are exclusively reserved for tactical workloads.

Different workload types require a different quantity of AMP worker tasks. So a SQL statement typically needs 1-2 AMP worker tasks per AMP, while a Teradata Fastload needs 3 AMP worker tasks in the first phase (acquisition phase) and 1 AMP worker task in the apply phase (all per AMP). The number of parallel running fastloads and multiloads is limited to a small two-digit range (the default setting can be adjusted as needed).

As AMP worker tasks are a crucial resource, each AMP reserves 24 AMP worker tasks for internal work. Reservation of tasks helps to avoid blocking and deadlock-like situations.

What happens when all AMP Worker Tasks of an AMP are in use?

If an AMP does not have a worker task of the required type, the tasks end up in a message queue. Since the message queue has limited memory, the AMP sends a message to the parsing engine when the message queue is full. The parsing engine then aborts the task on all AMPs and tries again after a short time (in the millisecond range). The retries happen in a loop until the AMP has free AMP worker tasks for this specific task. At each attempt, the waiting time until the subsequent attempt increases. The situation is called flow control mode and usually means that the entire Teradata system’s performance is affected.

Are all Teradata AMPs always involved in a request?

Teradata AMPs can be used individually, in groups, or to complete tasks. This applies to retrieve steps as well as join steps and others. Single-AMP operations are, e.g., accessing rows via the Primary Index or the USI. All AMPs are required for full table scans and most joins, but group AMP or single AMP joins.

What happens if an AMP crashes?

Access to the rows of a Teradata AMP is redundant. FALLBACK protection (in which a backup AMP replaces the crashed AMP) is permanently activated on modern Teradata Intelliflex systems. AMPs can also migrate to another node if an entire node crashes. However, it is common practice to have a hot standby node immediately ready for use in case of a problem.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>