The Teradata Architecture
The Teradata Architecture explained
In order to understand how queries execute on a Teradata System, it is important to have an understanding of the parallel architecture of Teradata and how the components making up the system are interacting with each other.
The Teradata architecture described is the same for each relevant release of Teradata. Additional features introduced over time are not covered as they do not give to the general understanding of the system architecture.
I will not go into great detail here. If you are interested in more detailed information, I recommend reading the Teradata Manuals which you can download for free. I will limit this post to information sufficient to understand parallelism on Teradata.
– Parsing Engine (PE)
– Access Module Processor (AMP)
– Virtual Disks
The Parsing Engine
The parsing engine is a process responsible for the following tasks:
– Session handling (session authorization, log on, log off)
– Parsing the SQL statement (syntax checking, looking up the used database objects)
– Creation of the execution plan
– Dispatching the execution plan to the AMPs
– some other tasks like character set conversion etc. which are not relevant for our purpose of understanding how SQL statements are executed.
The Parsing Engine is the piece of software communicating with the client applications (SQL Assistant, BTEQ, etc.)
Each Teradata System has several Parsing Engines executed at the same time. The number of Parsing Engines can be increased dynamically by the system if needed. Each Parsing Engine can only handle a limited amount of sessions.
The Teradata AMP
Each Teradata AMP is a process handling a portion of the database system on its own.
AMPs are doing the work on a Teradata System, and each Teradata AMP is managing all work related to the rows which it is assigned to.
The main tasks of an AMP are storing rows, retrieving rows, sorting, and aggregation.
Each AMP returns its result set back to the Parsing Engine (via the BYTNET, see the details below). As we already know, the Parsing Engine is communicating directly with the client tools passing the complete result set to the client software, such as SQL Assistant.
Each AMP has its memory and handles exactly one virtual disk. We are not interested in concrete details of the implementation here as this is not needed for our understanding of the system and how components are interacting with each other.
AMPS and Parallelism
AMPs are the foundation for parallelism on Teradata. The goal of each Teradata system is to distribute work evenly across all available AMPs. If we can reach this aim, we will have a performance-optimized system regarding parallelism.
Finally, the BYNET is a link between the Parsing Engine and the AMPs. It is the network part used for the communication between the Parsing Engine and the AMPs.