Teradata Query Parallelism: A closer look at how Teradata’s shared-nothing architecture and parallel processing capabilities deliver exceptional performance.

Teradata Query Parallelism

A query on a Teradata system runs in parallel at every step, whether for joining, sorting, or aggregating data.

Teradata’s uniqueness lies in its ability to apply parallelism at every step of the query process. This advantage resulted from its architectural design, which integrated high levels of parallelism from the start, even when most components were hardware-based. As a result, Teradata has a significant advantage over other database systems.

The architecture of Teradata has enabled the implementation of additional features to enhance parallelism over time. Without it, many advancements would not have been achievable.

Parallel Execution across the AMPs

AMPs are autonomous procedures that execute diverse tasks independently. Every SQL query is initially divided into subtasks and allocated to the AMPs. Each AMP accomplishes its designated task and issues a partial outcome. Upon completion of all AMPs, the final result is returned.

The AMP level’s parallelism is a primary factor contributing to the exceptional performance of a properly utilized Teradata System.

AMPs are versatile and able to perform any task without specialization.

Numerous AMPs are presently accessible on every Teradata System.

The Tasks of a Teradata AMP

  • Reading of Rows
  • Writing of Rows
  • Row Locking
  • Sorting
  • Aggregating
  • Index creation and maintenance
  • Maintaining the transaction log
  • Backup and Recovery
  • Bulk and Transactional Loading
teradata AMP

The Components of a Teradata AMP

Every AMP possesses unique resources.

  • Logical storage unit
  • Memory
  • CPU
Teradata AMP

Teradata’s shared-nothing architecture assigns all resources exclusively to each AMP. Expanding the system with additional hardware enables linear growth in performance.

The Teradata Primary Index – Hash Partitioning

Hash partitioning achieves parallelism by evenly distributing data among AMPS, ensuring each AMP performs equally.

Hash partitioning efficiently distributes large data sets to AMPs, but imbalanced task distribution can create bottlenecks on individual AMPs, resulting in skewing and increased pressure.

Addressing skew is a primary concern for a performance tuner working on a Teradata system.

To implement hash partitioning, designate one or more columns in each table to calculate a hash value, determining the corresponding AMP for a given row. These columns are referred to as the Primary Index.

CREATE TABLE Customer(Customer_ID BIGINT NOT NULL,Lastname VARCHAR(500),Firstname VARCHAR(500)) UNIQUE PRIMARY INDEX (Customer_Id);

Teradata Pipelining Steps

Pipelining further enhances query parallelism by allowing steps to begin before their predecessors are complete.

A request consists of multiple subtasks, as previously mentioned. These subtasks can include, for instance:

  • Read all rows of a table (simple step)
  • Update a subset of table rows (simple step)
  • Read two tables, redistribute them, and join them (complex step)

Steps can have different complexity, and AMPs may also need to interact with each other.

Through pipelining, the join process can commence while the involved tables’ rows are still being redistributed, despite its complexity.

pipeline

Teradata Parallel Multi-Steps

Pipelining involves nested steps where the predecessor’s input is readily accessible to the successor. Additionally, Teradata offers an extra layer of parallelism.

Independent steps can be executed simultaneously.

multistep

The Teradata BYNET

Teradata operates on a shared-nothing architecture where work is executed in parallel by the AMPs.

To facilitate communication among AMPs, the BYNET network serves as the medium for exchanging messages and data.

The Tasks of the BYNET

BYNET is not just an ordinary network; it possesses unique features tailored specifically for Teradata.

  • Message Delivery: Guarantees that messages arrive at the target AMP
  • Coordinate multiple AMPs working on the same step
  • Sorting of the final result set when sending to the client
  • Minimizes the number of AMPs needed for a step
  • Congestion control to avoid an overloaded network

Message Passing & the BYNET

To comprehend the functions of BYNET, it is imperative to introduce a novel virtual process known as the Parsing Engine, which is accountable for generating the execution plan of a request. BYNET serves as the intermediary between the Parsing Engine and the AMPs.

Messages can be sent from the Parsing Engine via BYNET to the AMPs, but BYNET is also responsible for the AMP-to-AMP communication.

BYNET can transmit messages to all or a selected group of AMPs.

Sorting the Final Answer Set

What sets Teradata apart from others?

The result set is sorted in parallel, with pre-sorting at each level (AMP, Node, BYNET, Parsing Engine) to prevent costly sorting at the end.

  • Each AMP locally sorts its data (this is done in parallel)
  • Each Node takes one buffer of data from all its AMPs and sorts it (buffer by buffer by AMP)
  • The BYNET passes one buffer per Node to the Parsing Engine, which does the final sort.
sort

The Teradata Shared Nothing architecture comprises these vital elements. Further information regarding data storage is available here:

Related Services

🏗️ Planning a Data Platform Migration?

Architecture-first approach: we design before a single line of code is written. Zero data loss across every migration delivered.

Our Migration Services →

📊 Data Platform Migration Survey

Help us map where the industry is heading. Results are public — see what others chose.

1. What is your current data platform?

2. Where are you migrating to (or evaluating)?

Migrating FROM
Migrating TO

Thanks for voting! Share this with your network.

Follow me on LinkedIn for daily insights on data warehousing and platform migrations.

Stay Ahead in Data Warehousing

Get expert insights on Teradata, Snowflake, BigQuery, Databricks, Microsoft Fabric, and modern data architecture — delivered to your inbox.

Leave a Comment

DWHPro

Expert network for enterprise data platforms. Senior consultants, project teams built for your challenge — across Teradata, Snowflake, Databricks, and more.

📍Vienna, Austria & Jacksonville, Florida

Quick Links
Services Team Teradata Book Blog Contact Us
Connect
LinkedIn → [email protected]
Newsletter

Join 4,000+ data professionals.
Weekly insights on Teradata, Snowflake & data architecture.