Teradata Explain Statement: A Guide to Optimizing SQL Performance

Roland Wenzlofsky

April 21, 2023

minutes reading time

What is the Teradata Explain Statement?

Optimizing performance can be challenging, often requiring significant effort to identify the root issues. Thankfully, Teradata provides the EXPLAIN statement as a robust tool for assessing the execution of SQL statements. In Teradata, SQL statements are processed in a series of distinct tasks that correspond to specific components of the statement, such as retrieving rows, joining tables, or aggregating results. By utilizing our knowledge of how Teradata handles these tasks and scrutinizing the execution plan furnished by EXPLAIN, we can swiftly pinpoint weak points within a query and utilize this data to enhance performance.

The EXPLAIN statement can be intimidating for novices. At first, the result may seem disorderly and incomprehensible. Nonetheless, deciphering the steps is easier than it seems. We must evaluate each step’s effect on performance and whether Teradata has made ideal selections, such as choosing an inappropriate join strategy.

Our analysis will entail a thorough evaluation of every Execution Plan element, identification of crucial operations, and support in identifying potential issues.

The Execution Plan comprises retrieval, joining, aggregation, and other tasks. Nonetheless, these three tasks make up 95% of all steps; therefore, attention will be paid to them. In brief, the Execution Plan entails reading, joining, and aggregating, ultimately creating a final result set. Each step depends on the preceding one’s output.

Decoding the Plan Steps

Teradata’s blocks correspond to distinct operations, including fetching rows, joining tables, and conducting aggregations. In addition to the operation category, we collect information on the number of AMPs involved, including All-AMPs, Single-AMP, or Group-AMP.

Each block’s EXPLAIN output will show something akin to:

AMPs retrieve, and single AMP retrieve steps, AMPs join, and single AMP join steps, and an AMPs step for aggregation.

By identifying the various types of operations, we can analyze and obtain detailed information about each one.

Prior to each transaction, certain preparation steps are necessary, such as redistributing and sorting the rows. Sorting is essential to utilize the most efficient search algorithm, specifically the binary search.

Join Preparation

Redistribution is required as the AMPs hold the rows exclusively for joining. Since the AMPs are separate processes and lack access to other AMPs’ disks, the two tables’ rows to be joined must reside on the same AMP. To accomplish this, the easiest method is rehashing one or both tables in question.

The EXPLAIN result for row movement might show something like:

duplicated on all AMPs, redistributed by the hash code of (new hash column(s)), redistributed by rowkey, etc.

The EXPLAIN output’s sorting section will show something akin to this:

Sort according to hash code, order by row hash, partition by rowkey, etc.

Row retrieval strategy

A vital aspect of all operations is retrieving rows from disks. The Teradata Database offers various access paths to obtain data, such as full table scans, primary and secondary index access, and more.

The EXPLAIN output for row retrieval may contain the following information:

by way of an all-row scan, by way of a rowhash match scan, by way of the primary index, by way of the hash value, etc.

Join Type

The EXPLAIN output will provide information on the selected join strategy if the operation is a join.

using a product join, a single partition hash join, a merge join, a rowkey-based merge join, etc.

Confidence

Each block denotes the level of confidence in its corresponding rows. Although desirable, attaining a high level of confidence is not always possible.

The EXPLAIN output will display a similar format:

high confidence, low confidence, no confidence, and join index confidence

This article explains the essential elements of an EXPLAIN statement, delivering about 80% of the relevant data present in an explanation output.

Take a brief survey and receive the book for free Get instant access

Share0

Tweet0

Share0

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

Hi Roland,

Nice article. Could you please also explain product join indicator ie. what it means and how it affects query performance. What causes high PJI for a query. Does only product join is cause high PJI or there are other factors that cause high PJI?

Thank you

Teradata Explain Statement: A Guide to Optimizing SQL Performance

What is the Teradata Explain Statement?

Decoding the Plan Steps

Join Preparation

Row retrieval strategy

Join Type

Confidence

Roland Wenzlofsky

Fast multi-file export of Teradata query results using only Teradata SQL Assistant

The Teradata AMP Worker Task

Boost Your Teradata Performance – The Critical Role of NOT NULL Declarations

Optimizing Teradata SQL Queries by Avoiding Full Table Scans and Utilizing Secondary Indexes