Demystifying The Teradata Nested Join | DWHPRO

Demystifying The Teradata Nested Join

Teradata can choose from various join methods. The most common ones are Merge Join, Hash Join, Product Join or for PPI tables the Merge Join variants Rowkey Based Merge Join or Sliding Window Merge Join.

Less often, however, we see the Teradata nested join, since specific prerequisites must be met before it can be used.

The nested join is particularly suitable for tactical workloads, as we will see in a moment.

If you are interested in tactical workload tuning, you can learn more about it here: Tactical Workload Tuning with Teradata

What is the Teradata Nested Join?

The Teradata Nested Join is the cheapest join method since access is via indexes.

The following conditions must be fulfilled for Teradata to be able to use a nested join:

  • An “equi-join” is carried out.
  • There is a restriction on the left table (WHERE condition) and for the retrieve step a UPI, USI, or a NUPI with as many different values as possible can be used.
  • On the right table, an index is defined that includes all join columns. It can be a UPI, NUPI, USI, NUSI or global join index (a global join index contains the ROWID of the base table on which it is defined)

Teradata Nested Join Example

The following example is designed to be suitable for a nested join:

The left table has a UPI:

CREATE TABLE LEFT_TABLE
(
PK INTEGER NOT NULL,
ALT_PK INTEGER NOT NULL,
COL1 INTEGER
) UNIQUE PRIMARY INDEX (COL1);

CREATE TABLE RIGHT_TABLE
(
PK INTEGER NOT NULL,
ALT_PK INTEGER NOT NULL
) UNIQUE PRIMARY INDEX (ALT_PK);

The following query is executed:

SELECT *
FROM
   LEFT_TABLE l
INNER JOIN
   RIGHT TABLE r
ON
   l.ALT_PK = r.ALT_PK
WHERE 
   COL1=100;

If we take a closer look at the example, we see that both table definitions and the SQL statement are ideal for a nested join:

  • The restriction on the left table is made by the UPI
    WHERE COL1 = 100
  • It's an equijoin
    l.ALT_PK = r.ALT_PK
  • The join is done via the columns which are defined on the left table as an index (in this case the UPI).
    (l.ALT_PK = r.ALT_PK)

In fact, the execution plan shows that the optimizer uses a nested join to execute the SQL statement:

First, we do a single-AMP JOIN step from l by way of  the unique primary index "l.COL1 = 100" with no  residual conditions, which is joined to all partitions of  r by way of the primary index "r.ALT_PK = l.ALT_PK" with no residual conditions.  l and r are joined using a nested join, with a join  condition of ("(1=1)").  The result goes into Spool 1 (one-amp),   which is built locally on the AMPs.  The size of Spool 1 is  estimated with low confidence to be 1 row (41 bytes).

Although the join is done using columns, which are neither the primary index of the left table nor the primary index of the right table, no redistribution of rows or a full table scan is required. If the optimizer were to perform a merge join or hash join, a redistribution of rows would be mandatory in this case (copy or redistribute).

Only one row is accessed and only one AMP is needed. It couldn't be more efficient.

There are several variants of the nested join. All have in common that they are extremely performant. The example just shown was the Local Nested Join. We will now show the details of the different variants.

The Local Nested Join

We have previously said that the left table has a UPI, NUPI or USI which is restricted (with a WHERE condition).

The Local Nested Join is defined even more restrictively: it must be a UPI or USI. Also, no NUSI is allowed on the right table, only UPI, NUPI, and USI. Why this is so we explain in the next section.

Since the UPI or USI on the left table ensures that only one row is selected with the WHERE condition, the following algorithm can be used:

  1. Calculate the rowhash over the join columns of the left table
  2. Based on this rowhash send a message to the AMP which owns the rows of the right table
  3. The AMP owning the rows of the right table returns these rows to the calling AMP
  4. The calling AMP assembles the final rows and returns them into the resulting spool
teradata nested join
The Teradata Local Nested Join

As already mentioned, it is only necessary that the index on the left table is unique. Therefore our query from the previous example would also work for the following setup:

CREATE TABLE LEFT_TABLE
(
PK INTEGER NOT NULL,
ALT_PK INTEGER NOT NULL,
COL1 INTEGER
) PRIMARY INDEX (PK);
UNIQUE INDEX (COL1); --> USI works as well !
CREATE TABLE RIGHT_TABLE
(
PK INTEGER NOT NULL,
ALT_PK INTEGER NOT NULL
) UNIQUE PRIMARY INDEX (ALT_PK);
SELECT *
FROM
   LEFT_TABLE l
INNER JOIN
   RIGHT TABLE r
ON
   l.ALT_PK = r.ALT_PK
WHERE 
   COL1=100;
First, we do a two-AMP JOIN step from l by way of unique index # 4 "l.COL1 = 100" with no residual   conditions, which is joined to all partitions of r by  way of the primary index "r.ALT_PK = l.ALT_PK" with no residual conditions.  l and r are joined using a nested join, with a join  condition of ("(1=1)").  The result goes into Spool 1 (one-amp),  which is built locally on the AMPs.  The size of Spool 1 is   estimated with low confidence to be 1 row (41 bytes).

The only difference is the usage of the USI (by way of index # 4) instead of a UPI access.

The Remote Nested Join

The Remote Nested Join can be used in the following situations:

  • The index on the left table used in the WHERE condition is not unique OR
  • The right table index containing the join columns is a NUSI

In this case, the following algorithm is used:

  1. The left table rows are redistributed by the join column rowhash or duplicated to all AMPs.

    Which kind of distribution is required, depends on the type of index available on the right table.

    Whenever the right table index is a NUSI, row duplication to all AMPs is needed as the NUSI index rows are not rowhash distributed and matching rows could be available on any AMP.

    Any other kind of right table index will cause a redistribution of the left table rows by hashing on the join columns.
  2. A join between the left table rows and the right table index is executed.
    The base table ROWIDs are extracted from the index and copied into an intermediate spool.
  3. The intermediate spool is the new left table for a second join step, the so-called RowId Join.
  4. The extracted ROWIDs are used to access the base table rows of the right table.

    Depending on the kind of right table index, this step may need to access the base table rows on a different AMP (USI) or on the same AMP (NUSI – NUSI rows are always co-located together with their base table rows on the same AMP)

Here is an example of a Remote Nested Join (on the right table a NUSI is defined on the join columns)

CREATE TABLE LEFT_TABLE
(
PK INTEGER NOT NULL,
ALT_PK INTEGER NOT NULL,
COL1 INTEGER
) UNIQUE PRIMARY INDEX (COL1);
CREATE TABLE RIGHT_TABLE
(
PK INTEGER NOT NULL,
ALT_PK INTEGER NOT NULL
) UNIQUE PRIMARY INDEX (PK )
INDEX (ALT_PK);
SELECT *
FROM
   LEFT_TABLE l
INNER JOIN
   RIGHT TABLE r
ON
   l.ALT_PK = r.ALT_PK
WHERE 
   COL1=100;
3) We do a single-AMP RETRIEVE step from l by way of the unique primary index "l.COL1 = 100" with no residual conditions into Spool 2 (all_amps), which is duplicated on all  AMPs.  Then we do a SORT to order Spool 2 by the hash code of (l.ALT_PK).  The size of Spool 2 is estimated with high confidence to be 72 rows (1,800 bytes).  The estimated time for this step is 0.00 seconds. 4) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to r by way of a traversal of index # 4 without accessing the base table extracting row ids only.  Spool 2 and r are joined using a nested join, with a join condition of ("ALT_PK = r.ALT_PK").  The result goes into Spool 3 (all_amps), which is built locally on the AMPs.  Then we do a SORT to order Spool 3 by field Id 1.  The size of Spool 3 is estimated with low confidence to be 1 row (35 bytes).  The estimated time for this step is 0.02 seconds. 5) We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of an all-rows scan, which is joined to r by way of an  all-rows scan with no residual conditions.  Spool 3 and  r are joined using a row id join, with a join  condition of ("(1=1)").  The result goes into Spool 1 (group_amps),  which is built locally on the AMPs.  The size of Spool 1 is  estimated with low confidence to be 1 row (41 bytes).  The    estimated time for this step is 0.04 seconds.

As can be seen from the execution plan, there are 2 physical joins:

Spool 2 and r are joined using a nested join

Spool 3 and  r are joined using a row id join

Nested Join Performance: Local or Remote?

In principle, the Remote Nested Join is less performant, because the ROWIDs have to be extracted and two physical joins are necessary.

However, since the number of rows is very small (the WHERE condition must be very restrictive even with NUPI, otherwise the optimizer does not make a nested join), the effect on performance is negligible.

Conclusion

The Teradata Nested Join is ideal for tactical workload since it primarily queries single or a few rows. This means that only one or a few AMPs are involved.

It makes sense to analyze requests to see if they are suitable for a nested join and then create corresponding indexes (NUSI, USI).

It may also be possible to create WHERE conditions that do not change the meaning of the SQL statement but allow the use of the nested join.

Teradata Nested Join Without WHERE Condition

A WHERE condition on the left table is not always required.

If the Teradata Optimizer assumes that the left table contains only a few rows, it can execute a Remote Nested Join without a WHERE condition on the left table.

In general, the Teradata Optimizer might consider the usage of a Nested Join if the estimated number of left table rows is tiny (basically, a few rows).

Learning The 4 Principal Teradata Join Methods Is Not Difficult At All! You Just Need A Great Teacher!
Teradata SQL Tuning – 6 Golden Rules you must never forget!
Roland Wenzlofsky
 

Roland Wenzlofsky is a graduated computer scientist and Data Warehouse professional working with the Teradata database system for more than 20 years. He is experienced in the fields of banking and telecommunication with a strong focus on performance optimization.

>