Teradata Join Optimization

1
817

The following case study shows how we can sometimes improve query performance by a slight change of the query text.
The starting point is below SQL statement. The primary index of both tables is PK (one of the tables is a volatile table, but this doesn’t make any difference for our example). One of the tables is partitioned.

 

SELECT *
FROM
Table1 t01
LEFT OUTER JOIN
Table2 t02
ON
t02.PK = (CASE WHEN t01.TheCode <> ‘R’ THEN t01.PK END)
AND DATE ‘2016-02-29’ BETWEEN t02.RSD AND t02.RED
AND t02.TSS = TIMESTAMP ‘9999-12-31 23:59:59’
;

When I was executing this query, it became skewed. What I disliked immediately about this query was the join condition:

t02.PK = (CASE WHEN t01.TheCode <> ‘R’ THEN t01.PK END)

It is never a good idea to apply functions on join conditions. This should be avoided whenever possible. Functions in joins are a hint that something is wrong with the data model (for example, if 1NF is violated)

There exist several disadvantages with expressions on join conditions:

  • The optimizer will not have good statistics available for your expression.
  • No AMP-local merge join (direct merge join) can be done, even if both tables have the same primary index (as in our example).

In our test case, there is an additional disadvantage which is revealed by substituting the join condition into the below SELECT statement:

SELECT
CASE WHEN TheCode <> ‘R’ THEN PK END AS Func,COUNT(*)
FROM Table1
GROUP BY 1
ORDER BY 2 DESC;
Func                                          COUNT(*)
NULL                                        3.438.058
2.000.186.886.985                                 1
2.000.191.425.938                                 1
2.000.179.133.299                                 1
2.000.195.702.747                                 1
2.000.174.021.594                                 1
2.000.190.591.042                                 1

Almost all resulting values of the join condition will become NULL values, i.e. being redistributed to the same AMP!

This is the reason for the high skew when running the query.

Below is the detailed execution plan. Table2 has to be redistributed by the hash code of our complex join expression.

Most of the rows are moved to the same AMP (usually the highest AMP holds the NULL values).

The subsequent join step (hash join) is skewed:

Explain SELECT *
FROM Table1 t01
LEFT OUTER JOIN Table2 t02
ON t02.PK = CASE WHEN t01.TheCode <> ‘R’ THEN t01.PK END
AND DATE ‘2016-02-29’ BETWEEN t02.RSD AND t02.RED
AND t02.TSS = TIMESTAMP ‘9999-12-31 23:59:59’
1) First, we lock TheDatabase.Table22 in view
TheDatabase.Table2 for access.
2) Next, we execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from
MyUser.t01 by way of an all-rows scan with
no residual conditions into Spool 2 (all_amps) (compressed
columns allowed) fanned out into 39 hash join partitions,
which is redistributed by the hash code of ((CASE WHEN
(MyUser.t01.TheCode <> ‘R’) THEN
(MyUser.t01.PK) ELSE (NULL) END
)(BIGINT)) to all AMPs.  The size of Spool 2 is estimated
with high confidence to be 12,380,063 rows (5,434,847,657
bytes).  The estimated time for this step is 21.92 seconds.
2) We do an all-AMPs RETRIEVE step from TheDatabase.Table22
in view TheDatabase.Table2 by way of an all-rows scan
with a condition of (“(TheDatabase.Table22 in view
TheDatabase.Table2.RSD <= DATE ‘2016-02-29’) AND
((TheDatabase.Table22 in view
TheDatabase.Table2.RED >= DATE ‘2016-02-29’) AND
((TheDatabase.Table22 in view
TheDatabase.Table2.TSS = TIMESTAMP ‘9999-12-31
23:59:59.000000’) AND (TheDatabase.Table22 in view
TheDatabase.Table2.VERSION = 0 )))”) into Spool 3
(all_amps) (compressed columns allowed) fanned out into 39
hash join partitions, which is built locally on the AMPs.
The size of Spool 3 is estimated with no confidence to be
21,632,828 rows (3,893,909,040 bytes).  The estimated time
for this step is 6.79 seconds.
3) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an
all-rows scan, which is joined to Spool 3 (Last Use) by way of an
all-rows scan.  Spool 2 and Spool 3 are left outer joined using a
hash join of 39 partitions, with a join condition of (“PK =
(( CASE WHEN (TheCode <> ‘R’) THEN (PK) ELSE (NULL) END ))”).
The result goes into Spool 1 (group_amps), which is built locally
on the AMPs.  The size of Spool 1 is estimated with no confidence
to be 12,380,063 rows (7,985,140,635 bytes).  The estimated time
for this step is 8.22 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1.  The total estimated time is 30.15 seconds.

Fortunately, it’s quite easy to rewrite the query in a way to avoid data redistribution and skewing:

SELECT *
FROM
Table1 t01
LEFT OUTER JOIN
Table2 t02
ON
t02.PK = t01.PK
AND DATE ‘2016-02-29’ BETWEEN t02.RSD AND t02.RED
AND t02.TSS = TIMESTAMP ‘9999-12-31 23:59:59’
AND t01.TheCode <> ‘R’

The new version of the query doesn’t require any join expression.  It’s doing straight-forward sliding windows merge join on the primary index of both tables. No data redistribution is required.

Here is the execution plan:

Explain SELECT *
FROM
Table1 t01
LEFT OUTER JOIN
Table2 t02
ON
t02.PK = t01.PK
AND DATE ‘2016-02-29’ BETWEEN t02.RSD AND t02.RED
AND t02.TSS = TIMESTAMP ‘9999-12-31 23:59:59’
AND t01.TheCode <> ‘R’

 

1) First, we lock TheDatabase.Table22 in view
TheDatabase.Table2 for access.
2) Next, we do an all-AMPs JOIN step from
MyUser.t01 by way of a RowHash match scan with
no residual conditions, which is joined to TheDatabase.Table22
in view TheDatabase.Table2 by way of a RowHash match scan with
a condition of (“(TheDatabase.Table22 in view
TheDatabase.Table2.RSD <= DATE ‘2016-02-29’) AND
((TheDatabase.Table22 in view TheDatabase.Table2.RED >=
DATE ‘2016-02-29’) AND ((TheDatabase.Table22 in view
TheDatabase.Table2.TSS = TIMESTAMP ‘9999-12-31
23:59:59.000000’) AND (TheDatabase.Table22 in view
TheDatabase.Table2.VERSION = 0 )))”).
MyUser.t01 and TheDatabase.Table22 are
left outer joined using a sliding-window merge join, with
condition(s) used for non-matching on left table (
“((MyUser.t01.TheCode > ‘R’) OR
(MyUser.t01.TheCode < ‘R’)) AND (NOT
(MyUser.t01.PK IS NULL ))”), with a join
condition of (“TheDatabase.Table22.PK =
MyUser.t01.PK”).  The result goes into
Spool 1 (group_amps), which is built locally on the AMPs.  The
size of Spool 1 is estimated with no confidence to be 12,380,063
rows (7,985,140,635 bytes).  The estimated time for this step is
12.17 seconds.
3) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.  The total estimated time is 12.17 seconds.

Finally, let me show you the improvements in resource usage:

Query Start End Rows IOs CPUTime Spool
New Version 05.04 11:38 05.04 11:39 12379740 268.383,00 136,52 5.150.474.240,00
Old Version 05.04 11:40 05.04 11:46 12379740 437.771,00 178,46 5.152.128.000,00

 

Disk IOs were reduced by about 40%, CPU usage by 25%.

See also:
Teradata Query Rewriting & Partition Elimination

Our Reader Score
[Total: 5    Average: 3.6/5]
Teradata Join Optimization written by Roland Wenzlofsky average rating 3.6/5 - 5 user ratings

1 COMMENT

  1. >>”It is never a good idea to apply functions on join conditions. This should be avoided whenever possible.”

    So true, and so often neglected…

    Cheers.

    Carlos.

LEAVE A REPLY

Please enter your comment!
Please enter your name here