92% less CPU Usage with a Simple Trick – Teradata PPI Tables

3
1171
Teradata PPI Tables

Teradata PPIJoining partitioned tables can be quite expensive, especially if partitions don’t match or one of the tables is not partitioned. Today I discovered an interesting trick which can be applied to reduce resource usage if joining two tables with matching Primary Indexes and Partitions.

The test setup consists of the following two tables (both tables containing about 50.000 rows):

 

CREATE MULTISET TABLE DWHPro.Sales2 , NO FALLBACK,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
SalesId INTEGER NOT NULL,
Quantity DECIMAL(18,2),
SalesDate DATE FORMAT ‘YYYY-MM-DD’)
PRIMARY INDEX ( SalesId )
PARTITION BY RANGE_N(SalesDate BETWEEN DATE ‘1900-01-01’ AND DATE ‘2016-12-31’ EACH INTERVAL ‘1’ DAY , NO RANGE);

CREATE MULTISET TABLE DWHPro.Sales3, NO FALLBACK,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
SalesId INTEGER NOT NULL,
Quantity DECIMAL(18,2),
SalesDate DATE FORMAT ‘YYYY-MM-DD’)
PRIMARY INDEX ( SalesId )
PARTITION BY RANGE_N(SalesDate BETWEEN DATE ‘1900-01-01’ AND DATE ‘2016-12-31’ EACH INTERVAL ‘1’ DAY , NO RANGE);

First, we are executing the following query, which selects all partitions (see DDL’s above). We could leave out the WHERE condition, and would get the same result:

SELECT
t01.SalesId,
t02.SalesId
FROM Sales2 t01
INNER JOIN
Sales3 t02
ON
t01.SalesId = t02.SalesId AND t01.SalesDate = t02.SalesDate
WHERE t01.SALESDATE BETWEEN DATE’1900-01-01′ AND DATE’2016-12-31′;

The execution plan below shows that both tables are redistributed to a new primary index (SalesId, SalesDate), and a consecutive merge join is done:

1) First, we lock a distinct DWHPRO.”pseudo table” for read on a
RowHash to prevent global deadlock for DWHPRO.t02.
2) Next, we lock a distinct DWHPRO.”pseudo table” for read on a
RowHash to prevent global deadlock for DWHPRO.t01.
3) We lock DWHPRO.t02 for read, and we lock DWHPRO.t01 for read.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from 42734 partitions of
DWHPRO.t02 with a condition of (“(DWHPRO.t02.SalesDate >=
DATE ‘1900-01-01’) AND (DWHPRO.t02.SalesDate <= DATE
‘2016-12-31’)”) into Spool 2 (all_amps), which is
redistributed by the hash code of (DWHPRO.t02.SalesId,
DWHPRO.t02.SalesDate) to all AMPs. Then we do a SORT to
order Spool 2 by row hash. The size of Spool 2 is estimated
with high confidence to be 42,734 rows (897,414 bytes). The
estimated time for this step is 0.34 seconds.
2) We do an all-AMPs RETRIEVE step from 42734 partitions of
DWHPRO.t01 with a condition of (“(DWHPRO.t01.SalesDate <=
DATE ‘2016-12-31’) AND (DWHPRO.t01.SalesDate >= DATE
‘1900-01-01’)”) into Spool 3 (all_amps), which is
redistributed by the hash code of (DWHPRO.t01.SalesId,
DWHPRO.t01.SalesDate) to all AMPs. Then we do a SORT to
order Spool 3 by row hash. The size of Spool 3 is estimated
with high confidence to be 42,734 rows (897,414 bytes). The
estimated time for this step is 0.34 seconds.
5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a
RowHash match scan, which is joined to Spool 3 (Last Use) by way
of a RowHash match scan. Spool 2 and Spool 3 are joined using a
merge join, with a join condition of (“(SalesId = SalesId) AND
(SalesDate = SalesDate)”). The result goes into Spool 1
(group_amps), which is built locally on the AMPs. The size of
Spool 1 is estimated with low confidence to be 42,734 rows (
1,239,286 bytes). The estimated time for this step is 0.31
seconds.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 0.66 seconds.

In a next step we change the query, by removing the last date from the WHERE condition (i.e. ‘2016-12-31’)

SELECT
t01.SalesId,
t02.SalesId
FROM Sales2 t01
INNER JOIN
Sales3 t02
ON
t01.SalesId = t02.SalesId AND t01.SalesDate = t02.SalesDate
WHERE t01.SALESDATE BETWEEN DATE’1900-01-01′ AND DATE’2016-12-30′;

The resulting explain plan reveals that the Optimizer switches to a  rowkey based merge join (without doing a costly row redistribution):

1) First, we lock a distinct DWHPRO.”pseudo table” for read on a
RowHash to prevent global deadlock for DWHPRO.t02.
2) Next, we lock a distinct DWHPRO.”pseudo table” for read on a
RowHash to prevent global deadlock for DWHPRO.t01.
3) We lock DWHPRO.t02 for read, and we lock DWHPRO.t01 for read.
4) We do an all-AMPs JOIN step from 42733 partitions of DWHPRO.t02 by
way of a RowHash match scan with a condition of (
“(DWHPRO.t02.SalesDate >= DATE ‘1900-01-01’) AND
(DWHPRO.t02.SalesDate <= DATE ‘2016-12-30’)”), which is joined to
42733 partitions of DWHPRO.t01 by way of a RowHash match scan with
a condition of (“(DWHPRO.t01.SalesDate <= DATE ‘2016-12-30’) AND
(DWHPRO.t01.SalesDate >= DATE ‘1900-01-01’)”). DWHPRO.t02 and
DWHPRO.t01 are joined using a rowkey-based merge join, with a join
condition of (“(DWHPRO.t01.SalesId = DWHPRO.t02.SalesId) AND
(DWHPRO.t01.SalesDate = DWHPRO.t02.SalesDate)”). The result goes
into Spool 1 (group_amps), which is built locally on the AMPs.
The size of Spool 1 is estimated with low confidence to be 42,734
rows (1,239,286 bytes). The estimated time for this step is 0.28
seconds.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 0.28 seconds.

Afterwards, I compared the resource usage of the following two statements. Both of them are delivering the same result, but one is doing the standard merge join, the second query is designed to “force” the Optimizer into the rowkey based merge join:

SELECT — Standard merge join
t01.SalesId,
t02.SalesId
FROM Sales2 t01
INNER JOIN
Sales3 t02
ON
t01.SalesId = t02.SalesId AND t01.SalesDate = t02.SalesDate
WHERE t01.SALESDATE BETWEEN DATE’1900-01-01′ AND DATE’2016-12-31′;
SELECT — Forcing the Optimizer into the rowkey based merge join
t01.SalesId,
t02.SalesId
FROM Sales2 t01
INNER JOIN
Sales3 t02
ON
t01.SalesId = t02.SalesId AND t01.SalesDate = t02.SalesDate
WHERE t01.SALESDATE BETWEEN DATE’1900-01-01′ AND DATE’2016-12-30′

UNION ALL

SELECT
t01.SalesId,
t02.SalesId
FROM Sales2 t01
INNER JOIN
Sales3 t02
ON
t01.SalesId = t02.SalesId AND t01.SalesDate = t02.SalesDate
WHERE t01.SALESDATE = DATE’2016-12-31′;

Like expected, the rowkey based merge join is much cheaper and has a much better run time:

Logical IOs CPU Seconds Spool (GB)
RowKey Based Merge Join 91,00 0,02 0,10
Traditional Merge Join 272,00 0,24 1,80
Reduction in Resources 67% 92% 94%

Conclusion:

This might be a great trick whenever you want to retrieve all partitions of two PPI tables being joined together. When leaving out a WHERE condition covering the partitions you may end up with a traditional merge join. My trick allows to force the Optimizer to do a much cheaper rowkey based merge join.

While I don’t understand why the Optimizer is choosing a standard merge join if we select all partitions, this article shows that there is a lot of potential for optimizing joins by manually rewriting your queries!

UPDATE: It looks like the Optimizer avoids in some cases the rowkey based merge join, if the tables are very small. There seems to be an Optimizer rule which says “if the tables are small and all partitions are retrieved, go for a traditional merge join. This trick probably doesn’t work with big tables as the Optimizer by default will chose the rowkey based merge join.

Our Reader Score
[Total: 12    Average: 4.3/5]
92% less CPU Usage with a Simple Trick – Teradata PPI Tables written by Roland Wenzlofsky on September 30, 2015 average rating 4.3/5 - 12 user ratings

3 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here