Faster than Lightning: Teradata SQL Tuning

3
562
teradata SQL Tuning
a href="http://pixabay.com/users/geralt/">geralt / Pixabay

teradata SQL TuningIt is time to publish another case study,  which impressively demonstrates the terrific impact query rewriting can have on performance.

Our object of study is the query below, which initially ran about 40 minutes.

As usually, I added all missing statistics and ensured that no statistic is stale: Something a SQL tuning specialist always should do first.

Next, I analyzed the code of the SQL statement, which creates a report based on the call details of a telecom company; The total minutes of use (MOU)  for one specific day are selected.

CALL_DATE partitions the call detail (CDR) table, and I expected partition elimination when using the CALL_DATE as a predicate of the WHERE condition.

Indeed, I was able to find an execution step doing partition elimination, but not in the best performing way: A product join to the huge call detail table, enhanced by dynamic partition elimination.

Further analysis revealed that the filtering of the call detail records was done at the end of the execution plan (by the HAVING clause).

By moving the filter on CALL_DATE to the WHERE condition of the driving table, I achieved that the Optimizer applied static partition elimination at the very begin of its execution plan:

This tiny change reduced the runtime of the query to a few seconds; Remember, the original query finished after 40 minutes!

A real Teradata SQL Tuning Home Run

Conclusion:

Always apply filters in data retrieval steps as early as possible in the SQL statement (i.e., on the driving table). Although the Optimizer does a lot of query optimization, it can’t recognize all opportunities.
This is the bad performing query:

SELECT
TOT.PHONE_ID
, TOT.COUNTRY_CD
, TOT.BILLING_ENGINE_CD
, TOT.PRICEPLAN_GROUP_CD
, TOT.RATING_CODE
, TOT.CUSTOMER_ID
, TOT.REPORTING_GROUP_CD1
, TOT.ROAMING_IND
, TOT.CORRECTIONS_IND
, TOT.CALL_DATE
, TOT.BILL_PERIOD_CD
, ‘A6’ AS PROVIDER_CD
, SUM(TOT.MINUTES_OF_USE) AS MINUTES_OF_USE
, TOT.BILLED_IND
, TOT.PREPAID_IND
FROM
(SELECT
RG.PHONE_ID
, RG.COUNTRY_CD
, RG.BILLING_ENGINE_CD
, RG.PRICEPLAN_GROUP_CD
, RG.RATING_CODE
, RG.CUSTOMER_ID
, RG.REPORTING_GROUP_CD1
, RG.ROAMING_IND
, RG.CORRECTIONS_IND
, RG.REPORTING_GROUP_CD2
, RG.CALL_DATE
, RG.BILL_PERIOD_CD
, RG.MINUTES_OF_USE
, RG.BILLED_IND
, RG.PREPAID_IND
FROM
(SELECT
CD.PHONE_ID
, CD.COUNTRY_CD
, CD.BILLING_ENGINE_CD
, CD.PRICEPLAN_GROUP_CD
, CD.RATING_CODE
, CD.CUSTOMER_ID
, CD.REPORTING_GROUP_CD1
, CD.ROAMING_IND
, CD.CORRECTIONS_IND
, CD.REPORTING_GROUP_CD2
, CD.CALL_DATE
, CD.MINUTES_OF_USE
, RG.BILL_PERIOD_CD
, CD.BILLED_IND
, CD.PREPAID_IND
FROM
CALL_DETAILS CD
LEFT JOIN
(SELECT BILL_PERIOD_CD, CALL_DATE FROM REPORTING_GROUP GROUP BY 1,2) RG
ON CD.CALL_DATE = RG.CALL_DATE
WHERE
BILLING_ENGINE_CD IN (‘AMDOCS’)
AND BILLED_IND = ‘N’
AND REPORTING_GROUP_CD2 IS NOT NULL
) RG

LEFT JOIN
REPORTING_GROUP R1
ON RG.CALL_DATE = R1.CALL_DATE
AND RG.REPORTING_GROUP_CD1 = R1.REPORTING_GROUP_CD1
AND RG.BILL_PERIOD_CD = R1.BILL_PERIOD_CD

LEFT JOIN
REPORTING_GROUP R2
ON RG.CALL_DATE = R2.CALL_DATE
AND RG.REPORTING_GROUP_CD2 = R2.REPORTING_GROUP_CD1
AND RG.BILL_PERIOD_CD = R2.BILL_PERIOD_CD
WHERE NOT (R1.REPORTING_GROUP_CD1 IS NULL OR R2.REPORTING_GROUP_CD1 IS NULL)
) TOT
GROUP BY 1,2,3,4,5,6,7,8,9,10,11,12,14,15
 HAVING TOT.CALL_DATE = DATE ‘2015-05-21’; 

Here the execution plan of the bad performing query:

1) First, we lock CALL_DB.REPORTING_GROUP for access,
and we lock CALL_DB.CALL_DETAILS for access.
2) Next, we do an all-AMPs SUM step to aggregate from
CALL_DB.REPORTING_GROUP by way of an all-rows
scan with no residual conditions , grouping by field1 (
CALL_DB.REPORTING_GROUP.BILL_PERIOD_CD
,CALL_DB.REPORTING_GROUP.CALL_DATE).  Aggregate
Intermediate Results are computed globally, then placed in Spool 3.
The size of Spool 3 is estimated with high confidence to be 12
rows (612 bytes).  The estimated time for this step is 0.02
seconds.

3) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan into Spool 1 (used to materialize view, derived
table or table function RG) (all_amps) (compressed columns
allowed), which is built locally on the AMPs.  The size of Spool 1
is estimated with high confidence to be 12 rows (444 bytes).  The
estimated time for this step is 0.01 seconds.

4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by
way of an all-rows scan into Spool 8 (all_amps) (compressed
columns allowed), which is duplicated on all AMPs.  Then we
do a SORT to order Spool 8 by the hash code of (
CALL_DB.REPORTING_GROUP.BILL_PERIOD_CD,
CALL_DB.REPORTING_GROUP.CALL_DATE).  The size
of Spool 8 is estimated with high confidence to be 1,080 rows
(31,320 bytes).  The estimated time for this step is 0.01
seconds.

2) We do an all-AMPs RETRIEVE step from
CALL_DB.REPORTING_GROUP by way of an
all-rows scan with no residual conditions locking for access
into Spool 9 (all_amps) (compressed columns allowed), which
is built locally on the AMPs.  Then we do a SORT to order
spool 9 by the hash code of (
CALL_DB.REPORTING_GROUP.BILL_PERIOD_CD,
CALL_DB.REPORTING_GROUP.CALL_DATE).  The size
of Spool 9 is estimated with high confidence to be 4,277 rows
(149,695 bytes).  The estimated time for this step is 0.01
seconds.

5) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of a
RowHash match scan, which is joined to Spool 9 (Last Use) by way
of a RowHash match scan.  Spool 8 and Spool 9 are joined using a
merge join, with a join condition of (“(CALL_DATE = CALL_DATE) AND
(BILL_PERIOD_CD = BILL_PERIOD_CD)”).  The result
goes into Spool 10 (all_amps) (compressed columns allowed), which
is duplicated on all AMPs.  Then we do a SORT to partition by
rowkey.  The size of Spool 10 is estimated with low confidence to
be 384,930 rows (19,631,430 bytes).  The estimated time for this
step is 0.03 seconds.

6) We execute the following steps in parallel.

1) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way
of an all-rows scan, which is joined to
CALL_DB.CALL_DETAILS with a condition of (
“(CALL_DB.CALL_DETAILS.BILLED_IND = ‘N’) AND (CALL_DB.CALL_DETAILS.BILLING_ENGINE_CD = ‘AMDOCS ‘)”).

Spool 10 and CALL_DB.CALL_DETAILS are joined
 using a product join , with a join condition of (
“(CALL_DB.CALL_DETAILS.CALL_DATE = CALL_DATE) AND
((CALL_DB.CALL_DETAILS.CALL_DATE = CALL_DATE) AND
(CALL_DB.CALL_DETAILS.REPORTING_GROUP_CD2 = REPORTING_GROUP_CD1
))”)  enhanced by dynamic partition elimination.  The inputtable CALL_DB.CALL_DETAILS will not be cached in
memory, but it is eligible for synchronized scanning.  The
result goes into Spool 11 (all_amps) (compressed columns
allowed), which is built locally on the AMPs.  The size of
Spool 11 is estimated with low confidence to be 136,515 rows
(21,159,825 bytes).  The estimated time for this step is 4.15
seconds.

2) We do an all-AMPs RETRIEVE step from
CALL_DB.REPORTING_GROUP by way of an
all-rows scan with no residual conditions locking for access
into Spool 12 (all_amps) (compressed columns allowed), which
is duplicated on all AMPs.  The size of Spool 12 is estimated
with high confidence to be 384,930 rows (13,857,480 bytes).

The estimated time for this step is 0.02 seconds.

7) We do an all-AMPs JOIN step from Spool 11 (Last Use) by way of an
all-rows scan, which is joined to Spool 12 (Last Use) by way of an
all-rows scan.  Spool 11 and Spool 12 are joined using a single
partition hash join, with a join condition of (
“(BILL_PERIOD_CD = BILL_PERIOD_CD) AND ((CALL_DATE
= CALL_DATE) AND ((CALL_DATE = CALL_DATE) AND
((BILL_PERIOD_CD = BILL_PERIOD_CD) AND ((CALL_DATE
= CALL_DATE) AND (REPORTING_GROUP_CD1 = REPORTING_GROUP_CD1 )))))”).  The result goes into Spool 7 (all_amps) (compressed columns allowed), which is
built locally on the AMPs.  The size of Spool 7 is estimated with
low confidence to be 136,515 rows (18,702,555 bytes).  The
estimated time for this step is 0.04 seconds.

8) We do an all-AMPs SUM step to aggregate from Spool 7 (Last Use) by
way of an all-rows scan, grouping by field1 (
CALL_DB.CALL_DETAILS.PHONE_ID
, CALL_DB.CALL_DETAILS.COUNTRY_CD
, CALL_DB.CALL_DETAILS.BILLING_ENGINE_CD
, CALL_DB.CALL_DETAILS.PRICEPLAN_GROUP_CD
, CALL_DB.CALL_DETAILS.RATING_CODE
, CALL_DB.CALL_DETAILS.CUSTOMER_ID
, CALL_DB.CALL_DETAILS.REPORTING_GROUP_CD1
, CALL_DB.CALL_DETAILS.ROAMING_IND
, CALL_DB.CALL_DETAILS.CORRECTIONS_IND
, CALL_DB.CALL_DETAILS.CALL_DATE
, CALL_DB.REPORTING_GROUP.BILL_PERIOD_CD
,’A6′ ,CALL_DB.CALL_DETAILS.BILLED_IND
, CALL_DB.CALL_DETAILS.PREPAID_IND).  Aggregate

Intermediate Results are computed globally, then placed in Spool
The size of Spool 13 is estimated with low confidence to be
102,387 rows (31,637,583 bytes).  The estimated time for this step
is 0.17 seconds.
9) We do an all-AMPs RETRIEVE step from Spool 13 (Last Use) by way of
an all-rows scan with a condition of (“CALL_DATE = DATE
‘2015-05-21′”) into Spool 5 (group_amps), which is built locally
on the AMPs.  The size of Spool 5 is estimated with low confidence
to be 102,387 rows (14,846,115 bytes).  The estimated time for
this step is 0.02 seconds.

10) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.

-> The contents of Spool 5 are sent back to the user as the result of
statement 1.  The total estimated time is 4.44 seconds.

 

This is the tuned query:

EXPLAIN

SELECT
TOT.PHONE_ID
, TOT.COUNTRY_CD
, TOT.BILLING_ENGINE_CD
, TOT.PRICEPLAN_GROUP_CD
, TOT.RATING_CODE
, TOT.CUSTOMER_ID
, TOT.REPORTING_GROUP_CD1
, TOT.ROAMING_IND
, TOT.CORRECTIONS_IND
, TOT.CALL_DATE
, TOT.BILL_PERIOD_CD
, ‘A6’ AS PROVIDER_CD
, SUM(TOT.MINUTES_OF_USE) AS MINUTES_OF_USE
, TOT.BILLED_IND
, TOT.PREPAID_IND

FROM
(SELECT
RG.PHONE_ID
, RG.COUNTRY_CD
, RG.BILLING_ENGINE_CD
, RG.PRICEPLAN_GROUP_CD
, RG.RATING_CODE
, RG.CUSTOMER_ID
, RG.REPORTING_GROUP_CD1
, RG.ROAMING_IND
, RG.CORRECTIONS_IND
, RG.REPORTING_GROUP_CD2
, RG.CALL_DATE
, RG.BILL_PERIOD_CD
, RG.MINUTES_OF_USE
, RG.BILLED_IND
, RG.PREPAID_IND

FROM
(SELECT
CD.PHONE_ID
, CD.COUNTRY_CD
, CD.BILLING_ENGINE_CD
, CD.PRICEPLAN_GROUP_CD
, CD.RATING_CODE
, CD.CUSTOMER_ID
, CD.REPORTING_GROUP_CD1
, CD.ROAMING_IND
, CD.CORRECTIONS_IND
, CD.REPORTING_GROUP_CD2
, CD.CALL_DATE
, CD.MINUTES_OF_USE
, RG.BILL_PERIOD_CD
, CD.BILLED_IND
, CD.PREPAID_IND

FROM
CALL_DETAILS CD
LEFT JOIN
(SELECT BILL_PERIOD_CD, CALL_DATE FROM REPORTING_GROUP GROUP BY 1,2) RG
ON CD.CALL_DATE = RG.CALL_DATE
WHERE
BILLING_ENGINE_CD IN (‘AMDOCS’)
AND BILLED_IND = ‘N’
AND REPORTING_GROUP_CD2 IS NOT NULL

 AND CD.CALL_DATE = DATE ‘2015-05-21’  — The optimized place of the filter

) RG

LEFT JOIN
REPORTING_GROUP R1
ON RG.CALL_DATE = R1.CALL_DATE
AND RG.REPORTING_GROUP_CD1 = R1.REPORTING_GROUP_CD1
AND RG.BILL_PERIOD_CD = R1.BILL_PERIOD_CD
LEFT JOIN
REPORTING_GROUP R2
ON RG.CALL_DATE = R2.CALL_DATE
AND RG.REPORTING_GROUP_CD2 = R2.REPORTING_GROUP_CD1
AND RG.BILL_PERIOD_CD = R2.BILL_PERIOD_CD
WHERE NOT (R1.REPORTING_GROUP_CD1 IS NULL OR R2.REPORTING_GROUP_CD1 IS NULL)

) TOT
GROUP BY 1,2,3,4,5,6,7,8,9,10,11,12,14,15;

This is the execution plan of the tuned query:

1) First, we lock CALL_DB.REPORTING_GROUP for access,
and we lock CALL_DB.CALL_DETAILS for access.
2) Next, we do an all-AMPs SUM step to aggregate from a single
partition of CALL_DB.REPORTING_GROUP with a
condition of (“CALL_DB.REPORTING_GROUP.CALL_DATE =
DATE ‘2015-05-21′”) with a residual condition of (
“CALL_DB.REPORTING_GROUP.CALL_DATE = DATE
‘2015-05-21′”) , grouping by field1 (
CALL_DB.REPORTING_GROUP.BILL_PERIOD_CD
,CALL_DB.REPORTING_GROUP.CALL_DATE).  Aggregate
Intermediate Results are computed globally, then placed in Spool 3.

The size of Spool 3 is estimated with low confidence to be 4 rows
(204 bytes).  The estimated time for this step is 0.01 seconds.
3) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan into Spool 1 (used to materialize view, derived
table or table function RG) (all_amps) (compressed columns
allowed), which is built locally on the AMPs.  The size of Spool 1
is estimated with low confidence to be 4 rows (148 bytes).  The
estimated time for this step is 0.01 seconds.

4) We execute the following steps in parallel.

1) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by
way of an all-rows scan with a condition of (“RG.CALL_DATE =
DATE ‘2015-05-21′”) into Spool 8 (all_amps) (compressed
columns allowed), which is duplicated on all AMPs.  Then we
do a SORT to order Spool 8 by the hash code of (
CALL_DB.REPORTING_GROUP.BILL_PERIOD_CD).

The size of Spool 8 is estimated with low confidence to be
360 rows (10,440 bytes).  The estimated time for this step is
0.01 seconds.

2) We do an all-AMPs RETRIEVE step  from a single partition  of
CALL_DB.REPORTING_GROUP with a condition of
(“CALL_DB.REPORTING_GROUP.CALL_DATE = DATE
‘2015-05-21′”) with a residual condition of (
“CALL_DB.REPORTING_GROUP.CALL_DATE = DATE
‘2015-05-21′”) locking for access into Spool 9 (all_amps)
(compressed columns allowed), which is built locally on the
AMPs.  Then we do a SORT to order Spool 9 by the hash code of(
CALL_DB.REPORTING_GROUP.BILL_PERIOD_CD).

The size of Spool 9 is estimated with high confidence to be
1,591 rows (55,685 bytes).  The estimated time for this step
is 0.00 seconds.

3) We do an all-AMPs RETRIEVE step  from a single partition  of
CALL_DB.REPORTING_GROUP with a condition of
(“CALL_DB.REPORTING_GROUP.CALL_DATE = DATE
‘2015-05-21′”) with a residual condition of (
“CALL_DB.REPORTING_GROUP.CALL_DATE = DATE
‘2015-05-21′”) locking for access into Spool 10 (all_amps)
(compressed columns allowed), which is duplicated on all AMPs.

The size of Spool 10 is estimated with high confidence to be
143,190 rows (5,011,650 bytes).  The estimated time for this
step is 0.01 seconds.

4) We do an all-AMPs RETRIEVE step  from a single partition  of
CALL_DB.CALL_DETAILS with a condition of (
“CALL_DB.CALL_DETAILS.CALL_DATE = DATE
‘2015-05-21′”) with a residual condition of (“(NOT(CALL_DB.CALL_DETAILS.REPORTING_GROUP_CD2 IS NULL ))
AND ((CALL_DB.CALL_DETAILS.CALL_DATE = DATE
‘2015-05-21’) AND ((CALL_DB.CALL_DETAILS.BILLED_IND = ‘N’) AND
(CALL_DB.CALL_DETAILS.BILLING_ENGINE_CD = ‘AMDOCS
‘)))”) into Spool 11 (all_amps) (compressed columns allowed),
which is built locally on the AMPs.  The size of Spool 11 is
estimated with low confidence to be 212,693 rows (27,224,704
bytes).  The estimated time for this step is 0.06 seconds.

5) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of a
RowHash match scan, which is joined to Spool 9 (Last Use) by way
of a RowHash match scan.  Spool 8 and Spool 9 are joined using a
merge join, with a join condition of (“(BILL_PERIOD_CD =
BILL_PERIOD_CD) AND (CALL_DATE = CALL_DATE)”).  The result
goes into Spool 12 (all_amps) (compressed columns allowed), which
is duplicated on all AMPs.  The size of Spool 12 is estimated with
low confidence to be 143,190 rows (7,302,690 bytes).  The
estimated time for this step is 0.02 seconds.

6) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way of an
all-rows scan, which is joined to Spool 11 (Last Use) by way of an
all-rows scan.  Spool 10 and Spool 11 are joined using a single
partition hash join, with a join condition of (“(REPORTING_GROUP_CD1 =
REPORTING_GROUP_CD1) AND (CALL_DATE = CALL_DATE)”).  The result goes into
Spool 13 (all_amps) (compressed columns allowed), which is built
locally on the AMPs.  The size of Spool 13 is estimated with low
confidence to be 654,535 rows (94,907,575 bytes).  The estimated
time for this step is 0.11 seconds.

7) We do an all-AMPs JOIN step from Spool 12 (Last Use) by way of an
all-rows scan, which is joined to Spool 13 (Last Use) by way of an
all-rows scan.  Spool 12 and Spool 13 are joined using a single
partition hash join, with a join condition of (“(REPORTING_GROUP_CD2 =
REPORTING_GROUP_CD1) AND ((CALL_DATE = CALL_DATE) AND ((CALL_DATE = CALL_DATE)AND ((BILL_PERIOD_CD = BILL_PERIOD_CD) AND
((CALL_DATE = CALL_DATE) AND ((CALL_DATE = CALL_DATE) AND
(BILL_PERIOD_CD = BILL_PERIOD_CD ))))))”).  The
result goes into Spool 7 (all_amps) (compressed columns allowed),
which is built locally on the AMPs.  The size of Spool 7 is
estimated with low confidence to be 473,348 rows (64,848,676
bytes).  The estimated time for this step is 0.10 seconds.

8) We do an all-AMPs SUM step to aggregate from Spool 7 (Last Use) by
way of an all-rows scan , grouping by field1 (
CALL_DB.CALL_DETAILS.PHONE_ID,CALL_DB.CALL_DETAILS.COUNTRY_CD
,CALL_DB.CALL_DETAILS.BILLING_ENGINE_CD
,CALL_DB.CALL_DETAILS.PRICEPLAN_GROUP_CD
,CALL_DB.CALL_DETAILS.RATING_CODE
,CALL_DB.CALL_DETAILS.CUSTOMER_ID
,CALL_DB.CALL_DETAILS.REPORTING_GROUP_CD1
,CALL_DB.CALL_DETAILS.ROAMING_IND
,CALL_DB.CALL_DETAILS.CORRECTIONS_IND
,CALL_DB.CALL_DETAILS.CALL_DATE
,CALL_DB.REPORTING_GROUP.BILL_PERIOD_CD
,’TOT’ ,CALL_DB.CALL_DETAILS.BILLED_IND
,CALL_DB.CALL_DETAILS.PREPAID_IND).  Aggregate
Intermediate Results are computed globally, then placed in Spool
The size of Spool 14 is estimated with no confidence to be
355,011 rows (109,698,399 bytes).  The estimated time for this
step is 0.32 seconds.

9) We do an all-AMPs RETRIEVE step from Spool 14 (Last Use) by way of
an all-rows scan into Spool 5 (group_amps), which is built locally
on the AMPs.  The size of Spool 5 is estimated with no confidence
to be 355,011 rows (51,476,595 bytes).  The estimated time for
this step is 0.06 seconds.

10) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.

-> The contents of Spool 5 are sent back to the user as the result of
statement 1.  The total estimated time is 0.70 seconds.

 

Our Reader Score
[Total: 5    Average: 4.2/5]
Faster than Lightning: Teradata SQL Tuning written by Roland Wenzlofsky on May 21, 2015 average rating 4.2/5 - 5 user ratings

3 COMMENTS

  1. Hi NItin, You are absolutely right. What happened with this query is the following: On the Teradata System I am testing, the logging for DBC.DBQLOGTBL is filtering all requests consuming less than 100 CPU cycles. For the optimized SQL, there was not a single entry in the query log, only the original one was available. That’s why I don’t have this measures available.

  2. Hi Roland

    Nice case study.
    Can you please include AMPCPU usage too before and after tuning.
    Sometimes only time does not give clear picture.
    Factors like delay queue or server load can influence overall time if the query.

    Regards
    Nitin

LEAVE A REPLY

Please enter your comment!
Please enter your name here