Teradata GROUP BY Query Optimization

1
885
teradata group by

teradata group byTeradata applies two very effective join optimization methods for decision support workload, which usually is involving a lot of aggregations:

Teradata GROUP BY-  Early and Partial Aggregation

Early and Partial GROUP BY are transformations, which allow to move group operations in front of one or more join operations.

While Early GROUP BY executes the group statement, as a whole, before the join, Partial GROUP BY divides the aggregation, doing the first part of the agreggation before,  the second part after the the join.

This technique reduces the resource usage for the join and for the aggregation after the join (in case of Early GROUP BY: no aggregation at all has to be done after the join).

The Early Aggregation technique is used to reduce the number of rows as early as possible.

In which situations the optimizer applies this techniques depends on the estimated join costs.

Consider the following SQL query:

SELECT SUM(t01.a)
FROM t01,t02,t03
WHERE t01.b = t02.b AND t02.c = t03.c
GROUP BY t01.d
;

This is an example of an SQL where the Teradata optimizer can push the GROUP BY into relation t01, partially grouping t01 by columns {t01.b, t01.d}.

The grouped spool can be joined afterwards with relation t02 and in a last step the resulting spool will be joined with t03 with a consecutive final GROUP BY applied.

Although Early and Partial GROUP BY add complexity to the creation of the execution plan (as they offer more join options to the optimizer), they are a widely used approaches in any modern database systems for early data reduction.

Partial grouping can be easily detected by taking a look at the explain plan:

  • Explicitely stated: “We do an all-AMPs partial SUM step” or
  • Before the join takes place:  “we do a SORT/GROUP”
As always, in order to improve the chance that the optimizer applies this techniques you have to collect statistics on all join and aggregation columns.

Our Reader Score
[Total: 6    Average: 4.2/5]
Teradata GROUP BY Query Optimization written by Roland Wenzlofsky on April 3, 2015 average rating 4.2/5 - 6 user ratings

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here