Teradata applies two very effective join optimization methods for decision support workload, which usually is involving a lot of aggregations:

Teradata GROUP BY-  Early and Partial Aggregation

Early and Partial GROUP BY are transformations, which allow moving group operations in front of one or more join operations.

While Early GROUP BY executes the group statement, as a whole, before they join, Partial GROUP BY divides the aggregation, doing the first part of the aggregation before the second part after the join.

This technique reduces the resource usage for the join and the aggregation after the join (in case of Early GROUP BY: no aggregation at all has to be done after the join).

The Early Aggregation technique is used to reduce the number of rows as early as possible.

In which situations the optimizer applies, this technique depends on the estimated join costs.

Consider the following SQL query:

SELECT SUM(t01.a)
FROM t01,t02,t03
WHERE t01.b = t02.b AND t02.c = t03.c
GROUP BY t01.d
;

This SQL exemplifies how the Teradata optimizer can partially group t01 by columns {t01.b, t01.d} by pushing the GROUP BY into relation t01.

The spool group can be later connected using relation t02. Finally, the resulting spool will be combined with t03, and a final GROUP BY will be applied consecutively.

Early and Partial GROUP BY, despite their complexity in execution plan creation, are commonly employed methods in modern database systems for data reduction. These approaches provide additional join options to the optimizer.

Partial grouping can be detected by taking a look at the EXPLAIN plan:

  • Explicitly stated: “We do an all-AMPs partial SUM step” or
  • Before the join takes place:  “we do a SORT/GROUP”

It is necessary to gather statistics on all columns involved in joins and aggregations to increase the likelihood of the optimizer utilizing these techniques.

  • Avatar
    Srivignesh KN says:

    Thanks for explaining, how would early group by being classified and how could it be identified through explain plan.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >