fbpx

Teradata GROUP BY Query Optimization

By Roland Wenzlofsky

April 3, 2015


Teradata applies two very effective join optimization methods for decision support workload, which usually is involving a lot of aggregations:

Teradata GROUP BY-  Early and Partial Aggregation

Early and Partial GROUP BY are transformations, which allow moving group operations in front of one or more join operations.

While Early GROUP BY executes the group statement, as a whole, before they join, Partial GROUP BY divides the aggregation, doing the first part of the aggregation before the second part after the join.

This technique reduces the resource usage for the join and the aggregation after the join (in case of Early GROUP BY: no aggregation at all has to be done after the join).

The Early Aggregation technique is used to reduce the number of rows as early as possible.

In which situations the optimizer applies, this technique depends on the estimated join costs.

Consider the following SQL query:

SELECT SUM(t01.a)
FROM t01,t02,t03
WHERE t01.b = t02.b AND t02.c = t03.c
GROUP BY t01.d
;

The above SQL is an example where the Teradata optimizer can push the GROUP BY into relation t01, partially grouping t01 by columns {t01.b, t01.d}.

The grouped spool can be joined afterward with relation t02, and in the last step, the resulting spool will be joined with t03 with a consecutive final GROUP BY applied.

Although Early and Partial GROUP BY add complexity to the creation of the execution plan (as they offer more join options to the optimizer), they are widely used approaches in any modern database systems for early data reduction.

Partial grouping can be detected by taking a look at the explain plan:

  • Explicitly stated: “We do an all-AMPs partial SUM step” or
  • Before the join takes place:  “we do a SORT/GROUP”
To improve the optimizer’s chance to apply these techniques, you have to collect statistics on all join and aggregation columns.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
Buy the Book Teradata Query Performance Tuning

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

  • Avatar
    Srivignesh KN says:

    Thanks for explaining, how would early group by being classified and how could it be identified through explain plan.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >