Teradata Optimization with Partial Group by in Joins
What is Partial Group By?
The idea behind PARTIAL GROUP BY is simple: Aggregations that can be performed before the join (without changing the semantics of the query) reduce the amount of data that has to be redistributed or duplicated to all AMPs during the join preparation.
Obviously, the less distinct values the GROUP BY columns have, the greater the savings.
A distinction is made between Early GROUP BY and Partial GROUP BY. The only difference is that Early GROUP BY performs all aggregations before the join, Partial GROUP BY performs part of the aggregation before the join and part after the join. However, this is the only difference.
Here is an example of a query that is predestined for PARTIAL GROUP BY:
FROM table1 INNER JOIN table2 ON table1.key = table2.key
GROUP BY 1;
In this case, it makes no difference whether the join occurs first and then the aggregation or vice versa. Depending on the number of rows per table and the different values in the column “fact” of both tables, the performance can be very different.