Understanding Partial Group By: Reducing Join Costs with Aggregation Optimization

Roland Wenzlofsky

April 28, 2023

minutes reading time

What is Partial Group By?

Joins are costly. Before the introduction of PARTIAL GROUP BY, the join would be performed first, and then the aggregated result would follow.

PARTIAL GROUP BY reduces the amount of data that must be redistributed or duplicated to all AMPs during join preparation by performing aggregations before the join, without altering the query’s semantics.

The savings increase as the number of unique values in the GROUP BY columns decreases.

Early GROUP BY and Partial GROUP BY are distinguishable solely based on their approach to aggregations in relation to the join process. While Early GROUP BY executes all aggregations before joining, Partial GROUP BY conducts a portion of the aggregation before and after the join. Nonetheless, this is the only point of differentiation.

This query is optimized for PARTIAL GROUP BY:

SELECT table1.key,SUM(table1.fact),SUM(table2.fact)

FROM table1 INNER JOIN table2 ON table1.key = table2.key

GROUP BY 1;

The order of the join and aggregation is insignificant in this scenario. The performance may vary based on the number of rows and distinct values in the “fact” column of each table.

Take a brief survey and receive the book for free Get instant access

Share0

Tweet0

Share0

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

I tried it and it reduces the runtime by 50%. Btw are you offering also TD performance trainings?

Roland Wenzlofsky says:
at
Yes, just write me an email to [email protected]
Reply

Understanding Partial Group By: Reducing Join Costs with Aggregation Optimization

What is Partial Group By?

Roland Wenzlofsky

Fast multi-file export of Teradata query results using only Teradata SQL Assistant

The Teradata AMP Worker Task

Boost Your Teradata Performance – The Critical Role of NOT NULL Declarations

Optimizing Teradata SQL Queries by Avoiding Full Table Scans and Utilizing Secondary Indexes