Teradata SQL and Advanced GROUPING Functions

What advanced GROUPING methods are there in Teradata SQL?

– GROUP BY GROUPING SETS
– GROUP BY ROLLUP
– GROUP BY CUBE

What is GROUP BY CUBE used for?

One or more columns are used to group across multiple dimensions.
If grouped by n columns, the result is 2 to the power of n rows, as shown in the graph: teradata sql cube

What is GROUP BY ROLLUP used for?

This functionality is used to group columns over a single dimension calculating multiple levels of detail.

What is GROUP BY GROUPING SET used for?

Groupes over one dimension without ROLLUP or several dimensions without CUBE

What is a GROUPING SET?

A set of columns over which we group.

What is an empty GROUPING SET?

It is represented by a left and right bracket “()” and returns a grand total

The advanced GROUPING functions will be demonstrated through the following example. The foundation of this demonstration is a table consisting of the flight count for each aircraft and date:

CREATE MULTISET TABLE Flights
(
   PLANE AS BIGINT NOT NULL,
   FLIGHTDATE AS DATE NOT NULL,
   NR_FLIGHTS INTEGER NOT NULL
) PRIMARY INDEX (PLANE);

Below you can see our test data:

PlaneFlightDateFlights
12019-02-265
22019-02-262
32019-02-263
12019-02-271
22019-02-272
32019-02-275
12019-02-285
22019-02-281
32019-02-281
12019-03-015
22019-03-013
32019-03-012
12019-03-021
22019-03-021
32019-03-021

Teradata SQL – GROUP BY GROUPING SETS

Data can be grouped on a singular dimension without ROLLUP or on multiple dimensions using CUBE. Consider this illustration:

SELECT 
    PLANE, 
    EXTRACT(MONTH FROM FLIGHTDATE) AS TheMonth,  
    EXTRACT(YEAR FROM FLIGHTDATE) AS TheYear, 
    SUM(NR_FLIGHTS) AS NR_FLIGHTS
FROM 
    FLIGHTS
GROUP BY GROUPING SETS (PLANE,TheMonth,TheYear)
ORDER BY 1 DESC,2 DESC,3 DESC;

PlaneTheMonthTheYearNR_FLIGHTS
3NULLNULL12
2NULLNULL9
1NULLNULL12
NULL3NULL13
NULL2NULL25
NULLNULL201938

GROUP BY GROUPING SETS determines the number of flights for each plane, monthly and annually.

Details about GROUP BY GROUPING SET

  • All columns of GROUPING sets must reference column names. The referencing cannot be done using the position as it is possible in simple GROUP BY (GROUP BY 1,2).
  • Each GROUPING SET may contain a maximum of 190 columns.
  • Large objects cannot be used in a GROUPING SET (CLOB, BLOB). They must first be cast into another data type. CLOB can be cast to CHAR or VARCHAR. BLOB can be cast to BYTE or VARBYTE.

Teradata SQL – GROUP BY ROLLUP

Categorizes data along a singular dimension with multiple levels of detail. An illustration is presented below:

SELECT  
    PLANE, 
    EXTRACT(MONTH FROM FLIGHTDATE) AS TheMonth, 
    EXTRACT(YEAR FROM FLIGHTDATE) AS TheYear, 
    SUM(NR_FLIGHTS) AS NR_FLIGHTS 
 FROM 
    FLIGHTS
GROUP BY ROLLUP (Plane,TheMonth,TheYear) 
ORDER BY 1 DESC,2 DESC,3 DESC;
PlaneTheMonthTheYearNR_FLIGHTS
3320193
33NULL3
3220199
32NULL9
3NULLNULL12
2320194
23NULL4
2220195
22NULL5
2NULLNULL9
1320196
13NULL6
12201911
12NULL11
1NULLNULL17
NULLNULLNULL38

GROUP BY ROLLUP calculates the number of flights for each plane, each distinct month, each month per year, and the total.

Details about GROUP BY ROLLUP

  • Several GROUPING SETS can be defined in one SQL. In our example, we use only one (Plane, TheMonth, TheYear)
  • Large objects cannot be used in a GROUPING SET (CLOB, BLOB). They must first be cast into another data type. CLOB can be cast to CHAR or VARCHAR. BLOB can be cast to BYTE or VARBYTE.
  • All columns of GROUPING sets must reference column names. The referencing cannot be done using the position as in simple GROUP BY (GROUP BY 1,2).
  • GROUP BY ROLLUP on a single column corresponds to a simple GROUP BY statement
  • Each GROUPING SET may contain a maximum of 190 columns.

Teradata SQL – GROUP BY CUBE

Organizes data across multiple dimensions. For instance:

SELECT 
    PLANE, 
    EXTRACT(MONTH FROM FLIGHTDATE) AS TheMonth, 
    EXTRACT(YEAR FROM FLIGHTDATE) AS TheYear,
    SUM(NR_FLIGHTS) AS NR_FLIGHTS 
FROM 
    FLIGHTS
GROUP BY CUBE (Plane,TheMonth,TheYear)
ORDER BY 1 DESC,2 DESC,3 DESC;
PlaneTheMonthTheYearNR_FLIGHTS
3320193
33NULL3
3220199
32NULL9
3NULL201912
3NULLNULL12
2320194
23NULL4
2220195
22NULL5
2NULL20199
2NULLNULL9
1320196
13NULL6
12201911
12NULL11
1NULL201917
1NULLNULL17
NULL3201913
NULL3NULL13
NULL2201925
NULL2NULL25
NULLNULL201938
NULLNULLNULL38

Details about GROUP BY CUBE

  • Several GROUPING SETS can be defined in one Teradata SQL. In our example, we use only one (Plane, TheMonth, TheYear)
  • All columns of GROUPING sets must reference column names. The referencing cannot be done using the position as in simple GROUP BY (GROUP BY 1,3).
  • Large objects cannot be used in a GROUPING SET (CLOB, BLOB). They must first be cast into another data type. CLOB can be cast to CHAR or VARCHAR. BLOB can be cast to BYTE or VARBYTE.
  • Each GROUPING SET may contain a maximum of 8 columns

Teradata’s advanced grouping options facilitate multi-level grouping, resulting in improved performance.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>