fbpx

Teradata UNION ALL Enhancements

By Roland Wenzlofsky

April 20, 2018


Teradata UNION ALL – The New Features of Release 16

Teradata 16 introduces several new features for queries containing UNION ALL. The usage of these new features can improve query performance, but will not always be applied.

The optimizer will decide to execute the query without using these features if cost estimations for the traditional execution plan are lower. Like always, it’s essential to support the optimizer with correct statistics.

Before Teradata 16, the execution plan for a join between a derived table or view containing UNION ALL looks like this:

  • Create a common spool from all branches of the UNION ALL
  • Do the usual join preparation steps (rehashing, duplicating, etc.) for this spool.
  • Join the spool (left table) with the right table

Until Teradata 15.10, the optimizer will not apply any reduction (WHERE conditions, etc.) before all branches of the statement containing UNION ALL are placed in a common spool.

Starting with Teradata 16, the optimizer can execute queries containing UNION ALL in a different way:

  • Join spool of first UNION ALL branch (left table) against the right table
  • Join spool of second UNION ALL branch (left table) against the right table
  • Join spool of last UNION ALL branch (left table) against the right table
  • Merge the join results into a common spool

This approach will consume fewer resources than the original one if the partial join steps lead to smaller result spools for each partial join.

Likewise, aggregations can be applied on every single UNION ALL branch instead of spooling all branches into a common spool before doing the aggregation (which was the only available option for the optimizer until Teradata 15.10). Again, the performance improvement can be significant if the partial aggregations lead to small spool files.

Another enhancement was introduced for SELECT COUNT(*) queries, which are containing UNION ALL.

Instead of moving all branches into a common spool and counting the number of rows one by one, the Optimizer can now use the cylinder index (the same technique used for single tables) to count each branch’s rows very efficiently and sums the result up afterward.

Unfortunately, these new enhancements cannot be used for UNION queries. Dou you know why? Leave your answer in the comment section.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"rgb(255, 0, 0)"}},"gradients":[]}}]}__CONFIG_colors_palette__
GET OUR TERADATA BOOK

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

  • Avatar
    Patrick Ericx says:

    Table1:
    CLIENTNAME
    Roland

    Table2:
    CLIENTNAME
    Roland
    Jiri
    Patrick

    UNION ALL: will return 1 row from Table1 + 3 rows from Table2 = 4 rows. This is deterministic, and every table separate will always return the same number of rows, so here you can push the calculations down to the branches.
    UNION: as explained by Jiri, will remove duplicates. So only looking to table 2, you don’t know how many rows will be returned. You need to know the context from the other branches.

  • I think the answer is: UNION eliminates duplicate rows. There’s no way of telling whether there’re going to be any duplicates in the UNIONed set just by looking to Cylinder index.

    🙂

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >