December 15

2 comments

How Teradata sorts the Result Set

By Roland Wenzlofsky

December 15, 2016

sorting, tuning

Sorting in a parallel database system is expensive, as it requires all rows available in one place. The rows of a Teradata table are distributed evenly across all Amps.  Sorting would require moving all rows to one AMP. This would be a skewed,  not scalable operation. 

Teradata uses another sort approach. Here is how it works:

At first, the table rows are sorted on each AMP where they reside. This is always the last step of the execution plan. A higher number of AMPs permits a higher sort performance as all AMPs are sorting in parallel. The sorted rows are put into a spool table.

After AMPs finished their local sort step, the local spools are available to be returned to the requesting client. For performance reasons, Teradata creates a buffer and informs the client that the result set is available.

The client fetches as many rows as needed (keep in mind, that most times we “abort” our request after having a certain amount of rows available in SQL Assistant).

Each fetch request causes each AMP to move its top row into the buffer, where they are merged globally in sort order. Subsequently,  each AMP puts its next row into the buffer where they are merged by the sort order. This continues until the buffer is full and can is sent via the attached network to the client. The merge process itself is done by the BYNET software (which handles the merge buffer on its own)

The whole process continues as long as the client fetches rows from the result set or no more rows are available.

There are two advantages of this approach:

  1. The AMP local pre-sorting step is done in parallel on all AMPs and sort performance increases with the system size (good scalability)
  2. Only the rows being requested by the client have to be sorted in a last global merge step.
    Usually, when the whole table is requested from the client (such as SQL Assistant), we cancel the request after having a certain amount of rows on our screen. It would not make any sense to do the global sort for millions of rows.

Roland Wenzlofsky


Roland Wenzlofsky is a graduated computer scientist and Data Warehouse professional working with the Teradata database system for more than 20 years. He is experienced in the fields of banking and telecommunication with a strong focus on performance optimization.

You might also like

  • BYNET will merge-sort only the rows which are fetched by the client. No sort of all rows is required in advance.

    Reply

  • Nicely Explained.
    One Question: generally in SQL assistant fetch row limit is set to 2000 or let say we mention SAMPLE 2000 unless you ask for full fetch.

    So in that case when we are fetching 2000 rows, BYNET will sort the whole result or will give only first 2000 rows from ALL AMPS in node

    Reply

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    Never miss a good story!

     Subscribe to our newsletter to keep up with the latest trends!

    >