Teradata ORDER BY and Performance
To employ a basic sorting algorithm, all rows must be present in one location for sorting. However, this is not feasible in Teradata, where numerous AMPs retain a segment of each table. Transporting all the rows to one AMP for sorting purposes would result in a non-scalable and bottlenecked process.
Teradata optimizes the shared-nothing architecture, performing parallel sorting at each level without redistributing rows. Sorting constitutes the final step in the execution plan.
Here we present the algorithm employed in the ORDER BY statement of Teradata:
- At first, the table rows are sorted on each AMP where they reside. A higher number of AMPs permits a higher sort performance as all AMPs are sorted in parallel. The sorted rows are put into a spool table.
- After AMPs finish their local sort step, Teradata will return the local spools to the requesting client. For performance reasons, Teradata creates a global buffer per node and on the parsing engine level and informs the client that the result set is available.
- The client fetches as many rows as needed (recall that we often abort requests after having a specific number of rows available in SQL Assistant).
- Each fetch request causes each AMP to move its top row into the buffer, which is merged globally in sort order. Subsequently,  each AMP puts its next row into the buffer where the sort order merges them. This process continues until the global buffer is full and sent via the attached network to the client. The merge process itself is done by the BYNET software (which handles the merge buffer on its own)
- The above-described process continues if the client fetches rows from the result set or no more rows are available.
Teradata uses a sorting algorithm to arrange data in a specific order.
This sorting algorithm’s exceptional performance stems from its possession of the following qualities:
- The AMP local pre-sorting step is done in parallel on all AMPs, and sort performance increases with the system size (good scalability)
- Teradata must sort only the client-requested rows in the last global merge step. Usually, when the client requests the whole table (such as SQL Assistant), we cancel the request after having a small number of rows on our screen. It would not make sense to do the global sort for millions of rows that never are noticed by the user.
BYNET will merge-sort only the rows which are fetched by the client. No sort of all rows is required in advance.
Nicely Explained.
One Question: generally in SQL assistant fetch row limit is set to 2000 or let say we mention SAMPLE 2000 unless you ask for full fetch.
So in that case when we are fetching 2000 rows, BYNET will sort the whole result or will give only first 2000 rows from ALL AMPS in node
Only the 2000 rows are sorted in the global buffer.