December 28

0 comments

Teradata DBQL Performance Tuning

By Roland Wenzlofsky

December 28, 2019

tuning

What Are The Most Important Performance Indicators?

In this article, I will show you by which key figures you can use to recognize if a query should be optimized.

The selected Teradata performance indicators illuminate the query from different angles and give a good overview of whether a query should be improved and what the problem is.

All columns listed in the calculations are available in the table DBC.DBQLOGTBL.

The Product Join Indicator

As the name suggests, this indicator is used to determine if there are product joins in a query.

It is not about desired product joins (e.g. cross join with a calendar), as they sometimes occur, but about people who consume large amounts of resources.

Unwanted product joins can be recognized by the fact that the ratio of CPU to IO consumption is high.

Values in the three-digit range are often reached here. The reason is the high number of comparisons of rows executed by a product join.

This is how the Product Join Indicator is calculated:

CASE WHEN TotalIOCount = 0 THEN 0 ELSE (AMPCPUTime * 1000)/TotalIOCount END

If the Product Join Indicator is 3 or higher it is worth taking a look at the execution plan and looking for a product join.

However, a high Product Join Indicator can also have other causes:

  • Aggregations of large amounts of data that pass through the memory. Since the aggregation takes place in memory, hardly any IOs are required, but a lot of CPU seconds.
  • Duplicate rows check when there are many hash collisions
  • Extensive string operations as they are often generated in the queries of BI tools.
  • The use of Multivalue Compression also tends to increase the Product Join Indicator. Since more rows can be packed into a data block when using MVC, fewer IOs are needed to copy the data into memory. This also applies to systems that do not have dedicated hardware for Block Level Compression. BLC needs CPU for decompressing.
teradata PJI
Product Join Indicator detects high CPU with low IO

The Indicator For Unnecessary IOs

Here it is exactly the opposite of the product join indicator.

Queries or steps are identified that take many IOs compared to the CPU seconds.

A high value for this indicator usually means that large amounts of data are read or copied.

This is often a sign that either the primary indexes are not used optimally or secondary indexes are missing.

Here is how this Indicator is calculated:

CASE WHEN AMPCPUTime = 0 THEN 0 ELSE TotalIOCount/(AMPCPUTime * 1000) END

As with the product join indicator, for values above 3, you should take a closer look at the execution plan.

teradata DBQLOGTBL
Wasted IO indicator detects low CPU with high IO usage

The CPU and IO Skew Indicators

This is one of the most important indicators as it identifies skewed workload.

As you know, skew is one of the main causes of poor performance on a Teradata system.

With this indicator, we can search for skewed tables, joins that are skewed, etc.

The calculation for CPU skew is done like this:

CASE WHEN (AMPCPUTime / (HASHAMP()+1) =0 THEN 0 
ELSE MaxAmpCPUTime/(AMPCPUTime / (hashamp()+1) 
END

The calculation for IO skew is done like this:

CASE WHEN (TotalIOCount / (HASHAMP()+1) =0 THEN 0 
ELSE MaxAmpIO/(TotalIOCount / (HASHAMP()+1) 
END 

Values above 2 should be the trigger to take a closer look at the execution plan.

The CPU Impact Indicator

This indicator goes one step further and calculates the effect of skewing on the exploitation of the parallelism of the system.

The indicator is calculated in this way:

MaxAmpCPUTime * (HASHAMP()+1)

Here the basic rule applies: the higher the value compared to the other queries the worse.

High CPU or IO Consumption in General

These are the most important key figures in Performance Tuning.

The great thing about it is:

A query that is executed on the same data always uses the same resources per execution. These metrics are therefore much more appropriate than the runtime of queries which depends on the rest of the workload.

The Query Response Time

As already mentioned, the response time of a query is not a very good indicator for tuning.

Nevertheless, the goal is, of course, to have the fastest possible queries at the end. Even the business user will not care how many CPU seconds a query takes. The result must be available at 8 o'clock in the morning.

At the end that is all that counts. Therefore we should not completely disregard the average runtime.

Armed with this set of metrics, you can now search for queries that consume the most resources on your Teradata system. Have fun!

More about the most important topics in Teradata Performance Tuning can be found here:

Roland Wenzlofsky


Roland Wenzlofsky is a graduated computer scientist and Data Warehouse professional working with the Teradata database system for more than 20 years. He is experienced in the fields of banking and telecommunication with a strong focus on performance optimization.

You might also like

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

Never miss a good story!

 Subscribe to our newsletter to keep up with the latest trends!

>