On this website, you will find many articles about Teradata SQL Tuning. We describe in detail the tools which are available to optimize queries. In this article, I want to show you the best way to proceed when you need to optimize a query. The presented method is only one of many possibilities. Teradata tuning has a lot to do with experience but also luck.
The tuning process usually starts with business users or DBAs complaining about slow queries. If you have read our other articles carefully, the runtime of queries is a legitimate point of criticism. Still, for us as performance tuners, it is about reducing the resources needed for a slow query and minimizing the skew. Resource consumption is primarily about disk IOs. Required CPU seconds are also an important metric, but as long as the Teradata system is not CPU bound, the number of IOs plays a more significant role. As we know, a computer can process data only when it is available in the main memory. The copying of data blocks into the main memory is the bottleneck. IOs and skew are where we must primarily start.
The first decision point in which direction we need to optimize is to understand if we are dealing with a tactical or strategic workload. The kind of workload makes a big difference. Strategic workload processes large amounts of data, which is more tolerated when the processing takes some time. Fluctuations in runtime triggered by competing workloads are also typically not a problem. The main goal of strategic workload is to reduce the overall average runtime.
The situation is different for the tactical workload. Here, the expectations of business users are different. They usually expect short and, above all, stable runtimes.
Therefore, the tactical workload is more challenging to optimize since we have to reduce the runtime here and simultaneously ensure stable runtimes. Consequently, we will focus on this in the further course of this article.
Tactical queries are highly sensitive to concurrent workloads. Therefore, it is helpful to get a first overview by executing the tactical queries scheduled over a more extended period covering all exceptional cases. Doing a scheduled run over some period should give a first overview of how the queries behave under load.
From practical experience, I can say that the same query that runs on an unloaded system takes 10 seconds on a heavily loaded system and can run for an hour. If this is the case, it is worth taking a closer look at workload management. Tactical queries should never be delayed by TASM. It is also essential to prevent queries from being blocked because other processes use the required database objects. LOCKING FOR ACCESS is very helpful here.
The optimization can begin once the issues mentioned above have been eliminated as disruptive factors. We must ask ourselves: “How do we achieve stable runtimes?” The answer is simple: It can only be achieved by removing full table scans in the execution plan. We need direct access to the data in every step. So we want to see accesses “by way of the primary index” and single-AMP or group-AMP joins. If we create secondary indexes or join indexes, their main goal is direct access to the data.
Now, this sounds easier than it is. The problem with Teradata is that dropping statistics or changing data demographics can already change the execution plan and negate our optimizations. Often, the statistics landscape is out of our control.
That’s why achieving Primary Index access or access via a USI is even more essential. Since this is the most efficient access, the Optimizer prefers this even if no statistics are available.
No matter the workload we are dealing with, we always start collecting and updating statistics when we optimize specific queries. We want to avoid optimizing an execution plan that changes as soon as missing statistics are collected.
Once we have finished the first step, the next step is to monitor the query in real-time. Viewpoint is the tool for this. We look for steps that get stuck and have a high skew factor. However, skew optimization is often no longer an issue after collecting statistics.
There are many causes of skew and possible solutions. For more information, I recommend the following article:
Finally, it’s time to look at the individual steps in the Execution Plan and look for All-AMP full table scans or All-AMP joins. We need to solve these problem points to achieve stable runtimes.
Step-by-step testing of the Execution Plan is independent of the type of workload. Step by step, we need to analyze where problems might arise. There is no magic trick or shortcut.
We should pay special attention to joins after a step where the estimated result rows are more significant than the actual ones. Here the wrong join method can often lead to bad runtimes. Product joins are especially susceptible to misestimation and hash joins if Teradata can’t hold the hash table in memory. Conversely, it is usually not a problem if the number of actual rows is less than the estimated number. Because then, the Optimizer has typically chosen a more conservative join method which, of course, has no problem with a lower number of rows.
You can read details about join methods in the following article:
Search for tuning opportunities for tactical workload
Now that we’ve covered general considerations, it’s time to review our tuning options for the tactical workload. We will create a checklist here that will be very helpful for your daily tuning activities.
Have both tables joined the same PI and Partitioning, and is it matching the join columns?
Aligning PI and Partitioning to the join columns ensures that AMP-local joining via merge join or rowkey-based merge join (partitioned tables) is possible. Such a setup is beneficial not only for tactical but also for strategic workloads.
Did we place a USI on the foreign table of a relation to support the usage of nested joins?
Suppose Teradata can query the main table using UPI or USI, and we provide a USI, UPI, or NUPI on the other table which matches the join columns. In that case, Teradata can read the row in the main table using UPI or USI (single or dual AMP access). Then the hash value is calculated from the join column(s), and the AMP that holds the corresponding row is instructed to return the searched row(s). So we have a join using only 3 or 4 AMPs, depending on which indexes we created.
Should we create single join indexes with a PI different from the base table?
A single table join index with different PI can be helpful in single-AMP queries if Teradata can use the new PI in retrieving steps or joins. The join index can also be created as a global join index by not including all required columns in the index but only the ROWIDs of the base table. If necessary, the join index and the ROWIDs can be used to access the base table to retrieve the required columns.
The advantage of a single table join index over a NUSI is that duplicates are allowed, and the rows can still be accessed using ROWHASH. A single table join index has further advantages: ROWHASH locking and Group-AMP operations are possible. The NUSI, on the other hand, always requires table locks and almost always all-AMP access (single-AMP access when the NUSI matches the primary index columns of its base table).
Are there statistics on all tables on the WHERE conditions and the join columns?
Statistics are always vital but especially so here. So the Optimizer will be able to use a join index even if these statistics are missing but may not perform group-AMP access but all-AMP access.
Do all row partitioned tables contain all partition columns in the primary index?
If the PI can contain all partition columns, no partition probing is necessary to determine within which partitions the PI value is available. The design then ensures that the value can only be in one partition. So make sure that all partition columns are always part of the PI. If possible, define a unique primary index.
If putting all partition columns into the PI is not possible, then you should at least define a WHERE condition on the partition columns to prevent probing.
Other strategies to avoid probing:
If the Primary Index of the partitioned table is a NUPI, still having unique values, we can put a USI matching the PI columns.
If the Primary Index of the partitioned table’s values is not unique, we can put a global join index matching the PI columns.
With both options, we can directly access the rows from the index as both types of indexes point straight to the physical rows, and Teradata doesn’t need to probe partitions to look up the row from the base table.
If we can’t use any of these strategies, there is still the option to keep the number of partitions small. The fewer partitions, the fewer have to be checked by probing.
Can a sparse join index be created that restricts to a specific partition?
It is often the case that we have partitioned base tables by date. Therefore, we could restrict our sparse join index to include only the current date partition (WHERE RefDate = DATE’2022-06-30′).
Another trick with the sparse join index is choosing the PI to include only a few values (fewer than AMPs). This way, the Optimizer can limit the query to a few AMPs (group-AMP).
Did we optimize the All-AMP Steps?
We cannot consistently achieve an execution plan that is complete without All-AMP. We should optimize all remaining All-AMP steps. Here are some ideas:
– Remove all unnecessary steps (e.g., joins that are not needed).
– NUSI accesses and row partitioning can prevent full table scans.
– Use join indexes with few columns as it reduces the IOs.
– Split the query into more minor queries.
– LOCKING FOR ACCESS ensures that others do not block us.
Should we apply multivalue compression?
If our Teradata system is not CPU bound, it may be helpful to apply multivalue compression. Especially if Teradata can keep small tables in the main memory, this can give a performance advantage because fewer IOs are needed. Multivalue compression can be an accompanying measure but is not one of our primary tools.
Multivalue compression makes sense for wide columns and columns with few distinct values.
Can small tables be swapped out to sparse maps?
Sparse maps are a relatively new feature in Teradata to keep small tables entirely on an AMP. This can have tremendous advantages for tactical workloads. First, single AMP accesses are possible even if the entire table is read. Second, it can make query times more stable compared to all-AMP access since individual AMPs that are overloaded will no longer become a bottleneck if our small table is not on those AMPs. Finally, our workload can be prioritized as a tactical workload by TASM.
You can read more details about sparse maps for tactical workload tuning here:
I think that was quite a lot of ideas on how you can systematically proceed to optimize your query, here at an example of a tactical workload.
If you are in the situation to optimize a tactical query, then do the following:
Look at the output of the EXPLAIN statement. Step by step, we have to analyze what kind of workload is executed (join step, retrieve step) and how many AMPs are involved. If we find an All-AMP step, we need to go through our checklist from before and consider in detail which of the listed tips could help turn the All-AMP step into a Group-AMP or even Single-AMP step. Have fun.