Tactical Workload Tuning on Teradata
Tactical workload tuning is a particular discipline on Teradata. Tactical workload requires a unique skill set and tools.
In this article, I will show you the skills needed to optimize tactical workload.
You will learn how to use the proper tools and the “tactical workload skill set.”
The Difference between Tactical and Decision Support Workload
Decision support workload typically involves full table scans (FTS) and all-row joins -usually between big tables.
Indexing (NUSI, USI, etc.) can improve performance, but more often all table rows are accessed.
Here is an example of a decision support query:
SELECT CustomerGroupName, SUM(Revenue)
t01.GroupCd = t02.GroupCd
GROUP BY 1;
The tactical workload is characterized by single row access or the access to a limited number of rows. Here is an example for a tactical query:
WHERE SalesId = 100;
Optimized tactical queries need indexed access paths, such as UPI, NUPI, NUSI, USI, Join Indexes or a single row partition. The primary goal is to have an execution plan with each step only delivering one or a few rows.
The tactical selection criteria of your query have to be applied as early as possible in the execution plan.
While this allows avoiding deadlocks entirely, serialization might not be useable in any case for performance reasons.
The solution I used was to rewrite the most executed statements, increasing the number of changes to the database. By keeping the number of changes small, I was able to prevent deadlocks.
With a proper query design, tactical query execution is stable, and performance is not decreasing if table cardinality is growing, while decision support queries usually become slower with a growing amount of data:
Tactical query performance does not improve with the number of Teradata nodes. Adding more nodes to your Teradata system will not have any impact on the performance of your single tactical query. Nevertheless, adding nodes to your system allows more tactical queries to be executed at the same time.
The performance of decision support queries improves with the number of nodes. Assuming a perfect data distribution (no skewing), doubling the number of nodes can speed up your query by a factor of two (nevertheless, this are pure numbers, no system is working 100% in parallel and without any skewing effects).
The Tactical Workload Toolkit
Besides optimizing the retrieve steps with indexed access to single rows (USI, UPI, NUSI, Join Index, etc.), we have to optimize the joins. Join optimization means to minimize the number of AMPs involved in the join operations.
For example, the execution plan of below query shows that the selected row is accessed from only one AMP (single-AMP join step):
Above query will always perform identical (given the execution plan is not changing), and this is what we want to achieve in tactical workload tuning: A stable direct access to the data.
While above example was built around a merge join, the most suitable join type for the tactical workload is the Nested Join.
When designing your physical data model, you should consider a design which allows the usage of Nested Joins.
Business users often need to retrieve information based on the natural key of a table and not the surrogate keys. It‘s always a good idea to think about keeping the natural keys in your physical data model when dealing with tactical workload!
Here is an example of a physical data model which allows the usage of Nested Joins.
CREATE TABLE Sales
ProductId INTEGER NOT NULL,
SalesId INTEGER NOT NULL
) PRIMARY INDEX (ProductId );
CREATE TABLE Products
ProductId INTEGER NOT NULL,
) PRIMARY INDEX (ProductId )
UNIQUE INDEX (ProductCode);
In our example above, the data is accessed via the natural key “ProductCode” and not the surrogate key (ProductId). Therefore, we use a USI to create a direct data access path via the natural key:
SELECT * FROM Products p INNER JOIN Sales s ON p.prod_id = s.prod_id WHERE ProductCode = ‚Private‘
As you can see, the tactical query is resolved with a Nested Join and the USI on column ProductCode.
Forcing all Rows to one AMP
Another interesting tool we have to execute tactical queries against little tables is to force all rows to one AMP, by adding a column to the small table, populating it with just one value. This column has to be the primary index of the table.
Populating the primary index column with just one value will force all table rows to the same AMP, and if the table is sufficiently small, even into the same data block.
If everything works like expected, this results in a singe-AMP retrieve step instead of an All-AMP scan.
If there are many concurrent tactical queries executed against the table at the same time, the entire table will stay in memory, avoiding any access to the disk drive.
If you think about using this approach for several tables, use different values for the primary index of the “dummy” column, to avoid an overload of a single AMP.
A Case Study: Tactical Workload and Join Indexes
Recently, I was involved in an optimization project for a web-frontend tool with Teradata as the backend database.
During the development process, nobody had considered that the queries executed against the database are tactical workload. Most of the workload the tool generated was full table scans.
The web tool had been tested on a development system, but when it went into production, and dozens of people started to work in parallel, together with all the other workload, performance severely. The real problem was not the size of the involved tables, as all tables were quite small.
The real problem was a mix of end-user expectations when navigating in the web frontend, and the setup of the workload management, which did not classify the tool queries as tactical workload.
Users expected an average response time of just a few seconds, which could not be achieved with full table scans on a loaded system (I understood the users. Who wants to wait 30 seconds to save the information of just one client?)
Run times for the main activities were at best 1-2 seconds but increased to a couple of minutes on a busy system. Furthermore, workload delays added to the overall response times.
I began to add unique secondary indexes, but as the tables were tiny, the optimizer decided that a full table scan of the base table would be cheaper than the indexed access.
Finally, it turned out that the usage of join indexes was the solution. As the tool required different access paths to the data, I created several 1:1 join indexes per table, for each of the access paths, only altering the primary index of the join index.
When adding join indexes, we have to keep an eye on the impact of an update, insert and delete statement. In my case, it was no problem, as tables were small.
What I experienced was another issue: The huge number of join indexes caused a lot of deadlocks.
There are several ways to overcome this problem. We can, for example, serialize the user session activities, by using a “serialization” table. Each session has to get a write lock for each activity on this”serialization” table before it can apply any changes to the data.
In my project, I was able to minimize deadlocks by rewriting the critical pieces of code, reducing the number of changes (updates, deletes, inserts) done on the tables.
The last hint: Don't forget to collect statistics on the join indexes. Otherwise, they may not be used!
Teradata Golden Tuning Tipps 2017 – Take your Skills to the next Level!
6 Golden Rules for Teradata SQL Tuning
The Teradata Join Index Guide – We leave no Questions unanswered!
Dramatically improve Performance with the Teradata Nested Join