What is the tactical workload in Teradata?
Tactical workload tuning is a specialized discipline in Teradata that demands distinct skills and tools.
This article will demonstrate the necessary skills for optimizing tactical workload.
You will acquire expertise in utilizing appropriate tools and mastering the “tactical workload skill set.”
The Difference between Tactical and Decision Support Workload
Decision support workloads typically require full table scans and all-row joins, often between large tables. While indexing (such as NUSI or USI) can enhance performance, it results in more frequent access to all table rows.
Here is an example of a query for decision support:
SELECT CustomerGroupName, SUM(Revenue)
FROM
Sales t01
INNER JOIN
CustomerGroup t02
ON
t01.GroupCd = t02.GroupCd
GROUP BY 1;
Tactical queries typically involve accessing a single row or a limited number of rows. An illustration of such a query is:
SELECT *
FROM Sales
WHERE SalesId = 100;
Efficient tactical queries require indexed access paths, such as UPI, NUPI, NUSI, USI, Join Indexes, or a partition containing only one row. The ultimate objective is to produce an execution plan that retrieves solely one or a few rows at each stage.
Apply your query’s tactical selection criteria as early as possible in the execution plan to avoid deadlocks. However, serialization may not be feasible for optimal performance. I optimized by rewriting frequently executed statements, minimizing database changes, and preventing deadlocks. A well-designed query ensures stable tactical execution and maintains performance even as table cardinality grows. In contrast, decision support queries tend to slow down with increasing data volume.
Increasing the number of Teradata nodes does not enhance tactical query performance. However, it enables multiple tactical queries to be executed concurrently.
The efficiency of decision support queries increases proportionally with the number of nodes. In the absence of data skewing, doubling the number of nodes can result in a twofold acceleration of your query. However, it should be noted that perfect parallelism and zero skewing are unattainable theoretical concepts in practice.
The Tactical Workload Toolkit
To optimize retrieval processes, we must use indexed access for individual rows (such as USI, UPI, NUSI, Join Index, etc.) and minimize the number of AMPs employed in join operations.
The query’s execution plan reveals that the chosen row is retrieved from a single AMP (single-AMP join step).
SELECT * FROM Sales s INNER JOIN SalesDetails sd ON s.SalesId = sd.SalesId WHERE s.SalesId = 100; 1)
First, we do a single-AMP JOIN step from Indexing.t01 using the primary index "Sales.s.SalesId = 100"
with no residual conditions, which is joined to Sales.SalesDetails by way of the primary index "Sales.sd.SalesId = 100"
with a residual condition of (“Sales.SalesDetails.SalesId = 100"). Sales.Sales and Sales.SalesDetails
are joined using a merge join, with a join condition of ("Sales.Sales.SalesId = Sales.SalesDetails.SalesId").
The result goes into Spool 1 (one-amp), built locally on that AMP.
The size of Spool 1 is estimated with low confidence to be 151 rows.
Consistent performance can be achieved in tactical workload tuning by ensuring stable and direct data access, assuming the execution plan remains unchanged.
The Nested join is the optimal join type for the tactical workload.
When creating your physical data model, consider a design that enables the utilization of Nested Joins.
Business users frequently require data retrieval using natural keys instead of surrogate keys. Retaining the natural keys in the physical data model for tactical operations is advisable.
This physical data model employs Nested Joins.
CREATE TABLE Sales
(
ProductId INTEGER NOT NULL,
SalesId INTEGER NOT NULL
) PRIMARY INDEX (ProductId );
CREATE TABLE Products
(
ProductId INTEGER NOT NULL,
ProductCode CHAR(10)
) PRIMARY INDEX (ProductId )
UNIQUE INDEX (ProductCode);
In the aforementioned example, the data is accessed through the “ProductCode” key instead of the surrogate key, “ProductId.” Hence, a unique secondary index (USI) is employed to establish a direct access route for the data via the natural key.
SELECT * FROM Products p INNER JOIN Sales s ON p.prod_id = s.prod_id WHERE ProductCode = ‚Private‘
1)First, we do a two-AMP JOIN step from Sales.Products by way of unique index # 4
“Sales.p.ProductCode = ‘Private’" with no residual conditions, which is joined to Indexing.s by
The way of unique index # 4 “Sales.s.ProductId = Sales.p.ProductId" with no residual conditions.
Sales.p and Sales.s are joined using a nested join, with a join condition of ("(1=1)").
The result goes into Spool 1 (one-amp), which is built locally on the AMPs.
The tactical query is resolved through a Nested Join utilizing the USI on the ProductCode column.
Forcing all Rows to one AMP
To perform tactical queries on small tables, we can use a tool that involves forcing all rows onto a single AMP. To achieve this, we add a column to the small table and populate it with a single value, ensuring that this column is set as the table’s primary index. This is an effective method for executing tactical queries on smaller datasets.
Assigning a single value to the primary index column will compel all rows in the table to be directed to a single AMP. In the event that the table is diminutive enough, they may even be placed into a solitary data block.
If all goes as planned, this yields a one-AMP retrieval instead of performing a scan on all AMPs.
Simultaneously executing numerous tactical queries on the table will keep the entire table in memory, negating the need for disk drive access.
When utilizing this approach for multiple tables, it is advisable to employ distinct primary index values for the “dummy” column to prevent overburdening a single AMP.
A Case Study: Tactical Workload and Join Indexes
I recently participated in optimizing a web frontend tool that employs Teradata as its backend database.
During development, the tactical nature of the queries executed against the database went unrecognized. The tool primarily generated workloads consisting of full table scans.
The web tool underwent testing on a development system. However, numerous individuals commenced concurrent and excessive workloads upon transitioning to production. The primary concern pertained not to the magnitude of the tables in question, given that they were all relatively diminutive.
The core issue was a combination of user expectations when using the website interface and the workload management configuration, which failed to categorize tool queries as tactical tasks.
Meeting users’ expectations for a quick response time, typically within a few seconds, proved unattainable under heavy system loads due to the inefficiency of full table scans. As a user, it is understandable that waiting 30 seconds to save a single client’s information is less than ideal.
The primary queries had optimum run times of 1-2 seconds, although they could extend to several minutes in a high-volume environment. Additionally, workloads experienced delays that contributed to the overall response time.
Although I introduced distinctive secondary indexes, since the tables were petite, the optimizer concluded that carrying out a full table scan of the primary table would be a more cost-effective alternative to the indexed approach.
Using join indexes proved to be the ultimate solution. Since the tool necessitated distinct data access paths, I established multiple 1:1 join indexes for every table to accommodate each access path, modifying only the primary index of the join index.
It is important to monitor the effects of update, insert, and delete statements when adding join indexes. However, this was not an issue for me since the tables were small.
My experience involved numerous join indexes leading to frequent deadlocks.
To address this issue, various solutions are available. One such approach involves serializing the user session activities through a dedicated “serialization” table. Before modifying any data, each session must acquire a write lock for each activity on this table.
To mitigate deadlocks in my project, I revamped the crucial code segments and curtailed the frequency of alterations (updates, deletes, inserts) performed on the tables.
Collect statistics on the join indexes to ensure their utilization.
Explore these resources for optimizing Teradata SQL performance:
6 Golden Rules for Teradata SQL Tuning, The Teradata Join Index Guide – Comprehensive Answers, and Boost Performance with the Teradata Nested Join.
Feel free to inquire in the comments section if you require any clarification regarding the aforementioned information. I will diligently monitor the comments and endeavor to respond to as many as possible. Thank you for perusing this blog; its evolution is solely due to your support.
Sorry. There is a small correction
5.1. if you want to modify some rows and tables in one transaction then it is good to get the most restrictive lock for participated rows in all tables involved into the transaction at the most granular level(rowid is the best) at first the very first step. This prevents deadlocks.
Hello.
Thank you for good article. It easy to read and contains helpful material.
Let me say some suggestions:
1. The point below does not concern the article theme but it is about query example in the text.
The query
SELECT CustomerGroupName, SUM(Revenue)
FROM
Sales t01
INNER JOIN
CustomerGroup t02
ON
t01.GroupCd = t02.GroupCd
GROUP BY 1;
looks to be bad, because there is a grouping by text field which should be avoided as much as possible in Teradata. In short, each grouping operation requires building some auxiliary table in the memory. The more column in the grouping clause and the wider these columns the more resources are required for building that auxiliary table to support grouping operation.
If the grouping column has variable text type, then Teradata reserve maximum possible size for this type. For example, if there is a grouping by the CustomerGroupName column which is VARCHAR(400) UNICODE16 type then Teradata reserves 800 bytes per each grouped row even then the actual value of the column is very small ( for example for “abc” is only 6 bytes). This leads to performance problems.
It is usually easy to solve it, just by switching grouping column to ID
SELECT MIN(CustomerGroupName) AS CustomerGroupName
, SUM(Revenue)
FROM Sales t01
INNER JOIN CustomerGroup t02 ON t01.GroupCd = t02.GroupCd
GROUP BY t01.GroupCd;
2. It is an excellent practice to use bind variables for tactical queries. This helps to reuse query plans and reduce the count of internal locks.
3. It is good to set up a lot of criteria and rules in TDWM for tactical queries to prevent moving tactical queries into non-tactical workload groups.
4. For small batch operations( like updating 10 rows by PI) it is good to think about multistatement requests or macros, they reduce context switching.
5. There are some additional best practices for locking for tactical queries(described in Teradata request processing book), like this:
5.1. if you want to modify some rows and tables in one transaction then it is good to get the most restrictive lock for all rows in tables at first the first step. This prevents deadlocks.
5.2.if an application may do a dirty read in some query(non-tactical), let it make dirty read, and reduce the count of the lock.
….
Thank you.
Best regards,
Aleksei Svitin.
Hi Aleksei,
Thanks for sharing your ideas. Especially point 3 is very important in my experience. A few months ago I worked on an optimization project. It was mainly about tactical requests. I had the situation that TDWM was not set up correctly to tackle the tactical workload.
All techniques I had applied would have been a huge improvement for these tactical queries, but workload management delayed them like non-tactical queries. This lead to the stupid situation, that my optimized queries, accessing single rows via a single table join index, had a runtime of about on second but a workload delay time of a few minutes.
Another interesting technique for tuning ALL-AMP tactical Queries.
Let consider a scenario where we have a web-portal accessing a few small semantic aggregated/summarised tables and we have a tight SLA on response time. Most of the portal queries doing an all-amp retrieve steps from these tables and there are no opportunities for indexing or any sort of tuning.
A technique here which can useful is by adding a deliberate skew by selecting highly non-unique primary index column/s. So in case 100 AMPs system, instead of data being evenly distributed on all 100 AMPs, using highly non-unique primary index column/s distribute the data only on 25 AMPs.
In the first case, rows are retrieved from all 100 AMPs reading large number data blocks.
In the second case, rows are retrieved only from 25 AMPs reading few number data blocks, hence reducing the number of data blocks to read.
By adding deliberate skew for smalls tables, significant savings in response time can be achieved for All-AMP Tactical Queries.
Hi Siddesh,
Thanks for sharing this technique with us. I assume by forcing all rows of a tiny lookup table to one AMP (all rows having the same Primary Index), there is another advantage: No deadlocks for rowhash locking can happen?
Good work as usual. Small but important typo: You created a USI on ProductCode, not a NUSI.
Thanks Paul, I corrected it.