Teradata Workload Management Basics
Everyone who has some experience with Teradata tuning will have heard the term “workload.” This article will deal with the definition of this term and explain in detail what it is and why it's helpful to map each request (query or load operation) to a workload.
What is Teradata Workload?
A workload is a group of requests with common characteristics. The grouping of requests into workloads has some advantages over the handling of single requests.
It's more efficient to manage groups of requests than to handle each request individually, as common request characteristics allow to apply standard settings and rules to the group at once.
Grouping requests into workloads give us a better overview of who is using the system and how many resources are used. Armed with this knowledge is is much easier to assign priorities to performance tuning tasks based on workloads (for example, by assigning the highest priority to tune activities for your critical, tactical workload).
Another important concept behind workloads is the assignment of resources (CPU, IOs) and preferences (for example, the frequency and duration of CPU usage) on a group level. Business needs usually define how critical a joint group of requests are. By putting these requests into a workload, common priorities and resource shares can be assigned. The Teradata priority scheduler is taking care of resources and priorities assigned to each workload.
Different releases of Teradata systems use different implementations of priority scheduling. While the Teradata 14 priority scheduler is closely connected with the SLES 11 Linux scheduler and its priority hierarchies, previous Teradata releases running on SLES 10 implement a different way of priority scheduling.
Nevertheless, these details are not important to understand the concept of workloads and workload classification. We will talk about the Teradata priority scheduler in one of our next articles related to workload management.
Whenever a request matches all the characteristics defined for an individual workload, it will be assigned to this workload and the workload specific rules and settings will be applied to the request. This process is called workload classification and the central concept of adding resources and limits to requests.
All workloads are managed in the ViewPoint “Workload Designer.”
Workload Characteristics are used to assign each Request to a Workload
Workload characteristics can be any out of the following types below (each characteristic on its own or combined with other features can be used to classify requests into workloads), but be aware that the lists below may not be complete as each release of Teradata adds new features.
1. Request source:
Request source defines where a request was initiated. The request source(s) can be any of the following:
- Account name
- Account string
- Client IP address
- Client ID
A query will be classified into a workload if a request source matches or does not match the request sources which are defined for this workload. Several request sources can be set for a workload:
Username = ‘ETLUSER1'
Client IP address = 192.168.0.54
If several request sources are defined for a workload they are AND combined (in above example this means that the login user has to be ‘ETLUSER1' and must connect from the client with IP address 192.168.0.54) to be classified into the considered workload. We are assuming that both request sources are used to include them into the workload.
As a general rule: Up to Teradata 13.00, several characteristics have been combined by OR. Since Teradata 13.10 characteristics are coupled with the “AND.” This rule is valid for all type of features (not only for request sources).
Hint: All of these request sources can be found in the DBC.DBQLOGTBL table.
2. Request target:
Each request will be executed against one or several database objects, such as databases, tables, view, macros, stored procedures, and UDFs.
By defining request targets in your workload characteristics, it's possible to assign queries to a workload if certain database objects are accessed.
Since Teradata 14, it's possible to define additional subcriteria for each request target, considering in which way the objects are used. But subcriteria are not allowed for UDFs.
Some examples for subcriteria are:
– Add query or exclude query from workload if the database object is accessed with a full table scan (FTS).
– Add query to workload if the database object is accessed with a particular join type (all or no joins, all or no product joins, all or no unconstrained product
– Add a query to workload if a minimum or a maximum number of rows for each step is estimated.
Several request targets can be combined in a workload definition (AND combined). If subcriteria are used, they are all AND combined.
3. Query Characteristics:
By detailed analysis of each request, a more detailed workload classification is possible. These are the query attributes which are used to classify a query to a workload:
– Join Behaviour: product join, unconstraint product join, no product join, no unconstraint product join, any join, no join.
– Type of Statement: DDL, DML, SQL Select, Collect Statistics.
– AMP Limits: Defines if All-AMP operations should be included or excluded into/from this workload.
– Estimated Processing Time: Adds a query to the workload if the estimated processing time is between the minimum and maximum estimation defined.
– Step Row Count: Adds a query to the workload if the result rows of each step are between the minimum and maximum values specified.
– Final Row Count: Adds a query to the workload if the result rows of the final result set are between the minimum and maximum values defined.
– Full Table Scan: Adds or excludes queries doing a full table scan (FTS).
– Estimated memory usage: Include or exclude request if it exceeds a defined memory usage
4. Utility Usage as Classification Criteria
The workload management offers the possibility to classify load tasks (Fastload,Multiload, , FastExport) and backup/restore tasks into separate workloads.
5. The Queryband as Classification Criteria
By adding a queryband to your session or transaction, this queryband can be used to assign or exclude queries with this queryband to/from a workload.
You will need these classification criter if requests are coming from a single or common request source but you need to classify them into different workloads.
Basic Workload Definition Rules
Currently (Teradata 14.10) up to 250 workloads can be defined. Requests which do not match any workload will be assigned to the WD-Default workload, which is available on each Teradata system and cannot be deleted (and it counts against the 250 workload limit).
Start your classification on request source level. If your users are logging on with a standard user, consider adding queryband classification. Querybands will help you to classify your workload on a very detailed level. You should only add additional classification criteria (request target, query characteristics) if needed.
Keep in mind, that an enormous amount of classification rules (include lists, exclude lists), makes maintenance difficult. Furthermore, a lot of standards will be hard to understand and confusing.
Limit as well the number of workloads. All workloads are interacting with each other. A huge number of workloads makes it difficult to understand the overall impact of each workload on the whole system performance.This was the first article of the DWHPro workload management series. Feel free to comment and inform us about errors or things we should change.