Tag Archive

Tag Archives for " tuning "
9

Teradata SQL Tuning – 6 Golden Rules you must never forget!

1. Ensuring the Completeness and Correctness of Statistics

Teradata SQL Tuning begins by providing the optimizer with the statistics it needs. This must always be done as a first step. Statistics influence the execution plan. It is not only important to define all required statistics, but also to keep them as up-to-date as possible.

There is a simple reason why I put statistics first: Often performance problems are solved simply by changing the execution plan.

Which columns require which type of statistics (full statistics collectin, random-AMP samples or sample statistics) depends primarily on the data demography.

Random-AMP sampling

Random AMP sampling is usually sufficient for columns with many different values. The table should contain significantly more rows than there are AMPs in the system.
The following applies to random-AMP sampling: The more distinct the column values are, the better the estimate will be.

If the column values are skewed, there is a risk that the random-AMP sample will be taken from an AMP that is not representative of the data. Skewed data leads to overestimation or underestimation of the number of rows.

Good candidates for Random-AMP sampling are unique indices, i.e., UPI and USI.

Random-AMP sampling only takes place on indexed columns.

Therefore, You should always collect statistics on non-indexed columns that are used in WHERE conditions. Whether full or sample statistics are collected depends on how unique the column values are.

Fully collected Statistics

Fully collected statistics are required for skewed column values and columns with a small number of distinct values (NUPI and NUSI)

Statistics for PPI Tables

The Teradata Optimizer has unique statistic requirements for PPI tables.

The following statistics should additionally be collected on PPI tables:

Dummy Column “PARTITION”

Statistics on dummy column “PARTITION” tell the optimizer the number of the empty statistics.

Dummy column “PARTITION” + PI column

These statistics are required if the partitioning columns are not part of the primary index. In this case, the same primary index value can exist in different partitions. Statistics on dummy column “PARTITION” + PI allow the optimizer to estimate the cost of the sliding window and rowkey based merge join and of dynamic partition elimination steps.

Below statement can be used to determine which statistics the optimizer would additionally need in a SQL statement:

DIAGNOSIS HELPSTATS ON FOR SESSION;

This statement displays a list of suggested statistics at the end of the Execution Plan (given by the EXPLAIN statement) and the Optimizer’s opinion of their value (confidence levels).

By gradually adding these statistics, you can test their influence on the execution plan.

Identify Stale Statistics

There are several ways to identify stale statistics. The easiest way is to split the SQL statement and test each partial statement individually. Splitting is done merely by comparing the estimated number of rows (as shown in the Explain output) with the actual number of records returned by the query.

The above-described approach is particularly suitable if the entire SQL statement does not execute in a reasonable time.

Here’s an example:

SELECT t01.*
FROM
<Tablename1> t01
INNER JOIN
<Tablename2> t02
ON
t01.PK = t02.PK
WHERE
t01.<Column name> = 1 AND t02.<Column name> = 2;
The above query can be divided into two parts for testing:
SELECT * FROM <Tablename1> WHERE <Columnnname> = 1;
SELECT * FROM <Tablename2> WHERE <Columnnname> = 2; 

If you execute both sub-queries, and the number of rows returned differs significantly from the estimate in EXPLAIN, the statistics may be obsolete.

2. Teradata SQL Tuning and the Primary Index Choice

Teradata SQL Tuning is not complete without choosing the best possible Primary Index. Design your queries so that the primary index can be used in the joins.

All join columns must be part of the primary index. If only one column of the primary index is missing in the join condition, the result is a different row hash (the order how they are defined in the primary index doesn’t matter)

Still, the join condition can contain additional columns. These are then applied as residual conditions after locating the rows via row hash.

To execute a join between two tables, the rows of both tables must be co-located on the same AMP.

This is true when they have the same primary index. In this case, the optimizer can use a join strategy that requires no redistribution of rows.

If for some reason you cannot change the primary indexes, or if you need a specific primary index for a particular SQL statement, create a volatile table (or a true temporary table) with the same structure and content as the original table but with a different primary index.

This is the only way to ensure that all AMPs start and finish their work simultaneously. This is exactly what parallel data processing is all about.

There is something else to consider when selecting the primary index: Since every change of a value of a primary index column means a new row hash, these rows must be redistributed.
This can still be acceptable if it happens now and then and only in a few rows. However, if the primary index is very volatile, this has a negative effect on performance.

3. Teradata SQL Tuning with Indexing & Partitioning

Using indexes or partitioning is another way to improve query performance.

While secondary indexes in Teradata are stored in a sub-table and therefore require extra space, partitioning is just another way to structure the rows on the mass storage device.

When it is the best option to use partitioning and when indexing is better depends on the workload type. Sometimes a combined usage of partitioning and indexes is the best approach (i.e. creating an index on a partitioned table).

Tactical queries normally access one or a few data rows. The USI is therefore often an optimal choice if the access can’t be covered with a primary index. Also, a join index that represents a copy of the base table and only changes the primary index can be helpful here.

Partitioned tables are often used for strategic queries in which a series of data (for example, the sales for the current year) is aggregated.

Disadvantages of Indexing

When we work with indexing techniques, we need to keep an eye on the entire data warehouse architecture and decide whether our solution fits. Indexes can have a negative impact on the ETL process for several reasons.

Loading tools such as Fastload require secondary indexes and join indexes to be removed before loading.

The index sub-tables have to be managed by Teradata. Insert, Delete and Update statements require that in addition to the actual table, the index sub-table must be maintained.

If potentially useful indexes are not used by the optimizer and they are not helpful in the entire PDM design, drop them immediately. You’re wasting space and resources.

4. Query Rewriting

The performance of a query can often be improved by rewriting the query.

Personally, I prefer to consider an SQL statement as a black box and to limit optimizations to technical methods first.

Here are a few examples:

  • EXISTS instead of IN
  • Modularizing a large SQL statement
  • UNION ALL instead of UNION
  • DISTINCT instead of GROUP BY

Not having to understand the content and business logic of a query, I do not need to contact the author of the query. The purely technical optimizations are usually not that risky.

Only when I reach the limits of purely technical optimization do I begin to deal with business logic. Understanding business logic often opens up additional optimization opportunities.

Rewriting of queries often solves performance problems, even when all other techniques have failed.

5. Teradata SQL Tuning with Real-Time Monitoring

Teradata SQL tuning requires to watch the query running in real-time. Monitoring a query in viewpoint at runtime helps to identify the critical steps.

In Viewpoint you should look for the following steps:

  • Steps that take a long time to finish and consume a huge amount of resources.
  • Steps that are skewed, especially if the skew factor reaches 99%, as this means that all work is done by only one AMP or a few AMPs.

Analyzing the bad query

We have to think about the root cause of the bad query step, here are a few ideas:

  • Does the base table have a skewed primary index (static skew)?
  • Does the redistribution of rows cause skew (dynamic skew)?
  • Are massive amounts of data redistributed because of poor join preparation due to missing or stale statistics?
  • Are there several hash collisions during the execution of the INSERT statement?

Stale and missing statistics typically lead to incorrect decisions in join preparation (copying a table to all AMPs instead of rehashing) and to the use of incorrect join methods (for example, product join instead of merge join).

6. Comparison of Resource Usage

Another important task in Teradata SQL Tuning is measuring resource consumption before and after the optimization of the query.

Query run times are misleading because they can be affected by the simultaneous workload of other queries over which we have no control.

In performance tuning, we prefer to use absolute key figures that are independent of what else is running on our Teradata system.

Below is a query that gives you a detailed insight into how good each step of the SQL statement to be optimized is. To execute this query, you must have read access to table DBC.DBQLOGTBL (or related views):

Do not forget to give each version of the query to be optimized its own query band so that you can uniquely identify it in the table DBC.DBQLOGTBL:

SET QUERY_BAND = 'Version=1;' FOR SESSION;
SELECT
   AMPCPUTIME,
   (FIRSTRESPTIME-STARTTIME DAY(2) TO SECOND(6)) RUNTIME,
   SPOOLUSAGE/1024**3 AS SPOOL_IN_GB,
   CAST(100-((AMPCPUTIME/(HASHAMP()+1))*100/NULLIFZERO(MAXAMPCPUTIME)) AS INTEGER) AS
CPU_SKEW,
   MAXAMPCPUTIME*(HASHAMP()+1) AS CPU_IMPACT,
   AMPCPUTIME*1000/NULLIFZERO(TOTALIOCOUNT) AS LHR
FROM
   DBC.DBQLOGTBL
WHERE
     QUERYBAND = 'Version=1;'
;

The above query will return the following measures:

  • The total CPU Usage
  • The Spool Space needed
  • The LHR (ratio between CPU and IO usage)
  • The CPU Skew
  • The Skew Impact on the CPU

Our goal is to decrease total CPU usage, consumed spool space and skewing on the CPU. The LHR is optimally around 1.00

See also:
Teradata Golden Tuning Tipps 2017 – Take your Skills to the next Level!
Teradata Tuning – Query Rewriting
Tactical Workload Tuning on Teradata
The Primary Index Choice
Tutorial Point Tuning Ideas

Teradata Quiz
Teradata Quiz
Developer: Roland Wenzlofsky
Price: Free

3

Teradata TO_CHAR

The Teradata TO_CHAR Function

The Teradata TO_CHAR function can be used to:

  • Convert numeric expressions to character strings
  • Convert dates, times, interval data types, or timestamps to character strings

Here is an example of a numeric expression argument. It can be any of these data types: BYTEINT, SMALLINT, INTEGER, BIGINT, DECIMAL/NUMERIC, FLOAT/REAL/DOUBLE PRECISION, or NUMBER

The second argument is a format string (More about format strings you can find in the Teradata documentation). It has to be CHAR or VARCHAR.

SELECT TO_CHAR(123456789,'999999999');

Hint: If the query returns only ‘#’ characters, this means that the conversion failed:

SELECT TO_CHAR(123456789,'99');
--> ###

Here is an example of a date expression argument. The first argument has to be one of the following data types: DATE, TIME, TIME WITH TIME ZONE, TIMESTAMP, TIMESTAMP WITH TIME ZONE, or an Interval data type. The second argument has to be CHAR or VARCHAR:

SELECT TO_CHAR(DATE'2019-11-09', 'DD-MON-YYYY' );
--> '09-NOV-2019'

Teradata TO_CHAR and Statistic Estimations

Of course, we are all excited when Teradata provides new features with every new release that makes our lives easier. However, it is important not to forget that the use of functions always has the risk that the optimizer has no statistics available.

Here is an example that shows exactly such a case. First, we will apply TO_CHAR on a date and take a look at the execution plan:

SELECT Col_1, Col_2 FROM 
DBNAME.TableNane 
WHERE TO_CHAR(Date_Col, 'mm-dd-yyyy') ='12-31-9999'
ORDER by 1;

COLLECT STATS COLUMN TO_CHAR(ENT_END_TS, 'mm-dd-yyyy') AS COL_DATE_UDF 
ON DBNAME.Tablename;

The below excerpt from the execution plan illustrates that in step 2 the optimizer has “No Confidence” despite having the statistics collected on the column used in the WHERE condition. The estimated number of rows is 1, which is wrong (the correct result is 2 rows):

Step 2:

Next we do an all-amp operations RETRIEVE step from dbname.tablename in view viewed.view_nm by way of an all row scan with a condition of "(TD_SYSFNLIB.To_Char(CAST((date_col in view viewing.viewnam.ENT_END_TD) as TIMESTAMP(6) WITH TIME ZONE) AT TIME ZONE INTERVAL '0:00' HOUR TO MINUTE, 'mm-dd-yyyy')) ='12-31-9999'") into Spool 1 (group_amps). which is built locally on the amps. Then we do a SORT to order Spool 1 by the sort key in spool field1. The Size of spool 1 is estimated with no confidence to be 1 row.

Let’s rewrite the query, and use a simple CAST:

SELECT Col_1, Col_2 FROM
DBNAME.TableNane WHERE CAST(CAST(ENT_END_TD as DATE FORMAT 'MM-DD-YYYY') as CHAR(10)) ='12-31-9999' ORDER by 1;

COLLECT STATS COLUMN CAST(CAST(ENT_END_TD as DATE FORMAT 'MM-DD-YYYY') as CHAR(10)) as COL_DATE_UDF on DBNAME.Tablename;
Next we do an all-amp operations RETRIEVE step from dbname.tablename in view viewed.view_nm by way of an all row scan with a condition of "(TD_SYSFNLIB.To_Char(CAST((date_col in view viewing.viewnam.ENT_END_TD) as TIMESTAMP(6) WITH TIME ZONE) AT TIME ZONE INTERVAL '0:00' HOUR TO MINUTE, 'mm-dd-yyyy')) ='12-31-9999'") into Spool 1 (group_amps). which is built locally on the amps. Then we do a SORT to order Spool 1 by the sort key in spool field1. The Size of spool 1 is estimated with High Confidence to be 2 rows.

As we can see, the optimizer can now rate this step as “high confidence” because it has the statistics at its disposal. The TO_CHAR function from our first query has made it impossible to use statistics.

Conclusion

User-Defined Functions (UDF), such as the Teradata TO_CHAR function, should only be used if we cannot achieve the same result with “simple” means. Functions usually mean (with some exceptions like UPPERCASE) that existing statistics cannot be used!

4

9 Teradata Tuning Tips to Increase Performance by 50%

What are the Goals of SQL Tuning?

The goal of Teradata SQL tuning is to cut resource usage. Two measures we have to attend:

  • Disk IOs
  • CPU seconds

These are absolute values. They are not influenced by any concurrent system activity and stay stable for a particular execution plan.

Don’t use execution times as your optimization target. Many irrelevant factors will affect run times:

Session blocking, workload delays, heavy concurrent workload, etc.

The most expensive task for any RDBMS is to move data from the mass storage devices to memory.

Many of the techniques & ideas described below reduce the number of transferred data blocks (from disk to memory and vice versa). Some of them help to cut CPU consumption.

9 Teradata Tuning Tips to increase performance by at least 50%

1. Ensure Completeness and Correctness of Teradata Statistics

The most important optimization task is to aid the Teradata Optimizer with complete and correct statistics.

We have to pay particular attention to 3 basic situations and always collect full statistics:

  1. Non-indexed columns used in predicates:

    Missing statistics force the Optimizer to do a very inaccurate heuristic estimation.
  2. Skewed indexed columns:

    Random AMP sampling on skewed index results in wrong estimates and execution plans.
  3. Small tables:

    Estimations for small tables suck when there are fewer table rows than AMPs.  In this case,  most AMPs will not contain rows. A random-AMP sample taken from an AMP without or just a few rows will give a wrong estimation.

By following these three rules, many problems related to missing statistics will vanish.

The goal of our Teradata SQL tuning activities is to find the right balance between good query plans and the time needed to make sure useful statistical information.

Discovering  Missing Statistics

The most simple way to discover missing statistics is by turning on diagnostics:

DIAGNOSTIC HELPSTATS ON FOR SESSION;

The above statement adds statistics recommendation  at the end of each explained statement:

EXPLAIN
SELECT * FROM  WHERE Column = 'value';

Test each recommendation separately. Don’t add all recommendations at once, but wisely chose the ones which improve query performance.

Ensure that no plan step only has “no confidence”.  Steps with “no confidence” are a sure sign for heuristics on non-indexed columns. Something must avoid at all costs!

Detection of Stale Statistics

The Optimizer is doing an excellent job of detecting stale statistics and extrapolating.

Before release 14.10, Teradata detected table growth by comparing two random AMP samples. One taken during statistics collection, the other during query execution.

Since Teradata Release 14.10, deleted and inserted rows can be tracked (the UDI counts). For extrapolation, the Optimizer will prefer UDI counts over random AMP sample comparison.

Still, the best way to ensure up to date statistics is to collect them often.

If we need to find stale statistics, we have to compare estimations against the real row count. There are two places where we can use this information:

  • The Execution Plan (by placing the EXPLAIN modifier in front of our query)
  • The Statistics Histogram (by executing the SHOW STATISTICS VALUES ON <TABLE> statement)

The statistics histograms give us the same information the Optimizer has. It’s used to create the execution plan:

  • The timestamp of the last statistics collection
  • Collected & Random-AMP Sampling  row counts
  • Inserted and deleted rows since the last statistics collection
  • Maximum column values
  • Minimum column values
  • Biased column values
  • Break up the SQL into smaller pieces (using volatile tables)
  • Get rid of joins on expressions

Why did I say “almost the same information”? Statistics improve each time the Optimizer gets new insight. They can come from earlier join steps, single table predicates, and aggregations.

Furthermore, different methods of extrapolation will adjust estimations shown in the histograms. Furthermore, various methods of extrapolation will correct estimates shown in the histograms.

If you need the estimations after extrapolation you can use the following statement:

SHOW CURRENT STATISTICS VALUES ON <Table>;

Here is a “single table predicate” example. It demonstrates the usage of derived estimations:

SELECT * FROM <Table1> t01 INNER JOIn <Table2> t02 ON t01.Key = t02.Key WHERE t01.Column IN (1,2,3);

The retrieve step estimation for <Table1> is 3 distinct values. This information flows into the joining step. The retrieve step estimation for <Table1> is 3 distinct values. This information flows into the joining step. If the average number of rows per value for both tables is about 1 (unique data), the resulting spool for the join is three rows.  

A straightforward approach to detect stale statistics is this:

  • Decompose the query  into single retrieve and join steps
  • Figure out why step estimations and real numbers don’t match
    Statistics could be outdated, or they could be missing. It might even be that the estimation is a result of the way the optimizer works.
    Here is an example:
SELECT * FROM  WHERE Column = 1;

Value 1 is not available in <Table>As value=1 is not available in <Column> the Optimizer will estimate the result set to be the average rows per value. I guess many of you expect zero rows?

More information is available here: Teradata Statistics Basics

My last piece of advice:

Keep a copy of both the execution plans – before and after the change – to see the impact of your change.

2. The Primary Index Choice

When it comes to Primary Index choice, we must create a fair balance, weighing opposing requirements against each other:

Even data distribution and join performance.

Two tables can only be joined if they have the same primary index. In other words: The rows of both tables need to be on the same AMP.

Design your queries in a way, that the Primary Index is used for joining as much as possible, as this is the cheapest way of joining.

AMP-local joining also is possible if the join condition includes columns that are not in the primary index. But if the join condition is not including all primary index columns, the rows of one or both tables have to be relocated (and maybe sorted).

If you need a different primary index to improve performance, you can use volatile tables or temporary tables. Create them with the same structure and content as the original tables but with the needed primary index. The use of temporary tables is particularly useful if your query consists of many join steps.

3. Teradata Indexing & Partitioning

Indexes give the Optimizer more data access paths. They are improving highly selective retrieve actions. Unfortunately, indexes consume permanent disk space, and they require maintenance when the underlying base table changes.

Recommendation:

If indexes are not used by the Optimizer, and not useful in the PDM design, drop them immediately. They only will waste space and resources.

Unique secondary index (USI) allows for direct row access (like the primary index). The non-unique secondary index (NUSI) requires a full index subtable scan on all AMPs.

The NUSI can be an advantage over the base table access if the index subtable is smaller than the base table. Covering NUSI’s are more useful than non-covering ones:  There are base table lookups needed, and no costs created.

The difference between an index and partitioning is that indexes are sub-tables. Partitioning is another way of structuring the base table. Partitioning allows the Optimizer to limit access to the data blocks of a partition. The advantage of partitioning is that partition elimination always is used. Index usage has preconditions. For example, the NUSI will not be used without statistics collected on the index columns.

Still, there is a significant advantage of indexing: We create the index; we check if it’s used. We drop it if it’s not used. Have you ever partitioned a 200 Terabyte table for test reasons? I guess this is not what we like to do.

Another disadvantage of partitioning comes to my mind:

Join performance worsens if partitions of both tables don’t match or one of the tables is not partitioned.

Whenever working with partitioning, you have to keep the data warehouse architecture in mind. Decide if your solution fits into it. Partitioned tables use different join techniques. If tables have different partitions, this has an adverse impact on join performance!

Conclusion: There is no one-size-fits-all method. You have to check what works best for you!

4. Query Rewriting

Often queries performance improves when the query is rewritten. Here are some ideas:

  • DISTINCT instead of GROUP BY depending on the number of different values
  • UNION ALL instead of UNION (getting rid of a sort step)
  • Splitting a skewed query into two parts. One for handling the skewed values, the other to deal with the rest
  • Adding WHERE conditions to allow for partition elimination
  • Removing of unreferenced columns and expression from the select list (could help to cut joins)
  • Converting outer joins into inner joins

Query rewriting allows improving performance, sometimes even when all other techniques fail.

Query rewriting is a very powerful way to improve performance. Still, it often requires understanding the business logic (“can I replace this left join with an inner join?”).

It is possible to rewrite a query in a purely technical way. Still,  understanding the business logic of a query reveals more tuning opportunities.

See also our article on query rewriting here: Teradata Tuning – Query Rewriting

5. Physical Data Model Design

I was not sure if I should add the physical data model to this list for one reason. Often we can’t do any significant changes in the physical database design. Too many consumers on top of the database (such as reporting tools, data marts) would require a redesign.

Still, if we can improve the physical model, this is one of the most productive changes.

The best advice I can give:

Keep your core model normalized, denormalize in your data marts. Many of you will not agree. I can live with this. My experience is that early denormalization causes bad performance, especially when done without apparent reason. Check all columns of all tables if the information stored cannot be further broken down. Ensure also that they have the same data types and character sets.    Columns which are containing more than one piece of information force the user to use expression joins. Most likely the Optimizer will not use any statistics and the primary index for joining.

Ensure that all primary key columns are defined as NOT NULL. Add default values where appropriate. If you store codes, there is no reason to use UNICODE. It will just waste space.

Apply Multi Value compression on all tables: More rows can fit into each data block. Data blocks are the smallest unit transferred between disk and memory. More rows per data block lead to less disk IOs and better performance.

Please consider that the above advice can only be a loose collection of ideas about how to fix a broken data model.

6. Make Use of Teradata-Specific Features

There are several unique optimization opportunities which you should consider:

  • Using MULTISET tables can decrease Disk IOs
  • Use Multi-Statement-Requests, as this avoids the usage of the transient journal and does block optimization
  • Use CLOB columns instead of VARCHAR if you seldom select the column. Teradata stores CLOBs in sub-tables
  • DELETE instead of DROP tables
  • Use ALTER TABLE instead of INSERT…SELECT into a copy
  • User MERGE INTO instead of INSERT & UPDATE

7. Real-Time Monitoring

Observing a query while it’s running allows detecting the critical steps. Most people use Viewpoint for real-time monitoring. I prefer another tool called dbmon. The author is a guy from Teradata Austria (I hate the slowness of Viewpoint).

Bad SQL performance is either caused by:

  • Skewed steps or
  • Stale and missing statistics. They fool the Optimizer into creating wrong join decisions: Doing product joins instead of merge joins, duplicating instead of rehashing

That’s my way of real-time monitoring:

I wait for a step that is either skewed or in which the estimated input rows to a join don’t make sense. (such as having two spools with millions of rows joined with a product join). I concentrate my optimization on the steps of the previously mentioned type.

If the skew is the issue, I will analyze the join column value skew. If estimations are wrong, I will go over the statistics and make sure that they are up to date.

Issues with statistics can be fixed quickly. Skew issues can be quite stubborn. Query rewriting is always my last option unless I find something foolish and easy to repair.

8. Comparison of Resource Usage

Always measure resource usage before and after the optimization. As I said earlier: query run times are no reliable test!

Here is a SQL query you can use in your daily work to extract appropriate steps from the query log. You have to set a different QUERYBAND for each query version you are running to be able to distinguish them. You need “select” access to “DBC.DBQLOGTBL”.

SET QUERY_BAND = 'Version=1;' FOR SESSION;
SELECT
   AMPCPUTIME,
   (FIRSTRESPTIME-STARTTIME DAY(2) TO SECOND(6)) RUNTIME,
   SPOOLUSAGE/1024**3 AS SPOOL_IN_GB,
   CAST(100-((AMPCPUTIME/(HASHAMP()+1))*100/NULLIFZERO(MAXAMPCPUTIME)) AS INTEGER) AS CPU_SKEW,
   MAXAMPCPUTIME*(HASHAMP()+1) AS CPU_IMPACT,
   AMPCPUTIME*1000/NULLIFZERO(TOTALIOCOUNT) AS LHR
FROM
   DBC.DBQLOGTBL
WHERE
     QUERYBAND = 'Version=1;'

The query will return:

  • The total CPU Usage
  • The Spool Space needed
  • The LHR (ratio between CPU and IO usage)
  • The CPU Skew
  • The Skew Impact on the CPU

The goal is to cut total CPU usage, consumed spool space and skew.

9. Tactical Workload

Tactical workload requires a very particular skillset. The best tuner will fail if he doesn’t recognize that he is dealing with a tactical workload. I have seen complete projects failing because developers ignored this important fact.

The skillset required for the strategic workload is unique.

I strongly recommend to read this post which explains all the details:

Tactical Workload Tuning on Teradata

Teradata SQL Queries: 6 Effective Tricks to optimize them

1. Avoid multiple Joins to the same Table

Each join means either a full table scan or index access. In all your Teradata SQL queries, avoid multiple joins to the same table if possible. Often multiple joins can be bypassed by subqueries, the use of volatile tables, or ordered analytic functions.

Here is an example of how to use a pivot in a subquery to prevent multiple access to the same table:

SELECT
	A.SUBSCRIBER_ID,
 	B.VALUE AS INCOMING,
 	C.VALUE AS OUTGOING
FROM 
	SUBSCRIBERS A
	LEFT JOIN  CALLS B ON A.SUBSCRIBER_ID = B.SUBSCRIBER_ID AND B.INOUT_CD  = 'IN'
	LEFT JOIN  CALLS C ON A.SUBSCRIBER_ID = C.SUBSCRIBER_ID AND C.INOUT_CD  = 'OUT'
WHERE
	B.CALL_DATE BETWEEN DATE '2019-01-01' AND DATE '2019-12-31'
	C.CALL_DATE BETWEEN DATE '2019-01-01' AND DATE '2019-12-31'
SELECT
	A.SUBSCRIBER_ID,
 	B.INCOMING,
 	B.OUTGOING
FROM 
	SUBSCRIBERS A
	LEFT JOIN  
	(
		SELECT 
			SUBSCRIBER_ID,
			MAX(CASE WHEN INOUT_CD  = 'IN' THEN VALUE END) AS AVN_OFF_BAL,
			MAX(CASE WHEN INOUT_CD  = 'OUT' THEN VALUE END) AS BAL_OFF_BAL
		FROM
			CALLS 
		WHERE
			CALL_DATE BETWEEN DATE '2019-01-01' AND DATE '2019-12-31'
			AND INOUT_CD IN ('IN', 'OUT')
		GROUP BY 1
	) B
ON A.SUBSCRIBER_ID = B.SUBSCRIBER_ID;

2. Avoid functions in Equi-Join Conditions

Most functions prevent ROWHASH access from being used when applied to a join column. These types of transformations are a sign of poor data modeling or poor data quality. If you solve the cause, then there is no need to use functions in joins.

a.SUBSCRIBER_ID = TRIM(LEADING '0' FROM b.SUBSCRIBER_ID)

3. Avoid the usage of UNION

A simple UNION causes a sorting and deduplication of the rows. This is a very expensive process. If it is ensured that the result sets cannot provide overlapping results, UNION ALL should always be used.

Instead of: 

SELECT   CAST ('FIRST'  AS VARCHAR (15)), PK FROM TheTable
UNION 
SELECT   CAST ('SECOND'  AS VARCHAR (15)), PK FROM TheTable;

Use:

SELECT   CAST ('FIRST'  AS VARCHAR (15)), PK FROM TheTable
UNION ALL
SELECT   CAST ('SECOND'  AS VARCHAR (15)), PK FROM TheTable;

4. Avoid GROUP BY over Expressions

Avoid grouping over expressions. For performance reasons, better use the underlying columns for the expression as grouping criteria, e.g., if you need to group by concatenated columns, use these columns in GROUP BY and not concatenated columns.

SELECT  COALESCE (SUBSCRIBER_ID ,0) || COALESCE(DESCRIPTION ,'') FROM THETABLE 
GROUP BY 
COALESCE (SUBSCRIBER_ID,0) || COALESCE(DESCRIPTION ,'');

Change to:

SELECT  COALESCE (SUBSCRIBER_ID ,0) || COALESCE(DESCRIPTION ,'') FROM THETABLE 
GROUP BY 
COALESCE (SUBSCRIBER_ID,0) , COALESCE(DESCRIPTION ,'');

5. Divide large Teradata SQL queries into smaller ones

A large single query can hit resource limits, especially SpoolSpace. To avoid this try to split the query into smaller chunks. Good splitting points are UNION (ALL) statements. These can be replaced by multiple insert statements. Another splitting point could be JOINs.
Consider materializing complex sub-queries in a volatile table first.

6. Consider the creation of additional statistics and indexes

It’s all about statistics.
Optimization of SQL statements makes absolutely no sense if we do not have proper statistics defined.

But: Proper statistics does not mean we have to collect statistics at all.

We can let the optimizer do this as the optimizer always samples statistics on indexes if they are not explicitly defined.
Still, we have to be sure that the optimizer has for each table and column the correct picture about data demographics.

The very first you should do is turning on diagnostics, as this will give you useful information about missing statistics:

DIAGNOSTIC HELPSTATS ON FOR SESSION;

Run afterward the EXPLAIN statement and you will find at the end of the output a list of statistics the optimizer is suggesting.

Not all suggested statistics may make sense in your scenario but they give you a good starting point.

Be careful: It’s not about adding all the statistics which are suggested!
What you should definitely look for is suggestions on columns that are used in JOINS and WHERE conditions (especially if not indexed).
Add them step by step, re-check the EXPLAIN output after each time you added statistics and look for changes in the plan.
If not, drop the statistics again. Do not add them all at once as you will not be able anymore to find out only the useful ones!

If the diagnostics suggest statistics on a column or a combination of columns that is unique or almost unique, go for sample statistics only (COLLECT STATISTICS USING SAMPLE ON COLUMN <(COLUMN)> as they are much faster and need fewer resources.

Check the plan for deviations between the estimated rows and the number of rows in the related table. This could be a hint that there are aged statistics on the table.

SHOW STATISTICS VALUES ON <TheTable>; is your best friend to get detailed inside on the statistics currently used by the Optimizer. It will even show you the skewed values of each column which is valuable information when designing your query.

2

Teradata Optimization with Partial Group by in Joins

What is Partial Group By?

Joins are very expensive. Before there was PARTIAL GROUP BY, the join was first executed, then the result of the join was aggregated.

The idea behind PARTIAL GROUP BY is simple: Aggregations that can be performed before the join (without changing the semantics of the query) reduce the amount of data that has to be redistributed or duplicated to all AMPs during the join preparation.

Obviously, the less distinct values the GROUP BY columns have, the greater the savings.

A distinction is made between Early GROUP BY and Partial GROUP BY. The only difference is that Early GROUP BY performs all aggregations before the join, Partial GROUP BY performs part of the aggregation before the join and part after the join. However, this is the only difference.

Here is an example of a query that is predestined for PARTIAL GROUP BY:

SELECT table1.key,SUM(table1.fact),SUM(table2.fact)
FROM table1 INNER JOIN table2 ON table1.key = table2.key
GROUP BY 1;

In this case, it makes no difference whether the join occurs first and then the aggregation or vice versa. Depending on the number of rows per table and the different values in the column “fact” of both tables, the performance can be very different.

1

Teradata Multiple Joins – A Query Tuning Showcase

GUEST POST BY ARTEMIY KOZYR

1. Summary

The purpose of this showcase is to demonstrate the way of optimization statement with multiple JOINs. The elegant way of tuning which leads Teradata Optimizer to the optimal JOIN strategy and using data redistribution instead of duplication where appropriate.

Whenever you have complex logic with lots of joins inside try to decompose and identify parts of it performing badly. Gather Execution plan, track query performance and resource usage to determine the optimal JOIN sequence to achieve the best performance.

The results are just overwhelming. Performing JOINs the optimal way helped save resource utilization by several orders:

In several cases, it turns out to be too complicated to tune the query with lots of joins and logic inside, so it might be useful to decompose it into several consequent steps. It then gets easier in terms of monitoring performance, transparency, execution plan.

2. Heavy load Statement with lots of JOINs inside

Here is the real example of the query running in our production environment:

  • Heavy query run daily which consumes lots of resources;
  • Takes place at ETL Staging area (pre-detailed layer);
  • Source system entries row are enriched with surrogate keys and ETL metadata;

Inside there are 12 x LEFT JOINs take place.

The cumulative staging table (425M+ rows) is JOINed to Surrogate Keys tables (the largest is 125M+ rows).

Take a look at the source code:

LOCKING ROW FOR ACCESS

SELECT
_S.*
SK_ACC_DEV.TGT_ID,
SK_PTY_CLIENT.TGT_ID,
SK_AGRMNT.TGT_ID,
SK_PTY_VSP.TGT_ID,
SK_PTY_OSB.TGT_ID,
SK_PROD.TGT_ID,
SK_ACC_DEV_REASON_TYPE.TGT_ID,
SK_DRN.TGT_ID,
SK_UCFC.TGT_ID,
SK_CONTR_SUBTYPE.TGT_ID,
SK_CARD_BIN.TGT_ID,

FROM SANDBOX.S0160000010008_CARD _S

LEFT JOIN SANDBOX.K_AGRMNT_NK07 AS _SK_AGRMNT
ON (_SK_AGRMNT.NK07_AGREEMENT_ID_W4 IS NOT NULL
AND _SK_AGRMNT.NK07_AGREEMENT_ID_W4=_S.CONTRACT_ID_W4 )

LEFT JOIN SANDBOX.K_PTY_NK12 AS _SK_PTY_CLIENT
ON (_SK_PTY_CLIENT.NK12_WAY4_CLIENT_ID IS NOT NULL
AND _SK_PTY_CLIENT.NK12_WAY4_CLIENT_ID=_S.CARD_CLIENT_ID_W4 )

LEFT JOIN SANDBOX.K_PROD_NK03 AS _SK_PROD
ON (_SK_PROD.NK03_PRODUCT_ID_W4=_S.PRODUCT_ID_W4 )

LEFT JOIN SANDBOX.K_ACCESS_DEVICE_NK10 AS _SK_ACC_DEV
ON (_SK_ACC_DEV.NK10_CARD_NUMBER IS NOT NULL
AND _SK_ACC_DEV.NK10_CARD_NUMBER=_S.CARD_NUMBER )

LEFT JOIN SANDBOX.S0160000010001_BRANCH _B
ON (_S.OSB_TB_NUMBER=_B.TB_OLD_NUMBER
AND _S.OSB_NUMBER=_B.OSB_NUMBER )

LEFT JOIN SANDBOX.K_PTY_NK13 AS _SK_PTY_VSP
ON _SK_PTY_VSP.NK13_VSP_FULL_NUMBER = _B.TB_NUMBER

LEFT JOIN SANDBOX.K_PTY_NK11 AS _SK_PTY_OSB
ON SK_PTY_OSB.NK11_TB_NUMBER = B.TB_NUMBER

LEFT JOIN SANDBOX.K_ACCESS_DEVICE_RSN_TYPE_NK01 AS _SK_ACC_DEV_REASON_TYPE
ON (_SK_ACC_DEV_REASON_TYPE.NK01_W4_STATUS_ID=_S.STATUS_ID
AND _SK_ACC_DEV_REASON_TYPE.NK01_W4_STATUS_ID IS NOT NULL)

LEFT JOIN SANDBOX.K_ACCESS_DEVICE_CLASS_VAL_NK05 AS _SK_DRN
ON (_SK_DRN.NK05_DRN=_S.DRN
AND _SK_DRN.NK05_DRN IS NOT NULL )

LEFT JOIN SANDBOX.K_ACCESS_DEVICE_CLASS_VAL_NK03 AS _SK_UCFC
ON (_SK_UCFC.NK03_UCFC=_S.UCFC
AND _SK_UCFC.NK03_UCFC IS NOT NULL )

LEFT JOIN SANDBOX.K_CONTR_SUBTYPE#WAY4_NK01 AS _SK_CONTR_SUBTYPE
ON (_SK_CONTR_SUBTYPE.NK01_CONTR_SUBTYPE_ID=_S.CONTR_SUBTYPE_ID
AND _SK_CONTR_SUBTYPE.NK01_CONTR_SUBTYPE_ID IS NOT NULL )

LEFT JOIN SANDBOX.K_ACCESS_DEVICE_CLASS_VAL_NK04 AS _SK_CARD_BIN
ON (_SK_CARD_BIN.NK04_CARD_BIN=_S.CARD_BIN
AND _SK_CARD_BIN.NK04_CARD_BIN IS NOT NULL );

3. Poor performing query needs to be reviewed

 

The query has shown poor performance indicators and tends to be among top badly performing statements according to Query Log.

Recent weeks it has only grown in resource consumption, so it needs to be reviewed and tuned.

The resources are consumed non-optimal way. This single session utilizes 40-45% CPU time of the whole ETL process of this source system!

Take a look at the query performance:

4. Examine the execution plan

It is always the tricky question how can we tune the query and improve the performance. What should we start with?

First of all, we should start with digging deeper into the Execution Plan. Let us try to identify a part of the query which is performing badly.

Here we see the JOIN to clients surrogate key table. The table of SK consists of 125M+ rows and data is redistributed evenly across all AMPs. Yet we see that join is performed with data duplication on all AMPs!

10) We do an all-AMPs RETRIEVE step from a single partition of
SANDBOX._SK_PTY_CLIENT with a condition of … into Spool 26 (all_amps)
which is duplicated on all AMPs with hash fields …

 Then we do a SORT to order Spool 26 by row hash. The result spool file will not be cached in memory. The size of Spool 26 is estimated with no confidence to be 230,896,636,992 rows (10,852,141,938,624 bytes). The estimated time for this step is 16 minutes and 22 seconds. 

5. Approach: Statistics, JOIN INDEX,  Sequence of JOINs

  1. Statistics gathering

First of all, make sure proper STATS are in place. Should we gather necessary data if there are some statistics missing on the joining fields?

COLLECT STATISTICS COLUMN (PARTITION
, INFO_SYSTEM_INST_CD
, INFO_SYSTEM_TYPE_CD
, NK12_WAY4_CLIENT_ID)
ON SANDBOX.K_PTY_NK12 ;

HELP STATISTICS SANDBOX.K_PTY_NK12;

HELP STATISTICS SANDBOX.S016_0008_CARD;

But still, Teradata Optimizer decides to perform table duplication and does not change the execution plan.

  1. Introducing JOIN INDEX

 

Should we proceed to create Join Index – the Teradata-specific way of materializing views?

I have tried a couple of different JI definitions, but unfortunately, Teradata Optimizer would not use it due to specific reasons. Furthermore, it would add significant overhead, so it might not be the best option.

CREATE JOIN INDEX SANDBOX.JI_CC0160000010008_CARD
AS
SELECT
_S.CARD_ID_W4
, _S.CARD_CLIENT_ID_W4
, _SK_PTY_CLIENT.NK12_WAY4_CLIENT_ID
, _SK_PTY_CLIENT.TGT_ID
FROM SANDBOX.S0160000010008_CARD /*SANDBOX.S016_0008_CARD*/ _S
INNER JOIN SANDBOX.K_PTY_NK12 AS _SK_PTY_CLIENT
ON (_SK_PTY_CLIENT.NK12_WAY4_CLIENT_ID IS NOT NULL
AND _SK_PTY_CLIENT.NK12_WAY4_CLIENT_ID=_S.CARD_CLIENT_ID_W4 );

  1. JOIN sequence manipulation

Finally, let us try to force a particular sequence of joining tables and see the results. So how do you do this? You can simply decompose this large query into two consequent ones:

1) First, perform all the JOINs except one causing bad performance

2) JOIN intermediate result to the last large table

How is the sequence of joins determined? It depends on Teradata cost-based optimization. There are no HINTs like you have in Oracle to force the optimizer to execute the statement the way you want.

By dividing the initial statement into two consequent ones we force Teradata the specific join order. This approach would bring the best results.

So, you simply comment out the JOIN condition and add it in the second step:

/* LEFT JOIN SANDBOX.K_PTY_NK12 AS _SK_PTY_CLIENT
ON (_SK_PTY_CLIENT.NK12_WAY4_CLIENT_ID IS NOT NULL
AND _SK_PTY_CLIENT.NK12_WAY4_CLIENT_ID=_S.CARD_CLIENT_ID_W4) */

6. Improved performance with data redistribution

Here is the new EXPLAIN Plan excerpt for the second step of query utilizing data redistribution and performing of Hash JOIN:

 3) We do an all-AMPs RETRIEVE step from a single partition of SANDBOX._SK_PTY_CLIENT with a condition of … into Spool 25 (all_amps) Spool 25 is redistributed by hash code to all AMPs The size of Spool 25 is estimated with low confidence to be 127,257,826 rows. The estimated time for this step is 0.58 seconds. 11) We do an all-AMPs JOIN step from Spool 2, which is joined to Spool 23 Spool 25 is used as the hash table and Spool 23 is used as the probe table in a right outer joined using a classical hash join of 2 partitions with a join condition of … The size of Spool 28 is estimated with low confidence to be 387,595,663 rows ( 815,113,679,289 bytes). The estimated time for this step is 16.71 seconds.  

The results are just great. The Performance Indicators improved by multiple times unlocking the whole Teradata capacity.

Here are the performance indicators BEFORE and AFTER optimization. I have grouped 2 consequent steps under one number for the convenience.

Artemiy Kozyr is Data Engineer at Sberbank, Moscow with Master Degree in CS.
He has 5 years of Experience in Data Warehousing, ETL, Visualization for Financial Institutions.

Contacts:

[email protected]
http://linkedin.com/in/artemiykozyr/

1 2 3 7
>