What are the Goals of SQL Tuning?
Performance tuning of a database system always means reducing resource usage. Three metrics are important: skew, IOs, and CPU seconds. From our experience, we have the biggest problems in Teradata with the number of IOs and skew. Transferring rows from mass storage to main storage is time-consuming. Teradata cannot transfer single rows but data blocks that contain a certain number of rows. Only in the main memory, the required rows can be located in the data blocks.
CPU consumption can be a problem on a CPU-bound system. In practice, however, I have rarely experienced this limitation.
The three metrics mentioned are also our optimization metrics. We could also optimize by the runtime of our queries but this makes no sense in many respects. Runtimes are very dependent on external parameters over which we have no influence. The most important of these are concurrent workloads. Even though at the end of the day runtimes are what our users care about, we will still base our optimization on IOs, CPU seconds, and skew. This should automatically result in a better runtime.
These metrics have a big advantage that they are absolute. For the same execution plan, we will always see the same resource consumption for the same query.
1. Ensure Completeness and Correctness of Teradata Statistics
An essential optimization task is to support the Teradata Optimizer with complete and correct statistics.
We have to pay particular attention to 3 basic situations and always collect complete statistics:
- Non-indexed columns used in predicates:
Missing statistics force the Optimizer to do a wildly inaccurate heuristic estimation.
- Skewed indexed columns:
Random AMP sampling on skewed index results in wrong estimates and execution plans.
- Small tables:
Estimations for small tables suck when there are fewer table rows than AMPs. In this case, most AMPs will not contain rows. A dynamic-AMP sample taken from an AMP without a few rows will give a wrong estimation.
Following these three rules will vanish many problems related to missing statistics.
Our Teradata SQL tuning activities balance good query plans and time to ensure useful statistical information.
Discovering Missing Statistics
The most straightforward way to discover missing statistics is by turning on diagnostics:
DIAGNOSTIC HELPSTATS ON FOR SESSION;
The above statement adds statistics recommendation at the end of each explained statement:
EXPLAIN SELECT * FROM WHERE Column = 'value';
We have to check each recommendation individually. We don’t add all suggestions at once but choose those that improve query performance.
Ensure that no plan step only has “no confidence”. Steps with “no confidence” are a sign for heuristics on non-indexed columns. Something must avoid at all costs!
Detection of Stale Statistics
The Optimizer is doing an excellent job of detecting stale statistics and extrapolating.
Before release 14.10, Teradata detected table growth by comparing two random AMP samples. One was taken during statistics collection, the other during query execution.
Since Teradata Release 14.10, deleted and inserted rows can be tracked (the UDI counts). The Optimizer will prefer UDI counts over random AMP sample comparison for extrapolation.
Still, the best way to ensure up-to-date statistics is to collect them often.
We must compare estimations against the actual row count to find stale statistics. There are two places where we can use this information:
- The Execution Plan (by placing the EXPLAIN modifier in front of our query)
- The Statistics Histogram (by executing the SHOW STATISTICS VALUES ON <TABLE> statement)
The statistics histograms give us the same information the Optimizer has. It’s used to create the execution plan:
- The timestamp of the last statistics collection
- Collected & Random-AMP Sampling row counts
- Inserted and deleted rows since the last statistics collection
- Maximum column values
- Minimum column values
- Biased column values
- Break up the SQL into smaller pieces (using volatile tables)
- Get rid of joins on expressions
Why did I say “almost the same information”? Statistics improve each time the Optimizer gets new insight. They can come from earlier join steps, single table predicates, and aggregations.
Furthermore, different extrapolation methods will adjust the estimations shown in the histograms. Furthermore, various forms of extrapolation will correct estimates shown in the histograms.
If you need the estimations after extrapolation, you can use the following statement:
SHOW CURRENT STATISTICS VALUES ON <Table>;
Here is a “single table predicate” example. It demonstrates the usage of derived estimations:
SELECT * FROM <Table1> t01 INNER JOIn <Table2> t02 ON t01.Key = t02.Key WHERE t01.Column IN (1,2,3);
The retrieve step estimation for <Table1> is 3 distinct values. This information flows into the joining step. The retrieve step estimation for <Table1> is 3 distinct values. This information flows into the joining step. If the average number of rows per value for both tables is about 1 (unique data), the resulting spool for the join is three rows.
A straightforward approach to detecting stale statistics is this:
- Decompose the query into single retrieve and join steps
- Figure out why step estimations and real numbers don’t match
Statistics could be outdated, or they could be missing. It might even be that the estimation results from how the Optimizer works.
Here is an example:
SELECT * FROM WHERE Column = 1;
Value 1 is not available in <Table>As value=1 is not available in < column>, the Optimizer will estimate the resultset to be the average rows per value. I guess many of you expect zero rows?
More information is available here: Teradata Statistics Basics
My last piece of advice:
Keep a copy of both the execution plans – before and after the change – to see your change’s impact.
2. The Primary Index Choice
When it comes to Primary Index choice, we must create a fair balance, weighing opposing requirements against each other:
Even data distribution and join performance.
Two tables can only be joined if they have the same primary index. In other words: The rows of both tables need to be on the same AMP.
Design your queries so that the Primary Index is used for joining as much as possible, as this is the cheapest way of joining.
AMP-local joining is possible if the join condition includes columns, not in the primary index. But if the join condition does not have all primary index columns, the rows of one or both tables must be relocated (and maybe sorted).
You can use volatile or temporary tables if you need a different primary index to improve performance. Create them with the same structure and content as the original tables but with the required primary index. Using temporary tables is beneficial if your query consists of many join steps.
3. Teradata Indexing & Partitioning
Indexes give the Optimizer more data access paths. They are improving highly selective retrieve actions. Unfortunately, indexes consume permanent disk space and require maintenance when the underlying base table changes.
Drop them immediately if the Optimizer does not use indexes and is not valuable for the PDM design. They only will waste space and resources.
A unique secondary index (USI) allows direct row access (like the primary index). The non-unique secondary index (NUSI) requires a full index subtable scan on all AMPs.
The NUSI can be an advantage over the base table access if the index subtable is smaller than the base table. Covering NUSIs are more valuable than non-covering ones: There are base table lookups needed, and no costs are created.
The difference between an index and partitioning is that indexes are sub-tables. Partitioning is another way of structuring the base table. Partitioning allows the Optimizer to limit access to the data blocks of a partition. The advantage of partitioning is that partition elimination always is used. Index usage has preconditions. For example, the NUSI will not be used without statistics collected on the index columns.
Still, indexing is a significant advantage: We create the index; we check if it’s used. We drop it if it’s not used. Have you ever partitioned a 200 Terabyte table for test reasons? I guess this is not what we like to do.
Another disadvantage of partitioning comes to my mind:
Join performance worsens if partitions of both tables don’t match or one of the tables is not partitioned.
You have to keep the data warehouse architecture in mind when working with partitioning. Decide if your solution fits into it. Partitioned tables use different join techniques. If tables have different partitions, this hurts join performance!
Conclusion: There is no one-size-fits-all method. You have to check what works best for you!
4. Query Rewriting
Often query performance improves when the query is rewritten. Here are some ideas:
- DISTINCT instead of GROUP BY depending on the number of different values
- UNION ALL instead of UNION (getting rid of a sort step)
- Splitting a skewed query into two parts. One for handling the skewed values, the other to deal with the rest
- Adding WHERE conditions to allow for partition elimination
- Removing unreferenced columns and expressions from the select list (could help to cut joins)
- Converting outer joins into inner joins
Query rewriting allows for improving performance, sometimes even when all other techniques fail.
Query rewriting is a compelling way to improve performance. Still, it often requires understanding the business logic (“can I replace this left join with an inner join?”).
It is possible to rewrite a query in a purely technical way. Still, understanding the business logic of a query reveals more tuning opportunities.
5. Physical Data Model Design
I was unsure if I should add the physical data model to this list for one reason. Often we can’t make any significant changes in the physical database design. Too many consumers on top of the database (such as reporting tools and data marts) would require a redesign.
Still, if we can improve the physical model, this is one of the most effective changes.
The best advice I can give:
Keep your core model normalized and denormalize in your data marts. Many of you will not agree. I can live with this. My experience is that early denormalization causes bad performance, especially when done without apparent reason. Check all columns of all tables if the information stored cannot be further broken down. Ensure also that they have the same data types and character sets. Columns containing more than one piece of information force the user to use expression joins. Most likely, the Optimizer will not use statistics and the primary index for joining.
Ensure that all primary key columns are defined as NOT NULL. Add default values where appropriate. If you store codes, there is no reason to use UNICODE. It will just waste space.
Apply Multi Value compression on all tables: More rows can fit into each data block. Data blocks are the smallest unit transferred between disk and memory. More rows per data block lead to fewer disk IOs and better performance.
Please consider that the above advice can only be a loose collection of ideas about fixing a broken data model.
6. Make Use of Teradata-Specific Features
There are several unique optimization opportunities that you should consider:
- Using MULTISET tables can decrease Disk IOs
- Use Multi-Statement-Requests, as this avoids the usage of the transient journal and does block optimization
- Use CLOB columns instead of VARCHAR if you seldom select the column. Teradata stores CLOBs in sub-tables
- DELETE instead of DROP tables
- Use ALTER TABLE instead of INSERT…SELECT into a copy
- User MERGE INTO instead of INSERT & UPDATE
7. Real-Time Monitoring
Observing a query while it’s running allows for detecting the critical steps. Most people use Viewpoint for real-time monitoring. I prefer another tool called dbmon. The author is a guy from Teradata Austria (I wouldn’t say I like the slowness of Viewpoint).
Bad SQL performance is either caused by:
- Skewed steps
- Stale and missing statistics fool the Optimizer into creating wrong join decisions: Doing product joins instead of merge joins, duplicating instead of rehashing.
That’s my way of real-time monitoring:
I wait for a step that is either skewed or in which the estimated input rows to a join don’t make sense. (such as having two spools with millions of rows joined with a product join). I concentrate my optimization on the steps of the previously mentioned type.
I will analyze the join column value skew if the skew is the issue. If estimations are wrong, I will review the statistics and ensure they are up to date.
Issues with statistics can be fixed quickly. Skew matters can be pretty stubborn. Query rewriting is always my last option unless I find something foolish and easy to repair.
8. Comparison of Resource Usage
Always measure resource usage before and after the optimization. As I said earlier: query run times are no reliable test!
You can use a SQL query to extract appropriate steps from the query log in your daily work. To distinguish them, you must set a different QUERYBAND for each query version you are running. You need “select” access to “DBC.DBQLOGTBL”.
SET QUERY_BAND = 'Version=1;' FOR SESSION; SELECT AMPCPUTIME, (FIRSTRESPTIME-STARTTIME DAY(2) TO SECOND(6)) RUNTIME, SPOOLUSAGE/1024**3 AS SPOOL_IN_GB, CAST(100-((AMPCPUTIME/(HASHAMP()+1))*100/NULLIFZERO(MAXAMPCPUTIME)) AS INTEGER) AS CPU_SKEW, MAXAMPCPUTIME*(HASHAMP()+1) AS CPU_IMPACT, AMPCPUTIME*1000/NULLIFZERO(TOTALIOCOUNT) AS LHR FROM DBC.DBQLOGTBL WHERE QUERYBAND = 'Version=1;'
The query will return:
- The total CPU Usage
- The Spool Space needed
- The LHR (ratio between CPU and IO usage)
- The CPU Skew
- The Skew Impact on the CPU
The goal is to cut total CPU usage, spool space, and skew.
9. Tactical Workload
A tactical workload requires a very particular skillset. The best tuner will fail when not recognizing tactical workload. I have seen complete projects failing because developers ignored this critical fact.
The skillset required for the strategic workload is unique.
I strongly recommend reading this post which explains all the details: