What is Teradata Query Rewriting?
Teradata query rewriting is part of the Teradata optimization process.
The optimizer replaces your query with a better version, which is assumed to be less resource-intense and faster.
Of course, both queries have to deliver the same result set. To achieve better performance, Teradata utilizes several optimization strategies.
We will show you the most frequently applied optimization techniques.
1. Removing unreferenced columns and expressions from the VIEW select list
Without this enhancement, all VIEW columns would have to be spooled during query execution.
Let’s assume we have the following view definition:
CREATE VIEW MyView AS
SELECT col1, col2, AVG(col3) myavg
FROM table1 t01, table2 t02
WHERE t01.pk=t02.pk GROUP BY 1,2;
We are running the following query:
SELECT MAX(myavg) FROM MyView;
In the above example, view columns col1 and col2 can be eliminated, and only the column myavg will be spooled.
Another example will show you that it is even possible to eliminate spool usage:
SELECT COUNT(*) FROM MyView;
The above query does not reference any view column. The number of rows can be derived from the cylinder index alone!
The elimination of view columns from the spool is a widespread technique the optimizer uses for spool usage reduction.
2. Conversion of outer joins to inner joins
Whenever the optimizer detects that a query can be satisfied with an inner join, the outer join will be replaced to improve join performance:
SELECT DISTINCT(col1)
FROM table1 LEFT OUTER JOIN table2 ON table1.pk=table2.pk
WHERE table2.col2 = 100 ;
The outer join can safely be converted to an inner join in the above example as the WHERE condition filters any non-matching rows.
3. Folding of Views
When applying this enhancement, the optimizer completely removes the reference to views from the execution plan (view folding).
Let’s assume we have the following view definition:
CREATE VIEW MyView AS
SELECT col1, col2, SUM(col3*col4) the_sum
FROM table1, table2
WHERE table1.pk=table2.pk GROUP BY 1,2;
We run the following query:
SELECT col2 FROM MyView WHERE the_sum > 100;
The optimizer would rewrite the above example in the following form:
SELECT col2 FROM table1, table2
WHERE table1.pk=table2.pk
GROUP BY 1,2
HAVING SUM(col3*col4) > 100;
As you can see, the view was eliminated from the execution plan.
4. Predicate Pushdown
With this technique, the optimizer pushes down existing query predicates. In the above example, the WHERE predicate would be applied already in the inner sub-query t01:
SELECT SUM(the_sum) the_overall_sum
FROM (
SELECT col1, col2, SUM(col3*col4) the_sum
FROM table1, table2
WHERE table1.col1=table2.col2
GROUP BY 1, 2 ) t01
WHERE col1 IN (1,2,3,4,5) ;
5. Elimination of SET operator branches
If a SET branch contains an unsatisfiable condition, Â the branch will be eliminated from the query.
Let’s assume we have the example query below:
SELECT * FROM table1 WHERE month_id= 1
UNION ALL
SELECT * FROM table2 WHERE month_id = 1 ;
Further, we assume to have a check constraint on month_id, ensuring that table1 only contains month_id = 1 and table2 contains month_id = 2:
The optimizer can safely rewrite the above query to:
SELECT * FROM table1 WHERE month_id = 1 ;
6. Elimination of Joins
Unneeded Joins are eliminated from the execution plan:
SELECT table1.* FROM table1 LEFT JOIN table2 ON table1.pk=table2.pk ;
The above query can be satisfied by spooling only table1:
SELECT table1.* FROM table1;
As you can see, the optimizer tries its best to improve your queries. I hope you enjoyed this post!
Great article.
Sounds that Optimizer became much better with Version 14.0.
Hi Roland:
In what Teradata version did the ‘Query Rewrite of queries using views’ appear. I did not know that queries using views could be re-worked to eliminate the view-based-query entirely and go straight to the underlying tables – which support the view.