Introduction to Teradata Temporary Tables
Teradata temporary tables come in different types and for various purposes. This article shows you the properties and uses of each one.
1. Derived Tables
- Are materialized by a select statement within a query
- Only exist within the time the query is executed
- Use the spool space of the executing user
- Disappear at the moment the query is finished
Derived tables must consistently be named with an alias. Otherwise, there will be a syntax error!
Here is an example of a derived table (inside the parentheses):
SELECT * FROM (SELECT MAX(Revenue) FROM Invoices) AS InvoiceRevenue (MaxRevenue);
The table is named InvoiceRevenue, and the column from the derived table is called MaxRevenue.
However, there are further possibilities to name the column of the derived table:
SELECT * FROM (SELECT MAX(Revenue) AS MaxRevenue FROM Invoices) AS InvoiceRevenue;
We name the column already within the derived table in the above example.
Which method you use is a matter of taste. The result of the query is the same.
Derived Tables using the WITH Syntax
Here is another method to name a derived table and its columns in advance and reference it later:
WITH InvoiceRevenue(MaxRevenue) AS
(SELECT MAX(Revenue) AS MaxRevenue FROM Invoices)
SELECT * FROM InvoiceRevenue;
However, it would help if you considered that only one WITH statement is allowed per query.
The great thing about this Syntax is that the table is pre-defined and can be easily inserted into the rest of the query.
Here is an example:
WITH InvoiceRevenue(MaxRevenue) AS
(SELECT MAX(Revenue) AS MaxRevenue FROM Invoices)
SELECT *
FROM
InvoiceRevenue t01
INNER JOIN
Customer t02
ON
t01.CustomerId = t02.CustomerId;
It is also essential to understand that the derived table exists only during query execution, not for the entire session.
Here is an example of two queries that are executed in one transaction. The second query will cause an error message:
BT;
WITH InvoiceRevenue(MaxRevenue) AS
(SELECT MAX(Revenue) AS MaxRevenue FROM Invoices)
SELECT *
FROM
InvoiceRevenue t01
INNER JOIN
Customer t02
ON
t01.CustomerId = t02.CustomerId;
SELECT * FROM InvoiceRevenue; --> causes an error
ET;
2. Volatile Tables
- Are created and afterward materialized by adding data with an INSERT/SELECT statement or
- Are created and materialized at the same time with a CREATE VOLATILE TABLE TheTable AS (<QUERY>) WITH DATA PRIMARY INDEX (PI) ON COMMIT PRESERVE ROWS statement
- Use the spool space of the executing user.
- Disappear at the moment the session is logged of
- The table definition is stored in each AMP’s cache
CREATE VOLATILE TABLE Revenue, NO LOG ( CustomerId BIGINT NOT NULL, RevenueAmount DECIMAL(18,3) ) ON COMMIT PRESERVE ROWS;
NO LOG means that no Transient Journal is used.
This is usually unnecessary because it is a temporary table, and you will not need this functionality of a ROLLBACK.
NO LOG is the default setting and should usually not be changed.
ON COMMIT PRESERVE ROWS is not a default setting and must always be specified. Otherwise, the table contents are deleted immediately at the end of the transaction.
The default setting is ON COMMIT DELETE ROWS!
After the volatile table is created, we can load it like any other table:
INSERT INTO Revenue SELECT CustomerId, SUM(Revenue) FROM Transaction;
Why do I need ON COMMIT DELETE ROWS?
Here is a practical example of how to use ON COMMIT DELETE ROWS.
As already mentioned, ON DELETE COMMIT ROWS deletes the data as soon as the corresponding transaction is completed.
We can take advantage of this to, for example, delete a volatile table immediately after it is no longer needed.
BT;
CREATE VOLATILE TABLE Revenue, NO LOG
(
CustomerId BIGINT NOT NULL,
RevenueAmount DECIMAL(18,3)
) ON COMMIT DELETE ROWS;
INSERT INTO Revenue SELECT CustomerId,AVG(Revenue) FROM Transaction;
SELECT * FROM Revenue;
-- The volatile table still exists as the transaction is not ended!
ET; -- Now the volatile table is empty
-- Further Queries...
Additional possibilities to materialize volatile tables
–> Create the table with content CREATE VOLATILE TABLE Revenue AS (SELECT SUM(Amount) AS AMT FROM Revenue_All) WITH DATA ON COMMIT PRESERVE ROWS; | –> Create the table empty CREATE VOLATILE TABLE Revenue AS (SELECT SUM(Amount) ASAMT FROM Revenue_All ) WITH NO DATA ON COMMIT PRESERVE ROWS; |
–> This volatile table contains only column AMT and ceratin table attributes from the table Transaction CREATE VOLATILE TABLE Revenue AS (SELECT SUM(Amount) AS AMT FROM Revenue_All ) WITH DATA ON COMMIT PRESERVE ROWS; | –> This volatile table is a 1:1 copy of the table transaction CREATE VOLATILE TABLE Revenue AS Revenue_All ) WITH DATA ON COMMIT PRESERVE ROWS; |
How can I copy Statistics to the Volatile Table?
CREATE VOLATILE TABLE Revenue AS Revenue_All WITH DATA AND STATISTICS ON COMMIT PRESERVE ROWS;
How do I find all available volatile tables in my session?
All you have to do is execute this command:
HELP VOLATILE TABLE;
3. Global Temporary Tables
- They are created with a CREATE TABLE DDL, which is stored permanently in DBC tables.
- Are materialized with an INSERT/SELECT statement
- Use the temp space of the executing user (not the spool space)
- Disappear when the session is logged off (but the table definition stays stored in the DBC tables!)
- Each user can materialize their copy of a Global Temporary Table
Global Temporary Tables are helpful if the definition is to be used by several users.
Like a Volatile Table, the Global Temporary Table only exists in the session in which we materialized it. However, unlike the Volatile Table, many users can materialize this table (as the definition is stored permanently in DBC tables).
While it is true there can only be WITH clause at the outermost level of a query in Teradata (and ANSI SQL), the WITH clause can specify multiple derived tables separated by commas and the WITH clause can be used in subqueries of a query.
correction: only be WITH –> only be one WITH
Hi
Thank you for your articles.
I would like to put attention what,
in spite of being legal,
“CREATE VOLATILE TABLE AS [query|table] WITH DATA”
statement is a very bad practice.
Data dictionary table is locked while such query is executed (which could be quite long) and it could lead to many locks or deadlocks in the system (which is bad for performance and concurrency).
It’s also true for regular tables. So I would never recommend having CREATE TABLE AS [query|table] statements in production system (however, it’s quite useful for Development or testing purpose).
For production, our best practices is to always create volatile tables by mention all columns explicitly.
Hope, it will be helpful.
With best regards,
Aleksei Svitin.
Thank you very much for pointing this out.
Hi Aleksey, thank you for your comment here. Could you elaborate on this a little. Is there a different/better way to create volatile tables than “CREATE VOLATILE TABLE AS [query|table] WITH DATA” – if this is bad practice, could you share what the best practice looks like?