Cloning of Tables In Teradata

Roland Wenzlofsky

March 27, 2014

minutes reading time


Cloning a table is a very often required task. Probably you all know the traditional way of doing this:

–       Retrieve the DDL statement of the table, which has to be cloned
–       Create a copy of the table structure using this DDL
–       Run an INSERT…SELECT into the empty table

Teradata offers an alternative method with the statement:

CREATE TABLE <TABLENAME> AS <TABLE_TO_CLONE> WITH [NO] DATA [AND STATISTICS]

We can decide if only an empty table should be created based on the original one or if the cloned table should be populated from the existing table. The first option means to leave out the last step of the traditional method outlined above.

There are several benefits to replacing the traditional cloning tables with the CREATE TABLE AS statement.

First, you can transfer the statistics or the definitions from the original table to the cloned table. As you might know, collecting statistics can be time-consuming and resource-consuming. With the new method of cloning a table, no resources are wasted on recollecting statistics on the cloned table.

If you create an empty copy of the original table, not copying the data, it is possible to transfer the statistics definitions. You can imagine what an annoying task it would be to collect the statistics manually on the cloned table.

Besides the benefits mentioned above, the CREATE TABLE AS method allows for creating and populating clone tables from SQL statements. Almost every SQL statement is allowed, including joins, aggregations, etc. Although statistics can be taken over even for SQL-based cloning, I do not exploit this possibility. Why? Because rules for statistics transferral are pretty complex, and as I like to keep things simple, it is not worth internalizing all these rules. Keeping things simple is the primary motivation for using this new clone method.

Here is an example of cloning based on a SQL statement:

CREATE TABLE <TABLE> AS (SELECT * FROM <TABLE_1>) WITH DATA;

Rules for Cloning of Tables in Teradata and column attributes

When cloning tables from SQL statements, several rules apply regarding table and column attributes transfer.

In principle, it is enough to remember the following:

– If you clone a table, all attributes are cloned as well.
If you clone based on a SQL statement, attributes are not cloned.

Although there are exceptions to this rule, remembering this statement will probably be sufficient.

See below which attributes are transferred in table cloning. Green attributes are transferred, and red ones get lost.

Clone Tables

Cloning from a SQL statement and moved attributes (as you can see, just the name and the data type)

Clone with Query

Changing the attributes during cloning

As mentioned before, you can change table and column attributes during the cloning process.

In the case of SQL syntax allows it, you can directly modify the column attribute in the SQL statement like below (we are changing the data type attribute):

CREATE TABLE <TABLENAME> AS
(
SELECT CAST(A AS INTEGER)
FROM <TABLE_1>
) WITH DATA;

Many attributes can’t be changed because this would generate an invalid SQL syntax. In this case, we can use the syntax below (we are adding the NOT NULl constraint to column A):

CREATE TABLE <TABLENAME>  ( A NOT NULL) AS
(
SELECT A
FROM <TABLE_1>
) WITH DATA;

Don’t fall into the primary index trap

There is one rule in cloning that you have to internalize, or you will probably end up sooner or later searching for nasty performance problems:

If you clone from a SQL statement, you always have to define the primary index. Otherwise, Teradata will take the first column of the cloned table and use it as the table’s non-unique primary index. Using the first column could lead to skewed data and performance problems. Always keep an eye on this!

Here is an example where the primary index definition is omitted. Teradata will choose A to be the non-unique primary index of the clone table:

CREATE TABLE <TABLENAME> AS (SELECT A,B FROM <TABLE_1> WITH DATA;

You should always give a proper primary index definition, even if it is the first table column:

CREATE TABLE <TABLENAME> AS (SELECT A,B FROM <TABLE_1> PRIMARY INDEX (B) WITH DATA;

Conclusion:

The CREATE TABLE AS statement can make your life easier. Although packed with many options, I recommend using only the ones making your life easier without forcing you to internalize dozens of rules.

– Decide if data should be copied
– Determine if the statistics should be transferred, but use it only if cloning a table (no SQL statement cloning)
– Internalize which attributes are copied in case of table and SQL statement cloning. As a rule of thumb: table cloning transfers all attributes, SQL statement cloning moves  none
– Remember the two possibilities to change attributes during cloning

See also:
The Fundamentals of FastLoading on Teradata

  • Avatar
    Ilkka Kuivalainen says:

    Hello,

    Regarding creating tables from sql statements, is there any downside from locks on system tables (DBC)? Looking at the explain plans I understand that some dbc tables are locked on table level for the whole duration of executing the select statement. Is it like this, and if yes, wouldn’t that prevent other users from creating/modifying tables etc?

    Regards
    Ilkka

  • Avatar
    Dimitrios says:

    Hello,

    Nice post, really detailed. I totally agree to use for Production Env, the time-wasting approach.Extract DDL, insert data, collect Stats.

    As you said, manually define the datatype/Nullable or not, this thing, a lot of people they forget it, and TD take as default the first returning row of the query. So, in more complex queries where I make a new table or volatile one, some columns might not be nullable or in a different datatype and when I will use these tables for some other work, I will have an impact on performance.

    Ending, just to add one more parameter, that I learned in a bad way, is what if I have Foreign Keys and Triggers defined on my table and I want o make a clone one? Can I use the TD syntax?

  • Avatar
    Nitin Srivastava says:

    Hi Roland

    This is a very good post on Table Cloning.
    The two images in the post very clearly explain the difference between the two approaches.
    One thing which I find missing is about Partition Primary Index.
    In the SQL Statement approach, PPI of Source Table is also not transferred.

    Regards
    Nitin

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >