Cloning of Tables in Teradata
Cloning a table is a very often required task. Probably you all know the traditional way of doing this:
- Retrieve the DDL statement of the table which has to be cloned
- Create a copy of the table structure using this DDL
- Run an INSERT…SELECT into the empty table
Teradata offers an alternative method with the statement:
CREATE TABLE <TABLENAME> AS <TABLE_TO_CLONE> WITH [NO] DATA [AND STATISTICS]
You can decide if only an empty table should be created, based on the original one, or if the cloned table should be populated from the existing table additionally. The first option means to leave out the last step of the traditional method outlined above.
There are several benefits from replacing the traditional method of cloning tables with the CREATE TABLE AS statement.
First of all, you can transfer the statistics or the statistics definitions from the original table to the cloned table. As you might know, the collection of statistics can be a very time and resource consuming task. With the new method of cloning a table, no resources are wasted for the recollection of statistics on the cloned table.
If you create an empty copy of the original table, not copying the data, it is possible to transfer the statistics definitions. You can imagine what an annoying task it would be to collect the statistics manually on the cloned table.
Beside benefits as mentioned above, the CREATE TABLE AS method allows to create and populate clone tables from SQL statements. Almost every kind of SQL statement is allowed, including joins, aggregations and so on. Although statistics can be taken over even for SQL-based cloning, personally I do not exploit this possibility. Why? Because rules for statistics transferral are quite complex and as I like to keep things simple, it is not worth to internalize all these rules. Keeping things simple is the primary motivation for to using this new clone method, right?
Here an example for cloning bases on a SQL statement:
CREATE TABLE <TABLE> AS (SELECT * FROM <TABLE_1>) WITH DATA;
Rules for Cloning of Tables in Teradata and column attributes
When cloning tables or from SQL statements, there are several rules applied regarding the transfer of table and column attributes.
In principle it is enough to remember the following:
- If you clone a table, all attributes are cloned as well.
- In case you clone based on a SQL statement, attributes are not cloned.
Although there are exceptions to this rule, just remembering this statement will probably be sufficient most of the time.
See below which attributes are transferred in table cloning. Green attributes are transferred, red ones get lost.
Cloning from a SQL statement and moved attributes (as you can see, just the name and the data type)
Changing the attributes during cloning
As mentioned before, you can change table and column attributes during the cloning process.
In the case of SQL syntax allows it, you can directly modify the column attribute in the SQL statement like below (we are changing the data type attribute):
CREATE TABLE <TABLENAME> AS
SELECT CAST(A AS INTEGER)
) WITH DATA;
Many attributes cannot be changed because this would generate an invalid SQL syntax. In this case, we can use the syntax below (we are adding the NOT NULl constraint to column A):
CREATE TABLE <TABLENAME> ( A NOT NULL) AS
) WITH DATA;
Don't fall into the primary index trap
There is one rule in cloning which you have to internalize, or you will probably end up sooner or later searching for nasty performance problems:
In case you clone from a SQL statement you always have to define the primary index. Otherwise, Teradata will just take the first column of the cloned table and use it as the table's nonunique primary index. Using the first column could lead to skewed data and performance problems. Always keep an eye on this!
Here is an example where the primary index definition is omitted. Teradata will choose A to be the non-unique primary index of the clone table:
CREATE TABLE <TABLENAME> AS (SELECT A,B FROM <TABLE_1> WITH DATA;
You should always give a proper primary index definition, even if it is the first table column:
CREATE TABLE <TABLENAME> AS (SELECT A,B FROM <TABLE_1> PRIMARY INDEX (B) WITH DATA;
The CREATE TABLE AS statement can make your life easier. Although packed with many options, I recommend using only the ones making your life easier without forcing you to internalize dozens of rules.
- Decide if data should be copied
- Determine if the statistics should be transferred, but use it only if cloning a table (no SQL statement cloning)
- Internalize which attributes are copied in case of table and SQL statement cloning. As a rule of thumb: table cloning transfers all attributes, SQL statement cloning moves none
- Remember the two possibilities to change attributes during cloning
This is very good post on Table Cloning.
The two images in the post very clearly explain the difference between the two approaches.
One thing which I find missing is about Partition Primary Index.
In the SQL Statement approach , PPI of Source Table is also not transferred.
Nice post , really detailed. I totally agree to use for Production Env , the time - wasting approach.Extract DDL , insert data , collect Stats.
As you said , manually define the datatype/Nullable or not , this thing , a lot of people they forget it , and TD take as default the first returning row of the query . So , in more complex queries where i make a new table or volatile one , some columns might not be nullable or in a different dataype and when i will use these tables for some other work , i will might have impact on performance .
Ending , just to add one more parameter , that i learned on a bad way , is what if i have Foreign Keys and Triggers defined on my table and i want o make a clone one?Can i use the TD syntax?
Regarding creating tables from sql statements, is there any downside from locks on system tables (DBC)? Looking at the explain plans I understand that some dbc tables are locked on table level for the whole duration of executing the select statement. Is it like this, and if yes, wouldn't that prevent other users from creating/modifying tables etc?