Table cloning is a frequently required task. Most likely, you are familiar with the conventional method of performing it.
– Retrieve the DDL statement of the table, which has to be cloned
– Create a copy of the table structure using this DDL
– Run an INSERT…SELECT into the empty table
Teradata provides an alternative approach with the following statement:
CREATE TABLE <TABLENAME> AS <TABLE_TO_CLONE> WITH [NO] DATA [AND STATISTICS]
We can choose to either create an empty table solely from the original or populate the cloned table from the existing one. Opting for the former entails omitting the final step of the conventional approach.
Replacing traditional cloning tables with the CREATE TABLE AS statement offers numerous benefits.
You can transfer statistics and definitions from the original table to the cloned table. Collecting statistics can be time and resource-intensive. Using the cloning method, resources are not wasted on collecting statistics for the cloned table.
You can migrate the statistical definitions by creating a blank replica of the initial table and excluding the data. Manually collecting statistics for the duplicated table would be a tedious undertaking.
In addition to the aforementioned advantages, the CREATE TABLE AS technique permits the formation and population of duplicate tables from SQL statements. This process permits a broad range of SQL statements, such as joins and aggregations. Despite the ability to execute statistics for SQL-based cloning, I choose not to, as the statistics transfer regulations are rather intricate. Therefore, I prefer to abide by simplicity and avoid the need to comprehend these regulations. The central incentive for adopting this novel cloning method is to maintain simplicity.
Here is an example of cloning based on a SQL statement:
CREATE TABLE <TABLE> AS (SELECT * FROM <TABLE_1>) WITH DATA;
Rules for Cloning of Tables in Teradata and column attributes
When duplicating tables from SQL statements, various regulations for transferring table and column attributes are enforced.
Principally, it suffices to recall the subsequent:
– If you clone a table, all attributes are cloned as well.
If you clone based on a SQL statement, attributes are not cloned.
While there may be exceptions, recalling this statement will likely suffice.
Below are the attributes transferred in table cloning. Green attributes are preserved, while red attributes are lost.
Cloning a SQL statement to transfer attributes, including only the name and data type.
Changing the attributes during cloning
Table and column attributes can be modified during cloning.
If permitted by the SQL syntax, you may modify the column attribute directly in the SQL statement as shown below, wherein the data type attribute is being altered:
CREATE TABLE <TABLENAME> AS
(
SELECT CAST(A AS INTEGER)
FROM <TABLE_1>
) WITH DATA;
Certain attributes are unalterable, as modifying them would yield an erroneous SQL syntax. To address this scenario, the following syntax can be applied (in this instance, the NOT NULL constraint is appended to column A):
CREATE TABLE <TABLENAME> ( A NOT NULL) AS
(
SELECT A
FROM <TABLE_1>
) WITH DATA;
Don’t fall into the primary index trap
To avoid unpleasant performance issues, it is essential to internalize the cardinal rule of cloning.
Defining the primary index is necessary when cloning from a SQL statement. Failure to do so will result in Teradata utilizing the first column of the cloned table as the non-unique primary index, which may cause data skewing and performance issues. It is imperative to remain vigilant of this potential problem.
An example of an omitted primary index definition results in Teradata selecting A as the non-unique primary index for the clone table.
CREATE TABLE <TABLENAME> AS (SELECT A,B FROM <TABLE_1> WITH DATA;
It is essential to provide a correct primary index definition, even if it is the initial column in the table.
CREATE TABLE <TABLENAME> AS (SELECT A,B FROM <TABLE_1> PRIMARY INDEX (B) WITH DATA;
In conclusion, it can be deduced that redundancies in writing can be eliminated by utilizing a more suitable vocabulary and refining the structure. Therefore, the revised text will be more concise and effective in conveying the intended message.
Simplify your life by using the CREATE TABLE AS statement. While it offers multiple options, I suggest selecting only those that simplify without requiring you to memorize numerous rules.
– Decide if data should be copied
– Determine if the statistics should be transferred, but use it only if cloning a table (no SQL statement cloning)
– Internalize which attributes are copied in the table and SQL statement cloning. As a rule of thumb: table cloning transfers all attributes, and SQL statement cloning moves  none
– Remember the two possibilities to change attributes during cloning
Check out this guide to FastLoading on Teradata: The Fundamentals of FastLoading on Teradata.
[…] See also: Cloning of Tables in Teradata […]
Hello,
Regarding creating tables from sql statements, is there any downside from locks on system tables (DBC)? Looking at the explain plans I understand that some dbc tables are locked on table level for the whole duration of executing the select statement. Is it like this, and if yes, wouldn’t that prevent other users from creating/modifying tables etc?
Regards
Ilkka
Hello,
Nice post, really detailed. I totally agree to use for Production Env, the time-wasting approach.Extract DDL, insert data, collect Stats.
As you said, manually define the datatype/Nullable or not, this thing, a lot of people they forget it, and TD take as default the first returning row of the query. So, in more complex queries where I make a new table or volatile one, some columns might not be nullable or in a different datatype and when I will use these tables for some other work, I will have an impact on performance.
Ending, just to add one more parameter, that I learned in a bad way, is what if I have Foreign Keys and Triggers defined on my table and I want o make a clone one? Can I use the TD syntax?
Hi Roland
This is a very good post on Table Cloning.
The two images in the post very clearly explain the difference between the two approaches.
One thing which I find missing is about Partition Primary Index.
In the SQL Statement approach, PPI of Source Table is also not transferred.
Regards
Nitin