Insert Into Teradata

By Roland Wenzlofsky

August 30, 2020


Changing the structure of a Teradata table can require a lot of resources for large tables. In principle, two methods are available. We can modify the DDL using ALTER TABLE or create an empty table with the desired DDL statement and copy the data using the Teradata INSERT INTO statement. Both methods have their advantages and disadvantages.

To compare both methods, we created a test scenario. The test table had an average data block size of 127KB, and we filled it with very short rows to accommodate several hundred rows per data block.

Next, we executed both an ALTER TABLE statement and an INSERT INTO statement. It proved that the ALTER TABLE method is more efficient; it caused 133 I/Os, whereas the INSERT INTO strategy caused 322 I/Os. The ALTER TABLE statement required 59% less I/O than an INSERT INTO statement in Teradata. The required CPU seconds were about the same.

Then we changed the data demography of the table and filled it with very wide rows. This meant that there was only room for 1-2 rows per data block. Again we made an ALTER TABLE statement as well as an INSERT INTO statement.

Here a significant advantage of the ALTER TABLE method was shown. It needed 539 I/Os, while the INSERT INTO strategy needed 270853 I/Os. The ALTER TABLE method required an impressive 99.8% less I/Os!

We perform many further comparisons in which we varied either the data block size or the row size (or both simultaneously). The result was obvious. The fewer rows per data block, the greater the I/O savings for the ALTER TABLE method.

Below we can see a table showing the results (logical and phyiscal I/Os):

 Logical I/OsPhysical I/Os
INSERT BIG ROW270 85312 992

However, this does not mean that we should always use ALTER TABLE, as some disadvantages need to be weighed up carefully:

ALTER TABLE can’t be aborted. This may lead to unexpected situations, as the following example shows; if a workload management rule should kill all sessions consuming more than 100.000 CPU seconds and the ALTER TABLE reaches this limit, the session fails after the ALTER TABLE finished.

ALTER TABLE exclusively locks the table, while the Teradata INSERT INTO statement only needs a read lock. A long-running ALTER TABLE can delay other workloads for a long time. Finally, it might be considered risky to make changes to the original table. The INSERT INTO statement works on a copy of the table, and risk is therefore less.

ALTER TABLE adds new columns at the end of the table. If we need it in a specific position, we have to use the INSERT INTO statement,

Should we use Teradata INSERT INTO or ALTER TABLE?

ALTER TABLE is undoubtedly the better alternative from a performance tuning point of view. If we can live in a concrete scenario with the limitations we mentioned above, it is the first choice. Otherwise, it is better to think about optimized forms of the Teradata INSERT INTO statement.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"rgb(255, 0, 0)"}},"gradients":[]}}]}__CONFIG_colors_palette__

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

  • everything said and done. I found the BEST approach is
    — clone Newtable with data and statistics
    –copy stats from old to the new table
    Both these even for a TB table are not bad in terms of CPU. But you now have an online backup
    –Alter table on the original
    –Conditional BTEQ if the above ran well drop the clone
    — That way you do not have to deal with reassigning security again on that table
    — Original table will not be available till DDL is done
    But MOST TB tables are accessed through views – re-point to the clone till work is done.
    — If the clone was updated then the same update needs to synch in the Altered table. BUT really for a huge fact table mostly used for OLAP that is an unlikely scenario but still the clone will confirm this
    — Perm Space will momentarily shoot up. Understanding that this is a temp operation – I don’t mind loaning space for a brief time
    — Make sure it does not contend with Arch jobs in any way

    Is there anything else that I could have missed.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like