fbpx

Teradata Stage Table Design

By Roland Wenzlofsky

June 24, 2015


Even though stage tables are only temporary objects, holding intermediate data, a conventional table design can differentiate between a good performing load and a load proceeding at a snail’s pace. The additional efforts are paying off soon.

The correct choice of data types and sizes is essential. Some bytes have to be pushed into Teradata, which determines the load time. Using the suitable loading utility and minimizing the amount of data are primary goals in load optimization.

Multivalue compression and  NoPi tables should be additional tools you should consider.

Below is an example, how a poorly chosen data type can significantly increase your load times. We are loading the same file twice into Teradata (fast load). The only difference is that the first time the textual columns are defined as CHAR, and the second time they are defined as VARCHAR.

The average text length of each column (COl1 to COl10) is a few characters only. The usage of CHAR columns means we are wasting a lot of space (CHAR columns are filled with spaces to the full length).

When using CHAR(1000) as the data type for all columns, the fast load needs more than 10 seconds per 100,000 rows.

CREATE SET TABLE DB.CHAR_TABLE
(
COL1 CHAR(1000),
COL2 CHAR(1000),
COL3 CHAR(1000),
COL4 CHAR(1000),
COL5 CHAR(1000),
COL6 CHAR(1000),
COL7 CHAR(1000),
COL8 CHAR(1000),
COL9 CHAR(1000),
COL10 CHAR(1000)
PRIMARY INDEX (COL1);

**** 19:10:05 Number of recs/msg: 10
**** 19:10:05 Starting to send to RDBMS with record 1
**** 19:10:19 Starting row 100000
**** 19:10:32 Starting row 200000
**** 19:10:49 Starting row 300000
**** 19:10:59 Starting row 400000
**** 19:11:02 Finished sending rows to the RDBMS

If we are changing the data type and are using VARCHAR(1000), the fast load is much faster and only needs about 1 second per 100,000 rows.

CREATE SET TABLE DB.VARCHAR_TABLE
(
COL1 VARCHAR(1000),
COL2 VARCHAR(1000),
COL3 VARCHAR(1000),
COL4 VARCHAR(1000),
COL5 VARCHAR(1000),
COL6 VARCHAR(1000),
COL7 VARCHAR(1000),
COL8 VARCHAR(1000),
COL9 VARCHAR(1000),
COL10 VARCHAR(1000)
)
PRIMARY INDEX (COL1);

**** 19:20:28 Number of recs/msg: 10
**** 19:20:28 Starting to send to RDBMS with record 1
**** 19:20:28 Starting row 100000
**** 19:20:29 Starting row 200000
**** 19:20:29 Starting row 300000
**** 19:20:30 Starting row 400000
**** 19:20:31 Finished sending rows to the RDBMS

Only by changing the data type, the load speed was increased by a factor of 10.

Proper table design becomes especially important when we are dealing with wide rows. Wide rows cause only a few rows to fit into each data block, and as IOs are costly, load times increase significantly.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
Buy the Book Teradata Query Performance Tuning

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>