fbpx

The Teradata Utilities

By Roland Wenzlofsky

June 3, 2014


Teradata has several ways of external loading data. The Teradata load utilities can be divided into two groups:

The first group allows us to bring the data as fast as possible into the target table by bypassing the transient journal. The second group uses the transient journal but is slower.

Each group has its advantages and disadvantages. Depending on the requirements, you may choose to load your data either way.

Historically, each load utility has been a separate tool offering its level of usability and syntax. Still, now Teradata provides a standard tool, which is thought to accumulate all different instruments’ functionality: the Teradata Parallel Transporter (TPT). Nevertheless, all specialized tools still exist and can be used.

Many useful new features have been added into TPT, but the new scripting language is inconvenient.

The Teradata Utilities for Loading

The transactional load utilities are called BTEQ and TPump. The ones bypassing the transient journal are the Fastload and the Multiload. They are called bulk load utilities.

This article is not thought to teach you the syntax of these tools. Please take a look at the  Offical Teradata Documentation if you are interested in the syntax. This article aims to give you advice about how and when each tool should be used.

Fastload and Multiload are the most efficient load utilities, as the data is loaded in blocks of 64 kilobytes of data. BTEQ and TPump insert each row separately, using transaction handling (offering rollback functionality).

As no transaction log is used, the following restrictions exist for bulk load utilities:

  • Indexes, triggers, and referential integrity rules have to be removed before loading.
  • Fastload requires the target table to be empty.

The set tables’ loading requires a duplicate row check, but as the data is loaded in blocks, no duplicate row checks can be done. Teradata solves this problem by filtering rows before they reach the database.

Multiload, the second bulk load utility,  is not restricted to loading but can do additional tasks, such as UPDATEs, DELETEs, and UPSERTs. It uses temporary tables and therefore does not reach the load performance of a Fastload. The advantage over Fastloads is that the target table does not have to be empty.

BTEQ was the first Teradata load utility available, and it’s still very useful for loading small amounts of data. Unfortunately, it’s missing features like multiple load sessions and advanced deadlock handling.  Therefore the TPUMP utility was introduced, which implements these features in a better way.

The decision about the required load utility is mainly based on the number of source rows and the number of bytes each data row is consuming.

In principle, the bulk loading tools Fastload and Multiload should be used for the huge amount of data, while BTEQ and TPump are intended to be used for smaller data quantities. TPump offers the possibility to load the data at a particular rate (rows per second), making it an appropriate tool for active warehousing.

Personally, I use BTEQ for loading up to a few thousand records, TPUMP for loading up to 500.000 records, and Fastloads or Multiloads for more than 500.000 records. These numbers above are just a rule of thumb, as full rows may force you to use bulk load utilities already with a smaller number of rows.

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
Buy the Book Teradata Query Performance Tuning

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >