fbpx

Teradata TPump

By Roland Wenzlofsky

September 12, 2020


What is Teradata TPump?

TPump stands for Teradata Parallel Data Pump.

Teradata TPump does not move data in large blocks like Fastload or Multiload. Instead, it loads one row at a time, using row hash locks. Using row hash locks allows TPump to do many concurrent INSERTs and UPDATEs on a table.

Teradata TPump is not designed to bulk load huge amounts of data at once (and as fast as possible) but allows a trickle-feed of data to flow into the database.

The Teradata TPump Features

Real-time loading: Transactional systems need to get the data as soon as possible into the data warehouse. TPump allows performing near real-time updates from source systems into Teradata.

Throttling: When loading with TPump, you can define how many updates should be done per minute. This is called the statement rate.

The statement rate can be even changed during the execution of a TPump load job. You might decide to speed up the load during the nightly batch window and slow it down (decreasing resource consumption) during the day when business users are running their reports.

DML Functionality: TPump can do DML functions (like Multiload). This includes INSERT, UPDATE, and DELETE statements.

Advantages over Bulk Loading with Fastload or Multiload: TPump allows both USI and NUSI. While Fastload requires that the target table is empty, TPump can load, such as the Multiload, into a populated table.

Unlike Fastload and Multiload, row duplicates are allowed. Furthermore, you don’t have to drop triggers or referential integrity before loading.

The most important TPump Limitations

  • No concatenation of input data files is allowed.
  • TPump can’t handle aggregates, arithmetic functions, or exponentiation.
  • The use of the SELECT is not allowed.
  • No more than four IMPORT commands may be used in a single load task. This means that the most four files can be directly read in a single run.
  • TPump performance will decrease if access logging is enabled on the target objects.

Monitoring TPump

Teradata TPump comes with a tool called the TPump Monitor. This tool allows you to check the status of TPump jobs in real-time and change the statement rate of running jobs.

You can start the monitor under LINUX with the following command:

tpumpmon [-h] [TDPID/],[,]

TPump Error Handling

TPump uses one error table per target table. The error tables store a copy of each row, which caused the error.

These are some common errors showing up in TPump loads:

2816: Failed to insert a duplicate row into the TPump Target Table.

The error happened when TPump discovered a duplicate row. Only the first row will be inserted; duplicates will be discarded.

2817: Activity count greater than one for TPump UPDATE/DELETE.

This error happens when there are extra rows when TPump is attempting an UPDATE or DELETE.

2818: Activity count zero for TPump UPDATE or DELETE.

This error shows that the UPDATE or DELETE did not occur.

TPump Restartability

TPump is fully restartable as long as the log table, and error tables are not dropped.

TPump Example Script

.logtable TheDatabase.tpumplog;
.logon DWHPRO,******;
database TheDatabase;
.name test;
.begin load errortable ET_CUST
sleep 5
checkpoint 5
sessions 8
errlimit 4
pack 4

tenacity 4
serialize on;

specify a PRIMAY KEY in the .FILED command */
.layout mylayout;
.field CUSTID * varchar(5) key;
.field CUSTNAME * varchar(30);
.dml label INST;
insert into CUSTOMER
(CUSTID,CUSTOMER_NAME
)
values
(:CUSTID,
:CUSTNAME
);
.import infile CUSTDATA.txt
format vartext ‘,’
layout mylayout
apply INST;
.end load;
.logoff;

The Fundamentals of FastLoading on Teradata

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"rgb(255, 0, 0)"}},"gradients":[]}}]}__CONFIG_colors_palette__
GET OUR TERADATA BOOK

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

You might also like

>