Teradata offers multiple methods for external data loading, which can be categorized into two groups.
The bulk load utilities facilitate accelerated data transfer to the target table by circumventing the transient journal. The transactional utilities, on the other hand, use the transient journal, which results in a slower process.
Every type has its own set of pros and cons. You can load your data in either manner based on your specific needs.
Traditionally, individual load utilities offered varying levels of usability and syntax. However, Teradata now offers a universal tool, the Teradata Parallel Transporter (TPT), that integrates the functionality of all these distinct utilities. Despite this, the specialized tools remain available for use.
TPT now boasts many useful features, yet its scripting language remains inconvenient.
The Teradata Utilities for Loading
The utilities for transactional load are BTEQ and TPump, while Fastload and Multiload are known as bulk load utilities as they bypass the transient journal.
This article does not aim to teach syntax for these tools. Refer to the Official Teradata Documentation for syntax information. The purpose of this article is to provide guidance on the appropriate use of each tool.
Fastload and Multiload are highly efficient load utilities because they load data in 64-kilobyte blocks. In contrast, BTEQ and TPump insert each row separately, providing transaction handling with a rollback function.
Bulk load utilities have certain restrictions due to their lack of transaction logs.
- Indexes, triggers, and referential integrity rules must be removed before loading.
- Fastload requires the target table to be empty.
To load set tables, a duplicate row check is necessary. However, duplicate row checks cannot be performed since data is loaded in blocks. Teradata addresses this issue by filtering rows before the database entry.
Multiload is a versatile utility that can perform various tasks beyond loading, including UPDATEs, DELETEs, and UPSERTs. Unlike Fastload, Multiload uses temporary tables, resulting in lower load performance. However, Multiload offers the advantage of not requiring an empty target table.
Initially, BTEQ was Teradata’s primary load utility. It remains useful for loading small data quantities but lacks several key features, such as advanced deadlock handling and the ability to accommodate multiple load sessions. TPUMP, which incorporates these features more effectively, was subsequently developed as a solution.
The required load utility is determined primarily by the number of source rows and the byte consumption per data row.
Fastload and Multiload are recommended for handling large amounts of data, while BTEQ and TPump are better suited for smaller amounts. TPump is particularly useful for active warehousing as it allows for controlled data loading at a specific rate (measured in rows per second).
I use BTEQ for small data loads, TPUMP for moderate loads up to 500,000 records, and Fastload or Multiload for larger loads exceeding 500,000 records. However, it’s important to note that the size of the loaded rows may necessitate bulk load utilities even with smaller numbers of records.
Related Services
🔧 Need Expert Database Administration?
Our team brings 25+ years of enterprise DBA experience across Teradata, Snowflake, and Oracle.
Meet Our Team →📋 Considering a Move From Teradata?
Get a personalized migration roadmap in 2 minutes. We have migrated billions of rows from Teradata to Snowflake, Databricks, and more.
Free Migration Assessment →