The Fundamentals of FastLoading on Teradata
What is a Teradata FastLoad?
FastLoad is known for its speed in loading massive amounts of data from flat files into a table.
The FastLoad utility achieves this, by skipping the transient journal, which requires the target table to be empty. If the target table has data, you would need to use MultiLoad instead. It can handle populated tables but will be slower than the FastLoad.
How FastLoad works
There are some other optimizations, apart from skipping the transient journal, which make FastLoad perform so well.
The FastLoad utility assembles data into 64KB blocks before sending it to the database and can use multiple sessions at the same time, taking advantage of the parallel architecture.
Block level loading is a huge advantage over BTEQ and TPump, which load data row by row, using the transient journal.
On smaller Teradata Systems, Fastload will use one session per AMP to maximize parallel processing. Systems with a massive amount of AMPs usually limit the number of parallel Fastload sessions.
FastLoad comes with several limitations; we have to consider in our load design.
The FastLoad Phases
FastLoad consists of two phases. Incoming data is evenly distributed across the AMPs.
The Acquisition Phase
The Application Phase
Restartability of FastLoads
We may need to restart a Fastload for several reasons. The Teradata system may be restarted in the middle of the load, or the filesystem containing the loaded data went down. A restarted Fastload will continue where it was interrupted.
FastLoads can't be restarted in the following situations:
FastLoads can be restarted in the following situations: