Improve your Teradata Load Performance

4
326

Teradata Load PerformanceTeradata allows only a limited number of bulk loads to be executed together as they consume a lot of resources.  The precise limit depends on your Teradata system. It will be somewhere in the lower 2-digit area.

Whenever the utility limit is reached, additional requests will end up in a workload queue. On a heavily loaded system, this will lead to an unstable load behavior: Run times for the same load can vary considerably.

Depending on the number of rows and the size of the rows, it may be a better choice to use TPUMP or BTEQ. You should consider this transactional loading types for a small number of rows(a few thousand rows only). This way you can avoid wasting utility load slots.

The good idea is to analyze your existing “loading landscape”, searching for possibilities to improve performance. The QueryLog will help you to detect Fastloads and Multiloads which you should switch to Bteq or TPump.

The SQL below will do the job for you:

SELECT TheTable,UtilityRowCount,Starttime,FirstRespTime,RunTime
FROM
(
SELECT
SUBSTR(t01.QUERYTEXT,15) AS TABLE_START,
SUBSTR(TABLE_START,1,MINDEX(TABLE_START,’ ‘)) AS TheTABLE,
t02.UtilityRowCount,
t02.TotalIOCount,
t01.Starttime,
t02.FirstRespTime,
(t02.FIRSTRESPTIME – t01.STARTTIME DAY(2) TO SECOND(6) ) RUNTIME,
60*60*EXTRACT(HOUR FROM RUNTIME) +  60*EXTRACT (MINUTE FROM  RUNTIME) +
EXTRACT(SECOND FROM  RUNTIME)    (DECIMAL(18,2))  AS  RUNTIME_SECS

FROM DBC.DBQLogTbl  t01
INNER  JOIN
DBC.DBQLogTbl  t02
ON
t01.SESSIONID = t02.SESSIONID
WHERE
t01.STARTTIME(DATE)  BETWEEN DATE-21 AND DATE
AND t01.StatementType =  ‘BEGIN Loading’
AND t02.STARTTIME(DATE) BETWEEN DATE-21 AND DATE
AND   t02.StatementType = ‘END Loading’
)  x;
The result set shows you all Fastloads (or Multiloads)  and the number of rows which have been bulk loaded into each target table (I restricted the result set to the last 21 days).

This query helps to detect the bulk loads which you should change to transactional loads (BTEQ, TPUMP).  “Mytable2” is such a candidate.

One hint: Be sure that filtering on CPU seconds is turned off for the query log, otherwise most “Begin Loading” statements will be missing as they only consume small amounts of CPU and will not be recorded in the DBC.DBQLOGTBL.

 

TheTable UtilityRowCount Starttime FirstRespTime RunTime
MyTable 4.352.814,00 16.07.2015 00:25:24 16.07.2015 00:25:59   0 00:00:35.450000
MyTable2 100,00 16.07.2015 00:04:25 16.07.2015 04:25:30   0 00:00:05.450000
Our Reader Score
[Total: 12    Average: 3.7/5]
Improve your Teradata Load Performance written by Roland Wenzlofsky on July 30, 2015 average rating 3.7/5 - 12 user ratings

4 COMMENTS

  1. Hi,

    I am not sure if delays caused by missing utility slots are shown in these two columns? Maybe somebody else can comment on this?

    Usually, I pick up the session of my fastload from the DBC.LogonOff view. I take the column LogonTime of this table and calculate the difference between the column Starttime of DBC.DBQLOGTBL and the column LogonTime of DBC.LogonOff. The difference between the logon and the first activity in the QueryLog seems to be a good indication about how long the fastload was waiting for a utility slot.

    I just checked the documentation. Actually, you are right: The Delays should be visible in both columns you mentioned. WDDelay shows the workload delays and Delay should sum up all delays which happened (system and workload specific).

    My approach from above may be wrong. But i am not sure 😉

    Hope this helps.

    Roland

  2. Hello Roland, Thank you for this article.

    Using above mentioned sequel I found more than 40 tables in our DWH system, however I also found that there are no uility loads which may have delaytime>0 or wddelaytime >0.
    So here can I conclude that I do not need to convert those 40 fastload/mload jobs to btq/tpt loads?
    Please advise.

LEAVE A REPLY

Please enter your comment!
Please enter your name here