fbpx

The Teradata Transient Journal

By Roland Wenzlofsky

June 28, 2016


Before Teradata changes a data row (update, insert or delete statement), it takes a backup copy of the entire row.

Each AMP manages its local transient journal. Backup rows are always kept together with the base table row on the same AMP.

The decision about which row belongs to each AMP is built on the hashing algorithm Teradata used to distribute rows evenly.

The only task of the transient journal is to allow Teradata to roll back failed transactions. In a rollback case, the base table row can be quickly replaced with the backup row from the transient journal.

The transient journal cannot be turned off manually.

This is reasonable because database integrity has to be ensured. But sometimes we want to avoid the usage of it for performance reasons.

Especially when doing many changes on a table (such as updating billions of rows), the transient journal can become huge. Because it’s stored in the DBC database, we could run out of space. Running out of space would have significant implications on the existing workload (failing sessions, etc.)

Another nasty side effect of the transient journal shows up when updating many rows in an enormous and skewed table. The rollback usually consumes many resources and might hurt the overall system performance.

Remember, the transient journal is AMP-local: For any rollback executed on a skewed table, the AMPs holding most of the rows will have to do most of the work. Of course, the above-described issues also apply to DELETE statements.

Luckily, there are several techniques to avoid using the transient journal in such a situation described above.

As we said: The only reason for the system to write into the transient journal is to be able to rollback failed (or manually aborted) transactions.

But some transactions can be rolled back quickly without keeping a backup copy of rows and should be preferred. Here are some ideas:

INSERT into an empty table.

Here is a simple example of how the transient journal can be skipped by slightly changing our query.

INSERT INTO target SELECT col FROM table1;
INSERT INTO target SELECT col FROM table2;

In the above example, the first insert will occur without the transient journal, but the second insert uses it. We can easily rewrite our query, to avoid the usage of the transient journal altogether:

INSERT INTO target SELECT col FROM table1 UNION ALL SELECT col FROM table2;

Replace UPDATE with INSERT INTO … SELECT into a copy of the target table

If you have to deal with updates on huge and skewed tables, you should avoid direct updates:

UPDATE target SET col2 = 1;

Instead, you should use:

INSERT INTO copy_of_target SELECT col1, 1 AS col2 FROM target;
DROP TABLE target;
RENAME TABLE copy_of_target TO target;

A BTEQ trick

BTEQ allows you to avoid the transient journal if you put the semicolon for multiple INSERT statements into the same table at the begin of each line:

INSERT INTO target SELECT * FROM table1  — The semicolon starts on the next line!
; INSERT INTO target SELECT * FROM table2 — The semicolon has to be the first character!

__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"62516":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"62516":{"val":"var(--tcb-skin-color-0)"}},"gradients":[]},"original":{"colors":{"62516":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Previous Article
__CONFIG_colors_palette__{"active_palette":0,"config":{"colors":{"b4fbe":{"name":"Main Accent","parent":-1}},"gradients":[]},"palettes":[{"name":"Default Palette","value":{"colors":{"b4fbe":{"val":"rgb(241, 99, 52)"}},"gradients":[]},"original":{"colors":{"b4fbe":{"val":"rgb(19, 114, 211)","hsl":{"h":210,"s":0.83,"l":0.45}}},"gradients":[]}}]}__CONFIG_colors_palette__
Next Article
Buy the Book Teradata Query Performance Tuning

Roland Wenzlofsky

Roland Wenzlofsky is an experienced freelance Teradata Consultant & Performance Trainer. Born in Austria's capital Vienna, he is building and tuning some of the largest Teradata Data Warehouses in the European financial and telecommunication sectors for more than 20 years. He has played all the roles of developer, designer, business analyst, and project manager. Therefore, he knows like no other the problems and obstacles that make many data warehouse projects fail and all the tricks and tips that will help you succeed.

  • Avatar
    Smita BHAGAT says:

    HO To update a huge table means out of 10 billion records i want to 1 milion records?

  • Valuable update on TJ….

    Can we manually purge the TJ, there is no/less activity but it is consuming more than 2.5TB of space

    Regards
    Ashok

  • Avatar
    Johannes Vink says:

    Hi Roland,

    I suspect that a MERGE statement is also lower in TJ or at least more efficient. The orange book seems to hint at that, but only compares multi-load apply and ANSI merge.

    Regards, Johannes

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    You might also like

    >