Before Teradata changes a data row (update, insert or delete statement), it takes a backup copy of the entire row.
Each AMP manages its local transient journal. Backup rows are always kept together with the base table row on the same AMP.
The decision about which row belongs to each AMP is built on the hashing algorithm Teradata used to distribute rows evenly.
The only task of the transient journal is to allow Teradata to roll back failed transactions. In a rollback case, the base table row can be quickly replaced with the backup row from the transient journal.
The transient journal cannot be turned off manually.
This is reasonable because database integrity has to be ensured. But sometimes we want to avoid the usage of it for performance reasons.
Especially when doing many changes on a table (such as updating billions of rows), the transient journal can become huge. Because it’s stored in the DBC database, we could run out of space. Running out of space would have significant implications on the existing workload (failing sessions, etc.)
Another nasty side effect of the transient journal shows up when updating many rows in an enormous and skewed table. The rollback usually consumes many resources and might hurt the overall system performance.
Remember, the transient journal is AMP-local: For any rollback executed on a skewed table, the AMPs holding most of the rows will have to do most of the work. Of course, the above-described issues also apply to DELETE statements.
Luckily, there are several techniques to avoid using the transient journal in such a situation described above.
As we said: The only reason for the system to write into the transient journal is to be able to rollback failed (or manually aborted) transactions.
But some transactions can be rolled back quickly without keeping a backup copy of rows and should be preferred. Here are some ideas:
INSERT into an empty table.
Here is a simple example of how the transient journal can be skipped by slightly changing our query.
INSERT INTO target SELECT col FROM table1;
INSERT INTO target SELECT col FROM table2;
In the above example, the first insert will occur without the transient journal, but the second insert uses it. We can easily rewrite our query, to avoid the usage of the transient journal altogether:
INSERT INTO target SELECT col FROM table1 UNION ALL SELECT col FROM table2;
Replace UPDATE with INSERT INTO … SELECT into a copy of the target table
If you have to deal with updates on huge and skewed tables, you should avoid direct updates:
UPDATE target SET col2 = 1;
Instead, you should use:
INSERT INTO copy_of_target SELECT col1, 1 AS col2 FROM target;
DROP TABLE target;
RENAME TABLE copy_of_target TO target;
A BTEQ trick
BTEQ allows you to avoid the transient journal if you put the semicolon for multiple INSERT statements into the same table at the begin of each line:
INSERT INTO target SELECT * FROM table1 — The semicolon starts on the next line!
; INSERT INTO target SELECT * FROM table2 — The semicolon has to be the first character!