What are Deadlocks in Teradata?
Deadlocks occur when each of two transactions holds the lock on a database object the other transaction needs next.
Here is an example:
Transaction 1 locks some rows in table 1, and transaction 2 locks some rows in table 2. The next step of transaction 1 is to lock rows in table 2, and transaction 2 needs to lock rows in table 1.
We just described a deadlock situation. Without deadlock handling, both transactions would wait forever or until Teradata aborts them. On Teradata, deadlocks can happen on one AMP (local deadlock) or across different AMPs (global deadlock).
Luckily, Teradata uses a queuing strategy, serializing locking requests to avoid deadlocks.
There was a change in the naming convention for this locking strategy in Teradata 15.10.
Until Teradata 14.10, this locking strategy was called “pseudo table locks” independently if the lock was on row hash level (for dictionary tables) or table level.
Since Teradata 15.10, table level and partition locking (a new feature) are called “proxy locking”, and rows hash locking on dictionary tables is called “pseudo table locking.”
The strategy is still the same (but what’s new since Teradata 15.10 is partition locking). Just the wording changed.
The “Proxy” or “Pseudo Table” Lock Strategy
Without a proper locking strategy, and two transactions asking for a write lock on the same table, the first transaction could get the lock for the table on some of the AMPs, and the second query could take the locks on another set of AMPs.
None of the transactions would be able to finish its task. Both requests would wait forever (for completeness: a NOWAIT lock modifier is available, which aborts the request if the AMP can’t obtain the lock immediately).
A typical global deadlock situation (involving several AMPs):
The “proxy lock” or “pseudo table” strategy avoids such deadlocks by serializing the requests.
For sure many of you have seen the term “to prevent global deadlock” when explaining a query (the below EXPLAIN plan is from a Teradata 15.10 system):
Explain UPDATE CUSTOMER SET AGE = 40;
1) First, we lock DWHPRO.CUSTOMER for write
on a reserved RowHash to prevent global deadlock.
2) Next, we lock DWHPRO.CUSTOMER for write.
3) We do an all-AMPs UPDATE from DWHPRO.CUSTOMER by way of an
all-rows scan with no residual conditions. The size is estimated
with high confidence to be 100,000 rows. The estimated time for
this step is 0.21 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.
The total estimated time is 0.21 seconds.
Teradata Deadlock Prevention in Action
Here is an example that shows the deadlock handling strategy in action:
Two update statements execute simultaneously (‘Update 1’ and ‘Update 2’), and they want to update the same table. Each update intends to have the write lock:
UPDATE Customer SET Age = 40 ;
UPDATE Customer SET Gender = 'M';
“Pseudo table” or “Proxy” locking ensures that each request has to get the pseudo lock on a reserved rowhash before obtaining the required lock.
“proxy” or “pseudo table” locking defines an AMP, the gatekeeper to the locks for each table. The gatekeeper AMP for each table is found by hashing its “table id” value.
Hashing occurs in the same way as primary index hashing. By hashing the “table id”, the gatekeeper AMP is found.
As the hashing algorithm is stable, the ROWHASH for a specific “table id” always is the same. As long as the system configuration is not changed, there is a 1:1 relation between the table and gatekeeper AMP.
In our example, two update statements want a write lock on the same table.
Assuming that “update 1” is slightly faster than “update 2”, it will get the “proxy” or “pseudo table” lock on AMP 2, which is the gatekeeper for the table “Customer.” “Update 2” has to wait in the “gatekeeper” queue of AMP 2 and will be next as soon as “update 2” is finished and has released the write lock.
Teradata Deadlocks and BTEQ
BTEQ can repeat a step that terminated in a deadlock; the error code 2631 must occur, and “.SET RETRY ON” must be active. BTEQ will then repeat the request (but not the whole transaction). This functionality is unavailable in other client applications, and we must achieve it programmatically.
Recommendations for reducing Deadlocks in Teradata
- Use LOCKING FOR ACCESS whenever dirty reads are allowed.
LOCKING Customer FOR ACCESS
SELECT * FROM Customer;
- Avoid BT/ET but use multistatement requests instead to avoid table-level deadlocks.
- For unique index access (UPI, USI), use LOCKING ROW FOR WRITE or EXCLUSIVE before executing a transaction.
- In an application, we suggest putting all exclusive locks on tables and databases needed at the beginning of the transaction.
- If a lock can’t be granted, use the NOWAIT locking modifier option to abort the transaction immediately.
We can’t avoid all kinds of deadlocks with the above-described strategy. It only works if both participating transactions work with table locks. If one or both requests use row-hash locking, deadlocks still can happen.
Furthermore, deadlock detection takes time—Teradata checks by default for global deadlocks every four minutes. Teradata searches for local deadlocks every 30 seconds.
Usually, these durations are ok, but you might want to decrease global deadlock detection intervals in some cases.
One of my clients uses many join indexes, causing many global deadlocks (the join indexes used for primary index access, i.e., row hash locks are employed). The join indexes support tactical workload requests to keep execution times short and stable.
Having global deadlock detection set to four minutes is counterproductive in this case.
What Kind of Deadlocks exists?
AMP-local deadlocks occur on the same AMP, the deadlock detection software checks for such deadlocks every 30 seconds.
Global deadlocks occur across different AMPs and are checked by default and detected by the deadlock detection software every 4 minutes; we can change the interval with the DBS control utility (Parameter DeadLockTimeOut).
Which transaction is aborted in case of a Deadlock?
The most recent transaction will be aborted and rolled back
How can I handle Deadlocks in BTEQ?
If a deadlock occurs in BTEQ, only the request that caused the failure is repeated, not the whole transaction (we need to turn the RETRY feature on).
Here is an example:
SELECT * FROM Customer;
SELECT * FROM Deals;
The semicolon at the end of each line creates two requests. If request two fails, Teradata will not roll back request one.
Rewriting the BTEQ as a multistatement request changes the two requests into one request:
SELECT * FROM Customer
;SELECT * FROM Deals;
With the behavior of BTEQ with the RETRY option turned on, an unexpected situation can occur:
INSERT INTO Customer VALUES (1);
INSERT INTO Customer VALUES (2);
INSERT INTO Customer VALUES (3);
If the last INSERT statement fails, Teradata rolls back the entire transaction. Still, BTEQ will retry (only) the previous INSERT statement again. If it succeeded, it was executed as an implicit transaction. Therefore, the END TRANSACTION fails as INSERT statement number 3 was already committed.
How can I handle Deadlocks in my Applications?
While BTEQ offers the RETRY functionality, you need to resubmit the whole aborted transaction after a deadlock in your applications. Your application must detect error code 2631 and resubmit the transaction.
Is Teradata always respecting the LOCKING Modifier?
Teradata can upgrade the locks if required.
Where can I see the system-wide deadlocks?
The view DBC.ResUsageSPMA provides a column that helps us analyze deadlocks named DBLockDeadlocks for the defined sample interval (for example, 600 seconds)
We influence the LOCKS and can often prevent deadlocks by using LOCKING FOR ACCESS. In this article, you will learn what types of locking there are and how to use them optimally:
More granular types of deadlocks
In addition to deadlocks caused by transactions, there are also deadlocks at the file system level (disk segments).
The view DBC.ResUsageSPMA provides some columns that help us analyze deadlocks for disk segments on the system level. These columns indicate the total number of blocks, deadlocks, and lock requests for each disk segment.
|FileLockBlocks||The number of lock requests that were blocked.|
|FileLockDeadlocks||The number of deadlocks detected on lock requests.|
|FileLockEnters||The number of times a lock was requested.|