Deadlocks occur when each of two transactions holds the lock on a database object the other transaction needs next.
Here is an example:
Transaction 1 locks some rows in table 1, and transaction 2 locks some rows in table 2. The next step of transaction 1 is to lock rows in table 2 and transaction 2 needs to lock rows in table 1.
We just described a deadlock situation. Without deadlock handling, both transactions would wait forever, or until they are aborted. On Teradata, deadlocks can happen on one AMP (local deadlock) or across different AMPs (global deadlock), and for various reasons.
Luckily, Teradata uses a queuing strategy which is serializing locking requests to avoid deadlocks.
There was a change in the naming convention for this locking strategy in Teradata 15.10.
Until Teradata 14.10, this locking strategy was called “pseudo table locks” independently if the lock was on row hash level (for dictionary tables) or table level.
Since Teradata 15.10, table level and partition locking (a new feature) are called “proxy locking”, and rows hash locking on dictionary tables is called “pseudo table locking.”
In my opinion, the strategy is still the same (but what’s new since Teradata 15.10 is partition locking). Just the wording changed.
Without a proper locking strategy, and two transactions asking for a write lock on the same table, it could happen that the first transaction gets the lock for the table on some of the AMPs, and the second query takes the locks on another set of AMPs.
None of the transactions would be able to finish its task. Both requests would wait forever (for completeness: there is an NOWAIT lock modifier available, which aborts the request if the lock can’t be obtained immediately).
A typical global deadlock situation (involving several AMPs).
The “proxy lock” or “pseudo table” strategy avoids such deadlocks, by serializing the requests.
For sure many of you all have seen the term “to prevent global deadlock” when explaining a query (the explain plan is from a Teradata 15.10 system):
Explain UPDATE CUSTOMER SET AGE = 40;
Here is an example, which shows the deadlock handling strategy in action:
Two update statements execute at almost the same time (‘Update 1’ and ‘Update 2’), and they want to update the same table. Each update intends to have the write lock:
UPDATE Customer SET Age=40 ;
UPDATE Customer SET Gender = ‘M’;
“Pseudo table” or “Proxy” locking ensures that each request has to get the pseudo lock on a reserved rowhash, before obtaining the required lock.
For each table, “proxy” or “pseudo table” locking defines an AMP which is the gatekeeper to the locks. The gatekeeper AMP for each table is found by hashing its “table id” value.
Hashing happens in the same way like primary index hashing. By hashing the “table id”, the gatekeeper AMP is found.
As the hashing algorithm is stable, the rows hash for a specific “table id” always is the same, and as long as the system configuration is not changed, there is a 1:1 relation between table and gatekeeper AMP.
In our example, there are two update statements which want a write lock on the same table.
Assuming that “update 1” is slightly faster than “update 2”, it will get the “proxy” or “pseudo table” lock on AMP 2, which is the gatekeeper for the table “Customer.” “Update 2” has to wait in the “gatekeeper” queue of AMP 2 and will be next, as soon as “update 2” finished and released the write lock.
Not all kind of deadlocks can be avoided with the above-described strategy. It only works if both participating transactions work with table locks. If one or both request use row-hash locking, deadlocks still can happen.
Furthermore, deadlock detection takes time. Teradata checks by default for global deadlocks every four minutes. Local deadlocks are searched every 30 seconds.
Usually, these times are ok, but in some particular cases, you might want to decrease global deadlock detection intervals.
One of my clients uses a lot of join indexes, which are causing many global deadlocks (the join indexes are needed for primary index access, i.e. row hash locks are used).
The join indexes are utilized by tactical workload requests, to keep execution times below a couple of seconds.
Having global deadlock detection set to four minutes is counterproductive in this case.
Historically, all databases were designed for transaction processing. Nowadays, big unstructured data is becoming more important, and the evolution moves into a direction of data consistency exchanged for performance.
Locking of database objects is a must in OLTP systems to avoid that multiple users can change and access the same data at the same time in a way which is causing data inconsistencies. In other words, whenever a user is accessing or changing data, the affected objects are locked to ensure integrity.
Lock handling works on two different levels. Each time a database row is accessed, Teradata has to decide how granular the lock will be (the locked object type) and secondly, which kind of interactions on that object should be locked.
Teradata can apply locks at the following levels of granularity:
Database lock: All objects located in a database are locked
Table lock: all rows located in a table are locked
Rowhash lock: single or multiple rows in a table are locked
The Exclusive Lock
Exclusive locks can only be applied to databases or tables only. An exclusive lock on a table or database avoids that any other user can obtain any lock on this object. Mainly, you will see Teradata locking exclusively during DDL command execution.
The Write Lock
Whenever a write lock is put onto an object, other users can obtain only access locks to the same object. Access locks allow dirty reads of the data “as is.” No exclusive locks, read locks or other write locks can be obtained during the time any user holds a write lock on the object of desire.
The Read Lock
Whenever a user holds a read lock on an object, no write locks or exclusive locks can be obtained by any other users. However, unlimited read locks (and access locks) can be held by other users on the same object at the same time.
The Access Lock
Access locks allow a user to read “inconsistent” data. Access locks work even if another user holds already a write lock on the object. The only exception when not even an access lock is allowed: in case an exclusive lock is already owned by another user on the same object.
Lock handling occurs all the time automatically on Teradata but is your decision to upgrade or downgrade locks.
SELECT * FROM <TABLE>;
Requires by default requires a read lock on the table, and this is the default behavior to ensure data integrity.
Maybe you don’t care about having 100% accurate data and can live with changes going on at the same time you access the data. In this case, you could downgrade explicitly to an access lock:
LOCKING TABLE <TABLE> FOR ACCESS
SELECT * FROM <TABLE>;
The locking for access modifier in the query above now allows you to read the rows of <TABLE> even if they are currently “write locked” by another user.
Similarly, you could upgrade your lock, let’s say from a read lock to a write lock:
LOCKING TABLE <TABLE> FOR WRITE
SELECT * FROM <TABLE>;
Any other user who would need now a read lock, write lock or an exclusive lock will have to wait until your query finished. On the other hand, dirty reads would be still allowed at the same time (obtaining an access lock).