Bug #100752 Confusing latest deadlock information when victim holds no record locks
Submitted: 6 Sep 2:30 Modified: 7 Sep 9:21
Reporter: Jesper Wisborg Krogh Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:8.0.21 OS:Any
Assigned to: CPU Architecture:Any

[6 Sep 2:30] Jesper Wisborg Krogh
Description:
When a victim of a deadlock holds no record locks, the latest deadlock information will still show that it does with the information of the requested lock.

How to repeat:
-- Connection   Processlist ID   Thread ID
-- ---------------------------------------
--          1              170         290
--          2              171         291
--          3              172         292

-- Connection 1
Connection 1> START TRANSACTION;
Query OK, 0 rows affected (0.0003 sec)

Connection 1> SELECT * FROM world.city WHERE ID = 130 FOR SHARE;
+-----+--------+-------------+-----------------+------------+
| ID  | Name   | CountryCode | District        | Population |
+-----+--------+-------------+-----------------+------------+
| 130 | Sydney | AUS         | New South Wales |    3276207 |
+-----+--------+-------------+-----------------+------------+
1 row in set (0.0004 sec)

-- Connection 2
Connection 2> START TRANSACTION;
Query OK, 0 rows affected (0.0002 sec)

-- Blocks
Connection 2> UPDATE world.city
                 SET Population = Population + 1
               WHERE ID = 130;

-- Connection 3
-- Notice that Connection 2 (thread_id = 291) holds no record locks
Connection 3> SELECT thread_id, index_name,
                     lock_type, lock_mode, lock_status, lock_data
                FROM performance_schema.data_locks
               WHERE thread_id IN (290, 291)
               ORDER BY thread_id, object_schema, object_name,
                        index_name, lock_type;
+-----------+------------+-----------+---------------+-------------+-----------+
| thread_id | index_name | lock_type | lock_mode     | lock_status | lock_data |
+-----------+------------+-----------+---------------+-------------+-----------+
|       290 | NULL       | TABLE     | IS            | GRANTED     | NULL      |
|       290 | PRIMARY    | RECORD    | S,REC_NOT_GAP | GRANTED     | 130       |
|       291 | NULL       | TABLE     | IX            | GRANTED     | NULL      |
|       291 | PRIMARY    | RECORD    | X,REC_NOT_GAP | WAITING     | 130       |
+-----------+------------+-----------+---------------+-------------+-----------+
4 rows in set (0.0013 sec)

-- Connection 1
Connection 1> UPDATE world.city
                 SET Population = Population + 1
               WHERE ID = 130;
Query OK, 1 row affected (0.0085 sec)

Rows matched: 1  Changed: 1  Warnings: 0

-- Connection 2
ERROR: 1213: Deadlock found when trying to get lock; try restarting transaction

-- Connection 1
Connection 1> ROLLBACK;
Query OK, 0 rows affected (0.0378 sec)

-- Connection 2
Connection 2> ROLLBACK;
Query OK, 0 rows affected (0.0003 sec)

-- Connection 3
Connection 3> SHOW ENGINE INNODB STATUS\G
*************************** 1. row ***************************
  Type: InnoDB
  Name:
Status:
=====================================
2020-09-06 12:26:25 0x875c INNODB MONITOR OUTPUT
=====================================
...
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-09-06 12:24:20 0x8650
*** (1) TRANSACTION:
TRANSACTION 541847, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s)
MySQL thread id 171, OS thread handle 31048, query id 49358 localhost ::1 root updating
UPDATE world.city
   SET Population = Population + 1
 WHERE ID = 130

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 1940 page no 7 n bits 248 index PRIMARY of table `world`.`city` trx id 541847 lock_mode X locks rec but not gap waiting
Record lock, heap no 131 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 80000082; asc     ;;
 1: len 6; hex 000000083d95; asc     = ;;
 2: len 7; hex 8100000127079e; asc     '  ;;
 3: len 30; hex 5379646e6579202020202020202020202020202020202020202020202020; asc Sydney                        ; (total 35 bytes);
 4: len 3; hex 415553; asc AUS;;
 5: len 20; hex 4e657720536f7574682057616c65732020202020; asc New South Wales     ;;
 6: len 4; hex 8031fdaf; asc  1  ;;

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1940 page no 7 n bits 248 index PRIMARY of table `world`.`city` trx id 541847 lock_mode X locks rec but not gap waiting
Record lock, heap no 131 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 80000082; asc     ;;
 1: len 6; hex 000000083d95; asc     = ;;
 2: len 7; hex 8100000127079e; asc     '  ;;
 3: len 30; hex 5379646e6579202020202020202020202020202020202020202020202020; asc Sydney                        ; (total 35 bytes);
 4: len 3; hex 415553; asc AUS;;
 5: len 20; hex 4e657720536f7574682057616c65732020202020; asc New South Wales     ;;
 6: len 4; hex 8031fdaf; asc  1  ;;

*** (2) TRANSACTION:
TRANSACTION 541848, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 4 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 170, OS thread handle 12104, query id 49360 localhost ::1 root updating
UPDATE world.city
   SET Population = Population + 1
 WHERE ID = 130

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1940 page no 7 n bits 248 index PRIMARY of table `world`.`city` trx id 541848 lock mode S locks rec but not gap
Record lock, heap no 131 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 80000082; asc     ;;
 1: len 6; hex 000000083d95; asc     = ;;
 2: len 7; hex 8100000127079e; asc     '  ;;
 3: len 30; hex 5379646e6579202020202020202020202020202020202020202020202020; asc Sydney                        ; (total 35 bytes);
 4: len 3; hex 415553; asc AUS;;
 5: len 20; hex 4e657720536f7574682057616c65732020202020; asc New South Wales     ;;
 6: len 4; hex 8031fdaf; asc  1  ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1940 page no 7 n bits 248 index PRIMARY of table `world`.`city` trx id 541848 lock_mode X locks rec but not gap waiting
Record lock, heap no 131 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 4; hex 80000082; asc     ;;
 1: len 6; hex 000000083d95; asc     = ;;
 2: len 7; hex 8100000127079e; asc     '  ;;
 3: len 30; hex 5379646e6579202020202020202020202020202020202020202020202020; asc Sydney                        ; (total 35 bytes);
 4: len 3; hex 415553; asc AUS;;
 5: len 20; hex 4e657720536f7574682057616c65732020202020; asc New South Wales     ;;
 6: len 4; hex 8031fdaf; asc  1  ;;

*** WE ROLL BACK TRANSACTION (1)

Notice in the latest detected deadlock information that "*** (1) HOLDS THE LOCK(S):" and "
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:" are both for the record lock the transaction is waiting for.

Suggested fix:
Don't display record locks in the "HOLDS THE LOCK" part when none are held.
[6 Sep 2:31] Jesper Wisborg Krogh
The full InnoDB monitor output

Attachment: innodb_monitor.txt (text/plain), 8.07 KiB.

[7 Sep 9:21] MySQL Verification Team
Dear Jesper,

Thank you for the report and feedback.
Verified as described. 

regards,
Umesh
[7 Sep 9:57] Jakub Lopuszanski
Hi, thank you for detailed analysis.

There's clearly many ways in which the output could be made more readable to humans. 

However, if you put aside the very unfriendly format, the information provided is actually useful and in some sense (that is if you stretch the meaning of "HOLD") true.

To describe a deadlock cycle involving N transactions, we need to specify at least 2*N lock objects.
Perhaps it's not immediately obvious, that a deadlock cycle involving N transactions actually (often) involves twice as many locks:
- what i-th transaction is trying to acquire
- what lock of (i+1 mod N)-th transaciton is preventing the above acquisition
This is perhaps a bit confusing because the word "lock" itself in InnoDB means "a request for a lock" (so, there are as many of them as there are transactions competing for a given resource) rather than a single thing per resource for which transactions compete (which might be a common way to define this word in say, C++).
So, you can find some pictures in the web where deadlock among two transactions is shown as involving two locks https://twimgs.com/ddj/images/article/2011/0611/thread4.gif, but in InnoDB's parlance, it involves four locks (=four lock requests). In the picture there are two items (resources), two people (transactions), but four hands (=lock requests).

Yes, it may happen that some of these 2*N locks are in fact the same - this is the case in your example - the same "WAITING X,REC_NOT_GAP" lock of 291 is at the same time the one for which transaction 291 is waiting, and the reason transaction 290 can not acquire it's lock.
As you may already know, this is a part of strategy to avoid starvation of waiters - the transaction 290, when trying to acquire a lock, must check for conflicts also with already existing waiters in the queue, such as transaction 291.
So, a good picture for this situation might be a hand of 290 holding a thing, wrapped by a hand of 291 which also tries to grab this thing, and another hand of 290 wrapping both of them trying to get it. Here the middle hand plays two roles: of an unsatisfied, waiting request, and of a reason someone else has to wait.

Yes, the title of the section uses verb "HOLDs", which might be interpreted as "was granted the lock", but this is later "clarified" at the end of the next output line:

  *** (1) HOLDS THE LOCK(S):
  RECORD LOCKS space id 1940 page no 7 n bits 248 index PRIMARY of table `world`.`city` trx id 541847 lock_mode X locks rec but not gap waiting

Note the word "waiting" at the end of the list of "attributes" of this lock object - this "clarifies" that this lock is not yet granted, but waiting, yet still it plays the role of the request of transaction 291 which is preventing the request of transaction 290 from fulfilment. 

Removing this section, as you suggest, is IMHO not a good idea, because then it wouldn't be clear what is the reason that transaction 290 has to wait. (Also, it would break backward compatibility with tools parsing it).

So, yes, these could be made more human-readable, for example by using word "IS ASSOCIATED WITH THE LOCK(S):" or something like this, but IMHO, "HOLDS" is not such a bad word to describe the situation that trx 291 is so firmly attached to this lock, and will not let go, nor allow 290 to bypass it :)

Anyway, I believe we can't fix it for two reasons:
1. Backward compatibility. There are tools which might search for headers like this one to obtain information. 
2. We rather figure out a way to put the deadlock information into performance_schema, than invest in improving these texts.