Bug #101695 | Deadlock behavior | ||
---|---|---|---|
Submitted: | 20 Nov 2020 12:09 | Modified: | 20 Nov 2020 14:17 |
Reporter: | Niranjan R | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S3 (Non-critical) |
Version: | 5.6, 5.7, 8.0 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | deadlocks, Exclusive lock, unique index |
[20 Nov 2020 12:09]
Niranjan R
[20 Nov 2020 14:17]
MySQL Verification Team
Hi Mr. R, Thank you for your bug report. However, this is not a bug. Actually, 8.0.22 has the proper corrected output, which explains everything: ------- TRX HAS BEEN WAITING 7 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 95 page no 5 n bits 72 index usname of table `test`.`t1` trx id 27713 lock_mode X locks rec but not gap waiting Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 4; hex 70717273; asc pqrs;; 1: len 4; hex 80000002; asc ;; ------------------ TABLE LOCK table `test`.`t1` trx id 27713 lock mode IX RECORD LOCKS space id 95 page no 5 n bits 72 index usname of table `test`.`t1` trx id 27713 lock_mode X locks rec but not gap waiting Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 4; hex 70717273; asc pqrs;; 1: len 4; hex 80000002; asc ;; ---TRANSACTION 27711, ACTIVE 12 sec 5 lock struct(s), heap size 1200, 5 row lock(s), undo log entries 2 MySQL thread id 11, OS thread handle 123145335353344, query id 33 localhost sinisa TABLE LOCK table `test`.`t1` trx id 27711 lock mode IX RECORD LOCKS space id 95 page no 5 n bits 72 index usname of table `test`.`t1` trx id 27711 lock_mode X locks rec but not gap Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 4; hex 70717273; asc pqrs;; 1: len 4; hex 80000002; asc ;; Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 32 0: len 4; hex 6c6d6e6f; asc lmno;; 1: len 4; hex 80000002; asc ;; RECORD LOCKS space id 95 page no 4 n bits 72 index PRIMARY of table `test`.`t1` trx id 27711 lock_mode X locks rec but not gap Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0 0: len 4; hex 80000002; asc ;; 1: len 6; hex 000000006c3f; asc l?;; 2: len 7; hex 01000001700d61; asc p a;; 3: len 4; hex 70717273; asc pqrs;; 4: len 4; hex 8000000c; asc ;; RECORD LOCKS space id 95 page no 5 n bits 72 index usname of table `test`.`t1` trx id 27711 lock mode S locks gap before rec Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 4; hex 70717273; asc pqrs;; 1: len 4; hex 80000002; asc ;; RECORD LOCKS space id 95 page no 5 n bits 72 index usname of table `test`.`t1` trx id 27711 lock mode S Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 4; hex 7778797a; asc wxyz;; 1: len 4; hex 80000003; asc ;; ------------------------------------------------------------------------------------ This is exactly the output to be expected with your test case. You have the same record twice reported. The cause is simple. Once it is locked for the existed name and second time for the new name. Since both sessions are running both DML statements simultaneously, this is the expected behaviour. Not a bug.
[23 Nov 2020 13:55]
Jakub Lopuszanski
Hello, In the bug report you wrote: *** (1) HOLDS THE LOCK(S): ---> SAME RECORD LOCKS space id 5 page no 5 n bits 72 index usname of table `dlock`.`tlock` trx id 225349 lock_mode X waiting ... *** (1) WAITING FOR THIS LOCK TO BE GRANTED: ---> SAME RECORD LOCKS space id 5 page no 5 n bits 72 index usname of table `dlock`.`tlock` trx id 225349 lock_mode X waiting and indeed this is the same lock. But what should not be ignored is the word "waiting" at the end of each of these lines. This means that the transaction did not yet acquired the X-lock - it's still waiting for it. The very misleading part of the output is that the section is titled "HOLDS THE LOCK(S)", while in fact not every lock mentioned in this section is actually "held". It's still in the process of acquisition, as evidenced by the word "waiting". The reason this lock is listed in the section titled "HOLDS THE LOCK(S)" is because for the purpose of deadlock detection algorithm, a request for an X-lock is treated very similarly to already holding an X-lock. And this is done to correctly model the fact, that transactions which request S-lock, have to wait not only for transactions already holding an X-lock, but also for transactions which merely requested it, but still wait for it to be granted - this is done to avoid the problem of "starvation of writers by readers". Another reason (perhaps more important for your case) which is a bit similar, is that a transaction which requests an X-lock, has to wait for its turn, giving priority to transaction which already waits for an X-lock (in mysql 5.6 and 5.7 there's simply a FIFO queue, and in 8.0 there is also CATS algorithm, but both algorithms try to ensure that new-coming transaction waits for those already in queue). This again is done to avoid starvation: this time among writers themselves. So, given that InnoDB forces transactions to wait for a transaction which waits for an X-lock, it might mean that a deadlock is possible, in which one of the edges in the deadlock cycle involves waiting for transaction which still waits for an X-lock. And the simplest way to model it in the deadlock checking algorithm is to treat the "X-lock being waited for" as "already held". This later leads to a bit confusing output seen above. But once you understand the above logic, you can easily interpret the output as saying that: Transaction (2) already holds an S-lock, but asked for an X-lock, which it can't get granted immediately, because to avoid starving transaction (1) which is waiting for an X-lock, Transaction (2) should give priority to the Transaction (1), thus effectively Transaction (2) waits for Transaction (1), which unfortunately can't get granted the X-lock now, because its request for an X-lock is conflicting with the S-lock already held by Transaction (1). Here's the cycle: T2's X-lock request waits for T1's X-lock request, which waits for T2's S-lock to be released. This could be perhaps solved by changing the implementation of "lock escalation logic": when a T2, which already has S-lock tries to escalate the permission to an X-lock, we could perhaps favour T2 to T1 when they fight over the X-lock, because T1 has to wait for T1 anyway (because of the S-lock), so we are not saving anyone from starving by favouring T1, yet we might cause a deadlock like in your case. But this idea would be a feature request, not a bug fix, as current implementation is also correct, if suboptimal.
[30 Nov 2020 12:57]
MySQL Verification Team
Thank you, Jakub.
[12 Jan 2021 13:10]
Jakub Lopuszanski
The scenario from the bug report rewritten as a mysql-test-run script: # This is the original bug report from Niranjan R CREATE TABLE `tlock` ( `id` int NOT NULL AUTO_INCREMENT, `name` varchar(20) DEFAULT NULL, `phnumber` int DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `usname` (`name`) ); INSERT INTO tlock VALUES (1,'abcd',98765),(2,'pqrs',12345),(3,'wxyz',54321); SELECT * FROM tlock; DELIMITER |; CREATE PROCEDURE spammer(t INT) BEGIN WHILE 0<t DO SET t = t - 1; update tlock set name='lmno' where name='pqrs'; update tlock set name='pqrs' where name='lmno'; END WHILE; END| DELIMITER ;| --connect(C1,localhost,root,,test) --send CALL spammer(2000) --connection default CALL spammer(2000); --connection C1 --reap --connection default --disconnect C1 DROP PROCEDURE spammer; DROP TABLE tlock; Note: this fails only sometimes, under congestion, so run it with: $ ./mtr --mtr-port-base=31000 --force --retry=0 --max-test-fail=0 --repeat=100 --parallel=20 --testcase-timeout=1 innodb.lock_escalation_101695 You should see several: mysqltest: At line 34: Query 'reap' failed. ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
[31 Jan 2022 16:46]
Jakub Lopuszanski
A fix for Bug #21356 "Change lock priority so that the transaction holding S-lock gets X-lock first" will improve behaviour in scenarios like this.
[1 Feb 2022 13:33]
MySQL Verification Team
Thank you Jakub.