Bug #68021 Unexplainable InnoDB unique index locks on DELETE + INSERT with same values
Submitted: 3 Jan 2013 16:49 Modified: 28 Sep 2014 10:48
Reporter: Ovais Tariq Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.5.29, 5.5.30, 5.6, 5.7.1 OS:Any
Assigned to: CPU Architecture:Any
Triage: Needs Triage: D3 (Medium)

[3 Jan 2013 16:49] Ovais Tariq
Description:
MySQL server version running is 5.5.29 on CentOS 6.2.

Suppose there is a table with a composite unique key. 

Now when a record is deleted and a subsequent insert is made within the same transaction that contains the same value for the unique key columns that were deleted, then InnoDB places gap lock on the gap after the unique key value and the gap before the unique key value that is inserted by the subsequent insert. This behaviour remains even though I change transaction-isolation=READ-COMMITTED and innodb_locks_unsafe_for_binlog=1. 

However, when a record is deleted and a subsequent insert within the same transaction inserts values for unique key columns that are different from the ones deleted then no excessive locking is performed. This is the behaviour that one expects for the above case as well at least when transaction-isolation=READ-COMMITTED or innodb_locks_unsafe_for_binlog=1.

Relevant my.cnf variables that were used:
log-bin
binlog_format=ROW
transaction-isolation=READ-COMMITTED
innodb_locks_unsafe_for_binlog=1

How to repeat:
-- Prepare test data
CREATE TABLE `ti` (
  `session_ref_id` bigint(16) NOT NULL AUTO_INCREMENT,
  `customer_id` bigint(16) DEFAULT NULL,
  `client_id` int(2) DEFAULT '7',
  `app_id` smallint(2) DEFAULT NULL,
  PRIMARY KEY (`session_ref_id`),
  UNIQUE KEY `uk1` (`customer_id`,`client_id`,`app_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (4000, 8000, 10, 5);
INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (4090, 9000, 10, 5);
INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (6000, 10000, 10, 5);
INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (7000, 14000, 10, 5);

---- DELETE a row with value of unique key `uk` (9000, 10, 5) and INSERT the same row in the same transaction with same value for unique key `uk` columns

-- session 1
session1 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session1 > DELETE FROM ti WHERE session_ref_id = 4090;
Query OK, 1 row affected (0.00 sec)

session1 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (5000, 9000, 10, 5);
Query OK, 1 row affected (0.00 sec)

-- session 2
session2 > set innodb_lock_wait_timeout=1;
Query OK, 0 rows affected (0.00 sec)

session2 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 8001, 10, 5);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 7999, 10, 5);
Query OK, 1 row affected (0.00 sec)

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 9001, 10, 5);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 9999, 10, 5);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 10001, 10, 5);
Query OK, 1 row affected (0.00 sec)

You can see that the whole range from customer_id=8000 to customer_id=10000 is locked, this range encapsulates the value customer_id=9000 which is inserted above.

---- DELETE a row with value of unique key `uk` (9000, 10, 5) and INSERT a row in the same transaction with different value for unique key `uk` columns

-- session 1
session1 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session1 > DELETE FROM ti WHERE session_ref_id = 4090;
Query OK, 1 row affected (0.00 sec)

session1 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (5000, 8999, 10, 5);
Query OK, 1 row affected (0.00 sec)

-- session 2
session2 > set innodb_lock_wait_timeout=1;
Query OK, 0 rows affected (0.00 sec)

session2 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 8001, 10, 5);
Query OK, 1 row affected (0.00 sec)

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 7999, 10, 5);
Query OK, 1 row affected (0.00 sec)

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 9001, 10, 5);
Query OK, 1 row affected (0.00 sec)

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 9999, 10, 5);
Query OK, 1 row affected (0.00 sec)

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 10001, 10, 5);
Query OK, 1 row affected (0.00 sec)

You can see that no gap-locking is performed.
[3 Jan 2013 17:15] Shane Bester
reminded me of http://bugs.mysql.com/bug.php?id=19762
[4 Jan 2013 10:55] Ovais Tariq
If I change the test case a bit, and in session 1 instead of DELTE+INSERT I execute the following:
UPDATE ti SET session_ref_id=5000 WHERE session_ref_id=4090;

Then the gap locking is still repeatable, as in the original test case.
[4 Jan 2013 20:38] Justin Swanhart
In session 1 while session 2 is waiting for the lock during the INSERT:

mysql> select * from information_schema.innodb_locks\G
*************************** 1. row ***************************
    lock_id: 2AF1:7182:4:3
lock_trx_id: 2AF1
  lock_mode: X,GAP
  lock_type: RECORD
 lock_table: `test`.`ti`
 lock_index: `uk1`
 lock_space: 7182
  lock_page: 4
   lock_rec: 3
  lock_data: 9000, 10, 5
*************************** 2. row ***************************
    lock_id: 2AF0:7182:4:3
lock_trx_id: 2AF0
  lock_mode: S
  lock_type: RECORD
 lock_table: `test`.`ti`
 lock_index: `uk1`
 lock_space: 7182
  lock_page: 4
   lock_rec: 3
  lock_data: 9000, 10, 5
2 rows in set (0.00 sec)

mysql> select * from information_schema.innodb_lock_waits\G
*************************** 1. row ***************************
requesting_trx_id: 2AF1
requested_lock_id: 2AF1:7182:4:3
  blocking_trx_id: 2AF0
 blocking_lock_id: 2AF0:7182:4:3
1 row in set (0.00 sec)
[4 Jan 2013 20:41] Justin Swanhart
mysql> select * from information_schema.innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 2AF1
                 trx_state: LOCK WAIT
               trx_started: 2013-01-04 20:37:14
     trx_requested_lock_id: 2AF1:7182:4:3
          trx_wait_started: 2013-01-04 20:37:14
                trx_weight: 3
       trx_mysql_thread_id: 330
                 trx_query: INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 8001, 10, 5)
       trx_operation_state: inserting
         trx_tables_in_use: 1
         trx_tables_locked: 1
          trx_lock_structs: 2
     trx_lock_memory_bytes: 376
           trx_rows_locked: 1
         trx_rows_modified: 1
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 10000
*************************** 2. row ***************************
                    trx_id: 2AF0
                 trx_state: RUNNING
               trx_started: 2013-01-04 20:35:46
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 7
       trx_mysql_thread_id: 329
                 trx_query: select * from information_schema.innodb_trx
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 0
          trx_lock_structs: 5
     trx_lock_memory_bytes: 1248
           trx_rows_locked: 5
         trx_rows_modified: 2
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 10000
2 rows in set (0.00 sec)
[8 Jan 2013 9:55] Ovais Tariq
Verbose locking information for the transaction in session 1 holding locks:

---TRANSACTION F23, ACTIVE 25 sec
5 lock struct(s), heap size 1248, 5 row lock(s), undo log entries 2
MySQL thread id 25, OS thread handle 0x7f2c785f9700, query id 230 localhost msandbox
TABLE LOCK table `test`.`ti` trx id F23 lock mode IX
RECORD LOCKS space id 0 page no 338 n bits 72 index `PRIMARY` of table `test`.`ti` trx id F23 lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 6; compact format; info bits 32
 0: len 8; hex 8000000000000ffa; asc         ;;
 1: len 6; hex 000000000f23; asc      #;;
 2: len 7; hex 190000015a0110; asc     Z  ;;
 3: len 8; hex 8000000000002328; asc       #(;;
 4: len 4; hex 8000000a; asc     ;;
 5: len 2; hex 8005; asc   ;;

locked record from PK (record lock, no gap locking, X lock):
session_ref_id: 		4090
DB_TRX_ID (internal field): 	F23
DB_ROLL_PTR (internal field): 	0x190000015a0110
customer_id:			9000
client_id:			10
app_id:				5

RECORD LOCKS space id 0 page no 339 n bits 72 index `uk1` of table `test`.`ti` trx id F23 lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
 0: len 8; hex 8000000000002328; asc       #(;;
 1: len 4; hex 8000000a; asc     ;;
 2: len 2; hex 8005; asc   ;;
 3: len 8; hex 8000000000000ffa; asc         ;;

locked record from UK `uk1` (record lock, no gap locking):
customer_id:	9000
client_id:	10
app_id:		5
session_ref_id: 4090

RECORD LOCKS space id 0 page no 339 n bits 72 index `uk1` of table `test`.`ti` trx id F23 lock mode S
Record lock, heap no 3 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
 0: len 8; hex 8000000000002328; asc       #(;;
 1: len 4; hex 8000000a; asc     ;;
 2: len 2; hex 8005; asc   ;;
 3: len 8; hex 8000000000000ffa; asc         ;;

locked record from UK `uk1` (S lock):
customer_id:	9000
client_id:	10
app_id:		5
session_ref_id: 4090

Record lock, heap no 4 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 8; hex 8000000000002710; asc       ' ;;
 1: len 4; hex 8000000a; asc     ;;
 2: len 2; hex 8005; asc   ;;
 3: len 8; hex 8000000000001770; asc        p;;

locked record from UK `uk1`: <-- interestingly no information is printed about whether this is S/X lock and whether the gap_lock bit is set, based on my tests this is also of type 'locks gap before record'
customer_id:	10000
client_id:	10
app_id:		5
session_ref_id: 6000

RECORD LOCKS space id 0 page no 339 n bits 72 index `uk1` of table `test`.`ti` trx id F23 lock mode S locks gap before rec
Record lock, heap no 6 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 8; hex 8000000000002328; asc       #(;;
 1: len 4; hex 8000000a; asc     ;;
 2: len 2; hex 8005; asc   ;;
 3: len 8; hex 8000000000001388; asc         ;;

locked record from UK `uk1` (locks gap before record, S lock): <-- this one locks the gap between records (8000, 10, 5) and (9000, 10, 5)
customer_id:	9000
client_id:	10
app_id:		5
session_ref_id: 5000
[8 Jan 2013 12:02] Ovais Tariq
If we modify the test case such that we modify the last record in the table (7000, 14000, 10, 5); then no index record that is bigger than (14000, 10, 5) is allowed to be inserted, because the gap between (14000, 10, 5) and SUPREMUM is locked.

-- session 1
session1 > select * from ti;
+----------------+-------------+-----------+--------+
| session_ref_id | customer_id | client_id | app_id |
+----------------+-------------+-----------+--------+
|           4000 |        8000 |        10 |      5 |
|           4090 |        9000 |        10 |      5 |
|           6000 |       10000 |        10 |      5 |
|           7000 |       14000 |        10 |      5 |
+----------------+-------------+-----------+--------+
4 rows in set (0.00 sec)

session1 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session1 > DELETE FROM ti WHERE session_ref_id = 7000;
Query OK, 1 row affected (0.00 sec)

session1 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (8000, 14000, 10, 5);
Query OK, 1 row affected (0.00 sec)

-- session 2
session2 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session2 > set innodb_lock_wait_timeout=1;
Query OK, 0 rows affected (0.00 sec)

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 13999, 10, 5);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 9999, 10, 5);
Query OK, 1 row affected (0.00 sec)

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 14001, 10, 5);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 140001, 10, 5);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 1400001, 10, 5);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

session2 > INSERT INTO ti (session_ref_id, customer_id, client_id, app_id) VALUES (NULL, 14000001, 10, 5);
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

-- Following are the locks held on the unique index `uk1`:
RECORD LOCKS space id 0 page no 339 n bits 72 index `uk1` of table `test`.`ti` trx id F29 lock_mode X locks rec but not gap
Record lock, heap no 5 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
 0: len 8; hex 80000000000036b0; asc       6 ;;
 1: len 4; hex 8000000a; asc     ;;
 2: len 2; hex 8005; asc   ;;
 3: len 8; hex 8000000000001b58; asc        X;;

customer_id:	14000
client_id:	10
app_id:		5
session_ref_id: 7000

RECORD LOCKS space id 0 page no 339 n bits 72 index `uk1` of table `test`.`ti` trx id F29 lock mode S
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

supremum record of the UK `uk1` is locked

Record lock, heap no 5 PHYSICAL RECORD: n_fields 4; compact format; info bits 32
 0: len 8; hex 80000000000036b0; asc       6 ;;
 1: len 4; hex 8000000a; asc     ;;
 2: len 2; hex 8005; asc   ;;
 3: len 8; hex 8000000000001b58; asc        X;;

customer_id:	14000
client_id:	10
app_id:		5
session_ref_id: 7000

RECORD LOCKS space id 0 page no 339 n bits 72 index `uk1` of table `test`.`ti` trx id F29 lock mode S locks gap before rec
Record lock, heap no 6 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 8; hex 80000000000036b0; asc       6 ;;
 1: len 4; hex 8000000a; asc     ;;
 2: len 2; hex 8005; asc   ;;
 3: len 8; hex 8000000000001f40; asc        @;;

customer_id:	14000
client_id:	10
app_id:		5
session_ref_id: 8000
[12 Jan 2013 14:05] Sveta Smirnova
Thank you for the report.

Verified as described. Still can be duplicate of bug #19762 But I think we should fix it finally, because no allowing 2000 IDs like in this case is too much for users to work around.
[12 Jan 2013 14:21] Ovais Tariq
Hi Sveta,

Please take a look at my last comment before yours, it shows that under certain conditions, the gap between the record and the supremum record is locked which means no further inserts to the table in question are possible. This just increases the criticality of the bug :)
[14 Jan 2013 16:12] Sinisa Milivojevic
Ovais,

Every single insert at the end of some table will produce a new insert intention lock at the very end of the table.

This is expected behavior, which will not be changed with a fix for this bug.
[14 Jan 2013 16:25] Sinisa Milivojevic
Ovais,

I have grasped full meaning of your bug. It appears only for secondary index when insert intention lock is at the end of the table.

I am not sure how easy is this to fix, but we shall give it a try.
[15 Jan 2013 10:37] Ovais Tariq
Hi Sinisa,

Indeed transactions do acquire IX locks before writing to a table, but two IX locks are supposed to be compatible as is also shown here http://dev.mysql.com/doc/refman/5.5/en/innodb-lock-modes.html. That is, if I insert a row into a table and the transaction does not commit, then that insert would never block another insert into the same table (provided the second insert is inserting a different row). 

So I do not think what I am seeing is the expected behaviour. This can easily be checked by doing the following using the table from this bug report (note that the inserts below are at the end of the table):

session1 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session 1 > insert into ti values(8000, 15000, 10, 5);
Query OK, 1 row affected (0.00 sec)

session2 > start transaction;
Query OK, 0 rows affected (0.00 sec)

session2 > insert into ti values(9000, 16000, 10, 5);
Query OK, 1 row affected (0.00 sec)

The locks held by both the transactions can be seen as follows:
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 1103, ACTIVE 9 sec
1 lock struct(s), heap size 376, 0 row lock(s), undo log entries 1
MySQL thread id 2, OS thread handle 0x7fe0a4949700, query id 14 localhost msandbox
show engine innodb status
TABLE LOCK table `test`.`ti` trx id 1103 lock mode IX
---TRANSACTION 1102, ACTIVE 68 sec
1 lock struct(s), heap size 376, 0 row lock(s), undo log entries 1
MySQL thread id 1, OS thread handle 0x7fe0a498a700, query id 10 localhost msandbox
TABLE LOCK table `test`.`ti` trx id 1102 lock mode IX

As you can see both the transactions are holding an IX lock, but that did not block each other.
[15 Jan 2013 10:41] Ovais Tariq
Hi Sinisa,

I am not entirely sure that this bug is a consequence of insert intention locks. If you go through my bug report and the locks reported by the transactions you will see that InnoDB does not show an insert intention locks being held.
[19 Feb 2013 18:08] Sinisa Milivojevic
Ovais,

This bug is verified and will be worked upon in not too distant future. Sorry, I can not give you more precise info, as it would violate company's policy. I like that policy, very much, because in a distant past, many promises have been made, which were not kept.
[20 Feb 2013 6:24] Ovais Tariq
Hi Sinisa,

Thanks, that's good enough.
[14 Mar 2013 19:09] Bugs System
Added a changelog entry for 5.5.32, 5.6.12, 5.7.2.

When a transaction is in READ COMMITTED isolation level, gap locks are
still taken in the secondary index when a row is inserted. This occurs
when the secondary index is scanned for duplicates. The function
"row_ins_scan_sec_index_for_duplicate()" always calls the function
"row_ins_set_shared_rec_lock()" with "LOCK_ORDINARY" irrespective of the
transaction isolation level. This fix modifies the
"row_ins_scan_sec_index_for_duplicate()" function to call
"row_ins_set_shared_rec_lock()" with "LOCK_ORDINARY" or
"LOCK_REC_NOT_GAP", based on the transaction isolation level.
[1 Jul 2014 21:05] Inaam Rana
We think this fix needs to be revisited. We have enough information to suspect that this fix has somehow broken the locking code inside InnoDB. It is very hard to come up with a repeatable test case but here is what we have seen in our production systems:
* Have isolation level READ-COMMITTED
* Have a table with unique sec. index
* Have autocommit on
* Delete a row from the table
* In two different threads try to insert a row with same value for unique sec. index column. The insertion should happen very closely followed by the deletion and both insertions should happen concurrently.
* In the cases that we have seen one of the insertion is performed by the deleting thread though I don't think it makes a difference because of autocommit.
* What happens is that both insertions succeed. Thus we get a unique key violation silently. We were able to detect the error when the replicating slave tries to apply both DMLs and spits out the error.

It is some kind of race that happens very rarely but consistently enough to show that there is a bug. We have tried:
* Reverting this patch on some of the clusters
* Moving to REPEATABLE-READ in some other clusters.
We haven't seen this error on any of the cluster with either of the fix mentioned above for over a month.

I have tried to stare at the code but can't find anything obvious. I suggest we reopen this bug and request InnoDB experts to take another look at the patch.
[2 Jul 2014 5:29] Sunny Bains
Opening the bug based on Inaam's request.
[2 Jul 2014 7:27] Annamalai Gurusami
The problem reported by Inaam will be tracked by bug#73170.
[2 Jul 2014 7:28] Annamalai Gurusami
Inaam, Thanks for bug report.
[28 Sep 2014 2:49] zhai weixiang
should this bug be reopened ?
[28 Sep 2014 10:48] Ovais Tariq
Since the fix has been reverted so I think the bug should be reopened
[4 Dec 2014 19:12] Sveta Smirnova
Issue, reported by Inaam handled in bug#73170. No need to reopen this one.
[6 Jul 2015 15:13] Mannoj Kumar
Is this fixed? I'm facing similar issue in 5.6.21 .