MySQL Bugs: #55717: innodb gets row locks on delete marked rows

Bug #55717	innodb gets row locks on delete marked rows
Submitted:	3 Aug 2010 15:09	Modified:	11 Aug 2010 7:44
Reporter:	Mark Callaghan	Email Updates:
Status:	Duplicate	Impact on me:	None
Category:	MySQL Server: InnoDB Plugin storage engine	Severity:	S3 (Non-critical)
Version:	5.1.47, 4.1, 5.0, 5.1, 5.6.99 bzr	OS:	Any
Assigned to:	Assigned Account	CPU Architecture:	Any
Tags:	deleted, lock, purge, rows

Description:
InnoDB gets row locks on rows that have been marked for delete but yet to be removed by purge. 

This makes delete statements slower than needed when deletes are done in PK order and purge is behind or blocked.

This is a source of lock wait timeouts and deadlocks when concurrent sessions delete in PK order.

How to repeat:
drop table if exists t;
create table t (i int primary key auto_increment) engine=innodb;

insert into t values (null), (null), (null), (null);
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;
insert into t select null from t;

-----------

run this in one session to block purge:
begin; select count(*) from t;

-----------

run this in another session, each of these delete ~5900 rows

delete from t where i < 10000;
begin; delete from t where i < 20000;

-----------

run this in another session and find that session has deleted 5905 rows from the number of undo log entries but has 11840 row locks. This gets ~5900 row locks for i < 10000. But these rows have already been deleted.

show innodb status;

---TRANSACTION 1222, ACTIVE 20 sec, process no 27539, OS thread id 1221396800
19 lock struct(s), heap size 6960, 11840 row lock(s), undo log entries 5905
MySQL thread id 5, query id 41 localhost root

------------

Now start another session and run this and then wait for the row lock wait timeout. It gets this for rows that have already been deleted.

begin; delete from t where i < 9000;

I simplified this case a bit.

(1st session)
mysql> CREATE TABLE `t` (`i` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`i`)); ENGINE=InnoDB;
Query OK, 0 rows affected (0.03 sec)

mysql> insert into t values (NULL),(NULL),(NULL),(NULL),(NULL),(NULL);
Query OK, 6 rows affected (0.06 sec)
Records: 6  Duplicates: 0  Warnings: 0

(2nd session)

mysql> delete from t where i = 1; 
Query OK, 1 row affected (0.24 sec)

mysql> begin ; delete from t where i < 5; 
Query OK, 0 rows affected (0.00 sec)

Query OK, 3 rows affected (0.00 sec)

(1st session)

mysql> delete from t where i < 2; 
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

ENGINE InnoDB status shows:

---TRANSACTION 0 1431, ACTIVE 11 sec, process no 14585, OS thread id 140125902714640 starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 368, 1 row lock(s)
MySQL thread id 9, query id 205 localhost root updating
delete from t where i < 2
------- TRX HAS BEEN WAITING 11 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 50 n bits 80 index `PRIMARY` of table `test`.`t` trx id 0 1431 lock_mode X waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 4; hex 80000002; asc     ;; 1: len 6; hex 000000000595; asc       ;; 2: len 7; hex 0000000033056e; asc     3 n;;

------------------
---TRANSACTION 0 1425, ACTIVE 47 sec, process no 14585, OS thread id 140125902182160
2 lock struct(s), heap size 368, 4 row lock(s), undo log entries 3
MySQL thread id 6, query id 194 localhost root
Trx read view will not see trx with id >= 0 1426, sees < 0 1426

This looks like typical GAP locking. Per: http://dev.mysql.com/doc/refman/5.1/en/innodb-record-level-locks.html 

This works the identically if you insert i = 3-8 where no rows < 2 ever existed in the table.

1st login

mysql> truncate t; 
Query OK, 0 rows affected (0.02 sec)

mysql> insert into t values (3),(4),(5),(6),(7),(8); 
Query OK, 6 rows affected (0.04 sec)
Records: 6  Duplicates: 0  Warnings: 0

2nd login

mysql> begin ; delete from t where i < 5; 
Query OK, 0 rows affected (0.00 sec)

Query OK, 2 rows affected (0.00 sec)

1st login

mysql> delete from t where i between 0 and 2; 
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

This looks like a bug to me. InnoDB is getting locks on rows that have been deleted. In the example makes that clear:
1) block purge
2) delete ~65,000 rows, commit
3) delete ~8,000 rows, don't commit

At this point the session holds 70,000+ row locks.

Why does InnoDB lock rows that are known to be deleted before this transaction starts?

Keep this open as a bug or as a feature request. In either case, this behavior is a huge problem for a few of my production servers. This behavior should be documented.

I do not like the workaround of using READ-COMMITTED as that forces me to use RBR. I have yet to confirm that RBR works in production for my workload.

The other workaround, innodb_locks_unsafe_for_binlog, might work but requires more investigation.

mysql> delete from t where i < 100000;
Query OK, 65536 rows affected (0.26 sec)

mysql> begin; delete from t where i < 140000;
Query OK, 0 rows affected (0.00 sec)
Query OK, 8945 rows affected (0.04 sec)

---TRANSACTION 161D, ACTIVE 17 sec, process no 11223, OS thread id 1221396800
112 lock struct(s), heap size 31160, 74592 row lock(s), undo log entries 8945
MySQL thread id 3, query id 32 localhost root

Thank you for the report.

Verified as described. Regular InnoDB affected as well.

Matthew's test case can be affected by gaps, but if modify it we can get shorter test case:

--source include/have_innodb_plugin.inc
--disable_abort_on_error

CREATE TABLE `t` (`i` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`i`)) ENGINE=InnoDB;
insert into t values (NULL),(NULL),(NULL),(NULL),(NULL),(NULL);
insert into t values (NULL),(NULL),(NULL),(NULL),(NULL),(NULL);
insert into t values (NULL),(NULL),(NULL),(NULL),(NULL),(NULL);
insert into t values (NULL),(NULL),(NULL),(NULL),(NULL),(NULL);

connect (addconroot, localhost, root,,);
connection addconroot;
delete from t where i < 11; 
begin ; delete from t where i < 23; 

connection default;
delete from t where i < 2; 
--vertical_results
show engine innodb status;

connection addconroot;
commit;

connection default;
truncate t; 
insert into t values (5),(6),(7),(8); 

connection addconroot;
begin ;
delete from t where i < 8;

connection default;
delete from t where i between 0 and 2;

This (and Bug #45975) are duplicates of Bug #19762.