MySQL Bugs: #50953: Locked rows change in number, strangely.

Bug #50953	Locked rows change in number, strangely.
Submitted:	5 Feb 2010 18:40	Modified:	6 Feb 2010 8:08
Reporter:	Yoshiaki Tajika (Basic Quality Contributor)	Email Updates:
Status:	Verified	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S2 (Serious)
Version:	5.1.41, 5.1.44, 5.5.99	OS:	Any
Assigned to:		CPU Architecture:	Any

Description:
Locked rows change in number, strangely.

I tried to confirm that bug#41756 was fixed in 5.1.41 or later. 
The error message "unlock row could not find a 4 mode lock on the record" 
was successfully suppressed. But I found strange lock behavior.
I cannot summarize it well, so please see "how to repeat".

How to repeat:
// Start mysqld (I tried with 5.1.41 and 5.1.42 on Windows).
mysqld.exe --console --transaction-isolation=read-committed  --innodb_locks_unsafe_for_binlog=1 --log-bin --binlog-format=ROW --innodb-lock-wait-timeout=10

// Establish new connection #1, create tables, and populate them.
drop table if exists t1, t2;
create table t1(c1 int, c2 int primary key) engine=innodb;
insert t1 values(1,1),(null,2),(1,3),(1,4);
create table t2(c1 int, c2 int) engine=innodb;
insert t2 values (1,2),(1,2);

// Start transaction like below on conn #1.
begin;
select 1 from t1 natural join (select * from t2) as d for update;

// Establish another conn #2, and show innodb status.
// Then you'll find only 1 row is locked in the TRANSACTIONS section.
show engine innodb status \G

// And execute transaction like below on Conn #2.
// Lock timeout occurrs 10 seconds later, then rollback it explicitly.
begin; 
delete from t1;
rollback;

// Go back to conn #1, and rollback.
// After that, restart the same transaction again.
rollback;
begin;
select 1 from t1 natural join (select * from t2) as d for update;

// Go back to conn #2, and show innodb status.
// Here, you'll find 4 rows are locked.
show engine innodb status \G

Why does the number of locked rows change from 1 to 4,
though all transactions were once reset ?

I also tried another mysqld startup options, and got results below.

mysqld.exe --console --transaction-isolation=read-committed  --innodb_locks_unsafe_for_binlog=0 --log-bin --binlog-format=ROW --innodb-lock-wait-timeout=10
==> First time, 1 row is locked. Second time, 4 rows are locked. That is, the same results as above test case.

mysqld.exe --console --transaction-isolation=repeatable-read --innodb_locks_unsafe_for_binlog=0 --log-bin --binlog-format=ROW --innodb-lock-wait-timeout=10
==> First time, 1 row is locked. Second time, 5 rows are locked. Why is it 5 (neither 1 nor 4) ?

Suggested fix:
n/a

Thank you for the report.

Verified as described.

Sveta,

Please check whether this bug is duplicate of, or related to the bug #41756.

Thanks ...

Sinisa,

not, this is not duplicate of bug #41756. Also they could be not related at all, because this one repeatable with earlier versions.

My SQL Optimizer/QP should take a look at this issue.

The root cause seems due to two different plans generated for the same query in scenario described by the reporter.

The query plan before the "delete from t1" call is following:

mysql> explain select 1 from t1 natural join (select * from t2) as d for update;
+----+-------------+------------+--------+---------------+---------+---------+------+------+-------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+------------+--------+---------------+---------+---------+------+------+-------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL          | NULL    | NULL    | NULL |    2 |             |
|  1 | PRIMARY     | t1         | eq_ref | PRIMARY       | PRIMARY | 4       | d.c2 |    1 | Using where |
|  2 | DERIVED     | t2         | ALL    | NULL          | NULL    | NULL    | NULL |    2 |             |
+----+-------------+------------+--------+---------------+---------+---------+------+------+-------------+
3 rows in set (7.82 sec)

The query issued after the "delete from t1" call has following plan:

mysql>  explain select 1 from t1 natural join (select * from t2) as d for update;
+----+-------------+------------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+------------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | PRIMARY     | t1         | ALL  | PRIMARY       | NULL | NULL    | NULL |    1 |                                |
|  1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL | NULL    | NULL |    2 | Using where; Using join buffer |
|  2 | DERIVED     | t2         | ALL  | NULL          | NULL | NULL    | NULL |    2 |                                |
+----+-------------+------------+------+---------------+------+---------+------+------+--------------------------------+
3 rows in set (7.93 sec)

In the first query, only one row in t1 is locked, here is the calling stack:

sel_set_rec_lock 
row_search_for_mysql 
ha_innobase::index_read
handler::index_read_map
join_read_key 
sub_select 
evaluate_join_record 
sub_select 
do_select 
JOIN::exec
mysql_select 

Table t1 is no long being scanned in subsequent loop, the temp table is checked instead:

sub_select()
{
  ...
  while (rc == NESTED_LOOP_OK)
  {
    error= info->read_record(info);
    rc= evaluate_join_record(join, join_tab, error);
  }
  ...
}

(gdb) p join_tab->read_record->table->s->path
$120 = {
  str = 0xacbc118 "/home/jyang/backup/repo/mysql-5.1/mysql-test/var/tmp/mysqld.1/#sql_685c_0", length = 73}

For the second query plan, table t1 is being reaccess in above loop, and thus all 4 rows in t1 are being read:

(gdb)  p join_tab->read_record->table->s->path
$129 = {str = 0xac9c9e8 "./test/t1", length = 9}

here is the actual calling stack that locks the 4 rows

sel_set_rec_lock
row_search_for_mysql 
ha_innobase::general_fetch
ha_innobase::rnd_next 
rr_sequential 
sub_select 
do_select
evaluate_join_record <===
sub_select 
do_select 
JOIN::exec
mysql_select

So for this query plan, in the final loop phase, all 4 t1 rows are read. While in the first query plan, the temp table is accessed instead of table t1.

This explains the difference of 1 row locked vs. 4 row locked. The optimizer or query generation team need to look into why two different plan is generate in this scenario.

Also, to explain why there is case that shows 5 rows locked, the one addition row is most likely the InnoDB internal row (supremum):

RECORD LOCKS space id 0 page no 50 n bits 72 index `PRIMARY` of table `test`.`t1` trx id 0 1796 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

Change the category to optimizer based on Jimmy's analysis.