Bug #67638 Odd optimizer choice for UPDATE .. ORDER BY ... LIMIT query
Submitted: 19 Nov 2012 17:00 Modified: 26 Nov 2012 17:24
Reporter: Sergey Petrunya Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.6.8 OS:Any
Assigned to: CPU Architecture:Any

[19 Nov 2012 17:00] Sergey Petrunya
Description:
I've got an UPDATE ... ORDER BY ... LIMIT N query with a small value of N=2. It still chooses to do a full table scan and examine 10K rows, instead of using an index.  

MySQL 5.5 will use an index for this example.  Is this change in behaviour intentional? 

How to repeat:
run this as .test file:
--
CREATE TABLE `ten` (
  `a` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `ten` VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

CREATE TABLE `t10` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
insert into t10 
select 
 A.a+10*B.a+100*C.a + 1000*D.a, 
 A.a+10*B.a+100*C.a + 1000*D.a 
from 
  ten A, ten B, ten C, ten D;

explain update t10 set b=10 order by a limit 2;
flush status;
update t10 set b=10 order by a limit 2;
show status like 'Handler%';

drop table ten, t10;
-- EOF --
and observe

explain update t10 set b=10 order by a limit 2;
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	t10	ALL	a	NULL	NULL	NULL	10000	Using filesort
flush status;
update t10 set b=10 order by a limit 2;
show status like 'Handler%';
Variable_name	Value
Handler_commit	1
Handler_delete	0
Handler_discover	0
Handler_external_lock	2
Handler_mrr_init	0
Handler_prepare	0
Handler_read_first	1
Handler_read_key	3
Handler_read_last	0
Handler_read_next	0
Handler_read_prev	0
Handler_read_rnd	2
Handler_read_rnd_next	10001
Handler_rollback	0
Handler_savepoint	0
Handler_savepoint_rollback	0
Handler_update	2
Handler_write	0
drop table ten, t10;

Run the same on 5.5 and see that index is used.
[19 Nov 2012 19:52] Erlend Dahl
Moved to the internal bug db for evaluation by the Optimizer team.
[20 Nov 2012 11:38] Umesh Shastry
Thank you for your report.
Looks like this issue has been fixed in 5.6.9-rc which is not yet released for community/download.

#### 5.5.28

[root@node-1 mysql-5.5.28]# bin/mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.28 MySQL Community Server (GPL)

Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create database defect67638;
Query OK, 1 row affected (0.00 sec)

mysql> use defect67638;
Database changed
mysql> CREATE TABLE `ten` (
    ->   `a` int(11) DEFAULT NULL
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.05 sec)

mysql> INSERT INTO `ten` VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
Query OK, 10 rows affected (0.01 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> CREATE TABLE `t10` (
    ->   `a` int(11) DEFAULT NULL,
    ->   `b` int(11) DEFAULT NULL,
    ->   KEY `a` (`a`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.01 sec)

mysql>
mysql>
mysql> insert into t10
    -> select
    ->  A.a+10*B.a+100*C.a + 1000*D.a,
    ->  A.a+10*B.a+100*C.a + 1000*D.a
    -> from
    ->   ten A, ten B, ten C, ten D;
Query OK, 10000 rows affected (0.11 sec)
Records: 10000  Duplicates: 0  Warnings: 0

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> update t10 set b=10 order by a limit 2;
Query OK, 0 rows affected (0.03 sec)
Rows matched: 2  Changed: 0  Warnings: 0

mysql> show status like 'Handler%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 1     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 1     |
| Handler_read_key           | 3     |
| Handler_read_last          | 0     |
| Handler_read_next          | 1     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 2     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 2     |
| Handler_write              | 0     |
+----------------------------+-------+
16 rows in set (0.00 sec)

### mysql-5.6.9-rc - NOT yet released but seems the reported issue is fixed

[root@node-1 mysql-5.6.9]# bin/mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.9-rc-enterprise-commercial-advanced MySQL Enterprise Server - Advanced Edition (Commercial)

Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>
mysql> create database defect67638;
Query OK, 1 row affected (0.00 sec)

mysql> use defect67638;
Database changed
mysql> CREATE TABLE `ten` (
    ->   `a` int(11) DEFAULT NULL
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.03 sec)

mysql> INSERT INTO `ten` VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
Query OK, 10 rows affected (0.00 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql>
mysql> CREATE TABLE `t10` (
    ->   `a` int(11) DEFAULT NULL,
    ->   `b` int(11) DEFAULT NULL,
    ->   KEY `a` (`a`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.04 sec)

mysql>
mysql>
mysql> insert into t10
    -> select
    ->  A.a+10*B.a+100*C.a + 1000*D.a,
    ->  A.a+10*B.a+100*C.a + 1000*D.a
    -> from
    ->   ten A, ten B, ten C, ten D;
Query OK, 10000 rows affected (0.22 sec)
Records: 10000  Duplicates: 0  Warnings: 0

mysql>
mysql> explain update t10 set b=10 order by a limit 2;
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra           |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------+
|  1 | SIMPLE      | t10   | ALL  | a             | a    | 5       | NULL |    2 | Using temporary |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------+
1 row in set (0.00 sec)

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> update t10 set b=10 order by a limit 2;
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2  Changed: 2  Warnings: 0

mysql>
mysql> show status like 'Handler%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 1     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 2     |
| Handler_mrr_init           | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 1     |
| Handler_read_key           | 3     |
| Handler_read_last          | 0     |
| Handler_read_next          | 1     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 2     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 2     |
| Handler_write              | 0     |
+----------------------------+-------+
18 rows in set (0.00 sec)

mysql>
[20 Nov 2012 12:35] Erlend Dahl
Not repeatable on latest trunk version (5.6.9).
[21 Nov 2012 7:14] Erlend Dahl
Re-opening since the optimizer team managed to reproduce on head of trunk.
[21 Nov 2012 8:59] Olav Sandstå
I think this change in execution plan is caused by introducing persistent statistics for InnoDB in 5.6 and making this the default way to update the statistic. Before InnoDB had persistent statistics, the statistic data about the table was updated by the user thread doing the insert. With persistent statistics enabled, the statistics is now normally updated by a background thread. 

If I do any of the following changes:

a. add ANALYZE TABLE t10 after inserting data
b. add sleep 10 to the test case after inserting data
c. disable persistent statistics by using --innodb-stats-persistent=0

I get the similar execution as in 5.5 for this query.

Thus, I think the reason for the change in execution plan is that when the thread doing the insert of data and immediately after running the update statement, the statistics for the table has not yet been updated. This causes the server to select a different execution plan. I have not looked into details about what the statistics is when running the update statement or the choices done by the optimizer.
[21 Nov 2012 10:25] Sergey Petrunya
Thanks for the explanation! This property of persistent index statistics is very interesting for anyone doing anything with the query optimizer.
[21 Nov 2012 10:31] Sergey Petrunya
So, I wanted to see what's going on. I've modified the original test case to run an EXPLAIN that would show both rec_per_key and table->stats.records statistics, like this:

...
insert into t10 
select 
 A.a+10*B.a+100*C.a + 1000*D.a, 
 A.a+10*B.a+100*C.a + 1000*D.a 
from 
  ten A, ten B, ten C, ten D;

explain select * from t10 A, t10 B FORCE INDEX (a) where A.b=B.a;
select sleep(10);
explain select * from t10 A, t10 B FORCE INDEX (a) where A.b=B.a;
[21 Nov 2012 10:34] Sergey Petrunya
... Then I ran it on 5.6 and I've got:
explain select * from t10 A, t10 B FORCE INDEX (a) where A.b=B.a;
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	A	ALL	NULL	NULL	NULL	NULL	10000	Using where
1	SIMPLE	B	ref	a	a	5	test.A.b	5000	NULL
select sleep(10);
sleep(10)
0
explain select * from t10 A, t10 B FORCE INDEX (a) where A.b=B.a;
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	A	ALL	NULL	NULL	NULL	NULL	10157	Using where
1	SIMPLE	B	ref	a	a	5	test.A.b	5000	NULL

and on 5.5, I've got:
explain select * from t10 A, t10 B FORCE INDEX (a) where A.b=B.a;
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	A	ALL	NULL	NULL	NULL	NULL	10266	
1	SIMPLE	B	ref	a	a	5	test.A.b	1	Using where
select sleep(10);
sleep(10)
0
explain select * from t10 A, t10 B FORCE INDEX (a) where A.b=B.a;
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	A	ALL	NULL	NULL	NULL	NULL	10266	
1	SIMPLE	B	ref	a	a	5	test.A.b	1	Using where
[21 Nov 2012 10:45] Sergey Petrunya
Observations: 

* In 5.6, statistics update doesn't make a huge difference.
#rows went from 10000 to 10157 (that is, from the precise value
to an imprecise one:).  Still, this was enough to change the query plan.

* What about rows=5000 in row B? this is rec_per_key statistics. The real
value is 1, and 5.5 sets it correctly. Is it considered a bug that 5.6
has an estimate that is so much off?  Btw, ANALYZE TABLE changes it to be 
correct.  It seems, ANALYZE is much more important now - without it, statistics is waaaay off.
[26 Nov 2012 15:18] Olav Sandstå
Thanks for sharing your observations. Here is what I think happens
when using the persistent statistics implemented for InnoDB i 5.6:

The first time the query is run is immediately after creating the
table and filling it with data. The filling of data into the table has
triggered a task for a background thread to update statistics for the
table. This has not been done at the time the query is first run.

First run of query: 

The server opens the table and creates table objects for the
query. When the table objects are created, the code retrieves
statistics from the storage engine for the total number of records and
rec_per_key. The estimate for the number of records seems to be based
on an "in memory counter" for the number of records inserted into this
table since there is no persistent statistics for this yet. The
rec_per_key statistics is based on the persistent statistics. At this
time since there is no persistent statistics, it sets rec_per_key =
number of records / 2 (which explains why you see 5000 as the record
estimate for the ref access).

Some time later the background thread in InnoDB updates the persistent
statistics for this table.

Second run of the query:

The table objects are cached so on the second execution of the query
the same table objects as in the first execution are re-used. In this
case we do only update information about the number of records which
is now read from the persistent statistics (which explains the change
from 10000 to 10057). The rec_per_key is not re-read from the storage
engine and thus we re-use the initial (very wrong) rec_per_key
estimate. I might be wrong but it seems like for rec_per_key we do not
update its value for the lifetime of the cached table object.

It seems to be necessary to force a close of open table objects in
order to get the rec_per_key estimate to be updated to use the value
in the persistent statistics in InnoDB. This explains why running
eg. flush table or analyze table "fixes" the problem.
[26 Nov 2012 17:24] Sergey Petrunya
Thanks for the explanations.  This means, optimizer bugs must be handled with greater care in 5.6+ - one has to make sure the table he's using already has got decent index statistics. (and keep in mind that having correct table statistics doesn't mean correct index statistics is good).
[27 Nov 2012 10:41] Vasil Dimov
Hello, the comment in "[21 Nov 8:59] Olav Sandstaa" summarizes the situation very well.