Bug #78412 UPDATE with subquery is NOT optimized the same way as equivalent SELECT
Submitted: 11 Sep 2015 16:17 Modified: 11 Dec 2015 14:13
Reporter: Valeriy Kravchuk Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S2 (Serious)
Version:5.6.24 OS:Any
Assigned to: Paul DuBois CPU Architecture:Any
Tags: explain, materialization, semi-join, UPDATE

[11 Sep 2015 16:17] Valeriy Kravchuk
Description:
It seems UPDATE with simple IN subquery is NOT optimized the same way as equivalent SELECT. Consider this simple example:

mysql> explain select c1 from t1 where t1.uid in (select uid from t2 where t2.sr
id = 1);
+----+-------------+-------+------+---------------+------+---------+------------
-+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref
 | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------------
-+------+-------------+
|  1 | SIMPLE      | t2    | ref  | PRIMARY,srid  | srid | 4       | const
 |    1 | Using index |
|  1 | SIMPLE      | t1    | ref  | uid           | uid  | 5       | test.t2.uid
 |    1 | NULL        |
+----+-------------+-------+------+---------------+------+---------+------------
-+------+-------------+
2 rows in set (0.00 sec)

mysql> explain update t1 set c1='a' where t1.uid in (select uid from t2 where t2
.srid = 1);
+----+--------------------+-------+-----------------+---------------+---------+-
--------+------------+------+--------------------------+
| id | select_type        | table | type            | possible_keys | key     |
key_len | ref        | rows | Extra                    |
+----+--------------------+-------+-----------------+---------------+---------+-
--------+------------+------+--------------------------+
|  1 | PRIMARY            | t1    | index           | NULL          | PRIMARY |
4       | NULL       |    3 | Using where              |
|  2 | DEPENDENT SUBQUERY | t2    | unique_subquery | PRIMARY,srid  | PRIMARY |
8       | func,const |    1 | Using index; Using where |
+----+--------------------+-------+-----------------+---------------+---------+-
--------+------------+------+--------------------------+
2 rows in set (0.00 sec)

and imagine that the same plan for UPDATE is used with a huge table t1, You end up with a very slow UPDATE that can not be executed faster without explicit rewrite.

How to repeat:
With default optimizer_switch settings for 5.6, create a couple of tables like these:

mysql> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` int(11) NOT NULL,
  `uid` int(11) DEFAULT NULL,
  `c1` char(100) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `uid` (`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `uid` int(11) NOT NULL DEFAULT '0',
  `srid` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`uid`,`srid`),
  KEY `srid` (`srid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> select * from t1;
+----+------+------+
| id | uid  | c1   |
+----+------+------+
|  1 |    1 | a    |
|  2 |    2 | b    |
|  3 |    3 | c    |
+----+------+------+
3 rows in set (0.00 sec)

mysql> select * from t2;
+-----+------+
| uid | srid |
+-----+------+
|   1 |    1 |
+-----+------+
1 row in set (0.00 sec)

Then check execution plans and real Handler_% values for the following SELECT and UPDATE:

mysql> explain select c1 from t1 where t1.uid in (select uid from t2 where t2.sr
id = 1);
+----+-------------+-------+------+---------------+------+---------+------------
-+------+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref
 | rows | Extra       |
+----+-------------+-------+------+---------------+------+---------+------------
-+------+-------------+
|  1 | SIMPLE      | t2    | ref  | PRIMARY,srid  | srid | 4       | const
 |    1 | Using index |
|  1 | SIMPLE      | t1    | ref  | uid           | uid  | 5       | test.t2.uid
 |    1 | NULL        |
+----+-------------+-------+------+---------------+------+---------+------------
-+------+-------------+
2 rows in set (0.00 sec)

mysql> explain update t1 set c1='a' where t1.uid in (select uid from t2 where t2
.srid = 1);
+----+--------------------+-------+-----------------+---------------+---------+-
--------+------------+------+--------------------------+
| id | select_type        | table | type            | possible_keys | key     |
key_len | ref        | rows | Extra                    |
+----+--------------------+-------+-----------------+---------------+---------+-
--------+------------+------+--------------------------+
|  1 | PRIMARY            | t1    | index           | NULL          | PRIMARY |
4       | NULL       |    3 | Using where              |
|  2 | DEPENDENT SUBQUERY | t2    | unique_subquery | PRIMARY,srid  | PRIMARY |
8       | func,const |    1 | Using index; Using where |
+----+--------------------+-------+-----------------+---------------+---------+-
--------+------------+------+--------------------------+
2 rows in set (0.00 sec)

mysql> flush status;
Query OK, 0 rows affected (0.04 sec)

mysql> select c1 from t1 where t1.uid in (select uid from t2 where t2.srid = 1);

+------+
| c1   |
+------+
| a    |
+------+
1 row in set (0.00 sec)

mysql> show status like 'Handler_%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 1     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 4     |
| Handler_mrr_init           | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 2     |
| Handler_read_last          | 0     |
| Handler_read_next          | 2     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
18 rows in set (0.04 sec)

mysql> flush status;
Query OK, 0 rows affected (0.00 sec)

mysql> update t1 set c1='a' where t1.uid in (select uid from t2 where t2.srid =
1);
Query OK, 0 rows affected (0.11 sec)
Rows matched: 1  Changed: 0  Warnings: 0

mysql> show status like 'Handler_%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 2     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 4     |
| Handler_mrr_init           | 0     |
| Handler_prepare            | 2     |
| Handler_read_first         | 1     |
| Handler_read_key           | 4     |
| Handler_read_last          | 0     |
| Handler_read_next          | 0     |
| Handler_read_prev          | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_next      | 4     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_update             | 1     |
| Handler_write              | 0     |
+----------------------------+-------+
18 rows in set (0.00 sec)

Note that a lot more operations were used to execute UPDATE. Now slae this to millions of rows and enjoy...

Suggested fix:
Make sure subqueries are optimized for UPDATE and DELETE in the same way as they are for SELECT (including materialization, semi-join and all other strategies).

Until this is done, please, clearly explain in the manual what optimizations are NOT used for UPDATE and DELETE.
[14 Sep 2015 10:37] Roy Lyseng
Please see feature request 72895.
This is a known issue for single-table UPDATE and DELETE statements. For some queries, it is possible to workaround by converting to equivalent multi-table UPDATE and DELETE statements. For now, converting this into a documentation bug.
[14 Sep 2015 13:26] MySQL Verification Team
Thank you for the bug report.
[11 Dec 2015 14:13] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly.

Given that more complete subquery optimization for UPDATE/DELETE is a feature request, I've added this note to the Subquery Optimization section (e.g.,
http://dev.mysql.com/doc/refman/5.6/en/subquery-optimization.html):

A limitation on UPDATE and DELETE statements that use a subquery to
modify a single table is that the optimizer does not use semi-join or 
materialization subquery optimizations. As a workaround, try
rewriting them as multiple-table UPDATE and DELETE statements that
use a join rather than a subquery.
[13 Jan 2017 9:03] MySQL Verification Team
Bug #84492 marked as duplicate of this one