Bug #97418 MySQL chooses different execution plan in 5.7
Submitted: 29 Oct 2019 15:25 Modified: 28 Jan 4:10
Reporter: Vinodh Krish Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.7, 5.7.28 OS:Any
Assigned to: CPU Architecture:Any
Tags: optimizer; execution plan;, regression

[29 Oct 2019 15:25] Vinodh Krish
Description:
The MySQL 5.7 chooses different execution plan with JOIN in SELECT and UPDATE as follows. But the same pattern is not observed in MySQL 5.6 and doesn't need any workaround in 5.6.

Execution PLAN:
mysql> explain select t1.rec_id from tbl1 t1 INNER JOIN (select a.id,a.rec_id,s_date from tbl2 b , tbl1 a WHERE a.id_value2 = b.t_id and a.id = 6889877970355107670 order by s_date desc) t2 on (t1.id = t2.id and t1.rec_id = t2.rec_id); 
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref                   | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
|  1 | SIMPLE      | a     | NULL       | ref    | PRIMARY       | PRIMARY | 8       | const                 |    6 |   100.00 | Using where |
|  1 | SIMPLE      | t1    | NULL       | eq_ref | PRIMARY       | PRIMARY | 12      | const,vinodh.a.rec_id |    1 |   100.00 | Using index |
|  1 | SIMPLE      | b     | NULL       | eq_ref | PRIMARY       | PRIMARY | 8       | vinodh.a.id_value2    |    1 |   100.00 | Using index |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------------+------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)

The same inside UPDATE is different and table "b" takes precedence over "a" which causes the query to run for hours:
mysql> explain update tbl1 t1 INNER JOIN (select a.id,a.rec_id,b.s_date from tbl2 b , tbl1 a WHERE a.id_value2 = b.t_id and a.id = 6889877970355107670 order by s_date desc) t2 on (t1.id = t2.id and t1.rec_id = t2.rec_id)  SET t1.rec_id = @ROWNUM:= @ROWNUM+1;
+----+-------------+------------+------------+--------+---------------+---------+---------+-----------------+-------+----------+----------------+
| id | select_type | table      | partitions | type   | possible_keys | key     | key_len | ref             | rows  | filtered | Extra          |
+----+-------------+------------+------------+--------+---------------+---------+---------+-----------------+-------+----------+----------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL    | NULL          | NULL    | NULL    | NULL            |  6002 |   100.00 | NULL           |
|  1 | UPDATE      | t1         | NULL       | eq_ref | PRIMARY       | PRIMARY | 12      | t2.id,t2.rec_id |     1 |   100.00 | NULL           |
|  2 | DERIVED     | b          | NULL       | ALL    | PRIMARY       | NULL    | NULL    | NULL            | 10004 |   100.00 | Using filesort |
|  2 | DERIVED     | a          | NULL       | ref    | PRIMARY       | PRIMARY | 8       | const           |     6 |    10.00 | Using where    |
+----+-------------+------------+------------+--------+---------------+---------+---------+-----------------+-------+----------+----------------+
4 rows in set (0.00 sec)

 
Tested the same in MySQL 5.6, but both SELECT and UPDATE shows similar execution plan and runs faster. From Query trace, it seems the optimizer uses the straight_join which causes the table b to take precedence over table a. please check.

mysql> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE\G
*************************** 1. row ***************************
                            QUERY: update tbl1 t1 INNER JOIN (select a.id,a.rec_id,b.s_date from tbl2 b , tbl1 a WHERE a.id_value2 = b.t_id and a.id = 6889877970355107670 order by s_date desc) t2 on (t1.id = t2.id and t1.rec_id = t2.rec_id)  SET t1.rec_id = @ROWNUM:= @ROWNUM+1
                            TRACE: {
  "steps": [
    {
      "join_preparation": {
        "select#": 2,
        "steps": [
          {
            "expanded_query": "/* select#2 */ select straight_join `a`.`id` AS `id`,`a`.`rec_id` AS `rec_id`,`b`.`s_date` AS `s_date` from `tbl2` `b` join `tbl1` `a` where ((`a`.`id_value2` = `b`.`t_id`) and (`a`.`id` = 6889877970355107670)) order by `b`.`s_date` desc"
          }
        ]
      }
    },
    {
      "derived": {
        "table": " `t2`",
        "select#": 2,
        "materialized": true
      }
    },
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `t1`.`rec_id` AS `rec_id` from (`tbl1` `t1` join (/* select#2 */ select straight_join `a`.`id` AS `id`,`a`.`rec_id` AS `rec_id`,`b`.`s_date` AS `s_date` from `tbl2` `b` join `tbl1` `a` where ((`a`.`id_value2` = `b`.`t_id`) and (`a`.`id` = 6889877970355107670)) order by `b`.`s_date` desc) `t2` on(((`t1`.`id` = `t2`.`id`) and (`t1`.`rec_id` = `t2`.`rec_id`))))"
          },
          {
            "transformations_to_nested_joins": {
              "transformations": [
                "JOIN_condition_to_WHERE",
                "parenthesis_removal"
              ],
              "expanded_query": "/* select#1 */ select `t1`.`rec_id` AS `rec_id` from `tbl1` `t1` join (/* select#2 */ select straight_join `a`.`id` AS `id`,`a`.`rec_id` AS `rec_id`,`b`.`s_date` AS `s_date` from `tbl2` `b` join `tbl1` `a` where ((`a`.`id_value2` = `b`.`t_id`) and (`a`.`id` = 6889877970355107670)) order by `b`.`s_date` desc) `t2` where ((`t1`.`id` = `t2`.`id`) and (`t1`.`rec_id` = `t2`.`rec_id`))"
            }
          }
        ]
      }
    },

Please let me know if you need any other data.

How to repeat:
Create the tables as follows and load dummy records:

mysql> show create table tbl1\G
*************************** 1. row ***************************
       Table: tbl1
Create Table: CREATE TABLE `tbl1` (
  `id` bigint(16) NOT NULL,
  `rec_id` int(6) NOT NULL,
  `id_value2` bigint(16) DEFAULT NULL,
  PRIMARY KEY (`id`,`rec_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> show create table tbl2\G
*************************** 1. row ***************************
       Table: tbl2
Create Table: CREATE TABLE `tbl2` (
  `t_id` bigint(16) NOT NULL AUTO_INCREMENT,
  `s_date` datetime DEFAULT NULL,
  PRIMARY KEY (`t_id`)
) ENGINE=InnoDB AUTO_INCREMENT=10001 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> show table status like 'tbl1'\G
*************************** 1. row ***************************
           Name: tbl1
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 10100
 Avg_row_length: 53
    Data_length: 540672
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2019-10-29 15:01:08
    Update_time: 2019-10-29 14:37:31
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.00 sec)

mysql> show table status like 'tbl2'\G
*************************** 1. row ***************************
           Name: tbl2
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 10004
 Avg_row_length: 54
    Data_length: 540672
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: 10001
    Create_time: 2019-10-29 15:02:11
    Update_time: 2019-10-29 14:38:47
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options: 
        Comment: 
1 row in set (0.00 sec)

Check the SELECT and UPDATE query as mentioned above oil both MySQL 5.6 and 5.7

Suggested fix:
As a workaround, Change the query as follows and change the order of the tables a and b to get the right execution plan which runs within seconds.

update tbl1 t1 INNER JOIN (select a.id,a.rec_id,b.s_date from tbl1 a, tbl2 b  WHERE a.id_value2 = b.t_id and a.id = 6889877970355107670 order by s_date desc) t2 on (t1.id = t2.id and t1.rec_id = t2.rec_id)  SET t1.rec_id = @ROWNUM:= @ROWNUM+1;

+----+-------------+------------+------------+--------+---------------+---------+---------+--------------------+------+----------+----------------------------------------------+
| id | select_type | table      | partitions | type   | possible_keys | key     | key_len | ref                | rows | filtered | Extra                                        |
+----+-------------+------------+------------+--------+---------------+---------+---------+--------------------+------+----------+----------------------------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL    | NULL          | NULL    | NULL    | NULL               |    6 |   100.00 | NULL                                         |
|  1 | UPDATE      | t1         | NULL       | eq_ref | PRIMARY       | PRIMARY | 12      | t2.id,t2.rec_id    |    1 |   100.00 | NULL                                         |
|  2 | DERIVED     | a          | NULL       | ref    | PRIMARY       | PRIMARY | 8       | const              |    6 |   100.00 | Using where; Using temporary; Using filesort |
|  2 | DERIVED     | b          | NULL       | eq_ref | PRIMARY       | PRIMARY | 8       | vinodh.a.id_value2 |    1 |   100.00 | NULL                                         |
+----+-------------+------------+------------+--------+---------------+---------+---------+--------------------+------+----------+----------------------------------------------+
4 rows in set (0.00 sec)
[29 Oct 2019 15:29] Vinodh Krish
Optimizer plans for the UPDATE and SELECT

Attachment: optimizerplan_for_update.log (application/octet-stream, text), 21.78 KiB.

[29 Oct 2019 15:30] Vinodh Krish
Optimizer plans for the UPDATE and SELECT

Attachment: optimizerplan_for_select.log (application/octet-stream, text), 19.06 KiB.

[31 Oct 2019 12:12] Umesh Shastry
Hello Vinodh,

Thank you for the report.

regards,
Umesh
[31 Oct 2019 12:13] Umesh Shastry
5.6.46, 5.7.28 and 8.0.18 results

Attachment: 97418.results (application/octet-stream, text), 15.24 KiB.

[28 Jan 4:10] Jon Stephens
Documented fix as follows in the MySQL 5.7.30 and 8.0.20 changelogs:

    A multi-table UPDATE statement which updated a table joined to a
    derived table that joined two other tables was not optimized
    properly as it had been in MySQL 5.6, instead being treated as
    if STRAIGHT_JOIN had been used with the subquery creating the
    derived table.

Closed.