Bug #114996 Subquery with "order by" in mysql 8.0 is much slower than in mysql 5.7
Submitted: 14 May 10:09 Modified: 23 May 0:52
Reporter: 镇熙 林 Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.25, 8.0.37 OS:Any
Assigned to: CPU Architecture:Any
Tags: regression

[14 May 10:09] 镇熙 林
Description:
Subquery with "order by" in mysql 8.0 is much slower than in mysql 5.7.

How to repeat:
drop table if exists t1;
create table t1( id bigint unsigned auto_increment primary key, x varchar(15), a1 varchar(20), key idx1 (x));
set @x:=0;
insert into t1(x,a1) select Ipad(mod(@x:=@x+1,90000),15,'0'),Ipad('0',20,'0') from (select 1 from information_schema.columns limit 1800) a, (select 1 from information_schema.columns limit 1800)b;
create table t2 (id bigint unsigned auto_increment primary key, x varchar(15), b1 varchar(20), unique key uk1 (x) );
drop table if exists t2;
set @x=0;
insert into t2(x,b1) select Ipad(@x:=@x+1,15,'0'),Ipad('0',20,'0') from (select 1 from information_schema.columns limit 300) a, (select 1 from information_schema.columns limit 300)b;

mysql 8.0.25:
explain select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+-------------------------------------------------------------+
|  id  | select_type  | table | partitions | type   | possible_keys | key | key_len | ref            | rows   | filtered | Extra                                                                                   |
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+-------------------------------------------------------------+
|   1  | SIMPLE       | t2      | NULL        | index | uk1                 | uk1| 63         | NULL        | 89770 | 100.00 | Using where; Using index; Using temporary;Using filesort |
|   1  | SIMPLE       | t1      | NULL        | ref     | PRIMARY,idx1| idx1| 63         | linzx.t2.x   | 1         | 50.00   | Using index condition                                                        |
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+-------------------------------------------------------------+
-- much slower
select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
5 rows in set (4.54 sec)

explain select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 limit 5;
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+----------------------------+
|  id  | select_type  | table | partitions | type   | possible_keys | key | key_len | ref            | rows   | filtered | Extra                                  |
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+----------------------------+
|   1  | SIMPLE       | t2      | NULL        | index | uk1                 | uk1| 63         | NULL        | 89770 | 100.00 | Using where; Using index  |
|   1  | SIMPLE       | t1      | NULL        | ref     | PRIMARY,idx1| idx1| 63         | linzx.t2.x   | 1         | 50.00   | Using index condition       |
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+----------------------------+
-- faster without "order by"
select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 limit 5;
500 rows in set (0.00 sec)

explain select * from t1 where exists (select /*+ NO_SEMIJOIN() */ 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
|  id  | select_type                      | table | partitions  | type    | possible_keys | key           | key_len | ref             | rows      | filtered | Extra             |
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
|   1  | SIMPLE                            | t1      | NULL        | range | PRIMARY        | PRIMARY | 8           | NULL        | 1613275| 100.00 | Using where  |
|   2  | DEPENDENT SUBQUERY | t2      | NULL        | eq_ref | uk1                 | uk1          | 63         | linzx.t1.x   | 1            | 100.00 | Using index   |
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
-- faster with NO_SEMIJOIN hints
select * from t1 where exists (select /*+ NO_SEMIJOIN() */ 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
500 rows in set (0.00 sec)

mysql 5.7.21:
explain select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
|  id  | select_type                      | table | partitions  | type    | possible_keys | key           | key_len | ref             | rows      | filtered | Extra             |
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
|   1  | SIMPLE                            | t1      | NULL        | range | PRIMARY        | PRIMARY | 8           | NULL        | 1613275| 100.00 | Using where  |
|   2  | DEPENDENT SUBQUERY | t2      | NULL        | ref      | uk1                 | uk1          | 63         | linzx.t1.x    | 1            | 100.00 | Using index   |
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
-- no slow issue in 5.7.21
select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
500 rows in set (0.00 sec)
[21 May 7:05] MySQL Verification Team
Hello 镇熙 林,

Thank you for the report and feedback.
verified as described.

regards,
Umesh
[23 May 0:52] 镇熙 林
Under what circumstances will this problem be triggered? How to find out in advance? How to avoid it? Thanks!