MySQL Bugs: #114996: Subquery with "order by" in mysql 8.0 is much slower than in mysql 5.7

Bug #114996	Subquery with "order by" in mysql 8.0 is much slower than in mysql 5.7
Submitted:	14 May 2024 10:09	Modified:	24 Jun 2024 8:10
Reporter:	镇熙林	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S5 (Performance)
Version:	8.0.25, 8.0.37	OS:	Any
Assigned to:		CPU Architecture:	Any
Tags:	regression

Description:
Subquery with "order by" in mysql 8.0 is much slower than in mysql 5.7.

How to repeat:
drop table if exists t1;
create table t1( id bigint unsigned auto_increment primary key, x varchar(15), a1 varchar(20), key idx1 (x));
set @x:=0;
insert into t1(x,a1) select Ipad(mod(@x:=@x+1,90000),15,'0'),Ipad('0',20,'0') from (select 1 from information_schema.columns limit 1800) a, (select 1 from information_schema.columns limit 1800)b;
create table t2 (id bigint unsigned auto_increment primary key, x varchar(15), b1 varchar(20), unique key uk1 (x) );
drop table if exists t2;
set @x=0;
insert into t2(x,b1) select Ipad(@x:=@x+1,15,'0'),Ipad('0',20,'0') from (select 1 from information_schema.columns limit 300) a, (select 1 from information_schema.columns limit 300)b;

mysql 8.0.25:
explain select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+-------------------------------------------------------------+
|  id  | select_type  | table | partitions | type   | possible_keys | key | key_len | ref            | rows   | filtered | Extra                                                                                   |
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+-------------------------------------------------------------+
|   1  | SIMPLE       | t2      | NULL        | index | uk1                 | uk1| 63         | NULL        | 89770 | 100.00 | Using where; Using index; Using temporary;Using filesort |
|   1  | SIMPLE       | t1      | NULL        | ref     | PRIMARY,idx1| idx1| 63         | linzx.t2.x   | 1         | 50.00   | Using index condition                                                        |
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+-------------------------------------------------------------+
-- much slower
select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
5 rows in set (4.54 sec)

explain select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 limit 5;
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+----------------------------+
|  id  | select_type  | table | partitions | type   | possible_keys | key | key_len | ref            | rows   | filtered | Extra                                  |
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+----------------------------+
|   1  | SIMPLE       | t2      | NULL        | index | uk1                 | uk1| 63         | NULL        | 89770 | 100.00 | Using where; Using index  |
|   1  | SIMPLE       | t1      | NULL        | ref     | PRIMARY,idx1| idx1| 63         | linzx.t2.x   | 1         | 50.00   | Using index condition       |
+---+-------------+-----+----------+------+---------------+----+--------+----------+-------+-------+----------------------------+
-- faster without "order by"
select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 limit 5;
500 rows in set (0.00 sec)

explain select * from t1 where exists (select /*+ NO_SEMIJOIN() */ 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
|  id  | select_type                      | table | partitions  | type    | possible_keys | key           | key_len | ref             | rows      | filtered | Extra             |
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
|   1  | SIMPLE                            | t1      | NULL        | range | PRIMARY        | PRIMARY | 8           | NULL        | 1613275| 100.00 | Using where  |
|   2  | DEPENDENT SUBQUERY | t2      | NULL        | eq_ref | uk1                 | uk1          | 63         | linzx.t1.x   | 1            | 100.00 | Using index   |
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
-- faster with NO_SEMIJOIN hints
select * from t1 where exists (select /*+ NO_SEMIJOIN() */ 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
500 rows in set (0.00 sec)

mysql 5.7.21:
explain select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
|  id  | select_type                      | table | partitions  | type    | possible_keys | key           | key_len | ref             | rows      | filtered | Extra             |
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
|   1  | SIMPLE                            | t1      | NULL        | range | PRIMARY        | PRIMARY | 8           | NULL        | 1613275| 100.00 | Using where  |
|   2  | DEPENDENT SUBQUERY | t2      | NULL        | ref      | uk1                 | uk1          | 63         | linzx.t1.x    | 1            | 100.00 | Using index   |
+---+--------------------------+-----+-----------+-------+---------------+----------+--------+-----------+---------+-------+--------------+
-- no slow issue in 5.7.21
select * from t1 where exists (select 1 from t2 where t2.x=t1.x) and id>2000000 order by id asc limit 5;
500 rows in set (0.00 sec)

Hello 镇熙 林,

Thank you for the report and feedback.
verified as described.

regards,
Umesh

Under what circumstances will this problem be triggered? How to find out in advance? How to avoid it? Thanks!

Posted by developer:
 
Below is the analysis and conclusion of the current behaviour.
 
Issue: MySQL 8.0 is slower than 5.7 for the given customer scenario.
Query: SELECT * FROM test1 WHERE EXISTS (SELECT 1 FROM test2 WHERE x2=x1) AND id > 3 ORDER BY id ASC LIMIT 5;
 
Scenario:
Creates table and add rows.
Runs query before index status available to optimizer leading to inefficient plan.
 
1. Reason for 8.0 plan is different from 5.7.
Mysql 8.0 uses Semijoin strategy to simplify the EXISTS clause leading to plan difference. If semijoin is disabled using below optimizer flag, customer will get the same 5.7 plan on 8.0 as well.
                        optimizer_switch='semijoin=off';
 
2. Reason for the 8.0 plan in the bug scenario with semijoin strategy slow?
In this instance, optimizer does not have the correct index stats to generate proper plan.

Background and Details:
MySQL uses TABLE and TABLE_SHARE objects to represent open table and its corresponding metadata during DDL/DML execution. Any TABLE objects created in a session are cached for usage within the session for performance improvement. And TABLE_SHARE objects containing metadata and index stats for the table are shared across sessions. InnoDB updates the records per key index estimates to the TABLE_SHARE whenever a new open table is done (happens during new TABLE object creation). So, when a new query uses a cached TABLE object, it will have the old records per key estimate from last open table time.
 
In this customer scenario, INSERT into table and SELECT are happening from same connection leading to optimizer reusing the TABLE object created before the records are inserted into the table. Hence optimizer assumes the table is having minimal records (1) for ref access and leading to a plan not efficient for the actual record count.
 
Table Cache Reference: https://dev.mysql.com/doc/refman/8.4/en/table-cache.html
 
How to prevent/rectify this issue?
Any action that will lead to clearing the cache or a new TABLE object creation will resolve this issue. Few are listed below:
Flush TABLE.
ANALYZE TABLE.
ANY DDL on the table.
Use of table twice in a query from same connection. Eg: EXPLAIN SELECT * FROM test1, test1 as t2 LIMIT 1.
Access the same table from a another/second connection. Etc.
 
Can this affect a production environment?
Not really. All the below conditions need to be met to hit this issue.
A drastic change in data distribution in the table to affect the records per key calculation.
There should be another plan that has less cost based to the old records/key value and performs bad enough to cause trouble.
The select should happen from a connection that has stayed open from before the data change happened.
No new connection should access the table after new records per key value is available.
No FLUSH TABLBE, ANALYZE table should happen.
Server TABLE cache should be big enough to have this TABLE object not kicked out.
Etc.
 
The only case I can think of is someone setting up a database and uses the same connection to run such a query. A simple FLUSH TABLE will solve the issue in that case.
A production environment is highly unlikely to encounter this issue. Even if a query hit this rare possibility, next connection should resolve it for all.

Considering above aspects and the current behaviour is as per design, closing this issue as not a bug.

Regards
  Ajo

This specific problem is due to stale stats in internal cache. See https://bugs.mysql.com/bug.php?id=106264

In an instance with fix of Bug#106264, enable the sync of internal cache by

set global innodb_stats_notify_change = 1;

then the join order becomes (t1,t2) and performance regression is gone. Explicit ANALYZE of involved tables after insert also gives (t1,t2).

The problem suggests taking Bug#106264 into next releases.