Bug #117409 join with crosse table sort is not optimized away by index
Submitted: 7 Feb 5:48 Modified: 7 Feb 6:11
Reporter: jia liu Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0.35, 8.0.41 OS:Any
Assigned to: CPU Architecture:Any
Tags: join, sort

[7 Feb 5:48] jia liu
Description:
when there is a sort that across table, there is always a sort.

How to repeat:
create table t1 (pk int auto_increment primary key,k int,v varchar(100),key (k));

create table t2 like t1;

insert into t1 values (null,1,rand());
insert into t1 values (null,2,rand());
insert into t1 values (null,3,rand());
insert into t1 values (null,10,rand());
insert into t1 select null,100,rand() from t1;
insert into t1 select null,100,rand() from t1;
insert into t1 select null,100,rand() from t1;
insert into t1 select null,100,rand() from t1;
insert into t1 select null,100,rand() from t1;
insert into t2 select * from t1;

explain select * from t1,t2 where t1.k=t2.k and t1.k<10 order by t1.k;
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref       | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+-----------------------+
|  1 | SIMPLE      | t1    | NULL       | range | k             | k    | 5       | NULL      |    3 |   100.00 | Using index condition |
|  1 | SIMPLE      | t2    | NULL       | ref   | k             | k    | 5       | test.t1.k |    1 |   100.00 | NULL                  |
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+-----------------------+

-- a sort on single table is fine

explain select * from t1,t2 where t1.k=t2.k and t1.k<10 order by t1.k,t2.k;
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+--------------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref       | rows | filtered | Extra                                                  |
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+--------------------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | range | k             | k    | 5       | NULL      |    3 |   100.00 | Using index condition; Using temporary; Using filesort |
|  1 | SIMPLE      | t2    | NULL       | ref   | k             | k    | 5       | test.t1.k |    1 |   100.00 | NULL                                                   |
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+--------------------------------------------------------+
explain select * from t1,t2 where t1.k=t2.k and t1.k<10 order by t1.k,t1.pk,t2.k;
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+--------------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref       | rows | filtered | Extra                                                  |
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+--------------------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | range | k             | k    | 5       | NULL      |    3 |   100.00 | Using index condition; Using temporary; Using filesort |
|  1 | SIMPLE      | t2    | NULL       | ref   | k             | k    | 5       | test.t1.k |    1 |   100.00 | NULL                                                   |
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+--------------------------------------------------------+
explain select * from t1,t2 where t1.k=t2.k and t1.k<10 order by t1.k,t1.pk,t2.k,t2.pk;
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+--------------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref       | rows | filtered | Extra                                                  |
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+--------------------------------------------------------+
|  1 | SIMPLE      | t1    | NULL       | range | k             | k    | 5       | NULL      |    3 |   100.00 | Using index condition; Using temporary; Using filesort |
|  1 | SIMPLE      | t2    | NULL       | ref   | k             | k    | 5       | test.t1.k |    1 |   100.00 | NULL                                                   |
+----+-------------+-------+------------+-------+---------------+------+---------+-----------+------+----------+--------------------------------------------------------+

-- when sort across table, there is always sorted even if does not need.

-- and some extra test, a single row has some kind of optimize:
explain select * from t1,t2 where t1.k=t2.k and t1.k=10 order by t1.k,t2.k;
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t1    | NULL       | ref  | k             | k    | 5       | const |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | t2    | NULL       | ref  | k             | k    | 5       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
explain select * from t1,t2 where t1.k=t2.k and t1.k=10 order by t1.k,t1.pk,t2.k;
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t1    | NULL       | ref  | k             | k    | 5       | const |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | t2    | NULL       | ref  | k             | k    | 5       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+

-- and if more complex, if failed to optimize:
explain select * from t1,t2 where t1.k=t2.k and t1.k=10 order by t1.k,t1.pk,t2.k,t2.pk;
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra                           |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+---------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | k             | k    | 5       | const |    1 |   100.00 | Using temporary; Using filesort |
|  1 | SIMPLE      | t2    | NULL       | ref  | k             | k    | 5       | const |    1 |   100.00 | NULL                            |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+---------------------------------+

2 rows in set, 1 warning (0.00 sec)

Note (Code 1003): /* select#1 */ select `test`.`t1`.`pk` AS `pk`,`test`.`t1`.`k` AS `k`,`test`.`t1`.`v` AS `v`,`test`.`t2`.`pk` AS `pk`,`test`.`t2`.`k` AS `k`,`test`.`t2`.`v` AS `v` from `test`.`t1` join `test`.`t2` where ((`test`.`t1`.`k` = 10) and (`test`.`t2`.`k` = 10)) order by `test`.`t1`.`k`,`test`.`t1`.`pk`,`test`.`t2`.`pk`
-- the sql after rewrite, shows that t2.k is optimized away, and then lead to a sort?

-- it seems the same if multiple rows
insert into t1 values (null,10,rand());
insert into t2 values (null,10,rand());
explain select * from t1,t2 where t1.k=t2.k and t1.k=10 order by t1.k,t1.pk,t2.k;
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t1    | NULL       | ref  | k             | k    | 5       | const |    2 |   100.00 | NULL  |
|  1 | SIMPLE      | t2    | NULL       | ref  | k             | k    | 5       | const |    2 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
explain select * from t1,t2 where t1.k=t2.k and t1.k=10 order by t1.k,t1.pk,t2.k,t2.pk;
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra                           |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+---------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | k             | k    | 5       | const |    2 |   100.00 | Using temporary; Using filesort |
|  1 | SIMPLE      | t2    | NULL       | ref  | k             | k    | 5       | const |    2 |   100.00 | NULL                            |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+---------------------------------+

Suggested fix:
some across table sort operation in join is not needed
[7 Feb 6:11] MySQL Verification Team
Hello jia liu, Jean-François,

Thank you for the report and feedback.

regards,
Umesh