Bug #100257 Optimizer chooses wrong index for ORDER BY LIMIT
Submitted: 20 Jul 2020 3:57 Modified: 28 Jul 2020 2:27
Reporter: Matias Sánchez Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:8.0.21 OS:Any
Assigned to: CPU Architecture:Any
Tags: limit, Optimizer, order by

[20 Jul 2020 3:57] Matias Sánchez
Description:
Hi guys! We are in a process of migrating our data to mysql 8 last version, but we encountered a serious performance issue on queries that rely on order by - limit optimization. The queries that make use of a order by - limit clause instead of using the proper indexed columns for fetching fast top rows are executing a full table scan, but this is an optimizer non-so-deterministic behaviour.

How to repeat:
This is from a testcase that in these tables the plan is getting wrong when i add a simple column to the SELECT clause. 

Take a look at this explain plan

EXPLAIN
SELECT table_a.id,
       table_a.appUserId,
       table_a.fecha,
       table_a.failed,
       table_a.successful,
       table_a.warning,
       table_a.statusId,
       table_a.sname,
       table_b.name AS source,
       table_c.iispecid,
       table_c.column_a,
       table_c.column_b,
       table_d.ispecid,
       table_c.column_c
FROM table_a
LEFT JOIN table_b ON (table_b.id = table_a.sourceId)
LEFT JOIN table_c ON (table_a.id = table_c.ipid)
LEFT JOIN table_d ON (table_a.id = table_d.ipid)
ORDER BY fecha
LIMIT 21;

+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+------+----------+-------+
| id | select_type | table   | partitions | type   | possible_keys | key     | key_len | ref                       | rows | filtered | Extra |
+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+------+----------+-------+
|  1 | SIMPLE      | table_a | NULL       | index  | NULL          | fecha   | 8       | NULL                      |   21 |   100.00 | NULL  |
|  1 | SIMPLE      | table_b | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.sourceId |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | table_c | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.id       |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | table_d | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.id       |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+------+----------+-------+

It's using perfectly the date index column.
But when i add a single column to the SELECT now the optimizer choose a full table scan (adding table_d.folderName):

EXPLAIN
SELECT table_a.id,
       table_a.appUserId,
       table_a.fecha,
       table_a.failed,
       table_a.successful,
       table_a.warning,
       table_a.statusId,
       table_a.sname,
       table_b.name AS source,
       table_c.iispecid,
       table_c.column_a,
       table_c.column_b,
       table_d.folderName,
       table_d.ispecid,
       table_c.column_c
FROM table_a
LEFT JOIN table_b ON (table_b.id = table_a.sourceId)
LEFT JOIN table_c ON (table_a.id = table_c.ipid)
LEFT JOIN table_d ON (table_a.id = table_d.ipid)
ORDER BY fecha 
LIMIT 21;

+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+---------+----------+--------------------------------------------+
| id | select_type | table   | partitions | type   | possible_keys | key     | key_len | ref                       | rows    | filtered | Extra                                      |
+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+---------+----------+--------------------------------------------+
|  1 | SIMPLE      | table_a | NULL       | ALL    | NULL          | NULL    | NULL    | NULL                      | 1973233 |   100.00 | Using temporary; Using filesort            |
|  1 | SIMPLE      | table_b | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.sourceId |       1 |   100.00 | NULL                                       |
|  1 | SIMPLE      | table_c | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.id       |       1 |   100.00 | NULL                                       |
|  1 | SIMPLE      | table_d | NULL       | ALL    | PRIMARY       | NULL    | NULL    | NULL                      |       1 |   100.00 | Using where; Using join buffer (hash join) |
+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+---------+----------+--------------------------------------------+

This happens in lot on more complex examples but this little one shows easily how sensible is optimizer when changing for a full scan impacting on this order by -limit clauses.
We are on migration from mariadb source, but on Mariadb this is not happening with order by-limit clauses, that's why we rely a lot on them.
Any clue what's could be wrong?? or any workaround without the need of change these queries that are a tons of them, cause it's clear than forcing index on this cases will fix the issue but this is so demanding for our loads.
[20 Jul 2020 4:07] Matias Sánchez
Optimizer trace when using index for order by

Attachment: nice_plan.txt (text/plain), 13.61 KiB.

[20 Jul 2020 4:08] Matias Sánchez
Optimizer trace when not using index for order by (FULL SCAN)

Attachment: wrong_plan.txt (text/plain), 17.20 KiB.

[20 Jul 2020 4:10] Matias Sánchez
These are currently all table count

mysql> select count(*)
    -> from table_a;
+----------+
| count(*) |
+----------+
|  1994959 |
+----------+
1 row in set (0.11 sec)

mysql>
mysql> select count(*)
    -> from table_b;
+----------+
| count(*) |
+----------+
|        8 |
+----------+
1 row in set (0.00 sec)

mysql>
mysql> select count(*)
    -> from table_c;
+----------+
| count(*) |
+----------+
|  1994959 |
+----------+
1 row in set (0.09 sec)

mysql>
mysql> select count(*)
    -> from table_d;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)
[20 Jul 2020 4:11] Matias Sánchez
Tested also on 8.0.19 and 8.0.20, and on 5.7.30 and this is happening in all versions.
[20 Jul 2020 13:36] Matias Sánchez
Whenever plan is correctly accessing by index, optimizer trace is showning 

            "reconsidering_access_paths_for_index_ordering": {\
              "clause": "ORDER BY",\
              "steps": [\
              ],\
              "index_order_summary": {\
                "table": "`table_a`",\
                "index_provides_order": true,\
                "order_direction": "asc",\
                "index": "fecha",\
                "plan_changed": true,\
                "access_type": "index"\
              }\
            }\
          },\
          {\

but for some reason this is not an optimizer step at all in the bad plan example shown.
[20 Jul 2020 14:07] MySQL Verification Team
Hi Mr. Sanchez,

Thank you for your bug report.

However, we do not think that this is a bug. This is the intended behaviour.

First of all, you have probably noticed that you are not used full joins, but two semi-joins.

In those circumstances, when there is an option to use a hash join, when you add a single column, which is probably a long string, it is much faster to use hash joins than to search by index. This is because, the added column is not a part of the index. This is probably some VARCHAR, so if you use UTF, it can be quite long. In such cases, hash joins are usually faster and they involve, in case of semi-joins, a full scan of the left-most table.

We are not sure that this should be even documented, since resolution of the semi-joins is not MySQL specific.

Not a bug.
[20 Jul 2020 16:30] Matias Sánchez
Hi, thanks for the fast reply on this matter!! I appreciative. 
This behavior as a must is quite worrying because on different scenarios there are regressions on time of this type of queries, as we are considering migrating from MariaDB to MySQL 8 we are used to this type of queries taking advantage and preferring index for a order by and low limit value. Maybe this happens cause different hash join implementation.
This example I shown is one scenario that added a column with varchar(200) and the semi join/hash join  is a feasible theory, but same happens when adding an id integer as well, the query working correctly is this one

EXPLAIN
SELECT table_a.id,
       table_a.appUserId,
       table_a.fecha,
       table_a.failed,
       table_a.successful,
       table_a.warning,
       table_a.statusId,
       table_a.sname,
       table_b.name AS source,
       table_c.iispecid,
       table_c.column_a,
       table_c.column_b,
       table_d.ispecid,
       table_c.column_c
FROM table_a
LEFT JOIN table_b ON (table_b.id = table_a.sourceId)
LEFT JOIN table_c ON (table_a.id = table_c.ipid)
LEFT JOIN table_d ON (table_a.id = table_d.ipid)
ORDER BY fecha
LIMIT 21;

+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+------+----------+-------+
| id | select_type | table   | partitions | type   | possible_keys | key     | key_len | ref                       | rows | filtered | Extra |
+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+------+----------+-------+
|  1 | SIMPLE      | table_a | NULL       | index  | NULL          | fecha   | 8       | NULL                      |   21 |   100.00 | NULL  |
|  1 | SIMPLE      | table_b | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.sourceId |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | table_c | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.id       |    1 |   100.00 | NULL  |
|  1 | SIMPLE      | table_d | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.id       |    1 |   100.00 | NULL  |
+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+------+----------+-------+

But if I add instead table_d.id_error - integer value. 

EXPLAIN
SELECT table_a.id,
       table_a.appUserId,
       table_a.fecha,
       table_a.failed,
       table_a.successful,
       table_a.warning,
       table_a.statusId,
       table_a.sname,
       table_b.name AS source,
       table_c.iispecid,
       table_c.column_a,
       table_c.column_b,
       table_d.id_error,
       table_d.ispecid,
       table_c.column_c
FROM table_a 
LEFT JOIN table_b ON (table_b.id = table_a.sourceId)
LEFT JOIN table_c ON (table_a.id = table_c.ipid)
LEFT JOIN table_d ON (table_a.id = table_d.ipid)
ORDER BY fecha 
LIMIT 21;

Plan changes drastically:

+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+---------+----------+--------------------------------------------+
| id | select_type | table   | partitions | type   | possible_keys | key     | key_len | ref                       | rows    | filtered | Extra                                      |
+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+---------+----------+--------------------------------------------+
|  1 | SIMPLE      | table_a | NULL       | ALL    | NULL          | NULL    | NULL    | NULL                      | 1979683 |   100.00 | Using temporary; Using filesort            |
|  1 | SIMPLE      | table_b | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.sourceId |       1 |   100.00 | NULL                                       |
|  1 | SIMPLE      | table_c | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | testcase.table_a.id       |       1 |   100.00 | NULL                                       |
|  1 | SIMPLE      | table_d | NULL       | ALL    | PRIMARY       | NULL    | NULL    | NULL                      |       1 |   100.00 | Using where; Using join buffer (hash join) |
+----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+---------+----------+--------------------------------------------+

Also we can check that this new plan is not optimal at all cause when comparing with a forced index access this is result

Optimizer approach:

mysql> SELECT table_a.id,
    ->        table_a.appUserId,
    ->        table_a.fecha,
    ->        table_a.failed,
    ->        table_a.successful,
    ->        table_a.warning,
    ->        table_a.statusId,
    ->        table_a.sname,
    ->        table_b.name AS source,
    ->        table_c.iispecid,
    ->        table_c.column_a,
    ->        table_c.column_b,
    ->        table_d.id_error,
    ->        table_d.ispecid,
    ->        table_c.column_c
    -> FROM table_a
    -> LEFT JOIN table_b ON (table_b.id = table_a.sourceId)
    -> LEFT JOIN table_c ON (table_a.id = table_c.ipid)
    -> LEFT JOIN table_d ON (table_a.id = table_d.ipid)
    -> ORDER BY fecha
    -> LIMIT 21;
+----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+
| id | appUserId | fecha                      | failed | successful | warning | statusId | sname           | source | iispecid | column_a | column_b | id_error | ispecid | column_c |
+----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+
...........................................................
+----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+
21 rows in set (5.35 sec)

Index forced

mysql> SELECT table_a.id,
    ->        table_a.appUserId,
    ->        table_a.fecha,
    ->        table_a.failed,
    ->        table_a.successful,
    ->        table_a.warning,
    ->        table_a.statusId,
    ->        table_a.sname,
    ->        table_b.name AS source,
    ->        table_c.iispecid,
    ->        table_c.column_a,
    ->        table_c.column_b,
    ->        table_d.id_error,
    ->        table_d.ispecid,
    ->        table_c.column_c
    -> FROM table_a  force index(fecha)
    -> LEFT JOIN table_b ON (table_b.id = table_a.sourceId)
    -> LEFT JOIN table_c ON (table_a.id = table_c.ipid)
    -> LEFT JOIN table_d ON (table_a.id = table_d.ipid)
    -> ORDER BY fecha
    -> LIMIT 21;
+----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+
| id | appUserId | fecha                      | failed | successful | warning | statusId | sname           | source | iispecid | column_a | column_b | id_error | ispecid | column_c |
+----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+
..................
+----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+
21 rows in set (0.00 sec)

I'm uploading asap this testcase tables for reproduction if you wish also to test these as well.

Any idea how to workaround this optimizer behavior?, maybe i imagine "lowering priority" of hash join cost model or so to help optimizer solve faster these type queries. 

So grateful! 
Kind regards.
[20 Jul 2020 16:41] Matias Sánchez
Have just transferred testcase as mysql-bug-data-100257.dmp.gz to sftp.orcle.com:/support/incoming . This dump contains all tables with example data i've used to test queries shown on present case.
[21 Jul 2020 0:26] Øystein Grøvlen
A comment in the MySQL code says:

      filesort() and join cache are usually faster than reading in
      index order and not using join cache

So it is intentional that the LIMIT optimization is not used when join buffering is used.  However, as this example show, that join buffering is always faster is not a valid assumption, and I think that this should qualify as a feature request.

A possible work-around is to set optimizer_switch='block_nested_loop=off'.
[21 Jul 2020 12:37] MySQL Verification Team
Hi Mr. Sanchez,

This is strictly a forum for MySQL server bugs. We can not comment on products that are made by third parties.

You can try our server. Then, if you manage to find a combination of optimiser switches and directives that are available with our server, and that work faster then default behaviour, then we could consider making this report a feature request.

Otherwise, a behaviour that I described to you still stands.
[22 Jul 2020 11:55] Steinar Gunderson
Sinisa,

There are no semijoins here. And this not related to covering index scans at all. To me, this looks simply like just another case of our logic for using indexes for interesting orders is too simplistic, like Øystein pointed out.

Note that we do not have a cost model for hash joins; we plan them using the old BNL cost model.
[22 Jul 2020 13:15] MySQL Verification Team
Hi Mr. Sanchez,

After further analysis, it was concluded that there could be a way in which to make this type of queries faster. However, due to current limitations, this is a feature request that will be planned sometimes in the future.

The possible workaround is to use optimizer_switch='block_nested_loop=off'.

Verified as a feature request.
[28 Jul 2020 2:27] Matias Sánchez
Hi! Thanks you so much for the proposed workaround on this issue, after extensive tests on this types of queries in our load we were able to verify that this workaround it's working as expected. Feature request as detailed would be awesome as well, we had impact on this as we rely a lot on left joins with order by + limit optimizations.

Thanks you so much for your help!.

Regards
[28 Jul 2020 12:55] MySQL Verification Team
Hi Mr. Sanchez,

You are welcome .....