Bug #100257 | Optimizer chooses wrong index for ORDER BY LIMIT | ||
---|---|---|---|
Submitted: | 20 Jul 2020 3:57 | Modified: | 28 Jul 2020 2:27 |
Reporter: | Matias Sánchez | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S4 (Feature request) |
Version: | 8.0.21 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | limit, Optimizer, order by |
[20 Jul 2020 3:57]
Matias Sánchez
[20 Jul 2020 4:07]
Matias Sánchez
Optimizer trace when using index for order by
Attachment: nice_plan.txt (text/plain), 13.61 KiB.
[20 Jul 2020 4:08]
Matias Sánchez
Optimizer trace when not using index for order by (FULL SCAN)
Attachment: wrong_plan.txt (text/plain), 17.20 KiB.
[20 Jul 2020 4:10]
Matias Sánchez
These are currently all table count mysql> select count(*) -> from table_a; +----------+ | count(*) | +----------+ | 1994959 | +----------+ 1 row in set (0.11 sec) mysql> mysql> select count(*) -> from table_b; +----------+ | count(*) | +----------+ | 8 | +----------+ 1 row in set (0.00 sec) mysql> mysql> select count(*) -> from table_c; +----------+ | count(*) | +----------+ | 1994959 | +----------+ 1 row in set (0.09 sec) mysql> mysql> select count(*) -> from table_d; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.00 sec)
[20 Jul 2020 4:11]
Matias Sánchez
Tested also on 8.0.19 and 8.0.20, and on 5.7.30 and this is happening in all versions.
[20 Jul 2020 13:36]
Matias Sánchez
Whenever plan is correctly accessing by index, optimizer trace is showning "reconsidering_access_paths_for_index_ordering": {\ "clause": "ORDER BY",\ "steps": [\ ],\ "index_order_summary": {\ "table": "`table_a`",\ "index_provides_order": true,\ "order_direction": "asc",\ "index": "fecha",\ "plan_changed": true,\ "access_type": "index"\ }\ }\ },\ {\ but for some reason this is not an optimizer step at all in the bad plan example shown.
[20 Jul 2020 14:07]
MySQL Verification Team
Hi Mr. Sanchez, Thank you for your bug report. However, we do not think that this is a bug. This is the intended behaviour. First of all, you have probably noticed that you are not used full joins, but two semi-joins. In those circumstances, when there is an option to use a hash join, when you add a single column, which is probably a long string, it is much faster to use hash joins than to search by index. This is because, the added column is not a part of the index. This is probably some VARCHAR, so if you use UTF, it can be quite long. In such cases, hash joins are usually faster and they involve, in case of semi-joins, a full scan of the left-most table. We are not sure that this should be even documented, since resolution of the semi-joins is not MySQL specific. Not a bug.
[20 Jul 2020 16:30]
Matias Sánchez
Hi, thanks for the fast reply on this matter!! I appreciative. This behavior as a must is quite worrying because on different scenarios there are regressions on time of this type of queries, as we are considering migrating from MariaDB to MySQL 8 we are used to this type of queries taking advantage and preferring index for a order by and low limit value. Maybe this happens cause different hash join implementation. This example I shown is one scenario that added a column with varchar(200) and the semi join/hash join is a feasible theory, but same happens when adding an id integer as well, the query working correctly is this one EXPLAIN SELECT table_a.id, table_a.appUserId, table_a.fecha, table_a.failed, table_a.successful, table_a.warning, table_a.statusId, table_a.sname, table_b.name AS source, table_c.iispecid, table_c.column_a, table_c.column_b, table_d.ispecid, table_c.column_c FROM table_a LEFT JOIN table_b ON (table_b.id = table_a.sourceId) LEFT JOIN table_c ON (table_a.id = table_c.ipid) LEFT JOIN table_d ON (table_a.id = table_d.ipid) ORDER BY fecha LIMIT 21; +----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+------+----------+-------+ | 1 | SIMPLE | table_a | NULL | index | NULL | fecha | 8 | NULL | 21 | 100.00 | NULL | | 1 | SIMPLE | table_b | NULL | eq_ref | PRIMARY | PRIMARY | 4 | testcase.table_a.sourceId | 1 | 100.00 | NULL | | 1 | SIMPLE | table_c | NULL | eq_ref | PRIMARY | PRIMARY | 4 | testcase.table_a.id | 1 | 100.00 | NULL | | 1 | SIMPLE | table_d | NULL | eq_ref | PRIMARY | PRIMARY | 4 | testcase.table_a.id | 1 | 100.00 | NULL | +----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+------+----------+-------+ But if I add instead table_d.id_error - integer value. EXPLAIN SELECT table_a.id, table_a.appUserId, table_a.fecha, table_a.failed, table_a.successful, table_a.warning, table_a.statusId, table_a.sname, table_b.name AS source, table_c.iispecid, table_c.column_a, table_c.column_b, table_d.id_error, table_d.ispecid, table_c.column_c FROM table_a LEFT JOIN table_b ON (table_b.id = table_a.sourceId) LEFT JOIN table_c ON (table_a.id = table_c.ipid) LEFT JOIN table_d ON (table_a.id = table_d.ipid) ORDER BY fecha LIMIT 21; Plan changes drastically: +----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+---------+----------+--------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+---------+----------+--------------------------------------------+ | 1 | SIMPLE | table_a | NULL | ALL | NULL | NULL | NULL | NULL | 1979683 | 100.00 | Using temporary; Using filesort | | 1 | SIMPLE | table_b | NULL | eq_ref | PRIMARY | PRIMARY | 4 | testcase.table_a.sourceId | 1 | 100.00 | NULL | | 1 | SIMPLE | table_c | NULL | eq_ref | PRIMARY | PRIMARY | 4 | testcase.table_a.id | 1 | 100.00 | NULL | | 1 | SIMPLE | table_d | NULL | ALL | PRIMARY | NULL | NULL | NULL | 1 | 100.00 | Using where; Using join buffer (hash join) | +----+-------------+---------+------------+--------+---------------+---------+---------+---------------------------+---------+----------+--------------------------------------------+ Also we can check that this new plan is not optimal at all cause when comparing with a forced index access this is result Optimizer approach: mysql> SELECT table_a.id, -> table_a.appUserId, -> table_a.fecha, -> table_a.failed, -> table_a.successful, -> table_a.warning, -> table_a.statusId, -> table_a.sname, -> table_b.name AS source, -> table_c.iispecid, -> table_c.column_a, -> table_c.column_b, -> table_d.id_error, -> table_d.ispecid, -> table_c.column_c -> FROM table_a -> LEFT JOIN table_b ON (table_b.id = table_a.sourceId) -> LEFT JOIN table_c ON (table_a.id = table_c.ipid) -> LEFT JOIN table_d ON (table_a.id = table_d.ipid) -> ORDER BY fecha -> LIMIT 21; +----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+ | id | appUserId | fecha | failed | successful | warning | statusId | sname | source | iispecid | column_a | column_b | id_error | ispecid | column_c | +----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+ ........................................................... +----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+ 21 rows in set (5.35 sec) Index forced mysql> SELECT table_a.id, -> table_a.appUserId, -> table_a.fecha, -> table_a.failed, -> table_a.successful, -> table_a.warning, -> table_a.statusId, -> table_a.sname, -> table_b.name AS source, -> table_c.iispecid, -> table_c.column_a, -> table_c.column_b, -> table_d.id_error, -> table_d.ispecid, -> table_c.column_c -> FROM table_a force index(fecha) -> LEFT JOIN table_b ON (table_b.id = table_a.sourceId) -> LEFT JOIN table_c ON (table_a.id = table_c.ipid) -> LEFT JOIN table_d ON (table_a.id = table_d.ipid) -> ORDER BY fecha -> LIMIT 21; +----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+ | id | appUserId | fecha | failed | successful | warning | statusId | sname | source | iispecid | column_a | column_b | id_error | ispecid | column_c | +----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+ .................. +----+-----------+----------------------------+--------+------------+---------+----------+-----------------+--------+----------+----------+----------+----------+---------+----------+ 21 rows in set (0.00 sec) I'm uploading asap this testcase tables for reproduction if you wish also to test these as well. Any idea how to workaround this optimizer behavior?, maybe i imagine "lowering priority" of hash join cost model or so to help optimizer solve faster these type queries. So grateful! Kind regards.
[20 Jul 2020 16:41]
Matias Sánchez
Have just transferred testcase as mysql-bug-data-100257.dmp.gz to sftp.orcle.com:/support/incoming . This dump contains all tables with example data i've used to test queries shown on present case.
[21 Jul 2020 0:26]
Øystein Grøvlen
A comment in the MySQL code says: filesort() and join cache are usually faster than reading in index order and not using join cache So it is intentional that the LIMIT optimization is not used when join buffering is used. However, as this example show, that join buffering is always faster is not a valid assumption, and I think that this should qualify as a feature request. A possible work-around is to set optimizer_switch='block_nested_loop=off'.
[21 Jul 2020 12:37]
MySQL Verification Team
Hi Mr. Sanchez, This is strictly a forum for MySQL server bugs. We can not comment on products that are made by third parties. You can try our server. Then, if you manage to find a combination of optimiser switches and directives that are available with our server, and that work faster then default behaviour, then we could consider making this report a feature request. Otherwise, a behaviour that I described to you still stands.
[22 Jul 2020 11:55]
Steinar Gunderson
Sinisa, There are no semijoins here. And this not related to covering index scans at all. To me, this looks simply like just another case of our logic for using indexes for interesting orders is too simplistic, like Øystein pointed out. Note that we do not have a cost model for hash joins; we plan them using the old BNL cost model.
[22 Jul 2020 13:15]
MySQL Verification Team
Hi Mr. Sanchez, After further analysis, it was concluded that there could be a way in which to make this type of queries faster. However, due to current limitations, this is a feature request that will be planned sometimes in the future. The possible workaround is to use optimizer_switch='block_nested_loop=off'. Verified as a feature request.
[28 Jul 2020 2:27]
Matias Sánchez
Hi! Thanks you so much for the proposed workaround on this issue, after extensive tests on this types of queries in our load we were able to verify that this workaround it's working as expected. Feature request as detailed would be awesome as well, we had impact on this as we rely a lot on left joins with order by + limit optimizations. Thanks you so much for your help!. Regards
[28 Jul 2020 12:55]
MySQL Verification Team
Hi Mr. Sanchez, You are welcome .....