Bug #69721 | Block Nested Loop making things slower | ||
---|---|---|---|
Submitted: | 11 Jul 2013 4:55 | Modified: | 13 Apr 2020 23:47 |
Reporter: | Will Fong | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Documentation | Severity: | S3 (Non-critical) |
Version: | 5.6, 5.6.39, 5.7.21 | OS: | Any |
Assigned to: | Jon Stephens | CPU Architecture: | Any |
Tags: | block_nested_loop, left join, subquery |
[11 Jul 2013 4:55]
Will Fong
[11 Jul 2013 11:51]
MySQL Verification Team
Hello Will, Thank you for the bug report and the test case. Verified as described on recent 5.6.12. Thanks, Umesh
[30 May 2014 15:37]
Doug Cook
This bug has also been affecting us in customer-visible ways. From what we have seen, the optimizer decision to use a block nested loop or not depends on the state of the database; i.e. it is not consistent for a given query on a given database schema with a given set of indices. Thus, performance can be fine in testing, but after a few inserts and deletes in production, the optimizer may decide to use block nested loop, causing customer-facing functions to take minutes, instead of seconds, to execute. We've just seen this with a major customer. Yes, we can, and probably will, disable block nested loop as a workaround, but that may have other undesired performance effects. I have to imagine this is affecting other users in similarly ugly ways as well. Perhaps it is worth reconsidering the priority?
[3 Jun 2014 21:23]
MySQL Verification Team
http://bugs.mysql.com/bug.php?id=72854 marked as duplicate of this one.
[8 Sep 2015 23:11]
Sam Ottenhoff
We have also come across this issue where the Block Nested Loop makes a double left-join query much slower. Our product is an open-source learning management system named Sakai, and we find this issue becomes apparent with 500k+ rows in the table and a double left-join (a children-type query). We have only found this issue with one query so far: select privatefor0_.* from MFR_PRIVATE_FORUM_T privatefor0_ left outer join MFR_TOPIC_T topicsset1_ on privatefor0_.ID=topicsset1_.pf_surrogateKey left outer join MFR_TOPIC_T childrenfo2_ on topicsset1_.ID=childrenfo2_.pt_surrogateKey where privatefor0_.OWNER='9e4cc57c-91a1-462d-ae8c-574c357577a4' and privatefor0_.surrogateKey=12883 order by childrenfo2_.created desc; We are tracking our issue here: https://jira.sakaiproject.org/browse/SAK-29857
[9 Sep 2015 6:33]
Øystein Grøvlen
@Sam Otterhof: In order to say something more about your query we would need to know the query plan (EXPLAIN output). Even more useful would it be if you could upload the optimizer trace for this query. (See https://dev.mysql.com/doc/internals/en/optimizer-tracing.html for how to obtain that.) Another question is whether you really need to use LEFT JOIN here. Your query depends on sorting on childrenfo2_.created. Does this mean that for every row in privatefor0_ there will be matching rows in childrenfo2_? If so, it is better to use inner joins since the optimizer may then be able to use more efficient join orders.
[24 Sep 2015 14:09]
Sam Ottenhoff
optimizer-trace
Attachment: trace.txt (text/plain), 16.01 KiB.
[24 Sep 2015 14:11]
Sam Ottenhoff
Okay, here is the EXPLAIN with block nested loop on: mysql> explain select * from MFR_PRIVATE_FORUM_T privatefor0_ left outer join MFR_TOPIC_T topicsset1_ on privatefor0_.ID=topicsset1_.pf_surrogateKey left outer join MFR_TOPIC_T childrenfo2_ on topicsset1_.ID=childrenfo2_.pt_surrogateKey where privatefor0_.OWNER='9e4cc57c-91a1-462d-ae8c-574c357577a4' and privatefor0_.surrogateKey=12883; +----+-------------+--------------+-------+--------------------------+-------------------+---------+-------------+--------+----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------------+-------+--------------------------+-------------------+---------+-------------+--------+----------------------------------------------------+ | 1 | SIMPLE | privatefor0_ | const | OWNER,FKA9EE57548B5E2A2F | OWNER | 776 | const,const | 1 | NULL | | 1 | SIMPLE | topicsset1_ | ref | FK863DC0BE82FAB29 | FK863DC0BE82FAB29 | 9 | const | 4 | NULL | | 1 | SIMPLE | childrenfo2_ | ALL | FK863DC0BEFF3B3AE9 | NULL | NULL | NULL | 636617 | Using where; Using join buffer (Block Nested Loop) | +----+-------------+--------------+-------+--------------------------+-------------------+---------+-------------+--------+----------------------------------------------------+ 3 rows in set (0.00 sec) With block nested loop off: +----+-------------+--------------+-------+--------------------------+--------------------+---------+----------------------------+--------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------------+-------+--------------------------+--------------------+---------+----------------------------+--------+-------+ | 1 | SIMPLE | privatefor0_ | const | OWNER,FKA9EE57548B5E2A2F | OWNER | 776 | const,const | 1 | NULL | | 1 | SIMPLE | topicsset1_ | ref | FK863DC0BE82FAB29 | FK863DC0BE82FAB29 | 9 | const | 4 | NULL | | 1 | SIMPLE | childrenfo2_ | ref | FK863DC0BEFF3B3AE9 | FK863DC0BEFF3B3AE9 | 9 | sakai_ubalt.topicsset1_.ID | 364528 | NULL | +----+-------------+--------------+-------+--------------------------+--------------------+---------+----------------------------+--------+-------+ 3 rows in set (0.00 sec)
[24 Sep 2015 19:58]
Øystein Grøvlen
@Sam Ottenoff: Thanks for the trace. Unfortunately, the default max size for the trace is too small to capture the entire trace for this query. You will need to increase the setting of optimizer_trace_max_mem_size in order to get the interesting parts. Note that if the column MISSING_BYTES_BEYOND_MAX_MEM_SIZE of the trace table is non-zero, it means that not everything was captured. See https://dev.mysql.com/doc/internals/en/tracing-memory-usage.html for details. The EXPLAIN output indicates that there are very few distinct values for childrenfo2_.pt_surrogateKey. Is that correct? It says there will be about 364528 hits for each key look-up. That is probably why the optimizer thinks that a table scan with block-nested loop is better. If the actual number is far less, it explains why the chosen plan is not optimal. I think either statistics must be way off, or the distribution of values is significantly skewed.
[3 Dec 2016 6:12]
Jervin R
Encountered this same bug on 5.7.14: mysql [localhost] {msandbox} (test) > show create table a \G *************************** 1. row *************************** Table: a Create Table: CREATE TABLE `a` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `batchId` bigint(20) unsigned NOT NULL, PRIMARY KEY (`id`), KEY `idx_id_batchId` (`batchId`,`id`) ) ENGINE=InnoDB AUTO_INCREMENT=200046 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) mysql [localhost] {msandbox} (test) > show create table b \G *************************** 1. row *************************** Table: b Create Table: CREATE TABLE `b` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `batchPrepaidCardId` bigint(20) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=200046 DEFAULT CHARSET=utf8 1 row in set (0.00 sec) mysql [localhost] {msandbox} (test) > show create table c \G *************************** 1. row *************************** Table: c Create Table: CREATE TABLE `c` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `processingResultDetails` varchar(400) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql [localhost] {msandbox} (test) > explain -> SELECT SQL_NO_CACHE * FROM a -> use index (idx_id_batchId) -> inner join b on a.id=b.id -> left join c on b.batchPrepaidCardId = c.id -> where a.batchid = 1338412 and a.id > 3000 ORDER BY a.id ASC LIMIT 1000\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: a partitions: NULL type: range possible_keys: idx_id_batchId key: idx_id_batchId key_len: 16 ref: NULL rows: 99880 filtered: 100.00 Extra: Using where; Using index; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: b partitions: NULL type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 8 ref: test.a.id rows: 1 filtered: 100.00 Extra: NULL *************************** 3. row *************************** id: 1 select_type: SIMPLE table: c partitions: NULL type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 1 filtered: 100.00 Extra: Using where; Using join buffer (Block Nested Loop) 3 rows in set, 1 warning (0.01 sec) Table c has only 1 column - if I drop the column processingResultDetails from that table, it switches into eq_ref. Its supposed to be a PRIMARY KEY access on the JOIN and the optimizer should know this, which means it should never choose full table scan?
[3 Dec 2016 6:13]
Jervin R
Sorry, typo - instead of "Table c has only 1 column", table c has only 1 row.
[6 Dec 2016 9:16]
Øystein Grøvlen
@Jervin R: Do you see any performance difference if you turn off block nested loop? (set optimizer_switch='block_nested_loop=off') I would not think that a table scan of a table with just one row should be any slower than a primary key look-up.
[6 Dec 2017 9:56]
Sigalit Hillel-Shelly
This issue affects us too, in very ugly mannaer. query may take 200 seconds intead of 1 second. is there any ETA of fixing this?
[5 Jan 2018 9:04]
james mayn
Afect me to on Magento 2 CE.
[22 Feb 2018 5:59]
MySQL Verification Team
-- 5.6.39 - default i.e block_nested_loop=on 30 rows in set (15.60 sec) - block_nested_loop=off 30 rows in set (0.14 sec) -- 5.7.21 - default i.e block_nested_loop=on 30 rows in set (15.61 sec) - block_nested_loop=off 30 rows in set (0.14 sec)
[15 Mar 2018 10:32]
James Wilson
We are running MySQL 5.7.21 on Ubuntu 16.04 with SugarCRM 7.9. Ok, this is a real problem for us. We host SugarCRM instances on our hosting platform. We are testing MySQL 5.7 with a few instances and would like to upgrade all. However, this bug is causing SugarCRM to become unusable. All that is required is a large-ish table of Contacts or Accounts (the most commonly used record types in Sugar) and you will see this issue each time you click on the main list view for either of the above tables. This bug causes the query to take over 200 seconds. We have tried turning block_nested_loop off, and this resolves the issue (the query runs in a few seconds), but thruning this switch off introduces a side effect in that MySQL starts consuming excessive numbers of tmp files and the system runs out of file handles. The cause of this is side effect is unknown (somehow it is related to turning block_nested_loop off), but regardless, we can't turn block_nested_loop off. SugarCRM is a very common application and I am certain that this issue will be affecting plenty of other users. We have no way to work around this bug. Please escalate this issue. I note that the duplicate bug was classified as "Severe" but this bug is classified as "Non-Critical". I believe this bug should be re-classified as Severe.
[16 Apr 2018 20:20]
Matthew Marum
We have had several SugarCRM instances run into this bug. It is pretty debilitating.
[17 Dec 2018 15:08]
john kirby
Assigning this bug a 'non-critical' severity seems bizarre to me. You have queries that should be almost instantaneous running for 30 seconds. And the bug interferes with the performance of 3rd party apps using perfectly valid SQL.
[9 Sep 2019 17:16]
Rick James
Another example: https://stackoverflow.com/questions/57854233/mysql-chooses-completely-wrong-index
[9 Sep 2019 17:16]
Rick James
Another example: https://stackoverflow.com/questions/57854233/mysql-chooses-completely-wrong-index
[20 Dec 2019 15:48]
Joe Murray
CiviCRM is an open source project providing CRM for over 10,000 non-profits around the world. We're having trouble with this bug in certain cases where groups of contacts are specified as mailing recipients. We found in one instance that a query took 20 minutes to execute. Disabling block_nested_loop, the same query executes in 5 minutes (still really bad, but 1/4 the time).
[13 Apr 2020 20:58]
Jon Stephens
If this is a documentation issue, the category should have changed to Documentation (otherwise Docs might never see it). Changing category, assigning to myself for resolution. (Please note that solution/workaround info was already provided in developer comment of 31 May 2018. Thanks!)
[13 Apr 2020 23:47]
Jon Stephens
Fixed in mysqldoc rev 65690. Closed.
[14 Apr 2020 2:44]
Mike Treadgold
Hi Jon, Sorry I cant see any comments from the developer from the 31 May 2018. Are you able to provide details or a link to the documentation page please?
[8 Apr 2021 16:54]
teo teo
Why is this marked as documentation issue??? Was the actual bug ever fixed?