Bug #89739 MySQL performs slowly due to missing basic optimisations
Submitted: 21 Feb 2018 2:36 Modified: 15 Jan 2020 0:49
Reporter: Yoseph Phillips Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:5.7.21 OS:Any
Assigned to: CPU Architecture:Any

[21 Feb 2018 2:36] Yoseph Phillips
Description:
MySQL performs slowly due to missing basic optimisations

We have clients using MySQL 5.7.18 EE having these issues, and we have also tested on 5.7.21 and the same issues exist.

When we run the explain on mcuh simplified forms of the script in the How to repeat section this is what we see:

# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
'1', 'SIMPLE', 't4', NULL, 'index', 'PRIMARY', 'PRIMARY', '4', NULL, '1', '100.00', 'Using index; Using temporary; Using filesort'
'1', 'SIMPLE', 't1', NULL, 'ALL', NULL, NULL, NULL, NULL, '5', '20.00', 'Using where; Using join buffer (Block Nested Loop)'
'1', 'SIMPLE', 't2', NULL, 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.t1.t2ID', '1', '100.00', 'Using where; Using index'
'1', 'SIMPLE', 't3', NULL, 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.t1.t3ID', '1', '100.00', 'Using where; Using index'

In our actual queries we have huge tables and so this is causing performance issues. As a workaround we have tested when certain parameters are NULL and then done a simpler query instead, however we should not need to do this.

1) Why is it even looking into t2, the ON condition is clearly FALSE and the WHERE condition is clearly TRUE without even knowing anything about data in t2. When we remove these from our queries it saves examing millions of extra rows (as we can see from the slow query log) and makes performance take milliseconds instead of minutes.

2) Likewise for table t3, the ON condition is clearly FALSE and the ORDER BY will never use t3.id. We noticed that performance was significantly improved by changing the ORDER BY to ORDER BY t1.id; 

3) I have not checked how much of a performance impact this is having but for the temporary table t4 it is showing 'Using index; Using temporary; Using filesort'. It is already a temporary table, with only a primary key column so why does it need to use filesort for this and what appears to be another temporary table?

How to repeat:
Run the following script:

DROP TABLE IF EXISTS table_1;

CREATE TABLE table_1(id INT NOT NULL, t2ID INT NOT NULL, t3ID INT NOT NULL, t4ID INT NOT NULL, PRIMARY KEY (id));

INSERT INTO table_1(id, t2ID, t3ID, t4ID) VALUES (1, 1, 1, 1), (2, 2, 2, 2), (3, 3, 3, 3), (4, 4, 4, 4), (5, 5, 5, 5);

DROP TABLE IF EXISTS table_2;

CREATE TABLE table_2(id INT NOT NULL, PRIMARY KEY (id));

INSERT INTO table_2(id) VALUES (1), (2), (3), (4), (5);

DROP TABLE IF EXISTS table_3;

CREATE TABLE table_3(id INT NOT NULL, PRIMARY KEY (id)) ;

INSERT INTO table_3(id) VALUES (1), (2), (3), (4), (5);

DROP TEMPORARY TABLE IF EXISTS temporary_table_4;

CREATE TEMPORARY TABLE temporary_table_4(id INT NOT NULL, PRIMARY KEY (id)); 

INSERT INTO temporary_table_4(id) VALUES (1);

EXPLAIN
SELECT t1.id
FROM table_1 t1
INNER JOIN temporary_t4 t4 ON t4.id = t1.t4ID
LEFT JOIN table_2 t2 ON FALSE AND t2.id = t1.t2ID
LEFT JOIN table_3 t3 ON FALSE AND t3.id = t1.t3ID
WHERE TRUE OR t2.id = 2
ORDER BY if(FALSE, t3.id, NULL), t1.id;
[21 Feb 2018 4:38] MySQL Verification Team
Hi Yoseph Phillips,

This is not a bug. The MySQL optimizer is not ideal but we are working on making it better all the time. 

Your "how to repeat" is rather broad and with so few columns optimizer will certainly not (nor it can) calculate the same path as with large tables. To discuss how to best optimize your particular query you have issues with please contact our MySQL Support team as they can give you consulting on your particular query, how to tweak it (and/or the server settings) to get best out of your system. Since you say you are using EE binaries this means you already have a subscription so this type of consulting is included in your contract.

best regards
Bogdan
[21 Feb 2018 11:39] Yoseph Phillips
I am not saying it is a bug as it is producing the correct results. It is more of an RFE and what should be simple to do with greater benefits compared to some of the other ones the Optimization team has been working on. 

We are using InnoDB and this small test case produces exactly the same execution plan as when we have huge tables with many columns which would be much harder to see what it is doing.

We have already put in a workaround where we check if those ON conditions etc. would be impossible and remove them. 

Basically our code ends up like:
if such and such is false then
  do simplified query
else
  do complex query

however we should not need to do things like that as MySQL could do that for us.

MySQL already often checks for Impossible ON conditions and optimises them away.

This is not the forum for discussing our client's experiences with using the MySQL EE Support Teams.

What I am suggesting is that MySQL executes the query in the description the same way it would execute:

SELECT t1.id
FROM table_1 t1
INNER JOIN temporary_t4 t4 ON t4.id = t1.t4ID
ORDER BY t1.id;

They both give exactly the same results, however this simplified takes milliseconds for us, whereas the complex version takes minutes.
(MySQL should know that FALSE AND anything is FALSE, and TRUE OR anything is TRUE, and then when it sees that the LEFT JOINs are not used anywhere it can safely ignore them).

Number 3 in the description ('Using index; Using
temporary; Using filesort') is not as important to us as these temporary tables tend to be small enough, it just seems strange that MySQL would be using filesort on a table that just has a single PRIMARY KEY column and nothing else. The table is already sorted.

Hopefully that clarifies things further.  

Thanks,
Yoseph
[22 Feb 2018 3:13] MySQL Verification Team
Hi,

It is clear what you wish but it is how optimizer works for now.
I changed the bug to "feature request" and verified it but that's about it. There are only certain places where the impossible conditions are optimized out, it's not done "everywhere" and, as I mention, we do constantly improve the optimizer so it will happen in future.

all best
Bogdan
[15 Jan 2020 0:49] Jon Stephens
Fixed together with BUG#8202 and BUG#97552 in MySQL 8.0.20. See the latter bug report for docs info.

Closed.