Bug #69359 Optimizer fails to optimize expression of the form 'FOO' IS NULL
Submitted: 31 May 2013 2:31 Modified: 25 Oct 2018 0:37
Reporter: Archie Cobbs Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.5.28 OS:Linux
Assigned to: CPU Architecture:Any

[31 May 2013 2:31] Archie Cobbs
Description:
See

  http://stackoverflow.com/questions/16848190/mysql-why-isnt-foo-is-null-optimized-away

for a description. Quoting that here:

 I have two tables Person and Message and the latter has a foreign key to the former. Each table has id as the primary key column, and the Person table also has a column personId which is (uniquely) indexed.

The query below should take advantage of the personId key index, but instead MySQL requires scanning the entire Message table for some reason:

mysql> EXPLAIN SELECT `m`.*
    -> FROM
    ->   `Message` AS `m`
    -> LEFT JOIN
    ->   `Person` AS `p` ON (`m`.`person` = `p`.`id`)
    -> WHERE
    ->   'M002649397' IS NULL OR
    ->   `p`.`personId` = 'M002649397';
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref            | rows   | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+
|  1 | SIMPLE      | m     | ALL    | NULL          | NULL    | NULL    | NULL           | 273220 |             |
|  1 | SIMPLE      | p     | eq_ref | PRIMARY       | PRIMARY | 8       | pcom.m.person  |      1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+
2 rows in set (0.00 sec)

But when I comment out the 'M002649397' IS NULL OR clause (which has no effect on the result), the query suddenly gets more efficient:

mysql> EXPLAIN SELECT `m`.*
    -> FROM
    ->   `Message` AS `m`
    -> LEFT JOIN
    ->   `Person` AS `p` ON (`m`.`person` = `p`.`id`)
    -> WHERE
    -> --  'M002649397' IS NULL OR
    ->   `p`.`personId` = 'M002649397';
+----+-------------+-------+-------+--------------------+--------------------+---------+-------+------+-------------+
| id | select_type | table | type  | possible_keys      | key                | key_len | ref   | rows | Extra       |
+----+-------------+-------+-------+--------------------+--------------------+---------+-------+------+-------------+
|  1 | SIMPLE      | p     | const | PRIMARY,personId   | personId           | 767     | const |    1 | Using index |
|  1 | SIMPLE      | m     | ref   | FK9C2397E7A0F6ED11 | FK9C2397E7A0F6ED11 | 9       | const |    3 | Using where |
+----+-------------+-------+-------+--------------------+--------------------+---------+-------+------+-------------+
2 rows in set (0.01 sec)

The bug is that the 'M002649397' IS NULL expression, which is always false, is not being optimized away.

How to repeat:
See description.

Suggested fix:
Optimize away 'FOO' IS NULL expressions.
[31 May 2013 14:16] MySQL Verification Team
Hi,

Please answer couple of questions.

Is 'M002649397' a column or just a string constant ??? It looks to me as it is a second case, but I need your confirmation.

Also, can you check how optimizer behaves when you replace:

'M002649397' IS NULL

with:

0 = 1

Also, try replacing your original condition by the bracketed one, like this:

('M002649397' IS NULL) OR ...

and let me know the result.

Most likely, this is just a small deficiency in the optimizer, which would make it a small, but legitimate feature request.

Many thanks in advance.
[31 May 2013 14:51] Archie Cobbs
1. Yes, 'M002649397' is a string constant, not a column name.

2. Same result in both cases:

mysql> EXPLAIN SELECT `m`.* FROM `Message` AS `m` LEFT JOIN `Person` AS `p` ON (`m`.`person` = `p`.`id`) WHERE ('M002649397' IS NULL) OR   `p`.`personId` = 'M002649397';
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref            | rows   | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+
|  1 | SIMPLE      | m     | ALL    | NULL          | NULL    | NULL    | NULL           | 273643 |             |
|  1 | SIMPLE      | p     | eq_ref | PRIMARY       | PRIMARY | 8       | pcom.m.person  |      1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+
2 rows in set (0.00 sec)

mysql> EXPLAIN SELECT `m`.* FROM `Message` AS `m` LEFT JOIN `Person` AS `p` ON (`m`.`person` = `p`.`id`) WHERE (0 = 1) OR   `p`.`personId` = 'M002649397';
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref            | rows   | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+
|  1 | SIMPLE      | m     | ALL    | NULL          | NULL    | NULL    | NULL           | 273643 |             |
|  1 | SIMPLE      | p     | eq_ref | PRIMARY       | PRIMARY | 8       | pcom.m.person  |      1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+----------------+--------+-------------+
2 rows in set (0.00 sec)
[31 May 2013 18:10] MySQL Verification Team
After careful analysis, this turns out to be a bug.

You can not have one execution plan for the condition:

WHERE (0 = 1) OR p.personId = 'string_constant';

and another execution plan for:

WHERE p.personId = 'string_constant';

because (0 = 1) always results in FALSE, which makes the above two queries 100 % identical.

You can see in the bug report itself that execution plan when (0 = 1) OR is present is much worse then the one where the expression is only the equality of a column to a constant.
[17 Sep 2013 2:16] James Day
This bug is in a development work set that typically results in a fix being included in the server within a few months. As usual, development hiccups and testing can produce unexpected delay so do not count on it until it is actually released.

James Day, MySQL Senior Principal Support Engineer, Oracle
[17 Sep 2013 2:28] James Day
I've also updated the severity from 5 to the 3 that we have actually been using for this bug.

James Day, MySQL Senior Principal Support Engineer, Oracle
[17 Sep 2013 14:12] Archie Cobbs
Thanks for the update.
[25 Oct 2018 0:37] Jon Stephens
Documented fix in the MySQL 8.0.14 changelog as follows:

        The presence of a condition which was always false stopped
        conversion of an outer join to an inner join, resulting in a
        suboptimal query plan.

Closed.
[25 Oct 2018 12:13] MySQL Verification Team
The fact that this bug is fixed in 8.0 only, indicates that it is hard to port it back.

5.5 is not maintained any more, so it does not receive any further changes.