Bug #60026 Optimize for NULL parameter to IN()
Submitted: 9 Feb 2011 0:29 Modified: 9 Feb 2011 7:22
Reporter: Todd Farmer (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version: OS:Any
Assigned to: CPU Architecture:Any

[9 Feb 2011 0:29] Todd Farmer
Description:
The optimizer currently recognizes impossible WHERE conditions when using "WHERE idx_col IN (NULL)":

mysql> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `a` int(11) NOT NULL,
  `b` int(11) NOT NULL,
  `c` int(11) default NULL,
  KEY `a` (`a`),
  KEY `b` (`b`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> explain select * from t1 WHERE b IN (NULL)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
        Extra: Impossible WHERE noticed after reading const tables
1 row in set (0.00 sec)

However, it does not recognize that this condition (WHERE col IN (NULL)) will always return no matches:

mysql> explain select * from t1 WHERE c IN (NULL)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 49435
        Extra: Using where
1 row in set (0.00 sec)

This is because IN(NULL) matches nothing - even if the contents of the column are NULL (as is appropriate):

mysql> select count(*) from t1 WHERE c IS NULL;
+----------+
| count(*) |
+----------+
|      100 |
+----------+
1 row in set (0.03 sec)

The optimizer should recognize that this condition cannot be met, and needs to return no results without actually scanning data, which it does currently:

mysql> show status like 'handler_read_rnd_next';
+-----------------------+--------+
| Variable_name         | Value  |
+-----------------------+--------+
| Handler_read_rnd_next | 295300 |
+-----------------------+--------+
1 row in set (0.00 sec)

mysql> select * from t1 WHERE c IN (NULL);
Empty set (0.09 sec)

mysql> show status like 'handler_read_rnd_next';
+-----------------------+--------+
| Variable_name         | Value  |
+-----------------------+--------+
| Handler_read_rnd_next | 344455 |
+-----------------------+--------+
1 row in set (0.00 sec)

How to repeat:
See above

Suggested fix:
Make optimizer aware of IN (NULL) impossible condition.
[9 Feb 2011 7:22] Valeriy Kravchuk
Verified just as described:

C:\Program Files\MySQL\MySQL Server 5.1\bin>mysql -uroot -proot -P3312 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 12
Server version: 5.5.8 MySQL Community Server (GPL)

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> drop table t1;
Query OK, 0 rows affected (0.03 sec)

mysql> CREATE TABLE `t1` (
    ->   `a` int(11) NOT NULL,
    ->   `b` int(11) NOT NULL,
    ->   `c` int(11) default NULL,
    ->   KEY `a` (`a`),
    ->   KEY `b` (`b`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    -> ;
Query OK, 0 rows affected (0.19 sec)

mysql> insert into t1 values (1,1,1);
Query OK, 1 row affected (0.05 sec)

mysql> explain select * from t1 where b in (NULL);
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows
 | Extra                                               |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-----------------------------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL
 | Impossible WHERE noticed after reading const tables |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-----------------------------------------------------+
1 row in set (0.01 sec)

mysql> explain select * from t1 where c in (NULL);
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows
 | Extra       |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    1
 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-------------+
1 row in set (0.03 sec)

Note that for some engines optimizer is able to process this case correctly even when column has no index:

mysql> alter table t1 engine=MyISAM;
Query OK, 1 row affected (0.23 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> explain select * from t1 where c in (NULL);
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows
 | Extra                                               |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-----------------------------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL
 | Impossible WHERE noticed after reading const tables |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+-----------------------------------------------------+
1 row in set (0.00 sec)