Bug #19372 Optimizer does not use index anymore when WHERE index NOT IN () is added
Submitted: 26 Apr 2006 13:57 Modified: 17 Apr 2007 21:56
Reporter: Jochen Riehm
Status: Closed
Category:Server: Optimizer Severity:S2 (Serious)
Version:5.0.22-BK, 5.0.19 OS:Linux (Debian Linux, Kernel 2.6.8)
Assigned to: Georgi Kodinov Target Version:

[26 Apr 2006 13:57] Jochen Riehm
Description:
In a simple Join on an int-key add WHERE key NOT IN (...) and the index is not used
anymore. Run time is worse by orders of magnitude. See "How to repeat" for details.

How to repeat:
Create table index_where_not_in with attached dump.
Run following queries:
SELECT VERSION();
ANALYZE TABLE index_where_not_in;
EXPLAIN
SELECT COUNT(*)
FROM index_where_not_in f1 
INNER JOIN index_where_not_in f2 
  ON ( f1.mnr_id=f2.mnr_id AND f1.id<f2.id ) 
WHERE 1
;
EXPLAIN
SELECT COUNT(*)
FROM index_where_not_in f1 
INNER JOIN index_where_not_in f2 
  ON ( f1.mnr_id=f2.mnr_id AND f1.id<f2.id ) 
WHERE 1
AND f1.mnr_id NOT IN (11920035,784894,785671,111146418,111132393,111159094,111152706) 
;

Result is:
+---------------------+
| VERSION()           |
+---------------------+
| 5.0.19-standard-log |
+---------------------+
1 row in set (0.00 sec)

+---------------------------------+---------+----------+----------+
| Table                           | Op      | Msg_type | Msg_text |
+---------------------------------+---------+----------+----------+
| dbcarsharing.index_where_not_in | analyze | status   | OK       |
+---------------------------------+---------+----------+----------+
1 row in set (0.03 sec)

+----+-------------+-------+-------+---------------+------------+---------+------------------------+-------+--------------------------+
| id | select_type | table | type  | possible_keys | key        | key_len | ref          
         | rows  | Extra                    |
+----+-------------+-------+-------+---------------+------------+---------+------------------------+-------+--------------------------+
|  1 | SIMPLE      | f1    | index | sani_index    | sani_index | 8       | NULL         
         | 54850 | Using index              |
|  1 | SIMPLE      | f2    | ref   | sani_index    | sani_index | 4       |
dbcarsharing.f1.mnr_id |     5 | Using where; Using index |
+----+-------------+-------+-------+---------------+------------+---------+------------------------+-------+--------------------------+
2 rows in set (0.00 sec)

+----+-------------+-------+-------+---------------+------------+---------+------+-------+--------------------------+
| id | select_type | table | type  | possible_keys | key        | key_len | ref  | rows 
| Extra                    |
+----+-------------+-------+-------+---------------+------------+---------+------+-------+--------------------------+
|  1 | SIMPLE      | f1    | index | sani_index    | sani_index | 8       | NULL | 54850
| Using where; Using index |
|  1 | SIMPLE      | f2    | index | sani_index    | sani_index | 8       | NULL | 41138
| Using where; Using index |
+----+-------------+-------+-------+---------------+------------+---------+------+-------+--------------------------+
2 rows in set (0.01 sec)

EXPLAIN should be the same for both queries.

Suggested fix:
Correct  cost estimate for range to more accurately reflect situation in this case.
[26 Apr 2006 13:57] Jochen Riehm
Table dump of test case

Attachment: index_where_not_in_table.sql.bz2 (application/x-bzip2, text), 190.24 KiB.

[26 Apr 2006 16:23] Valeriy Kravchuk
Thank you for a bug report. Verified just as described on data you uploded, with
5.0.22-BK:

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.0.22    |
+-----------+
1 row in set (0.00 sec)

mysql> ANALYZE TABLE index_where_not_in;
+----------------------------+---------+----------+----------+
| Table                      | Op      | Msg_type | Msg_text |
+----------------------------+---------+----------+----------+
| db19372.index_where_not_in | analyze | status   | OK       |
+----------------------------+---------+----------+----------+
1 row in set (0.02 sec)

mysql> EXPLAIN
    -> SELECT COUNT(*)
    -> FROM index_where_not_in f1
    -> INNER JOIN index_where_not_in f2
    ->   ON ( f1.mnr_id=f2.mnr_id AND f1.id<f2.id )
    -> WHERE 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: f1
         type: index
possible_keys: sani_index
          key: sani_index
      key_len: 8
          ref: NULL
         rows: 54850
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: f2
         type: ref
possible_keys: sani_index
          key: sani_index
      key_len: 4
          ref: db19372.f1.mnr_id
         rows: 5
        Extra: Using where; Using index
2 rows in set (0.00 sec)

mysql> EXPLAIN
    -> SELECT COUNT(*)
    -> FROM index_where_not_in f1
    -> INNER JOIN index_where_not_in f2
    ->   ON ( f1.mnr_id=f2.mnr_id AND f1.id<f2.id )
    -> WHERE 1
    -> AND f1.mnr_id NOT IN
    -> (11920035,784894,785671,111146418,111132393,111159094,111152706)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: f1
         type: index
possible_keys: sani_index
          key: sani_index
      key_len: 8
          ref: NULL
         rows: 54850
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: f2
         type: index
possible_keys: sani_index
          key: sani_index
      key_len: 8
          ref: NULL
         rows: 41138
        Extra: Using where; Using index
2 rows in set (0.01 sec)

It is a bug!
[10 Apr 2007 18:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/24190

ChangeSet@1.2435, 2007-04-10 19:08:08+03:00, gkodinov@magare.gmz +2 -0
  Bug #19372:
  Added a test case.
  The problem was fixed by the fix for bug #17379.
  The problem was that because of some conditions 
  the optimizer always preferred range or full index
  scan access methods to lookup access methods even
  when the latter were much cheaper.
[15 Apr 2007 18:48] Bugs System
Pushed into 5.1.18-beta
[15 Apr 2007 18:53] Bugs System
Pushed into 5.0.40
[17 Apr 2007 21:56] Paul DuBois
Noted in 5.0.40, 5.1.18 changelogs.