Bug #19372 Optimizer does not use index anymore when WHERE index NOT IN () is added
Submitted: 26 Apr 2006 11:57 Modified: 17 Apr 2007 19:56
Reporter: Jochen Riehm Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.0.22-BK, 5.0.19 OS:Linux (Debian Linux, Kernel 2.6.8)
Assigned to: Georgi Kodinov

[26 Apr 2006 11:57] Jochen Riehm
Description:
In a simple Join on an int-key add WHERE key NOT IN (...) and the index is not used anymore. Run time is worse by orders of magnitude. See "How to repeat" for details.

How to repeat:
Create table index_where_not_in with attached dump.
Run following queries:
SELECT VERSION();
ANALYZE TABLE index_where_not_in;
EXPLAIN
SELECT COUNT(*)
FROM index_where_not_in f1 
INNER JOIN index_where_not_in f2 
  ON ( f1.mnr_id=f2.mnr_id AND f1.id<f2.id ) 
WHERE 1
;
EXPLAIN
SELECT COUNT(*)
FROM index_where_not_in f1 
INNER JOIN index_where_not_in f2 
  ON ( f1.mnr_id=f2.mnr_id AND f1.id<f2.id ) 
WHERE 1
AND f1.mnr_id NOT IN (11920035,784894,785671,111146418,111132393,111159094,111152706) 
;

Result is:
+---------------------+
| VERSION()           |
+---------------------+
| 5.0.19-standard-log |
+---------------------+
1 row in set (0.00 sec)

+---------------------------------+---------+----------+----------+
| Table                           | Op      | Msg_type | Msg_text |
+---------------------------------+---------+----------+----------+
| dbcarsharing.index_where_not_in | analyze | status   | OK       |
+---------------------------------+---------+----------+----------+
1 row in set (0.03 sec)

+----+-------------+-------+-------+---------------+------------+---------+------------------------+-------+--------------------------+
| id | select_type | table | type  | possible_keys | key        | key_len | ref                    | rows  | Extra                    |
+----+-------------+-------+-------+---------------+------------+---------+------------------------+-------+--------------------------+
|  1 | SIMPLE      | f1    | index | sani_index    | sani_index | 8       | NULL                   | 54850 | Using index              |
|  1 | SIMPLE      | f2    | ref   | sani_index    | sani_index | 4       | dbcarsharing.f1.mnr_id |     5 | Using where; Using index |
+----+-------------+-------+-------+---------------+------------+---------+------------------------+-------+--------------------------+
2 rows in set (0.00 sec)

+----+-------------+-------+-------+---------------+------------+---------+------+-------+--------------------------+
| id | select_type | table | type  | possible_keys | key        | key_len | ref  | rows  | Extra                    |
+----+-------------+-------+-------+---------------+------------+---------+------+-------+--------------------------+
|  1 | SIMPLE      | f1    | index | sani_index    | sani_index | 8       | NULL | 54850 | Using where; Using index |
|  1 | SIMPLE      | f2    | index | sani_index    | sani_index | 8       | NULL | 41138 | Using where; Using index |
+----+-------------+-------+-------+---------------+------------+---------+------+-------+--------------------------+
2 rows in set (0.01 sec)

EXPLAIN should be the same for both queries.

Suggested fix:
Correct  cost estimate for range to more accurately reflect situation in this case.
[26 Apr 2006 11:57] Jochen Riehm
Table dump of test case

Attachment: index_where_not_in_table.sql.bz2 (application/x-bzip2, text), 190.24 KiB.

[26 Apr 2006 14:23] Valerii Kravchuk
Thank you for a bug report. Verified just as described on data you uploded, with 5.0.22-BK:

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.0.22    |
+-----------+
1 row in set (0.00 sec)

mysql> ANALYZE TABLE index_where_not_in;
+----------------------------+---------+----------+----------+
| Table                      | Op      | Msg_type | Msg_text |
+----------------------------+---------+----------+----------+
| db19372.index_where_not_in | analyze | status   | OK       |
+----------------------------+---------+----------+----------+
1 row in set (0.02 sec)

mysql> EXPLAIN
    -> SELECT COUNT(*)
    -> FROM index_where_not_in f1
    -> INNER JOIN index_where_not_in f2
    ->   ON ( f1.mnr_id=f2.mnr_id AND f1.id<f2.id )
    -> WHERE 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: f1
         type: index
possible_keys: sani_index
          key: sani_index
      key_len: 8
          ref: NULL
         rows: 54850
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: f2
         type: ref
possible_keys: sani_index
          key: sani_index
      key_len: 4
          ref: db19372.f1.mnr_id
         rows: 5
        Extra: Using where; Using index
2 rows in set (0.00 sec)

mysql> EXPLAIN
    -> SELECT COUNT(*)
    -> FROM index_where_not_in f1
    -> INNER JOIN index_where_not_in f2
    ->   ON ( f1.mnr_id=f2.mnr_id AND f1.id<f2.id )
    -> WHERE 1
    -> AND f1.mnr_id NOT IN
    -> (11920035,784894,785671,111146418,111132393,111159094,111152706)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: f1
         type: index
possible_keys: sani_index
          key: sani_index
      key_len: 8
          ref: NULL
         rows: 54850
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: f2
         type: index
possible_keys: sani_index
          key: sani_index
      key_len: 8
          ref: NULL
         rows: 41138
        Extra: Using where; Using index
2 rows in set (0.01 sec)

It is a bug!
[10 Apr 2007 16:08] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/24190

ChangeSet@1.2435, 2007-04-10 19:08:08+03:00, gkodinov@magare.gmz +2 -0
  Bug #19372:
  Added a test case.
  The problem was fixed by the fix for bug #17379.
  The problem was that because of some conditions 
  the optimizer always preferred range or full index
  scan access methods to lookup access methods even
  when the latter were much cheaper.
[15 Apr 2007 16:48] Bugs System
Pushed into 5.1.18-beta
[15 Apr 2007 16:53] Bugs System
Pushed into 5.0.40
[17 Apr 2007 19:56] Paul Dubois
Noted in 5.0.40, 5.1.18 changelogs.