MySQL Bugs: #37328: Optimizer prefers ref access on multiple-column index over simple range access

Bug #37328	Optimizer prefers ref access on multiple-column index over simple range access
Submitted:	11 Jun 2008 4:50	Modified:	22 Oct 2012 8:13
Reporter:	Valeriy Kravchuk	Email Updates:
Status:	Won't fix	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	5.0.60, 5.0.64	OS:	Any
Assigned to:	Jørgen Løland	CPU Architecture:	Any

Description:
For some reason (incorrect estimation of rows affected?) optimizer prefers to use ref access using multiple-column index over range scan of simple index (or merge of them, but this is another old bug #23322). Look:

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot -P3308 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.0.60-enterprise-gpl-nt MySQL Enterprise Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> show create table incidents\G
*************************** 1. row ***************************
       Table: incidents
Create Table: CREATE TABLE `incidents` (
  `i_id` int(11) NOT NULL default '0',
  `status_type` smallint(6) NOT NULL default '0',
  `closed` timestamp NULL default NULL,
  `queue_id` int(11) default NULL,
  UNIQUE KEY `incidents$i_id` (`i_id`),
  KEY `incidents$closed` (`closed`),
  KEY `incidents$queue_id` (`queue_id`),
  KEY `incidents$status_type$queue_id` (`status_type`,`queue_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> explain select count(*) from incidents where status_type IN (2) and queue
_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62)
 and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: incidents
         type: ref
possible_keys: incidents$closed,incidents$queue_id,incidents$status_type$queue_i
d
          key: incidents$status_type$queue_id
      key_len: 2
          ref: const
         rows: 78600
        Extra: Using where
1 row in set (0.06 sec)

mysql> analyze table incidents;
+----------------+---------+----------+----------+
| Table          | Op      | Msg_type | Msg_text |
+----------------+---------+----------+----------+
| test.incidents | analyze | status   | OK       |
+----------------+---------+----------+----------+
1 row in set (0.66 sec)

mysql> explain select count(*) from incidents where status_type IN (2) and queue
_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62)
 and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: incidents
         type: ref
possible_keys: incidents$closed,incidents$queue_id,incidents$status_type$queue_i
d
          key: incidents$status_type$queue_id
      key_len: 2
          ref: const
         rows: 78779
        Extra: Using where
1 row in set (0.47 sec)

Now, let's check how many rows really satisfy individual conditions:

mysql> select count(*) from incidents where status_type IN (2);
+----------+
| count(*) |
+----------+
|  1406501 |
+----------+
1 row in set (2.95 sec)

mysql> select count(closed) from incidents where queue_id IN (28,11,59,19,9,6,30
,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62);
+---------------+
| count(closed) |
+---------------+
|       1094212 |
+---------------+
1 row in set (2.06 sec)

mysql> select count(*) from incidents where status_type IN (2) and queue_id IN (
28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62);
+----------+
| count(*) |
+----------+
|  1094212 |
+----------+
1 row in set (1.24 sec)

So, rows estimation is incorrect and we have to scan million of rows based on index optimizer selected. Compare to this:

mysql> select count(*) from incidents where closed >= DATE_SUB(now(), INTERVAL 1
 MONTH);
+----------+
| count(*) |
+----------+
|    43648 |
+----------+
1 row in set (0.05 sec)

So, index on `closed` column is really much more appropriate. Lets try to force it and check execution time:

mysql> select count(closed) from incidents force index(incidents$closed) where s
tatus_type IN (2) and queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46
,17,20,21,49,50,35,7,55,62) and incidents.closed >= DATE_SUB(now(), INTERVAL 1 M
ONTH)\G
*************************** 1. row ***************************
count(closed): 40421
1 row in set (0.13 sec)

mysql> select count(closed) from incidents where status_type IN (2) and queue_id
 IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62) an
d incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
count(closed): 40421
1 row in set (9.72 sec)

So, 74 times faster on this laptop. Similar results for select count(*):

mysql> select count(*) from incidents force index(incidents$closed) where status
_type IN (2) and queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,2
0,21,49,50,35,7,55,62) and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)
\G
*************************** 1. row ***************************
count(*): 40420
1 row in set (0.56 sec)

mysql> select count(*) from incidents where status_type IN (2) and queue_id IN (
28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62) and inc
idents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
count(*): 40420
1 row in set (9.53 sec)

17 times faster.

This optimizer behaviour does NOT depend on storage engine used:

mysql> alter table incidents engine=MyISAM;
Query OK, 1418386 rows affected (54.94 sec)
Records: 1418386  Duplicates: 0  Warnings: 0

mysql> analyze table incidents;
+----------------+---------+----------+-----------------------------+
| Table          | Op      | Msg_type | Msg_text                    |
+----------------+---------+----------+-----------------------------+
| test.incidents | analyze | status   | Table is already up to date |
+----------------+---------+----------+-----------------------------+
1 row in set (0.00 sec)

mysql> explain select count(*) from incidents where status_type IN (2) and queue
_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62)
 and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: incidents
         type: ref
possible_keys: incidents$closed,incidents$queue_id,incidents$status_type$queue_i
d
          key: incidents$status_type$queue_id
      key_len: 2
          ref: const
         rows: 472795
        Extra: Using where
1 row in set (0.00 sec)

mysql> explain select count(closed) from incidents where status_type IN (2) and
queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,5
5,62) and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: incidents
         type: ref
possible_keys: incidents$closed,incidents$queue_id,incidents$status_type$queue_i
d
          key: incidents$status_type$queue_id
      key_len: 2
          ref: const
         rows: 472795
        Extra: Using where
1 row in set (0.02 sec)

mysql> select count(closed) from incidents where status_type IN (2) and queue_id
 IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62) an
d incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
count(closed): 40409
1 row in set (4.72 sec)

mysql> select count(closed) from incidents force index (incidents$closed) where
status_type IN (2) and queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,4
6,17,20,21,49,50,35,7,55,62) and incidents.closed >= DATE_SUB(now(), INTERVAL 1
MONTH)\G
*************************** 1. row ***************************
count(closed): 40407
1 row in set (0.17 sec)

Still optimizer selected plan resulted in 27 times slower execution.

How to repeat:
Load dump that is provided. Then:

analyze table incidents;

explain select count(*) from incidents 
where status_type IN (2) and queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62)
and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G

explain select count(closed) from incidents where status_type IN (2) 
and queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62)
and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G

select count(*) from incidents 
where status_type IN (2) and queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62) 
and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH);

select count(*) from incidents force index(incidents$closed) 
where status_type IN (2) and queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62) 
and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH);

select count(closed) from incidents 
where status_type IN (2) and queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62) 
and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH);

select count(closed) from incidents force index(incidents$closed) 
where status_type IN (2) and queue_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62) 
and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH);

Suggested fix:
Fix this. Workarounds with FORCE INDEX are unacceptable in too many cases.

Verified on 5.0.64, with both InnoDB and MyISAM storage engines.

Proper index is used on 5.1.24:

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -P3310 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.1.24-rc-community-debug MySQL Community Server - Debug (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> alter table incidents engine=MyISAM;
Query OK, 1418386 rows affected (4 min 19.47 sec)
Records: 1418386  Duplicates: 0  Warnings: 0

mysql> explain select count(*) from incidents where status_type IN (2) and queue
_id IN (28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62)
 and incidents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: incidents
         type: range
possible_keys: incidents$closed,incidents$queue_id,incidents$status_type$queue_i
d
          key: incidents$closed
      key_len: 5
          ref: NULL
         rows: 104886
        Extra: Using where
1 row in set (0.14 sec)

mysql> select count(*) from incidents where status_type IN (2) and queue_id IN (
28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62) and inc
idents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
count(*): 40324
1 row in set (1.03 sec)

mysql> select count(*) from incidents where status_type IN (2) and queue_id IN (
28,11,59,19,9,6,30,14,31,36,54,57,58,47,15,46,17,20,21,49,50,35,7,55,62) and inc
idents.closed >= DATE_SUB(now(), INTERVAL 1 MONTH)\G
*************************** 1. row ***************************
count(*): 40319
1 row in set (0.45 sec)

So, this is 5.0 only bug.

Issue already fixed in 5.1+. The fix will not be backported to 5.0

I am using 5.5.30 and 5.6.20, meet this problem also :(