Bug #17259 Query optimizer chooses wrong index
Submitted: 9 Feb 2006 2:23 Modified: 3 Nov 2008 5:27
Reporter: Brad Green Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.0.15, 6.0.7 OS:Linux (Linux)
Assigned to: CPU Architecture:Any

[9 Feb 2006 2:23] Brad Green
Description:
I'm not certiain exactly what this relates to, but the query optimizer is selecting the worst possible index every time in the query below...

How to repeat:
We have a table described thusly:

CREATE TABLE `users` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `account_id` int(10) unsigned NOT NULL,
  `first_name` varchar(50) default NULL,
  `middle_name` varchar(50) default NULL,
  `last_name` varchar(100) default NULL,
  `home_address_1` varchar(150) default NULL,
  `home_city` varchar(75) default NULL,
  `home_state` char(2) default NULL,
  `home_postal_code` varchar(50) default NULL,
  `home_county` varchar(75) default NULL,
  `home_country` char(3) default NULL,
  `work_address_1` varchar(150) default NULL,
  `work_city` varchar(75) default NULL,
  `work_state` char(2) default NULL,
  `work_postal_code` varchar(50) default NULL,
  `work_county` varchar(75) default NULL,
  `work_country` char(3) default NULL,
  `login` varchar(50) NOT NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `login` (`login`,`account_id`),
  KEY `account_id` (`account_id`),
  KEY `user_home_country_indx` (`home_country`),
  KEY `user_work_country_indx` (`work_country`),
  KEY `user_home_state_indx` (`home_state`),
  KEY `user_work_state_indx` (`work_state`),
  KEY `user_home_city_indx` (`home_city`),
  KEY `user_work_city_indx` (`work_city`),
  KEY `user_first_name_indx` (`first_name`),
  KEY `user_last_name_indx` (`last_name`),
  CONSTRAINT `users_ibfk_1` FOREIGN KEY (`account_id`) REFERENCES `accounts` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

There are 265138 rows in this table, each with account_id = 1.  

The following query always uses the worst possible index (account_id):
select * from users where (home_state = 'ia' or work_state='ia') and account_id = 1;

Here is what I see when I run an EXPLAIN:
mysql> explain select * from users where (home_state = 'ia' or work_state='ia') and account_id = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users
         type: ref
possible_keys: account_id,user_home_state_indx,user_work_state_indx
          key: account_id
      key_len: 4
          ref: const
         rows: 131627
        Extra: Using where

Only 283 rows should match a query "where (home_state='ia' or work_state='ia') and account_id=1".

We can use a query hint here, but we can't know when we should hint unless we keep statistics on this table ourselves.  Sometimes account_id will be the correct index to use (*_state='ca'), but for the case described above, it is not.

Suggested fix:
Unknown
[15 Feb 2006 14:41] Valeriy Kravchuk
Thank you for a problem report. Verified just as described with 5.0.19-BK:

mysql> CREATE TABLE `users` (
    ->   `id` int(10) unsigned NOT NULL auto_increment,
    ->   `account_id` int(10) unsigned NOT NULL,
    ->   `first_name` varchar(50) default NULL,
    ->   `middle_name` varchar(50) default NULL,
    ->   `last_name` varchar(100) default NULL,
    ->   `home_address_1` varchar(150) default NULL,
    ->   `home_city` varchar(75) default NULL,
    ->   `home_state` char(2) default NULL,
    ->   `home_postal_code` varchar(50) default NULL,
    ->   `home_county` varchar(75) default NULL,
    ->   `home_country` char(3) default NULL,
    ->   `work_address_1` varchar(150) default NULL,
    ->   `work_city` varchar(75) default NULL,
    ->   `work_state` char(2) default NULL,
    ->   `work_postal_code` varchar(50) default NULL,
    ->   `work_county` varchar(75) default NULL,
    ->   `work_country` char(3) default NULL,
    ->   `login` varchar(50) NOT NULL,
    ->   PRIMARY KEY  (`id`),
    ->   UNIQUE KEY `login` (`login`,`account_id`),
    ->   KEY `account_id` (`account_id`),
    ->   KEY `user_home_country_indx` (`home_country`),
    ->   KEY `user_work_country_indx` (`work_country`),
    ->   KEY `user_home_state_indx` (`home_state`),
    ->   KEY `user_work_state_indx` (`work_state`),
    ->   KEY `user_home_city_indx` (`home_city`),
    ->   KEY `user_work_city_indx` (`work_city`),
    ->   KEY `user_first_name_indx` (`first_name`),
    ->   KEY `user_last_name_indx` (`last_name`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.04 sec)

mysql> insert into users(account_id, login, home_state, work_state) values(1, rand(), 'ia', 'ia');
Query OK, 1 row affected (0.01 sec)
...

mysql> insert into users(account_id, login, home_state, work_state) select 1, r
and(), 'ak', 'ak' from users;
Query OK, 24 rows affected (0.01 sec)
Records: 24  Duplicates: 0  Warnings: 0
...

mysql> select count(*) from users;
+----------+
| count(*) |
+----------+
|     6144 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from users where (home_state = 'ia' or work_state='ia')
and account_id = 1;
+----------+
| count(*) |
+----------+
|        6 |
+----------+
1 row in set (0.03 sec)

mysql> analyze table users;
+------------+---------+----------+----------+
| Table      | Op      | Msg_type | Msg_text |
+------------+---------+----------+----------+
| test.users | analyze | status   | OK       |
+------------+---------+----------+----------+
1 row in set (0.02 sec)

mysql> select count(*) from users where account_id = 1;
+----------+
| count(*) |
+----------+
|     6144 |
+----------+
1 row in set (0.01 sec)

mysql> explain select * from users where (home_state = 'ia' or work_state='ia')
 and account_id = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users
         type: ref
possible_keys: account_id,user_home_state_indx,user_work_state_indx
          key: account_id
      key_len: 4
          ref: const
         rows: 3128
        Extra: Using where
1 row in set (0.00 sec)
[19 Oct 2006 7:09] Valeriy Kravchuk
Here is the entire test case, without "...":

CREATE TABLE `users` (
`id` int(10) unsigned NOT NULL auto_increment,
`account_id` int(10) unsigned NOT NULL,
`first_name` varchar(50) default NULL,
`middle_name` varchar(50) default NULL,
`last_name` varchar(100) default NULL,
`home_address_1` varchar(150) default NULL,
`home_city` varchar(75) default NULL,
`home_state` char(2) default NULL,
`home_postal_code` varchar(50) default NULL,
`home_county` varchar(75) default NULL,
`home_country` char(3) default NULL,
`work_address_1` varchar(150) default NULL,
`work_city` varchar(75) default NULL,
`work_state` char(2) default NULL,
`work_postal_code` varchar(50) default NULL,
`work_county` varchar(75) default NULL,
`work_country` char(3) default NULL,
`login` varchar(50) NOT NULL,
PRIMARY KEY  (`id`),
UNIQUE KEY `login` (`login`,`account_id`),
KEY `account_id` (`account_id`),
KEY `user_home_country_indx` (`home_country`),
KEY `user_work_country_indx` (`work_country`),
KEY `user_home_state_indx` (`home_state`),
KEY `user_work_state_indx` (`work_state`),
KEY `user_home_city_indx` (`home_city`),
KEY `user_work_city_indx` (`work_city`),
KEY `user_first_name_indx` (`first_name`),
KEY `user_last_name_indx` (`last_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into users(account_id, login, home_state, work_state) values(1, rand(), 'ia', 'ia');
insert into users(account_id, login, home_state, work_state) values(1, rand(), 'ia', 'ia');
insert into users(account_id, login, home_state, work_state) values(1, rand(), 'ia', 'ia');
insert into users(account_id, login, home_state, work_state) values(1, rand(), 'ia', 'ia');
insert into users(account_id, login, home_state, work_state) values(1, rand(), 'ia', 'ia');
insert into users(account_id, login, home_state, work_state) values(1, rand(), 'ia', 'ia');
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;
insert into users(account_id, login, home_state, work_state) select 1, rand(), 'ak', 'ak' from users;

analyze table users;

select count(*) from users where account_id = 1;
explain select * from users where (home_state = 'ia' or work_state='ia')
and account_id = 1\G

I've got in 5.0.27-BK:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users
         type: ref
possible_keys: account_id,user_home_state_indx,user_work_state_indx
          key: account_id
      key_len: 4
          ref: const
         rows: 3128
        Extra: Using where
1 row in set (0.00 sec)

This estimation of rows is incorrect (yes I know you can not give correct one, but this case is OBVIOUS! All rows in table have account_id=1!).

Now let's try to execute:

select * from users where (home_state = 'ia' or work_state='ia')  and account_id = 1\G
...
*************************** 6. row ***************************
              id: 6
      account_id: 1
      first_name: NULL
     middle_name: NULL
       last_name: NULL
  home_address_1: NULL
       home_city: NULL
      home_state: ia
home_postal_code: NULL
     home_county: NULL
    home_country: NULL
  work_address_1: NULL
       work_city: NULL
      work_state: ia
work_postal_code: NULL
     work_county: NULL
    work_country: NULL
           login: 0.1604477094118
6 rows in set (0.12 sec)

While with a different index forced manually:

mysql> explain select * from users force index (user_home_state_indx) where (ho
me_state = 'ia' or work_state='ia')  and account_id = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users
         type: ALL
possible_keys: user_home_state_indx
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 6257
        Extra: Using where
1 row in set (0.00 sec)

We have even more incorrect estimation of rows (there are only 6 with home_state = 'ia', obviously), and it leads to ignoring this index, while it gives us better execution time (by wall clock):

mysql> select * from users force index (user_home_state_indx) where (home_state
 = 'ia' or work_state='ia')  and account_id = 1\G
...
*************************** 6. row ***************************
              id: 6
      account_id: 1
      first_name: NULL
     middle_name: NULL
       last_name: NULL
  home_address_1: NULL
       home_city: NULL
      home_state: ia
home_postal_code: NULL
     home_county: NULL
    home_country: NULL
  work_address_1: NULL
       work_city: NULL
      work_state: ia
work_postal_code: NULL
     work_county: NULL
    work_country: NULL
           login: 0.1604477094118
6 rows in set (0.05 sec)

mysql> select * from users where (home_state = 'ia' or work_state='ia')  and ac
count_id = 1\G
...
*************************** 6. row ***************************
              id: 6
      account_id: 1
      first_name: NULL
     middle_name: NULL
       last_name: NULL
  home_address_1: NULL
       home_city: NULL
      home_state: ia
home_postal_code: NULL
     home_county: NULL
    home_country: NULL
  work_address_1: NULL
       work_city: NULL
      work_state: ia
work_postal_code: NULL
     work_county: NULL
    work_country: NULL
           login: 0.1604477094118
6 rows in set (0.12 sec)

I know that "wall clock" does not matter for optimizer, but the only reason is a bug/problem in query cost estimation, once again poroved by this bug report. Fix it, eventually, please!
[22 Feb 2007 7:26] Igor Babaev
Unfortunately 5.0 and 5.1 even do not consider index merge access if there is a simple index access.
To make the optimizer use index merge for the reported query one has to disable
the index account_id:

mysql> explain select * from users where (home_state = 'ia' or work_state='ia') and account_id = 1;
+----+-------------+-------+------+------------------------------------------------------+------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys                                        | key        | key_len | ref   | rows | Extra       |
+----+-------------+-------+------+------------------------------------------------------+------------+---------+-------+------+-------------+
|  1 | SIMPLE      | users | ref  | account_id,user_home_state_indx,user_work_state_indx | account_id | 4       | const | 2935 | Using where |
+----+-------------+-------+------+------------------------------------------------------+------------+---------+-------+------+-------------+
1 row in set (0.70 sec)

mysql> explain select * from users ignore index (account_id)
    ->   where (home_state = 'ia' or work_state='ia') and account_id = 1;
+----+-------------+-------+-------------+-------------------------------------------+-------------------------------------------+---------+------+------+---------------------------------------------------------------------+
| id | select_type | table | type        | possible_keys                             | key                                       | key_len | ref  | rows | Extra                                                               |
+----+-------------+-------+-------------+-------------------------------------------+-------------------------------------------+---------+------+------+---------------------------------------------------------------------+
|  1 | SIMPLE      | users | index_merge | user_home_state_indx,user_work_state_indx | user_home_state_indx,user_work_state_indx | 7,7     | NULL |   10 | Using union(user_home_state_indx,user_work_state_indx); Using where |
+----+-------------+-------+-------------+-------------------------------------------+-------------------------------------------+---------+------+------+---------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select * from users use index (user_home_state_indx,user_work_state_indx)
    ->   where (home_state = 'ia' or work_state='ia') and account_id = 1;
+----+-------------+-------+-------------+-------------------------------------------+-------------------------------------------+---------+------+------+---------------------------------------------------------------------+
| id | select_type | table | type        | possible_keys                             | key                                       | key_len | ref  | rows | Extra                                                               |
+----+-------------+-------+-------------+-------------------------------------------+-------------------------------------------+---------+------+------+---------------------------------------------------------------------+
|  1 | SIMPLE      | users | index_merge | user_home_state_indx,user_work_state_indx | user_home_state_indx,user_work_state_indx | 7,7     | NULL |   10 | Using union(user_home_state_indx,user_work_state_indx); Using where |
+----+-------------+-------+-------------+-------------------------------------------+-------------------------------------------+---------+------+------+---------------------------------------------------------------------+
1 row in set (0.00 sec)

 mysql> select * from users ignore index (account_id)
    ->   where (home_state = 'ia' or work_state='ia') and account_id = 1;
+----+------------+------------+-------------+-----------+----------------+-----------+------------+------------------+-------------+--------------+----------------+-----------+------------+------------------+-------------+--------------+-------------------+
| id | account_id | first_name | middle_name | last_name | home_address_1 | home_city | home_state | home_postal_code | home_county | home_country | work_address_1 | work_city | work_state | work_postal_code | work_county | work_country | login             |
+----+------------+------------+-------------+-----------+----------------+-----------+------------+------------------+-------------+--------------+----------------+-----------+------------+------------------+-------------+--------------+-------------------+
|  1 |          1 | NULL       | NULL        | NULL      | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | 0.088737075299711 |
|  2 |          1 | NULL       | NULL        | NULL      | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | 0.29110233419677  |
|  3 |          1 | NULL       | NULL        | NULL      | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | 0.18930047581838  |
|  4 |          1 | NULL       | NULL        | NULL      | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | 0.073195407235246 |
|  5 |          1 | NULL       | NULL        | NULL      | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | 0.79807563945472  |
|  6 |          1 | NULL       | NULL        | NULL      | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | NULL           | NULL      | ia         | NULL             | NULL        | NULL         | 0.7707920063015   |
+----+------------+------------+-------------+-----------+----------------+-----------+------------+------------------+-------------+--------------+----------------+-----------+------------+------------------+-------------+--------------+-------------------+
6 rows in set (0.00 sec)

This deficiency of the optimizer will be fixed in 6.0.
[3 Nov 2008 5:27] Valeriy Kravchuk
This problem still is not fixed in 6.0.7.
[22 Jun 2015 18:36] Morgan Tocker
5.6 is able to use index merge on this query.  The plan changes to:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users
         type: index_merge
possible_keys: account_id,user_home_state_indx,user_work_state_indx
          key: user_home_state_indx,user_work_state_indx
      key_len: 7,7
          ref: NULL
         rows: 12
        Extra: Using union(user_home_state_indx,user_work_state_indx); Using where
1 row in set (0.00 sec)

mysql [localhost] {msandbox} (test) > select version();
+-----------+
| version() |
+-----------+
| 5.6.25    |
+-----------+
1 row in set (0.00 sec)