Description:
In some cases optimizer wrongly assumes that index on column used in (column = value) condition is better than primary key used in (pk <= other_value) condition. For example:
mysql> explain select count(*) from market2centralinqueueobject where ordercounter <= 300000 and receivestate = 5;
+----+-------------+-----------------------------+------+---------------------+-
------------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys |
key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+------+---------------------+-
------------+---------+-------+------+--------------------------+
| 1 | SIMPLE | market2centralinqueueobject | ref | PRIMARY,IDXA_status |
IDXA_status | 3 | const | 1636 | Using where; Using index |
+----+-------------+-----------------------------+------+---------------------+-
------------+---------+-------+------+--------------------------+
1 row in set (0.01 sec)
While:
mysql> select count(*) from market2centralinqueueobject where ordercounter <= 300000;
+----------+
| count(*) |
+----------+
| 1213 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*) from market2centralinqueueobject where receivestate = 5;
+----------+
| count(*) |
+----------+
| 4081 |
+----------+
1 row in set (0.01 sec)
In practice this often leads to wery slow query execution on large data sets.
FORCE INDEX(PRIMARY) workaround wors, but it is not always possible/easy to change queries (that may be generated by some frontend).
How to repeat:
Create the following table:
CREATE TABLE `market2centralinqueueobject` (
`ackId` bigint(20) default NULL,
`businessEventId` int(11) default NULL,
`countryNumber` smallint(6) default NULL,
`created` varchar(26) default NULL,
`marketNumber` smallint(6) default NULL,
`orderCounter` bigint(20) NOT NULL default '0',
`payload` longblob,
`priority` smallint(6) default NULL,
`receiveState` smallint(6) default NULL,
`wholeId` bigint(20) default NULL,
`zID` bigint(20) default NULL,
PRIMARY KEY (`orderCounter`),
UNIQUE KEY `ackId` (`ackId`,`marketNumber`,`countryNumber`),
KEY `IDX_M2C_PROCESSSELECT` (`marketNumber`,`countryNumber`,`receiveState`,`priority`,`orderCounter`),
KEY `IDXA_businessEventId` (`businessEventId`),
KEY `IDXA_marketNumber` (`marketNumber`),
KEY `IDXA_status` (`receiveState`,`marketNumber`,`countryNumber`,`created`),
KEY `IDXA_zID` (`zID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Populate it with some (semi-random) data (see also file with my data uploaded):
insert into market2centralinqueueobject(ordercounter, receivestate)
values(rand()*1000000, 5);
insert ignore into market2centralinqueueobject(ordercounter, receivestate) select rand()*1000000, receivestate from market2centralinqueueobject;
...
mysql> insert ignore into market2centralinqueueobject(ordercounter, receivestat
e) select rand()*1000000, receivestate from market2centralinqueueobject;
Query OK, 256 rows affected (0.15 sec)
Records: 256 Duplicates: 0 Warnings: 0
mysql> explain select count(*) from market2centralinqueueobject where ordercoun
ter <= 100 and receivestate= 5;
+----+-------------+-----------------------------+-------+---------------------+
---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys |
key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+-------+---------------------+
---------+---------+------+------+-------------+
| 1 | SIMPLE | market2centralinqueueobject | range | PRIMARY,IDXA_status |
PRIMARY | 8 | NULL | 1 | Using where |
+----+-------------+-----------------------------+-------+---------------------+
---------+---------+------+------+-------------+
1 row in set (0.01 sec)
OK, reasonable index is used for now, let's continue:
insert ignore into market2centralinqueueobject(ordercounter, receivestate) values(0, 3);
insert ignore into market2centralinqueueobject(ordercounter, receivestate) select rand()*1000000, receivestate from market2centralinqueueobject;
insert ignore into market2centralinqueueobject(ordercounter, receivestate) select rand()*1000000, receivestate from market2centralinqueueobject;
...
Let's stop with this number of rows:
mysql> select count(*) from market2centralinqueueobject;
+----------+
| count(*) |
+----------+
| 4089 |
+----------+
1 row in set (0.01 sec)
Now:
mysql> explain select count(*) from market2centralinqueueobject where ordercoun
ter <= 100 and receivestate= 5;
+----+-------------+-----------------------------+-------+---------------------+
---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys |
key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+-------+---------------------+
---------+---------+------+------+-------------+
| 1 | SIMPLE | market2centralinqueueobject | range | PRIMARY,IDXA_status |
PRIMARY | 8 | NULL | 2 | Using where |
+----+-------------+-----------------------------+-------+---------------------+
---------+---------+------+------+-------------+
1 row in set (0.00 sec)
OK, proper index is used, but:
mysql> explain select count(*) from market2centralinqueueobject where ordercoun
ter <= 300000 and receivestate= 5;
+----+-------------+-----------------------------+------+---------------------+-
------------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys |
key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+------+---------------------+-
------------+---------+-------+------+--------------------------+
| 1 | SIMPLE | market2centralinqueueobject | ref | PRIMARY,IDXA_status |
IDXA_status | 3 | const | 1636 | Using where; Using index |
+----+-------------+-----------------------------+------+---------------------+-
------------+---------+-------+------+--------------------------+
1 row in set (0.01 sec)
Now, when < 300000 is used optimizer, for some unknown reason, decided that using index that will give all rows but one is better! This is a bug, because
mysql> select count(*) from market2centralinqueueobject where ordercounter <= 300000;
+----------+
| count(*) |
+----------+
| 1213 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*) from market2centralinqueueobject where receivestate= 5;
+----------+
| count(*) |
+----------+
| 4081 |
+----------+
1 row in set (0.01 sec)
optimizer used index that selects 3+ times more rows than needed! And ANALYZE does not help:
mysql> analyze table market2centralinqueueobject;
+----------------------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+----------------------------------+---------+----------+----------+
| test.market2centralinqueueobject | analyze | status | OK |
+----------------------------------+---------+----------+----------+
1 row in set (0.03 sec)
mysql> explain select count(*) from market2centralinqueueobject where ordercoun
ter <= 300000 and receivestate= 5;
+----+-------------+-----------------------------+------+---------------------+-
------------+---------+-------+------+--------------------------+
| id | select_type | table | type | possible_keys |
key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------+------+---------------------+-
------------+---------+-------+------+--------------------------+
| 1 | SIMPLE | market2centralinqueueobject | ref | PRIMARY,IDXA_status |
IDXA_status | 3 | const | 2362 | Using where; Using index |
+----+-------------+-----------------------------+------+---------------------+-
------------+---------+-------+------+--------------------------+
1 row in set (0.00 sec)
Suggested fix:
Calculate costs properly? Use histograms? Give a preference to PRIMARY KEY, at least, for InnoDB tables?