MySQL Bugs: #69846: ICP does not work on UNIQUE indexes

Bug #69846	ICP does not work on UNIQUE indexes
Submitted:	26 Jul 2013 9:15	Modified:	6 Aug 2013 17:38
Reporter:	S Groot	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Documentation	Severity:	S5 (Performance)
Version:	5.6.12	OS:	Linux
Assigned to:	Bugs System	CPU Architecture:	Any
Tags:	ICP, index condition pushdown

Description:
Index Condition Pushdown is not used when the index is unique.

How to repeat:
CREATE TABLE `icp1` (
  `sensor` int(11) NOT NULL,
  `time` int(11) NOT NULL,
  `value` int(11) DEFAULT NULL,
  PRIMARY KEY (`sensor`,`time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
 explain select * from icp1 where sensor < 10 and time < 10;

result:
|  1 | SIMPLE      | icp1  | range | PRIMARY       | PRIMARY | 4       | NULL |    1 | Using where |

CREATE TABLE `icp2` (
  `sensor` int(11) NOT NULL,
  `time` int(11) NOT NULL,
  `value` int(11) DEFAULT NULL,
  KEY (`sensor`, `time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

explain select * from icp2 where sensor < 10 and time < 10

result:
|  1 | SIMPLE      | icp2  | range | sensor        | sensor | 4       | NULL |    1 | Using index condition |

It's a bit more complicated. It seems ICP is not used for a unique key that is used for clustered index of InnoDB table:

C:\Program Files\MySQL\MySQL Server 5.5\bin>mysql -uroot -proot -P3314 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 44
Server version: 5.6.12 MySQL Community Server (GPL)

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE `icp1` (
    ->   `sensor` int(11) NOT NULL,
    ->   `time` int(11) NOT NULL,
    ->   `value` int(11) DEFAULT NULL,
    ->   PRIMARY KEY (`sensor`,`time`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (4.98 sec)

mysql> explain select * from icp1 where sensor < 10 and time < 10;
+----+-------------+-------+-------+---------------+---------+---------+------+-
-----+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  |
rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+-
-----+-------------+
|  1 | SIMPLE      | icp1  | range | PRIMARY       | PRIMARY | 4       | NULL |
   1 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-
-----+-------------+
1 row in set (0.50 sec)

Not used for a primary key.

mysql> CREATE TABLE `icp2` (
    ->   `sensor` int(11) NOT NULL,
    ->   `time` int(11) NOT NULL,
    ->   `value` int(11) DEFAULT NULL,
    ->   KEY (`sensor`, `time`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (1.52 sec)

mysql> explain select * from icp2 where sensor < 10 and time < 10;
+----+-------------+-------+-------+---------------+--------+---------+------+--
----+-----------------------+
| id | select_type | table | type  | possible_keys | key    | key_len | ref  | r
ows | Extra                 |
+----+-------------+-------+-------+---------------+--------+---------+------+--
----+-----------------------+
|  1 | SIMPLE      | icp2  | range | sensor        | sensor | 4       | NULL |
  1 | Using index condition |
+----+-------------+-------+-------+---------------+--------+---------+------+--
----+-----------------------+
1 row in set (0.14 sec)

Used for non-unique key (in this case InnoDB generates it's own unique index to store data in). Now, let's continue:

mysql> alter table icp2 drop key sensor;
Query OK, 0 rows affected (0.93 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table icp2 add unique key (sensor,time);
Query OK, 0 rows affected (1.66 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain select * from icp2 where sensor < 10 and time < 10;
+----+-------------+-------+-------+---------------+--------+---------+------+--
----+-------------+
| id | select_type | table | type  | possible_keys | key    | key_len | ref  | r
ows | Extra       |
+----+-------------+-------+-------+---------------+--------+---------+------+--
----+-------------+
|  1 | SIMPLE      | icp2  | range | sensor        | sensor | 4       | NULL |
  1 | Using where |
+----+-------------+-------+-------+---------------+--------+---------+------+--
----+-------------+
1 row in set (0.00 sec)

Not used for the only unique key (as it is used for clustered index where data are stored in this case). Now, let's add other column and explcit primary key on it:

mysql> alter table icp2 add column id int;
Query OK, 0 rows affected (1.22 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table icp2 add primary key(id);
Query OK, 0 rows affected (1.52 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain select * from icp2 where sensor < 10 and time < 10;
+----+-------------+-------+-------+---------------+--------+---------+------+--
----+-----------------------+
| id | select_type | table | type  | possible_keys | key    | key_len | ref  | r
ows | Extra                 |
+----+-------------+-------+-------+---------------+--------+---------+------+--
----+-----------------------+
|  1 | SIMPLE      | icp2  | range | sensor        | sensor | 4       | NULL |
  1 | Using index condition |
+----+-------------+-------+-------+---------------+--------+---------+------+--
----+-----------------------+
1 row in set (0.00 sec)

mysql> show create table icp2\G
*************************** 1. row ***************************
       Table: icp2
Create Table: CREATE TABLE `icp2` (
  `sensor` int(11) NOT NULL,
  `time` int(11) NOT NULL,
  `value` int(11) DEFAULT NULL,
  `id` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `sensor` (`sensor`,`time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.03 sec)

So, to put it simple ICP is used only for secondary indexes of InnoDB table.

Probably it is even expected, as reading cluster index record is reading a table itself... I just do not see this explained at http://dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html

Should be definitely documented !!!

Just to confirm Valeriy's conclusion: ICP is not used for InnoDB's
clustered index.

The main goal of ICP is to reduce the number of full records that need
to be read and thus potentially save on IO operations. For the case
where the index is the InnoDB clustered index, the complete record is
already read into the InnoDB buffer. In this case we will not save any
IO by using ICP. For cases where the index condition will filter out a
major part of records, there could have been some saving in CPU
usage. For cases where the index condition does not filter out any
records or only a small fraction of the records, there might even be a
small overhead by pushing the index condition to the storage engine
and evaluate it there. Thus, the current ICP implementation is
conservative and only uses ICP for the case where there is most to
gain and least likely to introduce CPU overhead.

For a related case, see also Bug#68554.

Did some more research on the problem and thanks for the explanation. I see now why it is not used.

Maybe a stupid question, but currently it seems to be doing a full index-scan when testing the second condition. Isn't it possible to use the b-tree for the sub-ranges that are filtered out by the second condition? That should be able to save a lot of IO.

For the example query in this bug report MySQL will only be able to use the "sensor < 10" part to limit the range scan. MySQL is not able to utilize the second part of the where clause ("time < 10") to reduce the number of records that will be read from the storage engine.

In general for a multi-part index, for MySQL to utilize multiple fields of the index the where condition must have equality predicate on the first field. If the where condition has equality predicates for the first N fields of the index and a non-equality for the N+1 field, then the range scan will be able to utilize the first N+1 fields to reduce the amount of the data read from the index.

Example:

Index covers five fields:

  PRIMARY KEY (a, b, c, d, e)

and WHERE condition looks like:

  a=3 AND B IN (3, 5, 7) AND C=8 AND D < 10 AND E < 10

then the range scan should be able to scan only the part of the index that is covered by "a=3 AND B IN (3, 5, 7) AND C=8 AND D < 10".

yeah I noticed, but isn't this something the optimizer could do? eg resolve the "a < 10" into "a in (1, 3, 5)"

For an explanation why the range scan can only make use of the first keypart, see http://jorgenloland.blogspot.com/2011/08/mysql-range-access-method-explained.html 

As for translating "a < 10" to "a IN (list_of_values)":

To do this, the optimizer would have two options:
 a) let list_of_values contain all values possible for the datatype
 b) let list_of_values contain all applicable values that exist in the database

a) Although possible in simple cases, this option is not feasible in the general case. Some examples that would be very difficult or impossible to handle: 

 * "my_float_col < X"   => infinite number of floating point values, impossible
 * "my_int_col > 2"     => the list of values would be VERY long
 * "my_int_col < 10000" => same as ^

There are many more counterexamples that makes this approach unfeasible. 

b) In theory, the optimizer could check if there are many values that would be in the range, but that would require reading data. That's costly and not something we would like to do during the optimization phase.

So if you want this behaviour you'll have to write the IN-list yourself.

The following pages have been updated to include the ICP limitation for InnoDB:

http://dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html
http://dev.mysql.com/doc/refman/5.7/en/index-condition-pushdown-optimization.html

The revised content will appear soon, in the next documentation build.
Thank you for the bug report.