Bug #69846 | ICP does not work on UNIQUE indexes | ||
---|---|---|---|
Submitted: | 26 Jul 2013 9:15 | Modified: | 6 Aug 2013 17:38 |
Reporter: | S Groot | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Documentation | Severity: | S5 (Performance) |
Version: | 5.6.12 | OS: | Linux |
Assigned to: | Bugs System | CPU Architecture: | Any |
Tags: | ICP, index condition pushdown |
[26 Jul 2013 9:15]
S Groot
[26 Jul 2013 13:57]
Valeriy Kravchuk
It's a bit more complicated. It seems ICP is not used for a unique key that is used for clustered index of InnoDB table: C:\Program Files\MySQL\MySQL Server 5.5\bin>mysql -uroot -proot -P3314 test Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 44 Server version: 5.6.12 MySQL Community Server (GPL) Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE TABLE `icp1` ( -> `sensor` int(11) NOT NULL, -> `time` int(11) NOT NULL, -> `value` int(11) DEFAULT NULL, -> PRIMARY KEY (`sensor`,`time`) -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Query OK, 0 rows affected (4.98 sec) mysql> explain select * from icp1 where sensor < 10 and time < 10; +----+-------------+-------+-------+---------------+---------+---------+------+- -----+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+------+- -----+-------------+ | 1 | SIMPLE | icp1 | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where | +----+-------------+-------+-------+---------------+---------+---------+------+- -----+-------------+ 1 row in set (0.50 sec) Not used for a primary key. mysql> CREATE TABLE `icp2` ( -> `sensor` int(11) NOT NULL, -> `time` int(11) NOT NULL, -> `value` int(11) DEFAULT NULL, -> KEY (`sensor`, `time`) -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Query OK, 0 rows affected (1.52 sec) mysql> explain select * from icp2 where sensor < 10 and time < 10; +----+-------------+-------+-------+---------------+--------+---------+------+-- ----+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | r ows | Extra | +----+-------------+-------+-------+---------------+--------+---------+------+-- ----+-----------------------+ | 1 | SIMPLE | icp2 | range | sensor | sensor | 4 | NULL | 1 | Using index condition | +----+-------------+-------+-------+---------------+--------+---------+------+-- ----+-----------------------+ 1 row in set (0.14 sec) Used for non-unique key (in this case InnoDB generates it's own unique index to store data in). Now, let's continue: mysql> alter table icp2 drop key sensor; Query OK, 0 rows affected (0.93 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table icp2 add unique key (sensor,time); Query OK, 0 rows affected (1.66 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> explain select * from icp2 where sensor < 10 and time < 10; +----+-------------+-------+-------+---------------+--------+---------+------+-- ----+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | r ows | Extra | +----+-------------+-------+-------+---------------+--------+---------+------+-- ----+-------------+ | 1 | SIMPLE | icp2 | range | sensor | sensor | 4 | NULL | 1 | Using where | +----+-------------+-------+-------+---------------+--------+---------+------+-- ----+-------------+ 1 row in set (0.00 sec) Not used for the only unique key (as it is used for clustered index where data are stored in this case). Now, let's add other column and explcit primary key on it: mysql> alter table icp2 add column id int; Query OK, 0 rows affected (1.22 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table icp2 add primary key(id); Query OK, 0 rows affected (1.52 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> explain select * from icp2 where sensor < 10 and time < 10; +----+-------------+-------+-------+---------------+--------+---------+------+-- ----+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | r ows | Extra | +----+-------------+-------+-------+---------------+--------+---------+------+-- ----+-----------------------+ | 1 | SIMPLE | icp2 | range | sensor | sensor | 4 | NULL | 1 | Using index condition | +----+-------------+-------+-------+---------------+--------+---------+------+-- ----+-----------------------+ 1 row in set (0.00 sec) mysql> show create table icp2\G *************************** 1. row *************************** Table: icp2 Create Table: CREATE TABLE `icp2` ( `sensor` int(11) NOT NULL, `time` int(11) NOT NULL, `value` int(11) DEFAULT NULL, `id` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`id`), UNIQUE KEY `sensor` (`sensor`,`time`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.03 sec) So, to put it simple ICP is used only for secondary indexes of InnoDB table. Probably it is even expected, as reading cluster index record is reading a table itself... I just do not see this explained at http://dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html
[26 Jul 2013 18:28]
MySQL Verification Team
Should be definitely documented !!!
[30 Jul 2013 11:58]
Olav Sandstå
Just to confirm Valeriy's conclusion: ICP is not used for InnoDB's clustered index. The main goal of ICP is to reduce the number of full records that need to be read and thus potentially save on IO operations. For the case where the index is the InnoDB clustered index, the complete record is already read into the InnoDB buffer. In this case we will not save any IO by using ICP. For cases where the index condition will filter out a major part of records, there could have been some saving in CPU usage. For cases where the index condition does not filter out any records or only a small fraction of the records, there might even be a small overhead by pushing the index condition to the storage engine and evaluate it there. Thus, the current ICP implementation is conservative and only uses ICP for the case where there is most to gain and least likely to introduce CPU overhead.
[30 Jul 2013 12:00]
Olav Sandstå
For a related case, see also Bug#68554.
[30 Jul 2013 15:29]
S Groot
Did some more research on the problem and thanks for the explanation. I see now why it is not used. Maybe a stupid question, but currently it seems to be doing a full index-scan when testing the second condition. Isn't it possible to use the b-tree for the sub-ranges that are filtered out by the second condition? That should be able to save a lot of IO.
[31 Jul 2013 10:30]
Olav Sandstå
For the example query in this bug report MySQL will only be able to use the "sensor < 10" part to limit the range scan. MySQL is not able to utilize the second part of the where clause ("time < 10") to reduce the number of records that will be read from the storage engine. In general for a multi-part index, for MySQL to utilize multiple fields of the index the where condition must have equality predicate on the first field. If the where condition has equality predicates for the first N fields of the index and a non-equality for the N+1 field, then the range scan will be able to utilize the first N+1 fields to reduce the amount of the data read from the index. Example: Index covers five fields: PRIMARY KEY (a, b, c, d, e) and WHERE condition looks like: a=3 AND B IN (3, 5, 7) AND C=8 AND D < 10 AND E < 10 then the range scan should be able to scan only the part of the index that is covered by "a=3 AND B IN (3, 5, 7) AND C=8 AND D < 10".
[1 Aug 2013 13:48]
S Groot
yeah I noticed, but isn't this something the optimizer could do? eg resolve the "a < 10" into "a in (1, 3, 5)"
[5 Aug 2013 10:17]
Jørgen Løland
For an explanation why the range scan can only make use of the first keypart, see http://jorgenloland.blogspot.com/2011/08/mysql-range-access-method-explained.html As for translating "a < 10" to "a IN (list_of_values)": To do this, the optimizer would have two options: a) let list_of_values contain all values possible for the datatype b) let list_of_values contain all applicable values that exist in the database a) Although possible in simple cases, this option is not feasible in the general case. Some examples that would be very difficult or impossible to handle: * "my_float_col < X" => infinite number of floating point values, impossible * "my_int_col > 2" => the list of values would be VERY long * "my_int_col < 10000" => same as ^ There are many more counterexamples that makes this approach unfeasible. b) In theory, the optimizer could check if there are many values that would be in the range, but that would require reading data. That's costly and not something we would like to do during the optimization phase. So if you want this behaviour you'll have to write the IN-list yourself.
[6 Aug 2013 17:38]
Bugs System
The following pages have been updated to include the ICP limitation for InnoDB: http://dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html http://dev.mysql.com/doc/refman/5.7/en/index-condition-pushdown-optimization.html The revised content will appear soon, in the next documentation build. Thank you for the bug report.