Bug #39653 | find_shortest_key in sql_select.cc does not consider clustered primary keys | ||
---|---|---|---|
Submitted: | 25 Sep 2008 18:26 | Modified: | 20 Jun 2010 22:48 |
Reporter: | Zardosht Kasheff (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.1.25, 5.1.28 | OS: | Any |
Assigned to: | Gleb Shchepa | CPU Architecture: | Any |
Tags: | find_shortest_key, query optimizer |
[25 Sep 2008 18:26]
Zardosht Kasheff
[25 Sep 2008 18:34]
Sveta Smirnova
Thank you for the report. If I force index(b) I get similar results: explain select count(*) from bar force index(b); id 1 select_type SIMPLE table bar type index possible_keys NULL key b key_len 10 ref NULL rows 10 Extra Using index Could you please provide example which demonstrates performance increase if using another key?
[25 Sep 2008 18:40]
Sveta Smirnova
This also can be same issue as described in comment "[17 May 2005 14:27] Igor Babaev" for bug #10512
[25 Sep 2008 18:48]
Zardosht Kasheff
I cannot comment on if this is the same issue as what Igor states in #10512. As for queries that display the problem, take any query that has these properties: 1) Both the primary key and some other key are covering indexes 2) The storage engine has a clustered primary key, like InnoDB The bigger the disparity between the entire row size and the length of the non-primary covering key, the wider the gap in performance.
[26 Sep 2008 6:06]
Sveta Smirnova
Thank you for the feedback. In the example provided you have PRIMARY KEY which is unique index by definition and KEY(b,c) which is not unique. In the same time you ask about quantity of all rows. So it is natural what PRIMARY KEY is used. If change query to explain `select count(*) from bar where b=1;` index (b,c) is used: explain select count(*) from bar where b=1; id 1 select_type SIMPLE table bar type ref possible_keys b key b key_len 5 ref const rows 1 Extra Using where; Using index This is why I ask you to provide correct example.
[26 Sep 2008 12:51]
Zardosht Kasheff
Actually, it is not natural to use the primary key. The fact that the primary key is unique and the key (b,c) is not unique does not matter. The only reason that the primary key is being used is that it is of shorter length than the key (b,c). Here is a way to verify what I say. Alter the table such that we have a key (c) and a primary key of (a,b). If what you say is correct, then the primary should be used. It is not. The key (c) is used because it has shorter length than the primary: mysql> explain select count(*) from bar \G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: bar type: index possible_keys: NULL key: PRIMARY key_len: 4 ref: NULL rows: 10 Extra: Using index 1 row in set (0.00 sec) mysql> alter table bar drop primary key, drop index b; Query OK, 10 rows affected (0.01 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql> alter table bar add primary key(a,b), add index c(c); Query OK, 10 rows affected (0.01 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql> explain select count(*) from bar \G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: bar type: index possible_keys: NULL key: c key_len: 5 ref: NULL rows: 10 Extra: Using index 1 row in set (0.00 sec)
[3 Oct 2008 8:10]
Valeriy Kravchuk
Verified just as described with 5.1.28. I think primary key of InnoDB (and any other engine with clustered primary key) should be considered differently for "index" access path. Entire row length should be taken into account.
[26 Feb 2010 19:19]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/101713 3358 Gleb Shchepa 2010-02-26 Bug #39653: find_shortest_key in sql_select.cc does not consider clustered primary keys Choosing a shortest index for the covering index scan, the optimizer ignored the fact, that the clustered primary key read involves whole table data. The find_shortest_key function has been modified to take into account that fact that a clustered PK has a longest key of possible covering indices. @ mysql-test/r/innodb_mysql.result Test case for bug #39653. @ mysql-test/t/innodb_mysql.test Test case for bug #39653. @ sql/sql_select.cc Bug #39653: find_shortest_key in sql_select.cc does not consider clustered primary keys The find_shortest_key function has been modified to take into account that fact that a clustered PK has a longest key of possible covering indices.
[5 Mar 2010 7:22]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/102391 3358 Gleb Shchepa 2010-03-05 Bug #39653: find_shortest_key in sql_select.cc does not consider clustered primary keys Choosing a shortest index for the covering index scan, the optimizer ignored the fact, that the clustered primary key read involves whole table data. The find_shortest_key function has been modified to take into account that fact that a clustered PK has a longest key of possible covering indices. @ mysql-test/r/innodb_mysql.result Test case for bug #39653. @ mysql-test/t/innodb_mysql.test Test case for bug #39653. @ sql/sql_select.cc Bug #39653: find_shortest_key in sql_select.cc does not consider clustered primary keys The find_shortest_key function has been modified to take into account that fact that a clustered PK has a longest key of possible covering indices.
[5 Mar 2010 20:25]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/102478 3372 Gleb Shchepa 2010-03-05 Bug #39653: find_shortest_key in sql_select.cc does not consider clustered primary keys Choosing a shortest index for the covering index scan, the optimizer ignored the fact, that the clustered primary key read involves whole table data. The find_shortest_key function has been modified to take into account that fact that a clustered PK has a longest key of possible covering indices. @ mysql-test/r/innodb_mysql.result Test case for bug #39653. @ mysql-test/t/innodb_mysql.test Test case for bug #39653. @ sql/sql_select.cc Bug #39653: find_shortest_key in sql_select.cc does not consider clustered primary keys The find_shortest_key function has been modified to take into account that fact that a clustered PK has a longest key of possible covering indices.
[26 Mar 2010 8:23]
Bugs System
Pushed into 5.5.4-m3 (revid:alik@sun.com-20100326080914-2pz8ns984e0spu03) (version source revid:alexey.kopytov@sun.com-20100307164059-cri8typa32cypq0l) (merge vers: 5.5.3-m2) (pib:16)
[26 Mar 2010 8:27]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100326081116-m3v4l34yhr43mtsv) (version source revid:alik@sun.com-20100325072612-4sds00ix8ajo1e84) (pib:16)
[26 Mar 2010 8:31]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100326081944-qja07qklw1p2w7jb) (version source revid:alik@sun.com-20100325073410-4t4i9gu2u1pge7xb) (merge vers: 6.0.14-alpha) (pib:16)
[26 Mar 2010 19:44]
Zardosht Kasheff
Will this patch make it into 5.1?
[6 Apr 2010 7:56]
Bugs System
Pushed into 5.1.46 (revid:sergey.glukhov@sun.com-20100405111026-7kz1p8qlzglqgfmu) (version source revid:gshchepa@mysql.com-20100305194555-ts37kuiie7vbh9k6) (merge vers: 5.1.45) (pib:16)
[15 Apr 2010 15:44]
Paul DuBois
Noted in 5.1.46, 5.5.5, 6.0.14 changelogs. While looking for the shortest index for a covering index scan, the optimizer ignored that a clustered primary key read the entire table.
[16 May 2010 3:10]
James Day
Paul, please change the description to: "While looking for the shortest index for a covering index scan, the optimizer didn't consider the full row length for a clustered primary key, as in InnoDB. Secondary covering indexes will now be preferred, making full table scans less likely."
[16 May 2010 20:37]
Paul DuBois
Added performance tag, revised changelog wording per previous comment.
[25 May 2010 16:50]
Dan Nelson
This probably closes bug 19271 also (innodb optimizer prefers primary over secondary indexes for count(*))
[17 Jun 2010 11:45]
Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:23]
Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:martin.skold@mysql.com-20100609211156-tsac5qhw951miwtt) (merge vers: 5.1.46-ndb-6.2.19) (pib:16)
[17 Jun 2010 13:10]
Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 Jul 2010 23:27]
Domas Mituzas
actually, this change can have quite negative impacts. e.g. on a table like this: CREATE TABLE `pagelinks` ( `pl_from` int(8) unsigned NOT NULL DEFAULT '0', `pl_namespace` int(11) NOT NULL DEFAULT '0', `pl_title` varbinary(255) NOT NULL DEFAULT '', UNIQUE KEY `pl_from` (`pl_from`,`pl_namespace`,`pl_title`), KEY `pl_namespace` (`pl_namespace`,`pl_title`,`pl_from`) ) both indexes satisfy 'SELECT * FROM table' query done by mysqldump, but PK is much much hotter, and secondary key is relatively cold, so suddenly after #39653 equal-width secondary key is preferred - this has made our dumps take six days instead of half an hour.
[30 Jul 2010 20:46]
James Day
Domas, I've created bug #55656 for your report, please comment on the suggested fix over there.
[17 Aug 2010 10:53]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/115915 3181 Evgeny Potemkin 2010-08-17 Bug #55656: mysqldump can be slower after bug #39653 fix. After fix for bug#39653 the shortest available secondary index was used for full table scan. Primary clustered key was used only if no secondary index can be used. However, when chosen secondary index includes all fields of the table being scanned it's better to use primary index since the amount of data to scan is the same but the primary index is clustered. Now the find_shortest_key function takes this into account. @ mysql-test/suite/innodb/r/innodb_mysql.result Aadded a test case for the bug#55656. @ mysql-test/suite/innodb/t/innodb_mysql.test Aadded a test case for the bug#55656. @ sql/sql_select.cc Bug #55656: mysqldump can be slower after bug #39653 fix. The find_shortest_key function now prefers clustered primary key if found secondary key includes all fields of the table.
[17 Aug 2010 10:57]
Evgeny Potemkin
Fix for this bug also fixes bug#55204.
[25 Aug 2010 15:47]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/116781 3181 Evgeny Potemkin 2010-08-25 Bug #55656: mysqldump can be slower after bug #39653 fix. After fix for bug#39653 the shortest available secondary index was used for full table scan. Primary clustered key was used only if no secondary index can be used. However, when chosen secondary index includes all fields of the table being scanned it's better to use primary index since the amount of data to scan is the same but the primary index is clustered. Now the find_shortest_key function takes this into account. @ mysql-test/suite/innodb/r/innodb_mysql.result Aadded a test case for the bug#55656. @ mysql-test/suite/innodb/t/innodb_mysql.test Aadded a test case for the bug#55656. @ sql/sql_select.cc Bug #55656: mysqldump can be slower after bug #39653 fix. The find_shortest_key function now prefers clustered primary key if found secondary key includes all fields of the table.
[18 Aug 2016 8:23]
MySQL Verification Team
Bug #19271 marked as duplicate of this