MySQL Bugs: #55656: mysqldump can be slower after bug #39653 fix

Bug #55656	mysqldump can be slower after bug #39653 fix
Submitted:	30 Jul 2010 20:45	Modified:	15 Oct 2010 14:07
Reporter:	James Day	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	5.1.46, 5.5.5, 6.0.14	OS:	Any
Assigned to:	Evgeny Potemkin	CPU Architecture:	Any
Tags:	find_shortest_key, query optimizer, regression

Description:
The fix for bug #39653 caused the optimiser to prefer secondary covering indexes to primary, apparently in all cases. This was reported to cause a mysqldump to use a covering index on * instead of using the primary key and clustered data. The reported result was a half hour mysqldump changing to six days.

This can badly affect mapping tables like the example given in the 30 July comment at bug #39653, which appears to be from the Wikipedia schema. In such tables you may usefully have all rows in different orders in different covering indexes, but backup will still be faster when done by PK.

How to repeat:
Verified by code examination of the earlier fix.

Suggested fix:
Perhaps where the covering indexes are on * and there is no LIMIT clause, prefer the primary key? That seems like a case where PK preference is safe, since it guarantees a full table scan, which will always be faster by PK than secondary index.

Less sure about doing it also if it's on * and there is a WHERE clause. That could block useful optimisations done by adding secondary keys and doesn't seem so likely to be helpful. May also make backups slower by blocking use of one of the secondary covering indexes for the WHERE.

As I understand it, the issue with the example from Wikipedia, is that the index that is now picked is "less hot".  That is, using it will cause more disk access than using the primary index.  In general, this issue can not be solved without taking into account information about buffering of indexes in the storage engine.  However, I guess for most applications it is a good heuristic to assume that a clustered index is hotter, but there are probably some applications where this is not true.

Reading the comment added to the code with Bug#39653, it seems that fix did not recognize that if a secondary index covers all columns of a table, it will not be any advantage to use it over a clustered index.  Using the clustered index in such cases would solve the problem for the specific case reported in this bug report.

Thinking further about the possible fix, if I understand the original code properly, it would always choose the PK if there was both PK covering and * covering secondary index. Going back to that preference in all * cases seems like the safest choice, particularly for 5.1, which we don't really want to change too greatly this late in its life.

That still preserves the main likely improvement from bug #39653, preferring the shortest covering index.

To make a good choice we'd really need to do something like asking the storage engine and considering where and limit clauses to try to work out which is likely to be best.

There can be substantial benefits from choosing the secondary * covering index over the primary if all rows aren't wanted. That's why Wikipedia has both. It's a page hit rate and working set size optimisation, where it's much faster to read 10,000 index entries with many in the same or adjacent index pages than if they are in 10,000 different pages. The difficulty is knowing when this is true.

In the Wikipedia case it's either all pages that link here or all pages that this place links to and the where clause determines which index is best. It might be a useful heuristic to assume that the equal length index that has where clause entries earlier in the index is the one to use. It's a heuristic that would work for the Wikipedia case and seems likely to do reasonably well in general. But this isn't something that is safe enough to do in 5.1, just 5.5, perhaps.

I originally filed bug #39653. What the original code meant to do, and what the current code, does is assume that the covering index with the smallest key is best for running an index scan.

To better illustrate, let us look at the following example. Suppose I had the following table:

create table foo (a int, b int, c int, d int, e longblob, primary key (a), key (b,a), key (c,b,a));

Note that the primary key is not clustered. Now say we wish to execute:

select sum(a) from foo;

This requires a full index scan, and any of the indexes will do. What MySQL does is look at the three indexes, find the one with the smallest length, and select that one. Because the smallest defined key is the primary, that is what the original code selected. The reasoning was that picking the smallest key required processing less data, and is therefore faster. Similarly, if the schema was the following:

create table foo (a int, b int, c int, d int, e longblob, key (b,a), primary key (c,b,a));

In the above case, the key (b, a) was selected, and NOT the primary key, because (b,a) is shorter than (c,b,a).

The implicit "hotness" of the primary key is not taken into account.

The reason for filing the original bug is that for storage engines that cluster the primary key, the existence of the longblob makes selecting the primary key a really bad choice. If the true intent of find_shortest_key is to select the index that requires the least processing of data, then the clustered primary key must be taken into account.

As far as the problem that Domas ran into, the fix is simple. If a secondary index contains every column, then it should not take precedence over the clustered primary key.

As for the general problem, of taking the "hotness" of an index into account, I think James is right that more reasoning should be deferred to the storage engine. The optimizer should never assume relative costs of a range query, the storage engine should. I filed bug #52162 for this reason (as a feature request).

Also, this is not the first time that the "hotness" of a primary key has come under debate. The following code is commented out of sql_select.cc in make_join_readinfo:
/*
It has turned out that the below change, while speeding things
up for disk-bound loads, slows them down for cases when the data
is in disk cache (see BUG#35850):
// See bug #26447: "Using the clustered index for a table scan
// is always faster than using a secondary index".
if (table->s->primary_key != MAX_KEY &&
table->file->primary_key_is_clustered())
tab->index= table->s->primary_key;
else
*/

In fact, I think it is THIS code that Domas wants fixed, and not find_shortest_key. As the comment states, there was an assumption that using a longer clustered index is faster than a secondary index. The assumption was based on the clustered index "being hotter". This caused others to complain, and it was subsequently backed out in the fix for bug #35850.

I will reiterate what I think the proper solution in the long term should be: the storage engine should calculate all range query costs. The optimizer should not make any assumptions on costs. In the short term, I think changing find_shortest_key to pick a clustered pk over a secondary index that has all the columns would suffice.

Bug #35850 was fixed back in 5.1.25 so it seems too old to really be what Domas is writing about, though it would be good for Domas to confirm that he didn't jump from a pre-5.1.25 version to 5.1.46 or later.

The bug #26447 view that PK always best for a table scan might apply if all or most columns are needed but I've specifically created short secondary indexes to speed up scans of all or many rows for one column by making a short index that is more readily cached than the whole data set. Doesn't seem like a good general rule to prefer the PK if a short and hence easier to cache secondary index can be used, picking the shortest secondary index seems likely to work better. Heikki is right if the workload turns out to be disk-bound even with a short index available to cache.

It was upgrade from 5.1.4x to 5.1.46 - and I have deployed a hotfix build with revert for the change, so it works all fine now on my side.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/115915

3181 Evgeny Potemkin	2010-08-17
      Bug #55656: mysqldump can be slower after bug #39653 fix.
      After fix for bug#39653 the shortest available secondary index was used for
      full table scan. Primary clustered key was used only if no secondary index
      can be used. However, when chosen secondary index includes all fields of the
      table being scanned it's better to use primary index since the amount of
      data to scan is the same but the primary index is clustered.
      Now the find_shortest_key function takes this into account.
     @ mysql-test/suite/innodb/r/innodb_mysql.result
        Aadded a test case for the bug#55656.
     @ mysql-test/suite/innodb/t/innodb_mysql.test
        Aadded a test case for the bug#55656.
     @ sql/sql_select.cc
        Bug #55656: mysqldump can be slower after bug #39653 fix.
        The find_shortest_key function now prefers clustered primary key
        if found secondary key includes all fields of the table.

which 5.1 version is this slated for? Will it go into 5.1.48 or a higher release?

sorry meant to say 5.1.50

Dathan, unlikely that it'll be in 5.1.50, though that would be nice if it was practical without delaying 5.1.50 significantly.

5.1.51 looks likely. We might not do a simple rollback but instead both solve the original pair of requests and take care of this side effect. Subject to change, still being looked at by the developers.

Dathan, 5.1.50 is now released, was being prepared for release when you asked, which is why it's not in 5.1.50. Looks like the last patch, prefer the primary key if there is a choice between covering index and PK, will be in the next release. Subject to change, as always. Seems like a simple enough one to apply if you need it now and are comfortable building yourself, see the 17 August patch.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/116781

3181 Evgeny Potemkin	2010-08-25
      Bug #55656: mysqldump can be slower after bug #39653 fix.
      After fix for bug#39653 the shortest available secondary index was used for
      full table scan. Primary clustered key was used only if no secondary index
      can be used. However, when chosen secondary index includes all fields of the
      table being scanned it's better to use primary index since the amount of
      data to scan is the same but the primary index is clustered.
      Now the find_shortest_key function takes this into account.
     @ mysql-test/suite/innodb/r/innodb_mysql.result
        Aadded a test case for the bug#55656.
     @ mysql-test/suite/innodb/t/innodb_mysql.test
        Aadded a test case for the bug#55656.
     @ sql/sql_select.cc
        Bug #55656: mysqldump can be slower after bug #39653 fix.
        The find_shortest_key function now prefers clustered primary key
        if found secondary key includes all fields of the table.

Approved with some suggestions for improvement of comments.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/116839

3491 Evgeny Potemkin	2010-08-26
      Bug #55656: mysqldump can be slower after bug 39653 fix.
      After fix for bug 39653 the shortest available secondary index was used for
      full table scan. Primary clustered key was used only if no secondary index
      can be used. However, when chosen secondary index includes all fields of the
      table being scanned it's better to use primary index since the amount of
      data to scan is the same but the primary index is clustered.
      Now the find_shortest_key function takes this into account.
     @ mysql-test/suite/innodb/r/innodb_mysql.result
        Added a test case for the bug#55656.
     @ mysql-test/suite/innodb/t/innodb_mysql.test
        Added a test case for the bug#55656.
     @ sql/sql_select.cc
        Bug #55656: mysqldump can be slower after bug #39653 fix.
        The find_shortest_key function now prefers clustered primary key
        if found secondary key includes all fields of the table.

Pushed into mysql-trunk 5.6.1-m4 (revid:alik@sun.com-20100901130501-4g2k86dub29auj8y) (version source revid:alik@sun.com-20100901130012-9bmmvzcnnw6n5rw6) (merge vers: 5.6.1-m4) (pib:21)

Pushed into mysql-next-mr (revid:alik@sun.com-20100901130614-pgop3m80rmutewxn) (version source revid:alik@sun.com-20100901130033-8k19cjn6n2blm3py) (pib:21)

Pushed into mysql-5.5 5.5.7-m3 (revid:alik@sun.com-20100901125952-4hsrosoa0xreionr) (version source revid:alik@sun.com-20100901125952-4hsrosoa0xreionr) (merge vers: 5.5.7-m3) (pib:21)

Noted in 5.1.51, 5.5.7, 5.6.1 changelogs.

After the fix for Bug#39653, the shortest available secondary index
was used for full table scans. The primary clustered key was used
only if no secondary index could be used. However, when the chosen
secondary index includes all columns of the table being scanned, it
is better to use the primary index because the amount of data to scan
is the same but the primary index is clustered. This is now taken
into account.

Pushed into mysql-5.1 5.1.52 (revid:sunanda.menon@sun.com-20100928083322-wangbv97uobu7g66) (version source revid:sunanda.menon@sun.com-20100928083322-wangbv97uobu7g66) (merge vers: 5.1.52) (pib:21)

Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.20 (revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (version source revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (merge vers: 5.1.51-ndb-7.0.20) (pib:21)

Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.39 (revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (version source revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (merge vers: 5.1.51-ndb-6.3.39) (pib:21)

Pushed into mysql-5.1-telco-6.2 5.1.51-ndb-6.2.19 (revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (version source revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (merge vers: 5.1.51-ndb-6.2.19) (pib:21)

Already documented in the 5.1.51 changelog. Reverting to Closed state.