Bug #70617 Default persistent stats can cause unexpected long query times
Submitted: 14 Oct 2013 19:18 Modified: 27 May 2015 8:38
Reporter: Justin Swanhart Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.6.13 OS:Any
Assigned to: CPU Architecture:Any
Tags: analyze, Optimizer, regression, statistics

[14 Oct 2013 19:18] Justin Swanhart
Description:
This is a redacted email from a client.  We had a very long discovery process to find out that 5.6 now enables persistent stats by default.  Since this customer creates new tables for their reports each time they run, the tables do not have statistics and query plans that take HOURS result instead of plans that should take minutes.

Either removing the persistent stats options or using STRAIGHT_JOIN to bypass the optimizer fixes the query.

How to repeat:
Your hunch that the persistent statistics are the cause of the problem looks promising.  ... When I ran the app I was still experiencing long execution times for the INSERT...SELECT statement.  I then modified the PHP app to run ALTER/ANALYZE TABLE on the arizona_temp_pub_loc_xxxxx table prior to the INSERT...SELECT.  

As we mentioned in our initial email, each time the PHP app runs it creates and new copy of the region_yyyy_yyy_yyy_xxxxx table and populates it with a few thousand records.  So far today I've run the PHP app a few times and the query has been running quickly.  I'm going to do a few more test runs of the PHP app and if the query continues to run fast I will disable persistent stats in my.cnf.  Do you know of any negative performance implications to disabling the persistent stats?

When running the INSERT...SELECT from the command line I was always using an existing, pre-populated version of the region_yyyy_yyy_yyy_xxxx table.  I wasn't recreating it with each execution, so the stats were probably up-to-date and did not need to be recalculated.

I'll keep you updated on the outcome of my remaining tests and if turning persistent stats off solves the problem for good.

Suggested fix:
Don't make major changes to the optimizer stats collection procedures in the middle of a GA release.
[14 Oct 2013 19:22] Justin Swanhart
Change severity.
[15 Oct 2013 9:54] Arnaud Adant
Hi Justin,

Do you have a SQL test case that could be verified in 5.6.13 ?

Yes, persistent statistics have an impact on performance. Especially if you create a lot of small tables.

Arnaud Adant
[15 Oct 2013 22:35] Justin Swanhart
No - I don't have a test case.  I have hours of lost work.  You admit the change can result in bad plans.

ALTER TABLE .. ADD KEY won't add new stats properly.
Tables have invalid row counts.
Tables that have a lot of rows changed will have invalid statistics and might do index scans instead of FTS.
Use your imagination and think about how statistics are used.
[17 Oct 2013 13:30] Shane Bester
sample testcase.  Sometimes EXPLAIN will show table scan for a few seconds until stats are avail.. Is this what you're seeing ?

Attachment: bug70617_testcase1.sql (application/octet-stream, text), 1.05 KiB.

[17 Oct 2013 13:47] Shane Bester
In my case I am guessing the background stats thread wakes up every  MIN_RECALC_INTERVAL seconds to run.
[17 Oct 2013 14:56] Shane Bester
this testcase should take ~10 seconds to execute. it waits for stats...

Attachment: bug70617_testcase_poll.sql (application/octet-stream, text), 1.04 KiB.

[17 Oct 2013 15:42] Shane Bester
The crux of the matter is statistics on new tables, and heavily modified tables will be out of date until you run ANALYZE TABLE, or wait for the background thread to calculate stats, which is every ~10 seconds.
[18 Oct 2013 20:40] Justin Swanhart
Is this background thread that recalculates stats documented somewhere?
[27 Oct 2013 13:26] Valeriy Kravchuk
Here is my documentation request about the background thread: http://bugs.mysql.com/bug.php?id=70741
[26 Nov 2013 14:13] Vasil Dimov
Hello,

--- cut ---
  [15 Oct 9:54] Arnaud Adant 
...
  Yes, persistent statistics have an impact on performance. Especially if you create a lot of small tables.

  [15 Oct 22:35] Justin Swanhart
...
  You admit the change can result in bad plans.
--- cut ---

No!  

What Arnaud said is that if you execute CREATE TABLE e.g. 1000 times in a row, that may be slower - the CREATE TABLE itself. Not worse execution plans. And improvements have been made to the performance of CREATE TABLE recently.

Now, while it is theoretically possible, so far we haven't found a case where the persistent stats sampling algorithm would result in a worse estimates, compared to the old, transient stats algorithm.
[26 Nov 2013 14:35] Vasil Dimov
--- cut ---
  [14 Oct 19:18] Justin Swanhart 
...
  Do you know of any negative performance implications to disabling the persistent stats?
--- cut ---

If you switch off persistent stats, then InnoDB will start using the old stats method, which we call transient stats - the way it works in 5.5 and older. The transient stats have the following problems, which drove us to implement the new persistent stats:

. The transient stats sampling algorithm sometimes comes up with a very wrong, or even bogus estimates. When this happens, it usually results in bogus execution plans.

. The transient stats sampling algorithm results could vary quite a lot from run to run, which in combination with the point below may result in an unexpected execution plans change for good or for worse.

. The transient stats are automatically updated on various conditions, sometimes quite unexpected by the user, which may cause execution plans to change out of the blue.

. The transient stats are updated in the user thread, so eventually, now and then, at some point a DML query will execute slower compared to its usual execution time. The persistent stats are updated in an asynchronous manner by a background thread and I think your problems are somewhere around here.

. The transient stats are not saved on disk, lost upon server restart, which means a possibly different execution plan when the server boots up after the restart.

. The user can't fiddle with the transient stats. If desired, the persistent stats could be viewed _and modified_ in the tables mysql.innodb_table_stats and mysql.innodb_index_stats (use "FLUSH TABLE ...;" to force InnoDB to reread the stats from those two tables).
[26 Nov 2013 14:54] Vasil Dimov
--- cut ---
  [17 Oct 15:42] Shane Bester

  The crux of the matter is statistics on new tables, and heavily modified tables will be out of date until you run ANALYZE TABLE, or wait for the background thread to calculate stats, which is every ~10 seconds.
--- cut ---

Correct, except with a small addition: usually the background stats recalc will start immediately after the DML that triggered the 10% threshold. But if stats were recalculated not long ago, then there will be an artificial delay to avoid recalculating stats of this table more often then once per 10 seconds. For example:

Let table t has 100 rows.

11:30:00 idle
11:30:02 7 rows in t are modified
11:30:03 3 rows in t are modified -> this will trigger an immediate background stats recalc (at 11:30:03 o'clock)
11:30:05 4 rows in t are modified
11:30:07 6 rows in t are modified -> this will trigger an event that t's stats need to be recalculated, but the background thread will notice that it last calculated them in 11:30:03 and now is 11:30:07, so it will wait until 11:30:13 before it actually recalculates the stats.
[26 Nov 2013 15:00] Vasil Dimov
And now to the actual issue reported here - why would stats and execution plans be relevant at all for a query like:

  replace into `tmp` select * from `rc`;

it should always cause a full scan of the table `rc`, yes? Or maybe I miss something here...
[26 Nov 2013 15:09] Shane Bester
the query is: select * from `prod` where `b`='45';
[27 May 2015 8:38] Vasil Dimov
Closing as "not a bug". See my earlier comments for detailed explanations.