Bug #82968 InnoDB's estimate for rows-in-table can return rows=1 due to a race condition
Submitted: 13 Sep 2016 12:44 Modified: 19 Dec 2016 14:05
Reporter: Sergey Petrunya Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.7.15 OS:Any
Assigned to: CPU Architecture:Any

[13 Sep 2016 12:44] Sergey Petrunya
Description:
This is coming from https://jira.mariadb.org/browse/MDEV-10649 .

InnoDB's code that updates table statistics has a race condition,  which may cause an unlucky thread to get the estimate of rows_in_table=1. This may cause the unlucky thread to use an inefficient query plan. (and if this query plan is used for an UPDATE statement, that statement can cause deadlocks, etc).

The code in dict_stats_update() does this:

			dict_table_stats_lock(table, RW_X_LATCH);

			/* Initialize all stats to dummy values before
			copying because dict_stats_table_clone_create() does
			skip corrupted indexes so our dummy object 't' may
			have less indexes than the real object 'table'. */
			dict_stats_empty_table(table);
                     // (*)
			dict_stats_copy(table, t);

The issue is, other threads are reading the table statistics using calls to

  ha_innobase::info(HA_STATUS_VARIABLE | HA_STATUS_NO_LOCK)

HA_STATUS_NO_LOCK means they are not taking any locks, and can read the data while dict_stats_update is at location (*), right after dict_stats_empty_table call.

How to repeat:
* Apply the attached patch (it has the code to pause the execution in dict_stats_update, as well as debug printout statements)

* Run the attached testcase.

* Observe the output like the one attached. Things to note:

The second UPDATE statement reads a lot of rows:

Handler_read_rnd_next   12001

The reason it is doing so is in the error log:

2016-09-13T12:40:35.489144Z 3 [Warning] DEBUG: Full table scan on table t1.  Reason: 'quick == 0'  Records2: 1  Records3: 12337 Optimizer_flag: 523775  Query: UPDATE t1 SET col1 = 1234 WHERE pk IN (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)

"Records2: 1", The range optimizer was invoked with table->stats.records=1 .

Suggested fix:
I'll attach a patch
[13 Sep 2016 12:45] Sergey Petrunya
Patch with debug printouts and execution pausing

Attachment: mdev10649-for-mysql5.7.diff (text/x-patch), 2.52 KiB.

[13 Sep 2016 12:47] Sergey Petrunya
Test file

Attachment: mdev10649-mysql57.test (application/octet-stream, text), 3.61 KiB.

[13 Sep 2016 12:49] Sergey Petrunya
Suggested patch

Attachment: mdev10649-r2.diff (text/x-patch), 2.27 KiB.

[13 Sep 2016 17:47] MySQL Verification Team
Hi Sergey,

Please confirm to me something. A test case should be run in the t/ subdirectory of the mysql-test/ subdirectory within the mysql source distribution. 

Am I right or wrong. BTW, I did apply your patch and it went like a charm.

Also, your e-mail address is somewhat strange to me. BTW, send him my regards .... ;-)
[14 Sep 2016 5:15] Laurynas Biveinis
Bug 78401? (sorry about "Need Feedback" reset)
[14 Sep 2016 8:37] Sergey Petrunya
Sinisa,

Yes, I m using the source distribution, compile the tree myself (use debug build),
put the file into mysql-test/t, and then run it as follows:

./mysql-test-run t/mdev10649.test

there's no .result file, so it just prints the test commands and results to the output.
[14 Sep 2016 8:38] Sergey Petrunya
Example mysql-test execution output

Attachment: mdev10649-mysql57-log.txt (text/plain), 5.43 KiB.

[14 Sep 2016 8:41] Sergey Petrunya
Laurynas, 

I think this issue is different from Bug #78401 (=Bug#82969).

Bug #78401 is a race with dict_index_t::stat* values. This issue a race with dict_table_t::stat_n_rows. 

These are updated in different functions, the "suggested patch" for this issue doesn't fix #78401.
[14 Sep 2016 14:04] MySQL Verification Team
Sergey,

Thank you very much for your bug report. I have repeated it 100 %.

I am sending the output from my test in a separate file.
[14 Sep 2016 14:05] MySQL Verification Team
Output from Sergey's test

Attachment: bug82968.output (application/octet-stream, text), 5.16 KiB.

[5 Oct 2016 12:03] Vasil Dimov
Posted by developer:
 
Fixed in daee89b in mysql-trunk. The fix should be considered to mysql-5.7.
[19 Dec 2016 14:05] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 5.7.18, 8.0.1 release, and here's the changelog entry:

A race condition while updating table statistics could result in an
estimated row count of 1 and an incorrect query execution plan.
[15 Jan 2017 10:52] Jean-François Gagné
Hi,
my understanding is that this is fixed in 5.7.18.
It is not clear to me if 5.6 is affected... shouldn't this also be fixed in 5.6 ?
Thanks,
JFG
[17 Jan 2017 14:58] MySQL Verification Team
Jean-Francois,

You can expect good news on that front, as well.
[26 Jan 2017 8:37] Louis R
Hi,

Just want to confirm we are impacted by this bug with 5.6.34.
[5 Dec 2020 11:13] Viktor Chernov
This fix hasn't been backported to MySQL 5.6.*? Am I right?
[8 Dec 2020 12:56] MySQL Verification Team
This is just to note that this bug  will not be fixed in 5.6, due to the high complexity of the patch.