Bug #50166 Server provides varying results for the same query when run in rapid succession
Submitted: 8 Jan 2010 0:01 Modified: 29 Nov 2010 12:48
Reporter: Patrick Crews Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0-codebase OS:Any
Assigned to: CPU Architecture:Any
Tags: index_condition_pushdown, innodb, optimizer_switch

[8 Jan 2010 0:01] Patrick Crews
Description:
The server is producing varying results for the same query when it is run in rapid succession.  Additionally, it appears that the difference between the first and second runs is the lack of duplicate rows (it appears the first result set contains duplicates, while the second does not)

This test is part of the Random Query Generators SelectStability Validator which runs the same query 2x very quickly.

Output:
# 18:54:26 Query:  SELECT  MAX( `col_time_key`) AS field1 , `pk` AS field2 FROM C AS table1 WHERE ( ( table1 . `pk` > 238 AND table1 . `pk` < ( 238 + 151 ) OR ( table1 . `pk`  IN (114, 171) OR table1 . `col_varchar_key` <> 'i' ) ) AND table1 . `pk` <> 1 )  GROUP BY field2; returns different result when executed after a delay of 0 seconds.
# 18:54:26 --- /tmp//randgen9146-1262908466-server0.dump	2010-01-07 18:54:26.000000000 -0500
# 18:54:26 +++ /tmp//randgen9146-1262908466-server1.dump	2010-01-07 18:54:26.000000000 -0500
# 18:54:26 @@ -1,36 +1,18 @@
# 18:54:26  00:00:00	14
# 18:54:26 -00:00:00	14
# 18:54:26 -00:00:00	6
# 18:54:26  00:00:00	6
# 18:54:26  00:39:46	12
# 18:54:26 -00:39:46	12
# 18:54:26  04:56:48	16
# 18:54:26 -04:56:48	16
# 18:54:26 -05:03:03	19
# 18:54:26  05:03:03	19
# 18:54:26  11:32:06	8
# 18:54:26 -11:32:06	8
# 18:54:26 -13:12:11	15
# 18:54:26  13:12:11	15
# 18:54:26  13:47:24	3
# 18:54:26 -13:47:24	3
# 18:54:26 -15:15:04	7
# 18:54:26  15:15:04	7
# 18:54:26  15:19:25	10
# 18:54:26 -15:19:25	10
# 18:54:26 -15:59:13	5
# 18:54:26  15:59:13	5
# 18:54:26  18:32:33	9
# 18:54:26 -18:32:33	9
# 18:54:26  18:38:59	20
# 18:54:26 -18:38:59	20
# 18:54:26 -19:24:11	4
# 18:54:26  19:24:11	4
# 18:54:26  19:35:19	18
# 18:54:26 -19:35:19	18
# 18:54:26 -19:56:05	17
# 18:54:26  19:56:05	17
# 18:54:26  20:25:14	2
# 18:54:26 -20:25:14	2
# 18:54:26 -NULL	13
# 18:54:26  NULL	13

How to repeat:
At present, this is only repeatable via the RQG.  Will try to cook up a better test case during analysis:

perl ./runall.pl --threads=1 --debug --queries=10000 --Validator=SelectStability --engine=Innodb --basedir=/path/mysql-6.0 --mysqld=--init-file=6.sql --grammar=conf/optimizer_no_subquery.yy --mtr-build-thread=91

Contents of 6.sql:
SET GLOBAL OPTIMIZER_SWITCH = 'materialization=off,semijoin=off,loosescan=off,firstmatch=off';

SET GLOBAL debug = '+d,optimizer_innodb_ds_mrr';

SET GLOBAL engine_condition_pushdown = 1;

SET GLOBAL join_cache_level = 1;
[8 Jan 2010 23:45] Patrick Crews
NOTE:  This was easily repeatable with another RQG process (different server / ports) running on the same machine, but running with only the SelectStability process does not allow this to be repeated.
[9 Jan 2010 5:13] Patrick Crews
EXPLAIN SELECT  MAX( `col_time_key`) AS field1 , `pk` AS field2 FROM C AS table1 WHERE ( ( table1 . `pk` > 238 AND table1 . `pk` < ( 238 + 151 ) OR ( table1 . `pk`  IN (114, 171) OR table1 . `col_varchar_key` <> 'i' ) ) AND table1 . `pk` <> 1 )  GROUP BY field2; 
+----+-------------+--------+-------+-------------------------+---------+---------+------+------+-------------+
| id | select_type | table  | type  | possible_keys           | key     | key_len | ref  | rows | Extra       |
+----+-------------+--------+-------+-------------------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | table1 | range | PRIMARY,col_varchar_key | PRIMARY | 4       | NULL |   11 | Using where |
+----+-------------+--------+-------+-------------------------+---------+---------+------+------+-------------+
[12 Jan 2010 17:39] Patrick Crews
Was unable to repeat on 5.0 or 5.1.

Appears to be controlled via engine_condition_pushdown variable.  When this value is set to 1, the bug appears ~7 min into the run (with a separate Transformer RQG run in the background).

With the variable set to 0, the bug does not appear at all.

Test case in-progress.
[12 Jan 2010 20:45] Patrick Crews
Currently unable to duplicate this bug by any means other than running two RQG processes.
To duplicate:

Run the following against a server installation - in my experiments, I am running against a separate 6.0-codebase server with the following workload:
perl ./runall.pl --threads=1 --Validator=Transformer --basedir=/<path>/xform/mysql-6.0 --mtr-build-thread=91 --queries=100000 --grammar=conf/optimizer_subquery.yy --mysqld=--init-file=/<path>/6.sql

Then, I run the following SelectStability process against another 6.0 server.  The failure is showing up ~7 min. into this run:
perl ./runall.pl --threads=1 --debug --queries=10000 --Validator=SelectStability --engine=Innodb --basedir=/<path>/mysql-6.0 --mysqld=--init-file=/<path>/randgen/6.sql --grammar=conf/optimizer_no_subquery.yy --mtr-build-thread=81

I have tried the following perl script to substitute for the SelectStability process, but it is unable to duplicate the bug, regardless of the number of runs / how much background work is being done:
use DBI;
$| = 1;

my $dsn = 'dbi:mysql:host=127.0.0.1:port=10810:user=root:database=test';
my $query = " SELECT  MAX( `col_time_key`) AS field1 , `pk` AS field2 FROM C AS table1 WHERE ( ( table1 . `pk` > 238 AND table1 . `pk` < ( 238 + 151 ) OR ( table1 . `pk`  IN (114, 171) OR table1 . `col_varchar_key` <> 'i' ) ) AND table1 . `pk` <> 1 )  GROUP BY field2 ";

my $dbh = DBI->connect($dsn);
my %counts;
foreach my $trial (1..500000) {
        my $sth = $dbh->prepare($query);
        $sth->execute();
        print localtime()." [$$] trial: $trial; rows ".$sth->rows()."\n";
        $counts{$sth->rows()}++;
        exit if scalar(keys %counts) > 1;
}
[12 Jan 2010 22:09] Patrick Crews
After consultation with dev's, found that using the following in the init-file:
SET GLOBAL debug = '+d,optimizer_innodb_ds_mrr,optimizer_no_icp';

Would remove the bug, regardless of engine_condition_pushdown settings.

It should be noted that I tried an updated 6.0-codebase-bugfixing tree (rather than the ICP test tree) and was unable to duplicate this with any combination of 
@@optimizer_switch=engine_condition_pushdown=on|off,index_condition_pushdown=on|off;