Bug #46305 Wrong and variable result on simple JOIN + XOR + subquery optimizations
Submitted: 20 Jul 2009 12:01 Modified: 12 Jan 2010 14:23
Reporter: Philip Stoev Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:6.0 OS:Any
Assigned to: Tor Didriksen CPU Architecture:Any
Tags: optimizer_switch, semijoin, subquery

[20 Jul 2009 12:01] Philip Stoev
Description:
Even after applying the fixes for bug #46051 and bug #45266, when queries of the following type:

SELECT OUTR .`int_key`
FROM B OUTR2  JOIN C OUTR ON OUTR2 .`datetime_nokey` = OUTR .`datetime_key`  WHERE OUTR .`int_key`  AND OUTR .`int_key`  XOR OUTR .`varchar_key`;

are run concurrently with other queries, they will return a wrong and variable result when executed on Azalea with subquery optimizations turned on, even though the actual query does not contain a subquery.

MRR and join_cache_level do not appear to play a role.

How to repeat:
1. With the RQG, create a background work load:

$ perl runall.pl --basedir=/build/bzr/azalea --threads=1 --queries=1000000 --grammar=conf/subquery_semijoin.yy  

2. Run the following script:

use strict;
use DBI;
$| = 1;

my $dsn = 'dbi:mysql:host=127.0.0.1:port=19306:user=root:database=test';
my $query = "SELECT OUTR .`int_key`  FROM B OUTR2  JOIN C OUTR  ON OUTR2 .`datetime_nokey`  = OUTR .`datetime_key`  WHERE OUTR .`int_key`  AND OUTR .`int_key`  XOR OUTR .`varchar_key`";

my $dbh = DBI->connect($dsn);
my %counts;
foreach my $trial (1..150000) {
        my $sth = $dbh->prepare($query);
        $sth->execute();
        print localtime()." [$$] trial: $trial; rows ".$sth->rows()."\n";
        $counts{$sth->rows()}++;
        exit if scalar(keys %counts) > 1;
}

It will report that the query returns either 1 or 0 rows. 5.1 will consistently return 4 rows.

The problem is also repeatable if the query ends in "XOR OUTR .`int_nokey`", making it perfectly legitimate and warning-free.
[9 Dec 2009 7:43] Tor Didriksen
Currently this grammar (conf/subquery_semijoin.yy) generates queries which
makes the server crash.
A simplified version of the crashing query is

SELECT DISTINCT BIT_XOR( OUTR . `col_datetime_nokey` ) AS X FROM B AS OUTR 
WHERE ( OUTR . `pk` , OUTR . `col_int_nokey` ) IN 
( SELECT  INNR . `col_int_nokey` AS X ,
          INNR . `col_int_key` AS Y FROM BB AS INNR2
   LEFT JOIN CC AS INNR ON ( INNR2 . `col_varchar_key` >= INNR . `col_varchar_nokey` )
)

If I patch in the (not yet approved) patch for Bug#46692 the server 
does not crash for the above query.
So, there is no point in pursuing this bug until Bug#46692 is fixed.
[9 Dec 2009 7:57] Øystein Grøvlen
Bug#46692 does not apply if there are more than one row in each table.
[12 Jan 2010 14:23] Tor Didriksen
The grammar subquery_semijoin.yy has uncovered a series of crash bugs in the server. After the recent bugfixes, I cannot find any more errors.
The indicated query now constently returns 0 rows, both with 6.0-codebase-bugfixing and next-mr-bugfixing.