Bug #38010 Wrong result returned when outer join, semijoin and "use index"
Submitted: 10 Jul 2008 9:36 Modified: 14 Jul 2009 22:36
Reporter: Philip Stoev Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:6.0,5.4 OS:Any
Assigned to: CPU Architecture:Any

[10 Jul 2008 9:36] Philip Stoev
Description:
The following SELECT produces wrong results when executed with semijoin:

SELECT  OUTR . `int_key`
FROM B AS OUTR2
LEFT JOIN C AS OUTR
ON ( OUTR2 . `varchar_nokey` > OUTR . `varchar_key` )
WHERE OUTR . `pk` IN (
 SELECT  INNR . `int_key` AS Y
 FROM BB AS INNR2
 LEFT JOIN CC AS INNR
 ON ( INNR2 . `date_nokey` >= INNR . `date_key` )
);

The result is correct in 5.1, when semijoin is disabled and when the select starts with "SELECT OUTR . `pk`".

How to repeat:
A test case will be uploaded shortly.
[10 Jul 2008 9:43] Philip Stoev
Test case for bug 38010

Attachment: bug38010.test (application/octet-stream, text), 5.10 KiB.

[10 Jul 2008 9:44] Philip Stoev
Please find attached the test case. Innodb is required for this test case in order to prevent myisam-specific optimizations from kicking in.

The last query from the test case should return "6" rather than an empty result set.
[10 Jul 2008 11:32] Susanne Ebrecht
Verified as described by given test.

I tested with 5.0, 5.1 and 6.0 bzr tree. This is only in 6.0.

Let me paste some results:

MySQL 5.1:

mysql> SELECT  OUTR . `int_key`
    -> FROM B AS OUTR2
    -> LEFT JOIN C AS OUTR
    -> ON ( OUTR2 . `varchar_nokey` > OUTR . `varchar_key` )
    -> WHERE OUTR . `pk` IN (
    ->  SELECT  INNR . `int_key` AS Y
    ->  FROM BB AS INNR2
    ->  LEFT JOIN CC AS INNR
    ->  ON ( INNR2 . `date_nokey` >= INNR . `date_key` )
    -> );
+---------+
| int_key |
+---------+
|       6 | 
+---------+

mysql> SELECT  OUTR . `int_key`
    -> FROM B AS OUTR2
    -> LEFT JOIN C AS OUTR
    -> ON ( OUTR2 . `varchar_nokey` > OUTR . `varchar_key` )
    -> ;
+---------+
| int_key |
+---------+
|       6 | 
|       9 | 
+---------+
2 rows in set (0.00 sec)

mysql> SELECT  INNR . `int_key` AS Y
    ->  FROM BB AS INNR2
    ->  LEFT JOIN CC AS INNR
    ->  ON ( INNR2 . `date_nokey` >= INNR . `date_key` )
    -> ;
+------+
| Y    |
+------+
|    7 | 
|    0 | 
|    4 | 
+------+
3 rows in set (0.00 sec)

mysql>  SELECT  OUTR . `int_key` FROM B AS OUTR2 LEFT JOIN C AS OUTR ON ( OUTR2 . `varchar_nokey` > OUTR . `varchar_key` ) where OUTR.pk in ('7','0','4');
+---------+
| int_key |
+---------+
|       6 | 
+---------+
1 row in set (0.00 sec)

Now MySQL 6.0:

mysql> SELECT  OUTR . `int_key`
    -> FROM B AS OUTR2
    -> LEFT JOIN C AS OUTR
    -> ON ( OUTR2 . `varchar_nokey` > OUTR . `varchar_key` )
    -> WHERE OUTR . `pk` IN (
    ->  SELECT  INNR . `int_key` AS Y
    ->  FROM BB AS INNR2
    ->  LEFT JOIN CC AS INNR
    ->  ON ( INNR2 . `date_nokey` >= INNR . `date_key` )
    -> );
Empty set (0.01 sec)

mysql> SELECT  OUTR . `int_key`
    -> FROM B AS OUTR2
    -> LEFT JOIN C AS OUTR
    -> ON ( OUTR2 . `varchar_nokey` > OUTR . `varchar_key` )
    -> ;
+---------+
| int_key |
+---------+
|       6 | 
|       9 | 
+---------+
2 rows in set (0.00 sec)

mysql> SELECT  INNR . `int_key` AS Y
    ->  FROM BB AS INNR2
    ->  LEFT JOIN CC AS INNR
    ->  ON ( INNR2 . `date_nokey` >= INNR . `date_key` )
    -> ;
+------+
| Y    |
+------+
|    7 | 
|    0 | 
|    4 | 
+------+
3 rows in set (0.00 sec)

mysql> SELECT  OUTR . `int_key` FROM B AS OUTR2 LEFT JOIN C AS OUTR ON ( OUTR2 . `varchar_nokey` > OUTR . `varchar_key` ) where OUTR.pk in ('7','0','4');
+---------+
| int_key |
+---------+
|       6 | 
+---------+
1 row in set (0.05 sec)

It seems that the subquery result handling is defect here.
[14 Jul 2009 22:36] Patrick Crews
Ran the attached test many times with various settings for optimizer_switch, optimizer_use_mrr, and engine_condition_pushdown.

Unable to see a difference between 5.1 and azalea.