Bug #51983 OUTER JOINs with WHERE condition may produce incorrect results
Submitted: 12 Mar 2010 10:45 Modified: 18 May 2010 7:29
Reporter: Ole John Aske Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.41, 5.1.46-debug OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[12 Mar 2010 10:45] Ole John Aske
Description:
Adding a WHERE condition to a (complex) query containing LEFT & RIGHT
outer joins may return incorrect result sets.

The occurrence of the error partly depends on the storage engine being used, but
I assume this is more related to different execution plans being produced for
the different engines.

Repeating the same query in the attached Cluster database results in the expected results being returned for this particular query:

w/ WHERE table3 . `pk` IS NULL

+-----+------+------+------+------+------+------+------+------+
| pk1 | pk2  | pk3  | pk4  | pk5  | pk6  | pk7  | pk8  | pk9  |
+-----+------+------+------+------+------+------+------+------+
|   4 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   9 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   6 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+-----+------+------+------+------+------+------+------+------+
3 rows in set (0.01 sec)

w/ WHERE table3 . `pk` IS NOT NULL

+-----+------+------+------+------+------+------+------+------+
| pk1 | pk2  | pk3  | pk4  | pk5  | pk6  | pk7  | pk8  | pk9  |
+-----+------+------+------+------+------+------+------+------+
|   7 |    7 |    2 |    7 | NULL | NULL | NULL | NULL |   16 |
|   3 |    3 |   13 |    3 | NULL | NULL | NULL | NULL |   15 |
|  10 |   10 |   10 |   10 | NULL | NULL | NULL | NULL |   18 |
|   5 |    5 |    4 |    5 | NULL |   10 |   10 |   10 |   10 |
|  11 |   11 |    6 |   11 | NULL | NULL | NULL | NULL |   20 |
|   1 |    1 |   11 |    1 | NULL | NULL | NULL | NULL |   14 |
|   2 |    2 |   12 |    2 | NULL | NULL | NULL | NULL |   11 |
|   8 |    8 |    7 |    8 | NULL | NULL | NULL | NULL |   19 |
+-----+------+------+------+------+------+------+------+------+
8 rows in set (0.01 sec)

However there are other queries running correctly on MyISAM and incorrectly on Cluster, see further below. The makes it likely to not be a SE related bug but rather incorrect execution plans generated by the optimizer.

==== Another example ===

SELECT table1.pk pk1, table1.int_key i, table2.pk pk2, table3.pk pk3, 
       table4.pk pk4, table5.pk pk5,
  table6.pk pk6, table7.pk pk7, table8.pk pk8, table9.pk 
  pk9, table10.pk pk10,  table11.pk pk11
 FROM C AS table1 LEFT JOIN C AS table2 ON table1 . `int_key` = table2 . `int_key` LEFT JOIN C AS table3 ON table1 . `pk` = table3 . `pk` LEFT JOIN BB AS table4 LEFT JOIN CC AS table5 ON table4 . `int_key` = table5 . `pk` RIGHT JOIN BB AS table6 LEFT OUTER JOIN C AS table7 ON table6 . `pk` = table7 . `int_key` LEFT JOIN CC AS table8 LEFT JOIN CC AS table9 LEFT JOIN CC AS table10 ON table9 . `int_key` = table10 . `int_key` ON table8 . `int_key` = table10 . `int_key` LEFT JOIN A AS table11 ON table9 . `pk` = table11 . `pk` ON table6 . `int_key` = table10 . `pk` ON table4 . `int_key` = table10 . `pk` ON table3 . `pk` = table8 . `int_key`;

Then add the WHERE condition: 'WHERE table3 . `pk` > table1 . `int_key`'

Analyzing the results for this query is left as a user exercise.

How to repeat:
1) Load the attached mysqldumps myisam.sql and ndb.sql into
   databases spj_myisam and spj_ndb;

2) 
use spj_myisam;

SELECT
  table1.pk pk1, table2.pk pk2, table3.pk pk3,
  table4.pk pk4, table5.pk pk5, table6.pk pk6,
  table7.pk pk7, table8.pk pk8, table9.pk pk9
FROM 
 C AS table1
 LEFT JOIN
 ( C AS table2
   LEFT JOIN 
   ( D AS table3
     RIGHT JOIN 
     ( C AS table4
       LEFT JOIN 
       A AS table5
       ON table4 . `pk` = table5 . `int_key`
     ) ON table3 . `int_key` = table4 . `int_key`
   ) ON table2 . `int_key` = table3 . `int_key`
   LEFT JOIN 
   ( BB AS table6
     LEFT JOIN 
     ( CC AS table7
       RIGHT JOIN 
       D AS table8 
       ON table7 . `pk` = table8 . `pk`
     ) ON table6 . `pk` = table7 . `pk`
   ) ON table3 . `int_key` = table6 . `int_key`
   RIGHT JOIN 
   ( CC AS table9
   ) ON table3 . `int_key` = table9 . `int_key`
 ) ON table1 . `int_key` = table3 . `int_key`;

Will return the resultset:

+-----+------+------+------+------+------+------+------+------+
| pk1 | pk2  | pk3  | pk4  | pk5  | pk6  | pk7  | pk8  | pk9  |
+-----+------+------+------+------+------+------+------+------+
|   1 |    1 |   11 |    1 | NULL | NULL | NULL | NULL |   14 |
|   2 |    2 |   12 |    2 | NULL | NULL | NULL | NULL |   11 |
|   3 |    3 |   13 |    3 | NULL | NULL | NULL | NULL |   15 |
|   4 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   5 |    5 |    4 |    5 | NULL |   10 |   10 |   10 |   10 |
|   6 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   7 |    7 |    2 |    7 | NULL | NULL | NULL | NULL |   16 |
|   8 |    8 |    7 |    8 | NULL | NULL | NULL | NULL |   19 |
|   9 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|  10 |   10 |   10 |   10 | NULL | NULL | NULL | NULL |   18 |
|  11 |   11 |    6 |   11 | NULL | NULL | NULL | NULL |   20 |
+-----+------+------+------+------+------+------+------+------+

3)
Then add the WHERE-condition: (Complete query further below)

'WHERE table3.pk IS NULL'

 to select only the 3 NULL rows from t1:

+-----+------+------+------+------+------+------+------+------+
| pk1 | pk2  | pk3  | pk4  | pk5  | pk6  | pk7  | pk8  | pk9  |
+-----+------+------+------+------+------+------+------+------+
|   1 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   2 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   3 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   4 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   6 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   7 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   8 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|   9 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|  10 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
|  11 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+-----+------+------+------+------+------+------+------+------+
10 rows in set (0.00 sec)

Which certainly is not as expected, Actually the entire result set
has changed.

You may also try with 'IS NOT NULL':

+-----+------+------+------+------+------+------+------+------+
| pk1 | pk2  | pk3  | pk4  | pk5  | pk6  | pk7  | pk8  | pk9  |
+-----+------+------+------+------+------+------+------+------+
|   5 |    5 |    4 |    5 | NULL |   10 |   10 |   10 |   10 |
+-----+------+------+------+------+------+------+------+------+

Which only return as single row instead of the 8 rows expected:

============= Complete query ===========

SELECT
  table1.pk pk1, table2.pk pk2, table3.pk pk3,
  table4.pk pk4, table5.pk pk5, table6.pk pk6,
  table7.pk pk7, table8.pk pk8, table9.pk pk9
FROM 
 C AS table1
 LEFT JOIN
 ( C AS table2
   LEFT JOIN 
   ( D AS table3
     RIGHT JOIN 
     ( C AS table4
       LEFT JOIN 
       A AS table5
       ON table4 . `pk` = table5 . `int_key`
     ) ON table3 . `int_key` = table4 . `int_key`
   ) ON table2 . `int_key` = table3 . `int_key`
   LEFT JOIN 
   ( BB AS table6
     LEFT JOIN 
     ( CC AS table7
       RIGHT JOIN 
       D AS table8 
       ON table7 . `pk` = table8 . `pk`
     ) ON table6 . `pk` = table7 . `pk`
   ) ON table3 . `int_key` = table6 . `int_key`
   RIGHT JOIN 
   ( CC AS table9
   ) ON table3 . `int_key` = table9 . `int_key`
 ) ON table1 . `int_key` = table3 . `int_key`
WHERE table3 . `pk` IS NULL;
[12 Mar 2010 10:46] Ole John Aske
Dump of database spj_myisam

Attachment: myisam.sql (text/x-sql), 39.02 KiB.

[12 Mar 2010 10:47] Ole John Aske
Dump of database spj_ndb

Attachment: ndb.sql (text/x-sql), 39.04 KiB.

[12 Mar 2010 10:48] Ole John Aske
In order to reproduce the bug wo/ crashing, you would likely to manually apply the patch for bug #48971.
[14 Mar 2010 9:39] Sveta Smirnova
Thank you for the report.

Verified as described. MyISAM and InnoDB shows same behavior while NDB returns correct results.
[29 Mar 2010 14:34] Martin Hansson
Bug#48971 needs to be fixed before this bug.
[18 May 2010 7:29] Martin Hansson
Bug was fixed along with Bug#52357.