Bug #20297 a INNER JOIN b RIGHT JOIN c returns excessive rows
Submitted: 6 Jun 2006 16:05 Modified: 20 Jul 2006 13:40
Reporter: Axel Schwenke Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:4.0.27, 4.1.20, 4.1-BK OS:
Assigned to: Timour Katchaounov CPU Architecture:Any

[6 Jun 2006 16:05] Axel Schwenke
Description:
A three way join like 
  SELECT ... FROM t1 INNER JOIN t2 RIGHT JOIN t3 
returns excessive rows with NULL for table t2 but NOT NULL values for table t1.

This bug is present in latest 4.0 and 4.1 releases, but not in 5.0.

How to repeat:
pick the associated file "joinbug.sql" and load it in mysql:

mysql> \. joinbug.sql
...
+----+----------+---------+---------+------------+----+--------+
| ID | ParentID | Title   | EventID | Date       | ID | Title  |
+----+----------+---------+---------+------------+----+--------+
|  1 |        1 | Titel 1 |       1 | 2006-06-12 |  1 | Club 1 |
|  2 |        1 | Titel 2 |    NULL | NULL       |  1 | Club 1 |
|  3 |        2 | Titel 3 |    NULL | NULL       |  1 | Club 1 |
|  1 |        1 | Titel 1 |    NULL | NULL       |  2 | Club 2 |
|  2 |        1 | Titel 2 |    NULL | NULL       |  2 | Club 2 |
|  3 |        2 | Titel 3 |    NULL | NULL       |  2 | Club 2 |
|  1 |        1 | Titel 1 |    NULL | NULL       |  3 | Club 3 |
|  2 |        1 | Titel 2 |    NULL | NULL       |  3 | Club 3 |
|  3 |        2 | Titel 3 |    NULL | NULL       |  3 | Club 3 |
+----+----------+---------+---------+------------+----+--------+

the correct result would be:
+------+----------+---------+---------+------------+----+--------+
| ID   | ParentID | Title   | EventID | Date       | ID | Title  |
+------+----------+---------+---------+------------+----+--------+
| 1    | 1        | Titel 1 | 1       | 2006-06-12 | 1  | Club 1 |
| NULL | NULL     |         | NULL    |            | 2  | Club 2 |
| NULL | NULL     |         | NULL    |            | 3  | Club 3 |
+------+----------+---------+---------+------------+----+--------+
[6 Jun 2006 16:07] Axel Schwenke
joinbug.sql

Attachment: joinbug.sql (application/octet-stream, text), 2.89 KiB.

[6 Jun 2006 16:08] Axel Schwenke
FYI. This is the SELECT statement:

SELECT *
      FROM cal_events
INNER JOIN cal_dates
        ON cal_events.ID = cal_dates.EventID AND cal_dates.Date = '2006-06-12'
RIGHT JOIN clubs_main
        ON clubs_main.ID = cal_events.ParentID
[6 Jun 2006 21:53] Hartmut Holzgraefe
Isn't this a known issue with OUTER JOIN processing in pre-5.0 versions?
(exactly the fix that breaks query compatibility when it comes to outer joins, too?)
[7 Jun 2006 9:16] Axel Schwenke
Hartmut, /all,

I just checked the manual, but could not find anything about this limitation of OUTER JOINs. If this bug is of the "won't fix" type, this should be clearly documented.
[7 Jun 2006 12:35] Valeriy Kravchuk
It is described (http://dev.mysql.com/doc/refman/5.0/en/join.html) that:

"In versions of MySQL prior to 5.0.1, parentheses in table_references were just omitted and all join operations were grouped to the left."

Hence the result... It was incorrect, but, as far as I understand, there are no plans to change JOIN behaviour for 4.x.y.
[3 Jul 2006 11:57] Valeriy Kravchuk
On 4.1.21-BK I've got:

mysql> SELECT *
    ->       FROM cal_events
    -> INNER JOIN cal_dates
    ->         ON cal_events.ID = cal_dates.EventID AND cal_dates.Date = '2006-06-12'
    -> RIGHT JOIN clubs_main
    ->         ON clubs_main.ID = cal_events.ParentID;
+----+----------+---------+---------+------------+----+--------+
| ID | ParentID | Title   | EventID | Date       | ID | Title  |
+----+----------+---------+---------+------------+----+--------+
|  1 |        1 | Titel 1 |       1 | 2006-06-12 |  1 | Club 1 |
|  2 |        1 | Titel 2 |    NULL | NULL       |  1 | Club 1 |
|  3 |        2 | Titel 3 |    NULL | NULL       |  1 | Club 1 |
|  1 |        1 | Titel 1 |    NULL | NULL       |  2 | Club 2 |
|  2 |        1 | Titel 2 |    NULL | NULL       |  2 | Club 2 |
|  3 |        2 | Titel 3 |    NULL | NULL       |  2 | Club 2 |
|  1 |        1 | Titel 1 |    NULL | NULL       |  3 | Club 3 |
|  2 |        1 | Titel 2 |    NULL | NULL       |  3 | Club 3 |
|  3 |        2 | Titel 3 |    NULL | NULL       |  3 | Club 3 |
+----+----------+---------+---------+------------+----+--------+
9 rows in set (0.03 sec)

Surely the following:

mysql> SELECT * FROM clubs_main LEFT JOIN cal_events ON clubs_main.ID = cal_eve
nts.ParentID INNER JOIN cal_dates ON cal_events.ID = cal_dates.EventID AND cal_
dates.Date = '2006-06-12';
+----+--------+------+----------+---------+---------+------------+
| ID | Title  | ID   | ParentID | Title   | EventID | Date       |
+----+--------+------+----------+---------+---------+------------+
|  1 | Club 1 |    1 |        1 | Titel 1 |       1 | 2006-06-12 |
+----+--------+------+----------+---------+---------+------------+
1 row in set (0.00 sec)

gives different result, but RIGHT JOIN properly rewritten as LEFT JOIN for me looks like:

mysql> SELECT * FROM clubs_main LEFT JOIN (cal_events INNER JOIN cal_dates ON c
al_events.ID = cal_dates.EventID AND cal_dates.Date = '2006-06-12') ON clubs_ma
in.ID = cal_events.ParentID;
+----+--------+----+----------+---------+---------+------------+
| ID | Title  | ID | ParentID | Title   | EventID | Date       |
+----+--------+----+----------+---------+---------+------------+
|  1 | Club 1 |  1 |        1 | Titel 1 |       1 | 2006-06-12 |
|  2 | Club 2 |  1 |        1 | Titel 1 |    NULL | NULL       |
|  3 | Club 3 |  1 |        1 | Titel 1 |    NULL | NULL       |
|  1 | Club 1 |  2 |        1 | Titel 2 |    NULL | NULL       |
|  2 | Club 2 |  2 |        1 | Titel 2 |    NULL | NULL       |
|  3 | Club 3 |  2 |        1 | Titel 2 |    NULL | NULL       |
|  1 | Club 1 |  3 |        2 | Titel 3 |    NULL | NULL       |
|  2 | Club 2 |  3 |        2 | Titel 3 |    NULL | NULL       |
|  3 | Club 3 |  3 |        2 | Titel 3 |    NULL | NULL       |
+----+--------+----+----------+---------+---------+------------+
9 rows in set (0.00 sec)

And we have the same results, it seems. Moreover, on 5.0.24 thewy seems the same and CORRECT:

mysql> SELECT *
    ->       FROM cal_events
    -> INNER JOIN cal_dates
    ->         ON cal_events.ID = cal_dates.EventID AND cal_dates.Date = '2006-
06-12'
    -> RIGHT JOIN clubs_main
    ->         ON clubs_main.ID = cal_events.ParentID;
+------+----------+---------+---------+------------+----+--------+
| ID   | ParentID | Title   | EventID | Date       | ID | Title  |
+------+----------+---------+---------+------------+----+--------+
|    1 |        1 | Titel 1 |       1 | 2006-06-12 |  1 | Club 1 |
| NULL |     NULL | NULL    |    NULL | NULL       |  2 | Club 2 |
| NULL |     NULL | NULL    |    NULL | NULL       |  3 | Club 3 |
+------+----------+---------+---------+------------+----+--------+
3 rows in set (0.01 sec)

mysql> SELECT * FROM clubs_main LEFT JOIN (cal_events INNER JOIN cal_dates ON c
al_events.ID = cal_dates.EventID AND cal_dates.Date = '2006-06-12') ON clubs_ma
in.ID = cal_events.ParentID;
+----+--------+------+----------+---------+---------+------------+
| ID | Title  | ID   | ParentID | Title   | EventID | Date       |
+----+--------+------+----------+---------+---------+------------+
|  1 | Club 1 |    1 |        1 | Titel 1 |       1 | 2006-06-12 |
|  2 | Club 2 | NULL |     NULL | NULL    |    NULL | NULL       |
|  3 | Club 3 | NULL |     NULL | NULL    |    NULL | NULL       |
+----+--------+------+----------+---------+---------+------------+
3 rows in set (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.0.24    |
+-----------+
1 row in set (0.00 sec)

So, the difference is in LEFT JOIN processing in 4.1.x vs. 5.0.x (and correct one). I see no obvious explanation of this difference in http://dev.mysql.com/doc/refman/5.0/en/join.html. So, I think, it is a bug in versions 4.x.y (or a documentation request).
[16 Jul 2006 16:08] Timour Katchaounov
The reported problem is a result of a well-known deficiency of
4.1 and prior versions. The correct processing of nested joins
(including outer joins) is a major new feature in 5.0 that is
out of question to be backported to 4.1.

For a detailed description of the difference in processing
nested [outer] joins please read:
http://dev.mysql.com/doc/refman/5.1/en/nested-joins.html

Documentation team, please consider adding some note to the
4.1 documentation that points out that 4.1 doesn't process
nested [outer] joins correctly, and possibly add a pointer
to the 5.0 documentation that explains the differences.
[20 Jul 2006 13:40] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

I've added a note about 5.0 join processing to:

http://dev.mysql.com/doc/refman/4.1/en/join.html