Bug #88562 Join criteria applied to wrong join in some conditions
Submitted: 20 Nov 2017 16:52 Modified: 15 Mar 2018 21:36
Reporter: Patrick Meredith Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Parser Severity:S3 (Non-critical)
Version:5.5/5.6/5.7/8.0/trunk OS:Any
Assigned to: CPU Architecture:Any
Tags: join, natural join, USING

[20 Nov 2017 16:52] Patrick Meredith
Description:
Join criteria here means a clause with "using" or "on".  For instance, a inner join b on a.id = b.id, "on a.id = b.id" is the join criteria.

Because the MySQL yacc grammar tries to enforce join criteria (i.e., left/right join must have criteria, natural join must not) a syntactic ambiguity exists in the parser whereby join criteria can be associated with the wrong join.  

How to repeat:
Define three tables:

mysql> describe a;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id    | int(11) | YES  |     | NULL    |       |
| a     | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.01 sec)

mysql> describe b;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id    | int(11) | YES  |     | NULL    |       |
| a     | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

mysql> describe c;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id    | int(11) | YES  |     | NULL    |       |
| a     | int(11) | YES  |     | NULL    |       |
+-------+---------+------+-----+---------+-------+
2 rows in set (0.00 sec)

mysql> select * from a;
+------+------+
| id   | a    |
+------+------+
|    1 |    2 |
|    2 |    3 |
|    3 |    4 |
+------+------+
3 rows in set (0.00 sec)

mysql> select * from b;
+------+------+
| id   | a    |
+------+------+
|    1 |    2 |
|    2 |    3 |
+------+------+
2 rows in set (0.00 sec)

mysql> select * from c;
+------+------+
| id   | a    |
+------+------+
|    1 |    2 |
+------+------+
1 row in set (0.00 sec)

Then issue the following:

mysql> select * from b natural join c using(id);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'using(id)' at line 1
select * from b natural join c on b.id = c.id;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'on b.id = c.id' at line 1

^These are correct, you cannot have join criteria with natural joins.

mysql> select * from a left join b natural join c using(id);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'using(id)' at line 1
mysql> mysql> select * from a left join b natural join c on b.id = c.id;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'mysql> select * from a left join b natural join c on b.id = c.id' at line 1

^These are also correct, you cannot have join criteria with natural joins.

mysql> select * from a join b natural join c using(id);
+------+------+------+
| id   | a    | a    |
+------+------+------+
|    1 |    2 |    2 |
+------+------+------+
1 row in set (0.00 sec)

select * from a join b natural join c on a.id=b.id;
+------+------+------+------+
| id   | a    | id   | a    |
+------+------+------+------+
|    1 |    2 |    1 |    2 |
+------+------+------+------+
1 row in set (0.00 sec)

select * from a join b natural join c on b.id=c.id;
+------+------+------+------+
| id   | a    | id   | a    |
+------+------+------+------+
|    1 |    2 |    1 |    2 |
|    2 |    3 |    1 |    2 |
|    3 |    4 |    1 |    2 |
+------+------+------+------+

^These should be errors also.  They are passing the parser because the join criteria are being parsed as attached to the a join b instead of the natural join, which produces unexpected results.

Suggested fix:
The easiest way to fix this is to allow left/right join to not have criteria and natural join to have criteria in the parser (yacc grammar), and then have a secondary pass on the abstract syntax tree that throws errors if a natural join has criteria or a left/right join does not.  This also gives the ability to have better error messages such as:

"left/right joins require an on or using clause"

and

"natural joins may not have on or using clauses"
[20 Nov 2017 20:21] MySQL Verification Team
Thank you for the bug report.
[15 Mar 2018 21:36] Roy Lyseng
This is not a bug, MySQL interprets these queries according to the SQL standard.

The behaviour becomes clearer when we add parentheses to separate the join clauses. This query:

  select * from a join b natural join c using(id);

with parentheses added becomes:

  select * from a join (b natural join c) using(id);

b and c are operand tables in a natural join. The result of that join is
joined with a on the id column.

It is also interesting to look at this query:

  select * from a join (b natural join c) on b.id=c.id;

There is still a natural join, but the join criterion between a and the natural join does not reference the a table at all, effectively making this a cross join.