Bug #4789 Problem with natural join for more than 2 tables
Submitted: 28 Jul 2004 16:01 Modified: 30 Aug 2005 4:15
Reporter: Fabrice Kinnar Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.0 4.1 OS:Any (all)
Assigned to: Timour Katchaounov

[28 Jul 2004 16:01] Fabrice Kinnar
Description:
MySQL doesn't give the good result when "natural joining" more than 2 tables. For example, this request doesn't work:

SELECT *
FROM (a NATURAL JOIN b) NATURAL JOIN c;

The result given by MySQL is the cartesian product of (a NATURAL JOIN b) and c, it doesn't apply the second join criteria.

This request should give excactly the same result, but this one works fine:

SELECT *
FROM (a NATURAL JOIN b), c
WHERE c.id = a.id;

I Assume that tables a and c have a row called id of the same data type of course. What I do is explicitly write the join criteria.

How to repeat:
Simply create 3 tables that can be logacally joined and have fun :-)
[28 Jul 2004 17:38] Fabrice Kinnar
Things are worst than I thought first!

SELECT * FROM a NATURAL JOIN b
and
SELECT * FROM b NATURAL JOIN a
give, of course, identical results

BUT:
SELECT * FROM a NATURAL JOIN b NATURAL JOIN c
and
SELECT * FROM b NATURAL JOIN a NATURAL JOIN c
give DIFFERENT RESULTS!

It's not logical at all!
[28 Jul 2004 20:11] Dean Ellis
I cannot repeat the problem.  Please submit a complete test case with the SQL to create, populate and query the tables demonstrating this behavior, to ensure that we are testing the same thing you are seeing.
[29 Jul 2004 11:06] Fabrice Kinnar
Ok, here is a script to create and populate a db:

SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT;
SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS;
SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION;
SET NAMES utf8;

/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE=NO_AUTO_VALUE_ON_ZERO */;

CREATE DATABASE /*!32312 IF NOT EXISTS*/ `fkinnar`;
USE `fkinnar`;
CREATE TABLE `b` (
  `idb` int(10) unsigned NOT NULL auto_increment,
  `value` int(10) unsigned NOT NULL default '0',
  PRIMARY KEY  (`idb`)
) TYPE=InnoDB;
INSERT INTO `b` (`idb`,`value`) VALUES (1,1),(2,2),(3,3);
CREATE TABLE `c` (
  `ida` int(10) unsigned NOT NULL default '0',
  `idb` int(10) unsigned NOT NULL default '0'
) TYPE=InnoDB;
INSERT INTO `c` (`ida`,`idb`) VALUES (1,1),(2,1),(3,1),(4,1),(5,2),(3,2),(1,2),(2,3),(4,3);
CREATE TABLE `a` (
  `ida` int(10) unsigned NOT NULL auto_increment,
  `value` int(10) unsigned NOT NULL default '0',
  PRIMARY KEY  (`ida`)
) TYPE=InnoDB;
INSERT INTO `a` (`ida`,`value`) VALUES (1,1),(2,2),(3,3),(4,4),(5,5);
/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT;
SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS;
SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;

Then, just try

select * from a natural join c natural join b;
select * from c natural join a natural join b;

And you'll see results are different!

Fab :-)
[29 Jul 2004 17:35] Fabrice Kinnar
I made a mistake in my first script (I simplified things too much LOL!).
Be careful to give differant names to the "value" fields in a and b...

So:
CREATE DATABASE /*!32312 IF NOT EXISTS*/ `fkinnar`;
USE `fkinnar`;
CREATE TABLE `b` (
  `idb` int(10) unsigned NOT NULL auto_increment,
  `valueb` int(10) unsigned NOT NULL default '0',
  PRIMARY KEY  (`idb`)
) TYPE=InnoDB;
INSERT INTO `b` (`idb`,`valueb`) VALUES (1,1),(2,2),(3,3);
CREATE TABLE `c` (
  `ida` int(10) unsigned NOT NULL default '0',
  `idb` int(10) unsigned NOT NULL default '0'
) TYPE=InnoDB;
INSERT INTO `c` (`ida`,`idb`) VALUES (1,1),(2,1),(3,1),(4,1),(5,2),(3,2),(1,2),(2,3),(4,3);
CREATE TABLE `a` (
  `ida` int(10) unsigned NOT NULL auto_increment,
  `valuea` int(10) unsigned NOT NULL default '0',
  PRIMARY KEY  (`ida`)
) TYPE=InnoDB;
INSERT INTO `a` (`ida`,`valuea`) VALUES (1,1),(2,2),(3,3),(4,4),(5,5);
[29 Jul 2004 18:26] Fabrice Kinnar
I think the error comes from the way the request is analysed by MySQL.

If you try:
SELECT * FROM (c natural join a) join b on (c.idb = b.idb)
The result is correct, yeah! :-)

But if you try:
SELECT * FROM (c natural join a) join b using (idb)
You receive an error that says:
Unknown column 'fkinnar.a.idb' in 'on clause' (Error Nr 1054)

It seems MySQL looks for an idb column within "a" (the table immediately on the left of the join clause?), and of course, it doesn't find it. But in "(c natural join a)" there is such a column!

I know parenthesis are ignored by MySQL when parsing a request, but in that case they are not critical. Even with the normal left to right reading the request should be evaluable... And the request:
SELECT * FROM c natural join a join b on (c.idb = b.idb)
Is legal and perfectly understood by the database engine!

With the error I obtained when I tried to evaluate 
SELECT * FROM (c natural join a) join b using (idb)
I understood the result MySQL gave my for
SELECT * FROM (c natural join a) natural join b
As it didn't find an "idb" column to join "b" to the result of "(c natural join a)", it simply returns the cartesian product of "b" and "(c natural join a)", and didn't show any error message!

Hope it helps,
Fab :-)
[29 Jul 2004 18:29] Fabrice Kinnar
A script to create and populate an example database

Attachment: bug 20040729 1100.sql (text/plain), 1.51 KiB.

[29 Jul 2004 18:40] Dean Ellis
Re-opening for now; thank you for the test case.  Yes, the problem is due to the order/method by which the tables are joined.  ie: with common columns (a,b) in tables A and C, and a common column (a) in table B, then:

FROM A NATURAL JOIN B NATURAL JOIN C will produce results you do not expect because column b is not considered for a join condition.  "Connecting" the tables with the common column with FROM B NATURAL JOIN A NATURAL JOIN C allows A and C to join using their B columns.
[30 Aug 2004 19:15] Timour Katchaounov
An even shorter script to reproduce the problem is:

drop table if exists t1;
drop table if exists t2;
drop table if exists t3;

CREATE TABLE t1 (a int(10), t1_val int(10));
CREATE TABLE t2 (b int(10), t2_val int(10));
CREATE TABLE t3 (a int(10), b int(10));

INSERT INTO t1 VALUES (1,1),(2,2);
INSERT INTO t2 VALUES (1,1),(2,2),(3,3);
INSERT INTO t3 VALUES (1,1),(2,1),(3,1),(4,1);

select * from t1 natural join t2 natural join t3;
select * from t1 natural join t3 natural join t2;
[1 Sep 2004 10:06] Timour Katchaounov
This bug is due to different NATURAL JOIN semantics between that of MySQL and the ANSI
SQL. The query:
  select * from (t1 cross join t2) natural join t3;
is equivalent in MySQL to:
  (t1 IJ t2) USING (common columns for t1,t2)[proper proj]
  IJ t3 USING (common columns for t2 and t3)[ proper proj]
while the standard semantics is:
  (t1 IJ t2) USING (common columns for t1,t2)[proper proj]
  IJ t3 USING (common columns for (t1,t2) and t3)[ proper proj]
where 'IJ' stands for INNER JOIN.

Since fixing this bug would require change of semantics, this behavior will
be fixed in the upcoming 5.0 version and not in the current 4.1.
[1 Sep 2004 14:44] Lenz Grimmer
Setting this bug to "Deferred" instead of "Won't fix", as it will be resolved for 5.0 - please mark it
as "Closed" once the changes have been made.
[23 Aug 2005 19:33] Timour Katchaounov
Fixed in 5.0.12.

The patch above contains only a test, the bug itself is fixed by WL#2486.
[30 Aug 2005 4:15] Mike Hillyer
Added bug number to existing changelog entry for this task.