Bug #470 Problem with multiple "left outer join" and "using" instead of "on"
Submitted: 21 May 2003 6:44 Modified: 25 Aug 2004 0:02
Reporter: Raf Schietekat Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:4.1 OS:Any (All)
Assigned to: Igor Babaev CPU Architecture:Any

[21 May 2003 6:44] Raf Schietekat
Description:
See example in "How to repeat". The lines with "using" don't work, they give the following error:
"ERROR 1054 at line 24: Unknown column 'test.T2.f3' in 'on clause'"
(with 24 the line number of "select" in my "mysql < tem.sql" file).

Just to make sure, if everything about T3 is removed from the select statement (",T3.f5" and the second "left outer join" line, the correct result is displayed even with "using".

The problem seems to be that left outer join is indeed left-associative, but that not all the column names from the result of the first or "innermost" left outer join (f1, f2, f3, f4) are available for the second or "outermost" left outer join, only the ones from the table just textually left of the second left outer join, as if "left outer join" were right-associative, although using parentheses around the first or "innermost" left outer join makes no difference there as an attempted workaround. Isn't this a bug? I've found some things with aliases in the bug search, but there are no aliases here, so, if it is indeed a bug, I think it might be new.

For a workaround, replace the lines with "using" by the lines with "on" (the correct result is displayed).

And a possibly related issue: if the list of columns is replaced with just an asterisk, duplicate columns are displayed, e.g., f1, f2, f3, f2, f4, f3, f5: is that correct (might very well be, but I thought I'd mention it anyway)?

How to repeat:
-- use after first time for easy experimentation (careful!)
  drop table T1;
  drop table T2;
  drop table T3;
create table T1(
  f1 integer,
  f2 integer,
  f3 integer
  );
insert into T1 (f1, f2, f3) values (1, 2, 3);
create table T2(
  f2 integer,
  f4 integer
  );
insert into T2 (f2, f4) values (2, 4);
create table T3(
  f3 integer,
  f5 integer
  );
insert into T3 (f3, f5) values (3, 5);
select
    T1.f1,
    T1.f2,
    T1.f3,
    T2.f4,
    T3.f5
  from
    T1
    left outer join T2 using (f2)
    left outer join T3 using (f3)
    -- left outer join T2 on T1.f2=T2.f2
    -- left outer join T3 on T1.f3=T3.f3
  ;

Suggested fix:
Make all column names available from nested joins.
[23 May 2003 9:53] Raf Schietekat
The problem seems to be in sql_yacc.yy, for rule <using_list>. Joins are not maintained as trees, only as lists, with each table joining the list in <join_table>. Early in <join_table_list>, the result of the left outer join of the first two tables is put in sel->db1 for use in <using_list>, but that really points to the lexically last table (through add_table_to_list()). So when <using_list> creates a new Item_field(sel->db1,sel->table1,$1.str), or similarly if there's more than one field, it effectively attaches itself to the lexically last table in the join subtree, rather than to the conceptual result of that syntactical subtree.

If this analysis is correct so far, a minimal though possibly still incomplete solution would be to have a new function that looks for the correct table in the list, and use its result to get the arguments for the Item_field constructor mentioned above.
[17 May 2004 14:04] Mark Sadler
What happened to the status of this report?  Its blank what does that mean?
I have this problem on 4.1.1-alpha (Linux).

ie. If you have 

SELECT * FROM Table1 INNER JOIN Table2 USING (something) INNER JOIN Table3 USING (a field thats in Table1)

You get Unknown Column.
[17 May 2004 14:45] Sergei Golubchik
it's blank, probably because it was created in the early days before the field "Status" was added. Changing to "Open"
[18 May 2004 22:18] Alexander Keremidarski
Mark,

Can you provide the exact query you use and error message you get?

SELECT * FROM Table1 INNER JOIN Table2 USING (something) INNER JOIN Table3 USING
(a field thats in Table1)

USING can be used only with columns which exist in both tables 

Consider following example:

CREATE TABLE `u1` (
  `a` int(11) default NULL,
  `b` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
 
CREATE TABLE `u2` (
  `a` int(11) default NULL,
  `c` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE `u3` (
  `a` int(11) default NULL,
  `c` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

mysql> SELECT * FROM u1 INNER JOIN u2 USING(a) INNER JOIN u3 USING(a);
Empty set (0.00 sec)

This works because `a` is presented in all tables.

mysql> SELECT * FROM u1 INNER JOIN u2 USING(a) INNER JOIN u3 USING(c);
Empty set (0.00 sec)

This works because `c` is presented in both u2 and u3.

mysql> SELECT * FROM u1 INNER JOIN u2 USING(a) INNER JOIN u3 USING(b);
ERROR 1054 (42S22): Unknown column 'bugs.u2.b' in 'on clause'

This doesn't work because `b` is not presented in u2

Is this what you are reporting?
[19 May 2004 8:41] Raf Schietekat
Alexander, your counterexample obviously wouldn't work because b is not in u3, and probably not because of the (putatively erroneous) reason you offer. In my original report, f3 is in both the result of the innermost join (just not in T2, which is the point of the report) and in the rightmost table of the outermost join. Alternatively, you might explain that (and why) it is significant that a <qualified join> is between two <table reference>s, not between a <table reference> and a <table primary>, disproving the impression of left association.
[19 May 2004 12:38] Sergei Golubchik
You are right, of course.
Unfortunately, we cannot add big behavior-changing patches into the stable version - 4.0. The bug will be fixed in 4.1

Meanwhile you can use ON as a workaround.
[19 May 2004 14:06] Raf Schietekat
But can you also explain the at first sight ambiguous grammar specifically for <qualified join> in the standard (ISO/IEC 9075-2:1999 (E) 7.7 <joined table>), i.e., the other join types specify <table reference> and <table primary>, whereas <qualified join> specifies twice <table reference>: error or intentional, kept or changed in subsequent versions? My report sort of depends on left associativity...
[7 Jun 2004 11:55] Michael Widenius
The problem is that in 4.0 and before MySQL did not really support the concept of braces in the JOIN part. (Braces was just read and ignored as some program like Microsoft Access used braces even when they was not needed).

When using "LEFT JOIN ... USING" MySQL 4.0 only connects the 2 adjacent tables, not the one before in the chain.

We have already code for full brace support for MySQL 4.1 and it should be pushed in the 4.1 tree shortly (it just have to be reviewed first...)
[24 Jun 2004 11:44] Michael Widenius
The fix will be pushed in 5.0 (Sorry, but as the change is huge, we can't put it in 4.1 without risking braking something else).
[25 Aug 2004 0:02] Igor Babaev
This can't be fixed in 4.1 as nested joins are not supported there.
It will be fixed in 5.0.x (5.0.2 already supports nested joins).