Bug #470 Problem with multiple "left outer join" and "using" instead of "on"
Submitted: 21 May 2003 8:44 Modified: 25 Aug 2004 2:02
Reporter: Raf Schietekat
Status: Won't fix
Category:Server Severity:S3 (Non-critical)
Version:4.1 OS:Any (All)
Assigned to: Bugs System Target Version:

[21 May 2003 8:44] Raf Schietekat
Description:
See example in "How to repeat". The lines with "using" don't work, they give the
following error:
"ERROR 1054 at line 24: Unknown column 'test.T2.f3' in 'on clause'"
(with 24 the line number of "select" in my "mysql < tem.sql" file).

Just to make sure, if everything about T3 is removed from the select statement (",T3.f5"
and the second "left outer join" line, the correct result is displayed even with
"using".

The problem seems to be that left outer join is indeed left-associative, but that not all
the column names from the result of the first or "innermost" left outer join (f1, f2, f3,
f4) are available for the second or "outermost" left outer join, only the ones from the
table just textually left of the second left outer join, as if "left outer join" were
right-associative, although using parentheses around the first or "innermost" left outer
join makes no difference there as an attempted workaround. Isn't this a bug? I've found
some things with aliases in the bug search, but there are no aliases here, so, if it is
indeed a bug, I think it might be new.

For a workaround, replace the lines with "using" by the lines with "on" (the correct
result is displayed).

And a possibly related issue: if the list of columns is replaced with just an asterisk,
duplicate columns are displayed, e.g., f1, f2, f3, f2, f4, f3, f5: is that correct (might
very well be, but I thought I'd mention it anyway)?

How to repeat:
-- use after first time for easy experimentation (careful!)
  drop table T1;
  drop table T2;
  drop table T3;
create table T1(
  f1 integer,
  f2 integer,
  f3 integer
  );
insert into T1 (f1, f2, f3) values (1, 2, 3);
create table T2(
  f2 integer,
  f4 integer
  );
insert into T2 (f2, f4) values (2, 4);
create table T3(
  f3 integer,
  f5 integer
  );
insert into T3 (f3, f5) values (3, 5);
select
    T1.f1,
    T1.f2,
    T1.f3,
    T2.f4,
    T3.f5
  from
    T1
    left outer join T2 using (f2)
    left outer join T3 using (f3)
    -- left outer join T2 on T1.f2=T2.f2
    -- left outer join T3 on T1.f3=T3.f3
  ;

Suggested fix:
Make all column names available from nested joins.
[23 May 2003 11:53] Raf Schietekat
The problem seems to be in sql_yacc.yy, for rule <using_list>. Joins are not maintained as
trees, only as lists, with each table joining the list in <join_table>. Early in
<join_table_list>, the result of the left outer join of the first two tables is put in
sel->db1 for use in <using_list>, but that really points to the lexically last table
(through add_table_to_list()). So when <using_list> creates a new
Item_field(sel->db1,sel->table1,$1.str), or similarly if there's more than one field, it
effectively attaches itself to the lexically last table in the join subtree, rather than
to the conceptual result of that syntactical subtree.

If this analysis is correct so far, a minimal though possibly still incomplete solution
would be to have a new function that looks for the correct table in the list, and use its
result to get the arguments for the Item_field constructor mentioned above.
[17 May 2004 16:04] Mark Sadler
What happened to the status of this report?  Its blank what does that mean?
I have this problem on 4.1.1-alpha (Linux).

ie. If you have 

SELECT * FROM Table1 INNER JOIN Table2 USING (something) INNER JOIN Table3 USING (a field
thats in Table1)

You get Unknown Column.
[17 May 2004 16:45] Sergei Golubchik
it's blank, probably because it was created in the early days before the field "Status"
was added. Changing to "Open"
[19 May 2004 0:18] Alexander Keremidarski
Mark,

Can you provide the exact query you use and error message you get?

SELECT * FROM Table1 INNER JOIN Table2 USING (something) INNER JOIN Table3 USING
(a field thats in Table1)

USING can be used only with columns which exist in both tables 

Consider following example:

CREATE TABLE `u1` (
  `a` int(11) default NULL,
  `b` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
 
CREATE TABLE `u2` (
  `a` int(11) default NULL,
  `c` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

CREATE TABLE `u3` (
  `a` int(11) default NULL,
  `c` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

mysql> SELECT * FROM u1 INNER JOIN u2 USING(a) INNER JOIN u3 USING(a);
Empty set (0.00 sec)

This works because `a` is presented in all tables.

mysql> SELECT * FROM u1 INNER JOIN u2 USING(a) INNER JOIN u3 USING(c);
Empty set (0.00 sec)

This works because `c` is presented in both u2 and u3.

mysql> SELECT * FROM u1 INNER JOIN u2 USING(a) INNER JOIN u3 USING(b);
ERROR 1054 (42S22): Unknown column 'bugs.u2.b' in 'on clause'

This doesn't work because `b` is not presented in u2

Is this what you are reporting?
[19 May 2004 10:41] Raf Schietekat
Alexander, your counterexample obviously wouldn't work because b is not in u3, and
probably not because of the (putatively erroneous) reason you offer. In my original
report, f3 is in both the result of the innermost join (just not in T2, which is the
point of the report) and in the rightmost table of the outermost join. Alternatively, you
might explain that (and why) it is significant that a <qualified join> is between two
<table reference>s, not between a <table reference> and a <table primary>, disproving the
impression of left association.
[19 May 2004 14:38] Sergei Golubchik
You are right, of course.
Unfortunately, we cannot add big behavior-changing patches into the stable version - 4.0.
The bug will be fixed in 4.1

Meanwhile you can use ON as a workaround.
[19 May 2004 16:06] Raf Schietekat
But can you also explain the at first sight ambiguous grammar specifically for <qualified
join> in the standard (ISO/IEC 9075-2:1999 (E) 7.7 <joined table>), i.e., the other join
types specify <table reference> and <table primary>, whereas <qualified join> specifies
twice <table reference>: error or intentional, kept or changed in subsequent versions? My
report sort of depends on left associativity...
[7 Jun 2004 13:55] Michael Widenius
The problem is that in 4.0 and before MySQL did not really support the concept of braces
in the JOIN part. (Braces was just read and ignored as some program like Microsoft Access
used braces even when they was not needed).

When using "LEFT JOIN ... USING" MySQL 4.0 only connects the 2 adjacent tables, not the
one before in the chain.

We have already code for full brace support for MySQL 4.1 and it should be pushed in the
4.1 tree shortly (it just have to be reviewed first...)
[24 Jun 2004 13:44] Michael Widenius
The fix will be pushed in 5.0 (Sorry, but as the change is huge, we can't put it in 4.1
without risking braking something else).
[25 Aug 2004 2:02] Igor Babaev
This can't be fixed in 4.1 as nested joins are not supported there.
It will be fixed in 5.0.x (5.0.2 already supports nested joins).