MySQL Bugs: #6709: Subquery: table subquery allows duplicate column names

Bug #6709	Subquery: table subquery allows duplicate column names
Submitted:	18 Nov 2004 22:28	Modified:	4 Oct 2005 10:06
Reporter:	Trudy Pelzer	Email Updates:
Status:	Can't repeat	Impact on me:	None
Category:	MySQL Server	Severity:	S3 (Non-critical)
Version:	5.0.2-alpha-debug	OS:	Linux (SuSE 8.2)
Assigned to:	Oleksandr Byelkin	CPU Architecture:	Any

Description:
The MySQL Reference Manual says:
Subqueries are legal in a SELECT statement's FROM clause. 
The syntax that you'll actually see is: 
SELECT ... FROM (subquery) AS name ...
...  Any columns in the subquery select list must have unique 
names. 

This is in agreement with the SQL Standard, which says:
17) Case:
a) If the i-th <derived column> in the <select list>
specifies an <as clause> that contains a <column name>
CN, then the <column name> of the i-th column of the
result is CN.
b) If the i-th <derived column> in the <select list> 
does not specify an <as clause> and the <value
expression> of that <derived column> is a single column
reference, then the <column name> of the i-th column
of the result is the <column name> of the column
designated by the column reference.
c) Otherwise, the <column name> of the i-th column of
the <query specification> is implementation-dependent.

It also says:
11) No <column name> shall be specified more than once 
in a <derived column> list.

So the Manual is correct. If I write a subquery and the
columns contained therein have duplicate names, then I
must use the AS clause to give them unique names. If
a "column" is the result of an expression and I do not
name it explicitly, then the DBMS must give it some
unique name. Keep in mind that all names must be unique
only for the subquery's scope -- this means that the same
column name can be used in an outer part of the query.

The uniqueness requirement for column names is, however, 
not being enforced; one can write a subquery that contains 
duplicate column names.

How to repeat:
mysql> select * from (select 1 as a, 1 as a) as t1, (select 1 as a, 1 as a) as t2;
+---+---+---+---+
| a | a | a | a |
+---+---+---+---+
| 1 | 1 | 1 | 1 |
+---+---+---+---+
-- This is the incorrect response. The SELECT should fail with a
syntax error: SQLSTATE 42S21 duplicate column name 'a'

Verified with 4.1.8 and 5.0.2

Thank you for bugreport. This bug is fixed now:
query 'select * from (select 1 as a, 1 as a) as t1, (select 1 as a, 1 as a) as t2' failed: 1060: Duplicate column name 'a'

I object to this 'fix', it has caused too many problems for everyone I know. This 'bug' has been fixed, but it means this query doesn't work any more:

select * from table_a join table_b on table_a.a_id = table_b.a_id;

This has caused many problems for all developers I know, and has prevented many people upgrading. It is common to use the same field name in two different tables, as primary and foreign keys, and require all * fields returned. In the example, the column 'a_id' would appear twice but would have the same value so it should still be accessable. This should be an exception and MySQL should not throw an error. The error should be thrown when the columns have different values, or, there should be no error and the fields automaticlly prefixed with the table names. Has anyone else had this problem? Does anyone agree?

Please do not reply to a closed bug report: our bug verification team may not notice it and your feedback may get lost. Submitting a separate bug report in such circumstances would ensure
that your bug is timely verified by an assigned team member and fixed in timely manner.

As to the actual problem, I can't repeat it:
mysql> drop table t1, t2;
Query OK, 0 rows affected (0.01 sec)

mysql> create table t1 (a int, b int, c int);
Query OK, 0 rows affected (0.06 sec)

mysql> insert into t1 (a,b,c) values (1,2,3), (2,3,4), (4,5,6);
Query OK, 3 rows affected (0.05 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> create table t2 select * from t1;
Query OK, 3 rows affected (0.10 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from t1 join t2 on t1.a=t2.a
    -> ;
+------+------+------+------+------+------+
| a    | b    | c    | a    | b    | c    |
+------+------+------+------+------+------+
|    1 |    2 |    3 |    1 |    2 |    3 |
|    2 |    3 |    4 |    2 |    3 |    4 |
|    4 |    5 |    6 |    4 |    5 |    6 |
+------+------+------+------+------+------+
3 rows in set (0.00 sec)

Please, let me stress it again: there is no guarantee whatsoever your response will be addressed
if it's present in this bug report and not in a separate one.

I concur with the objection, it's not a nice "fix".

In particular, this "fixes" things so that I can no longer sort a set of data using a subquery, because the data happens to have two columns of the same name (and there is no way in the SELECT to avoid one of them, since I do not know all of the column names beforehand and can thus not pick them out individually).

Trudy, was there any particular problem that you were trying to fix with this?

(It doesn't seem like there is any reasoning in the SQL Standard either..)

Hi David,

In our efforts to make MySQL more SQL-standard compliant,
I found and reported this bug. The Standard is clear:
"11) No <column name> shall be specified more than once 
in a <derived column> list."
means that there must be a unique name for every column.
I presume that the rule is there to help avoid the
confusion that could be caused by unambiguous column names
in a list, but I can't speak for the authors of the SQL
Standard.

I'm using MySql 5 with Delphi 7 and MyDac components.
I try to get some specific field from this kind of select:
"Select * from Table1 as Tbl1, Table2 as Tbl2 where tbl1.id = tbl2.id_Table1"

Suppose that both table1 and table2 have a field named "MyValue";
Suppose I have to retrieve them...
Recordset obtains that there are 2 fields named "MyValue" instead of table1.MyValue and table2.Myvalue and so it refers to them calling MyValue_1 ad MyValue_2.
Explained by MySql Query Browser, I see that something completes alias giving them the same alias, but most of db, use completing alias with "tablename.fieldname" if there are same fieldname, that I think is very much useful and usable.
In runtime complex query, builded by my application, this is really a problem.
And suppose that this query is a subquery, is it possbile that we have to esplicit write alias for al fields? * became really unuseful
Maybe I'm wrong in something?
Thank you in advance
EM

I'd like to note that at least in 4.1.25 the fix is not working again (regression?), whereas later versions have it.