Bug #18064 GROUP BY with undetermined column definition returns results instead of error
Submitted: 8 Mar 2006 8:55 Modified: 20 Apr 2006 12:24
Reporter: Norbert van Nobelen Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:mysql-standard-5.0.18-linux-i686 OS:Linux (OpenSuSE linux 10)
Assigned to: Georgi Kodinov CPU Architecture:Any

[8 Mar 2006 8:55] Norbert van Nobelen
Description:
An inner query works while it should give an error, the result could become undetermined because of improper validation of the SQL code.

How to repeat:
The following SQL works for unknown reason:

SELECT a.gameID,a.bonusStatus, a.userID
FROM transaction AS a,
(SELECT MIN(a.transactionID) AS transactionID,b.userID
FROM transaction AS a,user AS b
WHERE a.userID=b.userID AND a.type=10
GROUP BY userID) AS b
WHERE a.userID=b.userID;

As you can see, the GROUP BY has an error in it: userID instead of a.userID or b.userID.
Altering the SQL slightly results in the expected error:

SELECT a.gameID,a.bonusStatus, a.userID
FROM transaction AS a,
(SELECT MIN(a.transactionID) AS transactionID
FROM transaction AS a,user AS b
WHERE a.userID=b.userID AND a.type=10
GROUP BY userID) AS b
WHERE a.transactionID=b.transactionID;
ERROR 1052 (23000): Column 'userID' in group statement is ambiguous

Running the inner query alone gives the same error:
SELECT MIN(a.transactionID) AS transactionID
FROM transaction AS a,user AS b
WHERE a.userID=b.userID AND a.type=10
GROUP BY userID;
ERROR 1052 (23000): Column 'userID' in group statement is ambiguous

So in this case altering the WHERE statement in the outer query resulted in the inner query to work or not to work, while it just always should return an error.
The error is also found in mysql-standard-5.0.13-rc-linux-i686-glibc23
I am using the binary distributions downloaded from mysql.com.

Reproduction code:
CREATE TABLE user (userID VARCHAR(20)
) TYPE=InnoDB;

CREATE TABLE transaction (
transactionID INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY, \
userID VARCHAR(20), \
gameID INTEGER, \
bonusStatus INT DEFAULT 0,
FOREIGN KEY (userID) REFERENCES user (userID)
) TYPE=InnoDB;

CREATE INDEX trans_gameID ON transaction (gameID);
CREATE INDEX trans_tid ON transaction (transactionID);
CREATE INDEX trans_userID ON transaction (userID);
CREATE INDEX trans_bonusStatus ON transaction (bonusStatus);
CREATE INDEX user_userID ON user (userID);

Since if only concerns the SQL interpretor, no data is needed. Just run the example SQL and observe the issue.
With data btw it returns results! So it is not that it fails in a later stadium.
[9 Mar 2006 10:37] Hartmut Holzgraefe
verified in 4.1, 5.0 and 5.1, stripped down test case below:

DROP TABLE IF EXISTS user;
CREATE TABLE user 
(
  userID INTEGER
);

DROP TABLE IF EXISTS transaction;
CREATE TABLE transaction (
transactionID INTEGER, 
userID INTEGER
);

SELECT a.userID
FROM transaction AS a,
(SELECT MIN(a.transactionID) AS transactionID,b.userID
FROM transaction AS a,user AS b
WHERE a.userID=b.userID
GROUP BY userID) AS b
WHERE a.userID=b.userID;

SELECT a.userID
FROM transaction AS a,
(SELECT MIN(a.transactionID) AS transactionID
FROM transaction AS a,user AS b
WHERE a.userID=b.userID
GROUP BY userID) AS b
WHERE a.transactionID=b.transactionID;

SELECT MIN(a.transactionID) AS transactionID
FROM transaction AS a,user AS b
WHERE a.userID=b.userID
GROUP BY userID;
[17 Apr 2006 9:41] Georgi Kodinov
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.mysql.com/documentation/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Additional info:

The behavior you are observing is due to the way MySQL resolves unqualified references in GROUP BY/ORDER BY/HAVING clauses. 
The test case boils down to the following two selects : 

This one throws an error :
SELECT MIN(a.transactionID) AS transactionID
 FROM transaction AS a,user AS b
 WHERE a.userID=b.userID
 GROUP BY userID;

And this one works (no error) :
SELECT MIN(a.transactionID) AS transactionID, b.UserID 
 FROM transaction AS a,user AS b
 WHERE a.userID=b.userID
 GROUP BY userID;

As indicated in section 13.2.7 (http://dev.mysql.com/doc/refman/5.0/en/select.html) of the Reference Manual the select column aliases can be used in GROUP BY, ORDER BY, or HAVING clauses.

Actually the MySQL name resolution for unqualified column references in GROUP BY, ORDER BY, or HAVING works as follows :
1. Searches the resulting column names(or aliases) in the select list
2. Searches the fields of the tables in the FROM clause.

So in this particular case when it resolves the UserID in the ORDER BY clause of the second SELECT it finds a column of that name in the select list (witch is unambiguous).
[20 Apr 2006 12:24] Paul DuBois
Thank you for your bug report. This issue has been addressed in the
documentation. The updated documentation will appear on our website
shortly, and will be included in the next release of the relevant
product(s).