Bug #15643 Can't group on calculated field aliased as other field
Submitted: 9 Dec 2005 18:42 Modified: 6 Sep 2006 19:12
Reporter: Nathan Tanner Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S2 (Serious)
Version:5.0.15/5.0.18 BK/5.0.22 OS:Any (All)
Assigned to: Paul DuBois CPU Architecture:Any

[9 Dec 2005 18:42] Nathan Tanner
Description:
Not sure if this is a real bug or just an "undocumented feature", but when you run the test case queries on 4.0 or 4.1, you get:

test> select 1 as id, count(*) from test group by id;
+----+----------+
| id | count(*) |
+----+----------+
|  1 |        5 |
+----+----------+
1 row in set (0.00 sec)

As of 5.0, you get:

test> select 1 as id, count(*) from test group by id;
+----+----------+
| id | count(*) |
+----+----------+
|  1 |        1 |
|  1 |        1 |
|  1 |        1 |
|  1 |        1 |
|  1 |        1 |
+----+----------+
5 rows in set, 1 warning (0.00 sec)

Running SHOW WARNINGS you get:

  Level: Warning
   Code: 1052
Message: Column 'id' in group statement is ambiguous

That is fine and dandy, but it would be nice if the same results would come up, even if they still generated the warning.

The problem with this is that there is no way to alias a calculated field that same as a field in a table, and then group on that field. It indicates 'id' is ambiguous, but there is no way to dispel the ambiguity by making it group by the calculated 'id' field.

How to repeat:
create table test (id int not null auto_increment primary key);

insert into test values (null),(null),(null),(null),(null);

select 1 as id, count(*) from test group by id;

Suggested fix:
When the GROUP BY statement is ambiguous, default to assume the calculated field like it did in previous versions.
[9 Dec 2005 20:26] MySQL Verification Team
miguel@hegel:~/dbs/5.0> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1 to server version: 5.0.18-debug

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create table test (id int not null auto_increment primary key);
Query OK, 0 rows affected (0.01 sec)

mysql> 
mysql> insert into test values (null),(null),(null),(null),(null);
Query OK, 5 rows affected (0.01 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> 
mysql> select 1 as id, count(*) from test group by id;
+----+----------+
| id | count(*) |
+----+----------+
|  1 |        1 |
|  1 |        1 |
|  1 |        1 |
|  1 |        1 |
|  1 |        1 |
+----+----------+
5 rows in set, 1 warning (0.00 sec)
[17 Apr 2006 10:48] Georgi Kodinov
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.mysql.com/documentation/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Additional info:

The change in behaviour is because MySQL name resolution for GROUP BY have been changed so that derived columns do not shadow table columns from the FROM clause.
[22 Aug 2006 21:35] [ name withheld ]
Georgi,

    Although the documentation does note that this is not a bug, there is still a bug here.

    The documenation clearly states at the following address

http://dev.mysql.com/doc/refman/5.1/en/select.html

Excerpt:

"When MySQL resolves an unqualified column or alias reference in an ORDER BY, GROUP BY, or HAVING clause, it first searches for the name in the select_expr values. If the name is not found, it looks in the columns of the tables named in the FROM clause."

    Clearly, Mysql is not searching properly, otherwise the group by would have occurred in the correct fashion along with the warning.  

    I believe this was Nathan's main point.  The documentation clearly aims to keep Mysql 5 compatible with Mysql 4 on this issue with an extra warning issue.  However, this is not happening.

John
Web Programmer
[22 Aug 2006 22:42] Nathan Tanner
Thanks, John, for researching the documentation and supporting my initial argument.

I agree, this should be classified as a bug. At the very least, the documentation should make a special note of pointing out that it no longer "first searches for the name in the select_expr values", but instead "looks in the columns of the tables named in the FROM clause" -- and that this behavior has changed with the introduction of MySQL 5.0.
[22 Aug 2006 23:27] MySQL Verification Team
Thank you for the feedback, however Georgi is right isn't a bug and
I am changing the status to Analyzing for internal discuss about the
doubts you got from our Manual. Just for the record below how MaxDB
behaves:

sqlcli maxdb1=> select 1 as id, count(*) from test group by id
| ID     | EXPRESSION1        |
| ------ | ------------------ |
|      1 |                  1 |
|      1 |                  1 |
|      1 |                  1 |
|      1 |                  1 |
|      1 |                  1 |

5 rows selected (1 msec)

sqlcli maxdb1=>
[6 Sep 2006 19:12] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

I changed this:

When MySQL resolves an unqualified column or alias reference in an ORDER
BY, GROUP BY, or HAVING clause, it first searches for the name in the
select_expr values. If the name is not found, it looks for the columns
of the tables named in the FROM clause.

To this (based on PeterG's suggestion):

MySQL resolves unqualified column or alias references in ORDER BY
clauses by searching in the select_expr values, then in the columns
of the tables in the FROM clause. For GROUP BY or HAVING clauses, it
searches the FROM clause before searching in the select_expr values.
(For GROUP BY and HAVING, this differs from the pre-MySQL 5.0
behavior that used the same rules as for ORDER BY.)