Bug #48703 nontrivial ORDER/GROUP BY expressions: ambiguous statements not detected
Submitted: 11 Nov 2009 21:58 Modified: 7 Jun 2010 16:41
Reporter: Gleb Shchepa Email Updates:
Status: In progress Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0+ OS:Any
Assigned to: Assigned Account CPU Architecture:Any

[11 Nov 2009 21:58] Gleb Shchepa
Description:
Note: this bug is a showstopper for the bug #45640.

If
  1) the SELECT list defines an alias,
  2) and the FROM list refers to a column with the same name
     as the alias above,
  3) and we are trying to group by this name
the server returns an error:

mysql> CREATE TABLE t1 (a INT, b INT);
Query OK, 0 rows affected (0.11 sec)

mysql> SELECT a b, b FROM t1 GROUP BY b;
ERROR 1052 (23000): Column 'b' in group statement is ambiguous

However, if we use some grouping expression on this
name, the server silently ignores the ambiguity:

mysql> SELECT a b, b FROM t1 GROUP BY b + 0;
Empty set (0.00 sec)

Actually, the server chooses a column from the FROM list
instead of error:

mysql> INSERT INTO t1 VALUES (1,9), (2, 8), (3, 7);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT a b, b FROM t1 GROUP BY b + 0;
+------+------+
| b    | b    |
+------+------+
|    3 |    7 |
|    2 |    8 |
|    1 |    9 |
+------+------+
3 rows in set (0.00 sec)

Also see bug #47434 (similar, but not a duplicate).

How to repeat:
CREATE TABLE t1 (a INT, b INT);
SELECT a b, b FROM t1 GROUP BY b + 0;

Suggested fix:
The server should do a deeper analysis of non-trivial
GROUP BY expression and the ambiguous query should stop with
an error message.
[11 Nov 2009 22:09] Gleb Shchepa
Changing the Category to Optimizer since the problem is inside the
the find_order_in_list function: it doesn't do deep analysis of GROUP BY
expressions but analyses/substitutes the topmost items only, so inner
items are unchecked and unchanged.
[11 Nov 2009 23:12] MySQL Verification Team
Thank you for the bug report.

Microsoft Windows [Version 6.0.6002]
Copyright (c) 2006 Microsoft Corporation.  All rights reserved.

c:\dbs>5.0\bin\mysqladmin -uroot --port=3500 create db7

c:\dbs>5.0\bin\mysql -uroot --port=3500 db7
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.89-Win X64 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE t1 (a INT, b INT);
Query OK, 0 rows affected (0.23 sec)

mysql> SELECT a b, b FROM t1 GROUP BY b;
ERROR 1052 (23000): Column 'b' in group statement is ambiguous
mysql> SELECT a b, b FROM t1 GROUP BY b + 0;
Empty set (0.05 sec)

mysql>
[12 Nov 2009 8:44] Gleb Shchepa
Of course, ORDER BY is affected too.
[12 Nov 2009 9:00] Gleb Shchepa
If an ORDER/GROUP BY expression is not found in the SELECT list,
the find_order_in_list function adds this expression to the
JOIN::all_fields list, so after such a transformation
the result of "SELECT a b, b FROM t1 ORDER BY b+0" is equivalent
to two first columns of the "SELECT a b, b, b+0 FROM t1 ORDER BY 3".

The current server resolves names in the SELECT list with columns
from the FROM list (it doesn't try to resolve names in the
SELECT list recursively with aliases from the same SELECT list),
so the transformed query is not ambiguous (but the original
query is).
[12 Nov 2009 9:18] Gleb Shchepa
Typo in the previous commentary: the current server tries to resolve
aliases in the FROM list, otherwise in the SELECT list.