Bug #73332 | ambiguous column error in order by query | ||
---|---|---|---|
Submitted: | 19 Jul 2014 18:58 | Modified: | 6 Aug 2014 15:44 |
Reporter: | Ryan Brothers | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Parser | Severity: | S3 (Non-critical) |
Version: | 5.6.19 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[19 Jul 2014 18:58]
Ryan Brothers
[20 Jul 2014 11:21]
Peter Laursen
Further observations: 1) SELECT 1 AS HOST FROM mysql.user JOIN mysql.db ON (db.host = user.host) ORDER BY LENGTH(`host`); -- same error 2) SELECT 1 AS HOST FROM mysql.user JOIN mysql.db ON (db.host = user.host) ORDER BY LENGTH('host') -- success 3) SELECT 1 AS USER FROM mysql.user JOIN mysql.db ON (db.host = user.host) -- same error 4) SELECT 1 AS db FROM mysql.user JOIN mysql.db ON (db.host = user.host) ORDER BY LENGTH(db); -- success Now, documentation states that an alias may be quoted as an identifier or as a string. Using string-quotes lets the server recognize that the ORDER BY references the alias and not a column (because string-quotes excludes the possiblity of an identifier). But even I think it is very inconsitent that "ORDER BY LENGTH(host)" and "ORDER BY host" do not behave identically. Also I find it extremely weird that 3 and 4) in my examples above return different. "user" and "host" are not reserved words according to http://dev.mysql.com/doc/refman/5.6/en/reserved-words.html - but they still seem to trigger another behaviour in such queries that "non-keyword"-words (such as "db" in case). I have checked with all versions since 5.0.96 that it is the same in them all. -- Peter -- not a MySQL/Oracle person
[20 Jul 2014 11:23]
Peter Laursen
My case 3) got truncated. Here is the right one. SELECT 1 AS USER FROM mysql.user JOIN mysql.db ON (db.host = user.host) ORDER BY LENGTH(USER);
[20 Jul 2014 11:39]
Peter Laursen
Adn finally - using an nqoted reserved word retuns a (1064) syntax error as expected. Like here SELECT 1 AS JOIN FROM mysql.user JOIN mysql.db ON (db.host = user.host) ORDER BY LENGTH(JOIN); and SELECT 1 AS JOIN FROM mysql.user JOIN mysql.db ON (db.host = user.host) ORDER BY JOIN
[21 Jul 2014 10:57]
Sveta Smirnova
Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://dev.mysql.com/doc/ and the instructions on how to report a bug at http://bugs.mysql.com/how-to-report.php When you run "ORDER BY host" alias is used and when you run "ORDER BY LENGTH(host)" MySQL tries to use either db.host or user.host which is ambiguous. You can confirm this if use one of columns which exists only in one of tables and EXPLAIN EXTENDED: mysql> explain extended SELECT 1 AS password FROM mysql.user JOIN mysql.db ON (db.host = user.host) ORDER BY length(password); +----+-------------+-------+-------+---------------+---------+---------+---------------+------+----------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+-------+---------------+---------+---------+---------------+------+----------+----------------------------------------------+ | 1 | SIMPLE | db | index | PRIMARY | PRIMARY | 420 | NULL | 2 | 100.00 | Using index; Using temporary; Using filesort | | 1 | SIMPLE | user | ref | PRIMARY | PRIMARY | 180 | mysql.db.Host | 1 | 100.00 | NULL | +----+-------------+-------+-------+---------------+---------+---------+---------------+------+----------+----------------------------------------------+ 2 rows in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select 1 AS `password` from `mysql`.`user` join `mysql`.`db` where (`mysql`.`user`.`Host` = `mysql`.`db`.`Host`) order by length(`mysql`.`user`.`Password`) Compare with: mysql> explain extended SELECT 1 AS password FROM mysql.user JOIN mysql.db ON (db.host = user.host) ORDER BY password; +----+-------------+-------+-------+---------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+-------+---------------+---------+---------+---------------+------+----------+-------------+ | 1 | SIMPLE | db | index | PRIMARY | PRIMARY | 420 | NULL | 2 | 100.00 | Using index | | 1 | SIMPLE | user | ref | PRIMARY | PRIMARY | 180 | mysql.db.Host | 1 | 100.00 | Using index | +----+-------------+-------+-------+---------------+---------+---------+---------------+------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec) Note (Code 1003): /* select#1 */ select 1 AS `password` from `mysql`.`user` join `mysql`.`db` where (`mysql`.`user`.`Host` = `mysql`.`db`.`Host`) order by `password`
[21 Jul 2014 11:16]
Peter Laursen
The conclusion here illustrates IMO unseriously bug reports are sometimes treated by MySQL/Oracle, There is nothing in Sveta's last post that we have not already noticed. It is easy to *see* ("*aftersee*") the behavior, obviously. But it is not possible to *foresee* it. Is it completely random? Does some rules apply? Where is the parser behavior with such naming conflict documented? and why does it matter if the alias is wrapped inside a function or not? People should avoid such alias, of course. But still it reveals the primitivity of the MySQL parser in some respects that it had from early days and that have not been improved over 10+ years. IMO this should be verified so that it will be considered in a future parser rewrite aiming at achieving more consistency.
[21 Jul 2014 11:38]
Ryan Brothers
Thanks, I see it's because both the db and user tables have a field called host. Why does MySQL use the alias in "ORDER BY host", but use a field in "ORDER BY LENGTH(host)"? Why is there different behavior there? Should it be consistent in those 2 cases? How could I change my query to order by the length of host, which is a derived field in my query? I tried "ORDER BY `host`", but it still gives the ambiguous column error.
[21 Jul 2014 11:41]
Ryan Brothers
I mean "ORDER BY LENGTH(`host`)" still gives the ambiguous column error.
[21 Jul 2014 11:45]
Sveta Smirnova
Peter, this is not for MySQL who decided how to perform name resolution in this case. See http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt, particularly "6.4 <column reference>" and "6.5 <set function specification>" parts, pay attention for "Outer reference".
[21 Jul 2014 11:47]
Peter Laursen
@Ryan . "ORDER BY LENGTH('host')"(using apostrophes and not backquotes)does not! `host` is still ambigious because both identifiers for database objects and aliases may be `backquoted`. But an identifier cannot be 'singlequoted'.
[21 Jul 2014 12:13]
Ryan Brothers
Peter, From my testing, when I do "ORDER BY LENGTH('host')", it's ordering by the length of the string 'host', which is a constant 4. I just tried the below test which does not give the expected results. The results are coming back in the incorrect order. CREATE TABLE test1 ( host varchar(10) default NULL ); INSERT INTO test1 VALUES ('aaaa'), ('bbb'), ('cc'), ('d'); -- returns incorrect order SELECT * FROM test1 ORDER BY length('host'); -- returns correct order SELECT * FROM test1 ORDER BY length(host);
[22 Jul 2014 13:16]
Peter Laursen
OK! I did not try with data!
[22 Jul 2014 13:19]
Peter Laursen
@Ryan .. if you want more comments/considerations from MySQL supporters, I think you will have to change status from 'Not a Bug' to 'active'.
[6 Aug 2014 11:57]
Hartmut Holzgraefe
The real problem seems to be that in the ORDER BY syntax: ORDER BY {col_name | expr | position} [ASC | DESC], ... Later on in the "SELECT Syntax" page it says: Columns selected for output can be referred to in ORDER BY and GROUP BY clauses using column names, column aliases, or column positions. So the "col_name" in the ORDER BY syntax line can be either a real column name or an alias (and later on it is defined that aliases take precedence) But when using an "expr" the general syntax rules for expressions apply, and these do not know about column aliases. To make the original query work as expected the whole expression parser would need to be modified to be context sensitive whenever seeing something that looks like a column name, and only if the expression is found in a ORDER BY, GROUP BY or HAVING context it may also consider select column aliases as alternatives to consider ... This might be possible, but will probably only lead to other ambiguity side effects, and I wouldn't be surprised if that behavior was in violation with the standard, too. All in all I'd say this is a documentation issue as the manual doesn't state clearly enough that only direct column alias names are supported in the group by list, but that their use in expressions is not ...
[6 Aug 2014 13:32]
Ryan Brothers
Hartmut - thanks for your detailed reply. It looks like it is possible to reference an alias in an ORDER BY expr, but the results aren't correct: SELECT RAND() AS abc FROM mysql.user ORDER BY ABS(abc) This query is a valid query, but the data isn't being sorted correctly. Do you know what it's ordering by?
[6 Aug 2014 13:40]
Peter Laursen
Good question really!
[6 Aug 2014 15:44]
Ryan Brothers
From further testing, the above query with RAND() seems to be a separate, unrelated issue. To summarize what I'm seeing: 1) SELECT 'abc' AS user FROM mysql.user ORDER BY user; 2) SELECT 'abc' AS test1 FROM mysql.user ORDER BY LENGTH(test1); 3) SELECT 'abc' AS user FROM mysql.user ORDER BY LENGTH(user); Query 1 sorts by the alias user. Query 2 sorts by the alias test1. Query 3 sorts by the column name user.
[6 Aug 2014 16:22]
Bear Limvere
This behaviour is very inconsistent and makes using the sort function much harder that it should be. Here is my example: CREATE TABLE test (n INT(11), val VARCHAR(32)); INSERT INTO test SELECT n, CONCAT('a', ROUND(n * RAND(), 4)) AS rand_n FROM common_schema.numbers; SELECT n, val AS alias, n AS val FROM test ORDER BY UPPER(val) LIMIT 5; +------+---------+------+ | n | alias | val | +------+---------+------+ | 0 | a0.0000 | 0 | | 480 | a0.0091 | 480 | | 250 | a0.1293 | 250 | | 1 | a0.2261 | 1 | | 11 | a0.2617 | 11 | +------+---------+------+ 5 rows in set (0.01 sec) SELECT n, val AS alias, n AS val FROM test ORDER BY val LIMIT 5; +------+---------+------+ | n | alias | val | +------+---------+------+ | 0 | a0.0000 | 0 | | 1 | a0.2261 | 1 | | 2 | a0.5227 | 2 | | 3 | a1.8849 | 3 | | 4 | a1.4299 | 4 | +------+---------+------+