Bug #109020 Inconsistent resolution of alias references in order by clause
Submitted: 7 Nov 2022 20:50 Modified: 8 Nov 2022 18:24
Reporter: Jason Fulghum Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: aliases

[7 Nov 2022 20:50] Jason Fulghum
Description:
From testing and from the documentation I've found, when there is ambiguity in the order by clause between an available expression alias and a column name, the alias is preferred. 

However, if the order by clause uses that reference in a function call, the column is preferred over the alias. 

From the documentation in https://dev.mysql.com/doc/refman/8.0/en/select.html , I would expect MySQL to recognize the defined alias in select projections and use it regardless of whether the alias reference was used bare or was used as an argument to a function call: 
"MySQL resolves unqualified column or alias references in ORDER BY clauses by searching in the select_expr values, then in the columns of the tables in the FROM clause. For GROUP BY or HAVING clauses, it searches the FROM clause before searching in the select_expr values."

There is a similar issue reported and verified for MySQL 5.x, although it is specifically talking about use of alias references in aggregation functions. 
- https://bugs.mysql.com/bug.php?id=80802

I've provided simple repro steps below. 

Apologies and thanks in advance if I'm just missing something in the documentation that would explain this behavior!

How to repeat:
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.30    |
+-----------+

mysql> CREATE TABLE `aliastest` (
  `pk` int NOT NULL,
  PRIMARY KEY (`pk`)
);

mysql> insert into aliastest values (-4), (-3), (-2), (-1);

mysql> select * from aliastest;
+----+
| pk |
+----+
| -4 |
| -3 |
| -2 |
| -1 |
+----+

-- When there is ambiguity between an expression alias and a column name, the alias is preferred in the order by clause, as expected. 
mysql> select pk, rand() as pk from aliastest order by pk;
+----+--------------------+
| pk | pk                 |
+----+--------------------+
| -2 | 0.5979187577943492 |
| -4 | 0.6861500737128314 |
| -3 | 0.7312737896398397 |
| -1 |  0.795772450785872 |
+----+--------------------+

-- When the reference is in a function call, the column is used, instead of the alias. This does not seem to match the documented behavior. 
mysql> select pk, rand() as pk from aliastest order by abs(pk);
+----+---------------------+
| pk | pk                  |
+----+---------------------+
| -1 | 0.18510601127995738 |
| -2 |  0.5382127347758159 |
| -3 | 0.13574567077285207 |
| -4 |  0.0640901802704578 |
+----+---------------------+

Suggested fix:
I would expect the order by clause to resolve an unqualified alias reference the same way, regardless of whether is used bare or used as the argument to a function.
[8 Nov 2022 13:11] MySQL Verification Team
Hi Mr. Fulghum,

Thank you for your bug report.

This is truly a missing part in our documentation.

Verified as a documentation bug.
[8 Nov 2022 18:24] Jason Fulghum
Thank you for taking a look and confirming that the documentation needs to be updated to explain this behavior. Much appreciated!🙏

We're looking forward to seeing the doc updates so we can better understand the behavior/rules. That will be a big help so we can implement the alias handling logic correctly in the go-mysql-server project. Thank you!
[9 Nov 2022 12:18] MySQL Verification Team
Hi,

You are truly welcome .....