Bug #79549 Regression: “unknown column” for outer computed values used inside a subquery
Submitted: 7 Dec 2015 21:14 Modified: 16 Jan 2016 15:15
Reporter: Chris Adams Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.7.9 OS:Any
Assigned to: CPU Architecture:Any
Tags: regression

[7 Dec 2015 21:14] Chris Adams
Description:
After upgrading to MySQL 5.7.9 some reporting queries which previously worked on 5.6.23 now fail. 

The query below was used to generate a timeline listing the number of items and associated files released over time and aliased the `DATE(first_published)` call to avoid recalcuting the same value in multiple subqueries (only one is shown below for brevity).

How to repeat:
mysql> SELECT 
    ->     DATE(first_published) AS pub_date,
    ->     COUNT(*) AS item_count,
    ->     (SELECT COUNT(*) FROM core_file INNER JOIN core_item ON item_id = core_item.id WHERE DATE(first_published) = pub_date) AS file_count
    -> FROM core_item
    ->     GROUP BY pub_date
    ->     ORDER BY pub_date;
ERROR 1054 (42S22): Unknown column 'pub_date' in 'where clause'

Rewriting that query to an alternate form produces the same results and works on both versions:

SELECT 
        release_dates.pub_date, 
        COUNT(*) AS item_count,
        (SELECT COUNT(*) FROM core_file INNER JOIN core_item ON item_id = core_item.id WHERE DATE(first_published) = release_dates.pub_date) AS file_count
    FROM core_item c_i 
        INNER JOIN (SELECT DISTINCT DATE(first_published) AS pub_date FROM core_item WHERE first_published IS NOT NULL ORDER BY pub_date) release_dates
            ON DATE(c_i.first_published) = release_dates.pub_date
    GROUP BY release_dates.pub_date
    ORDER BY release_dates.pub_date;
[7 Dec 2015 21:26] MySQL Verification Team
Thank you for the bug report. Please provide the create table statement to run the queries against. Thanks.
[8 Dec 2015 6:27] MySQL Verification Team
I have a suspicion that this is due to the fix in 5.7.8:

"References to select list columns of the outer query from the HAVING clause of a correlated subquery in the inner query should, but did not, return an error, resulting in a server exit. (Bug #19823076)"

https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-8.html
[8 Dec 2015 8:39] Artem S.
i have a same problem.
my query: 

SELECT
    `id` as `t`, 
    `title`, 
    `once`,
    (select count(`id`) from `orders_files` where `order_id`='116815' and `type`=`t` and (`version`='41924') ) as `count`,
    (select count(`id`) from `orders_files` where `order_id`='116815' and `type`=`t` ) as `count_in_order`
FROM 
    `orders_filetypes`
WHERE
    `required`=1 
ORDER BY `sort`, `title`

it works fine on mysql 5.6.22 but not works in 5.7.9 because error "#1054 - Unknown column 't' in 'where clause'"

that is not convertation as DATE(string), that is just rename column.
[8 Dec 2015 9:46] MySQL Verification Team
Thank you for the feedback.

mysql 5.7 > SELECT
    ->     `id` as `t`,
    ->     `title`,
    ->     `once`,
    ->     (select count(`id`) from `orders_files` where `order_id`='116815' and `type`=`t` and (`version`='41924') ) as `count`,
    ->     (select count(`id`) from `orders_files` where `order_id`='116815' and `type`=`t` ) as `count_in_order`
    -> FROM
    ->     `orders_filetypes`
    -> WHERE
    ->     `required`=1
    -> ORDER BY `sort`, `title`;
ERROR 1054 (42S22): Unknown column 't' in 'where clause'

+-------------------------+---------------------------------------+
| Variable_name           | Value                                 |
+-------------------------+---------------------------------------+
| innodb_version          | 5.7.11                                |
| protocol_version        | 10                                    |
| slave_type_conversions  |                                       |
| tls_version             | 1                                     |
| version                 | 5.7.11                                |
| version_comment         | Source distribution PULL: 2015-DEC-05 |
| version_compile_machine | x86_64                                |
| version_compile_os      | Win64                                 |
+-------------------------+---------------------------------------+
8 rows in set (0.08 sec)

mysql> SELECT
    ->     `id` as `t`,
    ->     `title`,
    ->     `once`,
    ->     (select count(`id`) from `orders_files` where `order_id`='116815' and `type`=`t` and (`version`='41924') ) as `count`,
    ->     (select count(`id`) from `orders_files` where `order_id`='116815' and `type`=`t` ) as `count_in_order`
    -> FROM
    ->     `orders_filetypes`
    -> WHERE
    ->     `required`=1
    -> ORDER BY `sort`, `title`;
Empty set (0.00 sec)

mysql> SHOW VARIABLES LIKE "%VERSION%";
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 5.7.6                        |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.7.6-m16                    |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Win64                        |
+-------------------------+------------------------------+
7 rows in set (0.00 sec)
[8 Dec 2015 10:20] Artem S.
mysql> SHOW VARIABLES LIKE "%VERSION%";
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 5.7.9                        |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.7.9-log                    |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
+-------------------------+------------------------------+
7 rows in set (0.00 sec)

Is that a bug? Do you will fix it in new versions or that as it should be?
[8 Dec 2015 11:52] Roy Lyseng
Hi Artem,

your problem is simple to fix by changing the outer references to include the table names.
This is your query rewritten to be compliant with the SQL standard (if I got it right):

SELECT id, title,  once,
              (select count(id) from orders_files
               where order_id = '116815' and
                          type = of.id and
                          version = '41924') as `count`,
              (select count(id) from orders_files
               where order_id = '116815' and
                          type = of.id ) as `count_in_order`
FROM orders_filetypes AS of
WHERE required = 1 
ORDER BY sort, title;
[8 Dec 2015 12:59] Roy Lyseng
Hi Chris,

Aliased expressions from the SELECT list are only allowed in the ORDER BY clause, according to standard SQL.
As an extension MySQL also allows them in GROUP BY and HAVING.
However, they are invalid in the WHERE clause and in other parts of the SELECT list. For more information see http://dev.mysql.com/doc/refman/5.7/en/problems-with-alias.html.

The common workaround for this problem is to use a derived table, which is possible in 5.7 without loss of performance.
This should be a proper workaround for your problem:

SELECT pub_date,
       COUNT(*) AS item_count,
       (SELECT COUNT(*)
        FROM core_file INNER JOIN core_item ON item_id = core_item.id
        WHERE DATE(first_published) = pub_date) AS file_count
FROM (SELECT DATE(first_published) AS pub_date FROM core_item) AS dt
GROUP BY pub_date
ORDER BY pub_date;
[8 Dec 2015 13:49] Chris Adams
Roy: thanks for answering that authoritatively. Would it be possible to get a better message or perhaps something in the 5.7 upgrade notes? I found this while upgrading a fairly large internal application and a little searching suggests that this won't be uncommon as people migrate.
[8 Dec 2015 14:07] Artem S.
Hello, Roy!

Thank you for explaining, but I know how to solve this query and anyone else.
One query - is not a problem, but I have a million same and different queries in our web program, that prevents me to find all of them.

I think that finding and changing one of config parameter is very much easier to solve this problem. If that parameter exists, of course :)

I read about Aliased expressions (that they can't be used in WHERE clause), but then the question arises: 
1. Why is it stopped working? 
2. Why standard SQL was changed? (or it did not previously been given to the standard?) 
3. Does exists an option of the configuration that query will working again OR how to find all such requests that doesn't working?
[9 Dec 2015 15:35] Roy Lyseng
Hi Artem,

first, if I dare say, you have not chosen the best solution for your problem...

But over to your questions:

I read about Aliased expressions (that they can't be used in WHERE clause), but then the question arises: 
1. Why is it stopped working?

A. It stopped working because we were screening the MySQL extensions to the SQL standard. And when examining some crashing bugs due to corner cases that were using references to aliased expressions, it was decided that this extension was poorly defined and could be handled by other means. But the crashing bug was using a subquery in the WHERE clause, not in the SELECT list.

2. Why standard SQL was changed? (or it did not previously been given to the standard?)

A: This construction has never been part of standard SQL, it was a MySQL extension to the standard. The standard has never allowed references to aliases, except within the ORDER BY clause.

3. Does exists an option of the configuration that query will working again OR how to find all such requests that doesn't working?

A. We have reconsidered how to handle this problem, and will try to revert the decision on not supporting aliases in subqueries in SELECT list. Thus, we are reopening the bug.

Here is a longer background for the original decision:

Contrary to references to aliases in subqueries in the WHERE clause (and in GROUP BY, for that matter), there is no reason (except standard compliance) that we should not allow references to aliases in the SELECT list, since they should be available in the same phase of query execution. But the support in 5.6 was quite arbitrary:

Given this: create table t1(a int, b int),

Alias in SELECT list is not valid:

  select a+b as c,c+1 from t1;
  ERROR 1054 (42S22): Unknown column 'c' in 'field list'

But within a subquery, reference to c is valid:

  select a+b as c,(select c+1) from t1;

And subquery must be after definition of alias:

  select (select c+1),a+b as c from t1;
  ERROR 1247 (42S22): Reference 'c' not supported (forward reference in item list)

So, it is easy to say that support for references to aliases in SELECT list was rather ad-hoc. Nevertheless, we will try to reimplement the old solution, but with no attempt at cleaning up the obvious holes in the support for this feature. But referencing aliases in subqueries in the WHERE clause will not be reimplemented.
[10 Dec 2015 9:22] Guilhem Bichot
Artem,

Just to expand on the last Roy's sentence "referencing aliases in subqueries *in the WHERE clause* will not be reimplemented."

He meant, when subqueries are located in the WHERE clause of the query.

Given this: create table t1(a int, b int),
this will remain illegal:

  select a+b as c from t1 where c=1;
  select a+b as c from t1 where (select c)=1;

Indeed, the universal workflow of SQL is that it starts with a join of tables, applies the WHERE, then GROUP BY, then calculates the SELECT list expressions. So, in WHERE, the name "c" doesn't exist. "a" and "b" however do exist, as they're columns of the join of tables, so this is valid:
  select a+b as c from t1 where a+b=1;

Imagine, if aliases were allowed in WHERE, we could have such absurdity:

  select sum(a+b) as c from t1 where c=1;

i.e. filtering every row based on if the sum of them is equal to 1...

See also:
https://wiki.windward.net/Wiki/09.Knowledge_Base/SQL_Cheat_Sheet_%E2%80%93_by_Joe_Celko
[15 Dec 2015 21:47] Manhao Chen
@Roy,

you mentioned that the following is valid..
But within a subquery, reference to c is valid:

  select a+b as c,(select c+1) from t1;

but when i test with 5.7.10 with the following, i run into error Unknown column 't1' in 'field list'

SELECT	123 AS t1, (SELECT t1) AS t2

any ideas?
[16 Dec 2015 7:11] Roy Lyseng
Manhao,

this is the exact problem that this bug report is addressing. But we will restore the support in a coming release.
[16 Dec 2015 7:17] Roy Lyseng
Notice that error messages may be a bit confusing.

Let's say we have

  SELECT a+b AS c, (SELECT x FROM t2 WHERE t2.a=c) FROM t1;

The error message for this query might say Unknown column 'c' in 'WHERE clause'

However, the validity of alias references is based on where the subquery is placed, in this case in the field list, whereas the error message correctly reports that the reference is placed in the WHERE clause of the subquery.
[16 Jan 2016 15:15] Paul DuBois
Noted in 5.7.11 changelog.

MySQL 5.7.8 prohibited references to select list columns of the outer
query from the HAVING clause of a correlated subquery in the inner
query because they are not permitted by standard SQL. However,
because this is a frequently used extension, it is once again
permitted.