Bug #100094 Referencing a column in HAVING clause yields to an error
Submitted: 2 Jul 2020 21:25 Modified: 3 Jul 2020 14:40
Reporter: Thodoris Sotiropoulos Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version: OS:Ubuntu
Assigned to: CPU Architecture:Any

[2 Jul 2020 21:25] Thodoris Sotiropoulos
Description:
Referencing a column in HAVING clause yields to an "Unknown column 'x' in 'having clause'" error, even though the column appears in the GROUP BY clause.

According to the documentation:

"The SQL standard requires that HAVING must reference only columns in the GROUP BY clause or columns used in aggregate functions. However, MySQL supports an extension to this behavior, and permits HAVING to refer to columns in the SELECT list and columns in outer subqueries as well."

How to repeat:
SELECT count (*)
FROM t1
INNER JOIN t2 ON `t1`.`c0` = `t2`.`c1`
GROUP BY `t1`.`c0`, `t2`.`c0`
HAVING count (*) >= `t1`.`c0`;
[2 Jul 2020 21:40] Thodoris Sotiropoulos
It seems that the issue happens when the GROUP BY clause contains two different columns whose basename is the same.
[3 Jul 2020 12:24] MySQL Verification Team
Γεια σας κύριε Sotiropoulos,

Thank you for your bug report.

However, this is not a bug. You have a basic SQL error.

count(*) is not a column in GROUP BY.

Not a bug.
[3 Jul 2020 12:42] Thodoris Sotiropoulos
Thank you for your reply.

I miss something here, when GROUP BY contains columns whose basenames are different (i.e., `t1`.`c0`, `t2`.`c1`), the query is executed without errors

SELECT count (*)
FROM t1
INNER JOIN t2 ON `t1`.`c0` = `t2`.`c1`
GROUP BY `t1`.`c0`, `t2`.`c1`
HAVING count (*) >= `t1`.`c0`;

the query above is executed without errors

On the other hand, when GROUP BY contains different columns whose basename is the same (i.e., `t1`.`c0`, `t2`.`c0`), the query produces the error "Unknown column 't1.c0' in 'having clause'.

SELECT count (*)
FROM t1
INNER JOIN t2 ON `t1`.`c0` = `t2`.`c1`
GROUP BY `t1`.`c0`, `t2`.`c0`
HAVING count (*) >= `t1`.`c0`;

I don't understand why this happens, since HAVING refers to a column included in the GROUP BY clause (i.e., `t1`.`c0`), and an aggregate function mentioned in SELECT (i.e., count(*))
[3 Jul 2020 12:48] MySQL Verification Team
Hi,

First of all, you should be using our latest release, id est 8.0.21.

Second, for this purpose you are better off setting filtering conditions instead of HAVING.
[3 Jul 2020 13:24] Thodoris Sotiropoulos
Hi again

I am using MySQL 8.0.20 and the error occurs.

I inevitably use HAVING because the predicate condition contains an aggregate function. Therefore, I cannot use WHERE.
[3 Jul 2020 13:32] MySQL Verification Team
Hi,

Thiis deserves a deeper look.

Please, send us the entire test case.
[3 Jul 2020 14:09] Thodoris Sotiropoulos
I have the following SQL script that sets up two simple tables, namely t1 and t2.

DROP TABLE IF EXISTS t2;
DROP TABLE IF EXISTS t1;

CREATE TABLE t1 (
  id integer,
  c0 integer,
  c1 integer,
  PRIMARY KEY (id)
);
INSERT INTO t1(id, c0, c1) VALUES (1, 1, 2);
INSERT INTO t1(id, c0, c1) VALUES (2, 2, 2);

CREATE TABLE t2 (
  id integer,
  c0 integer,
  c1 integer,
  PRIMARY KEY (id),
  FOREIGN KEY (c1) REFERENCES t1(id) ON DELETE CASCADE
);

INSERT INTO t2(id, c0, c1) VALUES (1, 2, 1);
INSERT INTO t2(id, c0, c1) VALUES (2, 2, 2);

Then, I perform two queries.

Query 1
========

SELECT count(*)
FROM t1 INNER JOIN t2 ON (`t1`.`c0` = `t2`.`c1`)
GROUP BY `t1`.`c0`, `t2`.`c1`
HAVING count(*) >= `t1`.`c0`;

I get

+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)

However, when I run the second query
Query 2
=======

SELECT count(*)
FROM t1
INNER JOIN t2 ON (`t1`.`c0` = `t2`.`c1`)
GROUP BY `t1`.`c0`, `t2`.`c0`
HAVING count(*) >= `t1`.`c0`;

I get an ERROR: "ERROR 1054 (42S22): Unknown column 't1.c0' in 'having clause'"

Observe the difference between the first and the second query. The only thing that it changes is the columns included in the GROUP BY clause.

I cannot understand why I get this error when executing Query 2. The HAVING clause in the second query refers to:
  * A column mentioned in the GROUP BY clause (i.e, `t1.co`), something that is legal.
  * An aggregate function (i.e., count(*)), something that is also legal.
[3 Jul 2020 14:31] MySQL Verification Team
Hi,

Thank you for your report.

You have to follow SQL by its standard. Your query was not precise as to what was count(*) referring to.

With a small change in your query, I have got:

xxx	count(*)
1	1
xxx	count(*)
1	1

Not a bug.
[3 Jul 2020 14:40] Thodoris Sotiropoulos
Thank you for your reply.

OK, I see. I presume that you also place the column `t1`.`c0` in the SELECT clause, right?

So, do the two queries I posted previously manifest undefined behaviour, meaning that MySQL is free to produce any result (even an error)? Because, in the first query (although I don't have the column `t1`.`c0` in the SELECT clause), MySQL produces a result, while in the second query this is not the case.