MySQL Bugs: #18739: select having field=max(field) causes problems

Bug #18739	select having field=max(field) causes problems
Submitted:	3 Apr 2006 14:32	Modified:	10 May 2006 18:51
Reporter:	Steve Rogers	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	5.0.21-BK, 5.0.19	OS:	Linux (Linux, OS X)
Assigned to:	Evgeny Potemkin	CPU Architecture:	Any

Description:
Executing a select statement with a having clause like "having attribute=max(attribute) crashes the client (MySQL Query Browser 1.1.18) if there is more than one row in the table. It works OK and gets the right answer if there is only one row.

How to repeat:
create table maxthings (
  id     int,
  color  char(12),
  name   char(12),
  shape  char(12)
  )
  
insert into maxthings values ( 1, "blue", "sam", "round")
insert into maxthings values ( 2, "red", "bob", "square")
insert into maxthings values ( 3, "green", "al", "round")
insert into maxthings values ( 4, "orange", "joe", "oval")

select id from maxthings having id = max(id)

Thank you for a problem report. Please, specify, what result you expected, and why. I've got the following in mysql client:

mysql> select id from maxthings having id = max(id);
Empty set (0.01 sec)

No crash. (Any crashed of Quuery Browser is a separate problem. Open another bug report about it, if it is repeatable.)

If you wanted to get a row with maximal id, it is done in another way (http://dev.mysql.com/doc/refman/5.0/en/example-maximum-row.html):

mysql> select id from maxthings where id = (select max(id) from maxthings);
+------+
| id   |
+------+
| 4    |
+------+
1 row in set (0.02 sec)

Yes, if all rows have the same id values, your query "works":

mysql> select id from maxthings having id = max(id);
+------+
| id   |
+------+
| 4    |
+------+
1 row in set (0.00 sec)

mysql> select id from maxthings;
+------+
| id   |
+------+
| 4    |
| 4    |
+------+
2 rows in set (0.00 sec)

mysql> insert into maxthings(id) values(2);
Query OK, 1 row affected (0.00 sec)

mysql> select id from maxthings having id = max(id);
Empty set (0.00 sec)

And if there are 2 or more values, it does not work. So, it is either a bug or (if it was intended, for a reason unknown to me) a request for proper documentation. 

This is out of SQL Standard, IMHO, and thus should be explicitely documented in the manual.

Verified just as described with 5.0.21-BK on Linux.

It is valid SQL, though in practice you would not use this form just to get the max across the whole table - however, that isn't the point. The query should either return the right answer or get kicked back as a syntax error. But it is valid SQL, and the problem is still there if you use a more complex and "realistic" example, like:

create table maxthings (
  id     int,
  color  char(12),
  name   char(12),
  count  int
  )
  
insert into maxthings values ( 1, "blue", "sam", 1)
insert into maxthings values ( 2, "red", "bob", 5)
insert into maxthings values ( 3, "green", "al", 3)
insert into maxthings values ( 4, "orange", "joe", 2)
insert into maxthings values ( 5, "orange", "joe", 3)
insert into maxthings values ( 6, "orange", "joe", 1)

this query using the SUM aggregate gets the right answer:
SELECT id,color FROM maxthings m group by m.color having sum(count) > 2

however this query using the MAX aggregate doesn't return anything for the group of "orange" rows:
SELECT id,color FROM maxthings m group by m.color having id = max(id) 

and if you add another row so that green has two values:
insert into maxthings values ( 7, "green", "al", 5)

then the "green" row disappears from the result. 

The MAX aggregate clearly has some issues. While its certainly possible to query out the max value simply using SELECT MAX(field) and skipping the HAVING entirely, I should think that a failure to execute any SQL command (even a somewhat esoteric one) would at least merit the status of a bug. When the bug causes a crash, at least you know something is up, but when it has the potential for silently returning the wrong answer, that seems like a concern to me.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/4965

The SQL standard doesn't allow to use in HAVING clause fields that is not found
in GROUP BY clause and not enclosed in any aggregate function in the HAVING
clause. However, mysql allows using of such fields to simplify users 
calculations. This extension assume that the non-grouped fields will have the 
same group-wise values. Otherwise, the result will be unpredictable.

The GROUP BY clause have the similar extension. You can read about it here http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-fields.html.

I think that the possibly unpredictable result of using this HAVING extension should be explicitly noted somewhere in the manual.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/5231

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/5423

Fixed in 5.0.22, cset 1.2056

Noted in 5.0.22, 5.1.10 changelogs.

The <literal>ONLY_FULL_GROUP_BY</literal> SQL mode now also
applies to the <literal>HAVING</literal> clause. That is,
columns not named in the <literal>GROUP BY</literal> clause
cannot be used in the <literal>HAVING</literal> clause if not
used in an aggregate function.

Also revised this section to note that the MySQL extension applies
to HAVING:

http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-fields.html

I upgraded to "mysql  Ver 14.12 Distrib 5.0.45, for Win32 (ia32)" and I still have the problem with max aggregate function under having clause, which return no records.

This is the statement I am using:
select value,datetime1
from myTable
where datetime1 between '2007-09-17 00:00:00' and '2007-09-18 00:00:00'
having max(datetime1)=datetime1;

Result:
0 rows fetched.

I'm trying to get the value for the last occurence of datetime1.  With min(datetime1) it works perfect.
Please note that if do not add datetime1 on selected columns, I get "Unknown column 'datetime1' in 'having clause'" error.  It should be solved for this version.

sql-mode variable is set to "STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"

I still see this issue in version 5.0.51a-co

The issue is still in place even in 5.0.67.

issue still exist in 5.1.33

issus still in 6.0.11-alpha

Why is this bug still closed when several people have noted it in versions > 5.0.21?

Maybe this can be an workaround :  

SELECT MAX(CASE WHEN @price<=price THEN CONCAT(@price:=price,@article:=article,@dealer:=dealer) ELSE 0 END) AS dummy,
MAX(@article),
MAX(@dealer),
MAX(@price)
FROM
shop,
(SELECT @article:=0) r ,
(SELECT @dealer:=0) r2,
(SELECT @price:=0) r3
GROUP BY
@price:=-10,
@article:='',
@dealer:='';

Using user-defined variables to store the most-expansive item related data in the same query.

About the query in: "[30 Sep 2007 13:02] George Jara":

select value,datetime1
from myTable
where datetime1 between '2007-09-17 00:00:00' and '2007-09-18 00:00:00'
having max(datetime1)=datetime1;

There is a MAX(). As long as there's a MAX() there is aggregation. So what first happens is that the rows of myTable, after applying WHERE, are aggregated and a single row is produced; what value is taken for 'value' and 'datetime' for this row is random, it cannot be anything else.
Then, HAVING compares this random datetime1 with the MAX().

Another justification for aggregation is that there is HAVING and no GROUP BY. The SQL standard says that this implies aggregation into a single "row. Values of 'value' and 'datetime' for this row are impossible to define (what value to prefer).

What you want is probably:
select value,datetime1
from myTable
where datetime1 between '2007-09-17 00:00:00' and '2007-09-18 00:00:00'
and datetime1=(select max(datetime1) from myTable));