Bug #69833 Bad interaction between MIN/MAX and "HAVING SUM(DISTINCT)": wrong results
Submitted: 24 Jul 2013 20:58 Modified: 6 Jun 2014 2:54
Reporter: Arthur O'Dwyer Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.5.31,5,5.32, 5.6.12, 5.7.2-m12 OS:Any
Assigned to: CPU Architecture:Any
Tags: regression

[24 Jul 2013 20:58] Arthur O'Dwyer
Description:
The synopsis of this bug is almost identical to http://bugs.mysql.com/bug.php?id=69729 , but this time it's a different feature interfering with SELECT queries involving MIN/MAX.

How to repeat:
DROP DATABASE IF EXISTS d;
CREATE DATABASE d;
USE d;
CREATE TABLE t (a INT, b INT, KEY(a,b));
INSERT INTO t VALUES (1,1), (2,2), (3,3), (4,4);
SELECT a, MAX(b) FROM t GROUP BY a HAVING SUM(DISTINCT a);

  +------+--------+
  | a    | MAX(b) |
  +------+--------+
  |    1 |      2 |
  |    2 |      3 |
  |    3 |      4 |
  |    4 |      4 |
  +------+--------+

The MAX(b) column is weirdly shifted up one row from where it belongs; the expected output is

  +------+--------+
  | a    | MAX(b) |
  +------+--------+
  |    1 |      1 |
  |    2 |      2 |
  |    3 |      3 |
  |    4 |      4 |
  +------+--------+

Suggested fix:
Once again, I blame get_best_group_min_max(). In MySQL 5.5, it contains the following code:

      else if (min_max_item->sum_func() == Item_sum::COUNT_DISTINCT_FUNC ||
               min_max_item->sum_func() == Item_sum::SUM_DISTINCT_FUNC ||
               min_max_item->sum_func() == Item_sum::AVG_DISTINCT_FUNC)
        continue;

I don't know what this code is doing, but it's skipping *some* sort of check, in three cases, which happen to be the exact three cases in which this bug reproduces.
[25 Jul 2013 7:56] Valeriy Kravchuk
Both 5.5.32 and 5.6.12 are affected:

openxs@ao756:~/dbs/mysql-5.6.12-linux-glibc2.5-x86_64$ bin/mysql --no-defaults -uroot test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.12 MySQL Community Server (GPL)

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE TABLE t (a INT, b INT, KEY(a,b));
Query OK, 0 rows affected (0,73 sec)

mysql> INSERT INTO t VALUES (1,1), (2,2), (3,3), (4,4);
Query OK, 4 rows affected (0,14 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT a, MAX(b) FROM t GROUP BY a HAVING SUM(DISTINCT a);
+------+--------+
| a    | MAX(b) |
+------+--------+
|    1 |      2 |
|    2 |      3 |
|    3 |      4 |
|    4 |      4 |
+------+--------+
4 rows in set (0,07 sec)

mysql> SELECT a, MAX(b) FROM t GROUP BY a;
+------+--------+
| a    | MAX(b) |
+------+--------+
|    1 |      1 |
|    2 |      2 |
|    3 |      3 |
|    4 |      4 |
+------+--------+
4 rows in set (0,00 sec)
[25 Jul 2013 8:23] MySQL Verification Team
Hello Arthur,

Thank you for the bug report and the test case. 
Verified as described on recent 5.5, 5.6 and 5.7.2-m12.

Thanks,
Umesh
[25 Jul 2013 8:23] MySQL Verification Team
// 5.1.71 - Not affacted

mysql> select version();
+------------+
| version()  |
+------------+
| 5.1.71-log |
+------------+
1 row in set (0.00 sec)

mysql> DROP DATABASE IF EXISTS d;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE DATABASE d;
Query OK, 1 row affected (0.00 sec)

mysql> USE d;
Database changed
mysql> CREATE TABLE t (a INT, b INT, KEY(a,b));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO t VALUES (1,1), (2,2), (3,3), (4,4);
Query OK, 4 rows affected (0.00 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT a, MAX(b) FROM t GROUP BY a HAVING SUM(DISTINCT a);
+------+--------+
| a    | MAX(b) |
+------+--------+
|    1 |      1 |
|    2 |      2 |
|    3 |      3 |
|    4 |      4 |
+------+--------+
4 rows in set (0.00 sec)

// 5.6.12 - Affected

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.6.12    |
+-----------+
1 row in set (0.00 sec)

mysql> DROP DATABASE IF EXISTS d;
Query OK, 1 row affected (0.02 sec)

mysql> CREATE DATABASE d;
Query OK, 1 row affected (0.00 sec)

mysql> USE d;
Database changed
mysql> CREATE TABLE t (a INT, b INT, KEY(a,b));
Query OK, 0 rows affected (0.06 sec)

mysql> INSERT INTO t VALUES (1,1), (2,2), (3,3), (4,4);
Query OK, 4 rows affected (0.02 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT a, MAX(b) FROM t GROUP BY a HAVING SUM(DISTINCT a);
+------+--------+
| a    | MAX(b) |
+------+--------+
|    1 |      2 |
|    2 |      3 |
|    3 |      4 |
|    4 |      4 |
+------+--------+
4 rows in set (0.00 sec)

// 5.5.32 - Affected

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.5.32    |
+-----------+
1 row in set (0.00 sec)

mysql> DROP DATABASE IF EXISTS d;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE DATABASE d;
Query OK, 1 row affected (0.00 sec)

mysql> USE d;
Database changed
mysql> CREATE TABLE t (a INT, b INT, KEY(a,b));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO t VALUES (1,1), (2,2), (3,3), (4,4);
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT a, MAX(b) FROM t GROUP BY a HAVING SUM(DISTINCT a);
+------+--------+
| a    | MAX(b) |
+------+--------+
|    1 |      2 |
|    2 |      3 |
|    3 |      4 |
|    4 |      4 |
+------+--------+
4 rows in set (0.00 sec)

// 5.7.2-m12 - Affacted

mysql> DROP DATABASE IF EXISTS d;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE DATABASE d;
Query OK, 1 row affected (0.00 sec)

mysql> USE d;
Database changed
mysql> CREATE TABLE t (a INT, b INT, KEY(a,b));
Query OK, 0 rows affected (0.06 sec)

mysql> INSERT INTO t VALUES (1,1), (2,2), (3,3), (4,4);
Query OK, 4 rows affected (0.02 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT a, MAX(beer) FROM t GROUP BY a HAVING SUM(DISTINCT a);
+------+--------+
| a    | MAX(beer) |
+------+--------+
|    1 |      2 |
|    2 |      3 |
|    3 |      4 |
|    4 |      4 |
+------+--------+
4 rows in set (0.00 sec)

mysql> select version();
+------------------------------------------+
| version()                                |
+------------------------------------------+
| 5.7.2-m12-enterprise-commercial-advanced |
+------------------------------------------+
[25 Jul 2013 19:59] Roy Lyseng
This is surely an execution bug, but I think the user impact is minimal.

For each group, all values of column a are equal, hence SUM(DISTINCT a) within HAVING will accumulate only the single distinct a value, and SUM(DISTINCT a) should then be equivalent to simply a.

Unless I have misunderstood something.
[25 Jul 2013 20:56] Arthur O'Dwyer
@Roy, I *think* your analysis is correct, but without a definite root-cause for the bug, I'm not willing to bet that the problem is limited *only* to cases where SUM(DISTINCT X) is equivalent to X. I'm also not willing to bet that the symptoms are limited to wrong output; perhaps the off-by-one error is exploitable to access memory "one row beyond" the end of the table, which could access unallocated memory and/or trash the heap in the right circumstances.

Also, consider machine-generated queries; the server can't safely assume that every input has been hand-optimized to remove "silly" constructs. This particular input was reduced from a machine-generated query, for example.
[6 Jun 2014 2:54] Paul DuBois
Noted in 5.5.39, 5.6.20, 5.7.5 changelogs.

If a query had both MIN()/MAX() and aggregate_function(DISTINCT) (for
example, SUM(DISTINCT)) and was executed using Loose Index Scan, the
result values of MIN()/MAX() were set improperly.
[1 Aug 2014 16:04] Laurynas Biveinis
5.5 $ bzr log -r 4643
------------------------------------------------------------
revno: 4643
committer: mithun <mithun.c.y@oracle.com>
branch nick: mysql-5.5
timestamp: Thu 2014-05-15 11:46:57 +0530
message:
  Bug#17217128 : BAD INTERACTION BETWEEN MIN/MAX AND
                 "HAVING SUM(DISTINCT)": WRONG RESULTS.
  ISSUE:
  ------
  If a query uses loose index scan and it has both
  AGG(DISTINCT) and MIN()/MAX()functions. Then, result values
  of MIN/MAX() is set improperly.
  When query has AGG(DISTINCT) then end_select is set to
  end_send_group. "end_send_group" keeps doing aggregation
  until it sees a record from next group. And, then it will
  send out the result row of that group.
  Since query also has MIN()/MAX() and loose index scan is
  used, values of MIN/MAX() are set as part of loose index
  scan itself. Setting MIN()/MAX() values as part of loose
  index scan overwrites values computed in end_send_group.
  This caused invalid result.
  For such queries to work loose index scan should stop
  performing MIN/MAX() aggregation. And, let end_send_group to
  do the same. But according to current design loose index
  scan can produce only one row per group key. If we have both
  MIN() and MAX() then it has to give two records out. This is
  not possible as interface has to use common buffer
  record[0]! for both records at a time.
  
  SOLUTIONS:
  ----------
  For such queries to work we need a new interface for loose
  index scan. Hence, do not choose loose_index_scan for such
  cases. So a new rule SA7 is introduced to take care of the
  same.
  
  SA7: "If Q has both AGG_FUNC(DISTINCT ...) and
        MIN/MAX() functions then loose index scan access
        method is not used."
[12 Jan 2016 20:45] Matthew Wilmshorst
This Bug is still active. I just upgraded and now the MIN and MAX never work they always supply wrong numbers.
[12 Jan 2016 20:45] Matthew Wilmshorst
This Bug is still active. I just upgraded and now the MIN and MAX never work they always supply wrong numbers.