Bug #98663 ROLLUP output is not reasonable when same fields exist in the group by syntax
Submitted: 19 Feb 2020 10:53 Modified: 1 Apr 2020 12:38
Reporter: Hope Lee (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:8.0.16 OS:Any
Assigned to: CPU Architecture:Any

[19 Feb 2020 10:53] Hope Lee
Description:
When same fields exist in the group by syntax, the ROLLUP output is not reasonable and misleading. Also it's not good compared to other databases like PostgreSQL. Many strange "NULL" values appear in the results, also the results are not in order.

How to repeat:
CREATE TABLE t1 (
  a INT,
  b INT, 
  c INT);
INSERT INTO t1 VALUES (1, 2, 3), (4, 5, 6), (7, 8, 9);

MySQL output:
> SELECT a, b, a as d, SUM(c) FROM t1 GROUP BY a, b, d WITH ROLLUP;
+------+------+------+--------+
| a    | b    | d    | SUM(c) |
+------+------+------+--------+
|    1 |    2 |    1 |      3 |
| NULL |    2 | NULL |      3 |
| NULL | NULL | NULL |      3 |
|    4 |    5 |    4 |      6 |
| NULL |    5 | NULL |      6 |
| NULL | NULL | NULL |      6 |
|    7 |    8 |    7 |      9 |
| NULL |    8 | NULL |      9 |
| NULL | NULL | NULL |      9 |
| NULL | NULL | NULL |     18 |
+------+------+------+--------+
10 rows in set (0.01 sec)

PostgreSQL
> SELECT a, b, a as d, SUM(c) FROM t1 GROUP BY ROLLUP(a, b, d);
a	b	d	sum
1	2	1	3
1	2	(null)	3
1	(null)	(null)	3
4	5	4	6
4	5	(null)	6
4	(null)	(null)	6
7	8	7	9
7	8	(null)	9
7	(null)	(null)	9
(null)	(null)	(null)	18

Suggested fix:
The ROLLUP output should be in order by default. But the case above has broken the character. The suggested fix is that MySQL should consider the original column and the column with alias as the independent different columns.
[19 Feb 2020 11:57] MySQL Verification Team
Thank you for the bug report and test case.
[5 Mar 2020 2:19] Hope Lee
Suggested fix:

diff --git a/sql/sql_select.cc b/sql/sql_select.cc
index 87ea4a0..d0919b4 100644
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -3763,7 +3763,15 @@ bool JOIN::rollup_make_fields(List<Item> &fields_arg, List<Item> &sel_fields,
             from SUM(). ROLLUP code should find and set both NULL in order
             to get correct result.
           */
-          if (item == *group_tmp->item || item->eq(*group_tmp->item, false)) {
+          if (item == *group_tmp->item ||
+              (item->eq(*group_tmp->item, false) &&
+               /*
+                 Distinguish the same Item_fields or the same constant value
+                 with different aliases by their item_names.
+               */
+               (!item->item_name.is_set() ||
+                !(*group_tmp->item)->item_name.is_set() ||
+                item->item_name.eq((*group_tmp->item)->item_name)))) {
             /*
               This is an element that is used by the GROUP BY and should be
               set to NULL in this level
[5 Mar 2020 12:57] MySQL Verification Team
Thank you very much for your patch contribution, we appreciate it!

In order for us to continue the process of reviewing your contribution to MySQL, please send us a signed copy of the Oracle Contributor Agreement (OCA) as outlined in http://www.oracle.com/technetwork/community/oca-486395.html

Signing an OCA needs to be done only once and it's valid for all other Oracle governed Open Source projects as well.

Getting a signed/approved OCA on file will help us facilitate your contribution - this one, and others in the future.  

Please let me know, if you have any questions.

Thank you for your interest in MySQL.
[26 Mar 2020 0:13] Hope Lee
I have sent a signed copy of the Oracle Contributor Agreement (OCA) just now. :)
[26 Mar 2020 13:40] MySQL Verification Team
Thank you Mr. Lee, for your contribution.

Verified as reported.
[1 Apr 2020 12:38] Paul DuBois
Posted by developer:
 
Fixed in 8.0.21.

Using a column that is repeated twice or more in GROUP BY (through an
alias), combined with ROLLUP, had behavior differing from MySQL 5.7.
Example:

SELECT a, b AS a, COUNT(*) FROM t1 GROUP BY a, b WITH ROLLUP;

Behavior of such queries has been changed to better match MySQL 5.7.
They should be avoided, however, because behavior may change again in
the future or such queries may become illegal.
[2 Apr 2020 12:22] MySQL Verification Team
Thank you, Paul.