Bug #80653 COUNT DISTINCT GROUP BY ROLLUP return wrong number
Submitted: 8 Mar 2016 6:43 Modified: 15 Mar 2016 10:10
Reporter: Yinghong Lu Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.7.10 OS:Any
Assigned to: CPU Architecture:Any
Tags: COUNT DISTINCT, rollup

[8 Mar 2016 6:43] Yinghong Lu
Description:
I have a query with 'count distinct' OR 'group by ... with rollup' for counting some varchar column, it returns a wrong number for mysql 5.7.10 in Ubuntu 14.04.1 AND windows 10.

The result with same data and same query in 5.5.35-0ubuntu0.12.04.2-log is right.

The result with same data and same query in 5.7.10 and with some condition is right.

IN 5.7.10, COUNT DISTINCT for INT is right, but for VARCHAR is wrong。

How to repeat:
CREATE TABLE `y` (
  `id` int(11) NOT NULL DEFAULT '0',
  `no` varchar(32) DEFAULT NULL,
  `nid` int(11) DEFAULT NULL,
  `g1` int(11) DEFAULT NULL,
  `g2` int(11) DEFAULT NULL,
  KEY `IDX_g1` (`g1`),
  KEY `IDX_g2` (`g2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

#PREPARE DATA
INSERT INTO y(id) VALUES (0),(1),(2)...,(199999);
UPDATE y SET g1=MOD(id,10), g2=id DIV 100, no=CONCAT('20160308',10000000+id),nid=id;
INSERT INTO y(id, no, nid, g1, g2)
SELECT 1000000+id, no, nid, g1, g2 FROM y;

###################QUERY1
SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
GROUP BY g1
WITH ROLLUP
UNION ALL 
SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y;

#------------------RESULT1
...
|(null)| 400000	| 200000 | 211073 |
|  ''  | 400000	| 200000 | 399998 |

#BOTH search result are wrong for COUNT(DISTINCT no) AND are right for COUNT(DISTINCT nid).

###################QUERY2
SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
WHERE g2<1000
GROUP BY g1
WITH ROLLUP
UNION ALL 
SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
WHERE g2<1000;

#------------------RESULT2
...
|(null)| 200000	| 100000 | 100000 |
|      | 200000	| 100000 | 100000 |

#BOTH search result are right.

###################QUERY3
SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
WHERE g2<1600
GROUP BY g1
WITH ROLLUP
UNION ALL 
SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
WHERE g2<1600;

#------------------RESULT3
...
|(null)| 320000	| 160000 | 163073 |
|      | 320000	| 160000 | 320000 |

#ROLLUP search result is wrong for COUNT(DISTINCT no) AND is right for COUNT(DISTINCT nid).

###################QUERY4
SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT no) FROM y;

#------------------RESULT4
|  ''  | 400000	| 399999 |

#DIRECT search result is wrong for COUNT(DISTINCT no) AND is right for COUNT(DISTINCT nid).
[10 Mar 2016 0:14] MySQL Verification Team
Please check if the below results are the expected ones. Thanks.:

C:\dbs>c:\dbs\5.7\bin\mysql -uroot -p --port=3570 --prompt="mysql 5.7 > "
Enter password: ******
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.13 Source distribution PULL: 2016-MAR-05

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.7 > use test
Database changed

mysql 5.7 > CREATE TABLE `y` (
    ->   `id` int(11) NOT NULL DEFAULT '0',
    ->   `no` varchar(32) DEFAULT NULL,
    ->   `nid` int(11) DEFAULT NULL,
    ->   `g1` int(11) DEFAULT NULL,
    ->   `g2` int(11) DEFAULT NULL,
    ->   KEY `IDX_g1` (`g1`),
    ->   KEY `IDX_g2` (`g2`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.27 sec)

mysql 5.7 > select count(*) from y;
+----------+
| count(*) |
+----------+
|   200000 |
+----------+
1 row in set (0.05 sec)

mysql 5.7 > UPDATE y SET g1=MOD(id,10), g2=id DIV 100, no=CONCAT('20160308',10000000+id),nid=id;
Query OK, 200000 rows affected (6.19 sec)
Rows matched: 200000  Changed: 200000  Warnings: 0

mysql 5.7 > INSERT INTO y(id, no, nid, g1, g2)
    -> SELECT 1000000+id, no, nid, g1, g2 FROM y
    -> ;
Query OK, 200000 rows affected (3.48 sec)
Records: 200000  Duplicates: 0  Warnings: 0

mysql 5.7 > SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
    -> GROUP BY g1
    -> WITH ROLLUP
    -> UNION ALL
    -> SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y;
+------+--------------------+---------------------+--------------------+
| g1   | COUNT(DISTINCT id) | COUNT(DISTINCT nid) | COUNT(DISTINCT no) |
+------+--------------------+---------------------+--------------------+
| 0    |              40000 |               20000 |              20000 |
| 1    |              40000 |               20000 |              20000 |
| 2    |              40000 |               20000 |              20000 |
| 3    |              40000 |               20000 |              20000 |
| 4    |              40000 |               20000 |              20000 |
| 5    |              40000 |               20000 |              20000 |
| 6    |              40000 |               20000 |              20000 |
| 7    |              40000 |               20000 |              20000 |
| 8    |              40000 |               20000 |              20000 |
| 9    |              40000 |               20000 |              20000 |
| NULL |             400000 |              200000 |             200000 |
|      |             400000 |              200000 |             200000 |
+------+--------------------+---------------------+--------------------+
12 rows in set (3.44 sec)

mysql 5.7 > SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
    -> WHERE g2<1000
    -> GROUP BY g1
    -> WITH ROLLUP
    -> UNION ALL
    -> SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
    -> WHERE g2<1000;
+------+--------------------+---------------------+--------------------+
| g1   | COUNT(DISTINCT id) | COUNT(DISTINCT nid) | COUNT(DISTINCT no) |
+------+--------------------+---------------------+--------------------+
| 0    |              20000 |               10000 |              10000 |
| 1    |              20000 |               10000 |              10000 |
| 2    |              20000 |               10000 |              10000 |
| 3    |              20000 |               10000 |              10000 |
| 4    |              20000 |               10000 |              10000 |
| 5    |              20000 |               10000 |              10000 |
| 6    |              20000 |               10000 |              10000 |
| 7    |              20000 |               10000 |              10000 |
| 8    |              20000 |               10000 |              10000 |
| 9    |              20000 |               10000 |              10000 |
| NULL |             200000 |              100000 |             100000 |
|      |             200000 |              100000 |             100000 |
+------+--------------------+---------------------+--------------------+
12 rows in set (1.75 sec)

mysql 5.7 > SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
    -> WHERE g2<1600
    -> GROUP BY g1
    -> WITH ROLLUP
    -> UNION ALL
    -> SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
    -> WHERE g2<1600;
+------+--------------------+---------------------+--------------------+
| g1   | COUNT(DISTINCT id) | COUNT(DISTINCT nid) | COUNT(DISTINCT no) |
+------+--------------------+---------------------+--------------------+
| 0    |              32000 |               16000 |              16000 |
| 1    |              32000 |               16000 |              16000 |
| 2    |              32000 |               16000 |              16000 |
| 3    |              32000 |               16000 |              16000 |
| 4    |              32000 |               16000 |              16000 |
| 5    |              32000 |               16000 |              16000 |
| 6    |              32000 |               16000 |              16000 |
| 7    |              32000 |               16000 |              16000 |
| 8    |              32000 |               16000 |              16000 |
| 9    |              32000 |               16000 |              16000 |
| NULL |             320000 |              160000 |             160000 |
|      |             320000 |              160000 |             160000 |
+------+--------------------+---------------------+--------------------+
12 rows in set (2.78 sec)

mysql 5.7 > SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT no) FROM y;
+--+--------------------+--------------------+
|  | COUNT(DISTINCT id) | COUNT(DISTINCT no) |
+--+--------------------+--------------------+
|  |             400000 |             200000 |
+--+--------------------+--------------------+
1 row in set (1.25 sec)

mysql 5.7 > show variables like "%version%";
+-------------------------+---------------------------------------+
| Variable_name           | Value                                 |
+-------------------------+---------------------------------------+
| innodb_version          | 5.7.13                                |
| protocol_version        | 10                                    |
| slave_type_conversions  |                                       |
| tls_version             | TLSv1,TLSv1.1                         |
| version                 | 5.7.13                                |
| version_comment         | Source distribution PULL: 2016-MAR-05 |
| version_compile_machine | x86_64                                |
| version_compile_os      | Win64                                 |
+-------------------------+---------------------------------------+
8 rows in set (0.00 sec)
[15 Mar 2016 9:26] Yinghong Lu
Yes. Up results are what I expected!
[15 Mar 2016 9:28] Yinghong Lu
mysql>show variables like "%version%";

innodb_version	5.7.10
protocol_version	10
slave_type_conversions	
tls_version	TLSv1,TLSv1.1
version	5.7.10
version_comment	MySQL Community Server (GPL)
version_compile_machine	x86_64
version_compile_os	Win64
[15 Mar 2016 10:10] MySQL Verification Team
Thank you for the feedback. I tested with latest source so it was fixed somewhat. Please check when new releases are done.