| Bug #80653 | COUNT DISTINCT GROUP BY ROLLUP return wrong number | ||
|---|---|---|---|
| Submitted: | 8 Mar 2016 6:43 | Modified: | 15 Mar 2016 10:10 |
| Reporter: | Yinghong Lu | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server | Severity: | S2 (Serious) |
| Version: | 5.7.10 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | COUNT DISTINCT, rollup | ||
[10 Mar 2016 0:14]
MySQL Verification Team
Please check if the below results are the expected ones. Thanks.:
C:\dbs>c:\dbs\5.7\bin\mysql -uroot -p --port=3570 --prompt="mysql 5.7 > "
Enter password: ******
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.13 Source distribution PULL: 2016-MAR-05
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql 5.7 > use test
Database changed
mysql 5.7 > CREATE TABLE `y` (
-> `id` int(11) NOT NULL DEFAULT '0',
-> `no` varchar(32) DEFAULT NULL,
-> `nid` int(11) DEFAULT NULL,
-> `g1` int(11) DEFAULT NULL,
-> `g2` int(11) DEFAULT NULL,
-> KEY `IDX_g1` (`g1`),
-> KEY `IDX_g2` (`g2`)
-> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.27 sec)
mysql 5.7 > select count(*) from y;
+----------+
| count(*) |
+----------+
| 200000 |
+----------+
1 row in set (0.05 sec)
mysql 5.7 > UPDATE y SET g1=MOD(id,10), g2=id DIV 100, no=CONCAT('20160308',10000000+id),nid=id;
Query OK, 200000 rows affected (6.19 sec)
Rows matched: 200000 Changed: 200000 Warnings: 0
mysql 5.7 > INSERT INTO y(id, no, nid, g1, g2)
-> SELECT 1000000+id, no, nid, g1, g2 FROM y
-> ;
Query OK, 200000 rows affected (3.48 sec)
Records: 200000 Duplicates: 0 Warnings: 0
mysql 5.7 > SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
-> GROUP BY g1
-> WITH ROLLUP
-> UNION ALL
-> SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y;
+------+--------------------+---------------------+--------------------+
| g1 | COUNT(DISTINCT id) | COUNT(DISTINCT nid) | COUNT(DISTINCT no) |
+------+--------------------+---------------------+--------------------+
| 0 | 40000 | 20000 | 20000 |
| 1 | 40000 | 20000 | 20000 |
| 2 | 40000 | 20000 | 20000 |
| 3 | 40000 | 20000 | 20000 |
| 4 | 40000 | 20000 | 20000 |
| 5 | 40000 | 20000 | 20000 |
| 6 | 40000 | 20000 | 20000 |
| 7 | 40000 | 20000 | 20000 |
| 8 | 40000 | 20000 | 20000 |
| 9 | 40000 | 20000 | 20000 |
| NULL | 400000 | 200000 | 200000 |
| | 400000 | 200000 | 200000 |
+------+--------------------+---------------------+--------------------+
12 rows in set (3.44 sec)
mysql 5.7 > SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
-> WHERE g2<1000
-> GROUP BY g1
-> WITH ROLLUP
-> UNION ALL
-> SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
-> WHERE g2<1000;
+------+--------------------+---------------------+--------------------+
| g1 | COUNT(DISTINCT id) | COUNT(DISTINCT nid) | COUNT(DISTINCT no) |
+------+--------------------+---------------------+--------------------+
| 0 | 20000 | 10000 | 10000 |
| 1 | 20000 | 10000 | 10000 |
| 2 | 20000 | 10000 | 10000 |
| 3 | 20000 | 10000 | 10000 |
| 4 | 20000 | 10000 | 10000 |
| 5 | 20000 | 10000 | 10000 |
| 6 | 20000 | 10000 | 10000 |
| 7 | 20000 | 10000 | 10000 |
| 8 | 20000 | 10000 | 10000 |
| 9 | 20000 | 10000 | 10000 |
| NULL | 200000 | 100000 | 100000 |
| | 200000 | 100000 | 100000 |
+------+--------------------+---------------------+--------------------+
12 rows in set (1.75 sec)
mysql 5.7 > SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
-> WHERE g2<1600
-> GROUP BY g1
-> WITH ROLLUP
-> UNION ALL
-> SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y
-> WHERE g2<1600;
+------+--------------------+---------------------+--------------------+
| g1 | COUNT(DISTINCT id) | COUNT(DISTINCT nid) | COUNT(DISTINCT no) |
+------+--------------------+---------------------+--------------------+
| 0 | 32000 | 16000 | 16000 |
| 1 | 32000 | 16000 | 16000 |
| 2 | 32000 | 16000 | 16000 |
| 3 | 32000 | 16000 | 16000 |
| 4 | 32000 | 16000 | 16000 |
| 5 | 32000 | 16000 | 16000 |
| 6 | 32000 | 16000 | 16000 |
| 7 | 32000 | 16000 | 16000 |
| 8 | 32000 | 16000 | 16000 |
| 9 | 32000 | 16000 | 16000 |
| NULL | 320000 | 160000 | 160000 |
| | 320000 | 160000 | 160000 |
+------+--------------------+---------------------+--------------------+
12 rows in set (2.78 sec)
mysql 5.7 > SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT no) FROM y;
+--+--------------------+--------------------+
| | COUNT(DISTINCT id) | COUNT(DISTINCT no) |
+--+--------------------+--------------------+
| | 400000 | 200000 |
+--+--------------------+--------------------+
1 row in set (1.25 sec)
mysql 5.7 > show variables like "%version%";
+-------------------------+---------------------------------------+
| Variable_name | Value |
+-------------------------+---------------------------------------+
| innodb_version | 5.7.13 |
| protocol_version | 10 |
| slave_type_conversions | |
| tls_version | TLSv1,TLSv1.1 |
| version | 5.7.13 |
| version_comment | Source distribution PULL: 2016-MAR-05 |
| version_compile_machine | x86_64 |
| version_compile_os | Win64 |
+-------------------------+---------------------------------------+
8 rows in set (0.00 sec)
[15 Mar 2016 9:26]
Yinghong Lu
Yes. Up results are what I expected!
[15 Mar 2016 9:28]
Yinghong Lu
mysql>show variables like "%version%"; innodb_version 5.7.10 protocol_version 10 slave_type_conversions tls_version TLSv1,TLSv1.1 version 5.7.10 version_comment MySQL Community Server (GPL) version_compile_machine x86_64 version_compile_os Win64
[15 Mar 2016 10:10]
MySQL Verification Team
Thank you for the feedback. I tested with latest source so it was fixed somewhat. Please check when new releases are done.

Description: I have a query with 'count distinct' OR 'group by ... with rollup' for counting some varchar column, it returns a wrong number for mysql 5.7.10 in Ubuntu 14.04.1 AND windows 10. The result with same data and same query in 5.5.35-0ubuntu0.12.04.2-log is right. The result with same data and same query in 5.7.10 and with some condition is right. IN 5.7.10, COUNT DISTINCT for INT is right, but for VARCHAR is wrong。 How to repeat: CREATE TABLE `y` ( `id` int(11) NOT NULL DEFAULT '0', `no` varchar(32) DEFAULT NULL, `nid` int(11) DEFAULT NULL, `g1` int(11) DEFAULT NULL, `g2` int(11) DEFAULT NULL, KEY `IDX_g1` (`g1`), KEY `IDX_g2` (`g2`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; #PREPARE DATA INSERT INTO y(id) VALUES (0),(1),(2)...,(199999); UPDATE y SET g1=MOD(id,10), g2=id DIV 100, no=CONCAT('20160308',10000000+id),nid=id; INSERT INTO y(id, no, nid, g1, g2) SELECT 1000000+id, no, nid, g1, g2 FROM y; ###################QUERY1 SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y GROUP BY g1 WITH ROLLUP UNION ALL SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y; #------------------RESULT1 ... |(null)| 400000 | 200000 | 211073 | | '' | 400000 | 200000 | 399998 | #BOTH search result are wrong for COUNT(DISTINCT no) AND are right for COUNT(DISTINCT nid). ###################QUERY2 SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y WHERE g2<1000 GROUP BY g1 WITH ROLLUP UNION ALL SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y WHERE g2<1000; #------------------RESULT2 ... |(null)| 200000 | 100000 | 100000 | | | 200000 | 100000 | 100000 | #BOTH search result are right. ###################QUERY3 SELECT g1, COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y WHERE g2<1600 GROUP BY g1 WITH ROLLUP UNION ALL SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT nid),COUNT(DISTINCT no) FROM y WHERE g2<1600; #------------------RESULT3 ... |(null)| 320000 | 160000 | 163073 | | | 320000 | 160000 | 320000 | #ROLLUP search result is wrong for COUNT(DISTINCT no) AND is right for COUNT(DISTINCT nid). ###################QUERY4 SELECT '', COUNT(DISTINCT id), COUNT(DISTINCT no) FROM y; #------------------RESULT4 | '' | 400000 | 399999 | #DIRECT search result is wrong for COUNT(DISTINCT no) AND is right for COUNT(DISTINCT nid).