Bug #69841 | SELECT COUNT(DISTINCT a,b) incorrectly counts rows containing NULL | ||
---|---|---|---|
Submitted: | 25 Jul 2013 23:41 | Modified: | 1 Oct 2013 18:37 |
Reporter: | Arthur O'Dwyer | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.5/5.6 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | regression |
[25 Jul 2013 23:41]
Arthur O'Dwyer
[26 Jul 2013 1:09]
MySQL Verification Team
Thank you for the bug report. mysql 5.0 > ALTER TABLE t ADD KEY k(a,b); Query OK, 4 rows affected (0.10 sec) Records: 4 Duplicates: 0 Warnings: 0 <cut> mysql 5.0 > SELECT COUNT(DISTINCT a,b) FROM t; +---------------------+ | COUNT(DISTINCT a,b) | +---------------------+ | 1 | +---------------------+ 1 row in set (0.00 sec) mysql 5.0 > SHOW VARIABLES LIKE "%version%"; +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | protocol_version | 10 | | version | 5.0.97-Win X64 | | version_comment | Source distribution | | version_compile_machine | unknown | | version_compile_os | Win64 | +-------------------------+---------------------+ 5 rows in set (0.00 sec) ------------------------------------------------------ mysql 5.1 > ALTER TABLE t ADD KEY k(a,b); Query OK, 4 rows affected (0.11 sec) Records: 4 Duplicates: 0 Warnings: 0 <cut> mysql 5.1 > SELECT COUNT(DISTINCT a,b) FROM t; +---------------------+ | COUNT(DISTINCT a,b) | +---------------------+ | 1 | +---------------------+ 1 row in set (0.00 sec) mysql 5.1 > SHOW VARIABLES LIKE "%version%"; +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | protocol_version | 10 | | version | 5.1.72-Win X64 | | version_comment | Source distribution | | version_compile_machine | unknown | | version_compile_os | Win64 | +-------------------------+---------------------+ 5 rows in set (0.00 sec) mysql 5.1 > ------------------------------------------------------ mysql 5.5 > ALTER TABLE t ADD KEY k(a,b); Query OK, 0 rows affected (0.17 sec) Records: 0 Duplicates: 0 Warnings: 0 <CUT> mysql 5.5 > SELECT COUNT(DISTINCT a,b) FROM t; +---------------------+ | COUNT(DISTINCT a,b) | +---------------------+ | 2 | +---------------------+ 1 row in set (0.00 sec) mysql 5.5 > SHOW VARIABLES LIKE "%version%"; +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | innodb_version | 5.5.34 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.5.34 | | version_comment | Source distribution | | version_compile_machine | AMD64 | | version_compile_os | Win64 | +-------------------------+---------------------+ 7 rows in set (0.03 sec) mysql 5.5 > ------------------------------------------------------ mysql 5.6 > ALTER TABLE t ADD KEY k(a,b); Query OK, 0 rows affected (0.50 sec) Records: 0 Duplicates: 0 Warnings: 0 <cut> mysql 5.6 > SELECT COUNT(DISTINCT a,b) FROM t; +---------------------+ | COUNT(DISTINCT a,b) | +---------------------+ | 2 | +---------------------+ 1 row in set (0.00 sec) mysql 5.6 > SHOW VARIABLES LIKE "%version%"; +-------------------------+---------------------+ | Variable_name | Value | +-------------------------+---------------------+ | innodb_version | 5.6.14 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.6.14 | | version_comment | Source distribution | | version_compile_machine | x86_64 | | version_compile_os | Win64 | +-------------------------+---------------------+ 7 rows in set (0.00 sec) mysql 5.6 >
[1 Oct 2013 18:37]
Paul DuBois
Noted in 5.5.35, 5.6.15, 5.7.3 changelogs. COUNT(DISTINCT) should not count NULL values, but they were counted when the optimizer used Loose Index Scan.
[4 Dec 2013 9:05]
Laurynas Biveinis
5.5$ bzr log -r 4465 ------------------------------------------------------------ revno: 4465 committer: Neeraj Bisht <neeraj.x.bisht@oracle.com> branch nick: 5.5 timestamp: Wed 2013-09-04 10:45:55 +0530 message: Bug#17222452 - SELECT COUNT(DISTINCT A,B) INCORRECTLY COUNTS ROWS CONTAINING NULL Problem:- In MySQL, We can obtain the number of distinct expression combinations that do not contain NULL by giving a list of expressions in COUNT(DISTINCT). However rows with NULL values are incorrectly included in the count when loose index scan is used. Analysis:- In case of loose index scan, we check whether the field is null or not and increase the count in Item_sum_count::add(). But there we are checking for the first field in COUNT(DISTINCT), not for every field. This is causing an incorrect result. Solution:- Check all field in Item_sum_count::add(), whether there values are null or not. Then only increment the count. ****** Bug#17222452 - SELECT COUNT(DISTINCT A,B) INCORRECTLY COUNTS ROWS CONTAINING NULL Problem:- In MySQL, We can obtain the number of distinct expression combinations that do not contain NULL by giving a list of expressions in COUNT(DISTINCT). However rows with NULL values are incorrectly included in the count when loose index scan is used. Analysis:- In case of loose index scan, we check whether the field is null or not and increase the count in Item_sum_count::add(). But there we are checking for the first field in COUNT(DISTINCT), not for every field. This is causing an incorrect result. Solution:- Check all field in Item_sum_count::add(), whether there values are null or not. Then only increment the count.