Bug #68749 | Basic SELECT count(distinct id) is broken. | ||
---|---|---|---|
Submitted: | 22 Mar 2013 18:47 | Modified: | 4 Dec 2013 22:23 |
Reporter: | Igor Babaev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
Version: | 5.5, 5.6 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[22 Mar 2013 18:47]
Igor Babaev
[23 Mar 2013 19:36]
MySQL Verification Team
Hello Igor, Thank you for the report. Verified as described. Thanks, Umesh
[4 Dec 2013 22:23]
Paul DuBois
Noted in 5.5.36, 5.6.16, 5.7.4 changelogs. COUNT(DISTINCT) produces an incorrect result when it uses a Unique Tree and its last inserted row has a NULL value.
[5 Dec 2013 15:03]
Paul DuBois
Revised changelog entry: COUNT(DISTINCT) sometimes produced an incorrect result when the last read row contained a NULL value.
[2 Feb 2014 17:37]
Laurynas Biveinis
5.5$ bzr log -r 4557 ------------------------------------------------------------ revno: 4557 committer: Guilhem Bichot <guilhem.bichot@oracle.com> branch nick: 5.5 timestamp: Wed 2013-12-04 12:32:42 +0100 message: Bug#16539979 - BASIC SELECT COUNT(DISTINCT ID) IS BROKEN Bug#17867117 - ERROR RESULT WHEN "COUNT + DISTINCT + CASE WHEN" NEED MERGE_WALK Problem: COUNT DISTINCT gives incorrect result when it uses a Unique Tree and its last inserted record has null value. Here is how COUNT DISTINCT is processed, given that this query is not using loose index scan. When a row is produced as a result of joining tables (there is only one table here), we store the SELECTed value in a Unique tree. This allows elimination of any duplicates, and thus implements DISTINCT. When we have processed all rows like this, we walk the Unique tree, counting its elements, in Aggregator_distinct::endup() (tree->walk()); for each element we call Item_sum_count::add(). Such function wants to ignore any NULL value, for that it checks item_sum -> args[0] -> null_value. It is a mistake: when walking the Unique tree, the value to be aggregated is not item_sum ->args[0] but rather table -> field[0]. Solution: instead of item_sum -> args[0] -> null_value, use arg_is_null(), which knows where to look (like in fix for bug 57932). As a consequence of this solution, we have to make arg_is_null() a little more general: 1) Because it was so far only used for AVG() (which always has a single argument), this function was looking at a single argument; now that it has to work with COUNT(DISTINCT expression1,expression2), it must look at all arguments. 2) Because we start using arg_is_null () for COUNT(DISTINCT), i.e. in Item_sum_count::add (), it implies that we are also using it for COUNT(no DISTINCT) (same add ()). For COUNT(no DISTINCT), the nullness to check is that of item_sum -> args[0]. But the null_value of such item is reliable only if val_*() has been called on it. So far arg_is_null() was always used after a call to arg_val*(), so could rely on null_value; but for COUNT, there is no call to arg_val*(), so arg_is_null() has to call is_null() instead. Testcase for 16539979 by Neeraj. Testcase for 17867117 contributed by Xiaobin Lin from Taobao.
[4 Mar 2014 21:22]
Barb Trout
We are using MySQL Percona Server 5.6.15-56. I believe we are experiencing this bug. Has this bug been fixed? If so, what version(s) was it fixed in? If not, is it planned to be in a future release?
[5 Mar 2014 8:16]
Guilhem Bichot
Hi If you scroll above in the report: [4 Dec 2013 22:23] Paul Dubois Noted in 5.5.36, 5.6.16, 5.7.4 changelogs. So: fixed in MySQL 5.6.16 which is available on dev.mysql.com. For Percona, I don't know.