Bug #71028 | error result when "count + distinct + case when" need merge_walk | ||
---|---|---|---|
Submitted: | 27 Nov 2013 21:26 | Modified: | 14 Jan 2014 18:21 |
Reporter: | xiaobin lin (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: DML | Severity: | S3 (Non-critical) |
Version: | 5.5+ | OS: | Any |
Assigned to: | Guilhem Bichot | CPU Architecture: | Any |
Tags: | case when, count, distinct, regression, tmp_table_size |
[27 Nov 2013 21:26]
xiaobin lin
[27 Nov 2013 21:28]
xiaobin lin
base on 5.6 (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: merge_waik_bug2.5612.diff (application/octet-stream, text), 406 bytes.
[27 Nov 2013 22:07]
MySQL Verification Team
[20:05:04][Pochita:]~ miguel$ mysql -uroot -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.6.14 MySQL Community Server (GPL) Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE DATABASE xy; Query OK, 1 row affected (0.07 sec) mysql> USE xy Database changed mysql> set tmp_table_size=1024; Query OK, 0 rows affected (0.00 sec) mysql> create table tb(id int auto_increment primary key, v varchar(32))engine=myisam charset=gbk; Query OK, 0 rows affected (0.05 sec) mysql> insert into tb(v) values("aaa"); Query OK, 1 row affected (0.00 sec) mysql> insert into tb(v) (select v from tb); Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> insert into tb(v) (select v from tb); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> insert into tb(v) (select v from tb); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> insert into tb(v) (select v from tb); Query OK, 8 rows affected (0.00 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql> insert into tb(v) (select v from tb); Query OK, 16 rows affected (0.00 sec) Records: 16 Duplicates: 0 Warnings: 0 mysql> insert into tb(v) (select v from tb); Query OK, 32 rows affected (0.00 sec) Records: 32 Duplicates: 0 Warnings: 0 mysql> mysql> update tb set v=concat(v, id); Query OK, 64 rows affected (0.00 sec) Rows matched: 64 Changed: 64 Warnings: 0 mysql> select count(distinct case when id<=64 then id end) from tb; +----------------------------------------------+ | count(distinct case when id<=64 then id end) | +----------------------------------------------+ | 64 | +----------------------------------------------+ 1 row in set (0.00 sec) mysql> select count(distinct case when id<=63 then id end) from tb; +----------------------------------------------+ | count(distinct case when id<=63 then id end) | +----------------------------------------------+ | 0 | +----------------------------------------------+ 1 row in set (0.00 sec)
[27 Nov 2013 22:12]
MySQL Verification Team
c:\dbs>c:\dbs\5.1\bin\mysql -uroot --port=3510 --prompt="mysql 5.1 > " Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.1.74-Win X64 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql 5.1 > CREATE DATABASE xy; Query OK, 1 row affected (0.03 sec) mysql 5.1 > USE xy; Database changed mysql 5.1 > set tmp_table_size=1024; Query OK, 0 rows affected (0.00 sec) mysql 5.1 > create table tb(id int auto_increment primary key, v varchar(32))engine=myisam charset=gbk; Query OK, 0 rows affected (0.08 sec) mysql 5.1 > insert into tb(v) values("aaa"); Query OK, 1 row affected (0.01 sec) mysql 5.1 > insert into tb(v) (select v from tb); Query OK, 1 row affected (0.00 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql 5.1 > insert into tb(v) (select v from tb); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql 5.1 > insert into tb(v) (select v from tb); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql 5.1 > insert into tb(v) (select v from tb); Query OK, 8 rows affected (0.00 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql 5.1 > insert into tb(v) (select v from tb); Query OK, 16 rows affected (0.00 sec) Records: 16 Duplicates: 0 Warnings: 0 mysql 5.1 > insert into tb(v) (select v from tb); Query OK, 32 rows affected (0.08 sec) Records: 32 Duplicates: 0 Warnings: 0 mysql 5.1 > mysql 5.1 > update tb set v=concat(v, id); Query OK, 64 rows affected (0.00 sec) Rows matched: 64 Changed: 64 Warnings: 0 mysql 5.1 > select count(distinct case when id<=64 then id end) from tb; +----------------------------------------------+ | count(distinct case when id<=64 then id end) | +----------------------------------------------+ | 64 | +----------------------------------------------+ 1 row in set (0.00 sec) mysql 5.1 > select count(distinct case when id<=63 then id end) from tb; +----------------------------------------------+ | count(distinct case when id<=63 then id end) | +----------------------------------------------+ | 63 | +----------------------------------------------+ 1 row in set (0.00 sec) mysql 5.1 >
[27 Nov 2013 22:19]
MySQL Verification Team
Thank you for the bug report and contribution.
[27 Nov 2013 22:29]
MySQL Verification Team
Forgot to paste: c:\dbs>c:\dbs\5.5\bin\mysql -uroot --port=3550 --prompt="mysql 5.5 > " Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.5.36 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql 5.5 > CREATE DATABASE xy; Query OK, 1 row affected (0.02 sec) mysql 5.5 > USE xy; Database changed mysql 5.5 > set tmp_table_size=1024; Query OK, 0 rows affected (0.00 sec) mysql 5.5 > create table tb(id int auto_increment primary key, v varchar(32))engine=myisam charset=gbk; Query OK, 0 rows affected (0.13 sec) mysql 5.5 > insert into tb(v) values("aaa"); Query OK, 1 row affected (0.00 sec) mysql 5.5 > insert into tb(v) (select v from tb); Query OK, 1 row affected (0.02 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql 5.5 > insert into tb(v) (select v from tb); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql 5.5 > insert into tb(v) (select v from tb); Query OK, 4 rows affected (0.00 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql 5.5 > insert into tb(v) (select v from tb); Query OK, 8 rows affected (0.00 sec) Records: 8 Duplicates: 0 Warnings: 0 mysql 5.5 > insert into tb(v) (select v from tb); Query OK, 16 rows affected (0.00 sec) Records: 16 Duplicates: 0 Warnings: 0 mysql 5.5 > insert into tb(v) (select v from tb); Query OK, 32 rows affected (0.09 sec) Records: 32 Duplicates: 0 Warnings: 0 mysql 5.5 > mysql 5.5 > update tb set v=concat(v, id); Query OK, 64 rows affected (0.00 sec) Rows matched: 64 Changed: 64 Warnings: 0 mysql 5.5 > select count(distinct case when id<=64 then id end) from tb; +----------------------------------------------+ | count(distinct case when id<=64 then id end) | +----------------------------------------------+ | 64 | +----------------------------------------------+ 1 row in set (0.00 sec) mysql 5.5 > select count(distinct case when id<=63 then id end) from tb; +----------------------------------------------+ | count(distinct case when id<=63 then id end) | +----------------------------------------------+ | 0 | +----------------------------------------------+ 1 row in set (0.00 sec) mysql 5.5 >
[28 Nov 2013 13:29]
Guilhem Bichot
Hello. Coincidence: I have been working on this issue two weeks ago (http://bugs.mysql.com/bug.php?id=68749 ); the patch I made is under review currently. If it is accepted, I'll add your testcase to it. I looked at the "geek method" - thanks for contributing code! It shines by its compactness - one line! However, I prefer to not change Item::maybe_null because this boolean is supposed to be set during query resolution (it tells if an expression may have a NULL value), and to not change during execution. Imagine we change it during execution, then execute the statement a second time (which is possible, if it is part of a stored procedure, or a prepared statement). Then the second execution will start with maybe_null==false, which is incorrect (the aggregated item may be NULL, after all; it's only after going through the Unique tree, that values in the tree cannot be NULL). With the current code base, we re-resolve on second execution, so maybe_null will be set back to "true" in fix_length_and_dec(), so it will work. But in the future, it would be more efficient to not re-resolve (why recompute some information which is not changing between executions, like NULLability, number of decimals of DECIMAL results, number of selected columns...). And if we don't re-resolve anymore, maybe_null won't be corrected. Nevertheless, reviewers will look at your patch too.
[2 Dec 2013 12:50]
Guilhem Bichot
Hello Xiaobin. I'm attaching the patch which has just been approved by two reviewers.
[2 Dec 2013 12:51]
Guilhem Bichot
approved patch
Attachment: patch.txt (text/plain), 13.86 KiB.
[3 Dec 2013 15:02]
xiaobin lin
hi, Guilhem It looks good to me, Thank you.
[14 Jan 2014 18:21]
Paul DuBois
Noted in 5.5.36, 5.6.16, 5.7.4 changelogs. COUNT(DISTINCT) sometimes produced an incorrect result when the last read row contained a NULL value.
[2 Feb 2014 17:38]
Laurynas Biveinis
5.5$ bzr log -r 4557 ------------------------------------------------------------ revno: 4557 committer: Guilhem Bichot <guilhem.bichot@oracle.com> branch nick: 5.5 timestamp: Wed 2013-12-04 12:32:42 +0100 message: Bug#16539979 - BASIC SELECT COUNT(DISTINCT ID) IS BROKEN Bug#17867117 - ERROR RESULT WHEN "COUNT + DISTINCT + CASE WHEN" NEED MERGE_WALK Problem: COUNT DISTINCT gives incorrect result when it uses a Unique Tree and its last inserted record has null value. Here is how COUNT DISTINCT is processed, given that this query is not using loose index scan. When a row is produced as a result of joining tables (there is only one table here), we store the SELECTed value in a Unique tree. This allows elimination of any duplicates, and thus implements DISTINCT. When we have processed all rows like this, we walk the Unique tree, counting its elements, in Aggregator_distinct::endup() (tree->walk()); for each element we call Item_sum_count::add(). Such function wants to ignore any NULL value, for that it checks item_sum -> args[0] -> null_value. It is a mistake: when walking the Unique tree, the value to be aggregated is not item_sum ->args[0] but rather table -> field[0]. Solution: instead of item_sum -> args[0] -> null_value, use arg_is_null(), which knows where to look (like in fix for bug 57932). As a consequence of this solution, we have to make arg_is_null() a little more general: 1) Because it was so far only used for AVG() (which always has a single argument), this function was looking at a single argument; now that it has to work with COUNT(DISTINCT expression1,expression2), it must look at all arguments. 2) Because we start using arg_is_null () for COUNT(DISTINCT), i.e. in Item_sum_count::add (), it implies that we are also using it for COUNT(no DISTINCT) (same add ()). For COUNT(no DISTINCT), the nullness to check is that of item_sum -> args[0]. But the null_value of such item is reliable only if val_*() has been called on it. So far arg_is_null() was always used after a call to arg_val*(), so could rely on null_value; but for COUNT, there is no call to arg_val*(), so arg_is_null() has to call is_null() instead. Testcase for 16539979 by Neeraj. Testcase for 17867117 contributed by Xiaobin Lin from Taobao.