Bug #53531 SUM/COUNT DISTINCT with subquery produce incorrect results
Submitted: 10 May 2010 8:04 Modified: 13 Jul 2010 11:43
Reporter: Ole John Aske Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.1.44 OS:Any
Assigned to: CPU Architecture:Any

[10 May 2010 8:04] Ole John Aske
Description:
Adding a 'DISTINCT' to a SUM() or COUNT() aggregate which groups on the result from a scalar subquery returns incorrect results / garbage. (MIN/MAX is not affected)

Looks similar to bug#52582, but I don't think they are the same. However they may both be in the same area of the code.....

How to repeat:
Load mysqldump which I will attach.

1) The following query is our 'baseline' which produces the correct grouped result: (AFAIK)

mysql> SELECT COUNT(table1.col_int), (SELECT table1s.col_int_unique FROM C AS table1s WHERE table1s.pk = table1.pk ) AS subqry FROM E AS table1 GROUP BY subqry;
+-----------------------+--------+
| COUNT(table1.col_int) | subqry |
+-----------------------+--------+
|                    13 |   NULL |
|                     1 |      0 |
|                     1 |      3 |
|                     1 |      6 |
|                     1 |      7 |
|                     1 |      8 |
|                     1 |      9 |
|                     1 |     10 |
|                     0 |     12 |
|                     1 |     15 |
|                     1 |     17 |
|                     1 |     31 |
|                     1 |     32 |
|                     1 |     33 |
|                     1 |     40 |
|                     1 |     45 |
|                     1 |     47 |
+-----------------------+--------+
17 rows in set (0.00 sec)

2) Then change the 'COUNT' to a COUNT DISTINCT':

mysql> SELECT COUNT(DISTINCT table1.col_int), (SELECT table1s.col_int_unique FROM C AS table1s WHERE table1s.pk = table1.pk ) AS subqry FROM E AS table1 GROUP BY subqry;
+--------------------------------+--------+
| COUNT(DISTINCT table1.col_int) | subqry |
+--------------------------------+--------+
|                              1 |     17 |
|                              1 |   NULL |
|                             23 |   NULL |
+--------------------------------+--------+
3 rows in set (0.00 sec)

We observe that the following unexpected changes takes place

 - There are now only 3 groups, this was not expected to change.
 - There are now 23+1 NULL valued subqry columns, changed from 13.
 - The 23+1 NULL valued subqry columns should have been in the same group.

... The result seems to have been severely garbled......
[10 May 2010 8:05] Ole John Aske
schema + data for the testcase

Attachment: create.sql (text/x-sql), 2.91 KiB.

[10 May 2010 8:20] Valeriy Kravchuk
I do not see any data in your create.sql. Is it essential to use ndbcluster tables to demonstrate this problem?
[10 May 2010 8:23] Ole John Aske
Previous attached file was wrong, use this

Attachment: spj_myisam.dump (application/octet-stream, text), 37.34 KiB.

[10 May 2010 15:18] Valeriy Kravchuk
Please, check with a newer version, 5.1.46 at least. Looks like the problem is already fixed in recent code:

valeriy-kravchuks-macbook-pro:5.1 openxs$ bin/mysql -uroot test < ~/Downloads/spj_myisam.dump 
valeriy-kravchuks-macbook-pro:5.1 openxs$ bin/mysql -uroot testReading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 5.1.48-debug Source distribution

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SELECT COUNT(table1.col_int), (SELECT table1s.col_int_unique FROM C AS table1s
    -> WHERE table1s.pk = table1.pk ) AS subqry FROM E AS table1 GROUP BY subqry;
+-----------------------+--------+
| COUNT(table1.col_int) | subqry |
+-----------------------+--------+
|                    13 |   NULL |
|                     1 |      0 |
|                     1 |      3 |
|                     1 |      6 |
|                     1 |      7 |
|                     1 |      8 |
|                     1 |      9 |
|                     1 |     10 |
|                     0 |     12 |
|                     1 |     15 |
|                     1 |     17 |
|                     1 |     31 |
|                     1 |     32 |
|                     1 |     33 |
|                     1 |     40 |
|                     1 |     45 |
|                     1 |     47 |
+-----------------------+--------+
17 rows in set (0.40 sec)

mysql> SELECT COUNT(DISTINCT table1.col_int), (SELECT table1s.col_int_unique FROM C AS
    -> table1s WHERE table1s.pk = table1.pk ) AS subqry FROM E AS table1 GROUP BY subqry;
+--------------------------------+--------+
| COUNT(DISTINCT table1.col_int) | subqry |
+--------------------------------+--------+
|                             11 |   NULL |
|                              1 |      0 |
|                              1 |      3 |
|                              1 |      6 |
|                              1 |      7 |
|                              1 |      8 |
|                              1 |      9 |
|                              1 |     10 |
|                              0 |     12 |
|                              1 |     15 |
|                              1 |     17 |
|                              1 |     31 |
|                              1 |     32 |
|                              1 |     33 |
|                              1 |     40 |
|                              1 |     45 |
|                              1 |     47 |
+--------------------------------+--------+
17 rows in set (0.00 sec)
[10 Jun 2010 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[13 Jul 2010 11:43] Ole John Aske
Has tested failing query after merging with 5.1.47 (Server version: 5.1.47-ndb-7.0.16) - Problem seems to have been fixed in this version :-)
Change status to 'Closed'