Bug #15745 COUNT(DISTINCT x,y) and COUNT(DISTINCT CONCAT(x,y)) return different values
Submitted: 14 Dec 2005 16:29 Modified: 10 Mar 2006 18:21
Reporter: Giuseppe Maxia Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.0.16-max OS:Linux (Linux)
Assigned to: Alexey Botchkov

[14 Dec 2005 16:29] Giuseppe Maxia
Description:
A COUNT(DISTINCT x,y ) gives a result different from COUNT(DISTINCT CONCAT(x,y))
when one of the fields is an indexed DECIMAL column.

See the attached file for an reproducible test case.
Here is a sample run:

+----------------+
| version()      |
+----------------+
| 5.0.16-max-log |
+----------------+
+-------+---------------+------+-----+---------+----------------+
| Field | Type          | Null | Key | Default | Extra          |
+-------+---------------+------+-----+---------+----------------+
| ID    | int(11)       | NO   | PRI | NULL    | auto_increment |
| x     | varchar(20)   | YES  |     | NULL    |                |
| y     | decimal(10,0) | YES  | MUL | NULL    |                |
+-------+---------------+------+-----+---------+----------------+
+-----+------+------+
| ID  | x    | y    |
+-----+------+------+
|   1 | ba   |   -1 |
|   2 | ba   | 1150 |
| 306 | ba   |   -1 |
| 307 | ba   | 1150 |
| 611 | ba   |   -1 |
| 612 | ba   | 1150 |
+-----+------+------+
+---------------------+
| count(distinct x,y) |
+---------------------+
|                   2 |
+---------------------+
+-----------------------------+
| count(distinct concat(x,y)) |
+-----------------------------+
|                           3 |
+-----------------------------+
+-------------+
| concat(x,y) |
+-------------+
| ba-1        |
| ba1150      |
+-------------+

Please notice that the original problem occurred over a 5 million record table, and this
is just a simple case.

How to repeat:
create database if not exists test;
use test;

select version();

DROP TABLE IF EXISTS test;
CREATE TABLE test (
  ID int(11) NOT NULL auto_increment,
  x varchar(20) default NULL,
  y decimal(10,0) default NULL,
  PRIMARY KEY  (ID),
  KEY (y)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

desc test;

INSERT INTO test VALUES
(1,'ba','-1'),
(2,'ba','1150'),
(306,'ba','-1'),
(307,'ba','1150'),
(611,'ba','-1'),
(612,'ba','1150');

select * from test;

select count(distinct x,y) from test;
select count(distinct concat(x,y)) from test;
select distinct concat(x,y) from test;

Suggested fix:
none
[14 Dec 2005 17:00] Aleksey Kishkin
verified on slackware 10.2 against mysql 5.0.16-max:

mysql> select version();
+------------+
| version()  |
+------------+
| 5.0.16-max |
+------------+
1 row in set (0.00 sec)

mysql>
mysql> DROP TABLE IF EXISTS test;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE test (
    ->   ID int(11) NOT NULL auto_increment,
    ->   x varchar(20) default NULL,
    ->   y decimal(10,0) default NULL,
    ->   PRIMARY KEY  (ID),
    ->   KEY (y)
    -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> desc test;
+-------+---------------+------+-----+---------+----------------+
| Field | Type          | Null | Key | Default | Extra          |
+-------+---------------+------+-----+---------+----------------+
| ID    | int(11)       | NO   | PRI | NULL    | auto_increment |
| x     | varchar(20)   | YES  |     | NULL    |                |
| y     | decimal(10,0) | YES  | MUL | NULL    |                |
+-------+---------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

mysql>
mysql> INSERT INTO test VALUES
    -> (1,'ba','-1'),
    -> (2,'ba','1150'),
    -> (306,'ba','-1'),
    -> (307,'ba','1150'),
    -> (611,'ba','-1'),
    -> (612,'ba','1150');
Query OK, 6 rows affected (0.00 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql>
mysql> select * from test;
+-----+------+------+
| ID  | x    | y    |
+-----+------+------+
|   1 | ba   |   -1 |
|   2 | ba   | 1150 |
| 306 | ba   |   -1 |
| 307 | ba   | 1150 |
| 611 | ba   |   -1 |
| 612 | ba   | 1150 |
+-----+------+------+
6 rows in set (0.00 sec)

mysql>
mysql> select count(distinct x,y) from test;
+---------------------+
| count(distinct x,y) |
+---------------------+
|                   2 |
+---------------------+
1 row in set (0.00 sec)

mysql> select count(distinct concat(x,y)) from test;
+-----------------------------+
| count(distinct concat(x,y)) |
+-----------------------------+
|                           3 |
+-----------------------------+
1 row in set (0.03 sec)

mysql> select distinct concat(x,y) from test;
+-------------+
| concat(x,y) |
+-------------+
| ba-1        |
| ba1150      |
+-------------+
2 rows in set (0.00 sec)
mysql> select  concat(x,y) from test;
+-------------+
| concat(x,y) |
+-------------+
| ba-1        |
| ba1150      |
| ba-1        |
| ba1150      |
| ba-1        |
| ba1150      |
+-------------+
6 rows in set (0.00 sec)
[14 Jan 2006 13:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/1080
[5 Mar 2006 16:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/3477
[6 Mar 2006 12:00] Alexander Barkov
Looks ok to push
[10 Mar 2006 18:21] Paul Dubois
Noted in 5.0.20 changelog.

<literal>COUNT(DISTINCT<replaceable>col1</replaceable>,
<replaceable>col2</replaceable>)</literal>
and <literal>COUNT(DISTINCT
CONCAT(<replaceable>col1</replaceable>,<replaceable>col2</replaceable>))</literal>
operations produced different results if one of the columns
was an indexed <literal>DECIMAL</literal> column. (Bug #15745)