MySQL Bugs: #25551: inconsistent behaviour in grouping NULL, depending on index type

Bug #25551	inconsistent behaviour in grouping NULL, depending on index type
Submitted:	11 Jan 2007 16:30	Modified:	8 Feb 2007 18:04
Reporter:	Martin Friebe (Gold Quality Contributor) (OCA)	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S3 (Non-critical)
Version:	4.1.22, 5.0.27, 5.0.34-BK	OS:	Linux (Linux, freebsd)
Assigned to:	Georgi Kodinov	CPU Architecture:	Any
Tags:	distinct, GROUP BY, null

Description:
select distinct or group by handles NULL values in an inconsistent manner.

without index on the column (or with a non-unique index) NULL values are grouped as one

with a unique index, NULL values are displayed in individual rows

How to repeat:
drop table if exists d1; create table d1 (a int );
insert into d1 values (1),(2),(null), (null);

select * from d1 group by a; select distinct a from d1;
# result has one row with null

alter table d1 add key (a);
select * from d1 group by a; select distinct a from d1;
# result has one row with null

alter table d1 drop index a,  add unique (a);
select * from d1 group by a; select distinct a from d1;
# result has TWO rows with null

drop table if exists d1;

Suggested fix:
Not sure, which one should be considered correct.

Thank you for a bug report. Verified just as described with 5.0.34-BK on Linux:

openxs@suse:~/dbs/5.0> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.0.34-debug-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> drop table if exists d1; create table d1 (a int );
Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected (0.02 sec)

mysql> insert into d1 values (1),(2),(null), (null);
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from d1 group by a; select distinct a from d1;
+------+
| a    |
+------+
| NULL |
|    1 |
|    2 |
+------+
3 rows in set (0.01 sec)

+------+
| a    |
+------+
|    1 |
|    2 |
| NULL |
+------+
3 rows in set (0.00 sec)

mysql> alter table d1 add key (a);
Query OK, 4 rows affected (0.02 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from d1 group by a; select distinct a from d1;
+------+
| a    |
+------+
| NULL |
|    1 |
|    2 |
+------+
3 rows in set (0.00 sec)

+------+
| a    |
+------+
| NULL |
|    1 |
|    2 |
+------+
3 rows in set (0.01 sec)

mysql> alter table d1 drop index a,  add unique (a);
Query OK, 4 rows affected (0.02 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from d1 group by a; select distinct a from d1;
+------+
| a    |
+------+
| NULL |
| NULL |
|    1 |
|    2 |
+------+
4 rows in set (0.01 sec)

+------+
| a    |
+------+
| NULL |
| NULL |
|    1 |
|    2 |
+------+
4 rows in set (0.00 sec)

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18697

ChangeSet@1.2389, 2007-01-24 14:06:17+02:00, gkodinov@macbook.gmz +3 -0
  Bug #25551: inconsistent behaviour in grouping NULL, depending on index type
   The optimizer takes away columns from GROUP BY/DISTINCT if they constitute
   all the parts of an unique index.
   However if some of the columns can contain NULLs this cannot be done 
  (because an UNIQUE index can have multiple rows with NULL values).
   Fixed by not using UNIQUE indexes with nullable columns to remove
   grouping columns from GROUP BY/DISTINCT.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/19073

ChangeSet@1.2389, 2007-01-31 10:18:26+02:00, gkodinov@macbook.gmz +3 -0
  Bug #25551: inconsistent behaviour in grouping NULL, depending on index type
   The optimizer takes away columns from GROUP BY/DISTINCT if they constitute
   all the parts of an unique index.
   However if some of the columns can contain NULLs this cannot be done 
  (because an UNIQUE index can have multiple rows with NULL values).
   Fixed by not using UNIQUE indexes with nullable columns to remove
   grouping columns from GROUP BY/DISTINCT.

The fix has been pushed into 5.0.36, 5.1.16-beta main trees.

Noted in 5.0.36, 5.1.16 changelogs.