Bug #18131 GROUP BY doesn't always work
Submitted: 10 Mar 2006 11:03 Modified: 28 Mar 2006 17:26
Reporter: Juergen Neuhoff Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Documentation Severity:S1 (Critical)
Version:MySQL 5.0.18 OS:Windows (Windows XP & Linux Fedora core 4)
Assigned to: CPU Architecture:Any

[10 Mar 2006 11:03] Juergen Neuhoff
Description:
The GROUP BY doesn't always work. 

For example, the 
  
  SELECT field1, COUNT(field2), 0 AS field3 FROM `myTable` WHERE ... GROUP BY field1, field3

won't work for 'field3' if the `myTable` happens to have a field column by the same name 'field3'. It only works when using a name which doesn't exist yet in the underlying database table, e.g.

  SELECT field1, COUNT(field2), 0 AS dummy FROM `myTable` WHERE ... GROUP BY field1, dummy

where 'dummy' is not the name of an existing field column.

How to repeat:
Run the following query for attached database:

SELECT master_number, resource_sort, MIN( group_start_year ) AS start_year, MAX( group_end_year ) AS end_year, 0 AS toc_seq
FROM `master_numbers` 
INNER JOIN sections ON sections.fk_mno = master_numbers.master_number
WHERE master_numbers.released =1
AND master_numbers.master_number <10
AND sections.show_in_toc =1
GROUP BY master_number, toc_seq
[10 Mar 2006 11:04] Juergen Neuhoff
Sample database for reproducing the error and/or bug

Attachment: ohpr_demob.sql.zip (application/zip, text), 100.26 KiB.

[10 Mar 2006 12:05] Valeriy Kravchuk
Thank you for a problem report. Please, try to repeat with a newer version of MySQL, 5.0.18. Describe the expected results and present the results you got. Smaller test case, with several rows in table, will be useful also.
[10 Mar 2006 16:39] Juergen Neuhoff
Thank you for looking into it.

I just installed and tested it with Version 5.0.18. Still the same problem.
I'll try and create a smaller sample database illustrating this problem.
[10 Mar 2006 17:07] Juergen Neuhoff
I have created another attachment with a simpler database, and with the expected and un-expected result sets. This should enable you to easily reproduce the error.

J.Neuhoff
[10 Mar 2006 17:08] Juergen Neuhoff
test cases for the GROUP BY problem

Attachment: testcases-group-by.txt (text/plain), 11.52 KiB.

[13 Mar 2006 8:18] Hartmut Holzgraefe
Test statements:

 DROP TABLE IF EXISTS t1;
 CREATE TABLE t1 (i INT, j INT);

 INSERT INTO t1 VALUES(1,1);
 INSERT INTO t1 VALUES(1,2);
 INSERT INTO t1 VALUES(1,3);
 INSERT INTO t1 VALUES(2,1);
 INSERT INTO t1 VALUES(2,2);
 INSERT INTO t1 VALUES(2,3);
 INSERT INTO t1 VALUES(3,3);
 INSERT INTO t1 VALUES(3,2);
 INSERT INTO t1 VALUES(3,1);

 SELECT i, 0 AS j, COUNT(*) FROM t1 GROUP BY i, j;

 SELECT i, 0 AS x, COUNT(*) FROM t1 GROUP BY i, x;

expected result: both SELECT statements returning the same result data

actual result:

mysql>  SELECT i, 0 AS j, COUNT(*) FROM t1 GROUP BY i, j;
+------+---+----------+
| i    | j | COUNT(*) |
+------+---+----------+
|    1 | 0 |        1 |
|    1 | 0 |        1 |
|    1 | 0 |        1 |
|    2 | 0 |        1 |
|    2 | 0 |        1 |
|    2 | 0 |        1 |
|    3 | 0 |        1 |
|    3 | 0 |        1 |
|    3 | 0 |        1 |
+------+---+----------+
9 rows in set, 1 warning (0.00 sec)

mysql>  SELECT i, 0 AS x, COUNT(*) FROM t1 GROUP BY i, x;
+------+---+----------+
| i    | x | COUNT(*) |
+------+---+----------+
|    1 | 0 |        3 |
|    2 | 0 |        3 |
|    3 | 0 |        3 |
+------+---+----------+
3 rows in set (0.00 sec)
[13 Mar 2006 8:19] Hartmut Holzgraefe
verified on 5.0.19
[17 Mar 2006 10:25] Evgeny Potemkin
Not a bug.
The result of provided example is:
mysql>  SELECT i, 0 AS j, COUNT(*) FROM t1 GROUP BY i, j;
+------+---+----------+
| i    | j | COUNT(*) |
+------+---+----------+
| 1    | 0 | 1        |
| 1    | 0 | 1        |
| 1    | 0 | 1        |
| 2    | 0 | 1        |
| 2    | 0 | 1        |
| 2    | 0 | 1        |
| 3    | 0 | 1        |
| 3    | 0 | 1        |
| 3    | 0 | 1        |
+------+---+----------+
9 rows in set, 1 warning (0.00 sec)

mysql> 
mysql> show warnings;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1052 | Column 'j' in group statement is ambiguous |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)

According to SQL standard if there is an aliased expression and one of the joined tables contains field with same name, grouping by field should be done, not by the aliased expression. So this result is correct.

MySQL in this case produces the warning, you can see it above. Your example shows that there also was a warning.

I think this should be noted somewhere in the manual to prevent further reporting of this "bug".
[17 Mar 2006 14:58] Juergen Neuhoff
Thank you for looking into it.

So, if there is an aliased expression and one of the joined tables contains a field with same name, it is the grouping by field which should be done. 

In that case MySQL 5.0.18 is correct indeed, because it groups it by the field. However,  MySQL 4.1.8 groups it by the alias. I am not sure whether you plan to backpatch the older MySQL 4.1.x branch then. If not, then at least the documentation ought to be updated to warn the user that MySQL 4.1.x is not SQL-standard compliant for this particular case.
[20 Mar 2006 20:45] Mike Hillyer
Documenting status is reserved for bugfixes in need of a changelog entry. Changing status to Verified and category to Documentation.