Bug #3874 A Group By Field is not considered
Submitted: 24 May 2004 17:28 Modified: 13 Oct 2005 14:42
Reporter: Sergei Kulakov (Candidate Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:3.23.58 OS:FreeBSD (Free BSD)
Assigned to: Alexey Botchkov CPU Architecture:Any

[24 May 2004 17:28] Sergei Kulakov
Description:
I have a table PageLog that stores http requests. I try to count some stats on it using a left join with another table VirtPages yet one field from the group by clause is ignored. 

How to repeat:
1. Create the 2 tables and fill them with sample data:

CREATE TABLE `VirtPages` (
  `HostId` smallint(5) unsigned NOT NULL default '0',
  `PageId` mediumint(8) unsigned NOT NULL default '0',
  UNIQUE KEY `HostId` (`HostId`,`PageId`))	

Insert Into VirtPages Values (3, 1), (3, 2), (3, 3), (3, 4)

CREATE TABLE `PageLog` (
  `HostId` smallint(5) unsigned NOT NULL default '0',
  `PageId` mediumint(8) unsigned NOT NULL default '0',
  `GetData` varchar(255) default NULL)

Insert Into PageLog Values 
		(3, 10, 'a=1'),
		(3, 10, 'a=2'),
		(3, 11, 'a=1'),
		(3, 11, 'a=2'),
		(3, 2, 'a=1'),
		(3, 2, 'a=2'),
		(3, 2, 'a=3')
	
2. The query is supposed to count requests for pages. Some pages should be counted without request parameters (GetData field), some with. The table VirtPages lists the second kind of pages. Although this query has fields HostId, PageId, GetData in its group by clause, the 3rd - GetData seems to be ignored. This only happens under certain conditions. For ex., if you remove HostId from Group by, or use GetData instead of the expression for it, or change the order of fields in group by so that GetData goes before HostId it'll be alright. 

Select 
	PageLog.HostId, PageLog.PageId, 
	If(VirtPages.PageId Is Null, '', GetData) as GetData, Count(*) as C
	From PageLog Left Join VirtPages On
		(PageLog.HostId=VirtPages.HostId and PageLog.PageId=VirtPages.PageId)
Group By HostId, PageId, GetData

Suggested fix:
-
[25 May 2004 7:14] Sergei Kulakov
I got the same wrong result with MySQL 4.0.18 for Windows
[26 May 2004 0:50] Dean Ellis
Thank you for the report; verified against 4.0.20/Linux as well, using:

DROP TABLE IF EXISTS t1, t2;
CREATE TABLE t1 ( a INT NOT NULL, b INT NOT NULL, UNIQUE idx (a,b) );
INSERT INTO t1 VALUES (1,1),(1,2),(1,3),(1,4);
CREATE TABLE t2 ( a INT NOT NULL, b INT NOT NULL, c INT );
INSERT INTO t2 VALUES ( 1,10,1), (1,10,2), (1,11,1), (1,11,2), (1,2,1), (1,2,2),  (1,2,3);
SELECT t2.a, t2.b, IF(t1.b IS NULL,'',c) AS c, COUNT(*) AS d FROM t2 LEFT JOIN t1 ON t2.a = t1.a AND t2.b = t1.b GROUP BY a, b, c;
SELECT t2.a, t2.b, IF(t1.b IS NULL,'',c) AS c, COUNT(*) AS d FROM t2 LEFT JOIN t1 ON t2.a = t1.a AND t2.b = t1.b GROUP BY t1.a, t1.b, c;
SELECT t2.a, t2.b, IF(t1.b IS NULL,'',c) AS c, COUNT(*) AS d FROM t2 LEFT JOIN t1 ON t2.a = t1.a AND t2.b = t1.b GROUP BY t2.a, t2.b, c;
DROP TABLE IF EXISTS t1, t2;

Results alter depending on whether there is a UNIQUE index, a regular INDEX or no index on t1; whether t1.a and t1.b are NULLable or NOT NULLable
[26 May 2004 7:05] Sergei Kulakov
Results also alter when t1 has just 3 rows, not 4 or more:

INSERT INTO t1 VALUES (1,1),(1,2),(1,3);

and when you change the order of fields in GROUP BY (as I wrote)
[11 Oct 2005 13:40] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/30918
[13 Oct 2005 11:35] Michael Widenius
Discussed with Alexey and come up with a safer patch that only disables some optimzation in GROUP BY when we group on elments in a LEFT JOIN table.

Test failed also in 5.0 but because of different reasons. Alexey will check this after fixing this in 5.0
[13 Oct 2005 14:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/internals/31029