Bug #45267 | GROUP BY does not group NULLs on certain JOINs | ||
---|---|---|---|
Submitted: | 2 Jun 2009 13:01 | Modified: | 22 Nov 2010 0:47 |
Reporter: | Philip Stoev | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.4 | OS: | Any |
Assigned to: | Evgeny Potemkin | CPU Architecture: | Any |
Tags: | regression |
[2 Jun 2009 13:01]
Philip Stoev
[2 Jun 2009 13:10]
Philip Stoev
Test case. Apologies for not simplifying it to the absolute possible minimum. SET @saved_cs_client = @@character_set_client; SET character_set_client = utf8; CREATE TABLE `AA` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_nokey` int(11) NOT NULL, `int_key` int(11) NOT NULL, `date_key` date NOT NULL, `date_nokey` date NOT NULL, `time_key` time NOT NULL, `time_nokey` time NOT NULL, `datetime_key` datetime NOT NULL, `datetime_nokey` datetime NOT NULL, `varchar_key` varchar(1) NOT NULL, `varchar_nokey` varchar(1) NOT NULL, PRIMARY KEY (`pk`), KEY `int_key` (`int_key`), KEY `date_key` (`date_key`), KEY `time_key` (`time_key`), KEY `datetime_key` (`datetime_key`), KEY `varchar_key` (`varchar_key`) ) ENGINE=MyISAM AUTO_INCREMENT=10 DEFAULT CHARSET=latin1; SET character_set_client = @saved_cs_client; SET @saved_cs_client = @@character_set_client; SET character_set_client = utf8; CREATE TABLE `C` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_nokey` int(11) NOT NULL, `int_key` int(11) NOT NULL, `date_key` date NOT NULL, `date_nokey` date NOT NULL, `time_key` time NOT NULL, `time_nokey` time NOT NULL, `datetime_key` datetime NOT NULL, `datetime_nokey` datetime NOT NULL, `varchar_key` varchar(1) NOT NULL, `varchar_nokey` varchar(1) NOT NULL, PRIMARY KEY (`pk`), KEY `int_key` (`int_key`), KEY `date_key` (`date_key`), KEY `time_key` (`time_key`), KEY `datetime_key` (`datetime_key`), KEY `varchar_key` (`varchar_key`) ) ENGINE=MyISAM AUTO_INCREMENT=21 DEFAULT CHARSET=latin1; SET character_set_client = @saved_cs_client; INSERT INTO `C` VALUES (1,0,8,'0000-00-00','0000-00-00','00:16:10','00:16:10','2008-09-03 14:25:40','2008-09-03 14:25:40','h','h'),(2,7,1,'2008-05-01','2008-05-01','00:00:00','00:00:00','2001-01-13 00:00:00','2001-01-13 00:00:00','',''),(3,0,2,'2002-02-04','2002-02-04','00:00:00','00:00:00','0000-00-00 00:00:00','0000-00-00 00:00:00','x','x'),(4,2,2,'0000-00-00','0000-00-00','16:29:24','16:29:24','2000-10-16 01:39:08','2000-10-16 01:39:08','w','w'),(5,1,0,'0000-00-00','0000-00-00','09:23:32','09:23:32','0000-00-00 00:00:00','0000-00-00 00:00:00','p','p'),(6,3,7,'0000-00-00','0000-00-00','00:00:00','00:00:00','2007-12-02 00:00:00','2007-12-02 00:00:00','o','o'),(7,3,0,'0000-00-00','0000-00-00','00:00:00','00:00:00','2008-09-11 00:00:00','2008-09-11 00:00:00','',''),(8,0,6,'0000-00-00','0000-00-00','13:59:04','13:59:04','0000-00-00 00:00:00','0000-00-00 00:00:00','s','s'),(9,7,9,'2005-02-05','2005-02-05','09:01:06','09:01:06','0000-00-00 00:00:00','0000-00-00 00:00:00','d','d'),(10,5,2,'2005-12-12','2005-12-12','00:00:00','00:00:00','0000-00-00 00:00:00','0000-00-00 00:00:00','n','n'),(11,0,1,'2001-02-27','2001-02-27','21:06:46','21:06:46','0000-00-00 00:00:00','0000-00-00 00:00:00','o','o'),(12,2,2,'2001-09-05','2001-09-05','00:00:00','00:00:00','0000-00-00 00:00:00','0000-00-00 00:00:00','',''),(13,6,7,'0000-00-00','0000-00-00','14:45:34','14:45:34','2003-07-28 02:34:08','2003-07-28 02:34:08','w','w'),(14,1,4,'2000-10-11','2000-10-11','15:04:12','15:04:12','0000-00-00 00:00:00','0000-00-00 00:00:00','o','o'),(15,0,8,'2006-04-10','2006-04-10','00:00:00','00:00:00','0000-00-00 00:00:00','0000-00-00 00:00:00','x','x'),(16,0,9,'2005-03-26','2005-03-26','15:55:23','15:55:23','2004-03-17 00:32:27','2004-03-17 00:32:27','p','p'),(17,1,0,'2003-02-03','2003-02-03','16:30:00','16:30:00','2004-12-27 19:20:00','2004-12-27 19:20:00','d','d'),(18,0,0,'0000-00-00','0000-00-00','00:00:00','00:00:00','0000-00-00 00:00:00','0000-00-00 00:00:00','h','h'),(19,0,4,'0000-00-00','0000-00-00','14:13:26','14:13:26','2008-11-09 05:53:48','2008-11-09 05:53:48','o','o'),(20,0,0,'2004-01-20','2004-01-20','00:00:00','00:00:00','2009-10-11 06:58:04','2009-10-11 06:58:04','k','k'); SET @saved_cs_client = @@character_set_client; SET character_set_client = utf8; CREATE TABLE `BB` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_nokey` int(11) NOT NULL, `int_key` int(11) NOT NULL, `date_key` date NOT NULL, `date_nokey` date NOT NULL, `time_key` time NOT NULL, `time_nokey` time NOT NULL, `datetime_key` datetime NOT NULL, `datetime_nokey` datetime NOT NULL, `varchar_key` varchar(1) NOT NULL, `varchar_nokey` varchar(1) NOT NULL, PRIMARY KEY (`pk`), KEY `int_key` (`int_key`), KEY `date_key` (`date_key`), KEY `time_key` (`time_key`), KEY `datetime_key` (`datetime_key`), KEY `varchar_key` (`varchar_key`) ) ENGINE=MyISAM AUTO_INCREMENT=12 DEFAULT CHARSET=latin1; SET character_set_client = @saved_cs_client; INSERT INTO `BB` VALUES (10,0,8,'2005-03-23','2005-03-23','19:39:13','19:39:13','0000-00-00 00:00:00','0000-00-00 00:00:00','g','g'),(11,8,5,'2000-11-14','2000-11-14','03:43:53','03:43:53','0000-00-00 00:00:00','0000-00-00 00:00:00','b','b'); SELECT table1 .`datetime_nokey` field1 FROM AA table1 RIGHT JOIN (C table2 JOIN BB ON table2 .`date_nokey` ) ON table2 .`date_key` GROUP BY field1;
[26 Jun 2009 8:53]
Philip Stoev
A tigthter test case: CREATE TABLE `AA` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_nokey` int(11) NOT NULL, `time_key` time NOT NULL, PRIMARY KEY (`pk`), KEY `time_key` (`time_key`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1; CREATE TABLE `C` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_nokey` int(11) NOT NULL, `time_key` time NOT NULL, PRIMARY KEY (`pk`), KEY `time_key` (`time_key`) ) ENGINE=MyISAM AUTO_INCREMENT=21 DEFAULT CHARSET=latin1; INSERT INTO `C` VALUES (1,5,'20:53:30'),(2,3,'00:00:00'),(3,0,'12:37:08'),(4,3,' 00:00:00'),(5,1,'15:59:14'),(6,0,'04:43:51'),(7,1,'00:00:00'),(8,7,'09:51:25'),( 9,1,'00:00:00'),(10,0,'00:00:00'),(11,0,'00:00:00'),(12,8,'18:50:55'),(13,4,'00: 00:00'),(14,9,'08:22:36'),(15,0,'00:00:00'),(16,2,'09:37:21'),(17,0,'05:15:03'), (18,8,'20:11:43'),(19,5,'00:00:00'),(20,1,'00:00:00'); CREATE TABLE `BB` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_nokey` int(11) NOT NULL, `time_key` time NOT NULL, PRIMARY KEY (`pk`), KEY `time_key` (`time_key`) ) ENGINE=MyISAM AUTO_INCREMENT=12 DEFAULT CHARSET=latin1; INSERT INTO `BB` VALUES (10,9,'22:36:46'),(11,0,'08:46:46'); /* OPTIMIZER SETTINGS: */ SET SESSION optimizer_switch = 'firstmatch=on,index_merge=on,index_merge_union=o n,index_merge_sort_union=on,index_merge_intersection=on,loosescan=on,materializa tion=off,semijoin=off'; SET SESSION optimizer_use_mrr = 'force'; SET SESSION engine_condition_pushdown = '1'; SET SESSION join_cache_level = '1'; SET GLOBAL optimizer_switch = 'firstmatch=on,index_merge=on,index_merge_union=on ,index_merge_sort_union=on,index_merge_intersection=on,loosescan=on,materializat ion=off,semijoin=off'; SET GLOBAL optimizer_use_mrr = 'force'; SET GLOBAL engine_condition_pushdown = '1'; SET GLOBAL join_cache_level = '1'; /* ORIGINAL QUERY: SELECT table1 . `pk` AS field1 FROM (AA AS table1 RIGHT OUTER JOIN (BB AS table 2 STRAIGHT_JOIN C AS table3 ON ( table3 . `datetime_nokey` = table2 . `int_nokey ` )) ON ( table3 . `time_key` = table2 . `time_key` )) WHERE table2 . `int_noke y` <> 6 GROUP BY field1 ; ORIGINAL DIFF: --- /tmp//randgen18812-server0.dump 2009-06-26 11:51:46.000000000 +0300 +++ /tmp//randgen18812-server1.dump 2009-06-26 11:51:46.000000000 +0300 @@ -1 +1,7 @@ NULL +NULL +NULL +NULL +NULL +NULL +NULL SIMPLIFIED QUERY: SELECT table1 .`pk` field1 FROM AA table1 RIGHT JOIN BB table2 STRAIGHT_JOI N C ON table2 .`int_nokey` ON table2 .`time_key` GROUP BY field1 ; SIMPLIFIED DIFF: --- /tmp//randgen18812-server0.dump 2009-06-26 11:51:46.000000000 +0300 +++ /tmp//randgen18812-server1.dump 2009-06-26 11:51:46.000000000 +0300 @@ -1 +1,20 @@ NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL +NULL */
[8 Jul 2009 20:04]
Philip Stoev
Here is another test case, showing that even after using SELECT DISTINCT , the NULLs are separate rows in the result set. --disable_warnings DROP TABLE IF EXISTS AA, C, BB; --enable_warnings CREATE TABLE `AA` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_nokey` int(11) NOT NULL, `time_key` time NOT NULL, PRIMARY KEY (`pk`), KEY `time_key` (`time_key`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1; CREATE TABLE `C` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_nokey` int(11) NOT NULL, `time_key` time NOT NULL, PRIMARY KEY (`pk`), KEY `time_key` (`time_key`) ) ENGINE=MyISAM AUTO_INCREMENT=21 DEFAULT CHARSET=latin1; INSERT INTO `C` VALUES (1,5,'20:53:30'),(2,3,'00:00:00'),(3,0,'12:37:08'),(4,3,'00:00:00'),(5,1,'15:59:14'),(6,0,'04:43:51'),(7,1,'00:00:00'),(8,7,'09:51:25'),(9,1,'00:00:00'),(10,0,'00:00:00'),(11,0,'00:00:00'),(12,8,'18:50:55'),(13,4,'00:00:00'),(14,9,'08:22:36'),(15,0,'00:00:00'),(16,2,'09:37:21'),(17,0,'05:15:03'),(18,8,'20:11:43'),(19,5,'00:00:00'),(20,1,'00:00:00'); CREATE TABLE `BB` ( `pk` int(11) NOT NULL AUTO_INCREMENT, `int_nokey` int(11) NOT NULL, `time_key` time NOT NULL, PRIMARY KEY (`pk`), KEY `time_key` (`time_key`) ) ENGINE=MyISAM AUTO_INCREMENT=12 DEFAULT CHARSET=latin1; INSERT INTO `BB` VALUES (10,9,'22:36:46'),(11,0,'08:46:46'); SELECT DISTINCT table1 .`pk` field1 FROM AA table1 RIGHT JOIN BB table2 STRAIGHT_JOIN C ON table2 .`int_nokey` ON table2 .`time_key` GROUP BY field1;
[13 Jul 2009 15:27]
Philip Stoev
This regression was caused by the following changeset: revno: 2727 revision-id: igor@mysql.com-20090310013411-xxe9x3d9kqxuo6ar parent: sergefp@mysql.com-20090301182607-xglayyk67lwson93 committer: Igor Babaev <igor@mysql.com> branch nick: mysql-6.0-opt-bug42955 timestamp: Mon 2009-03-09 18:34:11 -0700 message: Fixed bug #42955. If a join buffer is employed to join a table through BNL or BKA algorithm for a query with a ORDER BY / GROUP BY clause then the result set has always to be sorted unless the clause can be optimized away. The regression is present even if all possible 5.4 optimizations have been turned off.
[21 Jul 2009 11:20]
Evgeny Potemkin
Slightly changed query reports wrong error: mysql> SELECT DISTINCT table1 .`pk` field1 FROM BB table2 LEFT JOIN AA table1 STRAIGHT_JOIN C ON table2 .`int_nokey` ON table2 .`time_key` GROUP BY field1; ERROR 1054 (42S22): Unknown error 1054
[21 Jul 2009 12:09]
Philip Stoev
The error for code 1054 is as follows: Error: 1054 SQLSTATE: 42S22 (ER_BAD_FIELD_ERROR) Message: Unknown column '%s' in '%s'
[23 Jul 2009 12:16]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/79167 2825 Evgeny Potemkin 2009-07-23 Bug#45267: Incomplete check caused wrong result. If a join buffer is used to join a table the ordering by an index for the first non-constant table cannot be employed unless ORDER BY/GROUP BY clauses are optimized away. The JOIN::get_sort_by_join_tab function is employed to choose whether to sort result with filesort or to use index scan. It wasn't checking whether the GROUP BY clause was optimized away and thus returned wrong result in this case. All this caused the optimizer to choose wrong execution plan and return wrong query result. The JOIN::get_sort_by_join_tab function now checks whether the GROUP BY clause was optimized away. @ mysql-test/r/join_cache.result Added a test case for the bug#45267. @ mysql-test/t/join_cache.test Added a test case for the bug#45267. @ sql/sql_select.h Bug#45267: Incomplete check caused wrong result. The JOIN::get_sort_by_join_tab function now checks whether the GROUP BY clause was optimized away.
[23 Jul 2009 12:22]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/79173 2825 Evgeny Potemkin 2009-07-22 Bug#45267: Incomplete check caused wrong result. If a join buffer is used to join a table the ordering by an index for the first non-constant table cannot be employed unless ORDER BY/GROUP BY clauses are optimized away. The JOIN::get_sort_by_join_tab function is employed to choose whether to sort result with filesort or to use index scan. It wasn't checking whether the GROUP BY clause was optimized away and thus returned wrong result in this case. All this caused the optimizer to choose wrong execution plan and return wrong query result. The JOIN::get_sort_by_join_tab function now checks whether the GROUP BY clause was optimized away. @ mysql-test/r/join_cache.result Added a test case for the bug#45267. @ mysql-test/t/join_cache.test Added a test case for the bug#45267. @ sql/sql_select.h Bug#45267: Incomplete check caused wrong result. The JOIN::get_sort_by_join_tab function now checks whether the GROUP BY clause was optimized away.
[4 Sep 2009 7:00]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/82408 2825 Evgeny Potemkin 2009-09-04 Bug#45267: Incomplete check caused wrong result. If a join buffer is used to join a table the ordering by an index for the first non-constant table cannot be employed unless ORDER BY/GROUP BY clauses are optimized away. The JOIN::get_sort_by_join_tab function is employed to choose whether to sort result with filesort or to use index scan. It wasn't checking whether the GROUP BY clause was optimized away and thus returned wrong result in this case. All this caused the optimizer to choose wrong execution plan and return wrong query result. The JOIN::get_sort_by_join_tab function now checks whether the GROUP BY clause was optimized away. @ mysql-test/r/join_cache.result Added a test case for the bug#45267. @ mysql-test/t/join_cache.test Added a test case for the bug#45267. @ sql/sql_select.h Bug#45267: Incomplete check caused wrong result. The JOIN::get_sort_by_join_tab function now checks whether the GROUP BY clause was optimized away.
[8 Sep 2009 8:01]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/82634 2811 Evgeny Potemkin 2009-09-08 [merge] Auto-merged fix for te bug#45267.
[15 Sep 2009 13:52]
Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090915134838-5nj3ycjfsqc2vr2f) (version source revid:epotemkin@mysql.com-20090908080055-t3xlolxjdj4xsn4s) (merge vers: 5.4.4-alpha) (pib:11)
[16 Sep 2009 1:24]
Paul DuBois
Noted in 5.4.4 changelog. NULL values were not grouped properly for some joins containing GROUP BY.
[12 May 2010 15:36]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/108169 3167 Evgeny Potemkin 2010-05-12 Bug#45267: Incomplete check caused wrong result. If a join buffer is used to join a table the ordering by an index for the first non-constant table cannot be employed unless ORDER BY/GROUP BY clauses are optimized away. The JOIN::get_sort_by_join_tab function is employed to choose whether to sort result with filesort or to use index scan. It wasn't checking whether the GROUP BY clause was optimized away and thus returned wrong result in this case. All this caused the optimizer to choose wrong execution plan and return wrong query result. The JOIN::get_sort_by_join_tab function now checks whether the GROUP BY clause was optimized away. Original revid:epotemkin@mysql.com-20090904065824-s97w8gwsls2sqhcy Only the test case is added as the code is already present. @ mysql-test/r/join_cache.result A test case added for the bug#45267. @ mysql-test/t/join_cache.test A test case added for the bug#45267.
[16 Aug 2010 6:40]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100816062819-bluwgdq8q4xysmlg) (version source revid:alik@sun.com-20100816062612-enatdwnv809iw3s9) (pib:20)
[13 Nov 2010 16:20]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[22 Nov 2010 0:47]
Paul DuBois
Noted in 5.6.1 changelog.