Bug #49391 Optimizer chooses wrong execution plan with WHERE and GROUP BY
Submitted: 3 Dec 2009 9:31 Modified: 4 Dec 2009 8:50
Reporter: Oli Sennhauser Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.38 OS:Any
Assigned to: CPU Architecture:Any
Tags: execution plan, GROUP BY, Optimizer, where

[3 Dec 2009 9:31] Oli Sennhauser
Description:
The optimizer does NOT choose the most optimal execution plan and thus the query runs about half as fast as possible.

How to repeat:
CREATE TABLE `test` (
  `P_intID` int(11) DEFAULT NULL,
  `W_intID` int(11) DEFAULT NULL,
  `intCount` int(11) DEFAULT NULL,
  `daDate` date DEFAULT NULL,
  `floatAP` float DEFAULT NULL,
  `floatMP` float DEFAULT NULL,
  `S_intID` smallint(5) unsigned DEFAULT NULL,
  `SW_intID` int(11) DEFAULT NULL,
  `floatMPSub` float DEFAULT NULL,
  KEY `i1` (`P_intID`),
  KEY `i2` (`W_intID`),
  KEY `i4` (`daDate`),
  KEY `i5` (`P_intID`,`W_intID`,`S_intID`),
  KEY `i3` (`P_intID`,`W_intID`,`S_intID`,`intCount`),
  KEY `i6` (`daDate`,`P_intID`,`W_intID`,`S_intID`),
  KEY `i7` (`P_intID`,`W_intID`,`S_intID`,`intCount`,`daDate`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

INSERT INTO test (P_intID, W_intID, S_intID, intCount, daDate) VALUES (1, 1, 1, 1, NOW());

INSERT INTO test (P_intID, W_intID, S_intID, intCount, daDate)
SELECT ROUND(RAND()*1000000, 0), ROUND(RAND()*1000000, 0), ROUND(RAND()*65536, 0), ROUND(RAND()*10, 0), FROM_UNIXTIME(UNIX_TIMESTAMP()-ROUND(RAND()*86400*365, 0))
  FROM test;

EXPLAIN
SELECT SQL_NO_CACHE
       P_intID, W_intID, S_intID, SUM(intCount)
  FROM test
 WHERE daDate >= "2009-10-17"
 GROUP BY P_intID, W_intID, S_intID
;

+----+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | test  | index | i4,i6         | i7   | 22      | NULL | 2097152 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
avg 4.39 s

EXPLAIN
SELECT SQL_NO_CACHE
       P_intID, W_intID, S_intID, SUM(intCount)
  FROM test FORCE INDEX (i4)
 WHERE daDate >= "2009-10-17"
 GROUP BY P_intID, W_intID, S_intID
;

+----+-------------+-------+-------+---------------+------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows   | Extra                                        |
+----+-------------+-------+-------+---------------+------+---------+------+--------+----------------------------------------------+
|  1 | SIMPLE      | test  | range | i4            | i4   | 4       | NULL | 520836 | Using where; Using temporary; Using filesort |
+----+-------------+-------+-------+---------------+------+---------+------+--------+----------------------------------------------+
avg 2.06 s

EXPLAIN
SELECT SQL_NO_CACHE
       P_intID, W_intID, S_intID, SUM(intCount)
  FROM test FORCE INDEX (i5)
 WHERE daDate >= "2009-10-17"
 GROUP BY P_intID, W_intID, S_intID
;

+----+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | test  | index | NULL          | i5   | 13      | NULL | 2097152 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+---------+-------------+
avg 5.67 s

EXPLAIN
SELECT SQL_NO_CACHE
       P_intID, W_intID, S_intID, SUM(intCount)
  FROM test FORCE INDEX (i6)
 WHERE daDate >= "2009-10-17"
 GROUP BY P_intID, W_intID, S_intID
;

+----+-------------+-------+-------+---------------+------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra                                        |
+----+-------------+-------+-------+---------------+------+---------+------+---------+----------------------------------------------+
|  1 | SIMPLE      | test  | range | i6            | i6   | 4       | NULL | 1082332 | Using where; Using temporary; Using filesort |
+----+-------------+-------+-------+---------------+------+---------+------+---------+----------------------------------------------+
avg 2.17 s

EXPLAIN
SELECT SQL_NO_CACHE
       P_intID, W_intID, S_intID, SUM(intCount)
  FROM test FORCE INDEX (i7)
 WHERE daDate >= "2009-10-17"
 GROUP BY P_intID, W_intID, S_intID
;

+----+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows    | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | test  | index | NULL          | i7   | 22      | NULL | 2097152 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+---------+--------------------------+
avg 4.40 s

Suggested fix:
Choose the right execution plan! :)
[4 Dec 2009 7:41] Oli Sennhauser
Problem does not occur with 5.0.86.
[4 Dec 2009 8:26] Oli Sennhauser
If I change the index sort order to:

CREATE TABLE `test` (
  `P_intID` int(11) DEFAULT NULL,
  `W_intID` int(11) DEFAULT NULL,
  `intCount` int(11) DEFAULT NULL,
  `daDate` date DEFAULT NULL,
  `floatAP` float DEFAULT NULL,
  `floatMP` float DEFAULT NULL,
  `S_intID` smallint(5) unsigned DEFAULT NULL,
  `SW_intID` int(11) DEFAULT NULL,
  `floatMPSub` float DEFAULT NULL,
  KEY `i1` (`P_intID`,`W_intID`,`S_intID`,`intCount`,`daDate`),
  KEY `i2` (`daDate`,`P_intID`,`W_intID`,`S_intID`),
  KEY `i3` (`P_intID`,`W_intID`,`S_intID`,`intCount`),
  KEY `i4` (`P_intID`,`W_intID`,`S_intID`),
  KEY `i5` (`daDate`),
  KEY `i6` (`W_intID`),
  KEY `i7` (`P_intID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

The optimizer chooses correctly i5, which gives the best performance.
[4 Dec 2009 8:50] Valeriy Kravchuk
Let's consider this a duplicate of bug #44969 and escalate it.