Bug #21174 Index degrades sort performance and optimizer does not honor IGNORE INDEX
Submitted: 20 Jul 2006 6:17 Modified: 12 Apr 2007 21:24
Reporter: John David Duncan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.0.18, 5.1bk OS:Linux (linux)
Assigned to: Georgi Kodinov CPU Architecture:Any
Tags: bfsm_2006_12_07, Q1

[20 Jul 2006 6:17] John David Duncan
Description:
"SELECT col_a, sum(col_b) FROM large_table GROUP BY col_a"  can be much slower if there is an index on col_a than if there is not one. "SELECT col_a, sum(col_b) FROM large_table IGNORE INDEX (index_on_col_a) GROUP BY col_a" does not help because the IGNORE INDEX hint is disregarded. 

How to repeat:
create table t1 (   
  pk int primary key not null auto_increment, 
  category int,
  amount float, 
  misc int  
) engine = innodb;

create table t2 like t1;
alter table t2 add index (category) ; 

create table t3 like t2;
alter table t3 engine = myisam;

DELIMITER //
CREATE PROCEDURE newrows(n int)
BEGIN
 DECLARE j int;
 SET j = 1;
 REPEAT 
  INSERT INTO t1 values (NULL, floor(rand() * 200) , rand() * 100 , 0) ;
  SET j = j + 1 ;
UNTIL j = n
END REPEAT;
END 
//

CALL newrows(250000) ;
insert into t2 select * from t1;
insert into t3 select * from t1;

-- t1 has no problem
explain select category, sum(amount) from t1 group by category ; 

-- t2 has a problem!!  (Index scan instead of table scan )
explain select category, sum(amount) from t2 group by category ; 

-- Also a problem:
explain select category, sum(amount) from t2 IGNORE INDEX (category) group by category;

-- t3 has no problem, so maybe something about innodb triggers the problem with t2.
explain select category, sum(amount) from t3 group by category ;
[20 Jul 2006 8:52] Sveta Smirnova
Verified as described on Linux and both 5.0 and 5.1 BK.
[2 Aug 2006 14:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/9948

ChangeSet@1.2227, 2006-08-02 17:18:00+03:00, gkodinov@macbook.gmz +3 -0
  Bug #21174: Index degrades sort performance and 
               optimizer does not honor IGNORE INDEX
   - Allow an index to be used for sorting the table 
     instead of filesort only if it is not disabled by
     IGNORE INDEX.
[14 Aug 2006 15:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10373

ChangeSet@1.2227, 2006-08-14 18:19:29+03:00, gkodinov@macbook.gmz +3 -0
  Bug #21174: Index degrades sort performance and 
               optimizer does not honor IGNORE INDEX
   - Allow an index to be used for sorting the table 
     instead of filesort only if it is not disabled by
     IGNORE INDEX.
[29 Aug 2006 13:22] Evgeny Potemkin
Fixed in 5.0.25
[31 Aug 2006 16:02] Chad MILLER
Available in 5.0.25.
[4 Sep 2006 11:43] Evgeny Potemkin
Fixed in 5.1.12
[13 Sep 2006 2:07] Paul DuBois
Noted in 5.0.25, 5.1.12 changelog.

InnoDB did not honor IGNORE INDEX, which prevented using IGNORE INDEX in cases where an index sort would be slower than a filesort.
[22 Sep 2006 18:38] Sergei Golubchik
the fix is wrong, old behaviour was intentional and explicitly documented:
"
`USE INDEX', `IGNORE INDEX', and `FORCE INDEX' affect only which indexes
are used when MySQL decides how to find rows in the table and how to do
the join. They do not affect whether an index is used when resolving an
`ORDER BY' or `GROUP BY'.
"
[27 Sep 2006 9:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/12593

ChangeSet@1.2280, 2006-09-27 12:53:53+03:00, gkodinov@macbook.gmz +3 -0
  Bug #21174: Index degrades sort performance and optimizer does not honor IGNORE INDEX
   - reversed the patch for 5.0 and moved to 5.1
[27 Sep 2006 10:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/12597

ChangeSet@1.2338, 2006-09-27 13:11:00+03:00, gkodinov@macbook.gmz +3 -0
  Bug#21174: Index degrades sort performance and optimizer does not honor IGNORE INDEX
   - fix moved to 5.1
[27 Sep 2006 16:42] John David Duncan
The patch was reverted based on the idea that you could use 
  SELECT SQL_BIG_RESULT ... 
in your query instead of  IGNORE INDEX, but get the same results.

This is documented at http://dev.mysql.com/doc/refman/5.0/en/select.html and unfortunately several of us overlooked that in the process of getting to this point.

Derek, can you confirm whether this solves the issue before we take the next step?
[27 Sep 2006 19:27] John David Duncan
If you try SQL_BIG_RESULT in the test case I originally submitted with this bug, it does not make any difference.

mysql> explain select category, sum(amount) from t2 group by category \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: index
possible_keys: NULL
          key: category
      key_len: 5
          ref: NULL
         rows: 250558
        Extra: 
1 row in set (0.01 sec)

mysql> explain select SQL_BIG_RESULT category, sum(amount) from t2 group by category \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: index
possible_keys: NULL
          key: category
      key_len: 5
          ref: NULL
         rows: 250558
        Extra: 
1 row in set (0.00 sec)
[1 Oct 2006 8:58] Georgi Kodinov
Pushed in 5.0.26/5.1.12
[6 Oct 2006 9:17] Michael Widenius
This patch should be removed from 5.1 as it breaks compatiblity with older versions and the current behaviour is fine for a lot of current usage cases
(Especially when using ORDER BY with LIMIT, the current patch is VERY bad)

What needs to done instead in 5.1 or 52, is to complement the IGNORE INDEX syntax they way that Sergei describe earlier in the bug report:

IGNORE INDEX [FOR {JOIN|ORDER BY|GROUP BY}]. If no FOR ... is specified the index will be ignored everywhere (as in the changeset above, except if you start mysqld with --old, in which case if there is no FOR, the index will only be ignored for join.

As a way for the original bug reporter to work around the problem in MySQL:
Add a +0 to the first argument to ORDER BY; This will disable any index optimization for the ORDER BY part.
[11 Mar 2007 13:27] Georgi Kodinov
Implemented (and reviewed) in a separate WorkLog Item (WL#3527).
[11 Apr 2007 9:41] Georgi Kodinov
Implemented in WL#3527.
Pushed in 5.0.40, 5.1.17
[12 Apr 2007 21:24] Paul DuBois
Noted in 5.0.40 changelog.

The syntax for index hints has been extended to enable explicit
specification that the hint applies only to join processing.

Noted in 5.1.17 changelog.

The syntax for index hints has been extended to enable more
fine-grained control over the optimizer's selection of an execution
plan for various phases of query processing.

I also added a note to the 5.0.25 changelog entry that that
patch was reverted in 5.0.26.

The updated operation of index hints now is documented in its
own section of the manual:

http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
http://dev.mysql.com/doc/refman/5.0/en/index-hints.html
[28 Aug 2007 9:27] Arpad Nev
I'm using MySQL 5.0.45 community-nt-log on Windows XP platform with 256M ram.

I have 3M rows in a 1.2G sized table T, with a compound index meant to increase performance of queries like
Select <KEY fields>, sum(<field) FROM T Group By <Key fields>
without WHERE clause.

There are about 900 key field combinations, inserted in random order. So using the index means 3M random file I/O which is far more expensive than reading 1.2G data in sequential order + 3M lookups in the resultset (which is probably entirely in the memory). On my machine queries that use index for grouping are at least 10 times slower.