Bug #31094 Forcing index-based sort doesn't work anymore if joins are done
Submitted: 19 Sep 2007 10:28 Modified: 13 Nov 2007 18:52
Reporter: Domas Mituzas Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.1 OS:Any
Assigned to: Georgi Kodinov CPU Architecture:Any
Tags: bfsm_2007_10_18, Optimizer, regression

[19 Sep 2007 10:28] Domas Mituzas
Description:
In case of two tables joined, there is no way to force index-based sort order and filesort always happens.

How to repeat:
create table t1 (a int primary key, b int);
create table t2 (c int primary key, d int);

(There can be millions of values too)

insert into t1 values (1,1),(2,2),(3,3),(4,4);
insert into t2 values (1,1),(2,2),(3,3),(4,4);

Uses filesort:
select * from t1 order by a; 

Uses index-based sort:
select * from t1 force index(PRIMARY) order by a;

Uses filesort:
select * from t1 join t2 on b=c order by a;

Uses filesort:
select * from t1 force index(PRIMARY) join t2 on b=c order by a;

mysql> explain select * from t1 force index(PRIMARY) join t2 on b=c order by A;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    4 | Using temporary; Using filesort | 
|  1 | SIMPLE      | t2    | ALL  | PRIMARY       | NULL | NULL    | NULL |    4 | Using where; Using join buffer  | 
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
2 rows in set (0.00 sec)

Limit does not make effect:
mysql> explain select * from t1 force index(PRIMARY) join t2 on b=c order by a limit 1;
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |    4 | Using temporary; Using filesort | 
|  1 | SIMPLE      | t2    | ALL  | PRIMARY       | NULL | NULL    | NULL |    4 | Using where; Using join buffer  | 
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
2 rows in set (0.00 sec)

Same happens with identical 200k-row tables:
mysql> explain select * from mem force index (a) join mem2 using (a) order by a limit 10;
+----+-------------+-------+------+---------------+------+---------+------------+--------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref        | rows   | Extra          |
+----+-------------+-------+------+---------------+------+---------+------------+--------+----------------+
|  1 | SIMPLE      | mem   | ALL  | a             | NULL | NULL    | NULL       | 201625 | Using filesort | 
|  1 | SIMPLE      | mem2  | ref  | a             | a    | 5       | test.mem.a |      1 | Using where    | 
+----+-------------+-------+------+---------------+------+---------+------------+--------+----------------+
2 rows in set (0.00 sec)

Suggested fix:
Do not use NULL key, if FORCE INDEX is done.
[19 Sep 2007 10:38] Domas Mituzas
Apparently 5.0 works properly with large datasets:

mysql> explain select * from mem force index (a) join mem2 using (a) order by a limit 10;
+----+-------------+-------+-------+---------------+------+---------+------------+--------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref        | rows   | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------------+--------+-------------+
|  1 | SIMPLE      | mem   | index | a             | a    | 5       | NULL       | 201625 |             | 
|  1 | SIMPLE      | mem2  | ref   | a             | a    | 5       | test.mem.a |      1 | Using where | 
+----+-------------+-------+-------+---------------+------+---------+------------+--------+-------------+
2 rows in set (0.01 sec)

Tagging this as regression.
[19 Sep 2007 11:01] Domas Mituzas
mysql-test testcase, fails on 5.1, executes properly on 5.0

Attachment: forcedorderindex.test (, text), 455 bytes.

[19 Sep 2007 14:31] Domas Mituzas
Additionally rechecked just with 'FORCE INDEX FOR ORDER BY' syntax:

mysql> explain select * from mem force index for order by (a) join mem2 using (a) order by a limit 10;
+----+-------------+-------+------+---------------+------+---------+------------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref        | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------------+------+----------------+
|  1 | SIMPLE      | mem   | ALL  | a             | NULL | NULL    | NULL       | 1000 | Using filesort | 
|  1 | SIMPLE      | mem2  | ref  | a             | a    | 5       | test.mem.a |    1 | Using where    | 
+----+-------------+-------+------+---------------+------+---------+------------+------+----------------+
2 rows in set (0.03 sec)
[5 Oct 2007 12:37] Georgi Kodinov
USE/FORCE/IGNORE INDEX FOR ORDER BY is not meant to influence the method to execute the ordering (filesort or index sort), it just allows the user to specify what indexes to consider when trying to use indexed sort (where the optimizer assumes it's applicable).
The index hints for ORDER BY can disable usage of indexes and effectively force sorted ORDER BY by specifying "USE INDEX FOR ORDER BY ()", but they currently can't force using of index based sorting.
However this bug is still a regression bug : in the fix for bug #27531 a rule was introduced in 5.1 that makes the optimizer to prefer filesort over using indexed order when there's a full table scan on the order by table. But this rule was enforced regardless of the LIMIT.
[5 Oct 2007 14:28] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/34980

ChangeSet@1.2584, 2007-10-05 17:28:34+03:00, gkodinov@magare.gmz +7 -0
  Bug #31094: Forcing index-based sort doesn't work anymore if joins are done
  
  A rule was introduced by the 5.1 part of the fix for bug 27531 to 
  prefer filesort over indexed ORDER BY when accessing all of the rows of a 
  table (because it's faster). This new rule was not accounting for the 
  presence of a LIMIT clause.
  Fixed the condition for this rule so it will prefer filesort over 
  indexed ORDER BY only if no LIMIT.
[29 Oct 2007 8:47] Bugs System
Pushed into 5.1.23-beta
[29 Oct 2007 8:51] Bugs System
Pushed into 6.0.4-alpha
[13 Nov 2007 18:52] Paul DuBois
Noted in 5.1.23, 6.0.4 changelogs.

A rule to prefer filesort over an indexed ORDER BY when accessing all
rows of a table was being used even if a LIMIT clause was present.