Bug #35334 For MyISAM select * from table order by key_field does not use index
Submitted: 17 Mar 2008 12:48 Modified: 22 Apr 2008 15:02
Reporter: Alexander Rubin Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S4 (Feature request)
Version:5.0 all OS:Any
Assigned to: Gleb Shchepa CPU Architecture:Any
Tags: myisam, Optimizer, performance
Triage: Triaged: D5 (Feature request)

[17 Mar 2008 12:48] Alexander Rubin
Description:
For MyISAM select * from table order by key_field does not use index and use "temporary table", for InnoDB it uses the correct index.

mysql>  
CREATE TABLE `q2` ( `i` int(11) NOT NULL auto_increment, 
`strArbo` varchar(255) NOT NULL default '',  
PRIMARY KEY  (`i`),   KEY `strArbo` (`strArbo`) )
ENGINE=MyISAM DEFAULT CHARSET=latin1;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into q2 values(...);
Query OK, 1 row affected (0.00 sec)
....
mysql> update q2 set strArbo='aaaaaaaaa' where i%3=0;
Query OK, 682 rows affected (0.06 sec)
Rows matched: 682  Changed: 682  Warnings: 0

mysql> update q2 set strArbo='ccccccccc' where i%5=0;
Query OK, 409 rows affected (0.06 sec)
Rows matched: 409  Changed: 409  Warnings: 0

mysql> explain select * from q2 order by strArbo;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | q2    | ALL  | NULL          | NULL | NULL    | NULL | 2048 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

mysql> alter table q2 engine=InnoDB;
Query OK, 2048 rows affected (0.11 sec)
Records: 2048  Duplicates: 0  Warnings: 0

mysql> explain select * from q2 order by strArbo;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|  1 | SIMPLE      | q2    | index | NULL          | strArbo | 257     | NULL | 1922 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> alter table q2 engine=MyISAM;
Query OK, 2048 rows affected (0.05 sec)
Records: 2048  Duplicates: 0  Warnings: 0

mysql> explain select * from q2 order by strArbo;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | q2    | ALL  | NULL          | NULL | NULL    | NULL | 2048 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

mysql> analyze table q2;
+---------+---------+----------+-----------------------------+
| Table   | Op      | Msg_type | Msg_text                    |
+---------+---------+----------+-----------------------------+
| test.q2 | analyze | status   | Table is already up to date |
+---------+---------+----------+-----------------------------+
1 row in set (0.01 sec)

mysql> optimize table q2;
+---------+----------+----------+----------+
| Table   | Op       | Msg_type | Msg_text |
+---------+----------+----------+----------+
| test.q2 | optimize | status   | OK       |
+---------+----------+----------+----------+
1 row in set (0.00 sec)

mysql> optimize table q2;
+---------+----------+----------+-----------------------------+
| Table   | Op       | Msg_type | Msg_text                    |
+---------+----------+----------+-----------------------------+
| test.q2 | optimize | status   | Table is already up to date |
+---------+----------+----------+-----------------------------+
1 row in set (0.00 sec)

mysql> explain select * from q2 order by strArbo;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | q2    | ALL  | NULL          | NULL | NULL    | NULL | 2048 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

mysql> explain select * from q2 use index (strArbo) order by strArbo;
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
|  1 | SIMPLE      | q2    | ALL  | NULL          | NULL | NULL    | NULL | 2048 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)

mysql> explain select * from q2 force index (strArbo) order by strArbo;
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------+
|  1 | SIMPLE      | q2    | index | NULL          | strArbo | 257     | NULL | 2048 |       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------+
1 row in set (0.01 sec)

How to repeat:
see above

Suggested fix:
Fix optimizer, so that it choses the right index for MyISAM tables as it does for InnoDB tables.
[17 Mar 2008 13:28] Miguel Solorzano
Thank you for the bug report.

CREATE TABLE `q2` ( `i` int(11) NOT NULL auto_increment, 
`strArbo` varchar(255) NOT NULL default '',  
PRIMARY KEY  (`i`),   KEY `strArbo` (`strArbo`) )
ENGINE=MyISAM DEFAULT CHARSET=latin1;
insert into q2 values();
insert into q2 select NULL, 'aaaaa' from q2;
insert into q2 select NULL, 'aaaaa' from q2;
insert into q2 select NULL, 'aaaaa' from q2;
update q2 set strArbo='aaaaaaaaa' where i%3=0;
update q2 set strArbo='ccccccccc' where i%5=0;
explain select * from q2 order by strArbo\G
alter table q2 engine=InnoDB;
explain select * from q2 order by strArbo\G

mysql> explain select * from q2 order by strArbo\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: q2
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 8
        Extra: Using filesort
1 row in set (0.00 sec)

mysql> alter table q2 engine=InnoDB;
Query OK, 8 rows affected (0.02 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql>  explain select * from q2 order by strArbo\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: q2
         type: index
possible_keys: NULL
          key: strArbo
      key_len: 257
          ref: NULL
         rows: 8
        Extra: Using index
1 row in set (0.00 sec)
[17 Mar 2008 15:21] Alexander Rubin
Here is the speed profiling:

mysql> update qqqq set strArbo=concat(char(round(rand()*100,0)+100), char(round(rand()*100,0)+100), char(round(rand()*100,0)+100), char(round(rand()*100,0)+100), 'asdasdsads
adsad');
Query OK, 131072 rows affected (6.86 sec)
Rows matched: 131072  Changed: 131072  Warnings: 0

mysql> select SQL_CALC_FOUND_ROWS * from qqqq  order by strArbo limit 10;
+-------+---------------------+
| i     | strArbo             |
+-------+---------------------+
| 71140 | └┬├┬asdasdsadsadsad |
| 11735 | ├├└оasdasdsadsadsad |
| 71140 | ├├└оasdasdsadsadsad |
| 11735 | ┬├└пasdasdsadsadsad |
| 11735 | ├├┴░asdasdsadsadsad |
| 11735 | ┴┬└▓asdasdsadsadsad |
| 11735 | ┴┬┴┤asdasdsadsadsad |
| 71140 | └┬└┤asdasdsadsadsad |
| 11735 | └┴└╢asdasdsadsadsad |
| 11735 | └┴┴╢asdasdsadsadsad |
+-------+---------------------+
10 rows in set (22.25 sec)

mysql> select SQL_CALC_FOUND_ROWS * from qqqq force index(strArbo)  order by strArbo limit 10;
+-------+---------------------+
| i     | strArbo             |
+-------+---------------------+
| 71140 | └┬├┬asdasdsadsadsad |
| 71140 | ├├└оasdasdsadsadsad |
| 11735 | ├├└оasdasdsadsadsad |
| 11735 | ┬├└пasdasdsadsadsad |
| 11735 | ├├┴░asdasdsadsadsad |
| 11735 | ┴┬└▓asdasdsadsadsad |
| 71140 | └┬└┤asdasdsadsadsad |
| 11735 | ┴┬┴┤asdasdsadsadsad |
| 11735 | └┴└╢asdasdsadsadsad |
| 11735 | └┴┴╢asdasdsadsadsad |
+-------+---------------------+
10 rows in set (0.00 sec)

mysql> explain select SQL_CALC_FOUND_ROWS * from qqqq  order by strArbo limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: qqqq
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 131072
        Extra: Using filesort
1 row in set (0.00 sec)

mysql> explain select SQL_CALC_FOUND_ROWS * from qqqq force index (strArbo) order by strArbo limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: qqqq
         type: index
possible_keys: NULL
          key: strArbo
      key_len: 257
          ref: NULL
         rows: 131072
        Extra:
1 row in set (0.00 sec)
[17 Mar 2008 15:22] Alexander Rubin
The above example is for myisam:

Create Table: CREATE TABLE `qqqq` (
  `i` int(11) default NULL,
  `strArbo` varchar(255) NOT NULL default '',
  KEY `strArbo` (`strArbo`),
  KEY `i` (`i`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
[18 Mar 2008 9:36] Sergey Petrunya
Everyone: please pay attention to presense/absense of LIMIT clause as that has a strong effect on which strategy will be used to resolve the ORDER BY. 
 
Bug verification team: please:
 - Check 5.1, we've did some ORDER BY... LIMIT fixes there.
 - Check if the records happen to be read in disk order in the provided examples. That is a factor that optimizer cannot detect and which can affect query speed greatly
[14 Apr 2008 18:32] Alexander Rubin
I've checked with 5.1.23-rc-community-log: the same behavior. 

mysql> alter table qqqq engine=MyISAM;
Query OK, 262144 rows affected (8.30 sec)
Records: 262144  Duplicates: 0  Warnings: 0

mysql> explain select SQL_CALC_FOUND_ROWS * from qqqq  order by strArbo limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: qqqq
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 262144
        Extra: Using filesort
1 row in set (0.27 sec)

mysql> alter table qqqq engine=InnoDB;
Query OK, 262144 rows affected (3.52 sec)
Records: 262144  Duplicates: 0  Warnings: 0

mysql> explain select SQL_CALC_FOUND_ROWS * from qqqq  order by strArbo limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: qqqq
         type: index
possible_keys: NULL
          key: strArbo
      key_len: 257
          ref: NULL
         rows: 262756
        Extra: Using index
1 row in set (0.00 sec)

Speed profiling for MySQL 5.1 (same results as in MySQL 5.0):
For MyISAM:

mysql> select SQL_CALC_FOUND_ROWS * from qqqq  order by strArbo limit 10;
+------+---------+
| i    | strArbo |
+------+---------+
| 2703 |         |
| 2686 |         |
| 2685 |         |
| 2684 |         |
| 2683 |         |
| 2682 |         |
| 2681 |         |
| 2680 |         |
| 2679 |         |
| 2678 |         |
+------+---------+
10 rows in set (25.11 sec)

mysql> select SQL_CALC_FOUND_ROWS * from qqqq force index(strArbo)  order by strArbo limit
    -> 10;
+----+---------+
| i  | strArbo |
+----+---------+
|  1 |         |
|  2 |         |
|  3 |         |
|  4 |         |
|  5 |         |
|  6 |         |
|  7 |         |
|  8 |         |
|  9 |         |
| 10 |         |
+----+---------+
10 rows in set (0.09 sec)

For InnoDB:
mysql> alter table qqqq engine=InnoDB;
Query OK, 262144 rows affected (5.33 sec)
Records: 262144  Duplicates: 0  Warnings: 0

mysql>  select SQL_CALC_FOUND_ROWS * from qqqq  order by strArbo limit 10;
+----+---------+
| i  | strArbo |
+----+---------+
|  1 |         |
|  2 |         |
|  3 |         |
|  4 |         |
|  5 |         |
|  6 |         |
|  7 |         |
|  8 |         |
|  9 |         |
| 10 |         |
+----+---------+
10 rows in set (0.17 sec)

So: for InnoDB it chooses the right index, and for MyISAM - it is not.
[21 Apr 2008 21:13] Guilhem Bichot
Testcase: put the lines below in a .test file:
-- source include/have_innodb.inc
CREATE TABLE `q2` ( `i` int(11) NOT NULL auto_increment, 
`strArbo` varchar(255) NOT NULL default '',  
PRIMARY KEY  (`i`),   KEY `strArbo` (`strArbo`) )
ENGINE=MyISAM DEFAULT CHARSET=latin1;
insert into q2 values(null, 'bbbbbbbbb');
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
update q2 set strArbo='aaaaaaaaa' where i%3=0;
update q2 set strArbo='ccccccccc' where i%5=0;
select count(*) from q2;
alter table q2 engine=innodb;
explain select * from q2 order by strArbo;
select now();
select * from q2 order by strArbo;
select now();
alter table q2 engine=myisam;
explain select * from q2 order by strArbo;
select now();
select * from q2 order by strArbo;
select now();

And then run it (I used 5.1-opt tree) and see the result:
CREATE TABLE `q2` ( `i` int(11) NOT NULL auto_increment,
`strArbo` varchar(255) NOT NULL default '',
PRIMARY KEY  (`i`),   KEY `strArbo` (`strArbo`) )
ENGINE=MyISAM DEFAULT CHARSET=latin1;
insert into q2 values(null, 'bbbbbbbbb');
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
insert into q2 (strArbo) select strArbo from q2;
update q2 set strArbo='aaaaaaaaa' where i%3=0;
update q2 set strArbo='ccccccccc' where i%5=0;
select count(*) from q2;
count(*)
524288
alter table q2 engine=innodb;
explain select * from q2 order by strArbo;
id      select_type     table   type    possible_keys   key     key_len ref
rows    Extra
1       SIMPLE  q2      index   NULL    strArbo 257     NULL    524541  Using in
dex
select now();
now()
2008-04-22 00:09:47
select * from q2 order by strArbo;
i       strArbo
3       aaaaaaaaa
<cut millions of lines>
44450  ccccccccc
select now();
now()
2008-04-22 00:09:49
alter table q2 engine=myisam;
explain select * from q2 order by strArbo;
id      select_type     table   type    possible_keys   key     key_len ref
rows    Extra
1       SIMPLE  q2      ALL     NULL    NULL    NULL    NULL    524288  Using fi
lesort
select now();
now()
2008-04-22 00:10:02
select * from q2 order by strArbo;
i       strArbo
342     aaaaaaaaa
<cut millions of lines>
44450  ccccccccc
select now();
now()
2008-04-22 00:10:18

This is the problem which the customer complained about (I know it, I talked with Stéphane Varoqui): InnoDB has "Using index" whereas MyISAM has "Using filesort", and the performance thus suffers (2 seconds for the SELECT in InnoDB, 16 seconds for the SELECT in MyISAM where most time is spent in "Sorting result" state.
I am not using a shm-based data directory, normal disk-based directory.
This should be enough of a testcase for the Optimizer team to consider now.
[22 Apr 2008 5:47] Sergei Golubchik
As far as I remember it's not a bug - it's intentional, but bad, heuristic in the optimizer. When retrieving all rows in the table by doing an index scan, the engine would need to read rows from disk in a series of random reads. When reading all rows with a full table scan the engine would read the disk sequentially, which often is more than an order of magnitude faster, and even taking filesort into account it should be faster than an index scan.

InnoDB is a special case - data are clustered by primary key, there's no way to read them sequentially
[24 Apr 2008 18:08] Stephane Varoqui
The sorting is also present in last 4.1 and 5.0

For OLTP database every peace of data stay in memory so the sorting does
not limit rnd read IO unless the database is cold
 
If the sort_buffer is too small we will also swap to a temporary table
so makes multiple serial IO for the merge pass. 
 
I feel that we should implement the fixe at the storage engine level
with some kine of read ahead buffer, espacially when the optimiser get
no statistics for the level of "ramdominess" associated with the index
and the rate of data that feet in memory .

rds
[26 May 2009 20:14] Sveta Smirnova
Bug #45095 most likely duplicate of this one and contains easier test.