Bug #37264 MySQL wrongly uses filesort
Submitted: 7 Jun 2008 22:35 Modified: 16 Oct 2008 3:58
Reporter: jocelyn fournier (Silver Quality Contributor) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.1.24-rc 5.1.29-rc 5.1.30 OS:Any
Assigned to: CPU Architecture:Any
Tags: qc, regression

[7 Jun 2008 22:35] jocelyn fournier
Description:
Hi,

With MySQL 5.1, MySQL is not able to use an index to avoid filesorting without analyzing the table.
This is especially problematic for MEMORY table for which the ANALYZE command is not available, altough FORCE INDEX could be used to solve the problem. 
Hence performances on 5.1 could be worse than 5.0 in that specific case.

This works properly with MySQL 5.0.

Regards,
  Jocelyn Fournier

How to repeat:
DROP TABLE IF EXISTS t1;
CREATE TABLE `t1` (
  `a` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`a`),
  KEY `date` USING BTREE (`date`) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8 MAX_ROWS=1000;

DROP PROCEDURE IF EXISTS fill_table;
DELIMITER //
CREATE PROCEDURE fill_table()
BEGIN
  DECLARE v1 INT DEFAULT 1;
  WHILE v1 < 771 DO
    INSERT INTO t1 (a) VALUES (v1);
    SET v1 = v1 + 1;
  END WHILE;
END
//
DELIMITER ;
CALL fill_table();

EXPLAIN SELECT a FROM t1 ORDER BY date DESC LIMIT 0,50;

Result with MySQL 5.1 (use of filesorting) :

+----+-------------+-------+------+---------------+------+---------+------+-----
-+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows
 | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+----------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |  770
 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+----------------+
1 row in set (0.00 sec)

Result with MySQL 5.0 (no filesorting) :
+----+-------------+-------+-------+---------------+------+---------+------+------+-------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------+
|  1 | SIMPLE      | t1    | index | NULL          | date | 4       | NULL |  770 |       |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------+
1 row in set (0.00 sec)

After an analyze table :

ANALYZE TABLE t1;
EXPLAIN SELECT a FROM t1 ORDER BY date DESC LIMIT 0,50;

+----+-------------+-------+-------+---------------+------+---------+------+----
--+-------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | row
s | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+----
--+-------+
|  1 | SIMPLE      | t1    | index | NULL          | date | 4       | NULL |   5
0 |       |
+----+-------------+-------+-------+---------------+------+---------+------+----
--+-------+
1 row in set (0.00 sec)

Now this is using properly the index.

Now, let's create t1 as a MEMORY table :

DROP TABLE IF EXISTS t1;
CREATE TABLE `t1` (
  `a` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`a`),
  KEY `date` USING BTREE (`date`) 
) ENGINE=MEMORY DEFAULT CHARSET=utf8 MAX_ROWS=1000;

DROP PROCEDURE IF EXISTS fill_table;
DELIMITER //
CREATE PROCEDURE fill_table()
BEGIN
  DECLARE v1 INT DEFAULT 1;
  WHILE v1 < 771 DO
    INSERT INTO t1 (a) VALUES (v1);
    SET v1 = v1 + 1;
  END WHILE;
END
//
DELIMITER ;
CALL fill_table();

EXPLAIN SELECT a FROM t1 ORDER BY date DESC LIMIT 0,50;
+----+-------------+-------+------+---------------+------+---------+------+-----
-+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows
 | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+----------------+
|  1 | SIMPLE      | t1    | ALL  | NULL          | NULL | NULL    | NULL |  770
 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+-----
-+----------------+
1 row in set (0.00 sec)

No index are used here, and only the use of FORCE INDEX (date) could solve the issue.

Suggested fix:
Properly use the index available to avoid filesorting.
[11 Jun 2008 14:01] Susanne Ebrecht
Many thanks for writing a bug report.

Verified as described by using 5.1 bzr tree from today.
[26 Jun 2008 14:38] Sergey Petrunya
> With MySQL 5.1, MySQL is not able to use an index to avoid filesorting without analyzing the table.

ANALYZE command produces table statistics. Generally (and in this particular case) it is not realistic to expect the query optimizer to make the right choices without having any statistics.
  

> This is especially problematic for MEMORY table for which the ANALYZE command is not available, altough FORCE INDEX could be used to solve the problem.

Index statistics for BTREE indexes over HEAP tables can be considered a reasonable feature request.

> Hence performances on 5.1 could be worse than 5.0 in that specific case.

Yes, alas, there may be isolated cases like this. At the moment we can only offer to use hint for this particular case.
[26 Jun 2008 14:40] Sergey Petrunya
Changing status to To Be Fixed Later
[26 Jun 2008 14:41] Sergey Petrunya
At the moment, we have no intent to fix this in MySQL 5.1
[26 Jun 2008 14:56] jocelyn fournier
Hi Sergey,

"ANALYZE command produces table statistics. Generally (and in this particular case) it is
not realistic to expect the query optimizer to make the right choices without having any
statistics."

If no statistics info are available, if would perhaps be safer to always use index to mimic the 5.0 behaviour, instead of doing a wrong guess ?

I assume the change in 5.1 behaviour here is an optimisation to try to not use index to avoid filesorting if the query returns more than 2/3 of the total rows of the table ?