MySQL Bugs: #38745: MySQL 5.1 optimizer uses filesort for ORDER BY when it should use index

Bug #38745	MySQL 5.1 optimizer uses filesort for ORDER BY when it should use index
Submitted:	12 Aug 2008 9:44	Modified:	20 Jul 2010 2:46
Reporter:	Marc Isambart	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S2 (Serious)
Version:	5.1	OS:	Any
Assigned to:	Gleb Shchepa	CPU Architecture:	Any
Tags:	Optimizer, regression

Description:
This is a case where MySQL 5.1.26 uses filesort for an ORDER BY when it should use the index. On the exact same query, MySQL 5.0.45 correctly uses the index.

This greatly affects the performance on large databases. By example, the query will take 100s of seconds on large databases when it will take less than a second on 5.0.45.

How to repeat:
We need to consider 3 tables:

CREATE TABLE data1 ( index1 integer NOT NULL, value1 integer NOT NULL, PRIMARY KEY(index1) ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE data2 ( index2 integer NOT NULL, value2 integer NOT NULL, PRIMARY KEY(index2) ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE data3 ( index3 integer NOT NULL, value3 integer NOT NULL, PRIMARY KEY(index3) ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Let's add some data to data1 data3, but keep data2 empty:

INSERT INTO data1 VALUES (1, 1),(2, 2),(3, 3),(4, 4),(5, 5),(6, 6),(7, 7),(8, 8),(9, 9),(10, 10),(11, 11),(12, 12);
INSERT INTO data3 VALUES (1, 1),(2, 2),(3, 3),(4, 4),(5, 5),(6, 6),(7, 7),(8, 8),(9, 9),(10, 10),(11, 11),(12, 12);

If we consider the following query:

EXPLAIN SELECT data1.*, data3.*, data2.value2 FROM data1 JOIN data3 ON data3.index3 = data1.index1 LEFT JOIN data2 ON data1.index1 = data2.index2 WHERE data1.index1 BETWEEN 3 AND 10 ORDER BY data1.index1 LIMIT 5\G

5.1.26 will use filesort:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: data2
         type: system
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 0
        Extra: const row not found
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: data1
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 7
        Extra: Using where; Using filesort
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: data3
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: bugreport.data1.index1
         rows: 1
        Extra:

On the same query, 5.0.45 will correctly optimize it:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: data2
         type: system
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 0
        Extra: const row not found
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: data1
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 7
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: data3
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: bugreport.data1.index1
         rows: 1
        Extra:

If we now add a single row in data2 and use the same previous select query:

INSERT INTO data2 VALUES (1, 1);

EXPLAIN SELECT data1.*, data3.*, data2.value2 FROM data1 JOIN data3 ON data3.index3 = data1.index1 LEFT JOIN data2 ON data1.index1 = data2.index2 WHERE data1.index1 BETWEEN 3 AND 10 ORDER BY data1.index1 LIMIT 5\G

MySQL 5.1.26 will now optimize it correctly:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: data1
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 7
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: data2
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: bugreport.data1.index1
         rows: 1
        Extra:
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: data3
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: bugreport.data1.index1
         rows: 1
        Extra:

Many thanks for writing a bug report.

Verified as described.

I just want to notice that this bug seems brake dbmail application.

This query hangs for a very very long time on dbmail with 6.5Gb database and mysql 5.1.34, but works fast on mysql 5.0.77:
SELECT messageblk, is_header FROM dbmail_messageblks WHERE physmessage_id = 278699 ORDER BY messageblk_idnr

I fix it in dbmail using 'use index' expression:
SELECT messageblk, is_header FROM dbmail_messageblks  use index(physmessage_id_index) WHERE physmessage_id = 278699 ORDER BY messageblk_idnr

Simplified test case:

CREATE TABLE t1 (i1 integer NOT NULL PRIMARY KEY);
CREATE TABLE t2 (i2 integer);
CREATE TABLE t3 (i3 integer NOT NULL PRIMARY KEY);

INSERT INTO t1 VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12);
INSERT INTO t3 VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12);

EXPLAIN EXTENDED
SELECT t1.*, t3.* FROM t1 JOIN t3 ON t3.i3 = t1.i1 LEFT JOIN t2 ON t1.i1 = t2.i2 ORDER BY t1.i1 LIMIT 5;

DROP TABLE t1, t2, t3;

I think I've also spotted this bug.

Given this simple query to retrieve some information from phpBB:

SELECT t.topic_id, t.topic_title, f.forum_id, f.forum_name, t.topic_replies, t.topic_last_post_time
FROM phpbb_topics t
INNER JOIN phpbb_forums f ON f.forum_id = t.forum_id
ORDER BY t.topic_last_post_time DESC 
LIMIT 7

I get a query plan of:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	f	ALL	PRIMARY	NULL	NULL	NULL	73	Using temporary; Using filesort
1	SIMPLE	t	ref	forum_id,forum_id_type,forum_appr_last,fid_time_mo...	forum_id	3	f.forum_id	782	 

An index is available on the phpbb_topics table: last_post_time (topic_last_post_time)

Re-writing to include a force index statement:
SELECT t.topic_id, t.topic_title, f.forum_id, f.forum_name, t.topic_replies, t.topic_last_post_time
FROM phpbb_topics t FORCE INDEX ( last_post_time ) 
INNER JOIN phpbb_forums f ON f.forum_id = t.forum_id
ORDER BY t.topic_last_post_time DESC 
LIMIT 7

And I get a query plan of:
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	t	index	NULL	last_post_time	4	NULL	7	 
1	SIMPLE	f	eq_ref	PRIMARY	PRIMARY	3	t.forum_id	1	 

Which is obviously much more efficient.

I tried running ANALYZE TABLES on both tables yet there was no difference in query plan. Both tables are InnoDB.

Cardinality on last_post_time is 149467. phpbb_topics contains ~150,000 rows, where-as the forums table contains ~100.

MySQL Server version 5.1.42-community

I also wonder why 'Using filesort' is even used on the forums table?

An ORDER BY is done on the topics table, not the forums table.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109113

3057 Gleb Shchepa	2010-05-25
      Bug #38745: MySQL 5.1 optimizer uses filesort for ORDER BY
                  when it should use index
      
      Sometimes the LEFT/RIGHT JOIN with an empty table caused an
      unnecessary filesort.
      
      Sample query, where t1.i1 is indexed and t3 is empty:
      
        SELECT t1.*, t2.* FROM t1 JOIN t2 ON t1.i1 = t2.i2
                             LEFT JOIN t3 ON t2.i2 = t3.i3
          ORDER BY t1.i1 LIMIT 5;
      
      The server erroneously used an item of empty outer-joined
      table as a common constant of a Item_equal (multi-equivalence
      expression).
      By the fix for the bug 16590 the constant status of such
      an item has been propagated to st_table::const_key_parts
      map bits related to other Item_equal argument-related
      key parts (those are obviously not constant in our case).
      As far as test_if_skip_sort_order function skips constant
      prefixes of testing keys, this caused an ignorance of
      available indices, since some prefixes were marked as
      constant by mistake.
     @ mysql-test/r/order_by.result
        Test case for bug #38745.
     @ mysql-test/t/order_by.test
        Test case for bug #38745.
     @ sql/item_cmpfunc.cc
        Bug #38745: MySQL 5.1 optimizer uses filesort for ORDER BY
                    when it should use index
        
        - Item_equal::is_simple_const() has been added.
        - Item_equal::update_const() and Item_equal::update_used_tables
          have been updated to not take into account constantness
          of empty outer-joined table items.
     @ sql/item_cmpfunc.h
        Bug #38745: MySQL 5.1 optimizer uses filesort for ORDER BY
                    when it should use index
        
        Item_equal::is_simple_const() has been added.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109477

3057 Gleb Shchepa	2010-05-28
      Bug #38745: MySQL 5.1 optimizer uses filesort for ORDER BY
                  when it should use index
      
      Sometimes the LEFT/RIGHT JOIN with an empty table caused an
      unnecessary filesort.
      
      Sample query, where t1.i1 is indexed and t3 is empty:
      
        SELECT t1.*, t2.* FROM t1 JOIN t2 ON t1.i1 = t2.i2
                             LEFT JOIN t3 ON t2.i2 = t3.i3
          ORDER BY t1.i1 LIMIT 5;
      
      The server erroneously used an item of empty outer-joined
      table as a common constant of a Item_equal (multi-equivalence
      expression).
      By the fix for the bug 16590 the constant status of such
      an item has been propagated to st_table::const_key_parts
      map bits related to other Item_equal argument-related
      key parts (those are obviously not constant in our case).
      As far as test_if_skip_sort_order function skips constant
      prefixes of testing keys, this caused an ignorance of
      available indices, since some prefixes were marked as
      constant by mistake.
     @ mysql-test/r/order_by.result
        Test case for bug #38745.
     @ mysql-test/t/order_by.test
        Test case for bug #38745.
     @ sql/item.h
        Bug #38745: MySQL 5.1 optimizer uses filesort for ORDER BY
                    when it should use index
        
        Item::is_outer_field() has been added and overloaded for
        Item_field and Item_ref classes.
     @ sql/item_cmpfunc.cc
        Bug #38745: MySQL 5.1 optimizer uses filesort for ORDER BY
                    when it should use index
        
        Item_equal::update_const() and Item_equal::update_used_tables()
        have been updated to not take into account the constantness
        of outer-joined table items.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109610

3079 Gleb Shchepa	2010-05-31
      Bug #38745: MySQL 5.1 optimizer uses filesort for ORDER BY
                  when it should use index
      
      Sometimes the LEFT/RIGHT JOIN with an empty table caused an
      unnecessary filesort.
      
      Sample query, where t1.i1 is indexed and t3 is empty:
      
        SELECT t1.*, t2.* FROM t1 JOIN t2 ON t1.i1 = t2.i2
                             LEFT JOIN t3 ON t2.i2 = t3.i3
          ORDER BY t1.i1 LIMIT 5;
      
      The server erroneously used an item of empty outer-joined
      table as a common constant of a Item_equal (multi-equivalence
      expression).
      By the fix for the bug 16590 the constant status of such
      an item has been propagated to st_table::const_key_parts
      map bits related to other Item_equal argument-related
      key parts (those are obviously not constant in our case).
      As far as test_if_skip_sort_order function skips constant
      prefixes of testing keys, this caused an ignorance of
      available indices, since some prefixes were marked as
      constant by mistake.
     @ mysql-test/r/order_by.result
        Test case for bug #38745.
     @ mysql-test/t/order_by.test
        Test case for bug #38745.
     @ sql/item.h
        Bug #38745: MySQL 5.1 optimizer uses filesort for ORDER BY
                    when it should use index
        
        Item::is_outer_field() has been added and overloaded for
        Item_field and Item_ref classes.
     @ sql/item_cmpfunc.cc
        Bug #38745: MySQL 5.1 optimizer uses filesort for ORDER BY
                    when it should use index
        
        Item_equal::update_const() and Item_equal::update_used_tables()
        have been updated to not take into account the constantness
        of outer-joined table items.

Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)

Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)

Gleb,

A proper fix for this bug would be:

+++ sql/sql_select.cc   2010-06-05 17:31:33 +0000
@@ -2663,6 +2663,7 @@ make_join_statistics(JOIN *join, TABLE_L
 #endif
       {                                                // Empty table
         s->dependent= 0;                        // Ignore LEFT JOIN depend.
+       *s->on_expr_ref= 0;
        set_position(join,const_count++,s,(KEYUSE*) 0);
        continue;
       }

Please, reconsider your patch.

Regards,
Igor.

Noted in 5.5.5 changelog.

The optimizer sometimes used filesort for ORDER BY when it should
have used an index.