Bug #39258 Optimizer do full table scan in favor of sorting result by index [innodb rel.]
Submitted: 4 Sep 2008 21:00 Modified: 30 Oct 2009 0:52
Reporter: Alexander Y. Fomichev Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.26-rc, 5.1.30 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: Optimizer sort innodb

[4 Sep 2008 21:00] Alexander Y. Fomichev
Description:
Seems like on some conditions optimizer don't like filesort 
so much that prefer to do full a table scan to fetch rows in 
favor of using index at a sort time.

How to repeat:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 507404
Server version: 5.1.26-rc-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> DROP TABLE IF EXISTS a;
Query OK, 0 rows affected (0.00 sec)

mysql> DROP TABLE IF EXISTS b;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE `a` (
    -> `id` int(11) NOT NULL DEFAULT '0',
    -> PRIMARY KEY (`id`)
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE `b` (
    -> `id` int(11) NOT NULL DEFAULT '0',
    -> `c` char(32) NOT NULL,
    -> `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    -> PRIMARY KEY (`id`,`c`),
    -> KEY `c` (`c`),
    -> KEY `time` (`time`)
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.00 sec)

by shell:
~ $ for i in `seq 100`;do mysql -e "INSERT INTO a VALUES ($i)" test;done
~ $ for i in `seq 100`;do mysql -e "INSERT INTO b VALUES (${i},\"${i}a\", NULL)" test;done

mysql> EXPLAIN SELECT * FROM a, b WHERE b.c='a45' AND a.id=b.id ORDER BY b.time\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: index
possible_keys: PRIMARY,c
          key: time
      key_len: 4
          ref: NULL
         rows: 100
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.b.id
         rows: 1
        Extra: Using index
2 rows in set (0.00 sec)

mysql> EXPLAIN SELECT * FROM a, b USE INDEX (c) WHERE b.c='a45' AND a.id=b.id ORDER BY b.time\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: ref
possible_keys: c
          key: c
      key_len: 32
          ref: const
         rows: 1
        Extra: Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.b.id
         rows: 1
        Extra: Using index
2 rows in set (0.00 sec)

It is curious with myisam table optimizer do quite right thing:

mysql> ALTER TABLE b ENGINE=MyISAM;
Query OK, 100 rows affected (0.02 sec)
Records: 100  Duplicates: 0  Warnings: 0

mysql> EXPLAIN SELECT * FROM a, b WHERE b.c='a45' AND a.id=b.id ORDER BY b.time\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: ref
possible_keys: PRIMARY,c
          key: c
      key_len: 32
          ref: const
         rows: 1
        Extra: Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.b.id
         rows: 1
        Extra: Using index
2 rows in set (0.00 sec)
[5 Sep 2008 3:35] Valeriy Kravchuk
Thank you for a problem report. It looks similar to bug #35844 (but that one should be fixed) and/or bug #36259. Maybe it is the same as #35844 but for a timestamp column. Please, check.
[5 Sep 2008 9:15] Alexander Y. Fomichev
> Thank you for a problem report.

Thank you for the quick unswer

>It looks similar to bug #35844 (but that one should be
fixed)

yep, it's fixed for me.
 
> and/or bug #36259.

probably yes, it's quite reproducible for me.

> Maybe it is the same as #35844 but for a timestamp column.
> Please, check.

seem like not:

mysql> SHOW CREATE TABLE broken\G
*************************** 1. row ***************************
       Table: broken
Create Table: CREATE TABLE `broken` (
  `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `id2` int(10) unsigned NOT NULL,
  `junk` int(10) unsigned NOT NULL,
  PRIMARY KEY (`time`),
  KEY `id2_j_id1` (`id2`,`junk`,`time`)
) ENGINE=InnoDB DEFAULT CHARSET=cp1251
1 row in set (0.00 sec)

mysql> SELECT * FROM broken;
+---------------------+-----+------+
| time                | id2 | junk |
+---------------------+-----+------+
| 2008-09-05 12:32:27 |   0 |    2 |
| 2008-09-05 12:32:26 |   1 |    3 |
| 2008-09-05 12:32:25 |   2 |    4 |
| 2008-09-05 12:32:24 |   3 |    5 |
| 2008-09-05 12:32:23 |   4 |    6 |
| 2008-09-05 12:32:22 |   5 |    7 |
| 2008-09-05 12:32:21 |   6 |    8 |
| 2008-09-05 12:32:20 |   7 |    9 |
| 2008-09-05 12:32:19 |   8 |   10 |
| 2008-09-05 12:32:18 |   9 |    0 |
+---------------------+-----+------+
10 rows in set (0.00 sec)

mysql> EXPLAIN SELECT time FROM broken WHERE id2 = 4 ORDER BY time\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: broken
         type: ref
possible_keys: id2_j_id1
          key: id2_j_id1
      key_len: 4
          ref: const
         rows: 1
        Extra: Using where; Using index; Using filesort
1 row in set (0.00 sec)
[10 Sep 2008 8:57] Valeriy Kravchuk
Verified with recent 5.1.30 from bzr:

openxs@suse:/home2/openxs/dbs/5.1> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 202
Server version: 5.1.30-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> EXPLAIN SELECT * FROM a, b WHERE b.c='a45' AND a.id=b.id ORDER BY b.time\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: index
possible_keys: PRIMARY,c
          key: time
      key_len: 4
          ref: NULL
         rows: 100
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.b.id
         rows: 1
        Extra: Using index
2 rows in set (0.04 sec)

mysql> EXPLAIN SELECT * FROM a, b USE INDEX (c) WHERE b.c='a45' AND a.id=b.id ORDER BY
    -> b.time\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: ref
possible_keys: c
          key: c
      key_len: 32
          ref: const
         rows: 1
        Extra: Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.b.id
         rows: 1
        Extra: Using index
2 rows in set (0.00 sec)

For MyISAM it works as expected:

mysql> alter table a engine=MyISAM;
Query OK, 100 rows affected (0.15 sec)
Records: 100  Duplicates: 0  Warnings: 0

mysql> alter table b engine=MyISAM;
Query OK, 100 rows affected (0.06 sec)
Records: 100  Duplicates: 0  Warnings: 0

mysql> EXPLAIN SELECT * FROM a, b WHERE b.c='a45' AND a.id=b.id ORDER BY b.time\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b
         type: ref
possible_keys: PRIMARY,c
          key: c
      key_len: 32
          ref: const
         rows: 1
        Extra: Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.b.id
         rows: 1
        Extra: Using index
2 rows in set (0.01 sec)

So, it may be related to InnoDB's way of reporting statistics also...
[26 May 2009 21:31] Sheeri Cabral
I have found this bug on MySQL 5.0.45 and MySQL 5.1.31, in 2 completely different.  This is a pretty serious bug that can be fixed by using an index hint, which is what we used on the client that had the 5.0.45 system.  We are running into it again today with the client that is on 5.1.31, and cannot change the query to use an index hint as it is a third-party application.

The only change we made is that we defragmented the table.  We can't un-OPTIMIZE the table, but we have confirmed that it's the defragmentation that caused the problem, using backups.

On the fragmented table:

mysql> EXPLAIN SELECT main.* FROM Attachments main WHERE (main.Content IS NOT NULL AND main.Content != '') AND (main.Parent = '1213209') AND (main.ContentType = 'text/plain') ORDER BY main.id ASC\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: main
         type: ref
possible_keys: Attachments3
          key: Attachments3
      key_len: 4
          ref: const
         rows: 2
        Extra: Using where; Using filesort

On the defragmented table:
mysql>  EXPLAIN SELECT main.* FROM Attachments main WHERE (main.Content IS NOT NULL AND main.Content != '') AND (main.Parent = '1213209') AND (main.ContentType = 'text/plain') ORDER BY main.id ASC\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: main
         type: index
possible_keys: Attachments3
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 2694273
        Extra: Using where
1 row in set (0.09 sec)

both tables have the following keys:

  PRIMARY KEY  (`id`),
  KEY `Attachments2` (`TransactionId`),
  KEY `Attachments3` (`Parent`,`TransactionId`)

and both tables are InnoDB.  Both tables have the same data, 1185276 rows (that's with count(*), obviously the optimizer above has the wrong information, though we've used ANALYZE TABLE many times to try to help the optimizer).

I think the severity of this bug should be increased, because *anyone* can fall victim to it at any time, simply by defragmenting their table.
[26 May 2009 21:35] Sheeri Cabral
Here's another try on the fragmented system after ANALYZE TABLE was done, with the number of rows being slightly more accurate.

mysql> explain SELECT main.* FROM Attachments main  WHERE (main.Content IS NOT NULL AND main.Content != '') AND (main.Parent = '1208717') AND (main.ContentType = 'text/plain')  ORDER BY main.id ASC\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: main
         type: index
possible_keys: Attachments3
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 1826130
        Extra: Using where
1 row in set (3 min 52.66 sec)
[27 May 2009 13:08] Sheeri Cabral
We eventually found bug 37680, http://bugs.mysql.com/bug.php?id=37680 -- converting the table to MyISAM, as suggested there, fixed the optimizer problem.

This is definitely a serious problem, and due to the fact that converting to MyISAM fixed the problem, it may very well be due to poor InnoDB statistics.  This is still a severe issue, as it can occur any time, just by defragmentation.
[30 Oct 2009 0:52] James Day
This is a duplicate of bug #45828 that is fixed in 5.1.37, 5.4.2 .