Bug #37786 Optimizer does not count rows before range when choosing index
Submitted: 1 Jul 2008 21:48 Modified: 9 Jul 2008 3:31
Reporter: Kolbe Kegel Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.25 OS:Any
Assigned to: CPU Architecture:Any

[1 Jul 2008 21:48] Kolbe Kegel
Description:
The optimizer, when trying to avoid a filesort, seems to ignore the huge number of rows that need to be scanned through to get to the rows the query needs to read.

How to repeat:
I have a very large test case that can be used to reproduce this issue.

CREATE TABLE t1 (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `c1` int(11) DEFAULT NULL,
  `c2` decimal(6,1) DEFAULT NULL,
  `c3` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `k1` (`c2`),
  KEY `k2` (`c1`,`c3`)
) ENGINE=InnoDB AUTO_INCREMENT=2404046480 DEFAULT CHARSET=latin1

> explain SELECT * FROM t1 WHERE c1 in (37924) ORDER BY c2 limit 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: k2
          key: k1
      key_len: 5
          ref: NULL
         rows: 793
        Extra: Using where
1 row in set (0.00 sec)

After executing the SELECT, SHOW STATUS reports this:

| Handler_read_next                 | 36115744      |

Results of the EXPLAIN are quite different if the LIMIT is excluded.

mysql 5.1.23-rc (root) "test"> explain SELECT * FROM t1 WHERE c1 in (37924) ORDER BY c2 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: ref
possible_keys: k2
          key: k2
      key_len: 5
          ref: const
         rows: 212280
        Extra: Using where; Using filesort
1 row in set (0.00 sec)

FORCE INDEX *with* the LIMIT produces EXPLAIN output quite similar to the above (unsurprisingly), but much better Handler_read_next results:

mysql 5.1.23-rc (root) "test"> explain SELECT * FROM t1 force index (k2) WHERE c1 in (37924) ORDER BY c2 limit 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: ref
possible_keys: k2
          key: k2
      key_len: 5
          ref: const
         rows: 212280
        Extra: Using where; Using filesort
1 row in set (0.00 sec)

mysql 5.1.25-rc (root) "test"> SELECT * FROM t1 force index (k2) WHERE c1 in (37924) ORDER BY c2 limit 1;
...

| Handler_read_next | 168778 |

It may be possible to reproduce with a smaller test case, too. It's easy enough to reproduce the optimizer choosing the "wrong" index, but more difficult to get the distribution right to cause the enormous disconnect between "rows" in EXPLAIN output and Handler_read_next after the query is executed. I'm not sure what feature of the distribution in the large test (above) causes that particular behavior.

CREATE TABLE `i1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `a` tinyint(4) NOT NULL,
  `b` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `a` (`a`),
  KEY `b` (`b`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=latin1

explain select * from i1 where a=100 order by b limit 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: i1
         type: index
possible_keys: a
          key: b
      key_len: 5
          ref: NULL
         rows: 99
        Extra: Using where

| Handler_read_next | 233   | 

Suggested fix:
In this example, the optimizer should account for the 30 million rows it needs to read to carry out this execution plan.
[1 Jul 2008 23:59] Kolbe Kegel
A workaround is to use FORCE INDEX for queries known to be affected by this problem.
[2 Jul 2008 20:30] Kolbe Kegel
Reproducible on both InnoDB & MyISAM.
[9 Jul 2008 3:31] Valeriy Kravchuk
Looks like a duplicate of bug #36259.