Bug #32272 innodb does sequential scan even when index should be used
Submitted: 12 Nov 2007 10:33 Modified: 9 Sep 2008 6:04
Reporter: River Tarnell Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:5.0.42, 5.0.51 OS:Solaris (Solaris 10 U4 patch 120012-14)
Assigned to: CPU Architecture:Any

[12 Nov 2007 10:33] River Tarnell
Description:
When a SELECT * or DELETE is done from an InnODB table with an index, the index is not used, even though it should be.  This results in serious performance issues for our application.

example 1:

root@localhost:enwiki> show status like 'handler_read_%';
+-----------------------+----------+
| Variable_name         | Value    |
+-----------------------+----------+
| Handler_read_first    | 17       |
| Handler_read_key      | 44       |
| Handler_read_next     | 37195625 |
| Handler_read_prev     | 0        |
| Handler_read_rnd      | 0        |
| Handler_read_rnd_next | 7977260  |
+-----------------------+----------+
6 rows in set (0.18 sec)

root@localhost:enwiki> delete FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
Query OK, 0 rows affected (29.59 sec)

root@localhost:enwiki> show status like 'handler_read_%';
+-----------------------+----------+
| Variable_name         | Value    |
+-----------------------+----------+
| Handler_read_first    | 18       |
| Handler_read_key      | 46       |
| Handler_read_next     | 37195625 |
| Handler_read_prev     | 0        |
| Handler_read_rnd      | 0        |
| Handler_read_rnd_next | 14771813 |
+-----------------------+----------+
6 rows in set (0.18 sec)

example 2:

root@localhost:enwiki> explain SELECT COUNT(*) FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
+----+-------------+---------------+-------+---------------+--------------+---------+------+---------+--------------------------+
| id | select_type | table         | type  | possible_keys | key          | key_len | ref  | rows    | Extra                    |
+----+-------------+---------------+-------+---------------+--------------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | recentchanges | range | rc_timestamp  | rc_timestamp | 16      | NULL | 1589942 | Using where; Using index |
+----+-------------+---------------+-------+---------------+--------------+---------+------+---------+--------------------------+
1 row in set (0.00 sec)

root@localhost:enwiki> explain SELECT * FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
+----+-------------+---------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table         | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+---------------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | recentchanges | ALL  | rc_timestamp  | NULL | NULL    | NULL | 8534030 | Using where |
+----+-------------+---------------+------+---------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

root@localhost:enwiki> SELECT * FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
Empty set (26.39 sec)

root@localhost:enwiki> SELECT COUNT(*) FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
1 row in set (0.15 sec)

How to repeat:
Unknown.

Suggested fix:
Unknown.
[12 Nov 2007 10:43] Domas Mituzas
This seems to be caused by purge lag giving skewed statistics and optimizer not understanding that :) 

So, it needs either FORCE INDEX support on DMLs or extracting better statistics from engine.
[27 Nov 2007 23:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/38659

ChangeSet@1.2656, 2007-11-28 02:43:46+03:00, sergefp@mysql.com +3 -0
  BUG#32272: partition crash 1: enum column
  The bug was that for ordered index scans, ha_partition::index_init() did
  not put index columns into table->read_set if the underlying storage 
  engine did not have HA_PARTIAL_COLUMN_READ flag. 
  This was causing assertion failure when handle_ordered_index_scan() tried
  to sort the records according to index order.
  
  Fixed by making ha_partition::index_init() put index columns into table->read_set
  for all ordered scans.
[28 Nov 2007 0:00] Sergey Petrunya
Please disregard the above commit, it was intended for another bug
[8 Dec 2007 19:35] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.51 or 5.0.52. In case of the same problem, please, upload table's data to our FTP server. or, at least, send the results of:

SHOW CREATE TABLE recentchanges;
[17 Dec 2007 8:24] River Tarnell
I upgraded to 5.0.51, and the problem is still present.  The innodb data is too large to upload (about 500GB).

CREATE TABLE `recentchanges` (
  `rc_id` int(8) NOT NULL auto_increment,
  `rc_timestamp` varchar(14) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_cur_time` varchar(14) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_user` int(10) unsigned NOT NULL default '0',
  `rc_user_text` varchar(255) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_namespace` int(11) NOT NULL default '0',
  `rc_title` varchar(255) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_comment` varchar(255) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_minor` tinyint(3) unsigned NOT NULL default '0',
  `rc_bot` tinyint(3) unsigned NOT NULL default '0',
  `rc_new` tinyint(3) unsigned NOT NULL default '0',
  `rc_cur_id` int(10) unsigned NOT NULL default '0',
  `rc_this_oldid` int(10) unsigned NOT NULL default '0',
  `rc_last_oldid` int(10) unsigned NOT NULL default '0',
  `rc_type` tinyint(3) unsigned NOT NULL default '0',
  `rc_moved_to_ns` tinyint(3) unsigned NOT NULL default '0',
  `rc_moved_to_title` varchar(255) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_patrolled` tinyint(3) unsigned NOT NULL default '0',
  `rc_ip` varchar(15) NOT NULL default '',
  `rc_old_len` int(10) default NULL,
  `rc_new_len` int(10) default NULL,
  `rc_deleted` tinyint(1) unsigned NOT NULL default '0',
  `rc_logid` int(10) unsigned NOT NULL default '0',
  `rc_log_type` varchar(255) character set latin1 collate latin1_bin default NULL,
  `rc_log_action` varchar(255) character set latin1 collate latin1_bin default NULL,
  `rc_params` blob NOT NULL,
  PRIMARY KEY  (`rc_id`),
  KEY `rc_timestamp` (`rc_timestamp`),
  KEY `rc_namespace_title` (`rc_namespace`,`rc_title`),
  KEY `rc_cur_id` (`rc_cur_id`),
  KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`rc_timestamp`),
  KEY `rc_ip` (`rc_ip`),
  KEY `rc_ns_usertext` (`rc_namespace`,`rc_user_text`),
  KEY `rc_user_text` (`rc_user_text`,`rc_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=181654467 DEFAULT CHARSET=latin1
[30 Dec 2007 20:17] Valeriy Kravchuk
Please, check how long the following statement will run (and if it will use the index):

SELECT * FROM `recentchanges` FORCE INDEX (`rc_timestamp`) WHERE rc_timestamp < '20071012204057';
[31 Dec 2007 4:41] River Tarnell
root@localhost:enwiki> SELECT * FROM `recentchanges` FORCE INDEX (`rc_timestamp`) WHERE rc_timestamp < '20071012204057';
Empty set (3.55 sec)

root@localhost:enwiki> explain SELECT * FROM `recentchanges` FORCE INDEX (`rc_timestamp`) WHERE rc_timestamp < '20071012204057';
+----+-------------+---------------+-------+---------------+--------------+---------+------+------+-------------+
| id | select_type | table         | type  | possible_keys | key          | key_len | ref  | rows | Extra       |
+----+-------------+---------------+-------+---------------+--------------+---------+------+------+-------------+
|  1 | SIMPLE      | recentchanges | range | rc_timestamp  | rc_timestamp | 16      | NULL |    1 | Using where | 
+----+-------------+---------------+-------+---------------+--------------+---------+------+------+-------------+
1 row in set (0.00 sec)

root@localhost:enwiki>
[1 Jan 2008 18:14] Valeriy Kravchuk
Please, run 

ANALYZE TABLE recentchanges;

and check if index will be used after that for SELECT * ...
[2 Feb 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[7 Apr 2008 21:39] River Tarnell
ANALYZE TABLE does not fix the problem.  However, after OPTIMIZE TABLE, the optimiser behaves correctly.
[8 Apr 2008 7:31] Valeriy Kravchuk
So, do you still think we have a bug here? Looks like optimizer just need proper up-to-date statistics.
[8 Apr 2008 10:54] River Tarnell
If the optimiser is not able to extract correct statistics from the table, then I think that is the bug.  Otherwise I must prevent all activity on the table for ~40 minutes every day while I run OPTIMIZE TABLE.  Surely that's not how it's meant to work?
[29 Apr 2008 5:37] James Day
River, try a between range so that using the index will appear to be more selective. If that fails or as well as that try using ORDER BY rc_timestamp, rc_id to see if that gives the optimiser a greater hint. It's the wrong order by (should be by rc_id for speed of deleting) but that shouldn't matter for recentchanges where inserts are in rc_timestamp order anyway.

If still no progress then getting the SHOW INDEX output and select count(*) for the table can be used to try to work out the apparent selectivity and why the optimiser thinks that enough rows will be involved to make a table scan faster than using the index.

Also check to see if it's safe to delete with rc_id as the range specifier instead of rc_timestamp. If it is then you can get the rc_id for the start and end of the range and delete for that rc_id range. That may persuade the optimiser to do what is desired.
[9 Aug 2008 6:04] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.67, and inform about the results.
[9 Sep 2008 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".