MySQL Bugs: #32272: innodb does sequential scan even when index should be used

Bug #32272	innodb does sequential scan even when index should be used
Submitted:	12 Nov 2007 10:33	Modified:	9 Sep 2008 6:04
Reporter:	River Tarnell	Email Updates:
Status:	No Feedback	Impact on me:	None
Category:	MySQL Server: Optimizer	Severity:	S5 (Performance)
Version:	5.0.42, 5.0.51	OS:	Solaris (Solaris 10 U4 patch 120012-14)
Assigned to:		CPU Architecture:	Any

Description:
When a SELECT * or DELETE is done from an InnODB table with an index, the index is not used, even though it should be.  This results in serious performance issues for our application.

example 1:

root@localhost:enwiki> show status like 'handler_read_%';
+-----------------------+----------+
| Variable_name         | Value    |
+-----------------------+----------+
| Handler_read_first    | 17       |
| Handler_read_key      | 44       |
| Handler_read_next     | 37195625 |
| Handler_read_prev     | 0        |
| Handler_read_rnd      | 0        |
| Handler_read_rnd_next | 7977260  |
+-----------------------+----------+
6 rows in set (0.18 sec)

root@localhost:enwiki> delete FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
Query OK, 0 rows affected (29.59 sec)

root@localhost:enwiki> show status like 'handler_read_%';
+-----------------------+----------+
| Variable_name         | Value    |
+-----------------------+----------+
| Handler_read_first    | 18       |
| Handler_read_key      | 46       |
| Handler_read_next     | 37195625 |
| Handler_read_prev     | 0        |
| Handler_read_rnd      | 0        |
| Handler_read_rnd_next | 14771813 |
+-----------------------+----------+
6 rows in set (0.18 sec)

example 2:

root@localhost:enwiki> explain SELECT COUNT(*) FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
+----+-------------+---------------+-------+---------------+--------------+---------+------+---------+--------------------------+
| id | select_type | table         | type  | possible_keys | key          | key_len | ref  | rows    | Extra                    |
+----+-------------+---------------+-------+---------------+--------------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | recentchanges | range | rc_timestamp  | rc_timestamp | 16      | NULL | 1589942 | Using where; Using index |
+----+-------------+---------------+-------+---------------+--------------+---------+------+---------+--------------------------+
1 row in set (0.00 sec)

root@localhost:enwiki> explain SELECT * FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
+----+-------------+---------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table         | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+---------------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | recentchanges | ALL  | rc_timestamp  | NULL | NULL    | NULL | 8534030 | Using where |
+----+-------------+---------------+------+---------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

root@localhost:enwiki> SELECT * FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
Empty set (26.39 sec)

root@localhost:enwiki> SELECT COUNT(*) FROM `recentchanges`  WHERE rc_timestamp < '20071012204057';
+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+
1 row in set (0.15 sec)

How to repeat:
Unknown.

Suggested fix:
Unknown.

This seems to be caused by purge lag giving skewed statistics and optimizer not understanding that :) 

So, it needs either FORCE INDEX support on DMLs or extracting better statistics from engine.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/38659

ChangeSet@1.2656, 2007-11-28 02:43:46+03:00, sergefp@mysql.com +3 -0
  BUG#32272: partition crash 1: enum column
  The bug was that for ordered index scans, ha_partition::index_init() did
  not put index columns into table->read_set if the underlying storage 
  engine did not have HA_PARTIAL_COLUMN_READ flag. 
  This was causing assertion failure when handle_ordered_index_scan() tried
  to sort the records according to index order.
  
  Fixed by making ha_partition::index_init() put index columns into table->read_set
  for all ordered scans.

Please disregard the above commit, it was intended for another bug

Please, try to repeat with a newer version, 5.0.51 or 5.0.52. In case of the same problem, please, upload table's data to our FTP server. or, at least, send the results of:

SHOW CREATE TABLE recentchanges;

I upgraded to 5.0.51, and the problem is still present.  The innodb data is too large to upload (about 500GB).

CREATE TABLE `recentchanges` (
  `rc_id` int(8) NOT NULL auto_increment,
  `rc_timestamp` varchar(14) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_cur_time` varchar(14) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_user` int(10) unsigned NOT NULL default '0',
  `rc_user_text` varchar(255) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_namespace` int(11) NOT NULL default '0',
  `rc_title` varchar(255) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_comment` varchar(255) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_minor` tinyint(3) unsigned NOT NULL default '0',
  `rc_bot` tinyint(3) unsigned NOT NULL default '0',
  `rc_new` tinyint(3) unsigned NOT NULL default '0',
  `rc_cur_id` int(10) unsigned NOT NULL default '0',
  `rc_this_oldid` int(10) unsigned NOT NULL default '0',
  `rc_last_oldid` int(10) unsigned NOT NULL default '0',
  `rc_type` tinyint(3) unsigned NOT NULL default '0',
  `rc_moved_to_ns` tinyint(3) unsigned NOT NULL default '0',
  `rc_moved_to_title` varchar(255) character set latin1 collate latin1_bin NOT NULL default '',
  `rc_patrolled` tinyint(3) unsigned NOT NULL default '0',
  `rc_ip` varchar(15) NOT NULL default '',
  `rc_old_len` int(10) default NULL,
  `rc_new_len` int(10) default NULL,
  `rc_deleted` tinyint(1) unsigned NOT NULL default '0',
  `rc_logid` int(10) unsigned NOT NULL default '0',
  `rc_log_type` varchar(255) character set latin1 collate latin1_bin default NULL,
  `rc_log_action` varchar(255) character set latin1 collate latin1_bin default NULL,
  `rc_params` blob NOT NULL,
  PRIMARY KEY  (`rc_id`),
  KEY `rc_timestamp` (`rc_timestamp`),
  KEY `rc_namespace_title` (`rc_namespace`,`rc_title`),
  KEY `rc_cur_id` (`rc_cur_id`),
  KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`rc_timestamp`),
  KEY `rc_ip` (`rc_ip`),
  KEY `rc_ns_usertext` (`rc_namespace`,`rc_user_text`),
  KEY `rc_user_text` (`rc_user_text`,`rc_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=181654467 DEFAULT CHARSET=latin1

Please, check how long the following statement will run (and if it will use the index):

SELECT * FROM `recentchanges` FORCE INDEX (`rc_timestamp`) WHERE rc_timestamp < '20071012204057';

root@localhost:enwiki> SELECT * FROM `recentchanges` FORCE INDEX (`rc_timestamp`) WHERE rc_timestamp < '20071012204057';
Empty set (3.55 sec)

root@localhost:enwiki> explain SELECT * FROM `recentchanges` FORCE INDEX (`rc_timestamp`) WHERE rc_timestamp < '20071012204057';
+----+-------------+---------------+-------+---------------+--------------+---------+------+------+-------------+
| id | select_type | table         | type  | possible_keys | key          | key_len | ref  | rows | Extra       |
+----+-------------+---------------+-------+---------------+--------------+---------+------+------+-------------+
|  1 | SIMPLE      | recentchanges | range | rc_timestamp  | rc_timestamp | 16      | NULL |    1 | Using where | 
+----+-------------+---------------+-------+---------------+--------------+---------+------+------+-------------+
1 row in set (0.00 sec)

root@localhost:enwiki>

Please, run 

ANALYZE TABLE recentchanges;

and check if index will be used after that for SELECT * ...

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

ANALYZE TABLE does not fix the problem.  However, after OPTIMIZE TABLE, the optimiser behaves correctly.

So, do you still think we have a bug here? Looks like optimizer just need proper up-to-date statistics.

If the optimiser is not able to extract correct statistics from the table, then I think that is the bug.  Otherwise I must prevent all activity on the table for ~40 minutes every day while I run OPTIMIZE TABLE.  Surely that's not how it's meant to work?

River, try a between range so that using the index will appear to be more selective. If that fails or as well as that try using ORDER BY rc_timestamp, rc_id to see if that gives the optimiser a greater hint. It's the wrong order by (should be by rc_id for speed of deleting) but that shouldn't matter for recentchanges where inserts are in rc_timestamp order anyway.

If still no progress then getting the SHOW INDEX output and select count(*) for the table can be used to try to work out the apparent selectivity and why the optimiser thinks that enough rows will be involved to make a table scan faster than using the index.

Also check to see if it's safe to delete with rc_id as the range specifier instead of rc_timestamp. If it is then you can get the rc_id for the start and end of the range and delete for that rc_id range. That may persuade the optimiser to do what is desired.

Please, try to repeat with a newer version, 5.0.67, and inform about the results.

No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".