Bug #32272 | innodb does sequential scan even when index should be used | ||
---|---|---|---|
Submitted: | 12 Nov 2007 10:33 | Modified: | 9 Sep 2008 6:04 |
Reporter: | River Tarnell | Email Updates: | |
Status: | No Feedback | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 5.0.42, 5.0.51 | OS: | Solaris (Solaris 10 U4 patch 120012-14) |
Assigned to: | CPU Architecture: | Any |
[12 Nov 2007 10:33]
River Tarnell
[12 Nov 2007 10:43]
Domas Mituzas
This seems to be caused by purge lag giving skewed statistics and optimizer not understanding that :) So, it needs either FORCE INDEX support on DMLs or extracting better statistics from engine.
[27 Nov 2007 23:44]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/38659 ChangeSet@1.2656, 2007-11-28 02:43:46+03:00, sergefp@mysql.com +3 -0 BUG#32272: partition crash 1: enum column The bug was that for ordered index scans, ha_partition::index_init() did not put index columns into table->read_set if the underlying storage engine did not have HA_PARTIAL_COLUMN_READ flag. This was causing assertion failure when handle_ordered_index_scan() tried to sort the records according to index order. Fixed by making ha_partition::index_init() put index columns into table->read_set for all ordered scans.
[28 Nov 2007 0:00]
Sergey Petrunya
Please disregard the above commit, it was intended for another bug
[8 Dec 2007 19:35]
Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.51 or 5.0.52. In case of the same problem, please, upload table's data to our FTP server. or, at least, send the results of: SHOW CREATE TABLE recentchanges;
[17 Dec 2007 8:24]
River Tarnell
I upgraded to 5.0.51, and the problem is still present. The innodb data is too large to upload (about 500GB). CREATE TABLE `recentchanges` ( `rc_id` int(8) NOT NULL auto_increment, `rc_timestamp` varchar(14) character set latin1 collate latin1_bin NOT NULL default '', `rc_cur_time` varchar(14) character set latin1 collate latin1_bin NOT NULL default '', `rc_user` int(10) unsigned NOT NULL default '0', `rc_user_text` varchar(255) character set latin1 collate latin1_bin NOT NULL default '', `rc_namespace` int(11) NOT NULL default '0', `rc_title` varchar(255) character set latin1 collate latin1_bin NOT NULL default '', `rc_comment` varchar(255) character set latin1 collate latin1_bin NOT NULL default '', `rc_minor` tinyint(3) unsigned NOT NULL default '0', `rc_bot` tinyint(3) unsigned NOT NULL default '0', `rc_new` tinyint(3) unsigned NOT NULL default '0', `rc_cur_id` int(10) unsigned NOT NULL default '0', `rc_this_oldid` int(10) unsigned NOT NULL default '0', `rc_last_oldid` int(10) unsigned NOT NULL default '0', `rc_type` tinyint(3) unsigned NOT NULL default '0', `rc_moved_to_ns` tinyint(3) unsigned NOT NULL default '0', `rc_moved_to_title` varchar(255) character set latin1 collate latin1_bin NOT NULL default '', `rc_patrolled` tinyint(3) unsigned NOT NULL default '0', `rc_ip` varchar(15) NOT NULL default '', `rc_old_len` int(10) default NULL, `rc_new_len` int(10) default NULL, `rc_deleted` tinyint(1) unsigned NOT NULL default '0', `rc_logid` int(10) unsigned NOT NULL default '0', `rc_log_type` varchar(255) character set latin1 collate latin1_bin default NULL, `rc_log_action` varchar(255) character set latin1 collate latin1_bin default NULL, `rc_params` blob NOT NULL, PRIMARY KEY (`rc_id`), KEY `rc_timestamp` (`rc_timestamp`), KEY `rc_namespace_title` (`rc_namespace`,`rc_title`), KEY `rc_cur_id` (`rc_cur_id`), KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`rc_timestamp`), KEY `rc_ip` (`rc_ip`), KEY `rc_ns_usertext` (`rc_namespace`,`rc_user_text`), KEY `rc_user_text` (`rc_user_text`,`rc_timestamp`) ) ENGINE=InnoDB AUTO_INCREMENT=181654467 DEFAULT CHARSET=latin1
[30 Dec 2007 20:17]
Valeriy Kravchuk
Please, check how long the following statement will run (and if it will use the index): SELECT * FROM `recentchanges` FORCE INDEX (`rc_timestamp`) WHERE rc_timestamp < '20071012204057';
[31 Dec 2007 4:41]
River Tarnell
root@localhost:enwiki> SELECT * FROM `recentchanges` FORCE INDEX (`rc_timestamp`) WHERE rc_timestamp < '20071012204057'; Empty set (3.55 sec) root@localhost:enwiki> explain SELECT * FROM `recentchanges` FORCE INDEX (`rc_timestamp`) WHERE rc_timestamp < '20071012204057'; +----+-------------+---------------+-------+---------------+--------------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------------+-------+---------------+--------------+---------+------+------+-------------+ | 1 | SIMPLE | recentchanges | range | rc_timestamp | rc_timestamp | 16 | NULL | 1 | Using where | +----+-------------+---------------+-------+---------------+--------------+---------+------+------+-------------+ 1 row in set (0.00 sec) root@localhost:enwiki>
[1 Jan 2008 18:14]
Valeriy Kravchuk
Please, run ANALYZE TABLE recentchanges; and check if index will be used after that for SELECT * ...
[2 Feb 2008 0:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[7 Apr 2008 21:39]
River Tarnell
ANALYZE TABLE does not fix the problem. However, after OPTIMIZE TABLE, the optimiser behaves correctly.
[8 Apr 2008 7:31]
Valeriy Kravchuk
So, do you still think we have a bug here? Looks like optimizer just need proper up-to-date statistics.
[8 Apr 2008 10:54]
River Tarnell
If the optimiser is not able to extract correct statistics from the table, then I think that is the bug. Otherwise I must prevent all activity on the table for ~40 minutes every day while I run OPTIMIZE TABLE. Surely that's not how it's meant to work?
[29 Apr 2008 5:37]
James Day
River, try a between range so that using the index will appear to be more selective. If that fails or as well as that try using ORDER BY rc_timestamp, rc_id to see if that gives the optimiser a greater hint. It's the wrong order by (should be by rc_id for speed of deleting) but that shouldn't matter for recentchanges where inserts are in rc_timestamp order anyway. If still no progress then getting the SHOW INDEX output and select count(*) for the table can be used to try to work out the apparent selectivity and why the optimiser thinks that enough rows will be involved to make a table scan faster than using the index. Also check to see if it's safe to delete with rc_id as the range specifier instead of rc_timestamp. If it is then you can get the rc_id for the start and end of the range and delete for that rc_id range. That may persuade the optimiser to do what is desired.
[9 Aug 2008 6:04]
Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.67, and inform about the results.
[9 Sep 2008 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".