| Bug #70245 | incorrect costing for range scan causes optimizer to choose incorrect index | ||
|---|---|---|---|
| Submitted: | 4 Sep 2013 21:45 | Modified: | 5 Sep 2013 16:57 |
| Reporter: | trevor price | Email Updates: | |
| Status: | Not a Bug | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
| Version: | 5.6.13 | OS: | Linux (2.6.16.33-xenU x86_64) |
| Assigned to: | Jørgen Løland | CPU Architecture: | Any |
| Tags: | optimizer cost range_scan | ||
[4 Sep 2013 21:48]
trevor price
Query trace on 5.6.13
Attachment: querytrace.rtf (text/rtf), 15.75 KiB.
[4 Sep 2013 22:12]
trevor price
a zip file has been loaded that contains data allowing replication of the issue. the data file is bug-data-70245.zip
[4 Sep 2013 22:24]
trevor price
further demonstrate the wrong index is selected.
Server version: 5.6.13-log MySQL Community Server (GPL)
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> flush status ;
Query OK, 0 rows affected (0.00 sec)
mysql> select audio_media_owner_id
-> from audio_media_play
-> where station_id = 3555 and audio_media_owner_type = 'STATION_CLIP'
-> and audio_media_owner_id in (48901, 48900, 48903, 48902, 48897, 48893, 48892, 48896, 48895, 48899, 48898, 48894)
-> and start_server_unix_timestamp >= 1378227067247 order by start_server_unix_timestamp desc
-> \G
----snipped------
mysql> show session status like 'hand%' ;
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 2 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 12 |
| Handler_read_last | 0 |
| Handler_read_next | 11088 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+
now , forcing 5.6.13 to use the correct index
mysql> flush status ;
Query OK, 0 rows affected (0.00 sec)
select audio_media_owner_id from audio_media_play use index ( station_id_start_server_unix_timestamp) where station_id = 3555 and audio_media_owner_type = 'STATION_CLIP' and audio_media_owner_id in (48901, 48900, 48903, 48902, 48897, 48893, 48892, 48896, 48895, 48899, 48898, 48894) and start_server_unix_timestamp >= 1378227067247 order by start_server_unix_timestamp desc ;
----snipped----
mysql> show session status like 'hand%' ;
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 2 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 0 |
| Handler_read_prev | 1005 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
| Handler_savepoint | 0 |
| Handler_savepoint_rollback | 0 |
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+
so that's 1,000 read_prev for the correct index and 11,000 read_next for the index chosen by 5.6.13
[5 Sep 2013 8:58]
Jørgen Løland
Hi Trevor, MySQL 5.6 has a new variable called eq_range_index_dive_limit. It determines whether or not to do a. expensive but accurate statistics gathering, or b. cheap but less accurate statistics gathering for conditions that can be translated to equality predicates (e.g. IN). The value is the maximum number of ranges there can be before we switch from a. to b. The default value is 10, and in your query you have 12 ranges, which in turn means that method b. is chosen. If you set eq_range_index_dive_limit to something greater than 12 or 0 (a. is always used) you should get the <5.6 behavior back. I'll close the bug - please reopen it if the suggestion above does not work for you.
[5 Sep 2013 16:57]
trevor price
yes, changing the variable specified changes the optimizer plan. Thanks.

Description: The optimizer is producing a very inaccurate index cost, which is causing the wrong index to be selected. I have encountered this same issue multiple times for range scans. Its as if the cost is being calculated AS IF it was not a range scan but an equals to. In this case the range is an in clause, not > or < comparison. This is a single table query. Here is the table Create Table: CREATE TABLE `audio_media_play` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `play_hash` varchar(255) NOT NULL, `streaming` enum('1','0') NOT NULL DEFAULT '1', `audio_media_id` bigint(20) unsigned DEFAULT NULL COMMENT 'audio_media.id', `audio_media_owner_type` enum('track','track_override','user','station_clip','ad') DEFAULT NULL, `audio_media_owner_id` bigint(20) unsigned DEFAULT NULL COMMENT 'track.id or jelli_clip.id or user.id, depending on value of audio_media_type', `station_id` bigint(20) unsigned NOT NULL, `lead_in` bigint(20) unsigned NOT NULL, `lead_out` bigint(20) unsigned NOT NULL, `start_server_time` datetime DEFAULT NULL, `start_device_time` datetime DEFAULT NULL, `stop_server_time` datetime DEFAULT NULL COMMENT 'time that the server was notified of play stop', `stop_device_time` datetime DEFAULT NULL, `percent_played` tinyint(4) DEFAULT NULL, `last_modified_server_time` datetime DEFAULT NULL, `version` int(10) unsigned NOT NULL DEFAULT '1', `start_server_unix_timestamp` bigint(20) unsigned DEFAULT NULL, `start_device_unix_timestamp` bigint(20) unsigned DEFAULT NULL, `stop_server_unix_timestamp` bigint(20) unsigned DEFAULT NULL, `stop_device_unix_timestamp` bigint(20) unsigned DEFAULT NULL, `last_modified_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `play_hash_idx` (`play_hash`), KEY `audio_media_id_idx` (`audio_media_id`), KEY `streaming_idx` (`streaming`), KEY `audio_media_owner_id_idx` (`audio_media_owner_id`), KEY `station_id_start_server_time` (`station_id`,`start_server_time`), KEY `station_id_start_server_unix_timestamp` (`station_id`,`start_server_unix_timestamp`), KEY `last_modified_server_time_idx` (`last_modified_server_time`), KEY `last_modified_timestamp_idx` (`last_modified_timestamp`) ) ENGINE=InnoDB ; here is the query select audio_media_owner_id from audio_media_play where station_id = 3555 and audio_media_owner_type = 'STATION_CLIP' and audio_media_owner_id in (48901, 48900, 48903, 48902, 48897, 48893, 48892, 48896, 48895,48899, 48898, 48894) and start_server_unix_timestamp >= 1378227067247 order by start_server_unix_timestamp desc Now the explain for 5.5.8 *************************** 1. row *************************** id: 1 select_type: SIMPLE table: audio_media_play type: range possible_keys: audio_media_owner_id_idx,station_id_start_server_time,station_id_start_server_unix_timestamp key: station_id_start_server_unix_timestamp key_len: 17 ref: NULL rows: 996 Extra: Using where And the explain for 5.6.13 *************************** 1. row *************************** id: 1 select_type: SIMPLE table: audio_media_play type: range possible_keys: audio_media_owner_id_idx,station_id_start_server_time,station_id_start_server_unix_timestamp key: audio_media_owner_id_idx key_len: 9 ref: NULL rows: 264 Extra: Using index condition; Using where; Using filesort Looking at the query trace in 5.6.13 it appears the cost for audio_media_owner_id_idx is too small. An analyze table was performed on the table. How to repeat: Create Table: CREATE TABLE `audio_media_play` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `play_hash` varchar(255) NOT NULL, `streaming` enum('1','0') NOT NULL DEFAULT '1', `audio_media_id` bigint(20) unsigned DEFAULT NULL COMMENT 'audio_media.id', `audio_media_owner_type` enum('track','track_override','user','station_clip','ad') DEFAULT NULL, `audio_media_owner_id` bigint(20) unsigned DEFAULT NULL COMMENT 'track.id or jelli_clip.id or user.id, depending on value of audio_media_type', `station_id` bigint(20) unsigned NOT NULL, `lead_in` bigint(20) unsigned NOT NULL, `lead_out` bigint(20) unsigned NOT NULL, `start_server_time` datetime DEFAULT NULL, `start_device_time` datetime DEFAULT NULL, `stop_server_time` datetime DEFAULT NULL COMMENT 'time that the server was notified of play stop', `stop_device_time` datetime DEFAULT NULL, `percent_played` tinyint(4) DEFAULT NULL, `last_modified_server_time` datetime DEFAULT NULL, `version` int(10) unsigned NOT NULL DEFAULT '1', `start_server_unix_timestamp` bigint(20) unsigned DEFAULT NULL, `start_device_unix_timestamp` bigint(20) unsigned DEFAULT NULL, `stop_server_unix_timestamp` bigint(20) unsigned DEFAULT NULL, `stop_device_unix_timestamp` bigint(20) unsigned DEFAULT NULL, `last_modified_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), UNIQUE KEY `play_hash_idx` (`play_hash`), KEY `audio_media_id_idx` (`audio_media_id`), KEY `streaming_idx` (`streaming`), KEY `audio_media_owner_id_idx` (`audio_media_owner_id`), KEY `station_id_start_server_time` (`station_id`,`start_server_time`), KEY `station_id_start_server_unix_timestamp` (`station_id`,`start_server_unix_timestamp`), KEY `last_modified_server_time_idx` (`last_modified_server_time`), KEY `last_modified_timestamp_idx` (`last_modified_timestamp`) ) ENGINE=InnoDB ; here is the query select audio_media_owner_id from audio_media_play where station_id = 3555 and audio_media_owner_type = 'STATION_CLIP' and audio_media_owner_id in (48901, 48900, 48903, 48902, 48897, 48893, 48892, 48896, 48895,48899, 48898, 48894) and start_server_unix_timestamp >= 1378227067247 order by start_server_unix_timestamp desc