Bug #87021 | wrong index was chosen when different condition on update | ||
---|---|---|---|
Submitted: | 12 Jul 2017 3:50 | Modified: | 31 Jul 2017 14:20 |
Reporter: | Ye Jinrong | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.7.18 | OS: | CentOS |
Assigned to: | MySQL Verification Team | CPU Architecture: | Any |
[12 Jul 2017 3:50]
Ye Jinrong
[27 Jul 2017 5:10]
MySQL Verification Team
Hi, statistic for a table with so few rows cannot give optimizer any data that can be of use to calculate the plan. For that few rows difference between any of the plans, including a full table scan, is so negligible that there is no "wrong" plan. If you have a sample with larger dataset where you can reproduce a bug please let us know. kind regards Bogdan Kecman
[27 Jul 2017 6:27]
Ye Jinrong
let's look another table CREATE TABLE `sid` ( `id` int(10) unsigned NOT NULL DEFAULT '0', `name` varchar(50) NOT NULL DEFAULT '', `aid` int(10) unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (`aid`), KEY `id` (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8 mysql> show index from sid; +-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | sid | 0 | PRIMARY | 1 | aid | A | 100063 | NULL | NULL | | BTREE | | | | sid | 1 | id | 1 | id | A | 95810 | NULL | NULL | | BTREE | | | +-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ session1: mysql> explain update sid set name = 'afasdfads' where id >= 94773\G *************************** 1. row *************************** id: 1 select_type: UPDATE table: sid partitions: NULL type: index possible_keys: id key: PRIMARY key_len: 4 ref: NULL rows: 100063 filtered: 100.00 Extra: Using where mysql> begin; update sid set name = 'afasdfads' where id >= 94773; and session2: mysql> begin; select * from sid where id = 9490 for update; -- was blocked until session 1 commit or rollback r+------+-------------------------------------------+-------+ | id | name | aid | +------+-------------------------------------------+-------+ | 9490 | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | 48197 | +------+-------------------------------------------+-------+ 1 row in set (27.71 sec) -- waiting some seconds after session1 commit or rollback look lock wait status mysql> select * from sys.innodb_lock_waits\G *************************** 1. row *************************** wait_started: 2017-07-27 14:22:01 wait_age: 00:00:23 wait_age_secs: 23 locked_table: `yejr`.`sid` locked_index: PRIMARY locked_type: RECORD waiting_trx_id: 266756 waiting_trx_started: 2017-07-27 14:22:01 waiting_trx_age: 00:00:23 waiting_trx_rows_locked: 2 waiting_trx_rows_modified: 0 waiting_pid: 94 waiting_query: select * from sid where id = 9490 for update waiting_lock_id: 266756:255:198:60 waiting_lock_mode: X blocking_trx_id: 266753 blocking_pid: 93 blocking_query: select * from sys.innodb_lock_waits blocking_lock_id: 266753:255:198:60 blocking_lock_mode: X blocking_trx_started: 2017-07-27 14:19:21 blocking_trx_age: 00:03:03 blocking_trx_rows_locked: 100346 blocking_trx_rows_modified: 90784 sql_kill_blocking_query: KILL QUERY 93 sql_kill_blocking_connection: KILL 93
[28 Jul 2017 12:25]
MySQL Verification Team
Hi, Can you provide - show table status - explain of that query - does the explain of that query changes after you do analyze table all best Bogdan
[31 Jul 2017 2:27]
Ye Jinrong
1. table status; mysql>show table status like 'sid'\G *************************** 1. row *************************** Name: sid Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 100063 Avg_row_length: 68 Data_length: 6832128 Max_data_length: 0 Index_length: 1589248 Data_free: 4194304 Auto_increment: 100001 Create_time: 2016-11-26 22:35:02 Update_time: NULL Check_time: NULL Collation: utf8_general_ci Checksum: NULL Create_options: Comment: 1 row in set (0.00 sec) 2. the two explain sql1: mysql> explain update sid set name = 'afasdfads' where id >= 94773\G *************************** 1. row *************************** id: 1 select_type: UPDATE table: sid partitions: NULL type: index possible_keys: id key: PRIMARY key_len: 4 ref: NULL rows: 100063 filtered: 100.00 Extra: Using where 1 row in set (0.01 sec) sql2: mysql> explain select * from sid where id = 9490 for update\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: sid partitions: NULL type: ref possible_keys: id key: id key_len: 4 ref: const rows: 1 filtered: 100.00 Extra: NULL 1 row in set, 1 warning (0.00 sec) 3. new explain after analyze table mysql> analyze table sid; +----------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +----------+---------+----------+----------+ | yejr.sid | analyze | status | OK | +----------+---------+----------+----------+ 1 row in set (0.03 sec) mysql> explain update sid set name = 'afasdfads' where id >= 94773\G *************************** 1. row *************************** id: 1 select_type: UPDATE table: sid partitions: NULL type: index possible_keys: id key: PRIMARY key_len: 4 ref: NULL rows: 99388 filtered: 100.00 Extra: Using where 1 row in set (0.00 sec) mysql>explain select * from sid where id = 9490 for update\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: sid partitions: NULL type: ref possible_keys: id key: id key_len: 4 ref: const rows: 1 filtered: 100.00 Extra: NULL 1 row in set, 1 warning (0.00 sec)
[31 Jul 2017 13:25]
MySQL Verification Team
Hi, This still does not look like a bug. What is the output of SELECT COUNT(*) FROM sid WHERE id >= 94773; If it's more then half of the 100063 (total no of raws) it's expected to do a full table scan (faster) and that will lock the whole table (like every full table scan will do). It is by design. all best Bogdan
[31 Jul 2017 13:41]
Ye Jinrong
Hi, mysql>SELECT COUNT(*) FROM sid WHERE id >= 94773; +----------+ | COUNT(*) | +----------+ | 90784 | +----------+ 1 row in set (0.03 sec) mysql>select count(*) from sid; +----------+ | count(*) | +----------+ | 100000 | +----------+ 1 row in set (0.04 sec) I find the following in document: Each table index is queried, and the best index is used unless the optimizer believes that it is more efficient to use a table scan. At one time, a scan was used based on whether the best index spanned more than 30% of the table, but a fixed percentage no longer determines the choice between using an index or a scan. The optimizer now is more complex and bases its estimate on additional factors such as table size, number of rows, and I/O block size. from: https://dev.mysql.com/doc/refman/5.7/en/where-optimization.html It is reasonable to do a full table scan when scan more than 20%-30% of the table, but when update a table, i think mysql can not lock all rows, it is not so reasonable :( At last, as you says: If it's more then half of the 100063 (total no of raws) it's expected to do a full table scan (faster) and that will lock the whole table (like every full table scan will do). It is by design. My question is, what is the percentage, 50% or 20%-30%, which shoud be more than accurate? thanks much :)
[31 Jul 2017 13:57]
MySQL Verification Team
> It is reasonable to do a full table scan when scan more than 20%-30% > of the table, but when update a table, i think mysql can not lock all > rows, it is not so reasonable :( You cannot do a full table scan without locking full table. > My question is, what is the percentage, 50% or 20%-30%, > which shoud be more than accurate? There is no "set" percentage. Basic theory say if you need to get more then 30% it's cheaper to scan full table then to jump trough index. This is not the only factor optimizer takes into account so there is no fixed value where the optimizer will decide to do one or the other thing, and the way it works changes trough versions (it's getting better and smarter). all best Bogdan
[31 Jul 2017 14:20]
Ye Jinrong
Hi Bogdan, thank you much once more :)