Bug #78935 | Explain output differs between UPDATE and SELECT | ||
---|---|---|---|
Submitted: | 23 Oct 2015 6:10 | Modified: | 9 Nov 2015 1:15 |
Reporter: | monty solomon | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.6.25 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[23 Oct 2015 6:10]
monty solomon
[23 Oct 2015 13:42]
MySQL Verification Team
Hi Monty, Thank you for your bug report. In order to repeat the behavior, we need the table, which means CREATE TABLE statements , plus all data that you had when you issued those two EXPLAIN statements. Most likely, this is not a bug. UPDATE has to differ from SELECT in its plan of execution, in this case. As you can see for yourself, plan for UPDATE uses the index on column which is changed itself. Hence, a temporary table has to be created first, which changes the cost of using one index over another. We will confirm that once we have your data ...
[23 Oct 2015 19:43]
monty solomon
The EXPLAIN output for the UPDATE shows that it is using the PRIMARY KEY, not the index on the column being changed. The PRIMARY KEY is on the ID column. Here is the EXPLAIN output for the SELECT when I change it to select the same column as the one being updated. The chosen key is the primary_password index for both variations of SELECT. I expected the primary_password index to be used for the UPDATE. mysql> EXPLAIN SELECT primary_password FROM users WHERE primary_password IS NULL AND password != '' AND salt IS NOT NULL\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: users type: range possible_keys: password,primary_password key: primary_password key_len: 223 ref: NULL rows: 1057 Extra: Using where; Using index 1 row in set (0.00 sec)
[23 Oct 2015 19:44]
monty solomon
In addition, the EXPLAIN output for UPDATE did not mention a temporary table.
[23 Oct 2015 19:47]
monty solomon
After the UPDATE completes the EXPLAIN output for UPDATE changes to reference the primary_password index and the temporary table. EXPLAIN UPDATE users SET primary_password = CONCAT('test', 'explain', password) WHERE primary_password IS NULL AND password != '' AND salt IS NOT NULL\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: users type: range possible_keys: password,primary_password key: primary_password key_len: 223 ref: const,const rows: 1057 Extra: Using where; Using temporary 1 row in set (0.00 sec)
[23 Oct 2015 20:04]
monty solomon
After dropping the index and adding it again, the EXPLAIN output for the UPDATE reverts to using the PRIMARY KEY. The EXPLAIN outputs for the SELECT statements use the primary_password index. mysql> EXPLAIN UPDATE users SET primary_password = CONCAT('test', 'explain', password) WHERE primary_password IS NULL AND password != '' AND salt IS NOT NULL\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: users type: index possible_keys: password,primary_password key: PRIMARY key_len: 4 ref: NULL rows: 93081 Extra: Using where 1 row in set (0.00 sec) mysql> EXPLAIN SELECT * FROM users WHERE primary_password IS NULL AND password != '' AND salt IS NOT NULL\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: users type: ref possible_keys: password,primary_password key: primary_password key_len: 103 ref: const rows: 38136 Extra: Using index condition 1 row in set (0.00 sec) mysql> EXPLAIN SELECT primary_password FROM users WHERE primary_password IS NULL AND password != '' AND salt IS NOT NULL\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: users type: range possible_keys: password,primary_password key: primary_password key_len: 134 ref: NULL rows: 38136 Extra: Using where; Using index 1 row in set (0.00 sec)
[26 Oct 2015 14:33]
MySQL Verification Team
The optimizer does not use the index that is updated, exactly because it is updated. That is sometimes the sufficient reason to make it faster, as the temporary table is avoided !!!! You could get the optimizer trace, although this might not be enough .... You can get it this way: SET optimizer_trace="enabled=on"; SELECT ...; # your query here SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE; # possibly more queries... # When done with tracing, disable it: SET optimizer_trace="enabled=off"; You should know that, at the end of the day, data are essential in order to make a final judgement. In spite of the schema or optimizer's trace.
[26 Oct 2015 16:27]
MySQL Verification Team
We must be able to repeat the bug, so without data, we can not do more work on this bug.
[28 Oct 2015 2:29]
monty solomon
mysql> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE\G *************************** 1. row *************************** QUERY: EXPLAIN UPDATE users SET primary_password = CONCAT('test', 'explain', password) WHERE primary_password IS NULL AND password != '' AND salt IS NOT NULL TRACE: { "steps": [ { "condition_processing": { "condition": "WHERE", "original_condition": "(isnull(`users`.`primary_password`) and (`users`.`password` <> '') and (`users`.`salt` is not null))", "steps": [ { "transformation": "equality_propagation", "resulting_condition": "(isnull(`users`.`primary_password`) and (`users`.`password` <> '') and (`users`.`salt` is not null))" }, { "transformation": "constant_propagation", "resulting_condition": "(isnull(`users`.`primary_password`) and (`users`.`password` <> '') and (`users`.`salt` is not null))" }, { "transformation": "trivial_condition_removal", "resulting_condition": "(isnull(`users`.`primary_password`) and (`users`.`password` <> '') and (`users`.`salt` is not null))" } ] } }, { "table": "`users`", "range_analysis": { "table_scan": { "rows": 101226, "cost": 22776 }, "potential_range_indices": [ { "index": "PRIMARY", "usable": true, "key_parts": [ "ID" ] }, { "index": "uniq_email", "usable": true, "key_parts": [ "email" ] }, { "index": "uniq_userName", "usable": true, "key_parts": [ "userName" ] }, { "index": "guid", "usable": true, "key_parts": [ "guid", "ID" ] }, { "index": "password", "usable": true, "key_parts": [ "password", "salt", "primary_password", "ID" ] }, { "index": "primary_password", "usable": true, "key_parts": [ "primary_password", "salt", "password", "ID" ] } ], "setup_range_conditions": [ ], "group_index_range": { "chosen": false, "cause": "no_join" }, "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": "password", "ranges": [ "password < ", " < password" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 50614, "cost": 60739, "chosen": false, "cause": "cost" }, { "index": "primary_password", "ranges": [ "NULL <= primary_password <= NULL AND NULL < salt" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 39348, "cost": 47219, "chosen": false, "cause": "cost" } ], "analyzing_roworder_intersect": { "usable": false, "cause": "too_few_roworder_scans" } } } } ] } MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0 INSUFFICIENT_PRIVILEGES: 0 1 row in set (0.00 sec)
[29 Oct 2015 14:08]
MySQL Verification Team
Thanks for the trace. We do not know whether it will be enough, but you should also send us the trace output for the UPDATE ....
[2 Nov 2015 15:28]
MySQL Verification Team
Hi Mr. Solomon, I have studied this case carefully again. I made my diagnosis and shared it with our Developers from the Optimizer team. They agreed 100 % with me. The explanation is quite simple. With UPDATE, when the optimizer has to choose between the index with columns that could be possibly changed with the UPDATE and the second best index that is created over columns that will not be updated, then this second best index will ALWAYS be chosen. The reason is very strong. When columns that are updated belong to the index by which a search would be made, then a temporary table is created and filled with current values of these columns (or this column). The cost of creation and insertion of data on disk is so high that it offsets all differences in index search costs. This is final diagnosis, as confirmed by our Optimizer team. This closes this bug as "Not a Bug" and finishes communication on it.
[9 Nov 2015 0:51]
monty solomon
The optimizer does not always choose the second-best index with columns that will not be updated. For example, if I add a LIMIT to the UPDATE in the EXPLAIN it selects the index with columns that could be possibly changed. mysql> EXPLAIN UPDATE users SET primary_password = CONCAT('test', 'explain', password) WHERE primary_password IS NULL AND password != '' AND salt IS NOT NULL LIMIT 92000\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: users type: range possible_keys: password,primary_password key: primary_password key_len: 134 ref: const,const rows: 38198 Extra: Using where; Using temporary 1 row in set (0.00 sec) mysql> SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE\G *************************** 1. row *************************** QUERY: EXPLAIN UPDATE users SET primary_password = CONCAT('test', 'explain', password) WHERE primary_password IS NULL AND password != '' AND salt IS NOT NULL LIMIT 92000 TRACE: { "steps": [ { "condition_processing": { "condition": "WHERE", "original_condition": "(isnull(`users`.`primary_password`) and (`users`.`password` <> '') and (`users`.`salt` is not null))", "steps": [ { "transformation": "equality_propagation", "resulting_condition": "(isnull(`users`.`primary_password`) and (`users`.`password` <> '') and (`users`.`salt` is not null))" }, { "transformation": "constant_propagation", "resulting_condition": "(isnull(`users`.`primary_password`) and (`users`.`password` <> '') and (`users`.`salt` is not null))" }, { "transformation": "trivial_condition_removal", "resulting_condition": "(isnull(`users`.`primary_password`) and (`users`.`password` <> '') and (`users`.`salt` is not null))" } ] } }, { "table": "`users`", "range_analysis": { "table_scan": { "rows": 92274, "cost": 110731 }, "potential_range_indices": [ { "index": "PRIMARY", "usable": true, "key_parts": [ "ID" ] }, { "index": "uniq_email", "usable": true, "key_parts": [ "email" ] }, { "index": "uniq_userName", "usable": true, "key_parts": [ "userName" ] }, { "index": "guid", "usable": true, "key_parts": [ "guid", "ID" ] }, { "index": "password", "usable": true, "key_parts": [ "password", "salt", "primary_password", "ID" ] }, { "index": "primary_password", "usable": true, "key_parts": [ "primary_password", "salt", "password", "ID" ] } ], "setup_range_conditions": [ ], "group_index_range": { "chosen": false, "cause": "no_join" }, "analyzing_range_alternatives": { "range_scan_alternatives": [ { "index": "password", "ranges": [ "password < ", " < password" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 46138, "cost": 55368, "chosen": true }, { "index": "primary_password", "ranges": [ "NULL <= primary_password <= NULL AND NULL < salt" ], "index_dives_for_eq_ranges": true, "rowid_ordered": false, "using_mrr": false, "index_only": false, "rows": 38198, "cost": 45839, "chosen": true } ], "analyzing_roworder_intersect": { "usable": false, "cause": "too_few_roworder_scans" } }, "chosen_range_access_summary": { "range_access_plan": { "type": "range_scan", "index": "primary_password", "rows": 38198, "ranges": [ "NULL <= primary_password <= NULL AND NULL < salt" ] }, "rows_for_plan": 38198, "cost_for_plan": 45839, "chosen": true } } } ] } MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0 INSUFFICIENT_PRIVILEGES: 0 1 row in set (0.01 sec)
[9 Nov 2015 1:15]
monty solomon
I found the MySQL Server blog article that states, in part, "query optimization of single-table UPDATE/DELETE statements is more limited than for SELECT statements" and that may explain the difference. [http://mysqlserverteam.com/multi-table-trick/] If I follow the suggestion there to add another table to the UPDATE then the optimizer chooses the index I expected it to use. mysql> EXPLAIN UPDATE users, (SELECT 1) dummy SET primary_password = CONCAT('test', 'explain', password) WHERE primary_password IS NULL AND password != '' AND salt IS NOT NULL\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: <derived2> type: system possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1 Extra: NULL *************************** 2. row *************************** id: 1 select_type: PRIMARY table: users type: ref possible_keys: password,primary_password key: primary_password key_len: 103 ref: const rows: 38198 Extra: Using where *************************** 3. row *************************** id: 2 select_type: DERIVED table: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: No tables used 3 rows in set (0.01 sec)
[9 Nov 2015 1:18]
monty solomon
trace output for the multi-table update explain
Attachment: trace.txt (text/plain), 9.24 KiB.