Bug #78822 | Materialized table from semijoin may change join order and lead to bad plan | ||
---|---|---|---|
Submitted: | 13 Oct 2015 17:15 | Modified: | 20 Oct 2016 8:41 |
Reporter: | Valeriy Kravchuk | Email Updates: | |
Status: | Won't fix | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 5.6, 5.6.27 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | materialization, semijoin |
[13 Oct 2015 17:15]
Valeriy Kravchuk
[13 Oct 2015 17:18]
Valeriy Kravchuk
Dump of the test database to load
Attachment: semijoin3.sql (application/octet-stream, text), 222.52 KiB.
[14 Oct 2015 5:48]
MySQL Verification Team
Hello Valeriy, Thank you for the bug report and test case. Observed this with 5.6.27 build. Thanks, Umesh
[14 Oct 2015 5:55]
MySQL Verification Team
// 5.6.27 [umshastr@hod03]/export/umesh/server/binaries/mysql-advanced-5.6.27: bin/mysql -uroot -S /tmp/mysql_ushastry.sock test Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 3 Server version: 5.6.27-enterprise-commercial-advanced MySQL Enterprise Server - Advanced Edition (Commercial) Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> mysql> analyze table t1; +---------+---------+----------+-----------------------------+ | Table | Op | Msg_type | Msg_text | +---------+---------+----------+-----------------------------+ | test.t1 | analyze | status | Table is already up to date | +---------+---------+----------+-----------------------------+ 1 row in set (0.00 sec) mysql> analyze table t2; +---------+---------+----------+-----------------------------+ | Table | Op | Msg_type | Msg_text | +---------+---------+----------+-----------------------------+ | test.t2 | analyze | status | Table is already up to date | +---------+---------+----------+-----------------------------+ 1 row in set (0.00 sec) mysql> analyze table t3; +---------+---------+----------+-----------------------------+ | Table | Op | Msg_type | Msg_text | +---------+---------+----------+-----------------------------+ | test.t3 | analyze | status | Table is already up to date | +---------+---------+----------+-----------------------------+ 1 row in set (0.00 sec) mysql> analyze table t4; +---------+---------+----------+-----------------------------+ | Table | Op | Msg_type | Msg_text | +---------+---------+----------+-----------------------------+ | test.t4 | analyze | status | Table is already up to date | +---------+---------+----------+-----------------------------+ 1 row in set (0.00 sec) mysql> explain select t1.id, t3.data from t1 join t2 on t1.id = t2.other_id join t3 on t2.id = t3.other_id where t3.id in (select t4.data from t4 where t4.other_id = 100) and t1.other_id = 100 and t2.other_id > 997; +----+--------------+-------------+--------+------------------+----------+---------+------------------+------+---------------------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------+-------------+--------+------------------+----------+---------+------------------+------+---------------------------------------------------------------------------+ | 1 | SIMPLE | <subquery2> | ALL | NULL | NULL | NULL | NULL | NULL | Using where | | 1 | SIMPLE | t3 | eq_ref | PRIMARY,other_id | PRIMARY | 4 | <subquery2>.data | 1 | NULL | | 1 | SIMPLE | t2 | range | PRIMARY,other_id | other_id | 5 | NULL | 12 | Using index condition; Using where; Using join buffer (Block Nested Loop) | | 1 | SIMPLE | t1 | eq_ref | PRIMARY,other_id | PRIMARY | 4 | test.t2.other_id | 1 | Using where | | 2 | MATERIALIZED | t4 | ref | other_id | other_id | 5 | const | 24 | NULL | +----+--------------+-------------+--------+------------------+----------+---------+------------------+------+---------------------------------------------------------------------------+ 5 rows in set (0.00 sec) mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select t1.id, t3.data from t1 join t2 on t1.id = t2.other_id join t3 on t2.id = t3.other_id where t3.id in (select t4.data from t4 where t4.other_id = 100) and t1.other_id = 100 and t2.other_id > 997; Empty set (0.00 sec) mysql> show status like 'Handler%'; +----------------------------+-------+ | Variable_name | Value | +----------------------------+-------+ | Handler_commit | 0 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_external_lock | 8 | | Handler_mrr_init | 0 | | Handler_prepare | 0 | | Handler_read_first | 0 | | Handler_read_key | 27 | | Handler_read_last | 0 | | Handler_read_next | 31 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 26 | | Handler_rollback | 0 | | Handler_savepoint | 0 | | Handler_savepoint_rollback | 0 | | Handler_update | 0 | | Handler_write | 25 | +----------------------------+-------+ 18 rows in set (0.00 sec) mysql> set optimizer_switch='semijoin=off'; Query OK, 0 rows affected (0.00 sec) mysql> explain select t1.id, t3.data from t1 join t2 on t1.id = t2.other_id join t3 on t2.id = t3.other_id where t3.id in (select t4.data from t4 where t4.other_id = 100) and t1.other_id = 100 and t2.other_id > 997; +----+-------------+-------+--------+------------------+----------+---------+------------------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+------------------+----------+---------+------------------+------+-----------------------+ | 1 | PRIMARY | t2 | range | PRIMARY,other_id | other_id | 5 | NULL | 12 | Using index condition | | 1 | PRIMARY | t1 | eq_ref | PRIMARY,other_id | PRIMARY | 4 | test.t2.other_id | 1 | Using where | | 1 | PRIMARY | t3 | ref | other_id | other_id | 5 | test.t2.id | 4 | Using where | | 2 | SUBQUERY | t4 | ref | other_id | other_id | 5 | const | 24 | NULL | +----+-------------+-------+--------+------------------+----------+---------+------------------+------+-----------------------+ 4 rows in set (0.01 sec) mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select t1.id, t3.data from t1 join t2 on t1.id = t2.other_id join t3 on t2.id = t3.other_id where t3.id in (select t4.data from t4 where t4.other_id = 100) and t1.other_id = 100 and t2.other_id > 997; Empty set (0.00 sec) mysql> show status like 'Handler%'; +----------------------------+-------+ | Variable_name | Value | +----------------------------+-------+ | Handler_commit | 0 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_external_lock | 8 | | Handler_mrr_init | 0 | | Handler_prepare | 0 | | Handler_read_first | 0 | | Handler_read_key | 4 | | Handler_read_last | 0 | | Handler_read_next | 6 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | | Handler_rollback | 0 | | Handler_savepoint | 0 | | Handler_savepoint_rollback | 0 | | Handler_update | 0 | | Handler_write | 0 | +----------------------------+-------+ 18 rows in set (0.00 sec) mysql>
[20 Oct 2016 8:39]
Sreeharsha Ramanavarapu
Posted by developer: In 5.6 optimizer comes up with a bad plan due to a sub-optimal join order of tables since it doesn't take the filtering estimates into account. This problem doesn't occur in 5.7+ since condition filtering has been added as part of "WL#6635: Make use of condition filtering in the optimizer". On 5.7 when this WL is turned off (using the optimizer_switch "condition_fanout_filter=off"), we get the same plan as 5.6. It is not possible to backport this feature to 5.6. This bug is closed as not to be fixed. Workaround: On 5.6: set optimizer_switch="semijoin=off"; As suggested in the bugpage (by the user), this will make the optimizer choose a plan with materialized lookup and a better join order. On 5.7: Condition filtering is on by default. So a better join order will be chosen.