Bug #76247 | Regression case with semijoin - query with many INs stuck on statistics state | ||
---|---|---|---|
Submitted: | 10 Mar 2015 17:33 | Modified: | 19 Aug 2015 13:16 |
Reporter: | Przemyslaw Malkowski | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 5.6 5.7 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | semijoin optimizer_prune_level |
[10 Mar 2015 17:33]
Przemyslaw Malkowski
[10 Mar 2015 17:36]
Przemyslaw Malkowski
test case
Attachment: bug_report_76247.txt (text/plain), 38.84 KiB.
[11 Mar 2015 15:14]
MySQL Verification Team
I see from your attached test case that you are using InnoDB tables. Are you using persistent stats for the indices on your tables ??? If the answer is "Yes", then your bug is a duplicate of the bug # 70617. The only difference is that your query requires many, many times more work on the index statistics collection. In that case, a cause of your bug (and # 70617) is that, In 5.6.13 some changes are made on the persistent statistics estimation, that led to spending lots of time in statistics collection.
[11 Mar 2015 15:34]
Przemyslaw Malkowski
Sinisa, yes, it was left with default. However when I disabled it, the problem still takes place: mysql> set global innodb_stats_persistent=off; Query OK, 0 rows affected (0.00 sec) mysql> show variables like '%pers%'; +--------------------------------------+-------+ | Variable_name | Value | +--------------------------------------+-------+ | innodb_stats_persistent | OFF | | innodb_stats_persistent_sample_pages | 20 | +--------------------------------------+-------+ 2 rows in set (0.00 sec) mysql> EXPLAIN SELECT tbl10_.id as col_0_0_, tbl10_.lock_version as col_1_0_, tbl.... .... -- second session: mysql> show processlist\G *************************** 1. row *************************** Id: 32 User: root Host: localhost db: prune Command: Query Time: 30 State: statistics Info: EXPLAIN SELECT tbl10_.id as col_0_0_, tbl10_.lock_version as col_1_0_, tbl10_.name as col_2_0_, subs
[11 Mar 2015 15:55]
MySQL Verification Team
Please, in order to clear up the matter, can you do the test in the following steps: * set persistent stats to 0 or off * create tables as per your .sql file * fill the tables * run ANALYZE on each of the tables * run the query and let us know how much time does gathering statistics take Thanks in advance.
[11 Mar 2015 17:13]
MySQL Verification Team
Smaller testcase based on the file uploaded by reporter: ----- set optimizer_prune_level=0; set optimizer_search_depth=default; set optimizer_switch=default; drop table if exists `t1`,`t2`; create table `t1` (`id` int primary key) engine=innodb; create table `t2` (`id` int,`b` int,`d` int,`c` int,primary key (`id`),key (`c`)) engine=innodb; set @s:=concat( 'explain select 1 from t1 where ', repeat('(id in(select d from t2 where b<>1 and c=1)) and\n',30), '(id in(select d from t2 where b<>1 and c=1));'); select @s; prepare t from @s; execute t; set optimizer_switch='semijoin=off'; prepare t from @s; execute t; ------ Optimize_table_order::best_access_path+0x3d [f:\git\mysql-5.6\sql\sql_planner.cc @ 407] Optimize_table_order::best_extension_by_limited_search+0x25d [f:\git\mysql-5.6\sql\sql_planner.cc @ 1852] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::best_extension_by_limited_search+0x52b [f:\git\mysql-5.6\sql\sql_planner.cc @ 1988] Optimize_table_order::greedy_search+0xf6 [f:\git\mysql-5.6\sql\sql_planner.cc @ 1476] Optimize_table_order::choose_table_order+0x177 [f:\git\mysql-5.6\sql\sql_planner.cc @ 1147] make_join_statistics+0x13a5 [f:\git\mysql-5.6\sql\sql_optimizer.cc @ 3766] JOIN::optimize+0x657 [f:\git\mysql-5.6\sql\sql_optimizer.cc @ 364] mysql_execute_select+0x33 [f:\git\mysql-5.6\sql\sql_select.cc @ 1086] mysql_explain_unit+0x148 [f:\git\mysql-5.6\sql\opt_explain.cc @ 2134] explain_query_expression+0x61 [f:\git\mysql-5.6\sql\opt_explain.cc @ 2035] execute_sqlcom_select+0xe3 [f:\git\mysql-5.6\sql\sql_parse.cc @ 5087]
[11 Mar 2015 18:04]
MySQL Verification Team
This seems like a bug in semi-join optimization. Can you also send us rows for your tables ???
[11 Mar 2015 18:18]
MySQL Verification Team
A smaller test case was produced, which required no data. So, we do not need any additional info from you. Even with zero rows in each table, with semijoin=ON, query was stuck for a long, long time, until it was killed. This is a verified bug now. It seems that there is possibly an eternal loop somewhere in the code ... Fully verified.
[11 Mar 2015 18:26]
MySQL Verification Team
Thank you for your contribution !
[11 Mar 2015 19:06]
Roy Lyseng
What happens if you reduce the search depth for the join planner? E.g use this statement before the SELECT statement: set optimizer_search_depth=8; Especially with a large number of tables with almost the same number of rows may cause the planner to take "forever".
[11 Mar 2015 19:54]
Przemyslaw Malkowski
Sinisa, thank you for confirming this! As I said in the initial report, "even on empty tables", so I had no rows at all in the test case. Roy, using optimizer_search_depth=8 doesn't seem to change much, I can't wait hours to see if the explain ever ends, also it's against the ecology to let the CPU core burn that long ;) Then using optimizer_search_depth=4 lets the explain to finish in 3 seconds :) The test case is quick and easy enough if you want to play.
[12 Mar 2015 11:34]
Roy Lyseng
Thank you for the update, however I do not think this is a bug. The query has a very poorly written set of subqueries that are really hard for the optimizer to handle efficiently. The query is on the form: SELECT ... FROM t1 INNER JOIN t2 ON jc(t1, t2) LEFT JOIN t3 ON jc(t1,t3) LEFT JOIN t4 ON jc( t3, t4) WHERE wc(t1, t2, t3, t4) AND t1.id IN (SELECT it.t1_id FROM it WHERE wc1(it)) AND t1.id IN (SELECT it.t1_id FROM it WHERE wc2(it)) AND ...; The query contains 32 similar subqueries. Since all of them are against the same table and have similar where conditions, the optimizer will calculate their costs as similar. In order to properly optimize a query with 36 tables, the optimizer either has to use a very low search depth, or use very aggressive pruning of possible plans. Such pruning is impossible when all choices available for the optimizer have the same cost. Thus, we need workarounds for this query to perform decently: - Use STRAIGHT_JOIN. I think this is quite good, since the tables anyway may be accessed in the lexical order. - Use OPTIMIZER_SEARCH_DEPTH. This is the recommended strategy when the optimizer uses excessive time and regular pruning is not sufficient. - Turn off semi-join. This is a good choice here, since all semi-joins anyway would use the FirstMatch strategy, and is the same strategy used when semi-join is disabled. - Consolidate the subqueries. Notice that all subqueries have the same inner table, the same selected column and the same outer expression. Thus, all subqueries can be rewritten into one by combining the inner conditions with AND. Ie, this query should be equivalent to the one above: SELECT ... FROM t1 INNER JOIN t2 ON jc(t1, t2) LEFT JOIN t3 ON jc(t1,t3) LEFT JOIN t4 ON jc( t3, t4) WHERE wc(t1, t2, t3, t4) AND t1.id IN (SELECT it.t1_id FROM it WHERE wc1(it) AND wc2(it)); This query seems machine generated, so I am not sure how helpful this advice is, though. Long-term we might be able to detect this kind of query pattern and rewrite the query automatically.
[12 Mar 2015 14:08]
MySQL Verification Team
I still consider this as a bug , since optimizer hangs in statistics gathering, even when there are 0 rows , no rows what so ever, in the tables. Also, the workarounds are OK, but those are not a solution to the bug. I am only changing the severity to "Performance".
[12 Mar 2015 14:49]
Roy Lyseng
Sinisa, it does not hang, the optimizer is simply evaluating plans. Providing empty input tables is both unrealistic and gives the optimizer the worst possible baseline for providing a plan, since all costs look more or less equal and pruning bad plans becomes impossible.
[22 Jul 2015 13:53]
MySQL Verification Team
First of all, our research shows that when optimizer_prune_level is 0, what you experience with your particular query is expected behavior. First, what happens when optimizer_prune_level is set 0. The issue is with subqueries, which are pulledup and converted to semi-joins so that tables in subqueries are added to outer query which participate in join order searching. Our Join order searching is exhaustive it produces all permutations of possible plan before selecting the plan (n!). From the test case we can see there is nearly 30+! plans need to be produced to select the best plan among them when pruning is disabled, hence the delay. If we enable pruning optimization is quite fast !!! However, you report that you get similar behavior even when prune level is greater then zero. That would be a candidate bug. But, we need to be able to repeat the situation which you describe. Hence, in order to proceed on that plan, we need a test case that would reproduce the bug on every possible run. In short, we are waiting on your fully reproducible test case in order to proceed.
[19 Aug 2015 12:51]
Przemyslaw Malkowski
I am afraid I failed to prepare a reproducible test case with optimizer_prune_level=1. Even if I tried with 10M rows in each table, explain is always fast. Also I don't have access to the original data where it happened. However I still think this is a regression comparing to 5.5, where with optimizer_prune_level=0 it is always fast. For 5.6, using the simplified test case, I am getting 36 seconds with 10 subqueries and 7 min 19.72 sec with 11 subqueries. Slow or very slow statistics stage happens with empty tables as well as with some data in them.
[19 Aug 2015 13:16]
MySQL Verification Team
Hi, When you manage to find a fully reproducible test case, we will start working on it immediately. In that case, create a new bug report.