Bug #52344 | Subquery materialization: Assertion if subquery in on-clause of outer join | ||
---|---|---|---|
Submitted: | 24 Mar 2010 20:36 | Modified: | 23 Nov 2010 3:21 |
Reporter: | Patrick Crews | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S3 (Non-critical) |
Version: | 6.0 | OS: | Any |
Assigned to: | Øystein Grøvlen | CPU Architecture: | Any |
Tags: | materialization, optimizer_switch, regression, subquery |
[24 Mar 2010 20:36]
Patrick Crews
[24 Mar 2010 20:37]
Patrick Crews
full MTR test case with original and simplified queries
Attachment: bug52344_test.txt (text/plain), 9.35 KiB.
[24 Mar 2010 20:38]
Patrick Crews
full crash output
Attachment: bug52344_backtrace.txt (text/plain), 16.90 KiB.
[25 Mar 2010 11:58]
Jørgen Løland
Looks similar to BUG#46680, i.e., it would be that JOIN::optimize exited without calling setup_subquery_materialization() after deciding that the materialization strategy should be used
[26 Apr 2010 20:20]
Guilhem Bichot
The final crashing SELECT of "how-to-repeat" can be simplified to SELECT 1 FROM dual WHERE 1 IN ( SELECT B.`col_varchar_nokey` FROM B LEFT JOIN BB ON ( 7 ) IN (SELECT MIN( `pk` ) FROM CC ) ) ; What happens is that: - in top query's JOIN::prepare(), JOIN::prepare() is called for the IN(...) predicate above, and this predicate is marked with exec_method=MATERIALIZATION - later, top query's JOIN::optimize() runs and: - BB is recognized as a const table - join_read_const_table() reads it and does: if (*tab->on_expr_ref && !table->null_row) { ... if ((table->null_row= test((*tab->on_expr_ref)->val_int() == 0))) mark_as_null_row(table); i.e. it evaluates the ON condition, to see whether this condition is false in which case this table could be replaced with NULL. The evaluation of ON... causes execution of the subquery inside it. All this happens before setup_subquery_materialization(), hence the crash.
[29 Apr 2010 7:54]
Øystein Grøvlen
Below is a minimal test case. Note that this is run with semijoin=off. One can get the same result with semijoin=on, if subquery contains an aggregate function. (Because semijoin is not used with aggregation). In other words, this is an issue with subquery materialization. set optimizer_switch='semijoin=off'; CREATE TABLE t1 (i INTEGER); INSERT INTO t1 VALUES (10); CREATE TABLE t2 (j INTEGER); INSERT INTO t2 VALUES (5); CREATE TABLE t3 (k INTEGER); SELECT i FROM t1 LEFT JOIN t2 ON (j) IN (SELECT k FROM t3); DROP TABLE t1, t2, t3;
[29 Apr 2010 9:55]
Øystein Grøvlen
The problem is limited to the case where both tables of the outer join is const tables. In that case, one will try to evaluate the on-clause during optimization. (See join_read_const_table() which is called from make_join_statistics()). At that point, it has been decided that the subquery should be materialized, but the materialization has not yet been set up. Hence, the assert. In this scenario, there is really no point in doing materialization since the subquery will only be needed once anyway. Hence, the solution is probably to find some way to tell subquery not to use materialization when it is executed during optimization.
[28 May 2010 6:22]
John Embretsen
This issue is also observed during RQG testing (optimizer_subquery.yy grammar) for the mysql-next-mr-opt-backporting branch, may 2010.
[26 Jul 2010 12:11]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/114340 3217 Oystein Grovlen 2010-07-26 BUG#52344 - Subquery materialization: Assertion if subquery in on-clause of outer join Problem: If tables of an outer join is constant tables, the associated on-clause will be evaluated in the optimization phase. If the on-clause contains a query that is to be executed with subquery materialization, this will not work since the infrastructure for such execution is not yet set up. Solution: If a subquery is executed before subquery materialization is properly set up, revert to traditional execution of this subquery. @ mysql-test/include/subquery_mat.inc Added test case for BUG#52344. @ mysql-test/r/subquery_mat.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_all.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_none.result Updated result file with test case for BUG#52344. @ sql/item_subselect.cc If subquery materialization is to be used, but the proper execution engine has not been set up when the subquery is executed, revert to traditional exeuction based on IN=>EXISTS transformation. Moved the code to revert to IN=>EXISTS transformation to a new method, revert_to_exists_transformation(), in order to be able to re-use this code for the scenario in this bug. @ sql/item_subselect.h Added new method revert_to_exists_transformation() to be used when it is detected that one has to revert the decision to do subquery materialization for IN-subquery. IN=>EXISTS transformation will be used for the query instead.
[27 Jul 2010 11:05]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/114430 3219 Oystein Grovlen 2010-07-27 BUG#52344 - Subquery materialization: Assertion if subquery in on-clause of outer join Problem: If tables of an outer join is constant tables, the associated on-clause will be evaluated in the optimization phase. If the on-clause contains a query that is to be executed with subquery materialization, this will not work since the infrastructure for such execution is not yet set up. Solution: Do not evaluate on-clause in optimization phase if is_expensive() returns true for this clause. This is how the problem is currently avoided for where-clauses. This works because, Item_in_subselect::is_expensive_processor returns true if query is to be executed with subquery materialization. @ mysql-test/include/subquery_mat.inc Added test case for BUG#52344. @ mysql-test/r/subquery_mat.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_all.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_none.result Updated result file with test case for BUG#52344. @ sql/sql_select.cc Do not evaluate on-clause in optimization phase if is_expensive() returns true for this clause. This prevents executing materialized subqueries in optimization phase. (Proper setup for such execution has not been done at this stage.)
[3 Aug 2010 11:46]
Øystein Grøvlen
I am "re-proposing" the latest committed patch
[17 Aug 2010 11:50]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/115924 3227 Oystein Grovlen 2010-08-17 BUG#52344 - Subquery materialization: Assertion if subquery in on-clause of outer join Problem: If tables of an outer join is constant tables, the associated on-clause will be evaluated in the optimization phase. If the on-clause contains a query that is to be executed with subquery materialization, this will not work since the infrastructure for such execution is not yet set up. Solution: Do not evaluate on-clause in optimization phase if is_expensive() returns true for this clause. This is how the problem is currently avoided for where-clauses. This works because, Item_in_subselect::is_expensive_processor returns true if query is to be executed with subquery materialization. @ mysql-test/include/subquery_mat.inc Added test case for BUG#52344. @ mysql-test/r/subquery_mat.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_all.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_none.result Updated result file with test case for BUG#52344. @ sql/sql_select.cc Do not evaluate on-clause in optimization phase if is_expensive() returns true for this clause. This prevents executing materialized subqueries in optimization phase. (Proper setup for such execution has not been done at this stage.)
[17 Aug 2010 11:54]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/115928 3227 Oystein Grovlen 2010-08-17 BUG#52344 - Subquery materialization: Assertion if subquery in on-clause of outer join Problem: If tables of an outer join are constant tables, the associated on-clause will be evaluated in the optimization phase. If the on-clause contains a query that is to be executed with subquery materialization, this will not work since the infrastructure for such execution is not yet set up. Solution: Do not evaluate on-clause in optimization phase if is_expensive() returns true for this clause. This is how the problem is currently avoided for where-clauses. This works because, Item_in_subselect::is_expensive_processor returns true if query is to be executed with subquery materialization. @ mysql-test/include/subquery_mat.inc Added test case for BUG#52344. @ mysql-test/r/subquery_mat.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_all.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_none.result Updated result file with test case for BUG#52344. @ sql/sql_select.cc Do not evaluate on-clause in optimization phase if is_expensive() returns true for this clause. This prevents executing materialized subqueries in optimization phase. (Proper setup for such execution has not been done at this stage.)
[17 Aug 2010 13:17]
Øystein Grøvlen
Pushed into mysql-next-mr-opt-backporting with revision id oystein.grovlen@oracle.com-20100817115345-erpngrr8v0yxpt65
[8 Sep 2010 7:45]
Timour Katchaounov
The patch looks OK to me.
[8 Sep 2010 8:20]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/117753 3238 Oystein Grovlen 2010-09-08 BUG#52344 - Subquery materialization: Assertion if subquery in on-clause of outer join Problem: If tables of an outer join are constant tables, the associated on-clause will be evaluated in the optimization phase. If the on-clause contains a query that is to be executed with subquery materialization, this will not work since the infrastructure for such execution is not yet set up. Solution: Do not evaluate on-clause in optimization phase if is_expensive() returns true for this clause. This is how the problem is currently avoided for where-clauses. This works because, Item_in_subselect::is_expensive_processor returns true if query is to be executed with subquery materialization. @ mysql-test/include/subquery_mat.inc Added test case for BUG#52344. @ mysql-test/r/subquery_mat.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_all.result Updated result file with test case for BUG#52344. @ mysql-test/r/subquery_mat_none.result Updated result file with test case for BUG#52344. @ sql/sql_select.cc Do not evaluate on-clause in optimization phase if is_expensive() returns true for this clause. This prevents executing materialized subqueries in optimization phase. (Proper setup for such execution has not been done at this stage.)
[8 Sep 2010 8:30]
Øystein Grøvlen
Seems I had forgotten to push. It is now pushed to mysql-next-mr-opt-backporting with revision id oystein.grovlen@oracle.com-20100908081903-ny1hot0nd2tk749w
[2 Oct 2010 18:15]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101002181053-6iotvl26uurcoryp) (version source revid:alexander.nozdrin@oracle.com-20101002180917-h0n62akupm3z20nt) (pib:21)
[11 Nov 2010 14:48]
Timour Katchaounov
After some analysis of a related problem, I believe that this patch is a bad idea in the long run, because once the meaning of is_expensive changes (e.g. different kinds of expressions become expensive), then the patch will introduce worse plans. In addition, this patch makes it hard to introduce non-expensive materialized subqueries.
[12 Nov 2010 15:39]
Øystein Grøvlen
Timour, Note that this patch only makes the handling of on-clauses match how where-clauses has been handled wrt materialized subqueries all the time since WL#1110 was added. If that was a bad idea in the first place, we might need to do something about that some time, but I am not convinced. I do not understand your comment "once the meaning of is_expensive changes (e.g. different kinds of expressions become expensive), then the patch will introduce worse plans." According to the code documentation, the main idea behind is_expensive is to avoid evaluating expressions in the optimization phase. Why would that change if more expressions become expensive? I recognize that things will break if "non-expensive materialized subqueries" are introduced. But what would the motivation be for such subqueries? I am not convinced that executing subqueries in the optimization phase will ever be a good idea.
[13 Nov 2010 16:25]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[23 Nov 2010 3:21]
Paul DuBois
Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.
[25 Nov 2010 8:01]
Timour Katchaounov
Oystein, what I meant is that if we make is_expensive() more intelligent, and let it consider the actual cost of executing an expression, then this would definitely lead to a crashes because suddenly some subqueries may become cheap. For instance it is definitely cheap to execute subqueries that would perform only one lookup/read per table, and thus would allow cheap constant substitution. On the other hand, not allowing subquery execution during optimization at all removes one dependency from the optimizer, thus making it less complex. I agree that the patch makes things in sync with how we currently process the WHERE clause. Given we have adequate test cases, if someone ever decides to make is_expensive based on actual cost, then these test cases will immediately break, so such a change cannot creep unnoticed. So I will implement the same solution for now :).