Bug #50489 another segfault in fix_semijoin_strategies_for_picked_join_order
Submitted: 20 Jan 2010 21:42 Modified: 23 Nov 2010 3:15
Reporter: Guilhem Bichot Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0-codebase-bugfixing OS:Linux (64)
Assigned to: Roy Lyseng CPU Architecture:Any
Tags: optimizer_switch, semijoin, subquery
Triage: Triaged: D1 (Critical)

[20 Jan 2010 21:42] Guilhem Bichot
Description:
I have not tested if 5.x run the test ok.
Apparently not the same bug as BUG#50237 (the segfault happens in a different line of the same function).
It starts with:
==6526== Invalid read of size 4
==6526==    at 0x7B9EA6: advance_sj_state(JOIN*, unsigned long long, st_join_table const*, unsigned int, double*, double*, st_position*) (sql_select.cc:13449)
==6526==    by 0x7ACA13: best_extension_by_limited_search(JOIN*, unsigned long long, unsigned int, double, double, unsigned int, unsigned int) (sql_select.cc:7595)
==6526==    by 0x7ACD9E: best_extension_by_limited_search(JOIN*, unsigned long long, unsigned int, double, double, unsigned int, unsigned int) (sql_select.cc:7658)
==6526==    by 0x7AC304: greedy_search(JOIN*, unsigned long long, unsigned int, unsigned int) (sql_select.cc:7322)
==6526==    by 0x7ABA23: choose_plan(JOIN*, unsigned long long) (sql_select.cc:6929)
==6526==    by 0x7A639D: make_join_statistics(JOIN*, TABLE_LIST*, Item*, st_dynamic_array*) (sql_select.cc:4529)
==6526==    by 0x79D233: JOIN::optimize() (sql_select.cc:1623)
==6526==    by 0x7A2933: mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:3125)
and then a lot of other errors. Crashes are random (after running query once, twice, it depends). Note that my line numbers are not exact (can be off by a few lines) as I added comments to code in my tree.
I am using revision-id:horst.hunger@sun.com-20100119125520-r1qh112atxs5m23y

How to repeat:
run the attached test file with mtr, it will create tables and run queries. The second query gives Valgrind errors.
The stored procedure runs the query only once, but is still necessary: I don't get errors if I use no stored procedure and just issue the SELECT directly.
Sorry for not providing a smaller testcase, I just hit this bug while debugging BUG#42620!!
[20 Jan 2010 21:42] Guilhem Bichot
test for valgrind errors

Attachment: newbug.test (application/octet-stream, text), 393.98 KiB.

[21 Jan 2010 8:05] Sveta Smirnova
Thank you for the report.

Verified as described. 5.x tests fail with error : 1231: Variable 'optimizer_switch' can't be set to the value of 'firstmatch=off,materialization=off'
[21 Jan 2010 9:55] Philip Stoev
Simplified test case (unnecessary INSERTS removed automatically):

--disable_abort_on_error
CREATE TABLE `it` (
  `id` int(11) NOT NULL,
  `expr_key` int(11) NOT NULL,
  `expr_nokey` int(11) NOT NULL,
  `expr_padder` char(10) DEFAULT NULL,
  KEY `expr_key` (`expr_key`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
INSERT INTO `it` VALUES (135,218264606,218264606,'neKk');
INSERT INTO `it` VALUES (201,810783319,810783319,'KKKK');
CREATE TABLE `ot` (
  `id` int(11) NOT NULL,
  `expr_key` int(11) NOT NULL,
  `expr_nokey` int(11) NOT NULL,
  KEY `expr_key` (`expr_key`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
delimiter |;
create procedure run_n_times(x int)
begin
declare c int;
while x do
  set x = x-1;
  select count(expr_key) into c from ot where expr_key in ( SELECT expr_nokey FROM it ) and ot.expr_key<100000000;
end while;
end|
delimiter ;|
set optimizer_switch="default";
call run_n_times(1);
set optimizer_switch="firstmatch=off,materialization=off";
call run_n_times(1);
[27 Jan 2010 8:53] Roy Lyseng
The issue occurs when you have a stored procedure or prepared statement, and switch optimizer settings between two executions.
See also bug#46744 that solved a similar problem.
With the current strategy being taken in the optimizer (i.e keep parsed and transformed query between executions and re-optimize each time), fixing all such problems is virtually impossible.

Generally, I think that we have two options for solving this:
1) Completely re-parse and re-optimize the query to be executed whenever a switch setting is changed, or
2) Store the current optimizer_switch settings with the query when the query is prepared, meaning that subsequent executions will always use the switches that were active at that time.

Of these two options, I like 2) best, because I think that a user will never want to change optimizer switches for a particular query at runtime. During development, a user may want to experiment with different settings to get the best performance, but at runtime s/he will stick to the settings found by experimenting and not pick up the settings currently set in the session object.
[11 Jun 2010 12:56] Roy Lyseng
For stored procedures (SPs), solution 1 is probably the most intuitive one. There is no clear "prepare" point in time, so the execution should probably pick up the current optimizer_switch setting when executing the statement. One way to enforce this is to save the current optimizer settings with the SP object. If the settings are changed when the PS is re-executed, the statement needs to be re-prepared. Sergey Gluhov suggested this solution.
[13 Jul 2010 12:58] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/113453

3213 Roy Lyseng	2010-07-13
      Bug#50489: another segfault in fix_semijoin_strategies...
      
      This particular problem occurs when we try to execute a query multiple times
      with different optimizer switch settings. On first execution, semijoin
      materialization strategy is a candidate and an SJ_MATERIALIZATION_INFO object
      is created. On second execution, materialization strategy is not a candidate,
      but there is a dangling pointer to the SJ_MATERIALIZATION_INFO object created
      in the prior execution. The pointer is followed, and we suffer a segfault.
      
      The solution chosen here is to refactor the SJ_MATERIALIZATION_INFO class
      so that optimization information is put into a new struct st_semijoin_mat
      which is embedded in struct st_nested_join.
      
      If semijoin materialization is an allowed strategy and materialization is
      possible for the semijoin in question, the st_semijoin_mat object is filled
      in and an st_position object is created and filled with information about
      the join strategy for the materialization.
      
      Later in the optimization, the materialization strategy is compared with
      other semijoin strategies, and, if selected, an SJ_MATERIALIZATION_INFO object
      is created and attached to the join nest representing the semijoin operation.
      The object is then filled with data relevant for semijoin execution, such as
      number of tables involved, whether this is a scan, information about
      temporary table for materialization, etc.
      
      NOTE. This bug no longer occurs on the current souce code base.
            It does however fix bug#46744, which is re-introduced by the
            fix for bug#43768, and it should also be useful because of the
            refactoring effect on the semijoin materialization strategy.
      
      mysql-test/r/subquery_bugfixes.result
        Results for test case for bug#50489
      
      mysql-test/t/subquery_bugfixes.test
        Test case for bug#50489
      
      sql/sql_class.h
        Optimization data are removed from class SJ_MATERIALIZATION_INFO.
      
      sql/sql_select.cc
        optimize_semijoin_nests() will no longer create SJ_MATERIALIZATION_INFO
        objects. Instead it populates the st_semijoin_mat struct inside the
        nested_join object with estimated cost for the materialized semijoin.
      
        fix_semijoin_strategies_for_picked_join_order() is renamed to
        fix_semijoin_strategies() and given a boolean return values. The return
        value is necessary because the function may create SJ_MATERIALIZATION_INFO
        objects, hence it must be able to return error conditions. Notice that
        these objects are created and populated only if a semijoin materialization
        strategy is actually chosen.
      
        advance_sj_state() has been slightly simplified: the "emb_sj_nest" is now
        assigned when the function is entered, in addition to changes that was
        necessitated by the splitting up of SJ_MATERIALIZATION_INFO.
      
      sql/sql_test.cc
        New interface for function print_sjm, made necessary by the splitting of
        SJ_MATERIALIZATION_INFO.
      
      sql/table.h
        Defines struct st_semijoin_mat containing estimated cost for a materialized
        semijoin operation. The struct is added as a member in struct st_nested_join.
[27 Jul 2010 8:24] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/114403

3219 Roy Lyseng	2010-07-27
      Bug#50489: another segfault in fix_semijoin_strategies...
      
      commit mail test
[9 Aug 2010 9:14] Guilhem Bichot
approved with minor comments
[12 Aug 2010 12:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/115573

3228 Roy Lyseng	2010-08-12
      Bug#50489: another segfault in fix_semijoin_strategies...
      
      This particular problem occurs when we try to execute a query multiple
      times with different optimizer switch settings. On first execution,
      semijoin materialization strategy is a candidate and an
      SJ_MATERIALIZATION_INFO object is created. On second execution,
      materialization strategy is not a candidate, but there is a dangling
      pointer to the SJ_MATERIALIZATION_INFO object created in the prior
      execution. The pointer is followed, and we suffer a segfault.
      
      The solution chosen here is to refactor the SJ_MATERIALIZATION_INFO
      class so that optimization information is put into a new struct
      st_semijoin_mat which is embedded in struct st_nested_join.
      
      If semijoin materialization is an allowed strategy and materialization
      is possible for the semijoin in question, the st_semijoin_mat object
      is filled in and an st_position object is created and filled with
      information about the join strategy for the materialization.
      
      Later in the optimization, the materialization strategy is compared
      with other semijoin strategies, and, if selected, an
      SJ_MATERIALIZATION_INFO object is created and attached to the join nest
      representing the semijoin operation.
      The object is then filled with data relevant for semijoin execution,
      such as number of tables involved, whether this is a scan, information
      about temporary table for materialization, etc.
      
      NOTE. This bug no longer occurs on the current souce code base.
            It does however fix bug#46744, which is re-introduced by the
            fix for bug#43768, and it should also be useful because of the
            refactoring effect on the semijoin materialization strategy.
      
      mysql-test/r/optimizer_switch.result
        Added results for test case for bug#50489
      
      mysql-test/t/optimizer_switch.test
        Added test case for bug#50489
      
      sql/sql_class.h
        Optimization data are removed from class SJ_MATERIALIZATION_INFO.
      
      sql/sql_select.cc
        optimize_semijoin_nests() will no longer create
        SJ_MATERIALIZATION_INFO objects. Instead it populates the
        st_semijoin_mat struct inside the nested_join object with estimated
        cost for the materialized semijoin.
      
        fix_semijoin_strategies_for_picked_join_order() is given a boolean
        return value. The return value is necessary because the function may
        create SJ_MATERIALIZATION_INFO objects, hence it must be able to
        return error conditions. Notice that these objects are created and
        populated only if a semijoin materialization strategy is actually
        chosen. Reset of sjm.positions is performed in this function.
      
        advance_sj_state() has been slightly simplified: the "emb_sj_nest"
        is now assigned when the function is entered, in addition to changes
        that was necessitated by the splitting up of SJ_MATERIALIZATION_INFO.
      
        Reset of the pointer sj_nest->sj_mat_info is added to JOIN::destroy().
      
      sql/sql_test.cc
        New interface for function print_sjm, made necessary by the splitting
        of SJ_MATERIALIZATION_INFO.
      
      sql/table.h
        Defines struct st_semijoin_mat containing estimated cost for a
        materialized semijoin operation.
        The struct is added as a member in struct st_nested_join.
[19 Aug 2010 7:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/116173

3227 Roy Lyseng	2010-08-19
      Bug#50489: another segfault in fix_semijoin_strategies...
      
      This particular problem occurs when we try to execute a query multiple
      times with different optimizer switch settings. On first execution,
      semijoin materialization strategy is a candidate and an
      SJ_MATERIALIZATION_INFO object is created. On second execution,
      materialization strategy is not a candidate, but there is a dangling
      pointer to the SJ_MATERIALIZATION_INFO object created in the prior
      execution. The pointer is followed, and we suffer a segfault.
      
      The solution chosen here is to refactor the SJ_MATERIALIZATION_INFO
      class so that optimization information is put into a new struct
      Semijoin_mat_optimize which is embedded in struct st_nested_join.
      SJ_MATERIALIZATION_INFO itself is renamed to Semijoin_mat_exec.
      
      If semijoin materialization is an allowed strategy and materialization
      is possible for the semijoin in question, the Semijoin_mat_optimize object
      is filled in and an st_position object is created and filled with
      information about the join strategy for the materialization.
      
      Later in the optimization, the materialization strategy is compared
      with other semijoin strategies, and, if selected, an
      Semijoin_mat_exec object is created and attached to the join nest
      representing the semijoin operation.
      The object is then filled with data relevant for semijoin execution,
      such as number of tables involved, whether this is a scan, information
      about temporary table for materialization, etc.
      
      NOTE. This bug no longer occurs on the current souce code base.
            It does however fix bug#46744, which is re-introduced by the
            fix for bug#43768, and it should also be useful because of the
            refactoring effect on the semijoin materialization strategy.
      
      mysql-test/r/optimizer_switch.result
        Added results for test case for bug#50489
      
      mysql-test/t/optimizer_switch.test
        Added test case for bug#50489
      
      sql/sql_class.h
        Optimization data are removed from class SJ_MATERIALIZATION_INFO.
        SJ_MATERIALIZATION_INFO is renamed to Semijoin_mat_exec.
      
      sql/sql_select.cc
        optimize_semijoin_nests() will no longer create
        SJ_MATERIALIZATION_INFO objects. Instead it populates the
        Semijoin_mat_optimize struct inside the nested_join object with estimated
        cost for the materialized semijoin.
      
        fix_semijoin_strategies_for_picked_join_order() is given a boolean
        return value. The return value is necessary because the function may
        create Semijoin_mat_exec objects, hence it must be able to
        return error conditions. Notice that these objects are created and
        populated only if a semijoin materialization strategy is actually
        chosen. Reset of sjm.positions is performed in this function.
      
        advance_sj_state() has been slightly simplified: the "emb_sj_nest"
        is now assigned when the function is entered, in addition to changes
        that was necessitated by the splitting up of SJ_MATERIALIZATION_INFO.
      
        Reset of the pointer sj_nest->sj_mat_exec is added to JOIN::destroy().
      
      sql/sql_select.h
        List with name sjm_info_list is renamed to sjm_exec_list.
      
      sql/sql_test.cc
        New interface for function print_sjm(), made necessary by the splitting
        of SJ_MATERIALIZATION_INFO.
      
      sql/sql_test.h
        New interface for function print_sjm().
      
      sql/table.h
        Defines struct Semijoin_mat_optimize containing estimated cost for a
        materialized semijoin operation.
        The struct is added as a member in struct st_nested_join.
[2 Oct 2010 18:14] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101002181053-6iotvl26uurcoryp) (version source revid:alexander.nozdrin@oracle.com-20101002180917-h0n62akupm3z20nt) (pib:21)
[13 Nov 2010 16:10] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[23 Nov 2010 3:15] Paul Dubois
Bug does not appear in any released 5.6.x version. No 5.6.1 changelog entry needed.