Bug #50019 Wrong result for IN-subquery with materialization
Submitted: 1 Jan 2010 12:39 Modified: 16 Jun 2010 11:22
Reporter: Øystein Grøvlen Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:6.0 OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: materialization, optimizer_switch, semijoin, subquery

[1 Jan 2010 12:39] Øystein Grøvlen
Description:
The following query gives the wrong result with materialization turned on:

mysql> select * from t1 where t1.i in (select t2.i from t2 join t3 where t2.i + t3.i = 5);
+------+
| i    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|    8 |
|    9 |
|   10 |
+------+
10 rows in set (0.00 sec)

If materialization is turned off (or semijoin entirely), the result is different:

mysql> set session optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 where t1.i in (select t2.i from t2 join t3 where t2.i + t3.i = 5);
+------+
| i    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
+------+
4 rows in set (0.00 sec)

How to repeat:
create table t1(i int);
insert into t1 values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
create table t2(i int);
insert into t2 values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
create table t3(i int);
insert into t3 values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
select * from t1 where t1.i in (select t2.i from t2 join t3 where t2.i + t3.i = 5);
set session optimizer_switch='materialization=off';
select * from t1 where t1.i in (select t2.i from t2 join t3 where t2.i + t3.i = 5);
[1 Jan 2010 13:29] Valeriy Kravchuk
Verified just as described with recent 6.0.14 from bzr:

77-52-7-73:6.0-codebase openxs$ bin/mysql -uroot test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 6.0.14-alpha-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create table t1(i int);
Query OK, 0 rows affected (0.05 sec)

mysql> insert into t1 values (1), (2), (3), (4), (5), (6), (7), (8), (9),
    -> (10);
Query OK, 10 rows affected (0.01 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> create table t2(i int);
Query OK, 0 rows affected (0.08 sec)

mysql> insert into t2 values (1), (2), (3), (4), (5), (6), (7), (8), (9),
    -> (10);
Query OK, 10 rows affected (0.00 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> create table t3(i int);
Query OK, 0 rows affected (0.09 sec)

mysql> insert into t3 values (1), (2), (3), (4), (5), (6), (7), (8), (9),
    -> (10);
Query OK, 10 rows affected (0.01 sec)
Records: 10  Duplicates: 0  Warnings: 0

mysql> select * from t1 where t1.i in (select t2.i from t2 join t3 where t2.i
    -> + t3.i = 5);
+------+
| i    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
|    5 |
|    6 |
|    7 |
|    8 |
|    9 |
|   10 |
+------+
10 rows in set (0.00 sec)

mysql> set session optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 where t1.i in (select t2.i from t2 join t3 where t2.i
    -> + t3.i = 5);
+------+
| i    |
+------+
|    1 |
|    2 |
|    3 |
|    4 |
+------+
4 rows in set (0.00 sec)
[7 Mar 2010 16:22] Sergey Petrunya
Analysis
--------

# Disable join buffering to make things simpler

MariaDB [j2]> set join_cache_level=0;
Query OK, 0 rows affected (0.00 sec)

MariaDB [j2]> explain select * from t1 where t1.i in (select t2.i from t2 join t3 where t2.i + t3.i = 5);
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | PRIMARY     | t1    | ALL  | NULL          | NULL | NULL    | NULL |   10 |                              |
|  1 | PRIMARY     | t2    | ALL  | NULL          | NULL | NULL    | NULL |   10 | Start materialize            |
|  1 | PRIMARY     | t3    | ALL  | NULL          | NULL | NULL    | NULL |   10 | Using where; End materialize |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
3 rows in set (0.00 sec)

...

(gdb) wher
  #0  Item_func_plus::int_op (this=0xaf766b0) at item_func.cc:1124
  #1  0x081ebfaf in Item_func_numhybrid::val_int (this=0xaf766b0) at item_func.cc:850
  #2  0x081fafe8 in Arg_comparator::compare_int_signed (this=0xaf7680c) at item_cmpfunc.cc:1258
  #3  0x0820a617 in Arg_comparator::compare (this=0xaf7680c) at item_cmpfunc.h:71
  #4  0x081ff499 in Item_func_eq::val_int (this=0xaf76790) at item_cmpfunc.cc:1719
  #5  0x08315658 in evaluate_join_record (join=0xafa4b78, join_tab=0xafaf5d8, error=0) at sql_select.cc:12736
  #6  0x08315bdf in sub_select (join=0xafa4b78, join_tab=0xafaf5d8, end_of_records=false) at sql_select.cc:12652
  #7  0x083158eb in evaluate_join_record (join=0xafa4b78, join_tab=0xafaf428, error=0) at sql_select.cc:12832
  #8  0x08315bdf in sub_select (join=0xafa4b78, join_tab=0xafaf428, end_of_records=false) at sql_select.cc:12652
  #9  0x08315efc in sub_select_sjm (join=0xafa4b78, join_tab=0xafaf428, end_of_records=false) at sql_select.cc:12341
  #10 0x083158eb in evaluate_join_record (join=0xafa4b78, join_tab=0xafaf278, error=0) at sql_select.cc:12832
  #11 0x08315bdf in sub_select (join=0xafa4b78, join_tab=0xafaf278, end_of_records=false) at sql_select.cc:12652
  #12 0x083250d0 in do_select (join=0xafa4b78, fields=0xa6cdf10, table=0x0, procedure=0x0) at sql_select.cc:12202
  #13 0x08336789 in JOIN::exec (this=0xafa4b78) at sql_select.cc:2124
  #14 0x08331088 in mysql_select (thd=0xa6cc590, rref_pointer_array=0xa6cdf80, tables=0xaf75740, wild_num=1, fields=@0xa6cdf10, conds=0xaf76888, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764736, result=0xaf76a00, unit=0xa6cdbf4, select_lex=0xa6cde7c) at sql_select.cc:2319
...

(gdb) p *args[0]->field->table_name
  $152 = 0xaf11fb0 "t1"
(gdb) p args[0]->field->field_name
  $153 = 0xaf81189 "i"

# We're accessing "t1.i" while doing materialization (of t2 JOIN t3 ON ..)  
# This is wrong.
[7 Mar 2010 16:28] Sergey Petrunya
It could be equality propagation. Indeed:

(gdb) list
  4760	          const_item->cmp_context != cmp_context)
  4761	        return this;
  4762	      return const_item;
  4763	    }
  4764	    Item_field *subst= item_equal->get_first();
  4765	    if (subst && field->table != subst->field->table && !field->eq(subst->field))
  4766	      return subst;
  4767	  }
  4768	  return this;
  4769	}

(gdb) wher
  #0  Item_field::replace_equal_field (this=0xaf76570, arg=0x0) at item.cc:4765
  #1  0x081be383 in Item::transform (this=0xaf76570, transformer=&virtual Item::replace_equal_field(unsigned char*), arg=0x0) at item.cc:549
  #2  0x081f5e88 in Item_func::transform (this=0xaf766b0, transformer=&virtual table offset 332, argument=0x0) at item_func.cc:300
  #3  0x081f5e88 in Item_func::transform (this=0xaf76790, transformer=&virtual table offset 332, argument=0x0) at item_func.cc:300
  #4  0x08323266 in substitute_for_best_equal_field (cond=0xaf76790, cond_equal=0xaf771dc, table_join_idx=0xaf77388) at sql_select.cc:9124
  #5  0x08323025 in substitute_for_best_equal_field (cond=0xaf77148, cond_equal=0xaf771dc, table_join_idx=0xaf77388) at sql_select.cc:9087
  #6  0x0832e3ce in JOIN::optimize (this=0xafa4b78) at sql_select.cc:946
  #7  0x08330ffe in mysql_select (thd=0xa6cc590, rref_pointer_array=0xa6cdf80, tables=0xaf75740, wild_num=1, fields=@0xa6cdf10, conds=0xaf76888, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147764736, result=0xaf76a00, unit=0xa6cdbf4, select_lex=0xa6cde7c) at sql_select.cc:2305

(gdb) p this->field_name 
  $156 = 0xaf1fb41 "i"
(gdb) p this->table_name
  $157 = 0xada3020 "t2"
(gdb)  
(gdb) p subst->field_name
  $158 = 0xaf81189 "i"
(gdb) p subst->table_name
  $159 = 0xaf11fb0 "t1"

It should not do such replaces. In fact, the issue of equality propagation with Materialization was addressed in eliminate_item_equal(), see comment starting with:

        item_field might refer to a table that is within a semi-join

The bug is that that logic is applied only in eliminate_item_equal(). eliminate_item_equal takes care of equal-field-substitution for equalities, while Item_field::replace_equal_field() takes care of equal-field-substitution elsewhere  (this is why one needs to write "t2.i + t3.i =5" to observe this bug, simple equalities work correctly, while complex expressions don't).
[9 Mar 2010 10:31] Sergey Petrunya
See also BUG#45174.
[13 Mar 2010 20:39] Sergey Petrunya
http://askmonty.org/wiki/EqualityPropagationAndEqualityPropagationAndSemiJoinMaterializati...  has a write-up about the nature of the problem and how it should be fixed.
[16 Jun 2010 11:22] Jørgen Løland
Should be fixed as part of BUG#45174; comments for how to do that sent on email to commit list. Closing as duplicate.
[29 Sep 2010 14:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/119437

3253 Roy Lyseng	2010-09-29
      Bug#45174: Incorrectly applied equality propagation caused wrong result
      on a query with a materialized semi-join.
      Bug#50019: Wrong result for IN-query with materialization.
      
      When a subquery is subject to a semijoin optimization, it's tables
      are merged to the outer query and later are treated as regular tables. 
      This allows a bunch of optimizations to be applied, equality
      propagation is among them. Equality propagation is done after query
      execution plan is chosen. It substitutes fields from tables being
      retrieved later for fields from tables being retrieved earlier.
      However, it can't be applied as is to any semijoin table.
      The semijoin materialization strategy differs from other semijoin
      strategies that the data from materialized semijoin tables isn't used
      directly but saved to a temporary table first.
      The materialization isn't isolated in a separate step, it is done
      inline within the nested loop execution.
      When it comes to fetching rows from the first table in the block of
      materialized semijoin tables, sub_select() function is called to
      materialize the result of the subquery and save it in the
      materialized table. Later, data from the materialized table is used
      as they were regular table rows.
      Due to this we can't substitute fields that belong to the semi-join
      for fields from outer query and vice versa. 
      
      Example: suppose we have a join order:
      
        ot1 ot2  SJ-Mat(it1  it2  it3)  ot3
      
      and equality ot2.col = it1.col = it2.col
      If we're looking for best substitute for 'it2.col', we should pick
      it1.col and not ot2.col.
      
      For a field that is not in a materialized semijoin we must pick a field
      that's not embedded in any materialized semijoin.
      
      Example: suppose we have a join order:
      
        SJ-Mat(it1  it2)  ot1  ot2
      
      and equality ot2.col = ot1.col = it2.col
      If we're looking for best substitute for 'ot2.col', we should pick
      ot1.col and not it2.col, because when we run a join between ot1
      and ot2, execution of SJ-Mat(...) has already finished and we can't
      rely on the value of it*.*.
      
      Now we have added another Item_equal::get_first function that accepts
      as a parameter a field being substituted and checks whether it belongs
      to a materialized semijoin.
      The field to substitute will be from the same materialized semijoin nest
      (if supplied field is within such nest), or outside any materialized
      semijoin nest (if supplied field is outside such nest).
      
      The new checks rely on the first_sj_inner_tab and first_sj_inner_tab
      fields of the join-tab. These fields are therefore set as soon as
      possible after the join strategy is fixed.
      
      mysql-test/include/subquery_sj.inc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Added new tests.
      
      mysql-test/r/subquery_mat_all.result
      mysql-test/r/subquery_sj_all.result
      mysql-test/r/subquery_sj_all_jcl6.result
      mysql-test/r/subquery_sj_all_jcl7.result
      mysql-test/r/subquery_sj_dupsweed.result
      mysql-test/r/subquery_sj_dupsweed_jcl6.result
      mysql-test/r/subquery_sj_dupsweed_jcl7.result
      mysql-test/r/subquery_sj_firstmatch.result
      mysql-test/r/subquery_sj_firstmatch_jcl6.result
      mysql-test/r/subquery_sj_firstmatch_jcl7.result
      mysql-test/r/subquery_sj_loosescan.result
      mysql-test/r/subquery_sj_loosescan_jcl6.result
      mysql-test/r/subquery_sj_loosescan_jcl7.result
      mysql-test/r/subquery_sj_mat.result
      mysql-test/r/subquery_sj_mat_jcl6.result
      mysql-test/r/subquery_sj_mat_jcl7.result
      mysql-test/r/subquery_sj_mat_nosj.result
      mysql-test/r/subquery_sj_none.result
      mysql-test/r/subquery_sj_none_jcl6.result
      mysql-test/r/subquery_sj_none_jcl7.result
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Results for two new tests added.
        Some tests using semijoin materialization show that where clause
        has moved from the outer query into the materialized inner query.
        This is caused by the changed call to get_first() in
        eliminate_item_equal().
        Ex: select * from ot where a in(select b from it where b>0);
        The clause "b>0" is now evaluated on the inner query materialization.
        Performance-wise this is never worse when using MaterializeScan and
        usually better for MaterializeLookup. For the latter strategy, the
        best possible solution is probably to evaluate the clause in both
        queries, this can be subject for a later feature development.
       
      sql/item.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        Calling new get_first() function instead of old.
      
      sql/item_cmpfunc.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_first that accepts as parameter
        a field being substituted.
      
      sql/item_cmpfunc.h
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_first that accepts as parameter
        a field being substituted.
      
      sql/sql_select.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        Setting fields first_sj_inner_tab and last_sj_inner_tab moved from
        setup_semijoin_dups_elimination() to get_best_combination(), so they
        are set as early as possible after join order optimization.
[7 Oct 2010 15:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/120276

3260 Roy Lyseng	2010-10-07
      Bug#45174: Incorrectly applied equality propagation caused wrong result
      on a query with a materialized semi-join.
      Bug#50019: Wrong result for IN-query with materialization.
      Bug#52068: Optimizer generates invalid semijoin materialization plan
      
      When a subquery is subject to a semijoin optimization, it's tables
      are merged to the outer query and later are treated as regular tables. 
      This allows a bunch of optimizations to be applied, equality
      propagation is among them. Equality propagation is done after query
      execution plan is chosen. It substitutes fields from tables being
      retrieved later for fields from tables being retrieved earlier.
      However, it can't be applied as is to any semijoin table.
      The semijoin materialization strategy differs from other semijoin
      strategies that the data from materialized semijoin tables isn't used
      directly but saved to a temporary table first.
      The materialization isn't isolated in a separate step, it is done
      inline within the nested loop execution.
      When it comes to fetching rows from the first table in the block of
      materialized semijoin tables, sub_select() function is called to
      materialize the result of the subquery and save it in the
      materialized table. Later, data from the materialized table is used
      as they were regular table rows.
      Due to this we can't substitute fields that belong to the semi-join
      for fields from outer query and vice versa. 
      
      Example: suppose we have a join order:
      
        ot1 ot2  SJ-Mat(it1  it2  it3)  ot3
      
      and equality ot2.col = it1.col = it2.col
      If we're looking for best substitute for 'it2.col', we should pick
      it1.col and not ot2.col.
      
      For a field that is not in a materialized semijoin we can use any field,
      even those that are embedded in a materialized semijoin. This is because
      such fields are "copied back" to their original join-tab structures when
      the materialized temporary table is being read.
      
      Now we have added another Item_equal::get_first() function that accepts
      as a parameter a field being substituted and checks whether it belongs
      to a materialized semijoin.
      The field to substitute will be from the same materialized semijoin nest
      (if supplied field is within such nest), otherwise it will be the first
      field in the multiple equality.
      
      The new checks rely on the first_sj_inner_tab and first_sj_inner_tab
      fields of the join-tab. These fields are therefore set as soon as
      possible after the join strategy is fixed.
      
      Also fixed problem appearing in Bug#52068: When MaterializeScan
      semijoin strategy was used and there were one or more outer dependent
      tables before the semijoin tables, the scan over the materialized
      table was not properly reset for each row of the prefix outer tables.
      
      Also fixed problems with pushdown of SJM-aware predicates during
      make_join_select(): wrong predicates were sometimes generated,
      make_cond_after_sjm() was called at the wrong position, and
      make_cond_after_sjm() was never actually considering the pushed-down
      SJM predicates.
      
      mysql-test/include/subquery_sj.inc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Bug#52068: Optimizer generates invalid semijoin materialization plan
        Added new tests.
      
      mysql-test/r/subquery_mat_all.result
      mysql-test/r/subquery_sj_all.result
      mysql-test/r/subquery_sj_all_jcl6.result
      mysql-test/r/subquery_sj_all_jcl7.result
      mysql-test/r/subquery_sj_dupsweed.result
      mysql-test/r/subquery_sj_dupsweed_jcl6.result
      mysql-test/r/subquery_sj_dupsweed_jcl7.result
      mysql-test/r/subquery_sj_firstmatch.result
      mysql-test/r/subquery_sj_firstmatch_jcl6.result
      mysql-test/r/subquery_sj_firstmatch_jcl7.result
      mysql-test/r/subquery_sj_loosescan.result
      mysql-test/r/subquery_sj_loosescan_jcl6.result
      mysql-test/r/subquery_sj_loosescan_jcl7.result
      mysql-test/r/subquery_sj_mat.result
      mysql-test/r/subquery_sj_mat_jcl6.result
      mysql-test/r/subquery_sj_mat_jcl7.result
      mysql-test/r/subquery_sj_mat_nosj.result
      mysql-test/r/subquery_sj_none.result
      mysql-test/r/subquery_sj_none_jcl6.result
      mysql-test/r/subquery_sj_none_jcl7.result
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Bug#52068: Optimizer generates invalid semijoin materialization plan
        Results for three new tests added.
        Some tests using semijoin materialization show that where clause
        has moved from the outer query into the materialized inner query.
        This is caused by the changed call to get_first() in
        eliminate_item_equal().
        Ex: select * from ot where a in(select b from it where b>0);
        The clause "b>0" is now evaluated on the inner query materialization.
        Performance-wise this is never worse when using MaterializeScan and
        usually better for MaterializeLookup. For the latter strategy, the
        best possible solution is probably to evaluate the clause in both
        queries, this can be subject for a later feature development.
        Another test that applies the same condition to both the outer and
        the inner query is added, to show the plan for such types of queries.
        Earlier wrong semijoin materialization test results are corrected.
       
      sql/item.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        Calling new get_first() function instead of old.
      
      sql/item_cmpfunc.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_first() that accepts as argument
        a field being substituted.
      
      sql/item_cmpfunc.h
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_first() that accepts as argument
        a field being substituted.
      
      sql/sql_select.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Bug#52068: Optimizer generates invalid semijoin materialization plan
      
        Setting fields first_sj_inner_tab and last_sj_inner_tab moved from
        setup_semijoin_dups_elimination() to get_best_combination(), so they
        are set as early as possible after join order optimization.
        
        In make_join_select(), the test that determined when to pushdown
        SJM-specific predicates was wrong, in addition to approving the
        comments.
      
        The logic of eliminate_item_equal() has been simplified and
        adjusted so that it generates equalities that are useful also
        when the semijoin materialization strategy is being used.
        Some simplification was possible by taking advantage of the new
        Item_equal::get_first() function.
      
        In sub_select_sjm(), moved code that initializes the scan over the
        materialized table so that it is now performed for each scan of
        table, instead of only for the first scan.
      
        In make_cond_for_table_from_pred(), a number of comments has been
        added, and TAB characters are replaced by spaces.
      
        In make_cond_after_sjm(), make sure that it handles equalities
        generated for semijoin materialization (with marker=3).
        Otherwise, removed marker optimizations for this function, as
        it will only be called once per materialized semijoin nest
        in a query. Added comments and removed TAB characters.
[22 Oct 2010 7:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/121634

3265 Roy Lyseng	2010-10-22
      Bug#45174: Incorrectly applied equality propagation caused wrong result
      on a query with a materialized semi-join.
      Bug#50019: Wrong result for IN-query with materialization.
      Bug#52068: Optimizer generates invalid semijoin materialization plan
      
      When a subquery is subject to a semijoin optimization, its tables
      are merged to the outer query and later are treated as regular tables. 
      This allows a bunch of optimizations to be applied, equality
      propagation is among them. Equality propagation is done after query
      execution plan is chosen. It substitutes fields from tables being
      retrieved later for fields from tables being retrieved earlier.
      However, it can't be applied as is to any semijoin table.
      The semijoin materialization strategy differs from other semijoin
      strategies that the data from materialized semijoin tables isn't used
      directly but saved to a temporary table first.
      The materialization isn't isolated in a separate step, it is done
      inline within the nested loop execution.
      When it comes to fetching rows from the first table in the block of
      materialized semijoin tables, sub_select() function is called to
      materialize the result of the subquery and save it in the
      materialized table. Later, data from the materialized table is used
      as if they were regular table rows.
      Due to this we can't substitute fields that belong to the semi-join
      for fields from outer query and vice versa. 
      
      Example: suppose we have a join order:
      
        ot1 ot2  SJ-Mat(it1  it2  it3)  ot3
      
      and equality ot2.col = it1.col = it2.col
      If we're looking for best substitute for 'it2.col', we should pick
      it1.col and not ot2.col.
      
      For a field that is not in a materialized semijoin we can use any field,
      even those that are embedded in a materialized semijoin. This is because
      such fields are "copied back" to their original join-tab structures when
      the materialized temporary table is being read.
      
      Now we have added another Item_equal::get_first() function that accepts
      as a parameter a field being substituted and checks whether it belongs
      to a materialized semijoin.
      The field to substitute will be from the same materialized semijoin nest
      (if supplied field is within such nest), otherwise it will be the first
      field in the multiple equality.
      
      The new checks rely on the first_sj_inner_tab and first_sj_inner_tab
      fields of the join-tab. These fields are therefore set as soon as
      possible after the join strategy is fixed.
      
      Also fixed problem appearing in Bug#52068: When MaterializeScan
      semijoin strategy was used and there were one or more outer dependent
      tables before the semijoin tables, the scan over the materialized
      table was not properly reset for each row of the prefix outer tables.
      
      Also fixed problems with pushdown of SJM-aware predicates during
      make_join_select(): wrong predicates were sometimes generated,
      make_cond_after_sjm() was called at the wrong position, and
      make_cond_after_sjm() was never actually considering the pushed-down
      SJM predicates.
      
      mysql-test/include/subquery_sj.inc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Bug#52068: Optimizer generates invalid semijoin materialization plan
        Added new tests.
      
      mysql-test/r/subquery_mat_all.result
      mysql-test/r/subquery_sj_all.result
      mysql-test/r/subquery_sj_all_jcl6.result
      mysql-test/r/subquery_sj_all_jcl7.result
      mysql-test/r/subquery_sj_dupsweed.result
      mysql-test/r/subquery_sj_dupsweed_jcl6.result
      mysql-test/r/subquery_sj_dupsweed_jcl7.result
      mysql-test/r/subquery_sj_firstmatch.result
      mysql-test/r/subquery_sj_firstmatch_jcl6.result
      mysql-test/r/subquery_sj_firstmatch_jcl7.result
      mysql-test/r/subquery_sj_loosescan.result
      mysql-test/r/subquery_sj_loosescan_jcl6.result
      mysql-test/r/subquery_sj_loosescan_jcl7.result
      mysql-test/r/subquery_sj_mat.result
      mysql-test/r/subquery_sj_mat_jcl6.result
      mysql-test/r/subquery_sj_mat_jcl7.result
      mysql-test/r/subquery_sj_mat_nosj.result
      mysql-test/r/subquery_sj_none.result
      mysql-test/r/subquery_sj_none_jcl6.result
      mysql-test/r/subquery_sj_none_jcl7.result
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Bug#52068: Optimizer generates invalid semijoin materialization plan
        Results for three new tests added.
        Some tests using semijoin materialization show that where clause
        has moved from the outer query into the materialized inner query.
        This is caused by the changed call to get_first() in
        eliminate_item_equal().
        Ex: select * from ot where a in(select b from it where b>0);
        The clause "b>0" is now evaluated on the inner query materialization.
        Performance-wise this is never worse when using MaterializeScan and
        usually better for MaterializeLookup. For the latter strategy, the
        best possible solution is probably to evaluate the clause in both
        queries, this can be subject for a later feature development.
        Another test that applies the same condition to both the outer and
        the inner query is added, to show the plan for such types of queries.
        Earlier wrong semijoin materialization test results are corrected.
       
      sql/item.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        Calling new get_first() function instead of old.
      
      sql/item_cmpfunc.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_first() that accepts as argument
        a field being substituted.
      
      sql/item_cmpfunc.h
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_first() that accepts as argument
        a field being substituted.
      
      sql/sql_select.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Bug#52068: Optimizer generates invalid semijoin materialization plan
      
        Setting fields first_sj_inner_tab and last_sj_inner_tab moved from
        setup_semijoin_dups_elimination() to get_best_combination(), so they
        are set as early as possible after join order optimization.
        
        In make_join_select(), the test that determined when to pushdown
        SJM-specific predicates was wrong, in addition to improving the
        comments.
      
        The logic of eliminate_item_equal() has been simplified and
        adjusted so that it generates equalities that are useful also
        when the semijoin materialization strategy is being used.
        Some simplification was possible by taking advantage of the new
        Item_equal::get_first() function.
      
        In sub_select_sjm(), moved code that initializes the scan over the
        materialized table so that it is now performed for each scan of
        table, instead of only for the first scan.
      
        In make_cond_for_table_from_pred(), a number of comments has been
        added, and TAB characters are replaced by spaces.
      
        In make_cond_after_sjm(), make sure that it handles equalities
        generated for semijoin materialization (with marker=3).
        Added comments and removed TAB characters.
[25 Oct 2010 9:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/121779

3269 Roy Lyseng	2010-10-25
      Bug#45174: Incorrectly applied equality propagation caused wrong result
      on a query with a materialized semi-join.
      Bug#50019: Wrong result for IN-query with materialization.
      
      When a subquery is subject to a semijoin transformation, its tables
      are merged to the outer query and are later treated as regular tables. 
      One possible optimization to apply after semijoin transformation is
      equality propagation.
      Equality propagation is done after query execution plan is chosen.
      It substitutes fields from tables being retrieved later for fields from
      tables being retrieved earlier.
      However, equality propagation can't be applied as is when using the
      semijoin materialization strategy. In this case, data from
      materialized semijoin tables are saved to a temporary table first
      before being involved in the join.
      
      This has two consequences for equality propagation:
      1. Equalities that exist within the materialized tables must be processed
         within the materialization operation, and
      2. Equalities between the materialized table and the outer query must be
         processed like regular equalities.
      
      Example: suppose we have a join order:
      
        ot1 ot2  SJ-Mat(it1  it2  it3)  ot3
      
      and equality ot2.col = it1.col = it2.col
      If we're looking for best substitute for it2.col, we should pick
      it1.col and not ot2.col, because the equality it1.col=it2.col
      can be processed during the materialization.
      
      For a field that is not in a materialized semijoin we can use any field,
      even those that are embedded in a materialized semijoin. This is because
      such fields are "copied back" to their original join-tab structures when
      the materialized temporary table is being read.
      
      Now we have added a bew function Item_equal::get_subst_item() that accepts
      as a parameter a field being substituted and checks whether it belongs
      to a materialized semijoin.
      The field to substitute will be from the same materialized semijoin nest
      (if supplied field is within such nest), otherwise it will be the first
      field in the multiple equality.
      
      The new checks rely on the first_sj_inner_tab and first_sj_inner_tab
      fields of the join-tab. These fields are therefore set as soon as
      possible after the join strategy is fixed (before they were only used
      by the DuplicateWeedout strategy, and were not needed until later).
      
      Also fixed problems with pushdown of SJM-aware predicates during
      make_join_select():
       - Wrong predicates were sometimes generated,
       - make_cond_after_sjm() was called at the wrong position
       - make_cond_after_sjm() was never actually considering the pushed-down
         SJM predicates.
      
      mysql-test/include/subquery_sj.inc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
      
      mysql-test/r/subquery_mat_all.result
      mysql-test/r/subquery_sj_all.result
      mysql-test/r/subquery_sj_all_jcl6.result
      mysql-test/r/subquery_sj_all_jcl7.result
      mysql-test/r/subquery_sj_dupsweed.result
      mysql-test/r/subquery_sj_dupsweed_jcl6.result
      mysql-test/r/subquery_sj_dupsweed_jcl7.result
      mysql-test/r/subquery_sj_firstmatch.result
      mysql-test/r/subquery_sj_firstmatch_jcl6.result
      mysql-test/r/subquery_sj_firstmatch_jcl7.result
      mysql-test/r/subquery_sj_loosescan.result
      mysql-test/r/subquery_sj_loosescan_jcl6.result
      mysql-test/r/subquery_sj_loosescan_jcl7.result
      mysql-test/r/subquery_sj_mat.result
      mysql-test/r/subquery_sj_mat_jcl6.result
      mysql-test/r/subquery_sj_mat_jcl7.result
      mysql-test/r/subquery_sj_mat_nosj.result
      mysql-test/r/subquery_sj_none.result
      mysql-test/r/subquery_sj_none_jcl6.result
      mysql-test/r/subquery_sj_none_jcl7.result
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Results for three new tests added.
        Some tests using semijoin materialization show that where clause
        has moved from the outer query into the materialized inner query.
        This is caused by the changed call to get_subst_item() in
        eliminate_item_equal().
        Ex: select * from ot where a in(select b from it where b>0);
        The clause "b>0" is now evaluated on the inner query materialization.
        Performance-wise this is never worse than before when using
        MaterializeScan and usually better than before for MaterializeLookup.
        (For MaterializeLookup, it is worse when it is more efficient to move
         the clause to the outer query. The best possible solution for this
         case is probably to evaluate the clause in both queries, this can
         be subject for a later feature development.)
        Another test that applies the same condition to both the outer and
        the inner query is added, to show the plan for such types of queries.
        Earlier wrong semijoin materialization test results are corrected.
       
      sql/item.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        Calling new get_subst_item() function instead of get_first().
      
      sql/item_cmpfunc.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_subst_item() that accepts as argument
        a field being substituted.
      
      sql/item_cmpfunc.h
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_subst_item() that accepts as argument
        a field being substituted.
      
      sql/sql_select.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
      
        Setting fields first_sj_inner_tab and last_sj_inner_tab moved from
        setup_semijoin_dups_elimination() to get_best_combination(), so they
        are set as early as possible after join order optimization.
        
        In make_join_select(), the test that determined when to pushdown
        SJM-specific predicates was wrong, in addition to improving the
        comments.
      
        The logic of eliminate_item_equal() has been simplified and
        adjusted so that it generates equalities that are useful also
        when the semijoin materialization strategy is being used.
        Some simplification was possible by taking advantage of the new
        Item_equal::get_subst_item() function.
      
        In make_cond_for_table_from_pred(), a number of comments has been
        added, and TAB characters are replaced by spaces.
      
        In make_cond_after_sjm(), make sure that it handles equalities
        generated for semijoin materialization (with marker=3).
        Added comments and removed TAB characters.
[26 Oct 2010 10:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/121911

3269 Roy Lyseng	2010-10-26
      on a query with a materialized semi-join.
      Bug#50019: Wrong result for IN-query with materialization.
      
      When a subquery is subject to a semijoin transformation, its tables
      are merged to the outer query and are later treated as regular tables. 
      One possible optimization to apply after semijoin transformation is
      equality propagation.
      Equality propagation is done after query execution plan is chosen.
      It substitutes fields from tables being retrieved later for fields from
      tables being retrieved earlier.
      However, equality propagation can't be applied as is when using the
      semijoin materialization strategy. In this case, data from
      materialized semijoin tables are saved to a temporary table first
      before being involved in the join.
      
      This has two consequences for equality propagation:
      1. Equalities that exist within the materialized tables must be processed
         within the materialization operation, and
      2. Equalities between the materialized table and the outer query must be
         processed like regular equalities.
      
      Example: suppose we have a join order:
      
        ot1 ot2  SJ-Mat(it1  it2  it3)  ot3
      
      and equality ot2.col = it1.col = it2.col
      If we're looking for best substitute for it2.col, we should pick
      it1.col and not ot2.col, because the equality it1.col=it2.col
      can be processed during the materialization.
      
      For a field that is not in a materialized semijoin we can use any field,
      even those that are embedded in a materialized semijoin. This is because
      such fields are "copied back" to their original join-tab structures when
      the materialized temporary table is being read.
      
      Now we have added a new function Item_equal::get_subst_item() that accepts
      as a parameter a field being substituted and checks whether it belongs
      to a materialized semijoin.
      The field to substitute will be from the same materialized semijoin nest
      (if supplied field is within such nest), otherwise it will be the first
      field in the multiple equality.
      
      The new checks rely on the first_sj_inner_tab and first_sj_inner_tab
      fields of the join-tab. These fields are therefore set as soon as
      possible after the join strategy is fixed (before they were only used
      by the DuplicateWeedout strategy, and were not needed until later).
      
      Also fixed problems with pushdown of SJM-aware predicates during
      make_join_select():
       - Wrong predicates were sometimes generated,
       - make_cond_after_sjm() was called at the wrong position in the join
         sequence.
       - make_cond_after_sjm() was never actually considering the pushed-down
         SJM predicates.
      
      mysql-test/include/subquery_sj.inc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
      
      mysql-test/r/subquery_mat_all.result
      mysql-test/r/subquery_sj_all.result
      mysql-test/r/subquery_sj_all_jcl6.result
      mysql-test/r/subquery_sj_all_jcl7.result
      mysql-test/r/subquery_sj_dupsweed.result
      mysql-test/r/subquery_sj_dupsweed_jcl6.result
      mysql-test/r/subquery_sj_dupsweed_jcl7.result
      mysql-test/r/subquery_sj_firstmatch.result
      mysql-test/r/subquery_sj_firstmatch_jcl6.result
      mysql-test/r/subquery_sj_firstmatch_jcl7.result
      mysql-test/r/subquery_sj_loosescan.result
      mysql-test/r/subquery_sj_loosescan_jcl6.result
      mysql-test/r/subquery_sj_loosescan_jcl7.result
      mysql-test/r/subquery_sj_mat.result
      mysql-test/r/subquery_sj_mat_jcl6.result
      mysql-test/r/subquery_sj_mat_jcl7.result
      mysql-test/r/subquery_sj_mat_nosj.result
      mysql-test/r/subquery_sj_none.result
      mysql-test/r/subquery_sj_none_jcl6.result
      mysql-test/r/subquery_sj_none_jcl7.result
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
        Results for three new tests added.
        Some tests using semijoin materialization show that where clause
        has moved from the outer query into the materialized inner query.
        This is caused by the changed call to get_subst_item() in
        eliminate_item_equal().
        Ex: select * from ot where a in(select b from it where b>0);
        The clause "b>0" is now evaluated on the inner query materialization.
        Performance-wise this is never worse than before when using
        MaterializeScan and usually better than before for MaterializeLookup.
        (For MaterializeLookup, it is worse when it is more efficient to move
         the clause to the outer query. The best possible solution for this
         case is probably to evaluate the clause in both queries, this can
         be subject for a later feature development.)
        Another test that applies the same condition to both the outer and
        the inner query is added, to show the plan for such types of queries.
        Earlier wrong semijoin materialization test results are corrected.
       
      sql/item.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        Calling new get_subst_item() function instead of get_first().
      
      sql/item_cmpfunc.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_subst_item() that accepts as argument
        a field being substituted.
      
      sql/item_cmpfunc.h
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
      
        New function Item_equal::get_subst_item() that accepts as argument
        a field being substituted.
      
      sql/sql_select.cc
        Bug#45174: Incorrectly applied equality propagation caused wrong result
        on a query with a materialized semi-join.
        Bug#50019: Wrong result for IN-query with materialization.
      
        Setting fields first_sj_inner_tab and last_sj_inner_tab moved from
        setup_semijoin_dups_elimination() to get_best_combination(), so they
        are set as early as possible after join order optimization.
        
        In make_join_select(), the test that determined when to pushdown
        SJM-specific predicates was wrong, in addition to improving the
        comments.
      
        The logic of eliminate_item_equal() has been simplified and
        adjusted so that it generates equalities that are useful also
        when the semijoin materialization strategy is being used.
        Some simplification was possible by taking advantage of the new
        Item_equal::get_subst_item() function.
      
        In make_cond_for_table_from_pred(), a number of comments has been
        added, and TAB characters are replaced by spaces.
      
        In make_cond_after_sjm(), make sure that it handles equalities
        generated for semijoin materialization (with marker=3).
        Added comments and removed TAB characters.
[13 Nov 2010 16:27] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (merge vers: 5.6.99-m5) (pib:21)
[13 Nov 2010 16:42] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:alexander.nozdrin@oracle.com-20101113152540-gxro4g0v29l27f5x) (pib:21)