Bug #103997 Mis-calculate SJM_POS's prefix_rowcount/prefix_costs
Submitted: 14 Jun 2021 3:24 Modified: 15 Jun 2021 12:58
Reporter: henry liang Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S5 (Performance)
Version:8.0 OS:Any
Assigned to: CPU Architecture:Any

[14 Jun 2021 3:24] henry liang
Description:
Example SQL:
SQL example :
    explain select *
    from partsupp
    where ps_suppkey not in (
      select
      s_suppkey
      from
      part, supplier
      where part.P_PARTKEY = supplier.S_SUPPKEY
    );

In get_best_combination(), MySQL rearranges join->join_tabs and add one additional sjm JOIN_TAB for each materialized semi-join nest.
The additional sjm JOIN_TAB is allocated from one temporary JOIN_TAB array and the POSITION object corresponding to it is the subsequent object in JOIN::best_positions().
    
In the above example, it looks like:
join->best_positions
pos[0]    (pos[1]       pos[2])    pos[3] <== sjm_pos
join->best_ref
partsupp   supplier     part       sjm    <== from tmp_join_tab array
    
Then it puts the temporary sjm JOIN_TAB at join->best_ref[first_inner], and calls setup_semijoin_materialized_table() to setup the information for sjm JOIN_TAB and POSITION objects.

In setup_semijoin_materialized_table(), it will recalculate sjm_pos's prefix_rowcount/prefix_costs using sj-nest's previous POSITION, but it mis-use the (sjm JOIN_TAB - join->join_tab) as the index of sjm table, since they come from different join_tab array, the index is not valid.
    
Also when calculating cost, it uses (sjm_pos - 1) as the previus rows/costinformation, but as the above illustration, sjm_pos - 1 is pos[2], instead it should use pos[0] since the actual join happens between partsupp and sjm.

How to repeat:
Executing query using tpch schema:

henry@localhost : tpch_s10 11:15:09> set optimizer_switch="materialization=on,semijoin=on,loosescan=off,firstmatch=off,duplicateweedout=off";

henry@localhost : tpch_s10 11:16:02> explain select * from partsupp
    where ps_suppkey not in (
      select
      s_suppkey
      from
      part, supplier
      where part.P_PARTKEY = supplier.S_SUPPKEY
    );

Suggested fix:
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -3076,8 +3076,29 @@ bool JOIN::setup_semijoin_materialized_table(JOIN_TAB *tab, uint tableno,
     sjm_pos->rows_fetched = static_cast<double>(tab->records());
     tab->set_type(JT_ALL);
   }
-  sjm_pos->set_prefix_join_cost((tab - join_tab), cost_model());
-
+  /*
+    Since this function is used to setup correct join cost for sjm POSITION,
+    we should use the correct join_tab's POSITION preceding sjm join_tab,
+    we can use "outer_target" index during JOIN::join_tab rearrangement,
+    which points to the current sjm join_tab in JOIN::best_ref, for detail
+    of rearrangement algorithm, @see get_best_combination().
+
+    This logic is only appliable in during MySQL's serial optimization phase,
+    in pq context, we should not reach here.
+  */
+  DBUG_ASSERT(inner_pos);
+  uint outer_idx = materialized_tab->idx();
+  if (outer_idx == 0) {
+    prefix_rowcount = sjm_pos->rows_fetched;
+    prefix_cost =
+        sjm_pos->read_cost + cost_model()->row_evaluate_cost(prefix_rowcount);
+  } else {
+    POSITION *prev_pos = best_ref[outer_idx - 1]->position();
+    prefix_rowcount = prev_pos->prefix_rowcount * sjm_pos->rows_fetched;
+    prefix_cost = prev_pos->prefix_cost + sjm_pos->read_cost +
+                  cost_model()->row_evaluate_cost(prefix_rowcount);
+  }
+  sjm_pos->set_prefix_cost(prefix_cost, prefix_rowcount);
   return false;
 }
[14 Jun 2021 12:56] MySQL Verification Team
Hi Mr. liang,

Thank you for your interesting bug report and for your patch.

We need to be able to repeat the behaviour, so we would like to have table and data. Can you please, let us know how can we obtain all we need. Is there an URL for that particular test case and how big is it ???

Also, you are using the ancient release of our 8.0 version, so please, let us know if 8.0.25 has a solution for this problem. Those two lines have not changed, but that is not enough for us to conclude that everything remained the same.

Thank you, in advance.
[15 Jun 2021 6:45] henry liang
I found this issue on my local TPC-H schema, but it is a common problem, I verified using the newest 8.0.24 version and it is still there. For example, I used this simple case:
create table t1 (id bigint(11), c1 bigint(11), c2 bigint(11)); 
create table t2 (id bigint(11), c1 bigint(11), c2 bigint(11)); 
create table t3 (id bigint(11), c1 bigint(11), c2 bigint(11));

insert into t1 values(1,2,3),(1,2,3),(1,2,3),(1,2,3); 
insert into t2 select * from t1;
insert into t3 select * from t1;

set optimizer_switch="materialization=on,semijoin=on,loosescan=off,firstmatch=off,duplicateweedout=off";

explain select * from t1 where t1.c1 in (select t2.c2 from t2,t3 where t2.c2 = t2.c2);

In the above example, the plan is like this:
| -> Inner hash join (t1.c1 = `<subquery2>`.c2)  (cost=31.56 rows=25)
    -> Table scan on t1  (cost=0.35 rows=9)
    -> Hash
        -> Table scan on <subquery2>  (cost=0.11..2.81 rows=25)
            -> Materialize with deduplication  (cost=6.11..8.81 rows=25)
                -> Filter: (t2.c2 is not null)  (cost=3.50 rows=25)
                    -> Inner hash join (no condition)  (cost=3.50 rows=25)
                        -> Table scan on t3  (cost=0.15 rows=5)
                        -> Hash
                            -> Table scan on t2  (cost=0.75 rows=5)

So SJM is the first join table, but in JOIN::setup_semijoin_materialized_table(), sjm's index was calculated using "(tab - join_tab)", which is not correct since tab is allocated from a temporary JOIN_TAB array, in the above example, (tab - join_tab) == 24, but the corrent idx should be 0.
[15 Jun 2021 12:58] MySQL Verification Team
Hi Mr. liang,

Thank you for your contribution.

Verified as reported.