Bug #78946 WRONG RESULT FOR SELECT NULL IN (<SUBQUERY>)
Submitted: 23 Oct 2015 14:40 Modified: 23 Nov 2015 16:51
Reporter: Dag Wanvik Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.7.8 OS:Any
Assigned to: CPU Architecture:Any

[23 Oct 2015 14:40] Dag Wanvik
Description:
"SELECT NULL IN (subquery)" returns 0 instead of NULL. This is apparently a regression, since 5.6 returns NULL as expected.

How to repeat:
CREATE TABLE t1(a MEDIUMTEXT CHARSET UTF8) ENGINE=INNODB;
INSERT INTO t1 VALUES(0x6F76786E7875);
SELECT NULL IN (SELECT 1 FROM t1);

The column type is not significant; the example popped up when working on 21922202, hence the particular type drawn from that issue's repro.
[27 Oct 2015 19:39] Dag Wanvik
Posted by developer:
 
A somewhat improved analysis:

This is a regression introduced in commit c5accd92e6fef WL#6369: EXPLAIN for
other thread.

The reason is as follows:

Optimalization now happens *before* [1] Item_in_optimizer:val_int can impact the result [by turning off and on condition guards], cf this section

      /*
        Turn off the predicates that are based on column compares for
        which the left part is currently NULL
      */
      for (uint i= 0; i < ncols; i++)
      {
        if (cache->element_index(i)->null_value)
          item_subs->set_cond_guard_var(i, FALSE);  <--- impacts optimization
        :
      }

      :
      else
      {
        /* The subquery has to be evaluated */
        (void) item_subs->val_bool_result();    <----- optimized here before

Since we have a NULL, the condition should not be triggered, cf. explanation for OUTER_FIELD_IS_NOT_NULL:

    /**
       In IN->EXISTS subquery transformation, new predicates are
       added: WHERE inner_field=outer_field OR inner_field IS NULL, as
       well as HAVING inner_field IS NOT NULL, are disabled *IF
       OUTER_FIELD IS A NULL VALUE* (emphasized by me)   
    */

The fallout is that the call to Item_func_trig_cond::val_int from
make_join_select:

       longlong val_int() { return *trig_var ? args[0]->val_int() : 1; }

sees *trig_var as true, returns 0, which makes make_join_select believe it is not constant:

  if (const_cond != NULL)
  {
    const bool const_cond_result= const_cond->val_int() != 0;
    :
    if (!const_cond_result)
      DBUG_RETURN(true);

The next piece of the puzzle is that if make_join_select returns 1, as we saw above, JOIN::optimize does:

  zero_result_cause=
      "Impossible WHERE noticed after reading const tables";

This in turn leads to wrong result at execution time.

At execution time this an early return from JOIN::exec, so the calling Item_in_subselect::val_bool doesn't set null_value to true (in part) because value didn't get set:

  bool Item_in_subselect::val_bool()
  {
    DBUG_ASSERT(fixed == 1);
    if (exec())
    {
      reset();
      return 0;
    }
    if (was_null && !value)   <------| value false, so null_value == FALSE
      null_value= TRUE;              | but was_null is false anyway, so
    return value;                    | responsibility falls to caller...
  }

The setting of null_value happened at the caller site here in Item_in_optimizer::val_int :

  /* The subquery has to be evaluated */
  (void) item_subs->val_bool_result();
  if (!item_subs->value)      <------ this used to be true, so we always set
                                      null_value                    /
    null_value= item_subs->null_value;                             /    
  else                                                            / 
    null_value= TRUE;         <----------------------------------/ 

but due to the early cut-off in execution, item_subs->value isn't true and we enter branch one, rather than two and get 0 rather than NULL.

[1] The call to make_join_select now happes from this stack scenario:

#1 make_join_select  .../sql_optimizer.cc:7578
#2 JOIN::optimize  .../sql_optimizer.cc:509
#3 mysql_prepare_and_optimize_select  .../sql_select.cc:1106
#4 mysql_union_prepare_and_optimize  .../sql_union.cc:125
#5 mysql_optimize_prepared_inner_units  .../sql_union.cc:57
#6 mysql_select  .../sql_select.cc:1166
#7 handle_select  .../sql_select.cc:114
#8 execute_sqlcom_select  .../sql_parse.cc:4885

(The method mysql_optimize_prepared_inner_units is new with the regression commit.)

instead of earlier, as part of the Item_in_subselect::exec:

#1 make_join_select  .../sql_optimizer.cc:7568
#2 JOIN::optimize  .../sql_optimizer.cc:514
#3 subselect_single_select_engine::exec  .../item_subselect.cc:2724
#4 Item_subselect::exec  .../item_subselect.cc:641
#5 Item_in_subselect::exec  .../item_subselect.cc:766
#6 Item_in_subselect::val_bool  .../item_subselect.cc:1399
#7 Item::val_bool_result  .../item.h:1199
#8 Item_in_optimizer::val_int  .../item_cmpfunc.cc:2084
#9 Item::send  .../item.cc:6811
#10 Protocol::send_result_set_row  .../protocol.cc:847
#11 select_send::send_data  .../sql_class.cc:2479
#12 JOIN::exec  .../sql_executor.cc:148
#13 mysql_execute_select  .../sql_select.cc:1104
#14 mysql_select  .../sql_select.cc:1225
#15 handle_select  .../sql_select.cc:111
#16 execute_sqlcom_select  .../sql_parse.cc:4843
[28 Oct 2015 19:58] Dag Wanvik
Posted by developer:
 
The last comment lost its stack lines, trying again:
[1] The call to make_join_select now happens from this stack scenario:

#1 make_join_select  .../sql_optimizer.cc:7578
#2 JOIN::optimize  .../sql_optimizer.cc:509
#3 mysql_prepare_and_optimize_select  .../sql_select.cc:1106
#4 mysql_union_prepare_and_optimize  .../sql_union.cc:125
#5 mysql_optimize_prepared_inner_units  .../sql_union.cc:57
#6 mysql_select  .../sql_select.cc:1166
#7 handle_select  .../sql_select.cc:114
#8 execute_sqlcom_select  .../sql_parse.cc:4885

(The method mysql_optimize_prepared_inner_units is new with the regression commit.)

instead of earlier, as part of the Item_in_subselect::exec:

#1 make_join_select  .../sql_optimizer.cc:7568
#2 JOIN::optimize  .../sql_optimizer.cc:514
#3 subselect_single_select_engine::exec  .../item_subselect.cc:2724
#4 Item_subselect::exec  .../item_subselect.cc:641
#5 Item_in_subselect::exec  .../item_subselect.cc:766
#6 Item_in_subselect::val_bool  .../item_subselect.cc:1399
#7 Item::val_bool_result  .../item.h:1199
#8 Item_in_optimizer::val_int  .../item_cmpfunc.cc:2084
#9 Item::send  .../item.cc:6811
#10 Protocol::send_result_set_row  .../protocol.cc:847
#11 select_send::send_data  .../sql_class.cc:2479
#12 JOIN::exec  .../sql_executor.cc:148
#13 mysql_execute_select  .../sql_select.cc:1104
#14 mysql_select  .../sql_select.cc:1225
#15 handle_select  .../sql_select.cc:111
#16 execute_sqlcom_select  .../sql_parse.cc:4843
[23 Nov 2015 16:51] Paul DuBois
Noted in 5.7.11, 5.8.0 changelogs.

IN-to-EXISTS subquery transformation could cause SELECT NULL IN
(subquery) to return 0 rather than NULL.