Bug #93531 mysql bug
Submitted: 9 Dec 2018 18:42 Modified: 12 Dec 2018 15:41
Reporter: baogang hou Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server Severity:S6 (Debug Builds)
Version:8.0.13 OS:CentOS
Assigned to: CPU Architecture:Any

[9 Dec 2018 18:42] baogang hou
Description:
I find a bug in the 8.0.13 , it create_temp_file faild,it will crash,I analyzed the code.it create_temp_file faild,and it do not make thd->is_error() true ,the code logic as follows, create_temp_file->mysql_tmpfile_path->mysql_tmpfile->generate_partition_syntax->generate_partition_syntax->fill_partitioning_from_dd->open_table_def->get_table_share->get_table_share_with_discover->open_table->open_and_process_table->open_tables->open_tables_for_query->Sql_cmd_dml::prepare->Sql_cmd_dml::execute(sql_select.cc:614)

and I found that the whole process did not make thd->is_error() true 。

I think it shoud make thd->is_error() true in the right place。

as sql_base.cc:5007

  if (error) {
    if (!ot_ctx->can_recover_from_failed_open() && safe_to_ignore_table) {
      DBUG_PRINT("info", ("open_table: ignoring table '%s'.'%s'", tables->db,
                          tables->alias));
      error = false;
    }
else
{
if (!thd->get_stmt_da()->is_error()) {
        // Copy the exception condition information.
        thd->get_stmt_da()->set_error_status(mysql_errno,message_text,returned_sqlstate);//
      }
}
    goto end;
  }

How to repeat:
There is something wrong with the code logic, so you don't need to reproduce it. Look at the code logic, you can see it.

Suggested fix:
 think it shoud make thd->is_error() true in the right place。

as sql_base.cc:5007

  if (error) {
    if (!ot_ctx->can_recover_from_failed_open() && safe_to_ignore_table) {
      DBUG_PRINT("info", ("open_table: ignoring table '%s'.'%s'", tables->db,
                          tables->alias));
      error = false;
    }
else
{
if (!thd->get_stmt_da()->is_error()) {
        // Copy the exception condition information.
        thd->get_stmt_da()->set_error_status(mysql_errno,message_text,returned_sqlstate);//
      }
}
    goto end;
  }
[9 Dec 2018 18:53] baogang hou
and Where the crash occurred is Sql_cmd_dml::execute(sql_select.cc:641) 
 ->  DBUG_ASSERT(thd->is_error() || thd->killed);
[10 Dec 2018 4:03] baogang hou
and the other bug, I find create_temp_file return the file which is 0;which may be right,but the code logic  create_temp_file(return 0)->mysql_tmpfile_path(return 0)->mysql_tmpfile(return 0)->generate_partition_syntax(return null)->fill_partitioning_from_dd(return ture)->open_table_def()(dd_table_share.cc:2236 error =true and return true.

the code logic is bug.
[10 Dec 2018 13:34] Sinisa Milivojevic
Hi,

Thank you for your bug report.

What you report is a bug, but it is not clear from your text where the bug truly is.

First of all, the code analysis part and "Suggested fix" are 100 % the same, hence I truly do not see the difference.

Next, you claim that when a function create_temp_file() fails, then THD::error should be raised. You should be specific on where exactly in our code does it happen.

Last, but not least, the handle returned can not be zero nor NULL. But, you have not presented where exactly in our code is that value ignored.

We are awaiting your answers on all our questions ......
[10 Dec 2018 16:09] baogang hou
then I reorganized the logic code with you。 in the create_temp_file,
1)mf_tempfile.cc:149  
if (file >= 0) {
    mysql_mutex_lock(&THR_LOCK_open);
    my_tmp_file_created++;
    mysql_mutex_unlock(&THR_LOCK_open);
  }
 DBUG_RETURN(file);
 
 
This indicates the file can be zero.and I did encounter the file is zero ,during the debugging process

2)and the code logic the code logic  create_temp_file(return 0)->mysql_tmpfile_path(return 0)->mysql_tmpfile(return 0)->generate_partition_syntax(return null)->fill_partitioning_from_dd(return ture)->open_table_def()(dd_table_share.cc:2236 error =true and return true.)->get_table_share(open_table_err = true and DBUG_RETURN(NULL);)->get_table_share_with_discover(DBUG_RETURN(NULL);)->open_table(sql_base.cc:3176 DBUG_RETURN(true);)->open_and_process_table(sql_base.cc:5004 error=ture ->sql_base.cc:5013 goto end;->sql_base.cc:5122 DBUG_RETURN(error);)->open_tables(sql_base.cc:5122 error=ture->sql_base.cc:5683 goto err;->sql_base.cc:5881 DBUG_RETURN(error);)->open_tables_for_query(sql_base.cc:6526 goto end;->sql_base.cc:6545 DBUG_RETURN(true);)->Sql_cmd_dml::prepare(sql_select.cc:390 DBUG_RETURN(true);)->Sql_cmd_dml::execute(sql_select.cc:533 goto err;->sql_select.cc:641,DBUG_ASSERT(thd->is_error() || thd->killed);)
then it hang up. and thd->is_error()  is false.and from the code logic starts to ends,thd->is_error() is always false.
This is what I call the first bug.

3)as you say the handle returned can not be zero nor NULL. the fellow code,why The condition of judgement is 	file >= 0

mf_tempfile.cc:149 
if (file >= 0) {
    mysql_mutex_lock(&THR_LOCK_open);
    my_tmp_file_created++;
    mysql_mutex_unlock(&THR_LOCK_open);
  }
 DBUG_RETURN(file);
 
 and mf_tempfile.cc:141 
 
    if (org_file >= 0 && file < 0) {
      int tmp = my_errno();
      close(org_file);
      (void)my_delete(to, MYF(MY_WME));
      set_my_errno(tmp);
    }
	
	why The condition of judgement is 	org_file >= 0 && file < 0
and suppose the file may be zero, There's a problem with the logic code  above create_temp_file(return 0)->mysql_tmpfile_path(return 0)->mysql_tmpfile(return 0)->generate_partition_syntax(return null)->fill_partitioning_from_dd(return ture)
	
So no matter the file returned can be  zero or not,the Code always has problems。

if the file returned can be  zero,it Caused the latter error,create_temp_file(return 0)->mysql_tmpfile_path(return 0)->mysql_tmpfile(return 0)->generate_partition_syntax(return null)->fill_partitioning_from_dd(return ture)

if the file returned can not be  zero,as the code
mf_tempfile.cc:149 
if (file >= 0) {
    mysql_mutex_lock(&THR_LOCK_open);
    my_tmp_file_created++;
    mysql_mutex_unlock(&THR_LOCK_open);
  }
 DBUG_RETURN(file);
 
 and mf_tempfile.cc:141 
 
    if (org_file >= 0 && file < 0) {
      int tmp = my_errno();
      close(org_file);
      (void)my_delete(to, MYF(MY_WME));
      set_my_errno(tmp);
    }
	
	The condition of judgement is error  。This is what I call the second bug.
[10 Dec 2018 16:48] Sinisa Milivojevic
Actually, your analysis is wrong.

Because an instance of the class `File` is returned from the function that can return 0. It is not a handle it is a value returned from the functions that registers file names, so it's value can be zero.

There is a necessary error checking in the following line:

 if (org_file >= 0 && file < 0) {
.....
}

Hence, when the error occurs, `file` is negative.

So far, I can not spot a bug ....
[11 Dec 2018 1:24] baogang hou
please check out all my instructions,These are not only my analysis, but also the problems I encounter. They are positioned step by step according to the added information.

and I know the ‘file' can be 0.  I was misled by what you said(the handle returned can not be zero nor NULL)。

[New Thread 0x7efbb466a700 (LWP 16777)]
[New Thread 0x7efbb41c7700 (LWP 16778)]

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7efbb41c7700 (LWP 16778)]
0x00007efcce6401f7 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007efcce6401f7 in raise () from /lib64/libc.so.6
#1  0x00007efcce6418e8 in abort () from /lib64/libc.so.6
#2  0x00007efcce639266 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007efcce639312 in __assert_fail () from /lib64/libc.so.6
#4  0x0000000002d5f99b in Sql_cmd_dml::execute (this=0x7efab800e968, thd=0x7efab8000be0) at /data/taurus2.0/taurus-root/src/sql/sql/sql_select.cc:620
#5  0x0000000002d0402f in mysql_execute_command (thd=0x7efab8000be0, first_level=true) at /data/taurus2.0/taurus-root/src/sql/sql/sql_parse.cc:4311
#6  0x0000000002d06274 in mysql_parse (thd=0x7efab8000be0, parser_state=0x7efbb41c6350) at /data/taurus2.0/taurus-root/src/sql/sql/sql_parse.cc:5069
#7  0x0000000002cfc5a9 in dispatch_command (thd=0x7efab8000be0, com_data=0x7efbb41c6cc0, command=COM_QUERY)
    at /data/taurus2.0/taurus-root/src/sql/sql/sql_parse.cc:1615
#8  0x0000000002cfaf1f in do_command (thd=0x7efab8000be0) at /data/taurus2.0/taurus-root/src/sql/sql/sql_parse.cc:1240
#9  0x0000000002e884b4 in handle_connection (arg=0x85e3ca0) at /data/taurus2.0/taurus-root/src/sql/sql/conn_handler/connection_handler_per_thread.cc:308
#10 0x00000000041b3d66 in pfs_spawn_thread (arg=0x85f0a50) at /data/taurus2.0/taurus-root/src/sql/storage/perfschema/pfs.cc:2836
#11 0x00007efcd03e0e25 in start_thread () from /lib64/libpthread.so.0
#12 0x00007efcce70334d in clone () from /lib64/libc.so.6
(gdb) f 4
#4  0x0000000002d5f99b in Sql_cmd_dml::execute (this=0x7efab800e968, thd=0x7efab8000be0) at /data/taurus2.0/taurus-root/src/sql/sql/sql_select.cc:620
620       DBUG_ASSERT(thd->is_error() || thd->killed);
(gdb) p thd->is_error()
$1 = false
(gdb) p thd->killed
$2 = {_M_i = THD::NOT_KILLED}
(gdb)

and I know the ‘file' can be 0.  I was misled by what you said(the handle returned can not be zero nor NULL)。

1)why it hang up at sql_select.cc:641 DBUG_ASSERT(thd->is_error() || thd->killed); because 
620       DBUG_ASSERT(thd->is_error() || thd->killed);
(gdb) p thd->is_error()
$1 = false
(gdb) p thd->killed
$2 = {_M_i = THD::NOT_KILLED}
(gdb)

and I added print information to locate the problem step by step. I find create_temp_file(return 0)。 I know the ‘file' can be 0 。it is ok。but why  it hang up at DBUG_ASSERT(thd->is_error() || thd->killed);
and  the code logic  create_temp_file(return 0)->mysql_tmpfile_path(return 0)->mysql_tmpfile(return 0)->generate_partition_syntax(return null)->fill_partitioning_from_dd(return ture)->open_table_def()(dd_table_share.cc:2236 error =true and return true.)->get_table_share(open_table_err = true and DBUG_RETURN(NULL);)->get_table_share_with_discover(DBUG_RETURN(NULL);)->open_table(sql_base.cc:3176 DBUG_RETURN(true);)->open_and_process_table(sql_base.cc:5004 error=ture ->sql_base.cc:5013 goto end;->sql_base.cc:5122 DBUG_RETURN(error);)->open_tables(sql_base.cc:5122 error=ture->sql_base.cc:5683 goto err;->sql_base.cc:5881 DBUG_RETURN(error);)->open_tables_for_query(sql_base.cc:6526 goto end;->sql_base.cc:6545 DBUG_RETURN(true);)->Sql_cmd_dml::prepare(sql_select.cc:390 DBUG_RETURN(true);)->Sql_cmd_dml::execute(sql_select.cc:533 goto err;->sql_select.cc:641,DBUG_ASSERT(thd->is_error() || thd->killed);)
then it hang up. and thd->is_error()  is false.and from the code logic starts to ends,thd->is_error() is always false.
This is what I call the first bug.

please explain the problem。
[11 Dec 2018 14:08] Sinisa Milivojevic
Hi,

I have examined your report and it turns out that you have hit an assert that only happens in debug binaries, not production binaries.

That is OK, because it only reduces the severity of the problem.  However, we have to know two facts, in order to proceed with the verification. First, which of those two conditions is true:

* thd->is_error() 

* thd->killed

Also, we need to know how we can reproduce this scenario.  Hence, a test case.

If you are in difficulty to provide a test case, you should describe to us EXACTLY how can this situation occur, so that we can repeat it.
[11 Dec 2018 14:55] baogang hou
This problem is difficult to recur,You can derive it from the logic code I provided,when create_temp_file return 0.
[11 Dec 2018 15:07] Sinisa Milivojevic
Hi,

Sorry, the function that you mention did not return the negative value.

Hence, you have not presented a clear case of the error in our code.

Not a bug.
[11 Dec 2018 16:38] Sinisa Milivojevic
Hi,

It is quite unclear as to what are you trying to explain.

Hence, please provide an analysis of the code that will clearly point out to the mistake in the code, that is causing a bug.
[12 Dec 2018 3:07] baogang hou
I describe in this way,

if in mf_tempfile.cc:154  file = 0; create_temp_file DBUG_RETURN(file); 
1)create_temp_file(return 0)->mysql_tmpfile_path(return 0)->mysql_tmpfile(return 0)->generate_partition_syntax(sql_partition.cc:2331 return null)->fill_partitioning_from_dd(return ture)->open_table_def()(dd_table_share.cc:2236 error =true and return true.)

and generate_partition_syntax(sql_partition.cc:2331 return null)  According to the sql_partition.cc:2297 ( @retval NULL - error ),

if in mf_tempfile.cc:154  file = 1; create_temp_file DBUG_RETURN(file);

2)create_temp_file(return 1)->mysql_tmpfile_path(return 1)->mysql_tmpfile(return 1)->generate_partition_syntax(return buf)->fill_partitioning_from_dd(return false)->open_table_def()(dd_table_share.cc:2236 error =false and return false.)

i think fill_partitioning_from_dd(return ture)   show there is an error occurred,and fill_partitioning_from_dd(return false) ,it is success.

And the  code comments clearly show as dd_table_share.cc:1881 @return false if success, else true.  I don't know Whether  you can see what's the difference.

3)and create_temp_file(return -1)->mysql_tmpfile_path(return -1)->mysql_tmpfile(return -1)->generate_partition_syntax() sql_partition.cc:2331 if(-1) (Error has occurred,and it did not return null)  

I  don't know Whether  you can understand here
[12 Dec 2018 13:35] Sinisa Milivojevic
Hi,

Let me see if I understood you correctly, because your style of writing is VERY condensed, meaning short and non-verbose ... There is nothing wrong with that, except that it is hard to follow ....

In case 1) and 3) the error is returned. In case under 1), NULL is returned, while in case under 3), I do not see what is returned ???

So what is returned in the third case and which behaviour is wrong, in your opinion ????
[12 Dec 2018 14:17] baogang hou
first we see the 1)and2),as we said befor,the file is 0 or 1, both are legal.but the return value is different。one return true,the other return false.is it right ?
[12 Dec 2018 14:54] Sinisa Milivojevic
HI,

I am sorry to inform you that nobody among us was able to understand your claims.

The easiest way to proceed is that you provide us with a full test case. You claim that you experience the clashes, so just provide us with a test case that we can reproduce, that would lead to the crash. In that manner we shall be able to verify this bug.

Otherwise, it stays "Not a Bug".
[12 Dec 2018 15:41] baogang hou
Sorry ,I can not give the test case.it is  Very complicated.Can you take a look at the source code according to my tips? This is very simple, you look at the source code logic, you will understand, you can ask Daniel Price to help check the source code.Thanks