Bug #95176 when search table in information_schema when delete table, there is a error
Submitted: 29 Apr 7:03 Modified: 18 Jun 12:15
Reporter: xiao Xiao Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:8.0.13 OS:CentOS
Assigned to: CPU Architecture:Any

[29 Apr 7:03] xiao Xiao
Description:
when search table in information_schema when delete table, there is a error:
  --------------
  SELECT NAME,SPACE_TYPE FROM INFORMATION_SCHEMA.INNODB_TABLES WHERE NAME LIKE 'CREATE_TABLESPACE_027/TABLE_TEST_0%'
  --------------
! ERROR 1049 (42000) at line 48: Unknown database ''

I debug the code and find race condition:

1)get_schema_tables_result get tableids, 
and
2)dd::cache::Storage_adapter::get   read table record with tableid.
it start a new transaction with read committed isolation:
  // Start a DD transaction to get the object.
  Transaction_ro trx(thd, isolation);

  trx.otx.register_tables<T>();
3) if the table is deleted between 1) and 2), there is a race condition and error happens.

I don't know why mysql need read the record again(after get tableids) and get this race condition.

error stack is as followed:
(gdb) bt
#0  Diagnostics_area::set_error_status (this=0x7ffeb0534760, mysql_errno=1049, message_text=0x7ffeb0532ea0 "Unknown database ''", returned_sqlstate=0x5f53b17 "42000")
    at /export/home/duyu/code/taurus-root/src/sql/sql/sql_error.cc:454
#1  0x0000000003070e8b in THD::raise_condition (this=0x7ffdcc0b1b20, sql_errno=1049, sqlstate=0x5f53b17 "42000", level=Sql_condition::SL_ERROR, msg=0x7ffeb0532ea0 "Unknown database ''",
    use_condition_handler=false) at /export/home/duyu/code/taurus-root/src/sql/sql/sql_class.cc:726
#2  0x0000000002f74ffa in my_message_sql (error=1049, str=0x7ffeb0532ea0 "Unknown database ''", MyFlags=0) at /export/home/duyu/code/taurus-root/src/sql/sql/mysqld.cc:3069
#3  0x00000000044edf07 in my_error (nr=1049, MyFlags=0) at /export/home/duyu/code/taurus-root/src/sql/mysys/my_error.cc:247
#4  0x0000000004236a16 in dd::cache::Dictionary_client::get_table_name_by_se_private_id (this=0x7ffdcc00ea50, engine="InnoDB", se_private_id=3268, schema_name=0x7ffeb0533990,
    table_name=0x7ffeb0533980) at /export/home/duyu/code/taurus-root/src/sql/sql/dd/impl/cache/dictionary_client.cc:1759
#5  0x00000000049a6fc3 in dd_table_open_on_id_low (thd=0x7ffdcc0b1b20, mdl=0x7ffeb05346f0, table_id=3268)
    at /export/home/duyu/code/taurus-root/src/sql/storage/innobase/dict/dict0dd.cc:420
#6  0x00000000049a7bb7 in dd_table_open_on_id (table_id=3268, thd=0x7ffdcc0b1b20, mdl=0x7ffeb05346f0, dict_locked=true, check_corruption=false)
    at /export/home/duyu/code/taurus-root/src/sql/storage/innobase/dict/dict0dd.cc:634
#7  0x00000000049b1941 in dd_process_dd_tables_rec_and_mtr_commit (heap=0x7ffdcc0180e8, rec=0x7fffc844219b "", table=0x7ffeb05346f8, dd_tables=0x7fffd8beeb38, mdl=0x7ffeb05346f0,
    mtr=0x7ffeb05341e0) at /export/home/duyu/code/taurus-root/src/sql/storage/innobase/dict/dict0dd.cc:4527
#8  0x00000000046977fa in i_s_innodb_tables_fill_table (thd=0x7ffdcc0b1b20, tables=0x7ffdcc019cc8) at /export/home/duyu/code/taurus-root/src/sql/storage/innobase/handler/i_s.cc:5287
#9  0x000000000317f44e in do_fill_table (thd=0x7ffdcc0b1b20, table_list=0x7ffdcc019cc8, qep_tab=0x7ffdcc01bbf0) at /export/home/duyu/code/taurus-root/src/sql/sql/sql_show.cc:4711
#10 0x000000000317f923 in get_schema_tables_result (join=0x7ffdcc01b260, executed_place=PROCESSED_BY_JOIN_EXEC) at /export/home/duyu/code/taurus-root/src/sql/sql/sql_show.cc:4824
#11 0x000000000315d591 in JOIN::prepare_result (this=0x7ffdcc01b260) at /export/home/duyu/code/taurus-root/src/sql/sql/sql_select.cc:1454
#12 0x0000000003097ff3 in JOIN::exec (this=0x7ffdcc01b260) at /export/home/duyu/code/taurus-root/src/sql/sql/sql_executor.cc:202
#13 0x000000000315bee7 in Sql_cmd_dml::execute_inner (this=0x7ffdcc01a4c8, thd=0x7ffdcc0b1b20) at /export/home/duyu/code/taurus-root/src/sql/sql/sql_select.cc:715
#14 0x000000000315b937 in Sql_cmd_dml::execute (this=0x7ffdcc01a4c8, thd=0x7ffdcc0b1b20) at /export/home/duyu/code/taurus-root/src/sql/sql/sql_select.cc:613

when 

How to repeat:
reproduce steps:
there are 50 parallel test cases running and error happens randomly.
the test case is as followed:(different tablespaces and tables)

create tablespace `123456789012345678901234567890123456789012345678901234567890127` add datafile 'A027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJabcd>こん.ibd' file_block_size=8192 engine InnoDB;

--  检查表空间存在
select NAME,FLAG,ROW_FORMAT,PAGE_SIZE,ZIP_PAGE_SIZE,SPACE_TYPE,FS_BLOCK_SIZE,FILE_SIZE from information_schema.innodb_tablespaces where binary name='123456789012345678901234567890123456789012345678901234567890127';

--  检查文件存在
-- exec echo "Check file exist or not.."
-- exec ls 'A027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJabcdこん.ibd'
-- -- exec ls '$dir/A027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJA027BCDEFGHIJabcdこん.isl'

--  开始往表空间中添加表
--  创建数据库
create database CREATE_TABLESPACE_027;

--  使用数据库
use CREATE_TABLESPACE_027;

--  创建表
CREATE TABLE TABLE_TEST_01 (c1 INT PRIMARY KEY) TABLESPACE `123456789012345678901234567890123456789012345678901234567890127` ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;
CREATE TABLE TABLE_TEST_02 (c1 INT PRIMARY KEY) TABLESPACE `123456789012345678901234567890123456789012345678901234567890127` ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;

--  查看表的空间类型
SELECT NAME,SPACE_TYPE FROM INFORMATION_SCHEMA.INNODB_TABLES WHERE NAME LIKE 'CREATE_TABLESPACE_027/TABLE_TEST_0%';
--  ---------------------------------------------清理环境-------------------------------------------------
--  删除数据库
DROP DATABASE CREATE_TABLESPACE_027;

--  删除表空间
DROP TABLESPACE `123456789012345678901234567890123456789012345678901234567890127`;

and you can reproduce by gdb, block at bool Storage_adapter::get(THD *thd, const K &key, enum_tx_isolation isolation,
                          bool bypass_core_registry, const T **object) {

before   // Start a DD transaction to get the object.
  Transaction_ro trx(thd, isolation);
  trx.otx.register_tables<T>();

and then delete the table with this id. 
then continues.

Suggested fix:
I don't know why mysql need read the record again(after get tableids) and get this race condition.

this should not error.
[14 May 14:16] Sinisa Milivojevic
Hi Mr. Xiao,

Thank you for your bug report.

We do need certain clarifications regarding the test case that you have provided.

First of all, would you please let us know which transactions are running in parallel, and which ones are not. This is especially necessary for the statement that is querying Information Schema. This is not visible from your test case.

Second, will your test case repeat if we would omit Chinese ideograms within table and tablespace names ???

Many thanks in advance.
[15 Jun 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[29 Jul 12:38] dave do
Xiao is my colleague, and we find that the issue still exist in 8.0.16.

Let me try to explain the issue race condition again.

First of all, before the query of information_schema start, there are already several tables belongs to different databases.
Then, the query start and in the function get_schema_tables_result, here all record of mysql.tables will fetched and SE_PRIVATE_ID got from the record in dd_process_dd_tables_rec_and_mtr_commit.
With the SE_PRIVATE_ID and the query drill down to function dd::cache::Dictionary_client::get_table_name_by_se_private_id. In this function, table object got succeed, because in the test case, table not dropped explicitly. And then will get schema object by tab_obj->schema_id(), find out the sch_obj is nullptr.
The main reason we get an invalid sch_obj, is because we get it with Dictionary_client::acquire_uncached and a new Transaction_ro trx started (Storage_adapter::get) in it with a ISO_READ_COMMITTED level. So if the schema dropped by another session (not the information_schema session), the error will occur.

For more information about re-produce this issue, you can add a DBUG_EXECUTE_IF to sleep 10ms before Transaction_ro trx start to make a GAP to increase the probability to re-produce the issue.
[29 Jul 12:45] Sinisa Milivojevic
Hi Mr. Do,

Thank you for your comment. However, .....

First of all, our questions from the previous comments were not answered at all.

Second, we do require a repeatable test case, that will show the problem, without changing anything in our code.

Last, but not least, you should make tests ONLY with our 8.0 release, which is 8.0.17.