Bug #100659 error_code=1146 write to binlog cause slave sql_thread stop
Submitted: 27 Aug 2020 11:21 Modified: 30 Aug 2020 5:28
Reporter: chao gao Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.7.17 OS:CentOS (7.2)
Assigned to: CPU Architecture:x86

[27 Aug 2020 11:21] chao gao
Description:
master:
    my.cnf:
        gitd_mode = on
        binlog_gtid_simple_recovery = on
        enforce_gtid_consistency= on
        replicate-ignore-db = a_1
        replicate-ignore-db = a_1   #write again,its my fault
        replicate-rewrite-db = b_2 -> c_3
        slave_parallel_type=logical_clock
        slave_parallel_worker=48
    query:
        drop database d_4;     # client return ok,server running normal
    binlog:
        xxxx Query thread_id=127772901 exec_time=1 error_code=1146
    error_log:
        nothing......

slave:
    my.cnf:
        gitd_mode = on
        binlog_gtid_simple_recovery = on
        enforce_gtid_consistency= on
        replicate-ignore-db = a_1
        replicate-ignore-db = a_1   #write again,its my fault
        replicate-rewrite-db = b_2 -> c_3
        slave_parallel_type=logical_clock
        slave_parallel_worker=48
    relay_log:
        reset slave all clear it,i'm sorry!!!
    sql_thread:
        query caused different errors on master and slave. Error on master:message(format)='Table '%-.192s.%-.192s' doesn't exist' error_code=1146.Error on slave:actual message='no error',error code=0.Default database:'d_4'.Query:"drop database d_4",Error_code:3002.
        Slave SQL for channel '':... The slave coordinator and worker threads are stoped,possibly leaving data in inconsistent state .........
  

        
    

How to repeat:
I have no good idea to repeat it......
[27 Aug 2020 11:31] chao gao
slave binlog:
    Anonymous_GTID last_commited=123147 sequence_number=12348
    set @@session.gtid_next='anonymous'
    xxxxx Query_id=12900425    exec_time=xxx   error_code=1146
    set timestamp=1598455544;
    set @@session.sql_mode=1436549152
    drop database d_4

    Anonymous_GTID last_commited=123148 sequence_number=12349
    xxxxx Query_id=12900425    exec_time=2327   error_code=1146
    set @@session.gtid_next='anonymous'
    set timestamp=1598455544;
    set @@session.sql_mode=1436549152
    drop database d_4

    Anonymous_GTID last_commited=123149 sequence_number=12350
    xxxxx Query_id=12900425    exec_time=xxx   error_code=1146
    set @@session.gtid_next='anonymous'
    set timestamp=1598455544;
    set @@session.sql_mode=1436549152
    drop database d_4

    Anonymous_GTID last_commited=123150 sequence_number=123151
    xxxxx Query_id=12900425    exec_time=xxx   error_code=1146
    set @@session.gtid_next='anonymous'
    set timestamp=1598455544;
    set @@session.sql_mode=1436549152
    drop database d_4
    

    Anonymous_GTID last_commited=123151 sequence_number=123152
    xxxxx Query_id=12900425    exec_time=xxx   error_code=1146
    set @@session.gtid_next='anonymous'
    set timestamp=1598455544;
    set @@session.sql_mode=1436549152
    drop database d_4
    

    Anonymous_GTID last_commited=123152 sequence_number=123153
    xxxxx Query_id=12900425    exec_time=xxx   error_code=1146
    set @@session.gtid_next='anonymous'
    set timestamp=1598455544;
    set @@session.sql_mode=1436549152
    drop database d_4

question:
    Why one transaction is executed multiple times ....
[27 Aug 2020 12:10] chao gao
modify Severity
[27 Aug 2020 12:43] MySQL Verification Team
Hi Mr. gao,

Thank you for your bug report.

However, this is a forum for reports with fully repeatable test cases.

Hence, what we need is a set of SQL commands that always lead to the problem. For the replication problems, we also need entire configurations on both master and slave(s) , with a full description of the actions that should be taken in order to repeat the error that you observed.

Hence, without a repeatable test case, we can't even start processing your report !!!!!!!
[27 Aug 2020 12:46] MySQL Verification Team
One more comment.

We do not consider reports which are not using the latest release.

Hence, we need a repeatable test case with 5.7.31. We are not looking for bugs in very old releases. Between 5.7.17 and 5.7.31, there are (literally) hundreds of replication bugs fixed.

Hence, we need a repeatable test case with 5.7.31.
[30 Aug 2020 5:28] chao gao
一、how to repeat:
1.move proc.MYD proc.MYI proc.frm to another directory
    cd $mysql_data_dir/mysql
    mv proc.* ../
2.restart mysqld
3.execute sql
    mysql> create database test;
    Query OK, 1 row affected (0.00 sec)

    mysql> drop database test;
    Query OK, 0 rows affected, 1 warning (0.00 sec)

    mysql> show warnings;
    Empty set (0.01 sec)
4.view binlog
    #200830 13:10:13 server id 1001  end_log_pos 568 CRC32 0xdf2a4e39   GTID [commit=yes]
    SET @@SESSION.GTID_NEXT= 'e5a31538-b488-11ea-b6f5-0920853da06e:3411'/*!*/;
    # at 568
    #200830 13:10:13 server id 1001  end_log_pos 662 CRC32 0x20332d3b   Query thread_id=4 exec_time=0 error_code=0
    SET TIMESTAMP=1598764213/*!*/;
    create database test
    /*!*/;
    # at 662
    #200830 13:10:15 server id 1001  end_log_pos 727 CRC32 0xa6d9d18d   GTID [commit=yes]
    SET @@SESSION.GTID_NEXT= 'e5a31538-b488-11ea-b6f5-0920853da06e:3412'/*!*/;
    # at 727
    #200830 13:10:15 server id 1001  end_log_pos 812 CRC32 0x8be840c6   Query thread_id=4 exec_time=0 error_code=1146
    SET TIMESTAMP=1598764215/*!*/;
    drop database test

二、sorce code(/*I add some print code to find the error code */)
sql_db.cc
    ha_drop_database(path);
    sql_print_error("execute_sql_error_code_to_binlog=============%d",query_error_code(thd, TRUE));
    tmp_disable_binlog(thd);
    sql_print_error("execute_sql_error_code_to_binlog=============%d",query_error_code(thd, TRUE));
    query_cache.invalidate(db.str);
    sql_print_error("execute_sql_error_code_to_binlog=============%d",query_error_code(thd, TRUE));
    (void) sp_drop_db_routines(thd, db.str); /* @todo Do not ignore errors */         
    sql_print_error("execute_sql_error_code_to_binlog=============%d",query_error_code(thd, TRUE)); /*error_code=1146*/
#ifndef EMBEDDED_LIBRARY
    sql_print_error("execute_sql_error_code_to_binlog=============%d",query_error_code(thd, TRUE));
    Events::drop_schema_events(thd, db.str);
    sql_print_error("execute_sql_error_code_to_binlog=============%d",query_error_code(thd, TRUE));

三、my questions
1.drop database test statment return warning,but show warnings return nothing,its a bug?
2.what's means about "@todo Do not ignore errors",it find until latest version 5.7.30.

四、suggestion
1.if error in mysql internal,may print it to error log or return client warning;
2.drop database may drop proc、event .. first ,then drop database.