Bug #108682 Read-Committed Isolation, UPDATES wrapped in DROP TEMPORARY TABLE IF EXISTS
Submitted: 4 Oct 2022 19:32 Modified: 6 Oct 2022 9:00
Reporter: Jervin Real Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7.39 OS:Any
Assigned to: CPU Architecture:Any

[4 Oct 2022 19:32] Jervin Real
Description:
When an UPDATE statement is wrapped between DROP TEMPORARY TABLE IF EXISTS, replica can break due to binlog out of order.

This is reproduceable on MySQL 5.7.38, binlog_format=ROW.

How to repeat:
Consider these series of statements.

session1> drop table if exists t;
session1> create table t (id int unsigned not null) engine=innodb;
session1> insert into t values (1);
session1> flush binary logs;

session2> set session autocommit=0;
session2> set session transaction isolation level read committed;
session2> create temporary table tm (id int unsigned not null);
session2> insert into tm values (3);
session2> drop temporary table if exists tm;
session2> update t set id=2 where id=1;

session1> delete from t where id = 1; -- will wait

session2> create temporary table tm (id int unsigned not null);
session2> insert into tm values (3);
session2> drop temporary table if exists tm;
session2> exit

session1> delete from t where id = 1; -- will complete

This will produce binary log output like below.

# at 219
#221004 19:25:31 server id 100  end_log_pos 291 CRC32 0xf8d93c1f        Query   thread_id=19    exec_time=15    error_code=0
BEGIN
/*!*/;
# at 291
#221004 19:25:31 server id 100  end_log_pos 335 CRC32 0xa6968c60        Table_map: `test`.`t` mapped to number 113
# at 335
#221004 19:25:31 server id 100  end_log_pos 375 CRC32 0x283f05f8        Delete_rows: table id 113 flags: STMT_END_F
### DELETE FROM `test`.`t`
### WHERE
###   @1=1
# at 375
#221004 19:25:31 server id 100  end_log_pos 406 CRC32 0x29b22eac        Xid = 223
COMMIT/*!*/;
# at 406
#221004 19:25:46 server id 100  end_log_pos 471 CRC32 0x0b931b08        Anonymous_GTID  last_committed=0        sequence_number=2       rbr_only=no
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
# at 471
#221004 19:24:51 server id 100  end_log_pos 550 CRC32 0x811397a2        Query   thread_id=18    exec_time=0     error_code=0
SET TIMESTAMP=1664911491/*!*/;
SET @@session.pseudo_thread_id=18/*!*/;
BEGIN
/*!*/;
# at 550
#221004 19:24:51 server id 100  end_log_pos 685 CRC32 0xec251f49        Query   thread_id=18    exec_time=0     error_code=0
use `test`/*!*/;
SET TIMESTAMP=1664911491/*!*/;
DROP TEMPORARY TABLE IF EXISTS `tm` /* generated by server */
/*!*/;
# at 685
#221004 19:25:01 server id 100  end_log_pos 729 CRC32 0x4f1ca7cc        Table_map: `test`.`t` mapped to number 113
# at 729
#221004 19:25:01 server id 100  end_log_pos 775 CRC32 0x151397d7        Update_rows: table id 113 flags: STMT_END_F
### UPDATE `test`.`t`
### WHERE
###   @1=1
### SET
###   @1=2
# at 775
#221004 19:25:45 server id 100  end_log_pos 910 CRC32 0xfa72f949        Query   thread_id=18    exec_time=0     error_code=0
SET TIMESTAMP=1664911545/*!*/;
DROP TEMPORARY TABLE IF EXISTS `tm` /* generated by server */
/*!*/;
# at 910
#221004 19:25:46 server id 100  end_log_pos 985 CRC32 0xef2952f8        Query   thread_id=18    exec_time=0     error_code=0
SET TIMESTAMP=1664911546/*!*/;
ROLLBACK
/*!*/;

Suggested fix:
Session2's events should've been written first on the binary log.

However, in the case above, Session1's DELETE was written first, causing the UPDATE on Session2 to cause replication error.
[6 Oct 2022 8:53] MySQL Verification Team
Hi,

Have you maybe tried with 8.0.30? I did not reproduce with 8.0.30.
Will see how 5.7.39 behaves like.

thanks
[6 Oct 2022 9:00] MySQL Verification Team
Verified with 5.7.39

 > show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: rsandbox
                  Master_Port: 20640
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 985
               Relay_Log_File: mysql-relay.000004
                Relay_Log_Pos: 619
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1032
                   Last_Error: Could not execute Update_rows event on table test.t; Unknown table 'test.tm', Error_code: 1051; Can't find record in 't', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000002, end_log_pos 775
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 406
              Relay_Log_Space: 1448
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1032
               Last_SQL_Error: Could not execute Update_rows event on table test.t; Unknown table 'test.tm', Error_code: 1051; Can't find record in 't', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000002, end_log_pos 775
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 100
                  Master_UUID: 00020640-1111-1111-1111-111111111111
             Master_Info_File: /home/arhimed/sandboxes/rsandbox_5_7_39/node1/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 221006 10:59:10
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
1 row in set (0.00 sec)
[7 Oct 2022 5:10] David J
The test case can be simplified to:

session1> drop table if exists t;
session1> create table t (id int unsigned not null) engine=innodb;
session1> insert into t values (1);
session1> flush binary logs;

session2> set session autocommit=0;
session2> set session transaction isolation level read committed;
session2> update t set id=2 where id=1;

session1> delete from t where id = 1; -- will wait

session2> drop temporary table if exists tm;
session2> exit

session1> delete from t where id = 1; -- will complete
[9 Oct 2022 12:48] huahua xu
Hi Jervin Real,

For your test cases, the order of flushing binlog about these two transactions is not stable, which is caused by the process of transaction rollback. 

About rolling back the transaction: 
1. Firstly, the transaction coordinator rolls back the transaction in the stotage engine, which will releases it's locks(it wakes up possible other transactions because of waiting on these locks)
2. Secondly, flushing binlog of the transaction that cannot be rolled back safely.