Bug #95064 slave server may has gaps in Executed_Gtid_Set when a special case happen
Submitted: 19 Apr 11:05 Modified: 19 Apr 22:34
Reporter: yoga yoga Email Updates:
Status: Verified Impact on me:
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.7.22 OS:Any
Assigned to: CPU Architecture:x86
Tags: Executed_Gtid_Set, gaps, lock_wait_timeout, rotate_event

[19 Apr 11:05] yoga yoga
First: In a slave server of master-slave replication system, a worker thread execute a trx has the follow error:  "...Lock wait timeout exceeded; try restarting transaction, Error_code: 1205". Then the trx retry 10 times and cause the sql thread and worker thread exit.

Second: We execute show slave status\G and find the execute relay log pos is not consistent with the Executed_Gtid_Set.

Third: If we execute start slave to continue, we can see gaps in Executed_Gtid_Set.

At the end , we think the slave has lost some trxs. 

slave configs:
slave_parallel_workers = 32
slave-parallel-type = LOGICAL_CLOCK
slave_preserve_commit_order = 1
relay_log_recovery = 0
innodb_flush_log_at_trx_commit = 1

How to repeat:
First: Make a master-slave server. Do the change master command:
reset master;
stop slave;
reset salve;
reset slave all;
change master to Master_Host='',Master_User='rpl',Master_Port=5518, Master_Password='123456',MASTER_AUTO_POSITION=1;
start slave;
drop database if exists abczyy_test;
create database abczyy_test;
create table abczyy_test.tb1(a int key, b int);

Second: Do the following cmds.
  reset master;
  reset master;
  stop slave;
  reset slave;
  start slave;
  truncate table abczyy_test.tb1;
  stop slave sql_thread;
  insert into abczyy_test.tb1(a,b) values(1,1);
  insert into abczyy_test.tb1(a,b) values(2,2);
  flush logs;
  insert into abczyy_test.tb1(a,b) values(3,3);
  insert into abczyy_test.tb1(a,b) values(4,4);
  flush logs;
  start transaction;
  insert into abczyy_test.tb1(a,b) values(2,2);
  start slave;
  show slave status\G

Third: Wait until the sql thread and worker threads exit.

[error log in mysqld.log]
2019-04-19T15:50:38.460226+08:00 583 [Warning] Slave SQL for channel '': Worker 1 failed executing transaction 'a902ca7e-2e4a-11e8-a450-00e0ec27ea92:3' at master log mysql-bin.000001, end_log_pos 820; Could not execute Write_rows event on table abczyy_test.tb1; Lock wait timeout exceeded; try restarting transaction, Error_code: 1205; handler error HA_ERR_LOCK_WAIT_TIMEOUT; the event's master log mysql-bin.000001, end_log_pos 820, Error_code: 1205
2019-04-19T15:50:38.460359+08:00 583 [ERROR] Slave SQL for channel '': worker thread retried transaction 10 time(s) in vain, giving up. Consider raising the value of the slave_transaction_retries variable. Error_code: 1205
2019-04-19T15:50:38.460397+08:00 583 [ERROR] Slave SQL for channel '': Worker 1 failed executing transaction 'a902ca7e-2e4a-11e8-a450-00e0ec27ea92:3' at master log mysql-bin.000001, end_log_pos 820; Lock wait timeout exceeded; try restarting transaction, Error_code: 1205
2019-04-19T15:50:38.460484+08:00 582 [Warning] Slave SQL for channel '': ... The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details). Error_code: 1756
2019-04-19T16:30:20.408096+08:00 582 [Note] Slave SQL thread for channel '' exiting, replication stopped in log 'mysql-bin.000002' at position 4

[result of show slave status]
    Relay_Log_File: relay-bin.000004
    Relay_Log_Pos: 241
    Relay_Master_Log_File: mysql-bin.000002
    Exec_Master_Log_Pos: 4
    Retrieved_Gtid_Set: a902ca7e-2e4a-11e8-a450-00e0ec27ea92:1-5
    Executed_Gtid_Set: a902ca7e-2e4a-11e8-a450-00e0ec27ea92:1-2

Fourth: start slave and see the gaps.
  start slave;
  show slave status\G

[result of show slave status]
    Retrieved_Gtid_Set: a902ca7e-2e4a-11e8-a450-00e0ec27ea92:1-5
    Executed_Gtid_Set: a902ca7e-2e4a-11e8-a450-00e0ec27ea92:1-2:4-5

Suggested fix:
This bug is caused by the slave apply the next master rotate_event in sql_thread, which the rli->sql_thread_kill_accepted is set to TURE after the lock wait timeout error.

[code fix:]
In Log_event::apply_event function
      if (rli->current_mts_submode->wait_for_workers_to_finish(rli) == -1)
          // handle synchronization error
          rli->report(WARNING_LEVEL, 0,
                      "Slave worker thread has failed to apply an event. As a "
                      "consequence, the coordinator thread is stopping "
+        if(rli->sql_thread_kill_accepted == TRUE &&
+           get_type_code() == binary_log::ROTATE_EVENT &&
+           rli->current_mts_submode->get_type() == MTS_PARALLEL_TYPE_LOGICAL_CLOCK)
+        {
+          rli->report(WARNING_LEVEL, 0,
+                      "Slave worker thread has failed. Do not need to "
+                      "execute the master rotate event.");
+          DBUG_RETURN(-1);
+        }
          Given not in-group mark the event handler can invoke checkpoint
          update routine in the following course.
        DBUG_ASSERT(rli->mts_group_status == Relay_log_info::MTS_NOT_IN_GROUP
                    || !is_mts_db_partitioned(rli));

At the end, If the code fix is not better please give me the suggested fix.
[19 Apr 22:34] Bogdan Kecman

Thanks for the report.

In order to submit contributions you must first sign the Oracle Contribution Agreement (OCA).
For additional information please check http://www.oracle.com/technetwork/community/oca-486395.html.
If you have any questions, please contact the MySQL community team.