Bug #77496 Replication position lost after crash on MTS configured slave
Submitted: 26 Jun 2015 6:11 Modified: 20 May 2016 11:56
Reporter: Simon Mudd (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.6.25, 5.6.27 OS:Any
Assigned to: CPU Architecture:Any
Tags: Failed to initialize the master info structure, MTS, recovery

[26 Jun 2015 6:11] Simon Mudd
Description:
If you enable MTS (e.g. slave_parallel_workers = 25) then MySQL may not recover the replication position properly if MySQL crashes. The end result is the SQL position is lost _completely_.

When using single threaded replication the replication position may get out of sync with the master under similar circumstances. From experience I see that this may be ~30 seconds difference and the slave thinks it's up to 30 seconds behind it's real position. This makes it re-execute some transactions again. (not good but may be harmless)

The situation with MTS is that there is _no_ reference left to any possible previous position, thus leaving the DBA with "no clue" as to where the slave may have been replicating from.

How to repeat:
Enable MTS.
Make sure that the server is busily inserting data with multiple threads in parallel.
Shutdown mysqld uncleanly (kill -9 or power off server without notice).

MySQL recovery logging shows:

...
InnoDB: Doing recovery: scanned up to log sequence number 22584973969408
InnoDB: Doing recovery: scanned up to log sequence number 22584979212288
InnoDB: Doing recovery: scanned up to log sequence number 22584983208413
2015-06-24 13:43:06 3895 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 31087193, file name binlog.029280
2015-06-24 13:48:25 3895 [Note] InnoDB: 128 rollback segment(s) are active.
2015-06-24 13:48:25 3895 [Note] InnoDB: Waiting for purge to start
2015-06-24 13:48:25 3895 [Note] InnoDB: 5.6.25 started; log sequence number 22584983208413
2015-06-24 13:48:25 3895 [Note] Recovering after a crash using ../log/binlog
2015-06-24 13:48:26 3895 [Note] Starting crash recovery...
2015-06-24 13:48:26 3895 [Note] Crash recovery finished.
2015-06-24 13:49:03 3895 [Note] Server hostname (bind-address): '0.0.0.0'; port: 3306
2015-06-24 13:49:03 3895 [Note]   - '0.0.0.0' resolves to '0.0.0.0';
2015-06-24 13:49:03 3895 [Note] Server socket created on IP: '0.0.0.0'.
2015-06-24 13:49:03 3895 [Note] Slave: MTS group recovery relay log info based on Worker-Id 24, group_relay_log_name ../log/relaylog.059984, group_relay_log_pos 49853702 group_master_log_name binlog.029323, group_master_log_pos 50299739
2015-06-24 13:49:03 3895 [Note] Slave: MTS group recovery relay log info group_master_log_name binlog.029323, event_master_log_pos 50299739.
2015-06-24 13:49:03 3895 [ERROR] --relay-log-recovery cannot be executed when the slave was stopped with an error or killed in MTS mode; consider using RESET SLAVE or restart the server with --relay-log-recovery = 0 followed by START SLAVE UNTIL SQL_AFTER_MTS_GAPS
2015-06-24 13:49:03 3895 [ERROR] Failed to initialize the master info structure
2015-06-24 13:49:03 3895 [Note] Check error log for additional messages. You will not be able to start replication until the issue is resolved and the server restarted.
2015-06-24 13:49:04 3895 [ERROR] /usr/sbin/mysqld: Table './mysql/event' is marked as crashed and should be repaired
2015-06-24 13:49:04 3895 [Warning] Checking table:   './mysql/event'
2015-06-24 13:49:04 3895 [ERROR] 1 client is using or hasn't closed the table properly
2015-06-24 13:49:04 3895 [Note] Event Scheduler: Loaded 3 events
2015-06-24 13:49:04 3895 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.25-log'  socket: 'mysql.sock'  port: 3306  MySQL Community Server (GPL)
2015-06-24 13:49:04 3895 [Note] Event Scheduler: scheduler thread started with id 1
2015-06-24 14:01:37 3895 [ERROR] Slave SQL: Slave failed to initialize relay log info structure from the repository, Error_code: 1872

root@slave-5-6-25 [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: master-host
                  Master_User: master_user
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: 
          Read_Master_Log_Pos: 4
               Relay_Log_File: relaylog.059984
                Relay_Log_Pos: 49853387
        Relay_Master_Log_File: 
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 0
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 0
                  Master_UUID: 42bbc67c-dda2-11e2-a2c3-80c16e230aca
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)

[smudd@mad8 ~]$ 

The error message shows something:
InnoDB: Last MySQL binlog file position 0 31087193, file name binlog.029280

I'm not sure what position that actually is. If it's what Innodb thinks is its last position then why is it not being used?

Suggested fix:
MySQL crash recovery on a slave has always been a point of discussion. Even in 5.6 the "solution" is far from ideal as this triggers MysQL on startup to throw away any existing relay logs. If there aren't many that may be ok (but what if the master has just died and you want that unprocessed info?), but if you have a slave that is behind and has several [hundreds of] GB of unprocessed relay logs throwing this information away and forcing the slave to pull it down again is downright unhelpful.  GTID helps a bit with this but has many other issues which make it currently unattractive to use at least in 5.6. I don't think it really solves the problem.

So please consider making the MTS recovery better, so that we don't completely lose the position of replication with respect to the master.  It's fine to stop and say "I'm not sure where I should be, but here are some pointers ...." but just wiping out all of the existing information is most unhelpful.

So any suggestions on how to improve this or the "correct" way to recovery from such a crash would be most welcome.

I'll upload the /etc/my.cnf configuration file which shows the actual configuration of this server (which was configured in a master-master setup).
[26 Jun 2015 6:44] Simon Mudd
/etc/my.cnf for server which had the problem.

Attachment: etc_my.cnf.slave-5-6-25 (application/octet-stream, text), 3.22 KiB.

[30 Jun 2015 9:15] Trent Lloyd
I wonder if the following errors from having relay_log_recovery enabled are causing the server to actually remove the multi-threaded slave execution state.

Will test this.

2015-06-24 13:49:03 3895 [ERROR] --relay-log-recovery cannot be executed when the slave was stopped with an error or killed in MTS mode; consider using RESET SLAVE or restart the server with --relay-log-recovery = 0 followed by START SLAVE UNTIL SQL_AFTER_MTS_GAPS
2015-06-24 13:49:03 3895 [ERROR] Failed to initialize the master info structure
2015-06-24 14:01:37 3895 [ERROR] Slave SQL: Slave failed to initialize relay log info structure from the repository, Error_code: 1872
[30 Jun 2015 10:42] Trent Lloyd
The replication position does not appear lost at all, the slave is failing to initialize due to relay_log_recovery=on which is not compatible with multi-threaded slave.  (It is compatible in 5.6.26+ when using GTID).

slave_worker_info is empty because it is not used unless relay-log-info-repository=TABLE

Lastly the slave state will not be crash safe/consistent unless relay-log-info-repository=TABLE is used.  This is documented here:
http://dev.mysql.com/doc/refman/5.6/en/slave-logs.html#replication-implementation-crash-sa...

There is one main outstanding issue here which is that without GTID, the relay logs are likely to become corrupt and relay_log_recovery=on cannot be used.  Starting with relay_log_recovery=off will let you use SQL_AFTER_MTS_GAPS to catch up and then reset the relay logs but that would be a manual process at this time.
[1 Jul 2015 22:25] Simon Mudd
Trent.

"the slave is failing to initialize due to relay_log_recovery=on which is not compatible with multi-threaded slave".
http://dev.mysql.com/doc/refman/5.6/en/replication-options-slave.html does not state that this is incompatible.  Or not imply that it should not be used. (I'm not using GTID.) If this is the case should MySQLD either startup in MTS mode but with relay_log_recovery disabled, or start with relay_log_recovery = 1 and MTS disabled?

Either way this does not seem intuitive or clear.  Safer recovery options would seem to be helpful.

I was using relay-log-info-repository=TABLE so mysql.slave_worker_info and this is supposed to ensure crash safety.

Relay logs can not be corrupt as the binlog checksum should allow us to catch that. it won't solve the problem but it will ensure that if the file is corrupt this will be noticed.  Truncating at the last known good point would seem like a good idea, scanning from the beginning of the last relay log file to ensure there's been no corruption might be a good idea.

So I'm not fully sure I understand the comments here. The configuration I was using was in theory supposed to be somewhat safe and the replication configuration was stored in InnoDB tables. Therefore InnoDB recovery would have been expected to restore some state even if that state were not strictly speaking 100% correct.
[27 Jul 2015 11:33] MySQL Verification Team
Hello Simon,

Thank you for the feedback.
I'm not sure whether this is intentional behavior but observed this during my tests.

Thanks,
Umesh
[27 Jul 2015 11:33] MySQL Verification Team
test results

Attachment: 77496.5_6_27.results (application/octet-stream, text), 14.89 KiB.

[21 Jan 2016 20:47] Jean-François Gagné
Hi,

I am observing the same behavior for:

-MySQL 5.6.28, 
-MySQL 5.7.10 with slave-parallel-type=DATABASE,
-MySQL 5.7.10 with slave-parallel-type=LOGICAL_CLOCK with slave_preserve_commit_order=0,
-and MySQL 5.7.10 with slave-parallel-type=LOGICAL_CLOCK with slave_preserve_commit_order=1.

Concerning slave-parallel-type=LOGICAL_CLOCK with slave_preserve_commit_order=1, it is a special case of this bug where no gap exists.  I logged Bug #80103 for it as if there is more detail in fixing the general case, I would like to see this special case fixed with less delay. 

Related reports: Bug #80102.

Thanks,

JFG
[22 Jan 2016 4:52] Jesper wisborg Krogh
Posted by developer:
 
See also Bug 18881475/Bug 72824
[20 May 2016 11:56] David Moss
Thanks for your feedback, this has been fixed in upcoming versions and the following was added to the 5.6.31 and 5.7.13 changelogs:

If a multi-threaded replication slave running with relay_log_recovery=1 stopped unexpectedly, during restart the relay log recovery process could fail. This was due to transaction inconsistencies not being filled, see Handling an Unexpected Halt of a Replication Slave. Prior to this fix, to recover from this situation required manually setting relay_log_recovery=0, starting the slave with START SLAVE UNTIL SQL_AFTER_MTS_GAPS to fix any transaction inconsistencies and then restarting the slave with relay_log_recovery=1. This process has now been automated, enabling relay log recovery of a multi-threaded slave upon restart automatically.
[5 Aug 2016 13:26] David Moss
Posted by developer:
 
After discussing with Sujatha, this new request is now being covered in Bug#21507981. Jesper has relayed this message via the SR mentioned above. Reclosing.
[5 Aug 2016 13:46] J-F Not Use Gagné
See Bug#81840 for new request.
[19 Sep 2016 8:23] Erlend Dahl
Bug#72824 Can't restart replication after restart with slave_parallel_workers and error

was marked as a duplicate
[9 Feb 2017 23:04] Vojtech Kurka
We ran into this today after a crash on MT slave with 5.7.16 and 5.7.15 master. This workaround did work:

1) keep record of the info in Relay log info;(show slave status)
2) stop slave;
3) reset slave;
4) start slave; 
5) stop slave;
6) SET GLOBAL gtid_purged='gtid position recorded in step 1'; change master to...;
7) start slave;

Did we run into https://bugs.mysql.com/bug.php?id=81840 ? or is there any other unfixed bug?