Bug #72313 | stop sql_thread, start sql_thread causes a trx to log with a different GTID | ||
---|---|---|---|
Submitted: | 11 Apr 2014 4:43 | Modified: | 27 Aug 2014 13:45 |
Reporter: | Santosh Praneeth Banda | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
Version: | 5.6.16 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | replication |
[11 Apr 2014 4:43]
Santosh Praneeth Banda
[27 Aug 2014 13:45]
David Moss
The following was added to the 5.6.21 and 5.7.5 changelog: When an event group was spanned across multiple relay log files, a slave could incorrectly identify GTID-header group boundaries. This meant that when a transaction was retried, or if the SQL thread was stopped in the middle of a transaction after some rotates, the Gtid_log_event was being silently skipped on the slave, and the transaction was logged with the slave's GTID. This problem also impacted on using START SLAVE UNTIL MASTER_LOG_POS = log_pos; with GTIDs enabled. If log_pos was in the middle of a transaction, the Gtid_log_event was not correctly detected as the beginning of the transaction and replication stopped before this event. With this fix, threads correctly detect that they are part of a group, and this is used to check if a Gtid_log_event is part of a transaction.
[25 Sep 2014 11:52]
Laurynas Biveinis
revno: 6083 committer: Joao Gramacho <joao.gramacho@oracle.com> branch nick: mysql-5.6 timestamp: Fri 2014-07-18 03:49:01 +0200 message: BUG#18652178 STOP SQL_THREAD, START SQL_THREAD CAUSES A TRX TO LOG WITH A DIFFERENT GTID BUG#18306199 START SLAVE UNTIL MASTER_LOG_POS=MIDDLE-OF-GTID-EVENT STOPS BEFORE TRANSACTION Problem: ======= Slave loses track of GTID-header group boundaries when the group spans across multiple relay log files. This means that when the transaction is retried, or if you stop the SQL thread in the middle of a transaction after some Rotates (considering that the transaction/group was spanned into multiple relay log files), the Gtid_log_event will be silently skipped on slave, and the transaction will be logged with a slave's GTID. Also, when using "START SLAVE UNTIL MASTER_LOG_POS = x;", if "x" is in the middle of a transaction, the server is supposed to complete the transaction. This works fine when GTIDs are disabled. However, when GTIDs are enabled, if "x" is in the middle of the Gtid_log_event, it will stop before the transaction, not considering the Gtid_log_event as the beginning of the transaction. Analysis: ======== When Rotate events are applied at the slave, the SQL thread verifies if it is not inside of a transaction to update SQL thread position. For STS, the SQL thread uses the Relay_log_info::is_in_group() function to determine if it is inside of a transaction. It was found a problem in Relay_log_info::is_in_group(). It wasn't considering a Gtid_log_event as the beginning of a group/transaction. Because of this problem, the SQL thread was updating its position when applying a Rotate_log_event that immediately follows a Gtid_log_event (i.e., in the middle of a transaction but outside BEGIN...COMMIT). This should not happen, as SQL thread position should not be updated in the middle of a transaction so it could retry (or re-apply) the transaction from the beginning in the case of failures because of InnoDB deadlock or because the transaction's execution time exceeded InnoDB's innodb_lock_wait_timeout (or in the case of a request to stop the SQL thread). For the problem with "START SLAVE UNTIL MASTER...", the SQL thread uses the Relay_log_info::is_until_satisfied() function to verify if the until condition is satisfied. In this function, when the until condition is UNTIL_MASTER_POS, the SQL thread will use the current transaction position if in the middle of a transaction or the current event position if not in the middle of a transaction. This verification wasn't considering an Gtid_log_event as the beginning of a transaction. Fix: === Made Relay_log_info::is_in_group() to check if the thread has a GTID set to the current transaction. If so, it will return true, stating that it is already in a group. Made Relay_log_info::is_until_satisfied() to use the is_in_group() function to verify if the SQL thread is in the middle of a transaction. With this change, is_until_satisfied() will consider the Gtid_log_event as part of the transaction.
[30 Sep 2014 12:27]
David Moss
Posted by developer: Reopened by merge, no new changelog entries needed. Set back to Closed.