Bug #70327 | Assersion error when setting future binlog file/pos with semisync | ||
---|---|---|---|
Submitted: | 13 Sep 2013 8:05 | Modified: | 3 Sep 2014 15:03 |
Reporter: | Yoshinori Matsunobu (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.6.13 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[13 Sep 2013 8:05]
Yoshinori Matsunobu
[13 Sep 2013 8:54]
MySQL Verification Team
Verified as described. ---------------------------------------------------------- On the slave; mysql> > stop slave; change master to master_log_file='mysql-bin.000002'; start slave; Query OK, 0 rows affected (0.06 sec) Query OK, 0 rows affected (0.02 sec) Query OK, 0 rows affected (0.00 sec) Then the server crashes with signal 6; This is because active_tranxs_ doesn't include the future binlog position. (The same problem will happen if a very old position is specified.) ---------------------------------------------------------- 2013-09-13 17:45:49 22182 [Note] Stop asynchronous binlog_dump to slave (server_id: 101) 2013-09-13 17:49:06 22182 [Note] Start semi-sync binlog_dump to slave (server_id: 101), pos(mysql-bin.000002, 20179840) 2013-09-13 17:49:06 22182 [Note] Stop semi-sync binlog_dump to slave (server_id: 101) 2013-09-13 17:49:06 22182 [Note] Stop semi-sync binlog_dump to slave (server_id: 101) mysqld: /export/home/pb2/build/sb_0-9714019-1373474833.36/mysql-5.6.13/plugin/semisync/semisync_master.cc:764: int ReplSemiSyncMaster::commitTrx(const char*, my_off_t): Assertion `!active_tranxs_->is_tranx_end_pos(trx_wait_binlog_name, trx_wait_binlog_pos)' failed. 08:49:06 UTC - mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.
[3 Sep 2014 15:03]
David Moss
The following was added to the 5.6.21 and 5.7.5 changelog with commit 4127: When using semisynchronous replication, if the binary log position was changed to a future position on a slave then an assertion error was generated on the master. This fix ensures that in such a situation the future position is correctly acknowledged and an error is instead generated on the slave.
[3 Sep 2014 15:04]
David Moss
Posted by developer: The following was added to the 5.6.21 and 5.7.5 changelog with commit 4127: When using semisynchronous replication, if the binary log position was changed to a future position on a slave then an assertion error was generated on the master. This fix ensures that in such a situation the future position is correctly acknowledged and an error is instead generated on the slave.
[25 Sep 2014 11:36]
Laurynas Biveinis
revno: 6055 committer: Sujatha Sivakumar <sujatha.sivakumar@oracle.com> branch nick: Bug17453826_mysql-5.6 timestamp: Wed 2014-07-02 11:34:11 +0530 message: Bug#17453826:ASSERTION ERROR WHEN SETTING FUTURE BINLOG FILE/POS WITH SEMISYNC Problem: ======== When DMLs are in progress on the master stopping a slave and setting ahead binlog name/pos will cause an assert on the master. Analysis: ======== Trx1 is waiting for ack on certain position meanwhile change master command is issued from the slave to master with a future log file name and position. Upon receipt master's dump thread assumes that it has received ack till the future log file and position and clears the active transaction list and sets the reply_file_name_ and reply_file_pos_ to future file name and pos. since reply file name and position are ahead of Trx1's current position Trx1 completes successfully. Execute a new DML Trx2. During commit Trx2 waits for an ack on some position but the reply file name and position are already pointing to future file name and position so this current transaction thinks that it has already got an ack and no need to wait and proceeds further. But the active transaction has not actually received any ack the transaction node is not cleared from active transaction list. During the exit of commit operation there is an assert which expects active transaction node to be cleared from the transaction list which fails this leads to an assert. Fix: === Ideally when a request for change master is received the request file name and position should be validated for correctness and only after the validation the transmit_start hook should be fired. Hence moved the transmit_start hook position to appropriate place where it is placed post the validation of file name and position. Post fix when a future log file name and position is specified ER_MASTER_FATAL_ERROR_READING_BINLOG error will be returned to slave. Note: During fixing this bug one more bug was identified. Rpl_semi_sync_master_clients this parameter can become negative at certain conditions causing UNINSTALL PLUGIN rpl_semi_sync_master to fail. 'Rpl_semi_sync_master_clients' client is incremented when dump thread initiates transmission and when a new slave connection is established. This value is decremented when dump thread exits and slave gets disconnected. On a certain condition if IO thread stops due to an error and dump thread exited before "transmit_start" hook dump thread will still try to call "transmit_stop" and this will cause Rpl_semi_sync_master_clients= -1 which is invalid state. Hence as part of the fix a new flag is introduced such that "transmit_stop" will be called only when "transmit_start" is called.
[25 Sep 2014 11:43]
Laurynas Biveinis
revno: 6072 committer: Sujatha Sivakumar <sujatha.sivakumar@oracle.com> branch nick: Bug17453826_mysql-5.6 timestamp: Fri 2014-07-11 11:20:54 +0530 message: Bug#17453826:ASSERTION ERROR WHEN SETTING FUTURE BINLOG FILE/POS WITH SEMISYNC Fixing a post push test issue.
[30 Sep 2014 13:47]
David Moss
Posted by developer: Reopened by merge, no new changelog entries needed. Set back to Closed.