Bug #70391 | uninstall and install semi-sync plugin causes slaves to break | ||
---|---|---|---|
Submitted: | 20 Sep 2013 21:21 | Modified: | 6 May 2014 14:37 |
Reporter: | Santosh Praneeth Banda | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 5.6.13, 5.6.14 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[20 Sep 2013 21:21]
Santosh Praneeth Banda
[30 Sep 2013 9:49]
MySQL Verification Team
Hello Santosh, Thank you for the bug report. I tried to reproduce with the provided steps but couldn't repeat the reported behavior, also tried with a heavy to moderate load on master to see if that repeats this but with no luck. Could you please provide configuration files from all servers? and details which would help us to reproduce this issue? Also, could you confirm is this issue repeatable on latest GA i.e on 5.6.14? Thanks, Umesh
[21 Oct 2013 19:17]
Santosh Praneeth Banda
Sorry, i should have been more clearer in my repro steps. I think you tried uninstalling slave plugin on master, but should be done on a running semi-sync slave. Here is a mtr test that reproduces consistently == mysql-test/suite/rpl/t/rpl_semi_sync_uninstall_plugin.test == source include/have_semisync_plugin.inc; source include/not_embedded.inc; source include/master-slave.inc; connection master; eval INSTALL PLUGIN rpl_semi_sync_master SONAME '$SEMISYNC_MASTER_PLUGIN'; set global rpl_semi_sync_master_timeout= 6000000; connection slave; eval INSTALL PLUGIN rpl_semi_sync_slave SONAME '$SEMISYNC_SLAVE_PLUGIN'; set global rpl_semi_sync_slave_enabled = ON; source include/stop_slave.inc; source include/start_slave.inc; connection master; set global rpl_semi_sync_master_enabled = ON; create table t1 (a int); insert into t1 values(1); connection slave; UNINSTALL PLUGIN rpl_semi_sync_slave; connection master; insert into t1 values(2); drop table t1; UNINSTALL PLUGIN rpl_semi_sync_master; source include/rpl_end.inc; == mysql-test/suite/rpl/t/rpl_semi_sync_uninstall_plugin-master.opt == $SEMISYNC_PLUGIN_OPT == mysql-test/suite/rpl/t/rpl_semi_sync_uninstall_plugin-slave.opt == $SEMISYNC_PLUGIN_OPT
[22 Oct 2013 7:49]
MySQL Verification Team
Thank you for the feedback and test case. I'm able to reproduce the issue. Thanks, Umesh
[6 May 2014 14:37]
Paul DuBois
Noted in 5.5.39, 5.6.20, 5.7.5 changelogs. Uninstalling and reinstalling semisynchronous replication plugins while semisynchronous replication was active caused replication failures. The plugins now check whether they can be uninstalled and produce an error if semisynchronous replication is active. To uninstall the the master-side plugin, there must be no semisynchronous slaves. To uninstall the slave-side plugin, there must be no semisynchronous I/O threads running.
[1 Aug 2014 15:51]
Laurynas Biveinis
5.5 $ bzr log -r 4631 ------------------------------------------------------------ revno: 4631 committer: Venkatesh Duggirala<venkatesh.duggirala@oracle.com> branch nick: mysql-5.5 timestamp: Mon 2014-05-05 22:22:15 +0530 message: Bug#17638477 UNINSTALL AND INSTALL SEMI-SYNC PLUGIN CAUSES SLAVES TO BREAK Problem: Uninstallation of semi sync plugin causes replication to break. Analysis: A semisync enabled replication is mutual agreement between Master and Slave when the connection (I/O thread) is established. Once I/O thread is started and if semisync is enabled on both master and slave, master appends special magic header to events using semisync plugin functions and sends it to slave. And slave expects that each event will have that special magic header format and reads those bytes using semisync plugin functions. When semi sync replication is in use if users execute uninstallation of the plugin on master, slave gets confused while interpreting that event's content because it expects special magic header at the beginning of the event. Slave SQL thread will be stopped with "Missing magic number in the header" error. Similar problem will happen if uninstallation of the plugin happens on slave when semi sync replication is in in use. Master sends the events with magic header and slave does not know about the added magic header and thinks that it received a corrupted event. Hence slave SQL thread stops with "Found corrupted event" error. Fix: Uninstallation of semisync plugin will be blocked when semisync replication is in use and will throw 'ER_UNKNOWN_ERROR' error. To detect that semisync replication is in use, this patch uses semisync status variable values. > On Master, it checks for 'Rpl_semi_sync_master_status' to be OFF before allowing the uninstallation of rpl_semi_sync_master plugin. >> Rpl_semi_sync_master_status is OFF when >>> there is no dump thread running >>> there are no semisync slaves > On Slave, it checks for 'Rpl_semi_sync_slave_status' to be OFF before allowing the uninstallation of rpl_semi_sync_slave plugin. >> Rpl_semi_sync_slave_status is OFF when >>> there is no I/O thread running >>> replication is asynchronous replication.
[1 Aug 2014 15:52]
Laurynas Biveinis
5.5 $ bzr log -r 4632 ------------------------------------------------------------ revno: 4632 committer: Venkatesh Duggirala<venkatesh.duggirala@oracle.com> branch nick: mysql-5.5 timestamp: Tue 2014-05-06 11:23:42 +0530 message: Bug#17638477 UNINSTALL AND INSTALL SEMI-SYNC PLUGIN CAUSES SLAVES TO BREAK Fixing post push failure
[1 Aug 2014 15:53]
Laurynas Biveinis
5.5 laurynas$ bzr log -r 4634 ------------------------------------------------------------ revno: 4634 committer: Venkatesh Duggirala<venkatesh.duggirala@oracle.com> branch nick: mysql-5.5 timestamp: Wed 2014-05-07 14:33:58 +0530 message: Bug#17638477 UNINSTALL AND INSTALL SEMI-SYNC PLUGIN CAUSES SLAVES TO BREAK Fixing post push failure