Bug #113813 | Semisync will abort due to wrong transaction binlog postion comparison | ||
---|---|---|---|
Submitted: | 31 Jan 2024 2:26 | Modified: | 1 Aug 8:13 |
Reporter: | karry zhang (OCA) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | MySQL8.0,8.0.36 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | Contribution |
[31 Jan 2024 2:26]
karry zhang
[31 Jan 2024 7:18]
MySQL Verification Team
Hello karry zhang, Thank you for the report and feedback. I quickly tried on 8.0.36 build but seems provided mtr test case is not complete(maybe include/have_semisync_plugin.inc missing?). Could you please resend the exact mtr test case to reproduce at our end? Thank you. -- ./mtr --suite=rpl bug113813 Logging: ./mtr --suite=rpl bug113813 MySQL Version 8.0.36 Checking supported features Collecting tests - Adding combinations for rpl Checking leftover processes Removing old var directory Creating var directory '/export/home/tmp/ushastry/mysql-8.0.36/mysql-test/var' Installing system database Using parallel: 1 ============================================================================== TEST NAME RESULT TIME (ms) COMMENT ------------------------------------------------------------------------------ [ 25%] rpl.bug113813 'mix' [ fail ] Test ended at 2024-01-31 08:15:22 CURRENT_TEST: rpl.bug113813 WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. WARNING: MYSQL_OPT_RECONNECT is deprecated and will be removed in a future version. mysqltest: At line 27: Query 'SET @old_rpl_semi_sync_slave_enabled = @@GLOBAL.rpl_semi_sync_slave_enabled' failed. ERROR 1193 (HY000): Unknown system variable 'rpl_semi_sync_slave_enabled' safe_process[15387]: Child process: 15388, exit: 1 regards, Umesh
[31 Jan 2024 12:18]
karry zhang
you can try the follow tese case on 8.0.36: semisync.test ============= source include/have_semisync_plugin.inc; --let $rpl_skip_start_slave= 1 source include/master-slave.inc; connection master; --let $MYSQLD_DATADIR= `select @@datadir` --let $master_binlog= query_get_value(SHOW MASTER STATUS, File, 1) --let $master_binlog_index = `SELECT @@log_bin_index` --let $rpl_server_number= 1 source include/rpl_stop_server.inc; --echo --echo # Change the binlog index entry and binlog file name so that the binlog can reach master-bin.999998, making it easier to reproduce the problem. --exec cat $master_binlog_index --exec sed -i 's/000001/999998/g' $master_binlog_index --exec cat $master_binlog_index --exec mv $MYSQLD_DATADIR/$master_binlog $MYSQLD_DATADIR/master-bin.999998 --let $rpl_server_number= 1 source include/rpl_start_server.inc; connection master; source include/install_semisync_source.inc; SET @old_rpl_semi_sync_source_enabled = @@GLOBAL.rpl_semi_sync_source_enabled; SET GLOBAL rpl_semi_sync_source_enabled = ON; # Set a relatively long rpl_semi_sync_master_timeout to prevent the impact of timeout. SET @old_rpl_semi_sync_source_timeout = @@GLOBAL.rpl_semi_sync_source_timeout; SET global rpl_semi_sync_source_timeout = 100000; SHOW VARIABLES LIKE "%rpl_semi_sync_source_enabled%"; SHOW MASTER STATUS; #SHOW STATUS LIKE 'Rpl_semi_sync%'; --echo --echo # enable semi-sync replication connection slave; --source include/install_semisync_replica.inc SET @old_rpl_semi_sync_replica_enabled = @@GLOBAL.rpl_semi_sync_replica_enabled; SET GLOBAL rpl_semi_sync_replica_enabled = ON; source include/start_slave.inc; SHOW GLOBAL VARIABLES LIKE 'rpl_semi_sync_replica_enabled'; SHOW STATUS LIKE 'Rpl_semi_sync_replica_status'; SHOW REPLICA STATUS; --echo --echo # create some data. connection master; SHOW STATUS LIKE 'Rpl_semi_sync%'; CREATE DATABASE tt; USE tt; CREATE TABLE t1(id int); INSERT INTO t1 VALUES(0); INSERT INTO t1 VALUES(1); INSERT INTO t1 VALUES(2); SHOW STATUS LIKE 'Rpl_semi_sync%'; --echo --echo # flush binary logs so that binlog reaches master-bin.1000000. FLUSH BINARY LOGS; SHOW MASTER STATUS; SHOW STATUS LIKE 'Rpl_semi_sync%'; --echo --echo # semi-sync aborts due to wrong transaction binlog position comparison INSERT INTO t1 VALUES(3); INSERT INTO t1 VALUES(4); SHOW STATUS LIKE 'Rpl_semi_sync%'; --echo --echo # clean up DROP TABLE t1; DROP DATABASE tt; source include/sync_slave_sql_with_master.inc; connection slave; source include/stop_slave.inc; SET GLOBAL rpl_semi_sync_replica_enabled = @old_rpl_semi_sync_replica_enabled; source include/uninstall_semisync_replica.inc; source include/start_slave.inc; connection master; SET GLOBAL rpl_semi_sync_source_enabled = @old_rpl_semi_sync_source_enabled; SET GLOBAL rpl_semi_sync_source_timeout = @old_rpl_semi_sync_source_timeout; source include/uninstall_semisync_source.inc; source include/rpl_end.inc; semisync-master.opt =================== $SEMISYNC_PLUGIN_OPT semisync-slave.opt ================== $SEMISYNC_PLUGIN_OPT
[31 Jan 2024 12:58]
MySQL Verification Team
Hello karry zhang, Thank you for the report and feedback. regards, Umesh
[7 Feb 2024 1:58]
karry zhang
bugfix patch (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: semisync_bugfix.patch (application/octet-stream, text), 922 bytes.
[7 Feb 2024 8:24]
MySQL Verification Team
Hello karry zhang, Thank you for your contribution. regards, Umesh
[23 Jul 2:01]
karry zhang
In the latest GA of MySQL 9.4, I noticed that the official released the following: Replication: During semisync replication, when the length of the binary log suffix exceeded six digits (.999999), so that the next log file became—for example—mysql-bin.1000000, the replication protocol unexpectedly changed from semisynchronous to asynchronous. Our thanks to Wuhao Cao for the contribution. (Bug #115861, Bug #37024069) I noticed that the fix patch I provided is basically the same as Wuhao Cao's, but the time I reported the bug and the time I fixed it were earlier than Wuhao Cao, but the official did not accept my fix, which made me very disappointed. In addition, I noticed that many of the bugs I reported and the fixes did not attract the official's attention, which is unacceptable for a developer who has been focusing on MySQL kernel development for many years.
[23 Jul 2:01]
karry zhang
In the latest GA of MySQL 9.4, I noticed that the official released the following: Replication: During semisync replication, when the length of the binary log suffix exceeded six digits (.999999), so that the next log file became—for example—mysql-bin.1000000, the replication protocol unexpectedly changed from semisynchronous to asynchronous. Our thanks to Wuhao Cao for the contribution. (Bug #115861, Bug #37024069) I noticed that the fix patch I provided is basically the same as Wuhao Cao's, but the time I reported the bug and the time I fixed it were earlier than Wuhao Cao, but the official did not accept my fix, which made me very disappointed. In addition, I noticed that many of the bugs I reported and the fixes did not attract the official's attention, which is unacceptable for a developer who has been focusing on MySQL kernel development for many years.
[30 Jul 6:55]
Edward Gilmore
Posted by developer: Fixed as part of 37024069 / 115861. Added the following note to the MySQL Server 8.0.43, 8.4.6, and 9.4.0 release notes: During semisync replication, when the length of the binary log suffix exceeded six digits (.999999), so that the next log file became--for example--mysql-bin.1000000, the replication protocol unexpectedly changed from semisynchronous to asynchronous. Our thanks to Wuhao Cao and Karry Zhang and the team at Alibaba for the contribution.
[30 Jul 7:58]
Nuno Carvalho
Hello Karry Zhang, There was a mistake on the handling of this bug which led to its absence from the community contributions. This omission was fixed, being the bug report and release notes updated. We apologise for the inconvenience. Best regards, Nuno Carvalho
[1 Aug 8:13]
karry zhang
Thank you Edward Gilmore and Nuno Carvalho. I sincerely hope that the other bugs I found will be fixed in future versions, and I look forward to continuing to communicate with you in the future.