Bug #46828 | rpl_get_master_version_and_clock fails on PB-2 | ||
---|---|---|---|
Submitted: | 20 Aug 2009 10:30 | Modified: | 19 Dec 2009 11:08 |
Reporter: | Andrei Elkin | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | Tests: Replication | Severity: | S3 (Non-critical) |
Version: | 5.1 | OS: | Any |
Assigned to: | Daogang Qu | CPU Architecture: | Any |
Tags: | test failure |
[20 Aug 2009 10:30]
Andrei Elkin
[23 Sep 2009 18:56]
Alexander Nozdrin
It also happens in 5.1: http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2330459&... May be related to Bug#46931.
[29 Sep 2009 12:44]
Andrei Elkin
As Joro noted the bug must have been fixed by suppression.
[5 Oct 2009 0:02]
Luis Soares
It has been failing recently with different symptom: http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2365035 Sample push: http://pb2.norway.sun.com/web.py?template=push_details&push=559202
[7 Oct 2009 14:55]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/86032 3645 Tor Didriksen 2009-10-07 Bug#46828 rpl_get_master_version_and_clock fails on PB-2 Adds some more mtr.add_suppression() calls. @ mysql-test/suite/rpl/r/rpl_get_master_version_and_clock.result Adds some more mtr.add_suppression() calls. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test Adds some more mtr.add_suppression() calls.
[8 Oct 2009 8:33]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/86095 3648 Tor Didriksen 2009-10-07 Bug#46828 rpl_get_master_version_and_clock fails on PB-2 Adds some more mtr.add_suppression() calls. @ mysql-test/suite/rpl/r/rpl_get_master_version_and_clock.result Adds some more mtr.add_suppression() calls. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test Adds some more mtr.add_suppression() calls.
[8 Oct 2009 10:27]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/86122 3648 Tor Didriksen 2009-10-08 Bug#46828 rpl_get_master_version_and_clock fails on PB-2 Adds some more mtr.add_suppression() calls. @ mysql-test/suite/rpl/r/rpl_get_master_version_and_clock.result Adds some more mtr.add_suppression() calls. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test Adds some more mtr.add_suppression() calls.
[13 Oct 2009 6:43]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/86640 3649 Tor Didriksen 2009-10-13 Bug#46828 rpl_get_master_version_and_clock fails on PB-2 Adds some more mtr.add_suppression() calls. @ mysql-test/suite/rpl/r/rpl_get_master_version_and_clock.result Adds some more mtr.add_suppression() calls. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test Adds some more mtr.add_suppression() calls (to both connections)
[14 Oct 2009 16:14]
Tor Didriksen
pushed to mysql-6.0-codebase-bugfixing
[16 Oct 2009 8:51]
Daogang Qu
Failure happened again on PB-2 has the following details: mysql_pe: http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2384281 rpl.rpl_get_master_version_and_clock w3 [ fail ] Found warnings/errors in server log file! Test ended at 2009-10-15 07:23:27 mysqltest: Logging to ''. mysqltest: Results saved in ''. mysqltest: Connecting to server localhost:13101 (socket /export/home2/pb2/test/sb_1-831087-1255581662.59/tmp/okMyWYzjCI/3/mysqld.2.sock) as 'root', connection 'default', attempt 0 ... mysqltest: ... Connected. mysqltest: Start processing test commands from './include/check-warnings.test' ... file_name line /export/home2/pb2/test/sb_1-831087-1255581662.59/mysql-6.0.14-alpha-linux-i686-test/mysql-test/var/3/log/mysqld.2.err 091015 8:23:15 [Warning] Slave I/O: Get master clock failed with error: Can't connect to MySQL server on '127.0.0.1' (111), Error_code: 2003 /export/home2/pb2/test/sb_1-831087-1255581662.59/mysql-6.0.14-alpha-linux-i686-test/mysql-test/var/3/log/mysqld.2.err 091015 8:23:23 [Warning] Slave I/O: Get master SERVER_ID failed with error: Can't connect to MySQL server on '127.0.0.1' (111), Error_code: 2003 ^ Found warnings!! mysqltest: ... Done processing test commands. mysqltest: Test has succeeded! ok - saving '/export/home2/pb2/test/sb_1-831087-1255581662.59/mysql-6.0.14-alpha-linux-i686-test/mysql-test/var/3/log/rpl.rpl_get_master_version_and_clock/' to '/export/home2/pb2/test/sb_1-831087-1255581662.59/mysql-6.0.14-alpha-linux-i686-test/mysql-test/var/log/rpl.rpl_get_master_version_and_clock/' Retrying test, attempt(2/3)... mysql_5.1_bugteam http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2384521 rpl.rpl_get_master_version_and_clock w8 [ exp-fail ] Test ended at 2009-10-15 12:00:31 CURRENT_TEST: rpl.rpl_get_master_version_and_clock ==28090== Memcheck, a memory error detector. ==28090== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==28090== Using LibVEX rev 1658, a library for dynamic binary translation. ==28090== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==28090== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==28090== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==28090== For more details, rerun with: -v ==28090== --- /export/home2/pb2/test/sb_1-832872-1255594574.7/mysql-5.1.41-linux-x86_64-test/mysql-test/suite/rpl/r/rpl_get_master_version_and_clock.result 2009-10-15 11:00:24.000000000 +0300 +++ /export/home2/pb2/test/sb_1-832872-1255594574.7/mysql-5.1.41-linux-x86_64-test/mysql-test/suite/rpl/r/rpl_get_master_version_and_clock.reject 2009-10-15 13:00:31.000000000 +0300 @@ -20,25 +20,86 @@ 1 Check network error happened here NETWORK ERROR -SELECT IS_FREE_LOCK("debug_lock.before_get_SERVER_ID"); -IS_FREE_LOCK("debug_lock.before_get_SERVER_ID") -1 -SELECT GET_LOCK("debug_lock.before_get_SERVER_ID", 1000); -GET_LOCK("debug_lock.before_get_SERVER_ID", 1000) -1 -set global debug= 'd,debug_lock.before_get_SERVER_ID'; -stop slave; -start slave; -SELECT RELEASE_LOCK("debug_lock.before_get_SERVER_ID"); -RELEASE_LOCK("debug_lock.before_get_SERVER_ID") -1 -Check network error happened here -NETWORK ERROR -set global debug= ''; -reset master; -include/stop_slave.inc -change master to master_port=SLAVE_PORT; -start slave; -*** must be having the replicate-same-server-id IO thread error *** -Slave_IO_Errno= 1593 -Slave_IO_Error= Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it). +**** ERROR: timeout after seconds while waiting for slave parameter Slave_IO_Running = Yes **** +Message: Failed while waiting for slave to start +Current connection is 'slave' +Note: the following output may have changed since the failure was detected + +[on slave] + +**** SHOW SLAVE STATUS on slave **** +SHOW SLAVE STATUS; +Slave_IO_State +Master_Host 127.0.0.1 +Master_User root +Master_Port 13060 +Connect_Retry 1 +Master_Log_File master-bin.000001 +Read_Master_Log_Pos 664 +Relay_Log_File slave-relay-bin.000004 +Relay_Log_Pos 810 +Relay_Master_Log_File master-bin.000001 +Slave_IO_Running No +Slave_SQL_Running Yes +Replicate_Do_DB +Replicate_Ignore_DB +Replicate_Do_Table +Replicate_Ignore_Table +Replicate_Wild_Do_Table +Replicate_Wild_Ignore_Table +Last_Errno 0 +Last_Error +Skip_Counter 0 +Exec_Master_Log_Pos 664 +Relay_Log_Space 1009 +Until_Condition None +Until_Log_File +Until_Log_Pos 0 +Master_SSL_Allowed No +Master_SSL_CA_File +Master_SSL_CA_Path +Master_SSL_Cert +Master_SSL_Cipher +Master_SSL_Key +Seconds_Behind_Master NULL +Master_SSL_Verify_Server_Cert No +Last_IO_Errno 2013 +Last_IO_Error error reconnecting to master 'root@127.0.0.1:13060' - retry-time: 1 retries: 10 +Last_SQL_Errno 0 +Last_SQL_Error + +**** SHOW PROCESSLIST on slave **** +SHOW PROCESSLIST; +Id User Host db Command Time State Info +2 root localhost:39597 test Query 0 NULL SHOW PROCESSLIST +3 root localhost:39598 test Sleep 430 NULL +7 system user NULL Connect 428 Has read all relay log; waiting for the slave I/O thread to update it NULL + +**** SHOW BINLOG EVENTS on slave **** +SHOW BINLOG EVENTS IN 'slave-bin.000001'; +Log_name Pos Event_type Server_id End_log_pos Info +slave-bin.000001 4 Format_desc 2 106 Server ver: 5.1.41-debug-log, Binlog ver: 4 +slave-bin.000001 106 Query 1 385 use `mtr`; INSERT INTO test_suppressions (pattern) VALUES ( NAME_CONST('pattern',_latin1'Slave I/O: .* failed with error: Lost connection to MySQL server at \'reading initial communication packet\'' COLLATE 'latin1_swedish_ci')) +slave-bin.000001 385 Query 1 664 use `mtr`; INSERT INTO test_suppressions (pattern) VALUES ( NAME_CONST('pattern',_latin1'Slave I/O: Master command COM_REGISTER_SLAVE failed: failed registering on master, reconnecting to try again' COLLATE 'latin1_swedish_ci')) + +[on master] + +**** SHOW MASTER STATUS on master **** +SHOW MASTER STATUS; +File master-bin.000002 +Position 106 +Binlog_Do_DB +Binlog_Ignore_DB + +**** SHOW PROCESSLIST on master **** +SHOW PROCESSLIST; +Id User Host db Command Time State Info +1 root localhost:49670 test Query 0 NULL SHOW PROCESSLIST + +**** SHOW BINLOG EVENTS on master **** +SHOW BINLOG EVENTS IN 'master-bin.000001'; +Log_name Pos Event_type Server_id End_log_pos Info +master-bin.000001 4 Format_desc 1 106 Server ver: 5.1.41-debug-log, Binlog ver: 4 +master-bin.000001 106 Query 1 385 use `mtr`; INSERT INTO test_suppressions (pattern) VALUES ( NAME_CONST('pattern',_latin1'Slave I/O: .* failed with error: Lost connection to MySQL server at \'reading initial communication packet\'' COLLATE 'latin1_swedish_ci')) +master-bin.000001 385 Query 1 664 use `mtr`; INSERT INTO test_suppressions (pattern) VALUES ( NAME_CONST('pattern',_latin1'Slave I/O: Master command COM_REGISTER_SLAVE failed: failed registering on master, reconnecting to try again' COLLATE 'latin1_swedish_ci')) +master-bin.000001 664 Stop 1 683 mysqltest: In included file "./include/wait_for_slave_param.inc": Result content mismatch
[19 Oct 2009 6:48]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/87249 3183 Dao-Gang.Qu@sun.com 2009-10-19 Bug #46828 rpl_get_master_version_and_clock fails on PB-2 The 'rpl_get_master_version_and_clock' test verifies if the slave I/O thread tries to reconnect to master when it tries to get the values of the UNIX_TIMESTAMP, SERVER_ID from master under network disconnection. So the master server is restarted for making the transient network disconnection, so that these warnings and errors are produced in server log file. These warnings and errors are expected, so we should suppress them by mtr suppression. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test Added mtr suppression for suppressing warnings and errors in server log file.
[22 Oct 2009 2:04]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/87708 3183 Dao-Gang.Qu@sun.com 2009-10-22 Bug #46828 rpl_get_master_version_and_clock fails on PB-2 The 'rpl_get_master_version_and_clock' test verifies if the slave I/O thread tries to reconnect to master when it tries to get the values of the UNIX_TIMESTAMP, SERVER_ID from master under network disconnection. So the master server is restarted for making the transient network disconnection. Restarting master server can bring two problems as following: 1. The time out error is encountered sporadically. The slave I/O thread tries to reconnect master ten times with one second every time. So sporadically the slave I/O thread really stoped when it can't reconnect master in the ten seconds successfully, then the time out error will be encountered while waiting for slave start. 2. These warnings and errors are produced in server log file when the slave I/O thread tries to get the values of the UNIX_TIMESTAMP, SERVER_ID from master under the transient network disconnection. To fix problem 1, increase the master retry count to three hundred times, so that the slave I/O thread has enough time to reconnect master successfully. To fix problem 2, suppress these warnings and errors by mtr suppression, because they are expected. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock-slave.opt Added the *.opt file for increasing master retry count to three hundred times. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test Added mtr suppression for suppressing warnings and errors in server log file.
[29 Oct 2009 2:12]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/88545 3163 Dao-Gang.Qu@sun.com 2009-10-29 Bug #46828 rpl_get_master_version_and_clock fails on PB-2 The 'rpl_get_master_version_and_clock' test verifies if the slave I/O thread tries to reconnect to master when it tries to get the values of the UNIX_TIMESTAMP, SERVER_ID from master under network disconnection. So the master server is restarted for making the transient network disconnection. Restarting master server can bring two problems as following: 1. The time out error is encountered sporadically. The slave I/O thread tries to reconnect master ten times with one second per time, which is set in my.cnf. So in the test framework sporadically the slave I/O thread really stoped when it can't reconnect to master in the ten times successfully before the master starts, then the time out error will be encountered while waiting for the slave to start. 2. These warnings and errors are produced in server log file when the slave I/O thread tries to get the values of the UNIX_TIMESTAMP, SERVER_ID from master under the transient network disconnection. To fix problem 1, increase the master retry count to sixty times, so that the slave I/O thread has enough time to reconnect master successfully. To fix problem 2, suppress these warnings and errors by mtr suppression, because they are expected. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock-slave.opt Added the *.opt file for increasing master retry count to sixty times. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test Added mtr suppression for suppressing warnings and errors in server log file.
[29 Oct 2009 2:29]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/88548 3164 Dao-Gang.Qu@sun.com 2009-10-29 Bug #46828 rpl_get_master_version_and_clock fails on PB-2 The 'rpl_get_master_version_and_clock' test verifies if the slave I/O thread tries to reconnect to master when it tries to get the values of the UNIX_TIMESTAMP, SERVER_ID from master under network disconnection. So the master server is restarted for making the transient network disconnection. Restarting master server can bring two problems as following: 1. The time out error is encountered sporadically. The slave I/O thread tries to reconnect master ten times, which is set in my.cnf. So in the test framework sporadically the slave I/O thread really stoped when it can't reconnect to master in the ten times successfully before the master starts, then the time out error will be encountered while waiting for the slave to start. 2. These warnings and errors are produced in server log file when the slave I/O thread tries to get the values of the UNIX_TIMESTAMP, SERVER_ID from master under the transient network disconnection. To fix problem 1, increase the master retry count to sixty times, so that the slave I/O thread has enough time to reconnect master successfully. To fix problem 2, suppress these warnings and errors by mtr suppression, because they are expected. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock-slave.opt Added the *.opt file for increasing master retry count to sixty times. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test Added mtr suppression for suppressing warnings and errors in server log file.
[29 Oct 2009 3:02]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/88550 3652 Dao-Gang.Qu@sun.com 2009-10-29 [merge] Bug #46828 rpl_get_master_version_and_clock fails on PB-2 The 'rpl_get_master_version_and_clock' test verifies if the slave I/O thread tries to reconnect to master when it tries to get the values of the UNIX_TIMESTAMP, SERVER_ID from master under network disconnection. So the master server is restarted for making the transient network disconnection. Restarting master server can bring two problems as following: 1. The time out error is encountered sporadically. The slave I/O thread tries to reconnect master ten times, which is set in my.cnf. So in the test framework sporadically the slave I/O thread really stoped when it can't reconnect to master in the ten times successfully before the master starts, then the time out error will be encountered while waiting for the slave to start. 2. These warnings and errors are produced in server log file when the slave I/O thread tries to get the values of the UNIX_TIMESTAMP, SERVER_ID from master under the transient network disconnection. To fix problem 1, increase the master retry count to sixty times, so that the slave I/O thread has enough time to reconnect master successfully. To fix problem 2, suppress these warnings and errors by mtr suppression, because they are expected. @ mysql-test/suite/rpl/t/disabled.def The 'rpl_get_master_version_and_clock' test case is enabled, because all the errors will be fixed by the patch of bug#46828. We should disable the 'rpl_get_master_version_and_clock' test case base on the bug#46828 instead of bug#46931. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock-slave.opt Added the *.opt file for increasing master retry count to sixty times. @ mysql-test/suite/rpl/t/rpl_get_master_version_and_clock.test Added mtr suppression for suppressing warnings and errors in server log file.
[29 Oct 2009 3:17]
Daogang Qu
The patch is pushed to mysql-5.1-bugteam and mysql-pe.
[31 Oct 2009 8:18]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091031081410-qkxmjsdzjmj840aq) (version source revid:tor.didriksen@sun.com-20091013064328-c53a4f2ait9lktkx) (merge vers: 6.0.14-alpha) (pib:13)
[31 Oct 2009 16:56]
Jon Stephens
Changes affect tests only; closed without further action.
[4 Nov 2009 9:25]
Bugs System
Pushed into 5.1.41 (revid:joro@sun.com-20091104092152-qz96bzlf2o1japwc) (version source revid:kristofer.pettersson@sun.com-20091103162305-08l4gkeuif2ozsoj) (merge vers: 5.1.41) (pib:13)
[4 Nov 2009 16:34]
Jon Stephens
re-closing; see my previous comment.
[11 Nov 2009 6:51]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091110093407-rw5g8dys2baqkt67) (version source revid:alik@sun.com-20091109080109-7dxapd5y5pxlu08w) (merge vers: 6.0.14-alpha) (pib:13)
[11 Nov 2009 6:59]
Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091109115615-nuohp02h8mdrz8m2) (version source revid:alik@sun.com-20091105092355-jzukkw6wnd7hmgrj) (merge vers: 5.5.0-beta) (pib:13)
[11 Nov 2009 14:46]
Jon Stephens
Re-closed. See my comment from 4 November.
[18 Dec 2009 10:30]
Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:46]
Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 11:01]
Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:16]
Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)
[19 Dec 2009 11:08]
Jon Stephens
Setting back to Closed. state. See my previous comments.