Bug #56416 | errors in first connection for IO thread are not retried | ||
---|---|---|---|
Submitted: | 31 Aug 2010 16:51 | Modified: | 29 Nov 2010 22:45 |
Reporter: | Mark Callaghan | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | 5.1.47 | OS: | Any |
Assigned to: | Daogang Qu | CPU Architecture: | Any |
Tags: | connect, error, replication |
[31 Aug 2010 16:51]
Mark Callaghan
[31 Aug 2010 17:25]
Sveta Smirnova
Thank you for the report. Verified as described: problem is number of retries which is not 86400 for just few seconds.
[17 Sep 2010 10:29]
Daogang Qu
It is not a bug. From the following code, we can see the mysql_real_connect(...) fun will be try to execute 'mi->retry_count' times if the connection always failed. And by default the mi->retry_count is set to 3600*24 (== 86400), which verifys that it really try to connect for 86400 times, just the error infomation will be reported in the first time. while (!(slave_was_killed = io_slave_killed(thd,mi)) && (reconnect ? mysql_reconnect(mysql) != 0 : mysql_real_connect(mysql, mi->host, mi->user, mi->password, 0, mi->port, 0, client_flag) == 0)) { /* Don't repeat last error */ if ((int)mysql_errno(mysql) != last_errno || DBUG_EVALUATE_IF("connect_to_master_always_report_error", TRUE, FALSE)) { /* TODO: would be great that when issuing SHOW SLAVE STATUS the number of retries would actually show err_count instead of mi->retry_count. We can achieve that if we remove the 'if' above and replace mi->retry_count with err_count. However, we would be adding an entry in the error log for each connection attempt (ie, for each retry). */ last_errno=mysql_errno(mysql); suppress_warnings= 0; mi->report(ERROR_LEVEL, last_errno, "error %s to master '%s@%s:%d'" " - retry-time: %d retries: %lu", (reconnect ? "reconnecting" : "connecting"), mi->user, mi->host, mi->port, mi->connect_retry, DBUG_EVALUATE_IF("connect_to_master_always_report_error", err_count+1, mi->retry_count)); } /* By default we try forever. The reason is that failure will trigger master election, so if the user did not set mi->retry_count we do not want to have election triggered on the first failure to connect */ if (++err_count == mi->retry_count) { slave_was_killed=1; break; } safe_sleep(thd,mi->connect_retry,(CHECK_KILLED_FUNC)io_slave_killed, (void*)mi); } {"master-retry-count", 0, "The number of tries the slave will make to connect to the master before giving up.", &master_retry_count, &master_retry_count, 0, GET_ULONG, REQUIRED_ARG, 3600*24, 0, 0, 0, 0, 0},
[17 Sep 2010 16:50]
Mark Callaghan
Yes, it is not a bug but a feature request as the IO thread is stuck in the connection retry loop mostly sleeping. I think that SHOW SLAVE STATUS could do a better job of describing that. *************************** 1. row *************************** Slave_IO_State: Connecting to master Master_Host: foobar Master_User: root Master_Port: 3306 Connect_Retry: 60 Master_Log_File: Read_Master_Log_Pos: 4 Relay_Log_File: dev347-relay-bin.000001 Relay_Log_Pos: 4 Relay_Master_Log_File: Slave_IO_Running: No Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 0 Relay_Log_Space: 212 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 2005 Last_IO_Error: error connecting to master 'root@foobar:3306' - retry-time: 60 retries: 86400 Last_SQL_Errno: 0 Last_SQL_Error: 1 row in set (0.00 sec)
[20 Sep 2010 3:12]
Zhenxing He
Hi Mark, Slave_IO_State shows status 'Connecting to master', and also for 5.5, the Slave_IO_Running will show 'Connecting' instead of 'No' when slave IO thread is trying to (re-)connecting the master (BUG#41613). So I think by checking these fields, you can know that the slave is connecting. I'm not sure if this can meet your needs.
[20 Sep 2010 6:37]
Mark Callaghan
I am not sure that this needs to be changed. However, I missed what was going on and the person who reported it to me missed what was going on. At this point, I am tainted. Maybe you can ask one or two other people whether they can diagnose the current that of the slave when this problem occurs.
[20 Sep 2010 6:39]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/118550 3273 Dao-Gang.Qu@sun.com 2010-09-20 Bug #56416 errors in first connection for IO thread are not retried SHOW SLAVE STATUS can not display the number of real retries for each connection attempt of I/O thread. To fix the problem to make SHOW SLAVE STATUS to display the number of retries which would be real retry counts instead of mi->retry_count(3600*24 by default) for each connection attempt of I/O thread. @ mysql-test/suite/rpl/r/rpl_change_master_dbug.result Updated for the patch of BUG #56416 @ mysql-test/suite/rpl/t/rpl_change_master_dbug.test Updated test case to also verifies that the 'Last_IO_Error' entry of SHOW SLAVE STATUS will display the number of real retries. @ sql/rpl_slave.cc Added code to make SHOW SLAVE STATUS to display the number of real retries for each connection attempt of I/O thread.
[26 Sep 2010 23:58]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/119130 3299 Dao-Gang.Qu@sun.com 2010-09-27 Bug #56416 errors in first connection for IO thread are not retried SHOW SLAVE STATUS can not display the number of real retries for each connection attempt of I/O thread by 'Last_IO_Error' entry. To fix the problem to make SHOW SLAVE STATUS to display the number of retries which would be real retry counts instead of mi->retry_count(3600*24 by default) for each connection attempt of I/O thread. @ mysql-test/suite/rpl/r/rpl_change_master_dbug.result Updated for the patch of BUG #56416 @ mysql-test/suite/rpl/t/rpl_change_master_dbug.test Updated test case to also verifies that the 'Last_IO_Error' entry of SHOW SLAVE STATUS will display the number of real retries. @ sql/rpl_slave.cc Added code to make SHOW SLAVE STATUS to display the number of real retries for each connection attempt of I/O thread.
[2 Oct 2010 18:15]
Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101002181053-6iotvl26uurcoryp) (version source revid:alexander.nozdrin@oracle.com-20101002180917-h0n62akupm3z20nt) (pib:21)
[13 Nov 2010 16:21]
Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (merge vers: 5.6.99-m4) (pib:21)
[29 Nov 2010 22:45]
Jon Stephens
Documented bugfix in the 5.6.1 changelog as follows: SHOW SLAVE STATUS now displays the actual number of retries for each connection attempt made by the I/O thread. Closed.