Bug #17978 mysql-max-4.0.20 Master to mysql-max-5.0.18 Slave I/O error connecting to maste
Submitted: 6 Mar 2006 18:27 Modified: 24 Apr 2006 14:37
Reporter: Ryan Turnbull Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:mysql-max-5.0.18-linux-i686-glibc23 OS:Linux (Slackware 10.0)
Assigned to: CPU Architecture:Any

[6 Mar 2006 18:27] Ryan Turnbull
Description:
I have discovered a small error that may or may not be critical.  In our operation we are trying to move to the newest version of mysql (5.0).  Anyways, we have a current mysql MASTER that is of version 4.0.20 replicating its data to a mysql 5.0.18 slave. Replicatation works fine for about a week, then we get the following error....

060304  4:12:12 [ERROR] Slave I/O thread: error reconnecting to master 'replicatem@priapus:3306': Error: ''  errno: 0  retry-time: 60  retries: 86400

Which doesn't really help cause a) no error number no error 0 and the actual error message is not reported.  I have checked the master to see what operations were occuring at this time (mysqlbinlog) and there are no operations going on at that time on the master, it is IDLE.  Is this truely a bug, could be, however we are going to be rolling the master to the newest version of mysql this week.  I don't think these problems will occur on the master slave combination at that time because they will both be mysql 5.0.18 once the operation is complete.  Is this a issue of mysql MASTER being to old a binary log format and causes trouble on the mysql 5 slave?  Anyways, I just thought this is a weird error because of no error number or no error explaination.

How to repeat:
Set up a mysql 4.0.23 master to a mysql-max 5.0.18 slave and do operations on it for a week (you might not even have to do that, just have the connect between master and slave active for 1 week).

I say one week because we have had this in operation for about 2 months and it is consistantly dying every Saturday morning.  Here are some further errors.....

060211  1:46:49 [ERROR] Slave I/O thread: error reconnecting to master 'XXXX@XXXX:3306': Error: ''  errno: 0  retry-time: 60  retries: 86400
060218  4:59:23 [ERROR] Slave I/O thread: error reconnecting to master 'XXXX@XXXX:3306': Error: ''  errno: 0  retry-time: 60  retries: 86400
060225  4:09:43 [ERROR] Slave I/O thread: error reconnecting to master 'XXXX@XXXX:3306': Error: ''  errno: 0  retry-time: 60  retries: 86400
060226  6:18:42 [ERROR] Slave I/O thread: error reconnecting to master 'XXXX@XXXX:3306': Error: ''  errno: 0  retry-time: 60  retries: 86400
060304  4:12:12 [ERROR] Slave I/O thread: error reconnecting to master 'XXXX@XXXX:3306': Error: ''  errno: 0  retry-time: 60  retries: 86400

I have checked the binary logs on the master for each of the above errors and nothing is happening at each of the times reported. The master is sitting at a idle state or at least it is not inserting updating or deleting any records.

Suggested fix:
For right now, this is how I fix it.  I come back into the office, I log into the slave mysql instance (5.0.18).  I issue a stop slave; then for my benefit I do a show slave status; Everything is normal, then I issue a start slave.  AT THIS POINT, the slave picks up where is left off and replication begins again and it continues to replicate until it in sync with the master. After that, it stays 'listening' to the master, until the next Saturday where it dies again..... !!!
[9 Mar 2006 13:51] Valeriy Kravchuk
Thank you for a problem report. We surely need some more information on how to repeat this or wqhen it really occures... Please, send the my.cnf files for master and slave. Can it be just a timeout of some kind?
[9 Mar 2006 15:56] Ryan Turnbull
I have added in the master and slaves configuration files.   I hope this helps in diagnosing.   The other situation is, could the master become almost saturated with slave requests.  In our current setup, we have 1 master with 4 slaves (2 mysql 4 slaves, 2 mysql 5 slaves).  The mysql 4 slaves never have the problem of not being able to read the master operations.  Master is id 1, Mysql 4 slave number 1 is id 2, Mysql 4 slave number 2 is id 3, Mysql 5 slave number 1 is id 4, and mysql 5 slave number 2 is id 5.  Here is a listing of the hardware as well....

Master is a dual Intel(R) Xeon(TM) CPU 3.06GHz with 6 GB of memory and 10,000K RPM scsi disks. The mysql directory is on a RAID 5 set.

Mysql slaves are on ibm x365 boxes with dual Intel(R) Xeon(TM) MP CPU 2.20GHz with 6 GB of memory and the mysql directory is on 15,000K RPM scsi disks on a storage array (DS4000).

I hope this helps.

Thanks
[22 Apr 2006 15:21] Valeriy Kravchuk
Have you seen similar errors again since March 9th?

Please, send my.cnf from your MySQL 4 slaves (those that works without problems according to your report). You can also try to install newer version, 5.0.20a on your MySQL 5 slaves. Many bugs, including those affecting replication, were fixed since 5.0.18.
[24 Apr 2006 14:37] Ryan Turnbull
I haven't seen any errors on the mysql slaves since.  The reason being that we have converted one of the slave meantioned in the report to a master server with the remaining slave server becoming its slave.  Because both master and slave are the same mysql version, I have had no problems.  The other boxes that were mysql 4 are now retired and doing other operations for the company.  Because of this, from my standpoint, this problem is now closed....

However that being said, it would be nice if mysql has a replication compatablity page between versions of mysql.  Basically which versions are suppose to work with each version (master and slave). As well if any current issues/errors are reported between compatiblity, they be removed from said list until they can be resolved.

But like I said, we are no longer using that master slave combination, so it is no longer a problem for us.

Thank you very much for your help with this issue.