Description:
I configured master-master replication on two machines. Things were fine for some time. When the network connection between the two boxes disconnected and connected again, the slave master is taking more time to sync data after reconnection. The master-connect-retry paramater is set to default value.
Eventhough, the scenario is not reproduced, every time the network is disturbed, some times it is becoming a problem to restart the sync process.
Please help in resolving this issue.
Thanks in advance.
Find the mysql config file and errors i got below:
Log file:
----------
070503 15:45:24 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.0.16-standard-log' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Edition - Standard (GPL)
070503 15:45:24 [Note] Slave SQL thread initialized, starting replication in log 'interceptor-bin.000001' at position 376111, relay log './interceptor-relay-bin.000002' position: 241
------------> This is where the sync process stops
070503 15:45:27 [ERROR] Slave I/O thread: error connecting to master 'sqluser2@192.168.150.230:3306': Error: 'Lost connection to MySQL server during query' errno: 2013 retry-time: 60 retries: 86400
070503 15:46:27 [Note] Slave I/O thread: connected to master 'sqluser2@192.168.150.230:3306', replication started in log 'interceptor-bin.000001' at position 376111
-----------------------------------------------
my.cnf file:
Primary Master config file:
----------------------------
[mysqld]
server-id=100
log-bin=/var/lib/mysql/bin.log
log-bin-index=/var/lib/mysql/log-bin.index
relay-log=/var/lib/mysql/relay.log
relay-log-index=/var/lib/mysql/relay-log.index
log-slave-updates
replicate-same-server-id=0
auto_increment_increment=1
auto_increment_offset=1
skip-slave-start
log-slow-queries
log-slow-admin-statements
log-error=/var/log/interceptor-mysql.log
master_host=192.168.150.24
master_user=sqluser1
master_password=password
report_host=192.168.150.23
Secondary Master config file:
---------------------------
[mysqld]
server-id=20
log-bin=/var/lib/mysql/bin.log
log-bin-index=/var/lib/mysql/log-bin.index
relay-log=/var/lib/mysql/relay.log
relay-log-index=/var/lib/mysql/relay-log.index
log-slave-updates
replicate-same-server-id=0
auto_increment_increment=1
auto_increment_offset=1
skip-slave-start
log-slow-queries
log-slow-admin-statements
log-error=/var/log/interceptor-mysql.log
master_host=192.168.150.24
master_user=sqluser2
master_password=password
report_host=192.168.150.23
master_host=192.168.150.23
master_user=sqluser2
master_password=password
report_host=192.168.150.24
How to repeat:
Establish master master replication between two boxes with the specified configuration. Assume two boxes are named primary and secondary. Populate primary master with some huge amount of data(for ex, add more than 5000 records). The same data is reflecting the in Secondary slave server.
Now disconnect the network connection. On Secondary master, delete previously added records and add same or more number of records as earlier. Reconnect the network connection between the two boxes.
Now verify in Primary master. After waiting for "master-connect-retry" time or more, it is observed that data is not replicating from secondary master to primary master.
Note: "show slave status" is showing the Slave_IO_thread and Slave_SQL_thread entries as YES.
Eventhough, this is not happening every time we did the above steps, the problem persists frequently.