Bug #19328 Slave timeout with COM_REGISTER_SLAVE error causing stop
Submitted: 25 Apr 2006 11:00 Modified: 22 Jul 2007 20:02
Reporter: Mats Kindahl
Status: Closed
Category:Server: Replication Severity:S1 (Critical)
Version:5.1 source, 5.0 OS:Any
Assigned to: Ramil Kalimullin Target Version:

[25 Apr 2006 11:00] Mats Kindahl
Description:
Under high load, the slave registering to the master can timeout during the
COM_REGISTER_SLAVE execution. This causes an error, which prevents the slave from
connecting at all.

How to repeat:
Run a replication test under high load. For example, running some of the cluster tests on
a laptop will cause this error.

Suggested fix:
Do not abort immediately on a timeout, but retry connecting.
[25 Apr 2006 14:37] Valeriy Kravchuk
Please, specify the exact tests to run. What version(s) of MySQL server should I check?
5.0.x, 5.1.x?
[25 Apr 2006 15:00] Mats Kindahl
The fault occurs most frequently when executing cluster replication tests, e.g.,
rpl_ndb_dd_partitions, since they have a tendency to load the server. I haven't seen it
for any other tests, but in principle it can happen for any test, since there is nothing
specific with cluster in this aspect.
[25 Apr 2006 23:07] Elliot Murphy
The cluster team is also hitting this bug, and it shows up sometimes in pushbuild. This
needs to be fixed because it is interrupting our ability to fix other bugs.
[4 May 2006 23:21] Jonathan Miller
http://bugs.mysql.com/?id=19471
[3 Jul 2006 17:16] Augusto Bott
Just hit this very same problem today.
Dropped a database with 50+ tables (all were InnoDB) on a dual-master setup.
The relay thread died with this on the error log:
[ERROR] Error on COM_REGISTER_SLAVE: 1159 ''
[6 Jun 2007 12:42] Andrei Elkin
Happened with 5.1-rpl tree after
https://intranet.mysql.com/secure/pushbuild/getlog.pl?dir=mysql-5.1-new-rpl&entry=skozlov@
mysql.com-20070605220302&name=elog_ps_row-warnings&plat=sapsrv1

CURRENT_TEST: rpl_server_id2
070606  0:54:55 [Warning] The syntax for replication startup options is deprecated and
will be removed in MySQL 5.2. Please use 'CHANGE MASTER' instead.
070606  0:54:55 [Note] Plugin 'InnoDB' disabled by command line option
070606  0:54:55 [Note] Event Scheduler: Loaded 0 events
070606  0:54:55 [Note]
/data0/pushbuild/pb/mysql-5.1-new-rpl/277/mysql-5.1-new-rpl-exp/sql/mysqld: ready for
connections.
Version: '5.1.20-beta-pb277-debug-log'  socket: '/dev/shm/pbtmp-ps_row-102/slave.sock' 
port: 11022  Source distribution
070606  0:54:55 [Note] Slave SQL thread initialized, starting replication in log 'FIRST'
at position 0, relay log '/dev/shm/var-ps_row-102/log/slave-relay-bin.000001' position: 4
070606  0:54:55 [Note] next log '/dev/shm/var-ps_row-102/log/slave-relay-bin.000002' is
currently active
070606  0:54:55 [Note] Slave I/O thread: connected to master
'root@127.0.0.1:11020',replication started in log 'FIRST' at position 4
070606  0:54:55 [ERROR] Error on COM_REGISTER_SLAVE: 1159 ''
0
[11 Jun 2007 11:31] Lars Thalmann
See also BUG#22989
[26 Jun 2007 13:37] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29577

ChangeSet@1.2509, 2007-06-26 13:32:30+05:00, ramil@mysql.com +1 -0
  Fix for bug #19328: Slave timeout with COM_REGISTER_SLAVE error causing stop
  
  Problem: "Under high load, the slave registering to the master can timeout 
  during the COM_REGISTER_SLAVE execution. This causes an error, which 
  prevents the slave from connecting at all."
  
  Fix: Do not abort immediately, but retry registering on master.
[28 Jun 2007 9:39] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29806

ChangeSet@1.2524, 2007-06-28 12:38:43+05:00, ramil@mysql.com +1 -0
  Fix for bug #19328: Slave timeout with COM_REGISTER_SLAVE error causing stop
  
  Problem: "Under high load, the slave registering to the master can timeout 
  during the COM_REGISTER_SLAVE execution. This causes an error, which 
  prevents the slave from connecting at all."
  
  Fix: Do not abort immediately, but retry registering on master.
[28 Jun 2007 14:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29847

ChangeSet@1.2524, 2007-06-28 17:18:29+05:00, ramil@mysql.com +1 -0
  Fix for bug #19328: Slave timeout with COM_REGISTER_SLAVE error causing stop
  
  Problem: "Under high load, the slave registering to the master can timeout 
  during the COM_REGISTER_SLAVE execution. This causes an error, which 
  prevents the slave from connecting at all."
  
  Fix: Do not abort immediately, but retry registering on master.
[29 Jun 2007 18:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/29968

ChangeSet@1.2524, 2007-06-29 21:48:22+05:00, ramil@mysql.com +1 -0
  Fix for bug #19328: Slave timeout with COM_REGISTER_SLAVE error causing stop
  
  Problem: "Under high load, the slave registering to the master can timeout 
  during the COM_REGISTER_SLAVE execution. This causes an error, which 
  prevents the slave from connecting at all."
  
  Fix: Do not abort immediately, but retry registering on master.
[10 Jul 2007 15:26] Bugs System
Pushed into 5.1.21-beta
[22 Jul 2007 20:02] Paul DuBois
Noted in 5.1.21 changelog.

If a slave timed out while registering with the master to which it
was connecting, auto-reconnect failed thereafter.