MySQL Bugs: #32003: rpl.rpl_server_id2 can cause extraneous errors recorded in error log

Bug #32003	rpl.rpl_server_id2 can cause extraneous errors recorded in error log
Submitted:	31 Oct 2007 19:51	Modified:	1 Feb 2008 15:04
Reporter:	Andrei Elkin	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Replication	Severity:	S2 (Serious)
Version:	CGE-6.3	OS:	Any
Assigned to:	Andrei Elkin	CPU Architecture:	Any

Description:
There is a warning on pushbuild:
https://intranet.mysql.com/secure/pushbuild/getlog.pl?dir=mysql-5.1-telco-6.3&entry=tulin@...

and slave.err for the test that describes a possible scenario.

SURRENT_TEST: rpl.rpl_server_id2
071031  2:26:50 [Warning] The syntax 'for replication startup options' is deprecated and will be removed in MySQL 5.2. Please use 'CHANGE MASTER' instead.
071031  2:26:50 [Note] Plugin 'InnoDB' disabled by command line option
071031  2:26:50 [Note] Event Scheduler: Loaded 0 events
071031  2:26:50 [Note] /data0/pushbuild/pb/mysql-5.1-telco-6.3/59/mysql-5.1.22-ndb-6.3.5-pb59/sql/mysqld: ready for connections.
Version: '5.1.22-ndb-6.3.5-pb59-log'  socket: '/dev/shm/pbtmp-ps_row-112/slave.sock'  port: 11122  MySQL Pushbuild Edition, build 59
071031  2:26:50 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:11120',replication started in log 'FIRST' at position 4
071031  2:26:50 [ERROR] The slave I/O thread stops because querying master with 'SET @master_heartbeat_period= 5000000000' failed; error: 1159  ''
071031  2:26:50 [Note] Slave I/O thread exiting, read up to log 'FIRST', position 4
071031  2:26:50 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log '/dev/shm/var-ps_row-112/log/slave-relay-bin.000001' position: 4
071031  2:26:50 [Note] Error reading relay log event: slave SQL thread was killed
071031  2:26:50 [Note] Slave I/O thread: connected to master 'root@127.0.0.1:11122',replication started in log 'FIRST' at position 4
071031  2:26:50 [Note] Slave SQL thread initialized, starting replication in log 'FIRST' at position 0, relay log '/dev/shm/var-ps_row-112/log/slave-relay-bin.000001' position: 4
071031  2:26:51 [Note] Slave I/O thread killed while reading event
071031  2:26:51 [Note] Slave I/O thread exiting, read up to log 'slave-bin.000001', position 182
071031  2:26:51 [Note] Error reading relay log event: slave SQL thread was killed
071031  2:26:51 [Note] /data0/pushbuild/pb/mysql-5.1-telco-6.3/59/mysql-5.1.22-ndb-6.3.5-pb59/sql/mysqld: Normal shutdown

How to repeat:
run rpl.rpl_server_id2 on a slow slave side env.
The idea is not to let slave io to finish its post-connecting operations such as 
sending initializing queries.

Suggested fix:
Synchronize slave with master before `reset master'.

More tests that have reset master which can lead to the same warning:
rpl.rpl_extraColmaster_innodb: 071031 13:22:55 [ERROR] The slave I/O thread stops because querying master with 'SET @master_heartbeat_period= 5000000000' failed; error: 1159
rpl_ndb.rpl_ndb_log

slave.err: rpl.rpl_ssl: 071101  5:48:34 [ERROR] The slave I/O thread stops because querying master with 'SET @master_heartbeat_period= 5000000000' failed; error: 1159
not having RESET MASTER still confines with the theory.
There is a tight loop for stop/start slave which effectively implements resetting of the master.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40651

ChangeSet@1.2827, 2008-01-07 19:11:12+02:00, aelkin@koti.dsl.inet.fi +1 -0
  bug#32003 rpl.rpl_server_id2 can cause extraneous errors recorded in error log
  
  the error printed in the slave error log is due to master's dump thread was
  killed in the middle of slave connecting as it should be because of 
  the tests' logics.
  However, printing the error was not necessary as it's expectable not to be
  able to connect to a killed master.
  
  Fixed with introducing a condition for printing the error that previously
  was introduced by bug#29976.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40652

ChangeSet@1.2827, 2008-01-07 19:25:38+02:00, aelkin@koti.dsl.inet.fi +1 -0
  bug#32003 rpl.rpl_server_id2 can cause extraneous errors recorded in error log
  
  the error printed in the slave error log appeared due to slave's threads
  were killed in the middle of handling START SLAVE.
  The sql command is asynchronous so that START may not be complete by
  the time a test issues STOP.
  Clearly, printing the error is not necessary if the threads are killed.
  
  Fixed with deploying a condition for printing the error that previously
  was introduced by bug#29976.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/40798

ChangeSet@1.2827, 2008-01-09 22:46:58+02:00, aelkin@koti.dsl.inet.fi +1 -0
  bug#32003 rpl.rpl_server_id2 can cause extraneous errors recorded in error log
  
  the error printed in the slave error log appeared due to slave's threads
  were killed in the middle of handling START SLAVE.
  The sql command is asynchronous so that START may not be complete by
  the time a test issues STOP.
  Clearly, printing the error is not necessary if the threads are killed.
  
  Fixed with deploying a condition for printing the error that previously
  was introduced by bug#29976.

pushed to mysql-5.1-telco-6.3

Test framework only; nothing to document for end users; closed w/o further action. 

See Bug #29976.