Bug #37768 rpl_flushlog_loop/row fails randomly
Submitted: 1 Jul 2008 15:01 Modified: 14 Jul 2009 10:53
Reporter: Alexander Nozdrin Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.1.26,6.0-BK OS:Any
Assigned to: Luis Soares CPU Architecture:Any
Tags: pushbuild, sporadic, test failure

[1 Jul 2008 15:01] Alexander Nozdrin
Description:
Symptom 1:
-----------------------------------------------------
mysqltest: At line 50: query 'INSERT INTO t1 VALUE(1)' failed: 1146: Table 'test.t1' doesn't exist

The result from queries just before the failure was:
< snip >
start slave;
show variables like 'relay_log%';
Variable_name	Value
relay_log	MYSQLTEST_VARDIR/master-data/relay-log
relay_log_index	
relay_log_info_file	relay-log.info
relay_log_purge	ON
relay_log_space_limit	0
stop slave;
change master to master_host='127.0.0.1',master_user='root',
master_password='',master_port=MASTER_PORT;
start slave;
stop slave;
change master to master_host='127.0.0.1',master_user='root',
master_password='',master_port=SLAVE_PORT;
start slave;

let $result_pattern= '%127.0.0.1%root%slave-bin.000001%slave-bin.000001%Yes%Yes%0%0%None%' ;

--source include/wait_slave_status.inc
-----------------------------------------------------

Symptom 2:
-----------------------------------------------------
@@ -24,30 +24,31 @@
 
 --source include/wait_slave_status.inc
 FLUSH LOGS;
-SHOW SLAVE STATUS;
-Slave_IO_State	#
+ERROR: failed while waiting for slave parameter Exec_Master_Log_Pos: 746
+show slave status;
+Slave_IO_State	Waiting for master to send event
 Master_Host	127.0.0.1
 Master_User	root
-Master_Port	SLAVE_PORT
+Master_Port	11312
 Connect_Retry	60
 Master_Log_File	slave-bin.000001
-Read_Master_Log_Pos	POSITION
-Relay_Log_File	#
-Relay_Log_Pos	#
+Read_Master_Log_Pos	746
+Relay_Log_File	relay-log.000002
+Relay_Log_Pos	11529215046057707195
 Relay_Master_Log_File	slave-bin.000001
 Slave_IO_Running	Yes
-Slave_SQL_Running	Yes
+Slave_SQL_Running	No
 Replicate_Do_DB	
 Replicate_Ignore_DB	
 Replicate_Do_Table	
-Replicate_Ignore_Table	#
+Replicate_Ignore_Table	
 Replicate_Wild_Do_Table	
 Replicate_Wild_Ignore_Table	
-Last_Errno	0
-Last_Error	
+Last_Errno	1594
+Last_Error	Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
 Skip_Counter	0
-Exec_Master_Log_Pos	POSITION
-Relay_Log_Space	#
+Exec_Master_Log_Pos	465
+Relay_Log_Space	978
 Until_Condition	None
 Until_Log_File	
 Until_Log_Pos	0
@@ -57,9 +58,9 @@
 Master_SSL_Cert	
 Master_SSL_Cipher	
 Master_SSL_Key	
-Seconds_Behind_Master	#
-Master_SSL_Verify_Server_Cert	#
-Last_IO_Errno	#
+Seconds_Behind_Master	NULL
+Master_SSL_Verify_Server_Cert	No
+Last_IO_Errno	0
 Last_IO_Error	
-Last_SQL_Errno	0
-Last_SQL_Error	
+Last_SQL_Errno	1594
+Last_SQL_Error	Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
-----------------------------------------------------

Symptom 3:
-----------------------------------------------------
mysqltest: At line 44: query 'CREATE TABLE t1 (a INT KEY) ENGINE= MyISAM' failed: 1050: Table 't1' already exists

The result from queries just before the failure was:
< snip >
start slave;
show variables like 'relay_log%';
Variable_name	Value
relay_log	MYSQLTEST_VARDIR/master-data/relay-log
relay_log_index	
relay_log_info_file	relay-log.info
relay_log_purge	ON
relay_log_space_limit	0
stop slave;
change master to master_host='127.0.0.1',master_user='root',
master_password='',master_port=MASTER_PORT;
start slave;
stop slave;
change master to master_host='127.0.0.1',master_user='root',
master_password='',master_port=SLAVE_PORT;
start slave;

let $result_pattern= '%127.0.0.1%root%slave-bin.000001%slave-bin.000001%Yes%Yes%0%0%None%' ;

--source include/wait_slave_status.inc
-----------------------------------------------------

How to repeat:
https://intranet.mysql.com/secure/pushbuild/xref.pl?startdate=&enddate=&dir=&plat=&testtyp...
[8 Jul 2008 20:01] Joerg Bruehe
I had all these happen during the build of 5.1.26-rc.

Symptom 2 was specific to Windows-64.

Symptom 1 happened in many Linux RPM builds, and in some other Linuc builds.

Symptom 3 happened only once, also Linux RPM.

To me, symptoms 1 and 3 look similar to the (earlier) bug report #37733.
[9 Jul 2008 13:52] Mats Kindahl
A resolution for symptom 1 was fixed in BUG#37733. It was caused because a circular set up was used, a reset master was not issued on the *slave*, and there were a delay due to a loaded server causing the "DROP TABLE [t1]" to be issued with a delay, deleting the newly created table.
[15 Jul 2008 9:35] Sven Sandberg
Symptom 2 is only to fail on two places so far:

https://intranet.mysql.com/secure/pushbuild/showdir.pl?dir=bzr_mysql-6.0-maria, sles10-ia64-a-1/ps_row

https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=mysql-5.1-maint2&order=52, sapsrv1/n_mix

Symptom 3 is more common:

xref: http://tinyurl.com/5mopgm

for example here:
https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-6.0-opt&order=14, pb-valgrind/rpl_binlog_row

https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-6.0-opt&order=18, win2003-x86/rpl_binlog_row
[15 Jul 2008 9:44] Sven Sandberg
Sorry, the link to the push in the maria tree where failure 2 occurs is https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-6.0-maria&order=12
[9 Feb 2009 15:28] Joerg Bruehe
I had symptom1 during the build of cluster-6.4.2, which is based on the server version 5.1.31.

To me, it is not obvious whether the fix for bug#37733 (claimed to fix this one, see comment of July 9) should be present in my build or not.