Bug #41055 rpl_sporadic_master fails in sync_slave_with_master sporadically in pushbuild
Submitted: 26 Nov 2008 14:11 Modified: 9 Dec 2010 20:34
Reporter: Sven Sandberg Email Updates:
Status: Can't repeat Impact on me:
None 
Category:Tests: Replication Severity:S7 (Test Cases)
Version:5.1-rpl OS:Any
Assigned to: Alfranio Tavares Correia Junior CPU Architecture:Any
Tags: pushbuild, rpl_sporadic_master, sporadic, test failure

[26 Nov 2008 14:11] Sven Sandberg
Description:
rpl_sporadic_master failed once in pushbuild because the IO thread had stopped:

rpl.rpl_sporadic_master                  [ fail ]

CURRENT_TEST: rpl.rpl_sporadic_master
safe_process[16070]: parent_pid: 7916
safe_process[16070]: Started child 16071, terminated: 0
=== SHOW MASTER STATUS ===
---- 1. ----
File	slave-bin.000001
Position	826
Binlog_Do_DB	
Binlog_Ignore_DB	
==========================

=== SHOW SLAVE STATUS ===
---- 1. ----
Slave_IO_State	
Master_Host	127.0.0.1
Master_User	root
Master_Port	10050
Connect_Retry	1
Master_Log_File	master-bin.000001
Read_Master_Log_Pos	826
Relay_Log_File	slave-relay-bin.000003
Relay_Log_Pos	972
Relay_Master_Log_File	master-bin.000001
Slave_IO_Running	No
Slave_SQL_Running	Yes
Replicate_Do_DB	
Replicate_Ignore_DB	
Replicate_Do_Table	
Replicate_Ignore_Table	
Replicate_Wild_Do_Table	
Replicate_Wild_Ignore_Table	
Last_Errno	0
Last_Error	
Skip_Counter	0
Exec_Master_Log_Pos	826
Relay_Log_Space	1127
Until_Condition	None
Until_Log_File	
Until_Log_Pos	0
Master_SSL_Allowed	No
Master_SSL_CA_File	
Master_SSL_CA_Path	
Master_SSL_Cert	
Master_SSL_Cipher	
Master_SSL_Key	
Seconds_Behind_Master	
Master_SSL_Verify_Server_Cert	No
Last_IO_Errno	0
Last_IO_Error	
Last_SQL_Errno	0
Last_SQL_Error	
=========================

=== SHOW PROCESSLIST ===
---- 1. ----
Id	2
User	root
Host	localhost:65418
db	test
Command	Query
Time	0
State	
Info	SHOW PROCESSLIST
---- 2. ----
Id	3
User	root
Host	localhost:65419
db	test
Command	Sleep
Time	303
State	
Info	
---- 3. ----
Id	7
User	system user
Host	
db	
Command	Connect
Time	300
State	Has read all relay log; waiting for the slave I/O thread to update it
Info	
========================

analyze: sync_with_master
mysqltest: At line 23: sync_slave_with_master failed: 'select master_pos_wait('master-bin.000002', 423, 300)' returned -1 indicating timeout after 300 seconds

The result from queries just before the failure was:
stop slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
reset master;
reset slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
start slave;
create table t2(n int);
create table t1(n int not null auto_increment primary key);
insert into t1 values (NULL),(NULL);
truncate table t1;
insert into t1 values (4),(NULL);
stop slave;
start slave;
insert into t1 values (NULL),(NULL);
flush logs;
truncate table t1;
insert into t1 values (10),(NULL),(NULL),(NULL),(NULL),(NULL);

Warnings from just before the error:
Note 1051 Unknown table 't1' 
Note 1051 Unknown table 't2' 
Note 1051 Unknown table 't3' 
Note 1051 Unknown table 't4' 
Note 1051 Unknown table 't5' 
Note 1051 Unknown table 't6' 
Note 1051 Unknown table 't7' 
Note 1051 Unknown table 't8' 
safe_process[16070]: Got signal 20, child_pid: 16071
safe_process[16070]: Killing child: 16071
safe_process[16070]: Child exit: 1

 == /Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/var-ps_row-5/tmp/analyze-sync_with_master-mysqld.1.err ==
SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info
7	root	localhost	NULL	Query	0	NULL	SHOW PROCESSLIST
SHOW BINLOG EVENTS IN 'master-bin.000002';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
master-bin.000002	4	Format_desc	1	106	Server ver: 5.1.31-pb87-log, Binlog ver: 4
master-bin.000002	106	Query	1	186	use `test`; truncate table t1
master-bin.000002	186	Query	1	254	use `test`; BEGIN
master-bin.000002	254	Table_map	1	295	table_id: 25 (test.t1)
master-bin.000002	295	Write_rows	1	354	table_id: 25 flags: STMT_END_F
master-bin.000002	354	Query	1	423	use `test`; COMMIT

 == /Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/var-ps_row-5/tmp/analyze-sync_with_master-mysqld.2.err ==
SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info
7	system user		NULL	Connect	301	Has read all relay log; waiting for the slave I/O thread to update it	NULL
8	root	localhost	NULL	Query	0	NULL	SHOW PROCESSLIST
SHOW BINLOG EVENTS IN 'slave-bin.000001';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
slave-bin.000001	4	Format_desc	2	106	Server ver: 5.1.31-pb87-log, Binlog ver: 4
slave-bin.000001	106	Query	1	191	use `test`; create table t2(n int)
slave-bin.000001	191	Query	1	312	use `test`; create table t1(n int not null auto_increment primary key)
slave-bin.000001	312	Query	1	380	use `test`; BEGIN
slave-bin.000001	380	Table_map	1	421	table_id: 23 (test.t1)
slave-bin.000001	421	Write_rows	1	460	table_id: 23 flags: STMT_END_F
slave-bin.000001	460	Query	1	529	use `test`; COMMIT
slave-bin.000001	529	Query	1	609	use `test`; truncate table t1
slave-bin.000001	609	Query	1	677	use `test`; BEGIN
slave-bin.000001	677	Table_map	1	718	table_id: 24 (test.t1)
slave-bin.000001	718	Write_rows	1	757	table_id: 24 flags: STMT_END_F
slave-bin.000001	757	Query	1	826	use `test`; COMMIT

 - saving '/Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/var-ps_row-5/log/rpl.rpl_sporadic_master/' to '/Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/var-ps_row-5/log/rpl.rpl_sporadic_master/'

Retrying test, attempt(2/3)...

worker[1] > Running test: rpl.rpl_sporadic_master
worker[1] > Setting timezone: GMT-3
worker[1] Restarting all servers
worker[1] > Cleaning datadirs...
worker[1] > clean_dir: /Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/var-ps_row-5/tmp
worker[1] > unlink: '/Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/var-ps_row-5/tmp/check-mysqld_1.log'
worker[1] > unlink: '/Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/var-ps_row-5/tmp/check-mysqld_1.result'
worker[1] > unlink: '/Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/var-ps_row-5/tmp/check-mysqld_2.log'
worker[1] > unlink: '/Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/var-ps_row-5/tmp/check-mysqld_2.result'
worker[1] > clean_dir: /tmp/pbtmp-ps_row-5
worker[1] > unlink: '/tmp/pbtmp-ps_row-5/mysqld.1.sock'
worker[1] > unlink: '/tmp/pbtmp-ps_row-5/mysqld.2.sock'
worker[1] > rmdir: '/tmp/pbtmp-ps_row-5/mysqld.1'
worker[1] > rmdir: '/tmp/pbtmp-ps_row-5/mysqld.2'
worker[1] > Removing MASTER_MYSOCK
worker[1] > Removing SLAVE_MYPORT
worker[1] > Removing SLAVE_MYSOCK
worker[1] > Removing MASTER_MYPORT
worker[1] > Generating my.cnf from '/Users/henry/pb/bzr_mysql-5.1-rpl/87/mysql-5.1.31-pb87/mysql-test/suite/rpl/my.cnf'
worker[1] > MASTER_MYPORT = 10050
worker[1] > MASTER_MYSOCK = /tmp/pbtmp-ps_row-5/mysqld.1.sock
worker[1] > SLAVE_MYPORT = 10051
worker[1] > SLAVE_MYSOCK = /tmp/pbtmp-ps_row-5/mysqld.2.sock
worker[1] > mysqld_start:  [' --sporadic-binlog-dump-fail', '--max-binlog-dump-events=2', '--loose-skip-innodb', '--binlog-format=row ']

worker[1] > Started [mysqld.1 - pid: 16118, winpid: 16118]
worker[1] > mysqld_start:  [' --loose-skip-innodb', '--binlog-format=row ']

worker[1] > Started [mysqld.2 - pid: 16120, winpid: 16120]
worker[1] Started [check-mysqld_1 - pid: 16123, winpid: 16123]
worker[1] Started [check-mysqld_2 - pid: 16124, winpid: 16124]
worker[1] Got [check-mysqld_1 - pid: 16123, winpid: 16123, exit: 0]
worker[1] Got [check-mysqld_2 - pid: 16124, winpid: 16124, exit: 0]
worker[1] > Started [mysqltest - pid: 16153, winpid: 16153]

(there is some extra debug info from mtr because mtr was running with --verbose in this particular push)

How to repeat:
https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-5.1-rpl&order=87 powermacg5/ps_row
xref: http://tinyurl.com/5zayh8
[4 Feb 2009 13:43] Andrei Elkin
Bug is regressing on the main 5.1:
 https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-5.1&order=53