Bug #39014 rpl.rpl_sporadic_master failed on pushbuild: "could not parse relay log entry"
Submitted: 25 Aug 2008 16:19 Modified: 20 Mar 2009 3:15
Reporter: Sven Sandberg Email Updates:
Status: Can't repeat Impact on me:
None 
Category:Tests: Replication Severity:S7 (Test Cases)
Version:6.0-bugteam OS:Any
Assigned to: Guangbao Ni CPU Architecture:Any
Tags: pushbuild, relay log corruption, rpl_sporadic_master, test failure

[25 Aug 2008 16:19] Sven Sandberg
Description:
Pushbuild failure:

rpl.rpl_sporadic_master 'row'  [ fail ]

==19954== Memcheck, a memory error detector.
==19954== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==19954== Using LibVEX rev 1732, a library for dynamic binary translation.
==19954== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==19954== Using valgrind-3.2.3, a dynamic binary instrumentation framework.
==19954== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==19954== For more details, rerun with: -v
==19954== 
=== SHOW MASTER STATUS ===
---- 1. ----
File	slave-bin.000001
Position	1154
Binlog_Do_DB	
Binlog_Ignore_DB	
==========================

=== SHOW SLAVE STATUS ===
---- 1. ----
Slave_IO_State	Waiting for master to send event
Master_Host	127.0.0.1
Master_User	root
Master_Port	11010
Connect_Retry	1
Master_Log_File	master-bin.000002
Read_Master_Log_Pos	424
Relay_Log_File	slave-relay-bin.000011
Relay_Log_Pos	18446744073444901144
Relay_Master_Log_File	master-bin.000001
Slave_IO_Running	Yes
Slave_SQL_Running	No
Replicate_Do_DB	
Replicate_Ignore_DB	
Replicate_Do_Table	
Replicate_Ignore_Table	
Replicate_Wild_Do_Table	
Replicate_Wild_Ignore_Table	
Last_Errno	1594
Last_Error	Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Skip_Counter	0
Exec_Master_Log_Pos	1044
Relay_Log_Space	2394
Until_Condition	None
Until_Log_File	
Until_Log_Pos	0
Master_SSL_Allowed	No
Master_SSL_CA_File	
Master_SSL_CA_Path	
Master_SSL_Cert	
Master_SSL_Cipher	
Master_SSL_Key	
Seconds_Behind_Master	
Master_SSL_Verify_Server_Cert	No
Last_IO_Errno	0
Last_IO_Error	
Last_SQL_Errno	1594
Last_SQL_Error	Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
=========================

mysqltest: At line 23: could not sync with master ('select master_pos_wait('master-bin.000002', 424)' returned NULL)

The result from queries just before the failure was:
stop slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
reset master;
reset slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
start slave;
create table t2(n int);
create table t1(n int not null auto_increment primary key);
insert into t1 values (NULL),(NULL);
truncate table t1;
insert into t1 values (4),(NULL);
stop slave;
start slave;
insert into t1 values (NULL),(NULL);
flush logs;
truncate table t1;
insert into t1 values (10),(NULL),(NULL),(NULL),(NULL),(NULL);

More results from queries before failure can be found in /dev/shm/var-rpl_binlog_row-101/log/rpl_sporadic_master.log

Warnings from just before the error:
Note 1051 Unknown table 't1' 
Note 1051 Unknown table 't2' 
Note 1051 Unknown table 't3' 
Note 1051 Unknown table 't4' 
Note 1051 Unknown table 't5' 
Note 1051 Unknown table 't6' 
Note 1051 Unknown table 't7' 
Note 1051 Unknown table 't8' 
==19954== 
==19954== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 3 from 1)
==19954== malloc/free: in use at exit: 344 bytes in 2 blocks.
==19954== malloc/free: 2,900 allocs, 2,898 frees, 611,724 bytes allocated.
==19954== For counts of detected errors, rerun with: -v
==19954== searching for pointers to 2 not-freed blocks.
==19954== checked 9,834,120 bytes.
==19954== 
==19954== LEAK SUMMARY:
==19954==    definitely lost: 0 bytes in 0 blocks.
==19954==      possibly lost: 0 bytes in 0 blocks.
==19954==    still reachable: 0 bytes in 0 blocks.
==19954==         suppressed: 344 bytes in 2 blocks.

Stopping All Servers

How to repeat:
https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-6.0-bugteam&order=88 pb-valgrind-*/rpl_binlog_row
[25 Aug 2008 16:20] Sven Sandberg
Looks similar to BUG#39010.
[20 Dec 2008 10:14] Sven Sandberg
Another failure found on https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-5.1-rpl&order=106 powermacg5/ps_row
xref: http://tinyurl.com/4ojpnq

Message:

rpl.rpl_sporadic_master                  [ fail ]
        Test ended at 2008-12-05 11:35:21

CURRENT_TEST: rpl.rpl_sporadic_master
=== SHOW MASTER STATUS ===
---- 1. ----
File	slave-bin.000001
Position	826
Binlog_Do_DB	
Binlog_Ignore_DB	
==========================

=== SHOW SLAVE STATUS ===
---- 1. ----
Slave_IO_State	
Master_Host	127.0.0.1
Master_User	root
Master_Port	10050
Connect_Retry	1
Master_Log_File	master-bin.000001
Read_Master_Log_Pos	826
Relay_Log_File	slave-relay-bin.000003
Relay_Log_Pos	972
Relay_Master_Log_File	master-bin.000001
Slave_IO_Running	No
Slave_SQL_Running	Yes
Replicate_Do_DB	
Replicate_Ignore_DB	
Replicate_Do_Table	
Replicate_Ignore_Table	
Replicate_Wild_Do_Table	
Replicate_Wild_Ignore_Table	
Last_Errno	0
Last_Error	
Skip_Counter	0
Exec_Master_Log_Pos	826
Relay_Log_Space	1127
Until_Condition	None
Until_Log_File	
Until_Log_Pos	0
Master_SSL_Allowed	No
Master_SSL_CA_File	
Master_SSL_CA_Path	
Master_SSL_Cert	
Master_SSL_Cipher	
Master_SSL_Key	
Seconds_Behind_Master	
Master_SSL_Verify_Server_Cert	No
Last_IO_Errno	0
Last_IO_Error	
Last_SQL_Errno	0
Last_SQL_Error	
=========================

=== SHOW PROCESSLIST ===
---- 1. ----
Id	2
User	root
Host	localhost:52236
db	test
Command	Query
Time	0
State	
Info	SHOW PROCESSLIST
---- 2. ----
Id	3
User	root
Host	localhost:52237
db	test
Command	Sleep
Time	303
State	
Info	
---- 3. ----
Id	7
User	system user
Host	
db	
Command	Connect
Time	301
State	Has read all relay log; waiting for the slave I/O thread to update it
Info	
========================

analyze: sync_with_master
mysqltest: At line 23: sync_slave_with_master failed: 'select master_pos_wait('master-bin.000002', 423, 300)' returned -1 indicating timeout after 300 seconds

The result from queries just before the failure was:
stop slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
reset master;
reset slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
start slave;
create table t2(n int);
create table t1(n int not null auto_increment primary key);
insert into t1 values (NULL),(NULL);
truncate table t1;
insert into t1 values (4),(NULL);
stop slave;
start slave;
insert into t1 values (NULL),(NULL);
flush logs;
truncate table t1;
insert into t1 values (10),(NULL),(NULL),(NULL),(NULL),(NULL);

Warnings from just before the error:
Note 1051 Unknown table 't1' 
Note 1051 Unknown table 't2' 
Note 1051 Unknown table 't3' 
Note 1051 Unknown table 't4' 
Note 1051 Unknown table 't5' 
Note 1051 Unknown table 't6' 
Note 1051 Unknown table 't7' 
Note 1051 Unknown table 't8'

 == /Users/henry/pb/bzr_mysql-5.1-rpl/94/mysql-5.1.31-pb94/mysql-test/var-ps_row-5/tmp/analyze-sync_with_master-mysqld.2.err ==
SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info
7	system user		NULL	Connect	301	Has read all relay log; waiting for the slave I/O thread to update it	NULL
8	root	localhost	NULL	Query	0	NULL	SHOW PROCESSLIST
SHOW BINLOG EVENTS IN 'slave-bin.000001';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
slave-bin.000001	4	Format_desc	2	106	Server ver: 5.1.31-pb94-log, Binlog ver: 4
slave-bin.000001	106	Query	1	191	use `test`; create table t2(n int)
slave-bin.000001	191	Query	1	312	use `test`; create table t1(n int not null auto_increment primary key)
slave-bin.000001	312	Query	1	380	use `test`; BEGIN
slave-bin.000001	380	Table_map	1	421	table_id: 23 (test.t1)
slave-bin.000001	421	Write_rows	1	460	table_id: 23 flags: STMT_END_F
slave-bin.000001	460	Query	1	529	use `test`; COMMIT
slave-bin.000001	529	Query	1	609	use `test`; truncate table t1
slave-bin.000001	609	Query	1	677	use `test`; BEGIN
slave-bin.000001	677	Table_map	1	718	table_id: 24 (test.t1)
slave-bin.000001	718	Write_rows	1	757	table_id: 24 flags: STMT_END_F
slave-bin.000001	757	Query	1	826	use `test`; COMMIT

 == /Users/henry/pb/bzr_mysql-5.1-rpl/94/mysql-5.1.31-pb94/mysql-test/var-ps_row-5/tmp/analyze-sync_with_master-mysqld.1.err ==
SHOW PROCESSLIST;
Id	User	Host	db	Command	Time	State	Info
7	root	localhost	NULL	Query	0	NULL	SHOW PROCESSLIST
SHOW BINLOG EVENTS IN 'master-bin.000002';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info
master-bin.000002	4	Format_desc	1	106	Server ver: 5.1.31-pb94-log, Binlog ver: 4
master-bin.000002	106	Query	1	186	use `test`; truncate table t1
master-bin.000002	186	Query	1	254	use `test`; BEGIN
master-bin.000002	254	Table_map	1	295	table_id: 25 (test.t1)
master-bin.000002	295	Write_rows	1	354	table_id: 25 flags: STMT_END_F
master-bin.000002	354	Query	1	423	use `test`; COMMIT

 - saving '/Users/henry/pb/bzr_mysql-5.1-rpl/94/mysql-5.1.31-pb94/mysql-test/var-ps_row-5/log/rpl.rpl_sporadic_master/' to '/Users/henry/pb/bzr_mysql-5.1-rpl/94/mysql-5.1.31-pb94/mysql-test/var-ps_row-5/log/rpl.rpl_sporadic_master/'

Retrying test, attempt(2/3)...
[19 Jan 2009 2:47] Guangbao Ni
Now the bug is different in result reported.

rpl.rpl_sporadic_master 'row'  [ fail ]

mysqltest: At line 20: query 'flush logs' failed: 2013: Lost connection to MySQL server during query

The result from queries just before the failure was:
stop slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
reset master;
reset slave;
drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
start slave;
create table t2(n int);
create table t1(n int not null auto_increment primary key);
insert into t1 values (NULL),(NULL);
truncate table t1;
insert into t1 values (4),(NULL);
stop slave;
start slave;
insert into t1 values (NULL),(NULL);
flush logs;

More results from queries before failure can be found in /home/gni/mysql/bzr/6.0-bugteam/mysql-test/var/log/rpl_sporadic_master.log

Aborting: rpl.rpl_sporadic_master failed in default mode.
To continue, re-run with '--force'.
Stopping All Servers

And there are error information in master.err:

CURRENT_TEST: rpl.rpl_sporadic_master
090116 22:23:19 [Note] Plugin 'InnoDB' disabled by command line option
090116 22:23:19 [Note] Falcon: unable to open system data files.
090116 22:23:19 [Note] Falcon: creating new system data files.
090116 22:23:24 [Note] Event Scheduler: Loaded 0 events
090116 22:23:24 [Note] /home/gni/mysql/bzr/6.0-bugteam/sql/mysqld: ready for connections.
Version: '6.0.10-alpha-debug-log'  socket: '/home/gni/mysql/bzr/6.0-bugteam/mysql-test/var/tmp/master.sock'  port: 9306  Source distribution
090116 22:23:26 [Note] Start binlog_dump to slave_server(2), pos(, 4)
090116 22:23:26 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 107)
090116 22:23:27 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 107)
090116 22:23:27 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 313)
090116 22:23:28 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 313)
090116 22:23:28 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 422)
090116 22:23:29 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 422)
090116 22:23:29 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 530)
090116 22:23:30 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 530)
090116 22:23:30 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 678)
090116 22:23:31 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 678)
090116 22:23:31 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 758)
090116 22:23:32 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 758)
090116 22:23:33 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 827)
090116 22:23:33 [Note] Start binlog_dump to slave_server(2), pos(master-bin.000001, 827)
mysqld: thr_mutex.c:679: add_to_locked_mutex: Assertion `0' failed.
090116 22:23:33 - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=5
max_threads=151
thread_count=3
connection_count=3
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 60060 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0xaa25748
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xad99f060 thread_stack 0x30c00
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(my_print_stacktrace+0x35) [0x88d1246]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(handle_segfault+0x274) [0x830e990]
/lib/tls/i686/libpthread.so.0 [0xb14998]
/lib/tls/i686/libc.so.6(abort+0x118) [0x9070b8]
/lib/tls/i686/libc.so.6(__assert_fail+0xdc) [0x8ffa5c]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld [0x890188f]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld [0x8901783]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(safe_mutex_lock+0x431) [0x8900c68]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(my_close+0x94) [0x89020dd]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(MYSQL_LOG::close(unsigned int)+0xe4) [0x83e3ddc]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(MYSQL_QUERY_LOG::reopen_file()+0xb9) [0x83e4061]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(Log_to_file_event_handler::flush()+0x36) [0x83e0fe4]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(LOGGER::flush_logs(THD*)+0x27) [0x83e1293]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(reload_acl_and_cache(THD*, unsigned long, TABLE_LIST*, bool*)+0x20d) [0x832a573]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(mysql_execute_command(THD*)+0x562c) [0x83245d4]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(mysql_parse(THD*, char const*, unsigned int, char const**)+0x1d0) [0x8328ac6]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x853) [0x831d64b]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(do_command(THD*)+0x230) [0x831cbf0]
/home/gni/mysql/bzr/6.0-bugteam/sql/mysqld(handle_one_connection+0x123) [0x831a847]
/lib/tls/i686/libpthread.so.0 [0xb0f0dd]
/lib/tls/i686/libc.so.6(__clone+0x5e) [0x995a2e]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0xaa67378 = flush logs
thd->thread_id=2
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file

(gdb) bt
#0  0x00b12150 in pthread_kill () from /lib/tls/i686/libpthread.so.0
#1  0x088d133a in my_write_core (sig=6) at stacktrace.c:309
#2  0x0830eb45 in handle_segfault (sig=6) at mysqld.cc:2678
#3  <signal handler called>
#4  0x00905c38 in raise () from /lib/tls/i686/libc.so.6
#5  0x009070b8 in abort () from /lib/tls/i686/libc.so.6
#6  0x008ffa5c in __assert_fail () from /lib/tls/i686/libc.so.6
#7  0x0890188f in add_to_locked_mutex (locked_mutex=0xa9c0b88,
    current_mutex=0xa90ed6c) at thr_mutex.c:679
#8  0x08901783 in add_used_to_locked_mutex (used_mutex=0xa90ed6c,
    locked_mutex=0xa9c0b88) at thr_mutex.c:653
#9  0x08900c68 in safe_mutex_lock (mp=0x9650ba0, my_flags=0,
    file=0x8b71e88 "my_open.c", line=73) at thr_mutex.c:319
#10 0x089020dd in my_close (fd=12, MyFlags=16) at my_open.c:73
#11 0x083e3ddc in MYSQL_LOG::close (this=0xa90ed6c, exiting=2)
    at log.cc:3167
#12 0x083e4061 in MYSQL_QUERY_LOG::reopen_file (this=0xa90ed6c)
    at log.cc:3239
#13 0x083e0fe4 in Log_to_file_event_handler::flush (this=0xa90e8f0)
    at log.cc:1660
---Type <return> to continue, or q <return> to quit---
#14 0x083e1293 in LOGGER::flush_logs (this=0x8d10320, thd=0xab7a748)
    at log.cc:1762
#15 0x0832a573 in reload_acl_and_cache (thd=0xab7a748, options=2,
    tables=0x0, write_to_binlog=0xad92c43b) at sql_parse.cc:6630
#16 0x083245d4 in mysql_execute_command (thd=0xab7a748)
    at sql_parse.cc:3941
#17 0x08328ac6 in mysql_parse (thd=0xab7a748,
    inBuf=0xaba6ab8 "flush logs", length=10, found_semicolon=0xad92cdf8)
    at sql_parse.cc:5735
#18 0x0831d64b in dispatch_command (command=COM_QUERY, thd=0xab7a748,
    packet=0xab9a7b9 "flush logs", packet_length=10) at sql_parse.cc:1007
#19 0x0831cbf0 in do_command (thd=0xab7a748) at sql_parse.cc:690
#20 0x0831a847 in handle_one_connection (arg=0xab7a748)
    at sql_connect.cc:1145
#21 0x00b0f0dd in start_thread () from /lib/tls/i686/libpthread.so.0
#22 0x00995a2e in clone () from /lib/tls/i686/libc.so.6