Bug #59205 Test rpl.rpl_stm_000001 fail on Valgrind with binlog-checksum enabled
Submitted: 29 Dec 2010 6:14 Modified: 30 Mar 2011 7:58
Reporter: Zhenxing He Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.6 OS:Any
Assigned to: Andrei Elkin CPU Architecture:Any
Tags: binlog-checksum, pb2, test
Triage: Triaged: D1 (Critical)

[29 Dec 2010 6:14] Zhenxing He
Description:
Test rpl.rpl_stm_000001 failing with binlog-checksum enabled on Valgrind:

rpl.rpl_stm_000001 'stmt'                w3 [ fail ]
        Test ended at 2010-12-24 01:33:11

CURRENT_TEST: rpl.rpl_stm_000001
==23095== Memcheck, a memory error detector
==23095== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==23095== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==23095== Command: /export/home2/pb2/test/sb_1-2708597-1293145839.87/mysql-5.6.1-m5-linux-x86_64-test/bin/mysqltest --defaults-file=/export/home2/pb2/test/sb_1-2708597-1293145839.87/mysql-5.6.1-m5-linux-x86_64-test/mysql-test/var-all_binlog_checksum/3/my.cnf --silent --tmpdir=/export/home2/pb2/test/sb_1-2708597-1293145839.87/tmp/mptw6xCnC4/3 --character-sets-dir=/export/home2/pb2/test/sb_1-2708597-1293145839.87/mysql-5.6.1-m5-linux-x86_64-test/share/charsets --logdir=/export/home2/pb2/test/sb_1-2708597-1293145839.87/mysql-5.6.1-m5-linux-x86_64-test/mysql-test/var-all_binlog_checksum/3/log --plugin_dir=/export/home2/pb2/test/sb_1-2708597-1293145839.87/mysql-5.6.1-m5-linux-x86_64-test/lib/plugin --database=test --timer-file=/export/home2/pb2/test/sb_1-2708597-1293145839.87/mysql-5.6.1-m5-linux-x86_64-test/mysql-test/var-all_binlog_checksum/3/log/timer --test-file=/export/home2/pb2/test/sb_1-2708597-1293145839.87/mysql-5.6.1-m5-linux-x86_64-test/mysql-test/suite/rpl/t/rpl_stm_000001.test --tail-lines=20 --result-file=/export/home2/pb2/test/sb_1-2708597-1293145839.87/mysql-5.6.1-m5-linux-x86_64-test/mysql-test/suite/rpl/r/rpl_stm_000001.result
==23095== 
mysqltest: In included file "./include/wait_for_slave_param.inc": At line 115: Timeout in include/wait_for_slave_param.inc

The result from queries just before the failure was:
< snip >
master-bin.000001	1193771	Query	1	1193843	BEGIN
master-bin.000001	1193843	Query	1	1193934	use `test`; insert into t1 values(1)
master-bin.000001	1193934	Query	1	1194007	COMMIT
master-bin.000001	1194007	Query	1	1194097	use `test`; create table t2(id int)
master-bin.000001	1194097	Query	1	1194169	BEGIN
master-bin.000001	1194169	Query	1	1194274	use `test`; insert into t2 values(connection_id())
master-bin.000001	1194274	Query	1	1194347	COMMIT
master-bin.000001	1194347	Query	1	1194446	use `test`; create temporary table t3(n int)
master-bin.000001	1194446	Query	1	1194518	BEGIN
master-bin.000001	1194518	Query	1	1194644	use `test`; insert into t3 select get_lock('crash_lock%20C', 1) from t2
master-bin.000001	1194644	Query	1	1194717	COMMIT
master-bin.000001	1194717	Query	1	1194789	BEGIN
master-bin.000001	1194789	Query	1	1194907	use `test`; update t1 set n = n + get_lock('crash_lock%20C', 2)
master-bin.000001	1194907	Query	1	1194980	COMMIT
master-bin.000001	1194980	Query	1	1195088	use `test`; DROP TABLE `t2` /* generated by server */

**** SHOW RELAYLOG EVENTS on server_1 ****
relaylog_name = 'No such row'
SHOW RELAYLOG EVENTS IN 'No such row';
Log_name	Pos	Event_type	Server_id	End_log_pos	Info

More results from queries before failure can be found in /export/home2/pb2/test/sb_1-2708597-1293145839.87/mysql-5.6.1-m5-linux-x86_64-test/mysql-test/var-all_binlog_checksum/3/log/rpl_stm_000001.log
==23095== 
==23095== HEAP SUMMARY:
==23095==     in use at exit: 1,560 bytes in 5 blocks
==23095==   total heap usage: 351,765 allocs, 351,760 frees, 257,100,379 bytes allocated
==23095== 
==23095== LEAK SUMMARY:
==23095==    definitely lost: 0 bytes in 0 blocks
==23095==    indirectly lost: 0 bytes in 0 blocks
==23095==      possibly lost: 0 bytes in 0 blocks
==23095==    still reachable: 0 bytes in 0 blocks
==23095==         suppressed: 1,560 bytes in 5 blocks
==23095== 
==23095== For counts of detected and suppressed errors, rerun with: -v
==23095== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

How to repeat:
http://pb2.norway.sun.com/?template=mysql_show_test_failure&test_output_id=60150e60227335b...
[29 Dec 2010 16:35] Andrei Elkin
It's quite possible that the test is just naturally long.

<andrei> jasonh, have you considered the slow valgrind as a reason for Bug
	 #59205 - it's a timeout.				        [12:34]
<andrei> ?
<jasonh> andrei: possible, but it fails several times so there could be other
	 problems						        [12:38]
<andrei> jasonh, i'll check some other test time to compare with time on a
	 not-failing platform..					        [12:39]
<jasonh> andrei: OK
<andrei> jasonh, so for instance rpl_row_sp002_innodb 28080 on valgrind and
	 219 sol10.						        [12:44]
<andrei> 
<andrei>  now, take 8071 * 28080/219 = 1 033 088		        [12:45]
[29 Dec 2010 16:41] Andrei Elkin
And yes, the test generates some time consuming load on master
that could make the slave catching up for over 300 secs:

let $1=5000;
# Generate 16K of relay log
disable_query_log;
while ($1)
{
 eval insert into t1 values($1);
 dec $1;
}
enable_query_log;
[29 Dec 2010 17:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/127689

3475 Andrei Elkin	2010-12-29
      Bug #59205 Test rpl.rpl_stm_000001 fail on Valgrind with binlog-checksum enabled
      
      It was a timeout cause by logics of the test producing rather big volume of data.
      
      Adding a valgrind branch in the test having lesser load size so slow valgrind has
      a great chance to finish the test.
     @ mysql-test/suite/rpl/t/rpl_stm_000001.test
        decreasing the load size few times on the valgrid env.
        Notice, we seem could drop the general case load to the valgrind's value as well,
        if comments about 16k are relevant.
[30 Mar 2011 7:58] Jon Stephens
Issue doesn't appear in release. Closing.