Bug #46827 rpl_circular_for_4_hosts failed on PB-2
Submitted: 20 Aug 2009 10:10 Modified: 12 Mar 2010 17:36
Reporter: Andrei Elkin Email Updates:
Status: Closed Impact on me:
None 
Category:Tests: Replication Severity:S3 (Non-critical)
Version:5.1 OS:Any
Assigned to: Libing Song CPU Architecture:Any

[20 Aug 2009 10:10] Andrei Elkin
Description:
Failure on PB-2 

http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2284291

has the following details

Where 	Build 	Test 	Run 	Suite 	Case 	Mode 	When
mysql-5.1-bugteam
495429	binary-max-sol10-x86_64-tar-gz
495604 tyr37	test-max-sol10-x86_64
495889 siv23	rpl_binlog_row	rpl	rpl_circular_for_4_hosts	None	2009-08-19 17:24:07
- Show/hide test output -
Original output

rpl.rpl_circular_for_4_hosts             w2 [ fail ]

How to repeat:
Look at the url.
[31 Aug 2009 11:17] Andrei Elkin
PB2 shows a range of possible failures on the test.
Timeouts, pre-, post- mtr checks but the most critical is a crash on win, like
reported in
http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2288058

Where 	Build 	Test 	Run 	Suite 	Case 	Mode 	When
mysql-5.1-telco-6.3
501780	tree-max-win-x86-zip
501951 loki02	test-max-win_ws2008-x86
502242 tyr27	n_mix	rpl	rpl_circular_for_4_hosts	None	2009-08-25 12:14:51
- Show/hide test output -
Original output

rpl.rpl_circular_for_4_hosts             [ fail ]
        Test ended at 2009-08-25 16:05:46

CURRENT_TEST: rpl.rpl_circular_for_4_hosts
mysqltest: At line 101: query 'STOP SLAVE' failed: 2013: Lost connection to MySQL server during query

The result from queries just before the failure was:
< snip >

* Data on servers (C failed) *
SELECT 'Master A',a,b FROM t1 WHERE c = 2 ORDER BY a,b;
Master A	a	b
Master A	5	A
Master A	8	D
SELECT 'Master B',a,b FROM t1 WHERE c = 2 ORDER BY a,b;
Master B	a	b
Master B	5	A
Master B	6	B
Master B	8	D
SELECT 'Master C',a,b FROM t1 WHERE c = 2 ORDER BY a,b;
Master C	a	b
Master C	6	C
[29 Sep 2009 9:47] Luis Soares
PB2 reports that in mysql-5.1-bugteam we have only been
experiencing Solaris failures since 2009-08-19.

The lack of recent failures with the pattern referred by "[31 Aug
13:17] Andrei Elkin" may be explained by the following (we base
the analysis on what the logs for the failure show):

CURRENT_TEST: rpl.rpl_circular_for_4_hosts
mysqltest: At line 323: query 'STOP SLAVE' failed: 2013: Lost connection to MySQL server during query

[...]

STACK_TEXT:  
ntdll!RtlpCoalesceFreeBlocks
ntdll!RtlpFreeHeap
ntdll!RtlFreeHeap
kernel32!HeapFree
mysqld!free [f:\dd\vctools\crt_bld\self_x86\crt\src\free.c @ 110]
mysqld!_freefls [f:\dd\vctools\crt_bld\self_x86\crt\src\tidtable.c @ 737]
mysqld!_freeptd [f:\dd\vctools\crt_bld\self_x86\crt\src\tidtable.c @ 794]
mysqld!_endthread [f:\dd\vctools\crt_bld\self_x86\crt\src\thread.c @ 358]
mysqld!handle_slave_io [g:\pb2\build\sb_0-708750-1251198901.38\mysql-5.1.35-ndb-6.3.27-win-x86\sql\slave.cc @ 2713]
mysqld!pthread_start [g:\pb2\build\sb_0-708750-1251198901.38\mysql-5.1.35-ndb-6.3.27-win-x86\mysys\my_winthread.c @ 88]
mysqld!_callthreadstart [f:\dd\vctools\crt_bld\self_x86\crt\src\thread.c @ 293]
mysqld!_threadstart [f:\dd\vctools\crt_bld\self_x86\crt\src\thread.c @ 275]
kernel32!BaseThreadInitThunk
ntdll!__RtlUserThreadStart
ntdll!_RtlUserThreadStart

This resemble the same issues that were fixed in BUG#40796,
Bug#45243, Bug#45242, Bug#45238, Bug#46030, Bug#46014. Patch for
these bugs was pushed in Aug 14th.

Given that the last windows failure in mysql-5.1-bugteam was
observed on 2009-08-12, matching the time frame for
BUG#40796 (and friends) push, it's fair to conclude that windows
failures for this test are fixed. 

The remaining failures seem related to Solaris only:

http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2348768&...

So, going back to Solaris issues, if we check the link provided
on the description section, we can find several symptons (listing
three of them), all related to system resources:

=== Symptom 1 ===

[...]
090819 23:07:37  InnoDB: Started; log sequence number 0 0
090819 23:07:48 [ERROR] Out of memory; restart server and try again (needed 19456 bytes)
090819 23:07:48 [ERROR] Aborting

Notice the "Out of memory".

=== Symptom 2 ===

http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2294479

Shows:

090829  4:44:00  InnoDB: Error: cannot allocate 32768 bytes of
InnoDB: memory with malloc! Total allocated memory
InnoDB: by InnoDB 16508248 bytes. Operating system errno: 11
InnoDB: Check if you should increase the swap file or
InnoDB: ulimits of your operating system.
InnoDB: On FreeBSD check you have compiled the OS with
InnoDB: a big enough maximum process size.
InnoDB: Note that in most 32-bit computers the process
InnoDB: memory space is limited to 2 GB or 4 GB.

Notice the "cannot allocate 32768 bytes".

=== Symptom 3 ===

Finally, a more recent failure:

http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2348768

shows:

rpl.rpl_circular_for_4_hosts             w2 [ fail ]
        Test ended at 2009-09-29 08:45:35

CURRENT_TEST: rpl.rpl_circular_for_4_hosts

Could not execute 'check-warnings' for testcase 'rpl.rpl_circular_for_4_hosts' (res: 1):

 - saving '/export/home/pb2/test/sb_1-775062-1254198985.86/mysql-5.1.40-solaris10-i386-test/mysql-test/var-rpl_binlog_row/2/log/rpl.rpl_circular_for_4_hosts/' to '/export/home/pb2/test/sb_1-775062-1254198985.86/mysql-5.1.40-solaris10-i386-test/mysql-test/var-rpl_binlog_row/log/rpl.rpl_circular_for_4_hosts/'

Retrying test, attempt(2/3)...

fork failed sleep 1 second and redo: Not enough space at lib/My/SafeProcess/Base.pm line 52.
fork failed sleep 1 second and redo: Not enough space at lib/My/SafeProcess/Base.pm line 52.
rpl.rpl_circular_for_4_hosts 'InnoDB plugin' w1 [ fail ]
        Test ended at 2009-09-29 08:45:47

CURRENT_TEST: rpl.rpl_circular_for_4_hosts

Could not execute 'check-warnings' for testcase 'rpl.rpl_circular_for_4_hosts' (res: 1):
ld.so.1: mysqltest: fatal: /usr/lib/64/libCstd.so.1: mmap failed: Resource temporarily unavailable
ld.so.1: mysqltest: fatal: libCstd.so.1: open failed: No such file or directory

=== Conclusions ===

1. Windows failures seem to be fixed by patch for BUG#40796 as
   issues stopped after patch for that bug was pushed.

2. Solaris failures seem related to limited resources (exhausted
   memory?) during the test run.
[3 Oct 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[12 Oct 2009 0:20] Luis Soares
A failure in linux platform has been observed, with different symptom:

rpl.rpl_circular_for_4_hosts             w5 [ fail ]
        Test ended at 2009-10-09 12:15:32

CURRENT_TEST: rpl.rpl_circular_for_4_hosts
--- /export/home/pb2/test/sb_3-807926-1255078194.91/mysql-5.1.40-linux-x86_64-test/mysql-test/suite/rpl/r/rpl_circular_for_4_hosts.result	2009-10-09 11:44:35.000000000 +0300
+++ /export/home/pb2/test/sb_3-807926-1255078194.91/mysql-5.1.40-linux-x86_64-test/mysql-test/suite/rpl/r/rpl_circular_for_4_hosts.reject	2009-10-09 13:15:31.000000000 +0300
@@ -269,9 +269,10 @@
 * Transactions with rollbacks *
 BEGIN;
 BEGIN;
+Timeout in wait_condition.inc for SELECT COUNT(*)=200 FROM t2 WHERE c = 2
 SELECT 'Master A',b,COUNT(*) FROM t2 WHERE c = 2 GROUP BY b ORDER BY b;
 Master A	b	COUNT(*)
-Master A	B	100
+Master A	B	99
 Master A	D	100
 SELECT 'Master B',b,COUNT(*) FROM t2 WHERE c = 2 GROUP BY b ORDER BY b;
 Master B	b	COUNT(*)

Details at:
http://pb2.norway.sun.com/web.py?template=mysql_show_test_failure&test_failure_id=2375742
[7 Dec 2009 14:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/93073

3212 Li-Bing.Song@sun.com	2009-12-07
      Bug #46827  	rpl_circular_for_4_hosts failed on PB2
      
      This test case tests a circular replication of four hosts. 
      A--->B--->C--->D--->A
      The replicate is slow and needs more time to replicate all data in the circle.
      The time it spends to replicate, sometimes, is longer than the time that 
      wait_condition.inc spends to wait that all data has been replicated. This
      cause sporadical failure of this test case.
        
      This patch uses sync_slave_with_master to ensure that all data can be replicated
      successfully in the circle.
[16 Dec 2009 4:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/94359

3278 Li-Bing.Song@sun.com	2009-12-16
      Bug #46827  	rpl_circular_for_4_hosts failed on PB2
      
      This test case tests a circular replication of four hosts. 
      A--->B--->C--->D--->A
      The replicate is slow and needs more time to replicate all data in the circle.
      The time it spends to replicate, sometimes, is longer than the time that 
      wait_condition.inc spends to wait that all data has been replicated. This
      cause sporadical failure of this test case.
        
      This patch uses sync_slave_with_master to ensure that all data can be replicated
      successfully in the circle.
[19 Dec 2009 8:27] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091219082307-f3i4fn0tm8trb3c0) (version source revid:alik@sun.com-20091216180721-eoa754i79j4ssd3m) (merge vers: 6.0.14-alpha) (pib:15)
[19 Dec 2009 8:31] Bugs System
Pushed into 5.5.1-m2 (revid:alik@sun.com-20091219082021-f34nq4jytwamozz0) (version source revid:alexey.kopytov@sun.com-20091216134707-o96eqw0u2ynvo9gm) (merge vers: 5.5.0-beta) (pib:15)
[19 Dec 2009 8:34] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20091219082213-nhjjgmphote4ntxj) (version source revid:alik@sun.com-20091216180221-a5ps59gajad3pip9) (pib:15)
[21 Dec 2009 9:07] Libing Song
Pushed into mysql-5.1-bugteam, merged into mysql-pe
[15 Jan 2010 8:58] Bugs System
Pushed into 5.1.43 (revid:joro@sun.com-20100115085139-qkh0i0fpohd9u9p5) (version source revid:li-bing.song@sun.com-20091216044115-h51xf8wqds1jlp03) (merge vers: 5.1.42) (pib:16)
[15 Jan 2010 18:32] Paul DuBois
Changes to test cases. No changelog entry needed.
[12 Mar 2010 14:06] Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:22] Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:36] Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)
[12 Mar 2010 17:36] Paul DuBois
No changelog entry needed.