Bug #15671 Test 'rpl_slave_status' fails often on Mac OS X x86
Submitted: 11 Dec 2005 22:55 Modified: 24 Jul 2006 22:48
Reporter: Kent Boortz Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.0.17 5.1.4-alpha-pre OS:MacOS (Mac OS X 10.4)
Assigned to: Chad MILLER CPU Architecture:Any

[11 Dec 2005 22:55] Kent Boortz
Description:
The test case 'rpl_slave_status' is failing on Mac OS X x86.
Likely a server crash

  At line 24: query 'start slave' failed: 2013: Lost connection to MySQL server during query

How to repeat:
Run the 'rpl_slave_status' test case on Mac OS X x86,
might not fail in each run, but this test failed for the max
build in both 5.0.16 and 5.0.17-pre.
[20 Dec 2005 19:04] Kent Boortz
Happens on PowerPC as well, so now showstopper
[9 Mar 2006 6:22] Greg Lehey
Tested on powermacg5.  This test didn't fail, but the first time round 'federated' did.  The second time it didn't.  It seems that we have a number of problems here.  I'm leaving the tests to run to see whether we can reproduce the original problem.
[10 Mar 2006 6:21] Greg Lehey
The failure with the federated test does not happen every time.  Here's the message from one occurrence:

TEST                            RESULT

-------------------------------------------------------

federated                      [ fail ]

Errors are (from /Users/grog/5.0-Bug-15671/mysql-test/var/log/mysqltest-time) :

mysqltest: In included file "./include/federated.inc": At line 9: query 'stop slave' failed: 2013: Lost 
connection to MySQL server during query
[10 Mar 2006 8:01] Elliot Murphy
Lost connection probably means a server crash, it would be good to grab the core file and get a backtrace.
[14 Mar 2006 7:56] Greg Lehey
As mentioned in private mail, there doesn't appear to be a core file.  There's also no 
"lost connection".  What we got was an ENOENT on the master socket and an
ECONNREFUSED from the slave.  It seems that for some reason the master doesn't
get started.
[28 Mar 2006 2:06] Greg Lehey
There was a core file after all.  By default, Apple puts *all* core files in the directory /core:

  $ sysctl kernel.corefille
  kern.corefile = /cores/core.%P

This format is not documented in the Apple man pages, but appears to be identical to the FreeBSD format described at http://www.freebsd.org/cgi/man.cgi?query=core&apropos=0&sektion=0&manpath=FreeBSD+6.0-stab...

The core file in question shows that the slave crashed during stop:

(gdb) i threads
  7 core thread 6  0x90017238 in semaphore_wait_signal_trap ()
  6 core thread 5  0x90017238 in semaphore_wait_signal_trap ()
  5 core thread 4  0x9000ed4c in read ()
  4 core thread 3  0x900873cc in __pthread_kill ()
  3 core thread 2  0x90017238 in semaphore_wait_signal_trap ()
  2 core thread 1  0x9005efac in sigwait ()
* 1 core thread 0  0x9000b46c in select ()
(gdb) thread 4
[Switching to thread 4 (core thread 3)]
#0  0x900873cc in __pthread_kill ()
(gdb) bt
#0  0x900873cc in __pthread_kill ()
#1  0x90087198 in pthread_kill ()
#2  0x001cc814 in write_core (sig=11) at stacktrace.c:220
#3  0x000945d4 in handle_segfault (sig=11) at mysqld.cc:2083
#4  <signal handler called>
#5  0x8f8f8f8c in ?? ()
Cannot access memory at address 0x8f8f8f8c
Cannot access memory at address 0x8f8f8f8c
Cannot access memory at address 0x8f8f8f8c
Cannot access memory at address 0x8f8f8f8c
Cannot access memory at address 0x8f8f8f8c
#6  0x90087198 in pthread_kill ()
#7  0x00080dec in _ZN3THD5awakeENS_12killed_stateE (this=0x485e218, state_to_set=4027129224) at sql_class.cc:484

This looks very much like stack corruption.
[5 Apr 2006 21:54] Elliot Murphy
This was looking like a race condition during stop slave.
I closed bug#12500 as a duplicate of this bug,
raising this one to P1.
[26 Apr 2006 5:43] Elliot Murphy
Possibly related to bug#12251.
[16 May 2006 18:52] Chad MILLER
See also my attachments to Bug#19437 .
[24 Jul 2006 22:47] Chad MILLER
I think this is probably fixed in a previous patch.  Please reopen if warranted.

I'm going to leave my machine testing overnight and re-open if there is a problem.
[24 Jul 2006 22:48] Chad MILLER
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://www.mysql.com/doc/en/Installing_source_tree.html