Bug #38095 Several tests fail in 'stop slave' on RedHat 5 IA64 (icc) + OS X 10.5 PPC
Submitted: 14 Jul 2008 12:18 Modified: 9 Jan 2015 16:01
Reporter: Kent Boortz Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.0.66, 5.0.68 OS:Other (IA64 (icc) + OS X (PPC))
Assigned to: Assigned Account CPU Architecture:Any

[14 Jul 2008 12:18] Kent Boortz
Description:
This failure is using Red Hat 5 IA64 RPMs, and the "debug" enabled server.

While tempting to ignore this, I think it should be verified that this is not an
assert that is causing this, i.e. the bug is in the normal server as well, just
happen to not cause any trouble there as the assert is not in the compilation.

The tests 'federated', 'federated_archive', 'federated_bug_13118',
'federated_bug_25714' and 'federated_innodb' fails the same way,
like this example

  federated_innodb               [ fail ]

  mysqltest: In included file "./include/federated.inc": At line 11:
  query 'stop slave' failed: 2013: Lost connection to MySQL server during query

  The result from queries just before the failure was:
  stop slave;
  drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
  reset master;
  reset slave;
  drop table if exists t1,t2,t3,t4,t5,t6,t7,t8,t9;
  start slave;
  stop slave;

The 'federated' test does actually fail the same way on Mac OS X 10.4
PowerPC compiled 64bits.

How to repeat:
Run these tests using packages built on RHEL5 IA64 using the icc
compiler.
[4 Aug 2008 12:33] Valeriy Kravchuk
Looks platfrom/package specific. Can not repeat on 5.0.68-bzr on 32-bit Suse:

openxs@suse:/home2/openxs/dbs/5.0/mysql-test> ./mysql-test-run.pl --force federated federated_archive federated_bug_13118 federated_bug_25714 federated_innodb
Logging: ./mysql-test-run.pl --force federated federated_archive federated_bug_13118 federated_bug_25714 federated_innodb
MySQL Version 5.0.68
Using ndbcluster when necessary, mysqld supports it
Setting mysqld to support SSL connections
Binaries are debug compiled
mysql-test-run: WARNING: Could not find all required ndb binaries, all ndb tests will fail, use --skip-ndbcluster to skip testing it.
Using MTR_BUILD_THREAD      = 0
Using MASTER_MYPORT         = 9306
Using MASTER_MYPORT1        = 9307
Using SLAVE_MYPORT          = 9308
Using SLAVE_MYPORT1         = 9309
Using SLAVE_MYPORT2         = 9310
Using NDBCLUSTER_PORT       = 9311
Using IM_PORT               = 9313
Using IM_MYSQLD1_PORT       = 9314
Using IM_MYSQLD2_PORT       = 9315
Killing Possible Leftover Processes
Removing Stale Files
Creating Directories
Installing Master Database
Installing Slave1 Database
Saving snapshot of installed databases
=======================================================
Starting Tests in the 'main' suite

TEST                           RESULT         TIME (ms)
-------------------------------------------------------

federated                      [ pass ]           3069
skipped 9 bytes from file: socket (3)
federated_archive              [ pass ]           1158
skipped 9 bytes from file: socket (3)
federated_bug_13118            [ pass ]            151
skipped 9 bytes from file: socket (3)
federated_bug_25714            [ skipped ]   Test requires: 'have_bug25714'
skipped 9 bytes from file: socket (3)
federated_innodb               [ pass ]           1094
-------------------------------------------------------
Stopping All Servers
skipped 9 bytes from file: socket (3)
skipped 9 bytes from file: socket (3)
All 4 tests were successful.
The servers were restarted 1 times
Spent 5.472 of 19 seconds executing testcases
[28 Aug 2008 13:54] Sveta Smirnova
Thank you for the report.

Verified as described. Only RPM packages are affected.
[20 Jan 2009 19:10] Joerg Bruehe
Still current in the 5.0.72sp1 builds, but found in OS X 10.5 PPC builds (both 32 and 64 bit) only, and with a low probability.

This bug (at least: this symptom) has a long history, and we never got rid of it.
Related seem to be (at least) bug#12500, bug#15671, bug#15673, and bug#35319.

Some of these claim to be "closed", but not based on an analysis and corresponding code fix but rather based on the symptom not occurring any more.
Sadly, this seems to be load- or timing-related, or otherwise caused by some race condition which may or may not happen.

I change the category from "federated" to "replication", as it is "stop slave" where this happens, and I also adapt the OS information.