Bug #26664 test suite times out on OS X 64bit
Submitted: 27 Feb 2007 5:33 Modified: 20 Jun 2007 1:09
Reporter: Scott Lee Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Tests Severity:S2 (Serious)
Version:5.0.36 OS:Mac OS X (OS X 64bit)
Assigned to: Magnus Blåudd CPU Architecture:Any

[27 Feb 2007 5:33] Scott Lee
Description:
The test suite does not complete because of multiple timeouts.  This occurs in enterprise, cluster+ndb and classic.

$ cd /data0/mysqldev/my/build-200702201448-5.0.36/mysql-5.0.36-build/log
$ grep ' timeout' osx-tiger-ppc-64bit-5.0-classic.log
rpl000001                      [ fail ]  timeout
rpl000002                      [ fail ]  timeout
rpl000005                      [ fail ]  timeout
rpl000006                      [ fail ]  timeout
rpl000008                      [ fail ]  timeout
rpl000009                      [ fail ]  timeout
rpl000011                      [ fail ]  timeout
rpl000013                      [ fail ]  timeout
rpl000015                      [ fail ]  timeout
rpl_EE_error                   [ fail ]  timeout
rpl_alter                      [ fail ]  timeout
mysql-test-run: *** ERROR: Test suite timeout
$

How to repeat:
$ perl ./mysql-test-run.pl --comment=debug --skip-rpl --skip-ndbcluster --tmpdir=/Users/mysqldev/tmp-200702201448-5.0.36-25396/tmp-25396/my_build-osx-tiger-ppc-64bit-16408  --timer --force --report-features
[27 Feb 2007 9:44] Sveta Smirnova
Thank you for the report.

I can not repeat it on our osx-tiger-ppc with binaries built by BUILD/compile-ppc-debug-max script.

Please indicate how you built binaries and machine I can verify on.
[27 Feb 2007 14:57] Scott Lee
The build was on osx-tiger-ppc and here are the binaries: /data0/mysqldev/my/build-200702201448-5.0.36/mysql-5.0.36-build/dist/packages/mysql-*-5.0.36-osx10.4-powerpc-64bit.tar.gz
[28 Feb 2007 12:04] Sveta Smirnova
Thank you for the report.

Verified as described.
[2 Apr 2007 10:28] Magnus Blåudd
Verified on osx-tiger-ppc with 64bit build. It looks like the slave are having problem connecting to the master(although there are lots's of error message in the slave.err I know should be there :( )

Need to find the basic rpl test and run with that.
[2 Apr 2007 19:44] Magnus Blåudd
Looks like slave fails to set the socket to blocking mode.
[19 Apr 2007 8:42] Magnus Blåudd
Problem ocucurs when we set SO_SNDTIMEO/SO_RCVTIMEO on the socket that connect from the slave mysqld to master. It causes read to time out very fast with error EWOULDLOCK(whic has the same errno as EAGIAN - kind of confusing).

There are two problems with this:
1. Since the "client" in mysqld uses alarms there is
   actually no need to use 'setsockopt'. This can be
   quite easily fixed by using net_set_write_timeout/net_set_read_timeout
   in client.c instead of the direct calls to 'vio_timeout'.
   The net_set* function has an #ifdef that only set the
   socket timeout if the client is not using alarm.
2. The implementation of 'vio_timeout' in viosocket.c is wrong since
   it cast the parameter to setsockopt to 'char*' instead of 'const void*'
   which causes the call to setsockopt to fail on a 64bit compiled OSX.
   Unfortunately here are 'setsockopt' implementation's that
   uses char*.

Also tried to read the set timeout with getsockopt but that just fails with error -1, will not pursue this any more since the above two corrections will take care of the problem.
[28 Apr 2007 9:53] Sergey Vojtovich
BUG#26562 was closed as a duplicated.
[24 May 2007 9:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/27261

ChangeSet@1.2456, 2007-05-24 11:21:27+02:00, msvensson@pilot.blaudden +13 -0
  Bug#26664 test suite times out on OS X 64bit
   - The "mysql client in mysqld"(which is used by
     replication and federated) should use alarms instead of setting
     socket timeout value if the rest of the server uses alarm. By
     always calling 'my_net_set_write_timeout'
     or 'my_net_set_read_timeout' when changing the timeout value(s), the
     selection whether to use alarms or timeouts will be handled by
     ifdef's in those two functions. 
   - Move declaration of 'vio_timeout' into "vio_priv.h"
[25 May 2007 9:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/27325

ChangeSet@1.2499, 2007-05-25 11:38:23+02:00, msvensson@pilot.blaudden +1 -0
  Bug #26664  	test suite times out on OS X 64bit
   - Make the two "my_net_set*" function only visible when included
     from my_global.h
[25 May 2007 14:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/27343

ChangeSet@1.2500, 2007-05-25 16:47:43+02:00, msvensson@pilot.blaudden +1 -0
  Bug#26664 test suite times out on OS X 64bit
   - Add checks to make sure net has a vio assigned
   - For example bootstrap will create a fake "net" with vio
     set to 0
[6 Jun 2007 16:54] Bugs System
Pushed into 5.1.20-beta
[6 Jun 2007 16:58] Bugs System
Pushed into 5.0.44
[20 Jun 2007 1:09] Paul Dubois
Noted in 5.0.44, 5.1.20 changelogs.

Connections from one mysqld server to another failed on Mac OS X,
affecting replication and FEDERATED tables.
[28 Nov 2007 10:23] Bugs System
Pushed into 6.0.4-alpha
[28 Nov 2007 10:25] Bugs System
Pushed into 5.1.23-rc
[28 Nov 2007 10:27] Bugs System
Pushed into 5.0.54
[28 Nov 2007 10:47] Bugs System
Pushed into 4.1.24
[28 Nov 2007 18:22] Jon Stephens
Also noted fix in 6.0.4 changelog.