Bug #38992 Server crashes sporadically with 'waiting for initial ...' msg on windows
Submitted: 24 Aug 2008 17:44 Modified: 12 Nov 2009 18:38
Reporter: Alexander Nozdrin Email Updates:
Status: Closed Impact on me:
None 
Category:Tests: Server Severity:S3 (Non-critical)
Version:6.0-TRUNK OS:Windows (vm-win2003-32-a)
Assigned to: Alexander Nozdrin CPU Architecture:Any
Tags: crash, pushbuild, sporadic, test failures, widespread

[24 Aug 2008 17:44] Alexander Nozdrin
Description:
Symptoms:
----------------------------------------------------------
main.archive                   [ fail ]

mysqltest: Could not open connection 'default': 2013 Lost connection to MySQL server at 'waiting for initial communication packet', system error: 0

Stopping All Servers
----------------------------------------------------------

Seen:
https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-6.0-falcon-chris&ord...
https://intranet.mysql.com/secure/pushbuild/showpush.pl?dir=bzr_mysql-6.0&order=52

Platform: cm-win2003-32-a

How to repeat:
https://intranet.mysql.com/secure/pushbuild/xref.pl?startdate=&enddate=&dir=&plat=&testtyp...
[8 Oct 2008 18:21] Patrick Crews
Currently on the last set of 1000 runs of this test.

Have run iterations on Windows and Mac for:
archive test with and without ps-protocol (failures noted with --ps-protocol)
ansi + archive test with and without ps-protocol (to determine if a preceding test might have caused the failure)

Have not been able to duplicate a failure so far.  Suspect this was a random Pushbuild issue rather than a test or server defect.
[9 Oct 2008 12:46] Patrick Crews
Finished running the last of my 1000 iteration runs.
Performed 6k runs of this test and some of the preceding tests without failure.

As noted, there are a number of other tests with this type of failure.  Suspect this is due to a Pushbuild 'hiccup' rather than an actual defect in the test or the server code.
[9 Oct 2008 13:24] Alexander Nozdrin
I think, I should reopen this bug as a widespread.
As it is shown on XRef (http://tinyurl.com/4gttad),
there are many failures with similar symptoms.

So, I changed the title, added "widespread" tag,
and  changed Impact to "I2".
[9 Oct 2008 13:25] Alexander Nozdrin
Symptoms:
Lost connection to MySQL server at 'waiting for initial communication packet'
[11 Jun 2009 4:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76093

2798 Alexander Nozdrin	2009-06-11
      Fix for Bug#38992: Server crashes sporadically with
      'waiting for initial ...' msg on windows.
      
      The problem is that connection timeout is too small
      for busy windows box.
      
      The fix is to
        - add support for connect_timeout command line argument
          to mysqltest;
        - increase connection timeout to 200 seconds.
[11 Jun 2009 14:21] Bjørn Munch
The opt_connect_timeout is declared and read as ulong but then assigned to an uint for use.  There's no need for 18 hours timeout :-) so it should be uint IMHO.

I also wonder if it's really necessary with 200s, if it really takes that long I'd think there's some other problem. But I don't have experience with it.

I'd like Magnus to look at it too.
[11 Jun 2009 17:01] Magnus Blåudd
1). Looking in "mysql-test/include/default_mysqld.cnf" I see that we(I did it) increased the connect_timeout value for mysqld. So, of course the clients should have been increased the same way. Would suggest to use server connect_timeout value * 2. Please check the bug associated with that fix.

# Default values that applies to all MySQL Servers
[mysqld]
<snip>
# Increase default connect_timeout to avoid intermittent
# disconnects when test servers are put under load see BUG#28359
connect-timeout=            60

2). Would prefer if we don't modify ConfigFactory.pm for simple settings like this and instead use include/default_mysqld.cnf if possible. Just add a [client] section and all clients will use it. But maybe all our clients does not support connect-timeout? Then add [mysqltest] section... Just and "advice" :)

3) mysql_options take "const void*" as third arg -> why cast to "char*"?
[12 Jun 2009 7:52] Magnus Blåudd
We should probably set the default connect_timeout of mysqltest to 120 seconds. That would avoid the need to have that number in any .cnf file.
[12 Jun 2009 8:33] Bjørn Munch
I agree with Magnus in his last comment: making this the mysqltest default makes it much simpler. I had the same thought in my head when reading the patch but the thought didn't quite surface :-)
[19 Jun 2009 12:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76683

2798 Alexander Nozdrin	2009-06-18
      Fix for Bug#38992: Server crashes sporadically with
      'waiting for initial ...' msg on windows.
      
      The problem is that connection timeout is too small
      for busy windows box.
      
      The fix is to
        - add support for connect_timeout command line argument
          to mysqltest;
        - increase connection timeout to 120 seconds.
[19 Jun 2009 13:01] Alexander Nozdrin
Patch queued to mysql-azalea-bugfixing.
[19 Jun 2009 13:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/76706

2803 Alexander Nozdrin	2009-06-19
      Fix for Bug#38992: Server crashes sporadically with
      'waiting for initial ...' msg on windows.
      
      The problem is that connection timeout is too small
      for busy windows box.
      
      The fix is to
        - add support for connect_timeout command line argument
          to mysqltest;
        - set default value of the connect_timeout option to
          120 seconds.
[3 Jul 2009 6:13] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090702084644-k95gd2asolvz2zpu) (version source revid:jon.hauglid@sun.com-20090625092953-xiur7w0mz78g6nmo) (merge vers: 5.4.4-alpha) (pib:11)
[9 Jul 2009 7:35] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090702084644-k95gd2asolvz2zpu) (version source revid:jon.hauglid@sun.com-20090625092953-xiur7w0mz78g6nmo) (merge vers: 5.4.4-alpha) (pib:11)
[9 Jul 2009 16:26] Paul DuBois
Test suite changes. No changelog entry needed.
[16 Oct 2009 15:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/87157

2930 Alexander Nozdrin	2009-10-16
      Backporting patch for Bug#38992 from 6.0.
      Original revision:
      revno: 2617.55.2
      committer: Alexander Nozdrin <alik@sun.com>
      branch nick: azalea-bf-bug38992
      timestamp: Fri 2009-06-19 16:41:16 +0400
      message:
        Fix for Bug#38992: Server crashes sporadically with
        'waiting for initial ...' msg on windows.
        
        The problem is that connection timeout is too small
        for busy windows box.
        
        The fix is to
          - add support for connect_timeout command line argument
            to mysqltest;
          - set default value of the connect_timeout option to
            120 seconds.
[16 Oct 2009 16:38] Alexander Nozdrin
Pushed into 5.5.0.
[16 Oct 2009 17:28] Paul DuBois
Test suite changes. No changelog entry needed.
[3 Nov 2009 7:17] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091102151658-j9o4wgro47m5v84d) (version source revid:alik@sun.com-20091023064702-2f8jdmny61bdl94u) (merge vers: 6.0.14-alpha) (pib:13)
[3 Nov 2009 15:38] Paul DuBois
Test suite changes. No changelog entry needed.
[12 Nov 2009 8:20] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091110093229-0bh5hix780cyeicl) (version source revid:mikael@mysql.com-20091103113702-p61dlwc6ml6fxg18) (merge vers: 5.5.0-beta) (pib:13)