Bug #19951 wait_timeout: different error from expected
Submitted: 19 May 2006 18:55 Modified: 8 Jul 2006 0:28
Reporter: Andrei Elkin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S3 (Non-critical)
Version:5.0.24, 5.1.11 OS:Linux (FC4smp x86_64)
Assigned to: Kristian Nielsen CPU Architecture:Any

[19 May 2006 18:55] Andrei Elkin
Description:
wait_timeout                   [ fail ]

Errors are (from /tmp/andrei/5.1/mysql-test/var/log/mysqltest-time) :
mysqltest: At line 27: query 'select 2' succeeded - should have failed with errn o 2006...

I could not recure it trying 10 times in row.

How to repeat:
make test
[21 May 2006 14:14] Valeriy Kravchuk
Thank you for a problem report. Sorry, but I was not able to repeat with today's 5.1-BK debug build (ChangeSet@1.2166, 2006-05-20 14:12:42+02:00):

openxs@suse:~/dbs/5.1/mysql-test> ./mysql-test-run wait_timeout
Logging: ./mysql-test-run wait_timeout
Stopping master cluster
Installing Test Databases
Removing Stale Files
Installing Master Databases
running  ../libexec/mysqld --no-defaults --bootstrap --skip-grant-tables     --basedir=.. --datadir=mysql-test/var/master-data --skip-innodb --skip-ndbcluster --skip-bdb
Installing Master Databases 1
running  ../libexec/mysqld --no-defaults --bootstrap --skip-grant-tables     --basedir=.. --datadir=mysql-test/var/master-data1 --skip-innodb --skip-ndbcluster
--skip-bdb
Installing Slave Databases
running  ../libexec/mysqld --no-defaults --bootstrap --skip-grant-tables     --basedir=.. --datadir=mysql-test/var/slave-data --skip-innodb --skip-ndbcluster --skip-bdb
Loading Standard Test Databases
Starting Tests

TEST                            RESULT
-------------------------------------------------------
wait_timeout                   [ pass ]
-------------------------------------------------------

Ending Tests
Shutting-down MySQL daemon

Master shutdown finished
Slave shutdown finished
All 1 tests were successful.
[5 Jul 2006 8:04] Kristian Nielsen
This problem is caused by a race condition in the test. It is seen in both
Pushbuild and release builds.

What happens is that the test is waiting for the TCP connection ('con1') to time out. While waiting, the socket connection ('default') times out first, confusing the wait loop. The fix is to explicitly close the socket connection.

It is easy enough to repeat the problem by inserting '--sleep 2' just after the second 'flush status' in the test. This provokes the error always (without the sleep the occurence of the problem depends on exact timing).

There is another race in that test case, causing this error:

mysqltest: At line 23: query 'select 1' failed: 2006: MySQL server has gone away

This problem happens when timing causes the socket connection to time out before the first statement in the test. It can be provoked by inserting '--sleep 2' just before the first 'select 1' statement in the test. Without the sleep, it happens especially in Valgrind runs and on slow hosts. The fix is to make sure the connection timeout is reset just before the first statement in the test.

I have committed a patch that fixes these two races.
[5 Jul 2006 8:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8735
[6 Jul 2006 21:41] Timothy Smith
Kristian,

Looks good to push.  Thanks!
[6 Jul 2006 21:51] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/8869
[7 Jul 2006 11:39] Kristian Nielsen
Pushed to mysql-5.0.
[8 Jul 2006 0:28] Kristian Nielsen
Pushed to mysql-5.1.

Fixed in 5.0.24 and 5.1.12.

This fix only involves the test suite, so no documentation needed.
[1 Aug 2006 18:40] Joerg Bruehe
Just for the record:
This fix is not contained in 5.0.24, but it should be in 5.0.25.